March 5, 2020

Five predictions for the future of sound recognition and voice assistants in 2020

Our mission at Audio Analytic is to give machines a sense of hearing.

I explained this to a friend and they asked me if our mission had been inspired by the Terminator franchise and whether this sense of hearing would help machines take over the world in the near future, with the sole mission to completely annihilate humanity.

I had to disappoint them. This is not part of our 2020 strategy. But there are still some very exciting times ahead for sound recognition technology.

By enabling smart speakers, mobile phones and automobiles with the ability to recognise the sounds beyond speech, we can provide voice assistants with powerful context awareness.

I’m particularly fascinated by the ways that cloudless AI can help voice assistants be more proactive.

It’s said that a good assistant will do something when you ask them to, while a great assistant doesn’t need asking. Voice assistants are gradually making this transition from good to great, from being purely reactive to proactive, and sound recognition can help them do this.

Here are five pieces of contextual information that on-edge sound recognition could provide to a proactive voice assistant:

  • “Don’t speak now, someone else is speaking” – Voice assistants may want to wait to talk when someone or something else is trying to communicate with their end-user. Relevant situations include the presence of sounds such as people talking, a dog barking, or the sounds from a doorbell, phone ringtone or alarm clock.
  • “Don’t speak now, they won’t hear you” – Voice interfaces are typically not conducive to noisy environments, and voice assistants may want to save their breath or repeat themselves later when the end-user’s ability to perceive sounds in their environment is reduced. Perhaps the acoustic scene is generally chaotic, or perhaps there are sounds present like loud music, a crying baby, a hairdryer, or a vacuum cleaner.
  • “Now might be a good time to say something” – The voice assistant’s soundscape can indicate the best opportunities for proactive communications with end-users. If the acoustic scene is generally calm and not chaotic, that’s a good start. Beyond this, the presence of sounds such as laughter might present a timely opportunity for the assistant to interrupt with appropriate content.
  • “Here’s a voice app they might enjoy” – Discoverability remains a big challenge for third-party party voice apps and skills. The presence of certain sounds in the environment can indicate that an end-user has certain interests, enabling the assistant to suggest relevant apps and skills to them. Examples of these sounds could include a dog barking, a child laughing, the sounds of various cooking activities, as well as various musical instruments being played.
  • “Don’t say it out loud, write it down” – Voice interfaces are typically not conducive to privacy in public environments. On smart displays and mobile phones the assistant has a choice between using a voice UI, a graphical UI or both. When the soundscape indicates you’re somewhere chaotic like a busy café or boring like public transport – which are both indicators of multiple people being present – the assistant could respond just through the graphical UI for increased privacy, especially when handling sensitive or personal information.

One thing is for sure – sound recognition can and will add a whole new dimension to the way we interact with voice assistants, and that level of innovation is set to come to fruition this year.


Like this? You can subscribe to our blog and receive an alert every time we publish an announcement, a comment on the industry or something more technical.

About Audio Analytic

Audio Analytic is the pioneer of AI sound recognition software. The company is on a mission to map the world of sounds, offering our sense of hearing to consumer technology. By transferring our sense of hearing to consumer products and digital personal assistants we give them the ability to react to the world around us, helping satisfy our entertainment, safety, security, wellbeing and communication needs.

Audio Analytic’s ai3™ sound recognition software enables device manufacturers and chip companies to equip products with Artificial Audio Intelligence, recognizing and automatically responding to our growing list of sound profiles.

Audio Analytic is the pioneer of artificial audio intelligence, which is empowering a new generation of smart products to hear and react to the sounds around us.

We are using our own and third party cookies which track your use of our website to help give you the best experience. By continuing, we’ll assume that you are happy to receive all cookies on our website.

You can check what cookies we use and how to manage them here and you read our privacy policy here.

Accept and close