When it comes to training and evaluating sound recognition systems that perform to a high standard within a range of diverse consumer products, you cannot rely on recordings downloaded from the internet.

This is due to a range of legal, ethical and technical limitations which are highlighted in a whitepaper that we have published today.

Despite the fact that it can be accessed at a click of the mouse, audio and video content shared online is made available for specific purposes only – typically for social networking, entertainment, and media production sound effects.

From a legal and ethical perspective, you need appropriate license and copyright permissions to use audio data for training a system that will be commercialised. Launching a consumer product into the global market without such permission – or the ability to prove it – opens organisations up to significant risk. As the regulations around AI evolve, these risks will only increase.

From a technical perspective, audio clips found online don’t usually match the target usage scenarios and suffer from certain biases, which makes them unsuitable for training. Plus, these audio recordings are affected by many layers of distortions, artifacts, codecs, echoes and speaker limitations – both from the environment they were recorded in and played back in – making them undesirable to evaluate the performance of a sound recognition system.

Giving machines a compact sense of hearing is right at the cutting edge of AI. It opens up new consumer values and opportunities for companies whether they manufacture smart speakers, smartphones, hearables, wearables, smart home devices or vehicles.

However, in order to match consumer expectations, the technology has to work in the environments where the products will be used. This means that those training and evaluating this tech need to better understand that humans and machines both listen in very different ways. For example, a machine is trained to hear using information that is often irrelevant or imperceptible to our human ears.

To be better informed, download and read the full whitepaper for free here.


Like this? You can subscribe to our blog and receive an alert every time we publish an announcement, a comment on the industry or something more technical. 


 About Audio Analytic 

Audio Analytic is the pioneer of AI sound recognition technology. The company is on a mission to give machines a compact sense of hearing. This empowers them with the ability to react to the world around us, helping satisfy our entertainment, safety, security, wellbeing, convenience, and communication needs across a huge range of consumer products.

Audio Analytic’s ai3™ and ai3-nano™ sound recognition software enables device manufacturers to equip products at the edge with the ability to recognize and automatically respond to our growing list of sounds and acoustic scenes.