We’ve built the world’s largest dedicated audio data set
To teach our technology to recognise sounds, we have to expose it to real-world data. Quantity matters, but more importantly it is about quality, relevance and diversity.
That is why we record these sounds (both audio events and acoustic scenes) ourselves either in our dedicated sound labs or through responsible data gathering initiatives. We have been doing this for seven years, thus developing unrivalled expertise in the sounds around us.
All of this expertise is captured in Alexandria, our proprietary data platform, which enables us to label, organise and analyse sounds in a way that has never been done before.
Algorithms based on ideophones
We have developed a dedicated sound recognition framework based on ideophones, the language of sounds. This provides us with the tools and ability to fuel cutting-edge machine listening algorithms with real-world data.
Our framework models hundreds of ideophonic features of sound, enabling our embedded software platform to accurately recognise and react to the world around consumer devices.
Artificial intelligence, everywhere
We envision a future where omnipresent, intelligent, context-aware computing is able to better help people by responding to the sounds around us, no matter where we are, and taking appropriate action on our behalf.
Our mission is to map the world of sounds and give machines a sense of hearing, whether that is in the home, out and about, or in the car.
Our sound profiles
Our customers integrate our ai3™ software platform into a diverse range of products because of its flexible architecture and scalability.
ai3™ is capable of recognising a number of sounds from the profiles that we have taught it to recognise (and which is growing all the time). What's more, sound profiles can be deployed over-the-air, which helps our customers stay one step ahead of their competition.
Window glass break
How is this different from speech and music?
Speech recognition and wake words are limited by the type of sounds that the human mouth can produce, as well as conditioned by the communicative structure of human language, which can both be exhaustively mapped.
Similarly, music mostly results from physical resonance, and is conditioned by the rules of various musical genres.
So whilst the human ear and brain are very good at interpreting sounds in spite of acoustic variations, computers were originally designed to process repeatable tasks. Thus, teaching a machine how to recognise speech and music greatly benefits from such pre-defined rules and prior knowledge.
Sounds, on the other hand, can be much more diverse, unbounded and unstructured than speech and music.
Think about a window being smashed, and all the different ways glass shards can hit the floor randomly, without any particular intent or style. Or think about the difference between a long baby cry and a short dog bark, or the relative loudness of a naturally spoken conversation versus an explosive glass crash.
Now you understand why sound recognition required us to develop a special kind of expertise: collecting sound data ourselves and tackling real-world sound recognition problems made us pioneering experts in understanding the full extent of sound variability.