Scalable, high-performance ML toolboxes designed to deliver a compact and broad sense of hearing
Our technology allows us to lead the sound recognition field both from a performance and compactness point of view.
Compared to Apple’s iOS accessibility feature, our technology delivers 56x better classification performance.
And when compared with MobileNet (a popular network architecture used for compact mobile and embedded ML applications) our network architecture requires 144x fewer FLOPS (floating point operations per second).
Data: The bedrock of good audio ML
To teach our technology to accurately model sounds, we have to expose it to high quality, real-world data. Quantity matters, but high-performance is also about relevance and diversity. Read more in our whitepaper.
We record sounds either in our dedicated anechoic Sound Labs, through our network of volunteers, or via our dedicated data collection teams. It means that we know everything about the data we use to train our models.
Our approaches to data collection and management are also in-line with global legal and ethical best practice, which means that our customers can confidently deploy our technology into the global market. Read more in our whitepaper.
Alexandria™ contains 30,000,000 labelled recordings across 1,000 sound classes, and 400,000,000 meta data points.
Alexandria™ – The world’s largest, commercially exploitable audio dataset
Alexandria™ provides us with a rich platform from which to build and manage the data necessary for giving machines a compact sense of hearing.
By leveraging the vast amount of data in Alexandria™ and by using our Acoustically Smart Temporal Augmentation techniques, we can construct acoustically accurate polyphonic soundscapes that represent a wide range of real-world scenarios which would be practically impossible to collect.
AuditoryNET™ – Sophisticated and optimized DNNs built for hearing
AuditoryNET™ is a range of compact, topologically-optimized DNNs, associated training frameworks and toolboxes designed specifically for the formal ML tasks within sound recognition (sound event detection, scene recognition and tagging).
Giving machines a broad sense of hearing beyond speech and music presents a number of specific challenges. This has led to many breakthroughs including:
- Our patented sound recognition loss function tools and our range of sound recognition specific model compression tools.
- A robust metric for evaluation sound recognition systems. Our published work on the Polyphonic Sound Detection Score (PSDS) has led to it fast becoming the industry’s standard approach to evaluating sound recognition.
Comparing network size and computational load
To illustrate the compactness of our technology we’ve compared one of our models from AuditoryNET™ (recognizing six sound classes) with two network architectures used in the research arena: MobileNet (recognizing six sound classes) and ResNet (recognizing 10 sound classes). Both are familiar to a broad range of people working in machine learning and both selected network architectures are also often used by groups entering the annual academic DCASE Challenge for sound recognition tasks.
ai3™ and ai3-nano™ – Proven, ultra-compact and flexible software platforms
Our customers license ai3™ and ai3-nano™ which give a wide range of consumer products a sense of hearing and require as little as 40kB of RAM and ~1mA of power.
Its flexible, scalable architecture means that it can:
- Be adapted to customer hardware
- Support technologies such as our patented post-biasing, which can adapt to contextual information based on other sensor data
- Take advantage of SoC acceleration.
Polyphonic Sound Detection Score
First published and presented at ICASSP 2020, the Polyphonic Sound Detection Score (PSDS) is an industry-standard evaluation framework and metric for polyphonic sound recognition systems designed by Audio Analytic and based on our extensive expertise in real-world sound recognition.
PSDS solves the fundamental shortcomings of previous evaluation approaches, and we believe that the wider sound recognition community will benefit from our expertise: feel free to access the source code and benefit from using PSDS in your own work.
We have a portfolio of 32 patents (filed and granted), covering the uses of sound recognition in products as well as the specialist techniques that we use to give machines a sense of hearing.
The Cortex M0+ challenge: How low can we go?
So we set ourselves a really complicated hardware challenge with exciting potential…. to see if we can embed our class-leading software on the M0+ based chip.
Audio Analytic named in CB Insights AI 100 List of Most Innovative Artificial Intelligence Startups