Our scalable technology platform

Data collection – It’s the foundation

To teach our technology to recognise sounds, we have to expose it to high quality, real-world data. Quantity matters, but it is also about relevance and diversity.

We record audio events and acoustic scenes either in our dedicated Sound Labs, through our network of volunteers, or via our dedicated data collection team.

Read about our data collection process in WIRED

Alexandria™ – The world’s largest, commercially-exploitable audio dataset

Sound recognition was a zero-data problem when we started this journey.

We built Alexandria™, a dedicated ML-ready audio dataset, which is used to train our sound recognition algorithms.

Alexandria™ contains 30 million labelled, relevant sound events and acoustic scene data. All audio data is expertly labelled, with full data provenance built in from the start. Our dataset is structured according to our unique taxonomy, encompassing anthrophonic, biophonic and geophonic sounds.

Alexandria™ contains 30,000,000 labelled recordings across 1,000 sound classes.

AuditoryNET™ – Our specialised deep neural network for sound recognition

Intelligent sound recognition requires a deep knowledge of the ideophonic features of sounds. It is the only way to teach machines how to hear. We’ve built our own highly-optimised and dedicated deep neural network that accurately models sounds based on their ideophonic features.

ai3™ and ai3-nano™ – Our proven, ultra lightweight and flexible software platforms

Our customers license ai3™ and ai3-nano™ which give a wide range of consumer products a sense of hearing.

Because our DNN is dedicated to sound recognition, it is extremely compact, which makes it perfect for a wide range of products from smart speakers to hearables.

Polyphonic Sound Detection Score

First published and presented at ICASSP 2020, the Polyphonic Sound Detection Score (PSDS) is an industry-standard evaluation framework and metric for polyphonic sound recognition systems designed by Audio Analytic and based on our extensive expertise in real-world sound recognition.

PSDS solves the fundamental shortcomings of previous evaluation approaches, and we believe that the wider sound recognition community will benefit from our expertise: feel free to access the source code and benefit from using PSDS in your own work.

Take a look

Sounds are fundamentally different from voice and music

Speech recognition and wake words are limited by the type of sounds that the human mouth can produce, as well as conditioned by the communicative structure of human language, which can both be exhaustively mapped.

Similarly, music mostly results from physical resonance, and is conditioned by the rules of various musical genres.

So whilst the human ear and brain are very good at interpreting sounds in spite of acoustic variations, computers were originally designed to process repeatable tasks. Thus, teaching a machine how to recognise speech and music greatly benefits from such pre-defined rules and prior knowledge.

Sounds, on the other hand, can be much more diverse, unbounded and unstructured than speech and music.

Think about a window being smashed, and all the different ways glass shards can hit the floor randomly, without any particular intent or style. Or think about the difference between a long baby cry and a short dog bark, or the relative loudness of a naturally spoken conversation versus an explosive glass crash.

Now you understand why sound recognition required us to develop a special kind of expertise: collecting sound data ourselves and tackling real-world sound recognition problems made us pioneering experts in understanding the full extent of sound variability.

Sound recognition for a wide range of products

Expertise in data collection, the world's leading audio dataset for machine learning and a highly specialised DNN enables us to create ai3™ - a flexible software platform capable of detecting a large number of sounds in a wide range of devices.

Dog bark

Baby cry

Window glass break

Smoke/CO alarm

Outdoor glass break

Speech detection

AA_Carhorn_Darkblue

Car alarm

Acoustic scene recognition

Snore

Laugh

Acoustic guitar

Clapping

Singing

Applause

Music

Door knock

Bicycle bell

Vehicle approaching

Emergency vehicle siren

Indoor physical scene

Outdoor physical scene

Running footsteps

Shushing

Warning shout

PSDS adopted by DCASE for Task 4 challenge

The organisers of the 2020 DCASE Challenge have included Audio Analytic’s Polyphonic Sound Detection Score (PSDS) as one of the two evaluation metrics for ‘Task 4: Sound event detection and separation in domestic environments’.

Visualising the complex world of sounds

Our colourful and vibrant Sound Map illustrates the complexity of sound, and the challenges involved with teaching machines to hear.

Our patents

We have a portfolio of 31 patents (filed and granted), covering the uses of sound recognition in products as well as the specialist techniques that we use to give machines a sense of hearing.

The Cortex M0+ challenge: How low can we go?

So we set ourselves a really complicated hardware challenge with exciting potential…. to see if we can embed our class-leading software on the M0+ based chip.

Latest news

Introducing ai3-nano™: The gold standard in sound recognition, now at 40kB

Read more

Loss function: There is more to DNNs than their architecture

Read more

Patent US010783434: Application of loss functions for training sound event recognition systems

Read more

We are using our own and third party cookies which track your use of our website to help give you the best experience. By continuing, we’ll assume that you are happy to receive all cookies on our website.

You can check what cookies we use and how to manage them here and you read our privacy policy here.

Accept and close
>