As 2020 drew to a close, we were granted a significant technology patent. The key technology in this patent, called Post-Biasing, has been essential to how our technology delivers exceptional performance in many diverse products and environments today. But it will also play a critical role in new consumer experiences of the future, where sound recognition not only understands but also adapts to context.

In this post I will share more about what it is and why it is so important to high performance and adaptive sound recognition, which in turn will deliver the best experiences to end users, wherever they are and whatever they are doing.

Building a dream machine

I think of machine learning (ML) systems as being rather like sports cars; simply throwing all the components and fuel together are not sufficient for the creation of a well-performing and efficient machine. Instead, the optimal components must be put together carefully with the relevant know-how. In addition, the fuel required to run the machine, not unlike the data we feed to our ML systems, must be refined to the correct potency.

So, for any ML system to deliver optimal performance it needs the correct quality and diversity of data for training and evaluation. And it also requires underlying ML technology and domain expertise that is relevant to the specific task. But even if you do all this, it still only gets you so far. The most important measure of success is how the system performs in the real world with end users and this requires adaptability in application. After all, consumer expectations are not always in step with lab tests.

Imagine owning the Ferrari 488 Pista (I imagine that on a regular basis – although I can’t decide between classic red or yellow) and having to rebuild it each time depending on what you want to use it for – cruising down the coast, taking it round a track or simply driving along a freeway on a wet Wednesday. Luckily, your lovely, expensive Ferrari 488 Pista features Ferrari Dynamic Enhancer (FDE) technology (to describe it as just traction control would be an injustice). This means that the car can adapt to what is happening to improve the driving experience, whether that is because of a poor driving style or challenging driving conditions.

Take away FDE and you are left with a powerful machine that may not always deliver the optimal experience for the driver.

In the same spirit, from the outset we recognised the importance of building core ML systems and technology that can cope with new environments and new use case scenarios, and hence invested efforts in developing the technology to make it possible. Post-Biasing is the fruit of these efforts, and just like the Ferrari analogy above, helps us to have the same underlying dream machine but with the ability to adapt to the applications of the real world.

What is Post-Biasing?

In simple terms Post-Biasing means giving an ML system the ability to change its output or change what the host device does, based on properties not considered a core part of the sound recognition model. These ‘properties’ could be the location, time, device type, output of a separate ML system or device feature, such as accelerometers, etc.

For example, think about what you do while wearing your earbuds. You could be sitting at your desk, running through the park or on a treadmill, sitting on the train or cycling along a busy road. In each of these scenarios, the sound recognition system will be able to adapt to improve experiences by tweaking outputs or switching between models automatically. For example, it is highly unlikely that a door knock would be present or relevant if you were cycling, so the model output can be adjusted to avoid a false alert, even if the sound had similar features.

As we rely on sound recognition more and more for smarter assistance in our lives, using additional information sources to the audio itself is essential.

The applications for this adaptability increase as sound recognition is embedded into portable devices with multiple sensors and inputs, such as smartphones and true wireless earbuds which are used in many different environments, at all times of the day and for different applications.

How can our sound recognition system make sense of all these kinds of additional information?

At first glance, it may seem like we can treat the additional bit of data like any other and push our system to learn it from a blank slate. From a training point of view, this is inefficient and would, if we continue with the sports car analogy, be akin to building a completely new car for each driving condition. The more significant impact would be on data. Building these properties into the model would put astronomical burdens on data collection, as you would need to collect enough diverse and representative audio data plus additional information related to every variation of the additional properties (for example, is the user walking, cycling, running, in a car, etc). Every new property that you would want to account for could effectively mean starting again on the data front, rendering this approach unfeasible due to the costs and development time.

We can further observe that the nature of understanding the sound activities does not change in each use case, but rather its interpretation is what differs. Hence, the better solution is to adapt our system using additional information where possible, instead of recreating it from scratch for each use case. Post-biasing is a technical solution that enables just that: improving the decisions of the core sound recognition system, based on the additional information available in the deployed environment. This is achieved without making the core system needlessly re-learn the acoustics; instead, we can adapt the system to the new information with much smaller data requirements.

As illustrated above, a post-biasing module within our ai3™ and ai3-nano™ inference engines can consider additional information such as the time of the day; or outputs from another ML system taking input from an accelerometer.

Using this additional information, the post-biasing module acts as a translator to make the system aware that it is not recognizing the sounds in a ‘generic’ condition, but instead in a smaller specific set of conditions. With this awareness, the initial output of the acoustic analysis can be updated to decide when to alert the user, change device settings or tag the correct sound.

Post-Biasing is an exciting technology that enables us to separate the solutions to core sound recognition problems from the individual challenges presented by various applications. This is key in an era with increasing use of sound recognition for applications such as entertainment, security, accessibility, smarter mobility and smart homes.

The adaptability achieved through Post-Biasing enables us to develop core technology that works across an increasing number of applications without sacrificing the quality and performance. A lot like my imaginary red yellow Ferrari.

We are using our own and third party cookies which track your use of our website to help give you the best experience. By continuing, we’ll assume that you are happy to receive all cookies on our website.

You can check what cookies we use and how to manage them here and you read our privacy policy here.

Accept and close