We are often asked if we accelerate using Neon on Arm.  The answer we give is perhaps seen as odd, but for us, optimizing for Neon is a rather academic exercise.

Our lightweight code performs fixed-point calculations, which optimises for the trade-offs between space, time, value range and ensuring cross-platform consistency. The main driver for this is that our main target processors are Cortex-M class processors operating at ~50MHz with a few hundred kilobytes of RAM. Optimizing for a Cortex-A class processor that normally operates above 300MHz, often has 1Gbyte of RAM or more and is typically superscalar is somewhat redundant.  Obtaining a 10% improvement on a Cortex-M4, for example, will transfer to a Neon class processor but means we need only maintain a single code base and test requirements are simplified.

Choosing fixed point code gives us maximal cross-platform consistency and better control over memory usage but comes at a small price: simple arithmetic operations are made more complex by the need to shift according to the required dynamic range of the calculation.

With Helium, Arm’s recently announced M-Profile Vector Extensions (MVE) for the Armv8.1-M architecture, the game changes.  If optimizing for Neon is largely an academic exercise, exploiting MVE is far from it. With Helium, next-generation Cortex-M processors gain the vectorising capabilities of Neon-class processors as well as a few other abilities that deliver tangible improvements in performance, and cost, in the microcontroller space.

Helium builds upon Neon’s capabilities. Optimising a few selected routines can quickly yield a 50% or greater improvement in execution speed, but these vector extensions offer other advantages too.

Helium offers another valuable trade-off with regards to memory usage control: half-precision floating point support. With our code seeing a 50% speed-up by performing operations in batches of four, half-precision would widen those batches to eight and avoid some of the extra steps involved in shifting fixed point values around while maintaining a more compact data representation at an acceptable accuracy trade-off.

Helium also helps with the hassles dealing with the left-overs: vectors that are not direct multiples of the number of available lanes mean special processing steps for the remainders. There are several techniques for dealing with this, but Helium offers streamlined loops to help resolve this problem.  Not necessarily a huge performance gain, but delivers simpler, smaller code.

Optimizing for Helium brings the goal of true wireless battery-powered devices to reality. For example, when combined with the piezoelectric MEMS microphone from Vesper, which can boot the processor on reaching a suitable sound level, a Helium-enabled processor can then quickly analyse the observed sound; identifying whether it is a known sound before returning to sleep. It will be able to do this at a lower power level than comparable systems and potentially eliminates the need for a DSP in the system, thus simplifying the architecture and reducing the BOM (Bill of Materials).

Helium matters. Edge-based machine learning, also known as tinyML, offers OEMs major benefits, such as reducing the cost of cloud computing. It also appeals to consumers, as processing is done without data leaving the device. Arm’s Helium technology means that you can, as my colleague Chris calls it ‘supercharge AI at even lower-power and lower-cost’.


Like this? You can subscribe to our blog and receive an alert every time we publish an announcement, a comment on the industry or something more technical.


About Audio Analytic

Audio Analytic is the pioneer of AI sound recognition software. The company is on a mission to map the world of sounds, offering our sense of hearing to consumer technology. By transferring our sense of hearing to consumer products and digital personal assistants we give them the ability to react to the world around us, helping satisfy our entertainment, safety, security, wellbeing and communication needs.

Audio Analytic’s ai3™ sound recognition software enables device manufacturers and chip companies to equip products with Artificial Audio Intelligence, recognizing and automatically responding to our growing list of sound profiles.