February 12, 2020
tinyML: The quest for compactness
At today’s tinyML Summit in San Francisco, I will stand on stage and demonstrate tinyML sound recognition in action in front of nearly 400 delegates.
I’ll show that we can build sound recognition software that is highly accurate and dependable while also working on the most constrained of endpoints – in this case, the Cortex-M0+. For me, this is what makes tinyML such an exciting area of research and innovation.
Artificial intelligence promises a lot of benefits for consumers. It isn’t hard to imagine the impact of machines that can react to contextual information or predict a need at the point it arrives. In some cases, this intelligence is spread across the network and the device at the edge is merely the interaction point. But in many cases, like sound recognition, it can be done on the device saving the costs of cloud-processing and offering consumers reassuring levels of privacy.
The ability to put more on a chip is not purely down to the chip technology. Obviously, more powerful and capable silicon is a major help. See Arm’s introduction of the Cortex-M55 and Ethos-U55 from earlier this week as a sign of great progress. But this is just part of the challenge.
The software has to be compact and accurate. There is no point making something small but having to accept poor quality performance.
You need great embedded software engineers who know how to get the best out of the code and the hardware, but that isn’t the only answer. You cannot simply assume that some form of quantization will simply compress the model to the level you need.
This is where the pipeline comes in and why compact ML is not just a challenge for the embedded software department but the whole organisation. The quest for compactness drives everything from the collection and labelling of the right data for the challenge; to an evaluation framework that identifies the best performing models grounded in end-user experience, rather than abstract or ill-fitting metrics for the task; leading to tinyML that does not materially compromise on performance.
The end result is something compact and efficient but the steps to get there are significant and sophisticated.