"It's not who I am underneath, but what I do that defines me."
-Batman
Over the last few years, as sensor and MCU prices plummeted and shipped volumes have gone thru the roof, more and more companies have tried to take advantage by adding sensor-driven embedded AI to their products.
Automotive is leading the trend – the average non-autonomous vehicle now has 100 sensors, sending data to 30-50 microcontrollers that run about 1m lines of code and generate 1TB of data per car per day. Luxury vehicles may have twice as many, and autonomous vehicles increase the sensor count even more dramatically.
But it's not just an automotive trend. Industrial equipment is becoming increasingly "smart" as makers of rotating, reciprocating and other types of equipment rush to add functionality for condition monitoring and predictive maintenance, and a slew of new consumer products from toothbrushes, to vacuum cleaners, to fitness monitors add instrumentation and "smarts".
Real-time, at the Edge, and a Reasonable Price Point
What these applications have in common is the need to use real-time, streaming, complex sensor data – accelerometer, vibration, sound, electrical, and biometric signals – to find signatures of specific events and conditions, or detect anomalies, and do it locally on the device: with code that runs in firmware on a microcontroller that fits the product's price point.
When setting out to build a product with these kinds of sensor-driven smarts, there are three main challenges that need to be overcome:
1. Variation
Real-world data is noisy and full of variation – meaning that the things you're looking for may look different in different circumstances. You will face variation in your targets (want to detect sit-ups in a wearable device? First thing you will hit is that people do them all slightly differently, with myriad variations).
But you will also face variation in backgrounds (vibration sensors on industrial equipment will also pick up vibrations transmitted thru the structure from nearby equipment). Background variation can sometimes be as important as target variation, so you'll want to collect both examples and counter-examples in as many backgrounds as possible.
2. Real-time Detection in Firmware
The need to be able to accomplish detections locally that provide a user with a "real-time" experience, or to provoke a time-sensitive control response in a machine, adds complexity to the problem.
3. Constraints – Physical, Power and Economic
With infinite computing power, lots of problems would be a lot easier. But real-world products have to deliver within a combination of form factor, weight, power consumption, and cost constraints.
Traditional Engineering vs. Machine Learning
To do all of this simultaneously, overcome variation to accomplish difficult detections in real-time, at the edge, within the necessary constraints is not at all easy. But with modern tools, including new options for machine learning on signals (like Reality AI) it is becoming easier.
Certainly, traditional engineering models constructed with tools like Matlab are a viable option for creating locally embeddable detection code. Matlab has a very powerful signal processing toolbox which, in the hands of an engineer who really knows what he/she is doing, can be used to create highly sophisticated, yet computationally compact, models for detection.
Why Use Machine Learning?
But machine learning is increasingly a tool of choice. Why?
For starters, the more sophisticated machine learning tools that are optimized for signal problems and embedded deployment can cut months, or even years, from an R&D cycle. They can get to answers quickly, generating embeddable code fast, allowing product developers to focus on their functionality rather than on the mathematics of detection.
But more importantly, they can often accomplish detections that elude traditional engineering models. They do this by making much more efficient and effective use of data to overcome variation. Where traditional engineering approaches will typically be based on a physical model, using data to estimate parameters, machine learning approaches can learn independently of those models. They learn how to detect signatures directly from the raw data and use the mechanics of machine learning (mathematics) to separate targets from non-targets without falling back on physics.
Different Approaches for Different Problems
It is also important to know that there are several different approaches to machine learning for this kind of complex data. The one getting most of the press is "Deep Learning", a machine learning method that uses layers of convolutional and/or recurrent neural networks to learn how to predict accurately from large amounts of data.
Deep Learning has been very successful in many use cases, but it also has drawbacks – in particular, that it requires very large data sets on which to train, and that for deployment it typically requires specialized hardware ($$$).
Other approaches, like the one we take, may be more appropriate if your deployment faces cost, size or power constraints.
Three Things to Keep in Mind when Building Products with Embedded AI
If you're thinking of using machine learning for embedded product development, there are three things you should understand:
- Use rich data, not poor data. The best machine learning approaches work best with information-rich data. Make sure you are capturing what you need.
- It's all about the features. Once you have good data, the features you choose to employ as inputs to the machine learning model will be far more important than which algorithm you use.
- Be prepared for compromises and tradeoffs. The sample rate at which you collect data and the size of the decision window will drive much of the requirements for memory and clock speed on the controller you select. But they will also affect detection accuracy. Be sure to experiment with the relationship between accuracy, sample rate, window size, and computational intensity. The best machine learning tools in the market will make it easy for you to do this.