ML at the Edge: a Practical Example

Table Of Contents

Gesture recognition
Techniques for gesture recognition
Machine learning for gesture recognition
A practical example of edge ML
Running the model
Conclusions and what next

Machine learning is the primary methodology for delivering AI applications. In previous articles, I discussed the main reasons behind moving machine learning to the network edge. These include the need for real-time performance, security considerations, and a lack of connectivity. However, ML models typically require powerful hardware to run efficiently. Fortunately, in recent years, MCUs have become ever more powerful, alongside the release of new frameworks enabling edge ML. TensorFlow Lite, in particular, has gained significant traction.

The last article discussed choosing a suitable platform for your edge ML project. It also explored some of the limitations imposed by TensorFlow Lite. In this article, we look at a practical example of running a TensorFlow Lite model on an NXP i.MX RT1050 EVK. Specifically, I’ll show how you can perform gesture recognition using the built-in accelerometers and a simple neural network.

Gesture recognition

Gesture recognition is a growing field in human-computer interaction using a combination of sensors and algorithms to identify specific human gestures. It enables many HCI applications ranging from controlling your car stereo to interactive gaming. It can even identify individuals based on the different ways they hold and use a device.

Applications of gesture recognition

One of the earliest well-known applications of gesture recognition was the Nintendo Wii. The Wii used hand-held accelerometers enabling players to interact with various virtual sports events. It was only able to detect large gestures, but that was ideal for playing virtual sports like tennis or golf.

The Nintendo Wii offered simple controllers with accelerometers to enable gesture recognition

More recently, we have seen a rush to install gesture recognition as a safety feature in cars. Many motoring accidents are caused by the driver being distracted, one of the most frequent distractions being the audio system. Gesture recognition allows the driver to control the system without taking their eyes off the road: waving in a specific pattern to skip a track, increase volume, or change channel.

Techniques for gesture recognition

There are three broad forms of gesture recognition. The first relies on detecting movement in a sensor’s field of view. The second uses accelerometers within a device being held or worn by the user. The third uses computer vision to process the input from a camera.

IR sensors

IR movement detection has existed for decades and is still widely used in burglar alarm systems, for example. However, this sort of movement detection is binary. That is to say, either it detects movement or it doesn’t, which is not sufficient to enable gesture recognition. Instead, a time of flight sensor is required. These send out pulses of IR energy and measure the time for the reflected pulse to return, thus identifying simple gestures.

Accelerometers

Our world is full of accelerometers: in mobile phones, in video game controllers, even in TV remote controls. Modern accelerometers are typically 6 axis, and often incorporate other sensors, providing data on movement in the XYZ directions, along with pitch, yaw, and roll. All of which can be used to determine gestures.

Computer vision

The most powerful approach for gesture recognition involves the application of computer vision. Here, you train a set of ML models to identify a hand in a video, track its movement, and recognise gestures. This approach has two key advantages over the other techniques: it doesn’t require the user to be holding a physical object (as is the case with accelerometers), and it can work at much longer ranges than systems based on time of flight sensors.

Time of flight cameras

One of the newest approaches for gesture recognition takes advantage of time-of-flight (ToF) cameras. These devices combine a camera with the ability to measure the distance to an object. They provide the ability to create 3D models of the environment without the need for expensive LIDAR scanning technology. As a result, they are perfect for recognising complex gestures at a distance.

Machine learning for gesture recognition

Machine learning is one of the most flexible techniques for enabling gesture recognition. It allows the identification of complex gestures yet is agnostic to the person making the gesture. ML models can take inputs from multiple different sensors and can cope with noisy data, outperforming the purely algorithmic approach taken by systems like Nintendo Wii or in-car audio. So, how do you create a machine learning model for gesture recognition and port it to your edge device?

A practical example of edge ML

The previous article explained the steps needed to create a TensorFlow Lite model and port it to a device. Now let’s see what this looks like in practice using a simple gesture recognition model.

This example uses the available from and one of the most powerful MCU boards in its class. It is based on an ARM Cortex 7 processor running at up to 600MHz. The board specification includes:

Processor: MIMXRT1052DVL6A processor clocked at up to 600MHz
Memory:
- 256 Mb SDRAM memory
- 512 Mb Hyper Flash
- 64 Mb QSPI Flash
- Footprint for QSPI Flash
- TF socket for SD card
Display and Audio:
- Parallel LCD connector
- Camera connector
- Audio codec
- 4-pole audio headphone jack
- External speaker connection
- Microphone
- S/PDIF connector
Connectivity:
- Micro USB host and OTG connectors
- Ethernet (10/100T) connector
- CAN transceivers
- Arduino® interface
IoT sensors: FXOS8700CQ 3D Accelerometer and Magnetometer

This board is ideally suited for running sophisticated and powerful ML models at the edge. It is capable of handling large neural networks, making it suitable for image recognition tasks.

Top view of NXP iMX RT1050 evaluation kit

Selecting a model

You could choose to create a model from scratch. However, this is only worthwhile if you need to use a non-standard sensor or identify unusual gestures. Typically, it’s easier to find an existing model and retrain it. An excellent starting point to find existing models is Model Zoo – a vast collection of ML models you can use for free. However, in this case, we want to find a model that converts accelerometer readings into gestures. Currently, there is no suitable model in Model Zoo. Instead, we will use one of the sample models provided for TensorFlow Lite. This example is called Magic Wand and is trained to recognise the three gestures shown below.

How to build your edge ML model

The first step is to set up your development environment. I recommend using NXP’s MCUXpresso, which is based on Eclipse. You will also need to build and download a suitable SDK for your development board. This requires you to create an account on the NXP website.

When you are creating your SDK, make sure you remember to select the correct board and include the eIQ, ISSDK, and sdmmc modules. These ensure your SDK can read from the accelerometer, provide native support for neural networks, and enable SD card IO.

Now install your SDK into MCUXpresso and create a new project called Magic Wand. Use git to clone the Magic Wand repository to the correct project folder on your local system.

Converting the ML model to TensorFlow Lite

The current model is written for TensorFlow. Using the TensorFlow Lite Converter, convert it to TensorFlow Lite FlatBuffer format – an efficient flat structure for storing hierarchical data.

In brief, the steps needed are:

Using the Python API, convert the model to TensorFlow Lite FlatBuffer. You need to import TensorFlow and use the tf.lite.TFLiteConverter.from_saved_model() command.
Convert the FlatBuffer to a C byte array. The easiest way to do this manually is with the Linux xxd command.
Integrate the C++ Library. You need to include the TensorFlow Lite C++ library in your build.
Deploy the model. You can choose to flash the model into the evaluation board’s flash memory. Alternatively, you could load it onto an SD card.

Building the model

When building your model, it is advantageous to set your development board as the build target as this ensures:

The IDE handles the process of converting the TensorFlow model to TensorFlow Lite and packaging it as a FlatBuffer.
The IDE calls the correct SDKs for the board. This automatically includes the eIQ SDK, together with a large selection of machine learning resources and libraries, and ensures the TensorFlow Lite library is loaded.
The IDE can communicate with the board’s debug port to read debug information.

Within MCUXpresso, click on the build icon to initiate all steps needed to convert the model and load it to the development board. As with all Eclipse-based IDEs, the build icon is a small hammer .

At this stage, you may encounter some problems if you didn’t include the correct SDKs in your download or forgot to install them in the IDE. If that is the case, go back to the SDK download page and make sure you select all the required elements. Download the SDK package again. Now, in MCUXpresso, go to Window > Show Views > Installed SDKs which opens a new window in the IDE. To install the SDK, drag and drop the zip file into this view.

Running the model

The Magic Wand model should now be loaded on to your evaluation board. As soon as the board is powered up, the model should start running and be ready to test. This requires you to connect to the board’s serial port using PuTTY or a similar terminal emulator.

Connecting to the serial port

Plug the board into your computer using a USB cable connected to the debug port. If you used MCUXpresso to build and deploy the model, the board should already be connected. Now use Device Manager to identify which USB port it is connected to and set up your terminal emulator with the following settings:

Port: Find the correct port for the USB cable connected to the board
Baud Rate: 115200 bits per second
Data Bits: 8
Stop Bits: 1
Parity: None
Flow Control: None

If all is well, you should see the words Magic starts! appear on the screen.

Testing gesture recognition

Pick up the board and draw a letter W in the air. You should see the following appearing on the terminal emulator:

Now try drawing a circle in the air. This time you should see:

Finally, try drawing an angle in the air:

Taking things a step further

Clearly, it’s impractical to wave a large board about when using the Magic Wand code. Fortunately, you can buy much smaller devices capable of running this code. Then, with a little ingenuity and maybe a 3D printer, it’s possible to create a real wand. Alternatively, companies such as Kano are producing suitable hardware, designed to teach children how to code. However, since the wand includes an accelerometer and is Bluetooth enabled, it is easy to see how you can use it with the NXP MCU board if you install an Arduino Bluetooth shield.

Conclusions and what next

Edge ML is an essential tool for enabling gesture recognition. Modern MCUs are powerful enough to run complex ML algorithms in real-time, enabling gesture recognition systems based on accelerometers, time of flight detectors, and even computer vision systems. The next article will show how edge ML can be used for person detection and explain how this can be extended to full facial recognition.

ML at the Edge: a Practical Example