Thinking about a production chain within big industry, it is easy to imagine how many machines must work together to provide high productivity levels. In such contexts, a single faulty mechanism can have a huge impact on the whole chain, causing significant economic damage.
This is why troubleshooting is crucial in these contexts. In most cases, techniques such as predictive maintenance can reduce the risk of unexpected malfunctions.
Moreover, even if a failure is detected, well-trained on-site operators can verify the cause of the malfunction and fix it in due course. However, it is not unusual for the investigation to demand greater expertise, and even an experienced operator may not be able to solve the issue by themselves.
This is why Capgemini developed Andy 3D, a technology that enables remote troubleshooting. Andy 3D allows on-site operators to cooperate with a remote team of experts to identify and solve possible issues faster and more effectively.
Andy 3D combines several technologies, such as augmented reality (AR), artificial intelligence (AI) and machine learning (ML), as well as 5G to provide both the on-site operators and the remote team with all the tools required to analyze and solve the malfunctions.
This article provides an overview of the technologies used within Andy 3D, discussing how they work and why combining them is a winning decision.
AR and AI
Andy 3D uses Microsoft HoloLens, a pair of smart glasses that allows users to overlay visual information on top of the real view, thanks to the transparent display integrated within the lenses. HoloLens also includes a camera that can record (and stream) the user’s view.
The use of Microsoft Hololens (and other similar devices, such as Vuzix M100 or the pioneering – although now abandoned – Google Glasses) enables the implementation of AR applications, combining the user’s view with additional overlaid information.
However, visualized information is not enough – the overlaid information must be rendered according to the user’s view to ensure its relevance. Let’s consider an operator looking at a faulty car engine while wearing smart glasses, for example.
It might be useful to highlight some areas of the engine, to focus the operator’s attention. However, unless the information included in the user’s view is included, such a highlight cannot be successfully executed.
Object Detection and Image Labelling
Here is where AI comes into play. To properly integrate information visualization and the user’s real view, the latter needs to be continuously processed and analyzed by algorithms, so that relevant objects can be identified and highlighted.
Object detection can be implemented in many ways, but in the present day this is mostly achieved by relying on machine learning (or deep learning) models.
Generally speaking, machine learning models can be trained using a set of images taken from a specific context. Such images need to be analyzed manually by domain experts, following a labelling process.
This is usually conceptually simple to perform, but often involves very repetitive tasks of long duration, due to the high number of images required to achieve good recognition performance.
Labelling can be done at various levels of precision, depending on what the final model is intended to do. For instance, a model that implements image segmentation requires the images to be labelled in terms of regions of interests (ROI), each of which can have one or more label.
On the other hand, if the model is an image classifier, the label can be associated with the whole image, which usually translates into a simpler labelling process.
Returning to the example given above, when an operator looks at car engines, the images to be labelled can be sets of car engine views, with domain experts manually labelling the various areas. Then, the images are fed into a training process to build an image segmentation model that acts as an object detector.
Theoretically, the whole training process can be run on any computer equipped with one or more GPUs to handle the massive computations required. However, many Cloud services allow this to be done more effectively. Capgemini has chosen Microsoft Azure AI to train the deep learning models used within Andy 3D.
Digital Twins for Remote Troubleshooting
Another interesting paradigm used within Andy 3D is what’s known as digital twins. Microsoft Azure AI provides services that allow the definition of a digital replica of a physical machine, built and stored directly on the Cloud.
With such a digital replica, while the on-site operator can investigate the machine’s status locally, a remote team of experts can investigate the problem by merging two information streams:
- the operator’s view, augmented with the overlaid information produced by machine learning models
- the machine’s digital replica status, as a form of comparison with what is happening in the actual machine, according to monitoring data and operator inputs
In other words, the digital twins’ paradigm allows an increased amount of information to be made available to the experts, so that they can investigate the problem and find a solution more easily and quickly.
Use Case
Now that we have a general overview of all the technologies involved in Capgemini’s Andy 3D, let’s look at a use case to better understand how all the elements mentioned above are integrated.
When a malfunction is detected, an operator is asked to check the machine directly on site. The operator wears the HoloLens and connects to the remote team.
As soon as the smart glasses are put on, the device starts recognizing objects within the user’s view and their status (represented by the predicted labels). This is the first step of the process; the main goal is to understand which maintenance activities are needed.
The HoloLens camera is also used to stream the user’s view to a remote server, through which two things happen in parallel:
- the live-stream is sent to a remote team of experts that can communicate with the operator while looking at the video
- the video frames are processed by machine learning models on the Cloud (Microsoft Azure AI), where object detection is performed
The results of object detection are also integrated into the visualization within the HoloLens and the video is shown to the remote team, implementing the augmented reality paradigm.
This approach also allows a collaborative analysis among the remote team and the operator during maintenance activities. Apart from traditional voice-based communication, the remote team can also annotate the user’s view to enhance the information available to the operator. This is useful, for instance, to explain which part of the machine requires more attention.
It is also worth mentioning that HoloLens visualization is enhanced with data received from the Cloud. Data from the sensors installed on the machine is also used, which allows a clearer vision of the machine’s status with the lowest possible degree of latency.
The whole Andy 3D use case is depicted and explained in the following video:
The role of 5G
So far, how the communication is managed has not been discussed. The amount of data that needs to be transmitted is huge: there is audio for allowing communications between the team and the operator, video streaming, annotations and sensor data.
To achieve real-time performance, all the above might be compressed. However, this could negatively impact the system’s ability to evaluate the issues and, therefore, find an effective solution to the problem. To overcome this, Andy 3D makes intensive use of 5G technology.
All the data sent to and from the HoloLens travels via such a communication channel, which allows high speed and very low latency without a heavy compression mechanism. To be precise, 5G can dramatically decrease latency down to 1 ms, a full 20 times less than 4G.
The integration of 5G can also be introduced gradually and in a hybrid manner, supporting both new and traditional technologies and granting retro-compatibility.
Conclusions
Capgemini’s Andy 3D provides an effective and innovative way to provide assistance to on-site operators and implement remote troubleshooting via AR and AI. Combining such paradigms is what makes this technology so powerful.
Although such technology is currently almost exclusively applied within industrial production chains of industries (especially in the field of smart manufacturing), it is not difficult to see extensions of this idea to other fields.