A digital twin is a virtual or digital replica of a physical entity. In the Industrial Internet of Things (IoT) realm, a digital twin is the dynamic virtual copy of a physical framework, a vehicle, a machine, or any device. Most often, a digital twin is developed to carry out simulations before the actual device is manufactured. Today, digital twin is taking the IoT industry by storm and the market is expected to hit $16 billion by 2023. But nowadays, digital twins are being used during operations too. The healthcare and manufacturing industries along with the automotive, utilities, and construction sectors, are extensively using digital twins.
The Architecture of Digital Twins
Talking about the architecture of the digital twin concept, each physical component comes with its virtual counterpart which is also known as the virtual mirror model. Each virtual mirror model comes with capabilities for analysis, evaluation, prediction, and monitoring of the physical entities. All of this is facilitated by a combination of different technologies like Industrial IoT, Cloud computing, Robotics, Edge computing, Machine learning, AI, and Big Data analytics.
In other words, the digital twin architecture establishes a powerful mechanism to communicate between the physical and the digital world through data. The physical and digital components incorporate synchronously, thereby forming a closed-loop.
From a high-level perspective, a digital twin model comprises – real-time data integration and real-time machine learning.
Real-time data integration
When it comes to live data integration, tech giants like IBM are real-time batch data integration leaders. Some of the most notable applications used for this purpose include IBM Streams, Apache Kafka, Apache Spark Structured Streaming, Apache Flink, and Node-RED. Apache Spark is especially efficient since it is capable of both batch processing and streaming of live data. Its version 2.3 employs micro-batching and is considered a practical solution for the integration of real-time data.
A discussion about real-time data integration remains incomplete without discussing edge computing. This innovative technology is essentially an extension to IoT, Big Data analytics, and Cloud computing. Not always will data integration involve central cloud storage; there will also be a need to directly process data by distributing it across edges of all forms. This is why edge computing plays a crucial role. Besides, it also addresses three key concerns –
1. Network partitioning
Network connections tend to become unreliable the closer it gets to the edge. This happens mostly in the case of disconnected edges. A smarter method of processing local data using edge computing can help mitigate this problem.
2. Network latency
Network latency plays a major role in Industrial IoT since the sensor data tend to lose their value within the initial few seconds. The network latency in a system increases the closer it gets to the edge. To avoid this, data-drive decisions must be made faster and on the edge, using edge computing.
3. Data privacy
IoT sensors like cameras and microphones facilitate data accumulation in an IoT network. But this also gives rise to concerns such as data privacy. However, if direct data processing uses edge computing, then sensitive data will remain within the desired realms. For instance, if the elevator occupancy is first gauged using a video stream, it can be scheduled for optimized scheduling and floor allocation to decrease the overall waiting time and improve the elevator workload. But care must be taken such that the image or video stream from inside the elevator is never leaked from the device.
Real-time machine learning
Since digital twins capture and process data in real-time, the physical systems must also be able to act on real-time data. This mandates the adoption of machine learning for physical entities. For instance, an anomaly sensor should generate an alert and close the production line to inhibit further damages. Or after simulating the outputs on the digital twin, based on a set of parameters, the actual system should be updated with the acceptable set of parameters.
Since machine learning prototypes are generally trained using static data, there is a need to store real-time data in a place where it can be conveniently retrieved.
- One very critical part in machine learning implementation is the tuning phase of hyperparameter, wherein outcomes are repeatedly tested using varying parameter sets to check which one drives the best results. Hyperparameter configuration tuning is extremely difficult with data streams since there is no way to store the data. It keeps flowing and yielding different outcomes, which makes tuning more difficult.
- When it comes to training the model in real-time, the system performance must always be in sync with the rate at which the data arrives. In case the system is slower than the data arrival rate, it will lead to flooding of the buffers, which will cause the system to trash important data.
- Since training on windows decreases the bandwidth, sequentially distant events cannot be considered for real-time machine learning. So ideally, data processing should include data from both real-time streaming as well as historical records to get optimal results
Different kinds of digital twins
There are 3 types of digital twins –
- Part Twin – A part twin is associated with a larger system, as the name suggests. For instance, let us consider a bearing of an energy plant. This bearing can have its digital twin at the time of its operations, to convey information about its condition like Mean Time To Failure (MTTF) or Mean Time Before Failure (MTBF). This information can be predicted from current details and also from that during the build or design phase. And during the operational phase, the findings of the part that are obtained can again be fed back in the design phase.
- Product Twin – A product twin is essentially a couple of part twins that reflect their mutual interactions. Talking about the product twin from the software product viewpoint, the product twin is actually the same. So part twins can be accessed by drilling down on a product twin. A fine example of product twin is the generator module of an energy-production plant; it comes with several bearings, including part twins.
- System Twin – A system twin can be considered as one step higher than product twins. The concept of system twin is applied in a single software product offering similar features, as a product part or a twin. But it just offers a view of the entire system. Continuing with the power plant example – a system twin can reflect its current and historical state and forecast the future state of a power train within the energy plant, the entire power plant, or a panel of an energy grid.
Today, the emerging concept of digital twins is gaining advantage from the existence of abundant data generated by machines. This is helpful since an abundance of information helps in realizing and adopting models enabled by deep learning. Digital twins in Industrial IoT are much like a new form of the control centre, which effectively combines the historical and current state of a system along with predictions about its future state. Although they are at a nascent stage, digital twins can unleash their true potential when both product development and design are considered.