ETL (Extract, Transfer, Load) is a data preparation process in which data is extracted from homogeneous or heterogeneous data sources, transformed for storing in the proper format or structure for the purposes of querying and analysis and finally loaded into the target database. About 80% of a typical analytics project is spent on ETL: Outliers are deleted, faulty entries eliminated, timestamps aligned, metadata added and the cleaned-up data formatted. If data is available in an OPC UA information model, ETL can be skipped, thus taking out 80% of the duration of your industrial analytics project. And the analytics results are put back into the OPC UA information model and thus made available for third party usage.
The Industrial Data Intelligence group is an interdisciplinary team at Softing Industrial that deals exclusively with data-based production optimization. We support the user in eliminating unsolved disruptions, reducing scrap, or optimizing plant performance, and thus in production-related revenue and cost optimization. Our dataTHINK industrial analytics solution processes production data in real-time in the production line and thus allows for on-site streamed production insight. Many decision makers have a bad feeling when suggested to move their production data into the cloud. By means of an „edge“ solution close to or part of field devices and machines, data stays in the production line and is processed on the spot. Additional security precautions are unnecessary.
As we started with our Industrial Data Intelligence initiative back in 2015, we concentrated our production optimization activities in brown field environments. As such we considered building a data gathering solution specifically for Profibus, the number one installed base communication protocol in factories in Germany / Europe. To our surprise, our first projects involved OPC UA, which, although in the meantime has been chosen as the Industrie 4.0 communication standard, at the time was a relatively (to e.g. Profibus) new protocol. The reason we believe this was nevertheless the case is that our first customers are trendsetters and had started working with OPC UA several years back.
As our company is a major provider of OPC UA technology, and we therefore have detailed OPC UA knowledge and software and engineering capabilities, we built our industrial analytics solution so to say in between OPC UA IN and OPC UA OUT. First we acquire data from automation components and field devices like PLC’s, sensors, actors and databases as well as from additional sources like e.g. production flow or weather data systems. Although we take data in any other kind of format as well, if available, we prefer an OPC UA information model. Not only because it provides us implicitly with additional meta-information which can help us do root cause analysis after the analytics model has spotted an anomaly; but mainly because data is available in a structured way, and clean; no need to do ETL!
After having chosen the relevant variables in the OPC UA information model, we semi-automatically apply specific Machine Learning algorithms. Our dataTHINK analytics solution uses machine learning procedures like anomaly detection and time series analysis, for detecting outliers and preventing unwanted situations. It is based on open source software like Python, Elastic / Kibana and OPC UA. Data is the oil of the twenty-first century. In the last 2 years more data was generated than in the complete history of mankind. In addition to land, capital and labor, data is becoming a production factor. Data facilitates cost reduction, additional revenue and new business models. Data gathering companies (especially US ones like Google, Apple, Amazon, Facebook, …) already today dominate the global economy.
Machine Learning, driven forward by Moore’s Law and continuously growing amounts of data, has started changing the face of the earth in a so far unforeseen dimension. Without machine learning major parts of our society would stand still. Machine learning algorithms for example decide within milliseconds if a customer will receive money at a cash dispenser; recognize the faces of our friends on social media platforms; let us talk to our smartphone and soon will make autonomous driving come true.
The same algorithms can tell us, how and where we can optimize our production as long as we feed them with relevant data. Yet machine learning is not new. Already in 1959 the American computer scientist and pioneer Arthur Samuel defined machine learning as a „field of study that gives computers the ability to learn without being explicitly programmed“. Machine Learning, like data mining is built on top of statistics. While statistics is about information review, about what has happened, and data mining about why something has happened, machine learning is about what will happen and how specific situations can be optimized or avoided (e.g. anomalies). Industrie 4.0 thought through may eventually result in an autonomous economy in which algorithms communicate with each other for the higher well-being of man. Before we get there algorithms can help improve Overall Equipment Effectiveness (OEE), representing machine availability and performance as well as the quality of produced goods. The goal – improving OEE – is not new. New is the data-based approach by means of machine learning algorithms. And even newer the OPC UA based approach saving up to 80% out of your industrial analytics process.
Peter Seeberg has been an active member in marketing OPC UA across the globe and since 2015 business development manager for Industrial Data Intelligence at Softing Industrial GmbH.