Anomaly Detection and Root cause analysis using TIBCO Analytics and Microsoft Cognitive Services

Published:
9:45am Jul 27, 2020

TIBCO recently announced that TIBCO Spotfire® and TIBCO® Data Science now support Microsoft Azure Cognitive Services. TIBCO's solutions extend these services to analyze multivariate anomalies and add root cause analysis, using input from the Key Phrase Extraction cognitive skill to render decisions and automate actions like anomaly detection. Using visual analytics and data science, we are able to analyse sensor data and log data at the edge to detect anomalies within large volumes of data and alert case managers in order to take preventive actions.

Anomaly detection using equipment maintenance dashboard

The TIBCO anomaly detection solution includes Microsoft Cognitive Services container deployment with anomaly detection, text mining and root cause analysis. Anomaly detection and analysis provides value across nearly every industry including energy, financial fraud and risk, algorithmic insurance, connected vehicles, healthcare and insurance claims, and manufacturing fault detection and yield optimization.  

This blog focuses on energy, identifying anomalies for asset management. This specific example uses machine learning techniques to detect anomalies, understand root cause from related text data, and alert case managers when sensor readings are deviating from expected patterns by recommending a suggested maintenance action. This enables operators to implement predictive-based maintenance actions before equipment failure, and to prevent costly manufacturing process shutdown. 

Reference Architecture

 The solution is built out in 3 phases; 

Phase 1 - Anomaly Detection on historical data

TIBCO Data Science platform is used for detecting anomalies across all the sites of a power plant. The input file is power plant data consisting of a timestamp column and several other numeric sensor readings. The response variable in this case is ‘Prodperminute’(sensor reading tracking production of power per minute). The workflow consists of multiple steps including data pre-processing, transformation of time-series data, filtering and ultimately calling the MS Azure services from a TIBCO Data Science custom anomaly detection operator designed to invoke the services. The results from the model are made available offline which the maintenance engineer uses to carry along to the remote site.

Phase 2 – Detecting Anomalies at remote site

Once the maintenance engineer is at the remote site that may not be connected to the internet, the first step is to perform anomaly detection analysis using the Spotfire equipment maintenance dashboard. The dashboard can run entirely in an offline mode so the engineer at the site can continue their critical inspection regardless of access to network connection.

The user selects a response variable(in the demo: ‘Prodperminute') and a time granularity. Anomalies are then detected using historical data collected at the particular site. This produces two visualizations, the original data readings along with the expected values provided from the containerized service, and the difference between the original and expected values over time. The red markers indicate the anomalies detected by the Anomaly Detection container.

Phase 3 – Performing root cause analysis

The site engineer investigates anomalies from a certain time window to perform a root cause analysis. As a part of root cause analysis, key driving factors indicate what factors are contributing to the anomalies in production per minute for the time window selected by an engineer using the TIBCO Spotfire’s brush linking capabilities.

At this point, the site engineer digs deeper to understand the maintenance action items performed prior to these anomalies occurring by getting insights from log data using key phrase extraction from the Text Analytics container. An example of the key phrases generated from the text log, “Complaint of Faulty BaroPressure Meter Reading” would be ['Complaint of Faulty BaroPressure Meter Reading', 'Needle bearing', 'diagnostics', 'SNO 476BD78', 'PCB', 'replacement', 'Restart PCU unit', 'psi']

From here, the equipment maintenance dashboard generates a recommendation for next steps by further pre-processing the key phrases extracted from the logs. In the example shown in the demo, the maintenance logs from the selected time window indicate ‘BaroPressure’ sensor was restarted 5 times vs. being recommended for replacement just once. As this sensor was one of the top key driving factors, recommendation to replace the ‘BaroPressure’ sensor is made.

The solution can be adapted to data from different domains. To learn more visit TIBCO Exchange

This use case is just an illustrative example and one of the many ways how customer’s use TIBCO’s analytics and data science capabilities to solve some of the most complex business problems such as pricing and promotion optimization, supply chain analytics, fraud detection, yield enhancement, and real-time process control.

To learn more about TIBCO’s anomaly detection solution, visit Anomaly Detection at TIBCO

Many thanks go to my collegue Prem Shah who helped me do this work throughout.

Vaibhav Gedigeri is a Lead Data Scientist for the Tibco Data Science team. Prior to joining Tibco, Vaibhav worked as a Full Stack Data Scientist at Honeywell for 3 years where he developed data science solutions at scale spanning across the aerospace, energy and industrial IOT sector. His focus has been in making complex data science solutions more explainable to business end users. He received his Master’s in applied statistics and operations research and before that his bachelor’s in computer science with a minor in artificial intelligence.