Anomaly Detection using TensorFlow
Last updated:
5:59pm Feb 20, 2019

Overview

TIBCO Spotfire’s python data function enables users to use all packages available on PyPi to build custom functionality into their dashboards. Users can utilize document properties and data functions to execute custom code in python and use the results of the execution to update visualizations on a spotfire dashboard. This demo uses unsupervised neural networks to solve a business problem in the manufacturing industry.

Prerequisites

The following requirements must be met to enable running Python code from the data function extension:

  • Spotfire 7.13 (or later) client and server
  • Latest copy of
    • SpotfirePS.DataFunctions.Python*.spk
    • SpotfirePS.DataFunctions.PythonForms.*.spk
  • A runtime python distribution (Preferred versions are 2.7.xx and 3.5.xx) installed in your local environment and the python executable file path listed in your PATH environment variable. By doing this, the Spotfire can automatically detect your python version and make use of the provided boilerplate python code.
  • The packages used in the python code must be manually installed in your Python environment before calling it in the user code.
  • The following packages ‘Pandas’, ‘Numpy’ must be installed for the Python data function to work. This demo also uses the ‘sklearn’ and ‘tensorflow’ python packages

Download

Custom Data Function for TIBCO Spotfire® to Execute Python Code is available from the TIBCO Exchange.

Installation

In order to add the Custom Data Function to the client software, the above packages must be deployed to a Spotfire server and the client software is needed to log into the deployment area containing the .spk’s in order to be updated.

See here for details.

Manufacturing Usecase

The dataset contains 16 columns of data captured during the 3rd and 4th week of January 2010. The company produces three different products across five plant locations. Various metrics from this period can help us identify abnormal behavior on our machines.

The anomaly detection models created using the techniques outlined below can be used in real-time applications to proactively identify risks and mitigate them.

Disclaimer: The data used in this demo is likely fictitious and has been created for the purpose of the demo.

 

Tensorflow

Tensorflow is an open-source software library to enable data flow programming. Primarily, it is used in the industry today for training neural networks efficiently. More information is available here

The software library has a rich ecosystem of APIs in many programming languages, ancillary products for serving models, visualization frameworks (i.e. tensorboard), deployment packages to mobile/edge & hosted products for productivity.

Autoencoders

Unsupervised neural networks, also known as Autoencoders is an important deep learning technique that is used for a variety of use cases, primarily Anomaly detection.

Anomaly detection is a way of detecting abnormal behavior. This technique uses past data to understand a pattern of expected behavior. This pattern is compared to real-time events to highlight any abnormal or unexplained activity occurring at that moment.

Other use cases for anomaly detection are –

  • Monitoring sensors on the edge devices
  • Fraud – financial or healthcare
  • Manufacturing equipment early failure detection

Autoencoders are similar to normal feed forward neural networks in that they can have multiple layers of neurons that attempt to understand a pattern in the dataset. However, unlike traditional feed forward networks, these data problems do not require a target (i.e dependent column). Instead, autoencoders have a set of layers for encoding the dataset and then decoding the encoded dataset. In cases where there exists a clear pattern, this outputs very little or no reconstruction error. However, in data points where there is abnormal behavior, the reconstruction error is higher. This information will help us identify the data points that required closer examination.

More information on autoencoders and other types of autoencoders is available here

Dxp walkthrough

  • The first page of the dxp contains information related to the usecase and tools, technologies and techniques used for solving the problem.
  • The ‘Data’ page contains information about the columns in the dataset and a table visualization showing the raw data.
  • The ‘Tensorflow’ page contains four sections – Input parameters (left), Reconstruction Error (top left), Loss vs Steps (top right) & Reconstruction error over time (bottom).
    • The section on the left allows users to enter input parameters to define two layers of an autoencoder, the learning rate and the number of epochs. The left section contains an explanation of the visualizations in the page. It also allows users to specify a name for their tensorboard folder.
    • On the Reconstruction Error viz, we can see a bar chart with the reconstruction error. This chart will be useful to us in identifying anomalies. In our use case, we can say that the data points to the right of the chart (for eg: greater than 1.20) will be labelled as anomalies. We can use these anomalies downstream for further investigation.
  • The Analysis page contains four sections – A general explanation of the visualizations in this page, A reconstruction error bar chart (vertically), a line chart visualization of the Actual data on the top and the predictions on the bottom.

License

TIBCO Component Exchange License

Copyright (c) 2019 TIBCO Software Inc. All Rights Reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. Neither the name of TIBCO Software Inc.  nor the names of any contributors may  be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT OWNER AND CONTRIBUTORS  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE