Image Recognition in TIBCO Spotfire® using Python and AWS
Image recognition is a very powerful tool that is used in countless industries as the machine learning explosion continues. It has applications in health and medical industries to help scan medical imagery, in manufacturing to assess for quality and errors, maintenance to detect failing equipment, intelligence and security to identify people, or objects etc. It can even be used on satellite imagery to detect environmental issues, or ship movements. To this end, I wanted to experiment with whether TIBCO Spotfire® could be used to run image recognition models while providing a visual and interactive way to build, assess or simply review model results. In this blog I will tell the story of this work.
As a data scientist and analyst of many years, I have largely dealt with structured, rectangular data in ever increasing sizes. Historically, processing non traditional data such as images, and text was a considerable task requiring much computational power, and lengthy coding or development times. However, as analytical and modelling tools have greatly improved, coupled with the advent of cloud based computing and services, the opportunities for data science is much greater. We now suddenly have the ability to run complex pre-trained models utilising massive computational power, with minimal coding or infrastructure required.
With the main cloud providers all offering image recognition services, I wanted to utilise their existing functionality in Spotfire® if possible. This would not only mean Spotfire could be used to run image recognition models but also connect to cloud services.
Summary of cloud providers image services
To build this image recognition Spotfire tool I would need to be able to do the following:
- Read images into Spotfire and extract metadata i.e. image name, dimensions, etc
- Submit images to a cloud service and handle the returned results
- Visualise the results in Spotfire
Read images into Spotfire and extract metadata
To read the images into Spotfire I used one IronPython script. IronPython gives you the ability to not only access Spotfire’s own API but also the C# libraries available in .NET. This opens an incredible potential to Spotfire users being able to perform tasks such as interacting with file systems and controlling Spotfire. Using C# libraries such as Image and various IO functions, I can read all images in a directory specified by the user, and extract metadata including location information. This can then be displayed in Spotfire:
Example of reading images in a directory into Spotfire
As a later extension, I added an option to specify a S3 bucket to read images from and download to Spotfire, showing we can interact with AWS storage services also. This was done utilising Spotfire’s Python data function. You can see the code for this in this Wiki article. More on this below.
Submit images to a cloud service and handle the returned results
I chose to use Amazon’s (AWS) Rekognize service as I had previously tried this through their web interface, and knew it could be called through Python or using the AWS CLI executable (note that you must first configure the AWS CLI tool before these options will work). It is possible to call executables through Spotfire’s IronPython functions but a much cleaner way is to use the Python data function. This allows you to utilise any python libraries, as well as pass data to and from Spotfire to Python.
Amazon provides a Python library called Boto3 which has functionality for the vast majority of their services. So we can call this from Spotfire opening up AWS’s huge functionality to Spotfire. I then followed this guide from AWS to get the correct Python code, and implemented this in Spotfire. This meant I could live submit many, or single images from Spotfire (as chosen by the user), to Amazon’s Rekognize model. The python to loop over rows of data and pass the images to AWS is relatively simple:
Example of Python code to loop call the AWS service
The output from the Rekognize service is a JSON object which contains all the identified labels i.e. objects, the confidence of this detection, and the coordinates of any detections, with the latter only being provided in certain circumstances. The Spotfire python data function was then written to flatten this JSON data into two tables:
- A table of labels and confidence in each image
- A table of coordinates of any bounding boxes for labels found in each image
We can now display the results in Spotfire from AWS.
Visualise the results in Spotfire
In Spotfire I used map charts and bar charts to display the data returned. Spotfire’s map charts can be used to display images rather than actual geographical layers, and you can plot data upon these using coordinates relative to the image dimensions. This means using the bounding box data returned from AWS, we can not only view the image but display where labels were found exactly. Here is my output in Spotfire:
Example of using a map chart to display 9 bounding boxes of persons identified in the image
To make the image recognition interactive in this Spotfire application, I used markings i.e. the ability to select by clicking on data point(s). This lets the user control which images are displayed in the map chart as shown above, and are sent to the AWS image recognition service. Since Spotfire gives you the ability to trigger a Python data function based upon markings, this interaction is possible. You are then able to mark/click on any row or bar in the charts on the right that have bounding box data, and these will be highlighted as blue rectangles on the map chart. I also utilised an IronPython script to control the configuration of the map chart allowing for the correct background image to be used, and image dimensions set.
Summary of the interactive process
I now have a complete image recognition application running through Spotfire utilising Amazon’s Rekognize service. However, since I have access to python there is nothing stopping me implementing my own model, or any other Python based modelling libraries such as Tensorflow or Keras, should I wish to do so.
My final step was to expand this to a passion of mine which is nature and the environment. I wanted to test whether I could analyse images from Explore.ORG’s live bear cam at the Katmai river in Alaska. My idea was to produce a timeline of when bears appeared on camera and in what number. Here is the application I produced after capturing images at 10 minute intervals over an evening:
Timeline of detected animals from video feed
Here I have utilised the date created property of each image to produce a timeline shown in the bar charts (showing 24 hour and minute). Each bar represents an animal type identified, and how many. We can see it identified many bears and birds showing a distinct pattern for when more bears were on camera. We also see some unusual identifications of penguins, a cow and dog, showing you must always test and understand a model’s capabilities before using in a real production environment! However, given this is a generic image recognition service, working on low quality images, the results are still impressive.
I hope you have enjoyed this blog on how I was able to perform image recognition in Spotfire, making it an interactive and visual process while showing that we can also call out to cloud services such as AWS quickly and easily in Spotfire also.
You can watch a live and full explanation of how these examples work on our YouTube channel.
Please feel free to ask any questions on the TIBCO community with a link to this blog.
Colin Gray - TIBCO Data Science - Aug 2019.
Colin Gray is a data scientist working in the Data Science team in TIBCO, in the EMEA region. He has a keen interest in all things data science especially around how we present and communicate machine learning models and analytics, as well as finding innovative ways to combine technologies to bring data science to more fields and users. He loves sports (boxing, NFL, Formula 1), music and dogs and will find any excuse to combine data science with these interests!