Visualizing TIBCO Spotfire® data table column dependencies using TERR/R/JSViz

Last updated:
1:29pm Jan 11, 2022

Introduction

In a TIBCO Spotfire® data table, next to the standard set of imported columns, one can create as many calculated, binned, tag, hierarchy, TERR™ and marking filter columns as needed. However, when those columns become dependent on one another, one can lose track of the consequences when changing the logic of such a column in the chain of column dependencies.

In other words, when changing the logic of a non-imported column it is rather difficult to find out how it will impact its dependency columns, and its dependencies, and its… (etc.)

An example:

Create calculated column [A] that is used as input for calculated column [B] that is used as input for hierarchy [C].

When changing the expression for [A] it may have an adverse effect on the outcome of [C] via column [B].

To mitigate this problem, for a data table, one should be able to create an overview of all non-imported columns and for each of these show its dependencies on other (non)-imported columns.

For this, a column dependency network plot would be an excellent means.

This article shows a method to retrieve column dependency information using the Spotfire® API in IronPython. Next, it also provides two distinct methods to visualize this information in a network plot.

The first network plot display method uses TERR and R function ggnet2() to create a static graphic image.

The second network plot display method uses TERR function visNetwork() and JSViz to create an interactive html page.

Referenced web material

For this article the following web resources have provided valuable information and methods:

  1. https://community.tibco.com/wiki/spotfire-tips-tricks-create-r-graphics-spotfire

    DivyaJyoti DJ Rajdev (2018) - Spotfire Tips & Tricks: Create R Graphics in Spotfire
  2. https://community.tibco.com/wiki/displaying-terr-graphics-tibco-spotfire-using-jsviz

    Dave Leigh (2018) - Displaying TERR Graphics in TIBCO Spotfire® using JSViz
  3. https://briatte.github.io/ggnet/

    Briatte, F. (2015) - ggnet2: network visualization with ggplot2
  4. http://www.kateto.net/network-visualization

    Ognyanova, K. (2018) - Network visualization with R

Prerequisites

Static network plot image using ggplot2() with TERR/R

For this plot the following requirements must be met:

  • The end-user has installed a local R client that can be accessed either via an R client command prompt or via RStudio.

    The graphics created in this article is based on 64-bit R client v3.5.0 in combination with RStudio 1.0.143.0.
  • The following packages must be installed in the R client (copy/paste in R session):
    install.packages("ggplot2")
    install.packages("GGally")
    install.packages("sna")

    Any dependent package will be downloaded and installed as well.

  • In case a Webplayer is used, it requires a properly configured TERR statistics server that has access to an R engine in which the packages listed above are installed.

Dynamic network plot using visNetwork() with TERR/JSViz

For this plot the following requirements must be met:

  • On the Spotfire server the JSViz package must be deployed, to be obtained via a link in web resource 2.

    The Spotfire client has downloaded the JSViz software via an appropriate deployment area. For this article JSViz v3.4.0.13 is used.
  • In the Spotfire client TERR Tools the visNetwork package must be installed that can be downloaded from the default cloud repository.

    Any visNetwork dependent package will be downloaded and installed as well.
  • The TERR visNetwork package depends on software package pandoc. This can be obtained at:

    https://pandoc.org/installing.html (for latest version) - or -

    https://github.com/jgm/pandoc/releases/tag/2.3.1 (for this article the 64-bit version is used)
  • In case a Webplayer is used, the Webplayer service must have been provided the JSViz deployment packages via an appropriate deployment area. Also, it requires a properly configured TERR statistics server in which the visNetwork package is installed and has access to pandoc.

The Spotfire example files

The TERR/R ggnet2() implementation is found in attached file “Dependency Plot - ggnet2.dxp”.

The TERR visNetwork() implementation is found in attached file “Dependency Plot - visNetwork.dxp”.

Both analysis files have been verified to work in Spotfire 7.11-LTS and 10.0.

Creating a Spotfire data table column dependency network plot

Step 1 - Obtaining Spotfire data table column dependency information

Spotfire provides the ColumnExpression class API that accepts a column expression while returning a list of dependency columns as present in the expression. Note that this API takes care of expanding the expression in case document properties are used, including list properties that expand using $map(), prior to parsing its contents.

A demo of the base code for this is found in IronPython script SingleExprDeps in either attached dxp. The script is invoked on page “2 - Demo - Extract deps from expression”. The demo takes a Spotfire expression from an input field, and returns its dependencies in a text label. See script and screenshot below.

from Spotfire.Dxp.Data.Expressions import ColumnExpression

# Resulting info
depInfo = ""

# Create an expression object using the input field expression
myTable = Document.Data.Tables["Project portfolio"]
myExpression = Document.Properties["myExpression"]
spotColExpr = ColumnExpression.Create(myExpression)

if (spotColExpr.IsValid == False):
  # Syntax error
  depInfo = spotColExpr.ErrorMessage
elif (spotColExpr.Validate(myTable) == False):
  # Semantic error
  depInfo = "Invalid column or function name"
else:
  # See if Spotfire can find dependency columns in the expression
  for depName in spotColExpr.ColumnNames:
    depInfo = depInfo + "[" + depName + "], "
  # See if Spotfire can find dependency hierarchies in the expression
  for depName in spotColExpr.HierarchyNames:
    depInfo = depInfo + "[" + depName + "], "
  # Strip last ", " separator when present
  if (depInfo != ""):
    depInfo = depInfo[:-2]
  else:
    depInfo = "<No dependencies found>"

# Return dependency info back to doc property in textbox
Document.Properties["singleExprDeps"] = depInfo
#print depInfo

 

Here's a screenshot from page 2 in either attached dxp.

Extracting column dependencies from a Spotfire expression

Using this base code, it is now possible to scan all columns in a data table, and extract dependency information on a column by column basis for non-imported columns. This results in a single text string that contains the following network information:

  • The number of nodes in the graph.

    This includes all non-imported columns and all imported columns that are used as input for at least one of the non-imported columns.

    Using this value one can unpack the remainder of the string.
  • The column name (label) of each column in-scope.
  • The column type of each column in-scope.

    This is either “Imported”, “Calculated”, “Binned”, “Tags”, “Result”, “Mask” or “Hierarchy”.
  • For each column a list of “0”s and “1”s, denoting whether there is a dependency from another column in-scope towards the current one.

    For example, assuming the number of columns in-scope in the network is 10, we end up with a matrix of 10x10=100 indicators being either “0” or “1”.

Invoking the script is done on page “3 - Dependency chart - fx()” in either attached dxp file using button “Show column dependencies”. The end result is stored in document property ${strDepInfo} for further processing in TERR.

The IronPython script TableExprDeps to generate the network info string is listed below.

from Spotfire.Dxp.Data.Expressions import ColumnExpression
from Spotfire.Dxp.Data import DataColumnProperties

# Text summary for text box
textSummary = ""
Document.Properties["textSummary"] = ""
textSeparator = ", "

# The resulting column network info string to be used in TERR
strDepInfo = ""
Document.Properties["strDepInfo"] = ""
strDepSeparator = "\t"

# Column name, column type and dependency columns name dictionaries
colTable = Document.Data.Tables["Project portfolio"]
colNameList = []
colTypeList = []
colDepList = []

# Function: colNameAdd()
# Register a column name, its type and an empty column dependency list.
# Do not re-register the column name when already registered.
# Returns: The list index assigned to the registered column name.
def colNameAdd(colName, colType):
  colIndex = 0
  try:
    colIndex = colNameList.index(colName)
  except ValueError:
    colNameList.append(colName)
    colTypeList.append(colType)
    colDepList.append([])
    colIndex = len(colNameList) - 1
  return colIndex

# Function: colDepAdmin()
# Register a dependency column and add it to the dep list of its parent
def colDepAdmin(depColName, colDep):
  depColType = colTable.Columns[depColName].Properties["ColumnType"].ToString()
  colNameAdd(depColName, depColType)
  colDep.append(depColName)

# Iterate through the data table columns and process those that are either
# calculated/binned/hierarchy/tag/result/mask columns. For each retrieve its
# column dependencies.
# In the process of this create the human readable text box text summary.
textSummary = textSummary + "\n--- Table: [" + colTable.Name + "]\n\n"
for tableColumn in colTable.Columns:
  colName = tableColumn.Properties["Name"]
  colType = tableColumn.Properties["ColumnType"].ToString()
  colExpr = tableColumn.Properties["Expression"]
  
  if (colType in ("Calculated", "Binned")):
    # Add column to dependency list and prepare text info for textbox
    colIndex = colNameAdd(colName, colType)
    textCol = "*** " + colType + " column: [" + colName + "]\n" + colExpr + \
      "\n" + "--- Dependencies: "

    # Get column deps by feeding its expression through a ColumnExpression
    # object
    spotColExpr = ColumnExpression.Create(colExpr)
    # Find dependencies in other columns and hierarchies
    for depColName in spotColExpr.ColumnNames:
      colDepAdmin(depColName, colDepList[colIndex])
      textCol += "[" + depColName + "]" + textSeparator
    for depColName in spotColExpr.HierarchyNames:
      colDepAdmin(depColName, colDepList[colIndex])
      textCol += "[" + depColName + "]" + textSeparator

    # Remove last ", " separator from text info and update the text summary
    if (spotColExpr.ColumnNames.Count + spotColExpr.HierarchyNames.Count > 0):
      textCol = textCol[:-len(textSeparator)]
    textSummary += textCol + "\n\n"

  elif (colType in ("Hierarchy")):
    # Add column to dependency list and prepare text info for textbox
    colIndex = colNameAdd(colName, colType)
    textCol = "*** " + colType + " column: [" + colName + "]\n" + colExpr + \
      "\n" + "--- Dependencies: "

    # Get column deps by retrieving them from the hierarchy itself
    for depLevel in tableColumn.Hierarchy.Levels:
      colDepAdmin(depLevel.Name, colDepList[colIndex])
      textCol += "[" + depLevel.Name + "]" + textSeparator

    # Remove last ", " separator from text info and update the text summary
    if (tableColumn.Hierarchy.Levels.Count > 0):
      textCol = textCol[:-len(textSeparator)]
    textSummary += textCol + "\n\n"

  elif (colType in ("Tags", "Result")):
    # Add column to dependency list and prepare text info for textbox.
    # Tags don't have dependencies as they're user selected.
    colIndex = colNameAdd(colName, colType)
    textCol = "*** " + colType + " column: [" + colName + "]\n" + \
      "--- Dependencies:"
    textSummary += textCol + "\n\n"

# All appropriate columns in the table are now scanned for dependencies.
# All the info we need for a network graph is found in colNameList[],
# colTypeList[] and colDepList[].
# We also completed the dependency text summary.
Document.Properties["textSummary"] = textSummary

# Create a text string with network info for TERR, requiring four steps.
# The TERR script will unpack the string back into network info suitable
# for creating a network plot.

# Step 1: Start with the number of nodes (=columns)
colCount = len(colNameList)
strDepInfo = str(colCount) + strDepSeparator

# Step 2: Add for each node its label name
for i in range(0, colCount):
  strDepInfo += "[" + colNameList[i] + "]" + strDepSeparator

# Step 3: Add for each node its column type
for i in range(0, colCount):
  strDepInfo += colTypeList[i] + strDepSeparator

# Step 4: Add for each node its dependency on other registered nodes (0 or 1)
for i in range(0, colCount):
  # Create a zero dependency list for this node and mark actual dependencies
  depNodeList = [0] * colCount
  for colDepName in colDepList[i]:
    depIndex = colNameList.index(colDepName)
    depNodeList[depIndex] = 1
  # Add the dependency info for this node
  for j in range(0, colCount):
    strDepInfo += str(depNodeList[j]) + strDepSeparator

# Strip the last separator from the string and assign to doc property.
# When the value differs from the old one it will trigger the TERR script.
strDepInfo = strDepInfo[:-len(strDepSeparator)]
Document.Properties["strDepInfo"] = strDepInfo
#print strDepInfo

 

An example of a generated network info string is found below for a network containing 13 columns.

Please note that the value separator shown here is a single “|” for readability reasons. This will become a problem when a column has such a character defined in its name. To circumvent this, the actual script uses a “\t” (tab) separator instead.

13|[Report region]|[Region]|[Project leader type]|[Project leader]|[Project control]|
[Priority]|[Region projects (H)]|[Project risk]|[Binned Capex]|[Capex]|[KPI combined]|
[Manpower]|[Taxes]|Calculated|Imported|Calculated|Imported|Calculated|Imported|Hierarchy|
Tags|Binned|Imported|Calculated|Imported|Imported|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0|1|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0|0|1|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|1|0|1|1|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0

 

Next to this, the script also generates a raw text summary of all non-imported columns and its derived dependencies in document property ${textSummary} from which a snippet is shown below.

Notice that calculated column [KPI combined] uses entries from a multi-select listbox on the same page to generate its final expression using $map().

--- Table: [Project portfolio]

*** Calculated column: [Report region]
if([Region] in ("Europe", "Middle East", "Africa"),"EMEA",[Region])
--- Dependencies: [Region]

*** Calculated column: [Project leader type]
If([Project leader] in ("Al Stewart", "Anneke Grönloh", "Bette Midler", "Billy Joel", "Bob Dylan", "Cat Stevens", 
"Eric Clapton", "Guus Meeuwis", "Jacques Brel", "John Lennon", "John Miles", "Ramses Shaffy", "Robbie Williams", 
"Stef Bos", "Van Morrison"),"Senior","Junior")
--- Dependencies: [Project leader]

*** Calculated column: [Project control]
case 
when ([Priority]<=2) and ([Project leader type]="Junior") then "Caution" 
when ([Priority]>=3) and ([Project leader type]="Senior") then "Overkill" 
else "Neutral" 
end
--- Dependencies: [Priority], [Project leader type]

*** Calculated column: [KPI combined]
Sum($map("[${lbmsKPI}]","+")) over ([Project control])
--- Dependencies: [Manpower], [Capex], [Taxes], [Project control]

 

Step 2a – Generating a static network dependency plot using ggplot2()

The Spotfire basics of this can be found in web reference 1. Examples of using ggnet2() are described in web reference 3.

Refer to attached file “Dependency Plot - ggnet2.dxp”.

The TERR script is triggered by a change in doc property ${strDepInfo} containing the network info string. This doc property is passed on into R-in-R function RGraph(). The R script that is run inside TERR RGraph() unpacks the network info string and drives ggnet2() that generates a network plot image. The end result is returned in binary document property ${ggnet2Graph} that is shown in a textbox.

The TERR/R script is as follows:

# Configure access to the R environment. You may or may not need an explicit
# reference to the R engine that will be used. Normally your environment points
# to the latest installed R client.
library(RinR)
#configureREvaluator(REvaluator, FullPath="C:/Program Files/R/R-3.5.0/bin/R")

# Run the graphics engine in the R client using RinR function RGraph().
# We'll also misuse RGRaph() to make R do all the preparational work for
# creating the network data from which to make the graph. This way we don't
# need to add any additional libraries in TERR. It's a trade-off between
# having to configure TERR vs the speed of TERR when compared to R. Since our
# data size is very small I've chosen for the first.
myGraph <- RGraph(
  # The script to run in R
  {
    # Load all the libraries we need
    library(network)
    library(ggplot2)
    library(GGally)

    # Unpack the network string info we receive, as built in python
    netInfo <- strsplit(strDepInfo, "\t", fixed=TRUE)[[1]]
    netSize <- as.numeric(netInfo[1])
    
    # Extract network node labels and node types using the number of network
    # nodes
    myLabel <- netInfo[2:(netSize + 1)]
    myType <- netInfo[(netSize + 2):(netSize * 2 + 1)]
    
    # Extract the network connections, put it in a matrix and generate a
    # network object that is used later on in ggnet2()
    netConns <- as.numeric(netInfo[(netSize * 2 + 2):(netSize * (netSize + 2) + 1)])
    netMatrix <- matrix(netConns, nrow=netSize, ncol=netSize, byrow=FALSE)
    myNet <- network(netMatrix, directed=TRUE)
    
    # Define a color translation palette 
    myColPalette <- c("Imported"="gray", "Binned"="tomato",
      "Calculated"="gold", "Hierarchy"="lightgreen", "Tags"="lightblue",
	  "Mask"="darkgreen", "Result"='#2B7CE9')
    
    # Make labels and nodes of non-imported columns a bit bigger
    myLabelSize <- rep(4, netSize)
    myNodeSize <- rep("Imported", netSize)
    for (i in 1:netSize)
    {
      if (myType[i] != "Imported")
      {
        myLabelSize[i] <- 6
        myNodeSize[i] <- "Other"
      }
    }
    
    # Create color and size for the network edges. When an edge is in-between
    # non-imported columns make it red (as changing the parent may impact the
    # child and its children (etc)). 
    edgeCount <- length(subset(netConns, netConns==1))
    myEdgeColor <- rep("gray", edgeCount)
    myEdgeSize <- rep(0.25, edgeCount)
    edgeIndex <- 1
    for (i in 1:(netSize * netSize))
    {
      if (netConns[i] == 1)
      {
        # Found a network edge. Check the parent and child.
        if (myType[as.integer((i / netSize) + 1)] != "Imported" &&
           myType[((i -1) % netSize) + 1] != "Imported")
        {
          # Both parent and child are calculated: draw big red edge
          myEdgeColor[edgeIndex] <- "red"
          myEdgeSize[edgeIndex] <- 1.35
        }
        edgeIndex <- edgeIndex + 1
      }
    }

    # Make the network plot graph, but note we need to enclose ggnet2() with
    # print() or else RGraph() won't be able to pick it up and pass it back to
    # TERR
    print(ggnet2(myNet, arrow.size=10, arrow.gap=0.020, edge.color=myEdgeColor,
      edge.size=myEdgeSize, edge.alpha=0.65, palette=myColPalette,
      node.size=myNodeSize, node.color=myType, node.alpha=0.75,
      label.color="black", label.size=myLabelSize, label=myLabel,
      layout.exp=0.25, size.legend="Node type", color.legend="Column type"))
  },
  # The data to transfer from TERR to R (=doc property with network info)
  data=list(strDepInfo=strDepInfo),
  # Don't attempt to open the image in a separate window (as it'll fail anyway)
  display=FALSE
)

# Return the image to Spotfire, but note we need to do this as raw data type
ggnet2Graph <- as.raw(myGraph)

 

An example of the generated network plot by this script, as seen in Spotfire, is as follows:

A ggnet2() network plot of Spotfire column dependencies

Please note the following:

  • Efforts have been made to highlight only relevant columns, in this case by size and color. Imported columns are the smaller dots and shown in gray as they only serve as input for other columns.
  • An edge between two nodes is shown as a thick red line in case both columns are non-imported columns. By following a red line once can instantly derive multi-column dependencies.

    For example, when changing [Project leader type] it is easy to derive that it will impact the outcome of calculated column [KPI combined], as well as hierarchy [Region projects (H)] via calculated column [Project control].
  • The image shown here is static and there is no way to interact with it.

    In case the objects in the network are not nicely spread press button “Shuffle network graph” on the same page to rerun ggnet2() using a dummy update on the network info string doc property.

Step 2b – Generating a dynamic network dependency plot using visNetwork() and JSViz

The Spotfire basics of this can be found in web reference 2. Examples of using visNetwork() are described in web reference 4.

Refer to attached file “Dependency Plot - visNetwork.dxp”.

Again, the TERR script is triggered by a change in doc property ${strDepInfo} containing the network info string. This time however, everything can be done in TERR. The script unpacks the network info string and drives TERR function visNetwork() that generates an interactive html network plot image. Via a temporary html file on the local filesystem it is loaded into document property ${htmlVisNetwork}. A JSViz chart is used to display the html.

The TERR script is as follows:

# Load visNetwork without reporting errors/warnings
suppressWarnings(suppressMessages(library(visNetwork)))

# Unpack the network string info we receive, as built in python
netInfo <- strsplit(strDepInfo, "\t", fixed=TRUE)[[1]]
netSize <- as.numeric(netInfo[1])

# Extract network info and prepare node data for the visNetwork
myLabel <- netInfo[2:(netSize + 1)]
myType <- netInfo[(netSize + 2):(netSize * 2 + 1)]
myShadow <- rep(TRUE,netSize)
myColNodes <- data.frame(id=myLabel,label=myLabel,group=myType,shadow=myShadow)

# Setup edges with arrow, edge width and edge color
netConns <- as.numeric(netInfo[(netSize * 2 + 2):(netSize * (netSize + 2) + 1)])
netMatrix <- matrix(netConns, nrow=netSize, ncol=netSize, byrow=FALSE)
myColEdges <- data.frame(from=character(),to=character(),arrows=character(),
  color=character(),width=numeric())
for (i in 1:netSize)
{
  for (j in 1:netSize)
  {
    if (netMatrix[i,j] == 1)
    {
      if (myType[i] != "Imported" && myType[j] != "Imported")
      {
        # An edge between non-imported columns is wide and red
        color <- c("red")
        width <- c(4)
      }
      else
      {
        # Any other edge is thin and gray
        color <- c("gray")
        width <- c(1)
      }
      myColEdges <- rbind(myColEdges,data.frame(from=c(myLabel[i]),
        to=c(myLabel[j]),arrows=c("middle"),color=color,width=width))
    }
  }
}

# Generate the network graph, add group info based on the column type and add
# a simple legend.
# For imported columns do not use a black border to improve contrast with other
# column types.
myVis <- visNetwork(nodes=myColNodes,edges=myColEdges,width="100%",height="620px")
myVis <- visGroups(myVis,groupname="Imported",shape="dot",value=1,
  color=list(background="gray",border="gray"))
myVis <- visGroups(myVis,groupname="Calculated",shape="dot",value=2,
  color=list(background="gold",border="black"))
myVis <- visGroups(myVis,groupname="Binned",shape="dot",value=2,
  color=list(background="tomato", border="black"))
myVis <- visGroups(myVis,groupname="Hierarchy",shape="dot",value=2,
  color=list(background="lightgreen",border="black"))
myVis <- visGroups(myVis,groupname="Tags",shape="dot",value=2,
  color=list(background="lightblue",border="black"))
myVis <- visGroups(myVis,groupname="Mask",shape="dot",value=2,
  color=list(background="darkgreen",border="black"))
myVis <- visGroups(myVis,groupname="Result",shape="dot",value=2,
  color=list(background='#2B7CE9',border="black"))
myVis <- visOptions(myVis,highlightNearest=TRUE,selectedBy="label")
myVis <- visNodes(myVis,scaling=list(min=10,max=20))
myVis <- visLegend(myVis,position="right",ncol=1,zoom=TRUE) 

# Create a temp file and save the html of the generated visNetwork.
# Saving an html file as selfcontained package requires an installation
# of pandoc software.
myHtmlFilePath <- tempfile("visNetwork.html")
visSave(myVis,myHtmlFilePath,selfcontained=TRUE,background="white")

# Return the html output by reading from the temp file
htmlVisNetwork <- readChar(myHtmlFilePath, file.info(myHtmlFilePath)$size)

 

Unfortunately, showing the end result as a self-contained html providing all interactive functionality is not possible on the community. As such, a static screenshot is provided instead. The attached Spotfire dxp will provide the interactive visualization. Or, refer to attachment vizNetwork.zip that includes an example of the interactive self-contained html that can be shown in a webbrowser.

A visNetwork plot of Spotfire column dependencies

Please note the following:

  • This plot looks "nicer" than the static ggnet2() plot, but this is a matter of personal preference.
  • The static image shown here is taken from a dynamic html page.

    In the actual plot in Spotfire one can use the scroll wheel to zoom in/out of the network plot and legend. One can move the image by a click-hold in the white area and dragging it in any direction. By clicking on a node its links to neighbor nodes are highlighted. One can search for a node by selecting it from the dropdown list at the top left. And finally, one can rearrange the plot by dragging a node across the plot area.
  • Efforts have been made to highlight only relevant columns, in this case by size and color. Imported columns are the smaller dots and shown in gray as they only serve as input for other columns.
  • An edge between two nodes is shown as a thick red line in case both columns are non-imported columns. By following a red line once can instantly derive multi-column dependencies.

    For example, when changing [Project leader type] it is easy to derive that it will impact the outcome of calculated column [KPI combined] and hierarchy [Region projects (H)] via calculated column [Project control].

Conclusions and remarks

The methods shown here can be used for Spotfire analysis file documentation purposes.

However, it is rather cumbersome to implement the several IronPython and TERR/R scripts in each relevant analysis file. To address this issue, but out-of-scope for this article, one could create a Spotfire plugin that implements the functionality described above to generate dependency plots at will. This could also mean that software installations outside Spotfire would not be needed. Please contact Quintus consultants (info@quintus.nl) if you’re interested in such a plugin.

The ColumnExpression class API also supports method CreateFromPlotExpression() to examine custom expressions. Although out-of-scope for this article, it technically allows to automatically generate documentation on all visualizations within a Spotfire analysis file.

However, although the ColumnExpression class allows to retrieve column dependencies, it looks like there is no means to retrieve them as fully qualified names, being [TableName].[ColumnName]. This means that in a custom expression for table X that refers to a column in table Y, for example [Y].[Column], the API only reports a dependency on [Column] and not on [Y].[Column]. This is considered a severe limitation in the API that, imho, should be corrected.

One of the data table column types is “Result”. Such a column is the result of a TERR script that is incorporated into the data table. As the interface to and from a TERR script is not arranged via column expressions, the TERR script input dependencies cannot be derived via the ColumnExpression class.

Spotfire also supports the GenerateSourceView class API that would allow us not only to trace data transformation actions within a data table but also derive dependencies between columns across data tables. For example, calculated column [X] in table A is added to table B where it is used as input for calculated column [Y].

But this sounds easier than I think it is. For example, adding a pivot transformation on the source data table before adding the result to another table, replacing a column with another, or a combination of both would make it difficult to trace column input to column output.

Attachments