TIBCO Statistica® Log-Linear Analysis of Frequency Tables

Last updated:
12:31pm Sep 29, 2020

This module contains log-linear modeling procedures for multi-way frequency tables. The term log-linear derives from the fact that we can, through logarithmic transformations, restate the problem of analyzing multi-way frequency tables in terms that are very similar to ANOVA. Specifically, we may think of the multi-way frequency table to reflect various main effects and interaction effects that add together in a linear fashion to bring about the observed table of frequencies.

When analyzing four-way or higher tables, finding the best fitting model can become increasingly difficult. Therefore an option was included for automatic model fitting to facilitate the search for a good model. 

The statistical significance of the goodness-of-fit of a particular model is evaluated via a Chi-square test. This module computes two types of Chi-squares:

  • traditional Pearson Chi-square statistic
  • maximum likelihood ratio Chi-square statistic

Both tests evaluate whether the expected cell frequencies under the respective model are significantly different from the observed cell frequencies.

The following residual statistics are available. Note: Fijk.. denotes the fitted or expected cell frequency for cell i,j,k,... And fijk.. denotes the observed frequency.

Raw residuals (rijk..) are computed as:

rijk..= fijk.. - Fijk..

Standardized residuals (sijk..) are computed as:

sijk.. = (fijk.. - Fijk..) / (Fijk..)½

In the equation below, ln denotes the natural logarithm. These are the contributions of each cell to the overall maximum likelihood ratio Chi-square goodness-of-fit statistic:

cijk.. = 2 * fijk.. * ln(fijk.. / Fijk..)

Freeman-Tukey deviates (frijk..) represent a normalizing transformation that is appropriate when the frequencies in the table come from a Poisson distribution:

frijk.. = fijk..½ + (fijk.. + 1)½ - (4 * Fijk.. + 1)½

For additional information on the computations of the Pearson Chi-square and maximum likelihood ratio Chi-square statistics, see Bishop, Fienberg, and Holland (1974) and Fienberg (1978).

For additional information on the logic of automatic model selection see Goodman (1971). 

See Deming & Stephan, 1940; Brown, 1959; Ireland & Kullback, 1968; Haberman, 1972, 1974 for information on the fitting of models (marginal tables) to observed frequency tables via iterative proportional fitting.

Note: The GLM module provides additional options for analyzing binomial and multinomial logit models with coded ANOVA/ANCOVA-like designs.

Historical Note: The term likelihood ratio for Chi-square was first introduced by Neyman and Pearson, 1931. The term maximum likelihood was first used by Fisher, 1922a.