Skip to main content icon/video/no-internet

A contingency table is a statistical table classifying observed data frequencies according to the categories in two or more variables. A table formed by the cross-classification of two variables is called a two-way contingency table (for an example, see Table 1), a table of three variables is termed a three-way contingency table, and, in general, a table of three or more variables is known as a multiway contingency table. The analysis of multiway contingency tables is sometimes known as multivariate contingency analysis.

All cells in a table may not have observed counts. Sometimes, cells in a contingency table contain zero counts, known as empty cells. A zero count may be generated by two distinctive mechanisms. If sample size causes a zero count (e.g., Chinese American farmers in Illinois), it is known as a sampling zero. If, for a cell, it is theoretically impossible to have observations (e.g., male [contraceptive] diaphragm users), it is called a structural zero. Contingency tables with at least one structural zero are termed incomplete tables. When many cells in a contingency table have lower observed frequencies, be they zero or not, such a table is known as a SPARSE TABLE.

Historical Development

Early work on CATEGORICAL DATA ANALYSIS in the beginning of the 20th century was primarily concerned with the analysis of contingency tables. The well-known debate between Karl Pearson and G. Udny Yule sparked interest in contingency table analysis. How would we analyze a 2 × 2 contingency table? Pearson was a firm believer in continuous bivariate distributions underlying the observed counts in the cross-classified table; Yule was of the opinion that variables containing discrete categories such as inoculation versus no inoculation would be best treated as discrete without assuming underlying distributions. Pearson's contingency coefficient was calculated based on the approximated underlying correlation of bivariate normal distributions collapsed into discrete crossclassifications, whereas Yule's Q was computed using a function of the ODDS RATIO of the observed discrete data directly.

Contributions throughout the 20th century can often be traced by the names of those inventing the statistics or tests for contingency table analysis. Pearson chisquare test, Yule's Q, Fisher's exact test, Kendall's tau, Cramér's V, Goodman and Kruskal's tau, and the Cochran-Mantel-Haenszel test (or the MantelHaenszel test) are just some examples. These statistics are commonly included in today's statistical software packages (e.g., proc freq in SAS generates most of the above).

Table 1 A Contingency Table of Parents' Socioeconomic Status (SES) and Mental Health Status (expected values in parentheses)
Mental Health Status
Parent's SES Well Mild Symptom Moderate Symptom Impaired
A (high) 64 (48.5) 94 (95.0) 58 (57.1) 46 (61.4)
B 57 (45.3) 94 (88.8) 54 (53.4) 40 (57.4)
C 57 (53.1) 105 (104.1) 65 (62.6) 60 (67.3)
D 72 (71.0) 141 (139.3) 77 (83.7) 94 (90.0)
E 36 (49.0) 97 (96.1) 54 (57.8) 78 (62.1)
F (low) 21 (40.1) 71 (78.7) 54 (47.3) 71 (50.9)

Contingency table analysis can be viewed as the foundation for contemporary categorical data analysis. We may see the series of developments in log-linear modeling in the latter half of the 20th century as refinements in analyzing contingency tables. Similarly, the advancement in latent trait and latent class analysis is inseparable from the statistical foundation of contingency table analysis.

...

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading