Skip to main content icon/video/no-internet

Cross-classifications of categorical variables (CONTINGENCY TABLE) are ubiquitous in the social sciences, and log-linear models provide a powerful and flexible framework for their analysis. The log-linear model has direct analogies to the linear model routinely used to perform an ANALYSIS OF VARIANCE. There are, of course, important issues that are particular to contingency tables analysis, and those are the primary focus of this entry.

This entry begins with an exposition of the log-linear model in the context of the 2 × 2 contingency table, and from the basic concepts and models of this simple setting, extensions or generalizations are made to a illustrate how log-linear models can be applied to applications with more than two categories per variable, more than two variables, or a one-to-one matching between the categories of two variables. The concept and representation of the odds ratio are fundamental to understanding the log-linear model and its applications, and this will be demonstrated throughout.

ASSOCIATION: THE 2 × 2 CONTINGENCY TABLE, LOG-ODDS RATIO, AND LOG-LINEAR MODEL PARAMETERS>

The correlation coefficient is the standard measure for the assessment of (linear) ASSOCIATION between two continuous variables, and the parameters in the classical LINEAR REGRESSION model are readily related to it. The odds ratio is the analogous measure in the log-linear model, as well as numerous generalizations of these models that have been developed for the modeling and analysis of associations between categorical variables. That is, log-linear model parameters have immediate interpretations in terms of log-odds, log-odds ratios, and contrasts between log-odds ratios. The analogy to linear models for analysis of variance is that the parameters in those models have interpretations in terms of means, differences in means, or mean contrasts. In addition, there are immediate connections between log-odds ratios for association and the parameters in regression models for log-odds, just as there are immediate connections between PARTIALCORRELATIONS and REGRESSION COEFFICIENTS in ordinary linear regression.

The fundamental concepts and parameterization of the log-linear model are readily demonstrated with the consideration of the 2 × 2 contingency table, as well as sets of 2 × 2 contingency tables. Let πij denote the probability associated with the cell in row i and column j ofa2 × 2 contingency table, and let nij denote the corresponding observed count. For example, consider the following cross-classification of applicants’ gender and admission decisions for all 12,763 applications to the 101 graduate programs at the University of California at Berkeley for 1973 (see, e.g., Agresti, 1984, pp. 71–73):

Table 1 Berkeley Graduate Admissions Data
Admissions Decision
Gender Yes No
Male 3,738 4,704
Female 1,494 2,827

In this case, n11 = 3,738,n12 = 4,704,n21 = 1,494, and n22 = 2,827. These data were presented in an analysis to examine possible sex bias in graduate admissions, as it is immediately obvious from simple calculations that more than 44% (i.e., 3,738/8,442 > 0.44) of males were offered admission, whereas less than 35% (i.e., 1,494/4,321 < 0.35) of females were. Log-linear models can be applied to test the hypothesis that admissions decisions were (statistically) independent of applicants’ gender versus the alternative that there was some association/dependence.

...

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading