Sparse Table

Michael S.Lewis-Beck; Alan Bryman; Tim Futing Liao

doi:10.4135/9781412950589

Entry
Reader's guide
Entries A-Z

Return to Entries

Sparse Table

Edited by:
Michael S. Lewis-Beck
,
Alan Bryman
&
Tim Futing Liao
In:The SAGE Encyclopedia of Social Science Research Methods
Chapter DOI:https://doi.org/10.4135/9781412950589.n939
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology

Request Permissions

Show page numbers Hide page numbers

A sparse table is a cross-classification of observations by two or more discrete variables that has many cells with small or zero frequencies. Sparse contingency tables occur most often when the total number of observations is small relative to the number of cells. For example, consider a table with N = 4 cells and n = 12 observations. If the observations are spread evenly over the four cells, then the maximum possible frequency is small (i.e., n/N = 3). Furthermore, if the occurrence of observations in one of the four cells is a rare event, then a large sample would be needed to obtain observations in this cell. As a second example, consider a cross-classification of four variables, each [Page 1050]with seven categories, which has N = 74 2,401 cells. A sample size considerably larger than 2,401 is required to ensure that all cells contain nonzero frequencies, and a much larger sample size is needed to ensure that all frequencies are large enough for statistical tests and modeling.

Table 1 Example of a Two-Way Table With Sampling Zeros
		Genetic Testing: More Harm or Good?
		More Good	More Harm	It Depends	Total
How Much	Great Deal	54	10	0	64
Know About	Not Very Much	170	88	0	258
Genetic Tests?	Nothing at All	17	14	0	31
Total		241	112	0	353

Sparseness invalidates standard statistic hypothesis tests, such as chi-square tests of independence or model goodness-of-fit. The justification for comparing test statistics (e.g., Pearson's chi-square statistic or the likelihood ratio statistic) to a chi-square distribution depends critically on having “large” samples, where large means having expected values for cells that are greater than or equal to 5. Without large samples, the probability distribution with which test statistics should be compared is unknown. Possible solutions to this problem include using an alternative statistic, performing exact tests, or approximating the probability distribution of test statistics by resampling or Monte Carlo methods.

Sparse tables often contain zero frequencies, which can cause estimation problems, including biased descriptive statistics (e.g., odds ratio), the estimation of log-linear model parameters, and difficulties for computational algorithms that fit models to data. Whether an estimation problem exists depends on the pattern of zero frequencies in the data and the particular model being estimated. Parameters cannot be estimated when there are zeros in the corresponding margins. For example, Table 1 consists of the cross-classification of responses to the following two questions and possible responses from the 1996 General Social Survey (http://www.icpsr.umich.edu:8080/GSS/homepage.htm): (a) “Based on what you know, do you think genetic testing will do more good than harm?” with possible responses of “more good than harm,” “more harm than good,” and “it depends”; and (b) “How much would you say that you have heard or read about genetic screening?” with possible responses of “a great deal,” “not very much,” and “nothing at all.” None of the respondents answered “It depends” to the first question. For the independence log-linear model, a parameter for the marginal effect for “It depends” cannot be estimated. The information needed to estimate this parameter is the column marginal value, which has no observations.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Sparse Table

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends