Missing Data

Edith D. de Leeuw; Joop Hox

doi:10.4135/9781412963947

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Missing Data

By: Edith D. de Leeuw & Joop Hox
In:Encyclopedia of Survey Research Methods
Chapter DOI:https://doi.org/10.4135/9781412963947.n298
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Economics, Education, Geography, Health, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology

Request Permissions

Show page numbers Hide page numbers

An important indicator of data quality is the fraction of missing data. Missing data (also called "item non-response") means that for some reason data on particular items or questions are not available for analysis. In practice, many researchers tend to solve this problem by restricting the analysis to complete cases through "listwise" deletion of all cases with missing data on the variables of interest. However, this results in loss of information, and therefore estimates will be less efficient. Furthermore, there is the possibility of systematic differences between units that respond to a particular question and those that do not respond—that is, item nonresponse error. If this is the case, the basic assumptions necessary for analyzing only complete cases are not met, and the analysis results may be severely biased.

Modern strategies to cope with missing data are imputation and direct estimation. Imputation replaces the missing values with plausible estimates to make the data set complete. Direct estimation means that all available (incomplete) data are analyzed using a maximum likelihood approach. The increasing availability of user-friendly software will undoubtedly stimulate the use of both imputation and direct estimation techniques.

However, a prerequisite for the statistical treatment of missing data is to understand why the data are missing. For instance, a missing value originating from accidentally skipping a question differs from a missing value originating from reluctance of a respondent to reveal sensitive information. Finally, the information that is missing can never be replaced. Thus, the first goal in dealing with missing data is to have none. Prevention is an important step in dealing with missing data. Reduction of item nonresponse will lead to more information in a data set, to more data to investigate patterns of the remaining item nonresponse and select the best corrective treatment, and finally to more data on which to base imputation and a correct analysis.

A Typology Of Missing Data

There are several types of missing data patterns, and each pattern can be caused by different factors. The first concern is the randomness or nonrandomness of the missing data.

Missing At Random Or Not Missing At Random

A basic distinction is that data are (a) missing completely at random (MCAR), (b) missing at random (MAR), or (c) not missing at random (NMAR). This distinction is important because it refers to quite different processes that require different strategies in data analysis.

Data are MCAR if the missingness of a variable is unrelated to its unknown value and also unrelated to the values of all other variables. An example is inadvertently skipping a question in a questionnaire. When data are missing completely at random, the missing values are a random sample of all values and are not related to any observed or unobserved variable. Thus, results of data analyses will not be biased, because there are no systematic differences between respondents and nonrespondents, and problems that arise are mainly a matter of reduced statistical power. It should be noted that the standard solutions in many statistical packages, those of listwise and pairwise deletion, both [Page 468]assume that the data are MCAR. However, this is a strong and often unrealistic assumption.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Missing Data

A Typology Of Missing Data

Missing At Random Or Not Missing At Random

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends