- By: | Edited by: Paul Atkinson, Sara Delamont, Alexandru Cernat, Joseph W. Sakshaug & Richard A.Williams
- Publisher: SAGE Publications Ltd
- Publication year: 2020
- Online pub date:
- Discipline: Anthropology, Business and Management, Communication and Media Studies, Computer Science, Counseling and Psychotherapy, Criminology and Criminal Justice, Economics, Education, Engineering, Geography, Health, History, Marketing, Mathematics, Medicine, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Science, Social Work, Sociology, Technology
- Methods: Statistical modelling, Missing data, Categorical variables
- Length: 10k+ Words
Missing values are a common problem in almost any data, whether collected through surveys, clinical trials, or epidemiological studies. This can lead to biased results if the missingness is not taken into account at the analysis stage. Multiple imputation is widely accepted as the most convenient strategy for dealing with item nonresponse in a proper way. With multiple imputation, missing values are imputed (i.e., replaced with plausible values given the observed data) more than once. The multiple copies allow accounting for the extra uncertainty from nonresponse using simple formulae (Rubin’s combining rules) ensuring valid inferences based on the imputed data. This extra uncertainty is typically ignored with single imputation, resulting in estimated standard errors and confidence intervals that are too small and p values that are too significant.
Following a general introduction, this entry starts by discussing the requirements for inference based on partially observed data. The inferential procedures for analyzing multiply imputed datasets are presented next, before illustrating the two main approaches for generating multiply imputed datasets: joint modeling and sequential regression. Various parametric and nonparametric imputation strategies are then discussed, followed by a simulation study, which illustrates how multiple imputation would be implemented in practice. Next, the entry discusses practical considerations, such as deciding which variables to include in the imputation models or picking the number of imputations. The entry concludes with a critical review of the limitations of multiple imputation, a discussion of potential alternatives, and an illustration of applications of the multiple framework beyond the nonresponse context.