Missing Data


Missing data are ubiquitous in social and medical research. This entry illustrates the issues raised by missing data, discussing when analysis restricted to complete records is likely to be sufficient, and the shortcomings of ad hoc imputation approaches. The entry introduces Rubin’s missing data framework (missing completely at random, missing at random and missing not at random) and uses a cohort study of school children as an example to show how to apply this. It also emphasizes that—with a nontrivial proportion of missing data—investigators must make assumptions about the missing data mechanism for the analysis to proceed. In addition, multiple imputation, full information maximum likelihood, and inverse probability weighting are discussed and their use for when data are plausibly missing at random compared—noting that in many applications multiple imputation provides the most general and accessible approach. As any assumption about the missingness mechanism is untestable given the data at hand, in applications, it is important for investigators to explore the robustness of conclusions to different, contextually plausible, missingness mechanism assumptions. Through an example, this entry shows how multiple imputation provides an accessible approach to this.

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles