Skip to main content icon/video/no-internet

Exploratory data analysis (sometimes abbreviated as EDA) consists of an approach to data analysis that allows the data themselves to reveal their underlying structure and that gives the researcher a “feel” for the data. It relies heavily on graphs and displays to reach these goals because visual inspection offers special insight into the data, but it also uses many numerical techniques. Both the visual and numerical techniques focus on searching for knowledge that allows more effective testing of theories and hypotheses. Given that exploratory data analysis represents a general philosophy of understanding data, the techniques serve as a means to that end rather than ends in themselves.

The philosophy of exploratory data analysis can be said to value two traits: openness and skepticism (Hartwig & Dearing, 1979, pp. 9–12). Analysts should be open to unanticipated data patterns that go beyond the planned model and expectations. Analysts should also be skeptical of numerical summaries of data that can conceal or misrepresent the most informative aspects of the data.

Historical Development

The seminal works on exploratory data analysis come from John W. Tukey (1977) and Frederick Mosteller and John W. Tukey (1977). Tukey (1977, p. vii) noted that over the 20th century, confirmatory data analysis, which assesses how precisely sample statistics can be used to make inferences about population parameters, had come to dominate statistical research. He believed that although exploratory and confirmatory data analyses emphasize different principles, both should be used by practicing researchers. In aiming to complement the dominant methods of statistical analysis, he argued that researchers cannot get along without confirmatory data analysis but need not start with it. His central principle follows from this point (Tukey, 1977): “It is important to understand what you CAN DO before you learn to measure how WELL you seem to have DONE it” (p. v).

Consider the differences between confirmatory and exploratory data analysis (NIST/SEMATECH, 2003):

  • Confirmatory approaches begin with a prespecified model and use the parameters obtained from the data analysis to evaluate the model, whereas exploratory approaches begin with analysis of the data to infer the model and specify its assumptions.
  • Confirmatory approaches are formal and rigorous in efforts to evaluate a model, whereas exploratory approaches are more informal and active in the effort to discover meanings contained in the data.
  • Confirmatory approaches impose a model on the data, whereas exploratory approaches let the data suggest appropriate models.
  • Confirmatory approaches filter the data in reducing the information to a small number of parameters, whereas exploratory approachesrelyon techniques that reflect all parts of the data.
  • Confirmatory approaches require strong assumptions, whereas exploratory approaches make few assumptions or attempts to validate assumptions before model testing.
  • Confirmatory approaches have much power to precisely test a hypothesis and estimate parameters under narrow circumstances, whereas exploratory approaches have greater generality and less sensitivity.

The two approaches represent ideal types that, in good research, both receive attention, but Tukey and others have had considerable influence in countering a belief that mere description and exploration of data were inferior to formal hypothesis testing and model confirmation. Today, most empirical social scientific research relies on confirmatory models, such as in the use of regression and analysis of variance to test theories, but exploratory techniques have become essential parts of the research process leading to the confirmatory models.

...

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading