Data analysis in qualitative research is quite different from that in quantitative research due not only to differences in the data themselves but also to substantial differences in the goals, assumptions, research questions, and data collection methods of the two styles of research. Because qualitative approaches and methods are an important part of educational research, both researchers and practitioners need to understand these differences, the strengths and limitations of the two approaches, and how they can be productively integrated. Data analysis may be the least understood aspect of qualitative research, partly because the term qualitative analysis has several different meanings. This entry reviews the aspects of qualitative research that are most important for data analysis, describes the history of its development, and surveys the current diversity of approaches to analysis in qualitative research.
The phrase qualitative analysis in the physical sciences, and in some quantitative research in the social sciences, refers to categorical rather than numerical analysis. For example, qualitative analysis in chemistry simply determines what elements are present in a solution, while quantitative analysis also measures the amount of each element. Some quantitative researchers have assumed that this distinction also applies to the social sciences—that qualitative analysis deals with data that are simply categorized, rather than measured numerically, and that the basic principles of quantitative research can be applied to both. This represents a profound misunderstanding of qualitative research and analysis, which rests on quite different premises from quantitative research, and uses distinct strategies for analyzing data.
These strategies are grounded in the primarily inductive, rather than hypothesis testing, nature of qualitative research. This, in turn, is shaped by the nature of qualitative data. Such data are primarily descriptions of what people did or said in particular contexts—either observations of actual settings and events or transcripts of interviews. Instead of converting these descriptions to variables and measuring or correlating these, as quantitative researchers do, qualitative researchers retain the data in their original, descriptive form and analyze these in ways that, at least to a greater extent than in quantitative research, retain their narrative, contextualized character. Qualitative research reports tend to contain many verbatim quotes and descriptions, and the analysis process is to a substantial extent devoted to selecting these as well as to aggregating, comparing, and summarizing them. The use of numbers, to make more precise statements of how often something happened or how many participants reported a particular experience or event, is legitimate and common in qualitative research, but such uses are supplementary to the primary descriptive and interpretive goals of analysis.
Because of the inductive character of qualitative research, and its particularistic focus, data [Page 1336]analysis is not a “stage” that occurs in a sequential order with theorizing, research design, data collection, and writing up results. Data analysis should begin as soon as any data are collected and should be continued as long as any significant questions remain about the meaning and implications of the data. Although the relative emphasis on the different aspects of the research process varies over time, they are not chronologically separated components of a linear series.
Although the term qualitative research is a more recent development, its actual practice has a long history, extending back at least to the 19th-century work of anthropologists and the study of social problems by Charles Booth, Jane Addams, and others and to later community studies such as Middletown and Yankee City. Despite this, the analysis of qualitative data has, until relatively recently, received little theoretical attention. This is in striking contrast to quantitative research, which has a well-developed theory, statistics, which informs quantitative analysis.
The first widely recognized, named, and systemically developed method for analyzing qualitative data was analytic induction (AI). It was created by the sociologist Florian Znaniecki in the 1930s, during his research with W. I. Thomas for their classic work The Polish Peasant in Europe and America, and was further developed by Alfred Lindesmith in his research on opiate addiction in the 1940s. In contrast to quantitative research, which typically collects and analyzes data in order to test previously developed theories, AI proceeds inductively to generate categories, concepts, and theories from the data. These inductively developed theories specify the necessary preconditions for a type of case (e.g., of people who embezzle money from a firm to deal with unexpected personal financial problems); the theory is tested by seeking negative instances, and revising the theory, or limiting its scope, until no negative cases are found.
The goal of AI was to develop explanatory theories about the phenomena studied. This was done by iteratively examining cases to see whether the theorized conditions were present; any case that lacked one of these preconditions required revision of the theory. However, the view that any exception to the preconditions necessitated revision of the theory is now seen by most researchers as too stringent. However, the inductive development of categories for sorting and classifying (coding) data has been a feature of most subsequent strategies for qualitative analysis.
The most influential and widely used strategy for qualitative analysis, grounded theory, was presented by Barney Glaser and Anselm Strauss in their 1967 book The Discovery of Grounded Theory. Their work was in part a response to the growing prestige of quantitative research in sociology and some other social sciences, as sophisticated statistical analyses of survey data became dominant in academic influence and funding. The book challenged the growing separation of theory development from research, in which broad abstract theories, often generated without reference to actual data, were then tested by researchers, using quantitative data to establish correlations between variables derived from the theories. As with AI, grounded theory emphasized the inductive development of theory but established a more systematic and flexible way of doing this. The phrase grounded theory was intended to emphasize the generation of theory that was “grounded” in, and developed in interaction with, the collection and analysis of data.
A key concept for grounded theory was the constant comparative method, a strategy that Glaser and Strauss distinguished from both the quantification of data in order to test existing theory and the simple examination of data to generate theory. Constant comparison integrates the coding of data with the development of theory and hypothesis generation in an iterative process. This strategy, a radical departure from standard practice (at least as theorized) when it was first presented, is now a fairly typical part of most qualitative research.
A second innovation that Glaser and Strauss introduced was the use of memos (written reflections on methods, data, or other aspects of the research) as an explicit data analysis strategy. Although memos were used informally in earlier research, Glaser and Strauss recognized these as [Page 1337]a distinct strategy for qualitative analysis. The Discovery of Grounded Theory treated memos very briefly, in only a few paragraphs, but Strauss’s later work (Qualitative Analysis for Social Scientists, 1987; Strauss and Corbin, Basics of Qualitative Research, 1990), as well as that of Matthew Miles and A. Michael Huberman, provided a much more extensive discussion of the uses of memos for data analysis and theory development.
In his later work, Strauss also developed additional strategies for analysis, including what he called axial and selective coding. The terminology he used for these is potentially confusing, because neither involves “coding” in the usual sense of creating categories and sorting data by category; Strauss used “coding” to mean broadly “the process of analyzing data.” In axial coding, the researcher connects a categorized phenomenon to the conditions that gave rise to it, its context, the strategies by which it is handled, and the consequences of these; selective coding involves relating a category to the core categories of the emerging theory. These are both ways of connecting a category to other categories; such strategies are discussed in more detail later in this entry.
There are now at least three different versions of grounded theory in use: Glaser’s development of traditional grounded theory, Strauss’s and Juliet Corbin’s subsequent elaboration of this approach (which Glaser rejected), and constructivist grounded theory, as developed by Kathy Charmaz. The latter combines the grounded theory approach with social constructivism, the epistemological position that people construct the realities in which their lives are embedded. The latter view, a reaction to the positivism that has dominated quantitative research, has become widespread (though by no means universal) in qualitative research. It emphasizes research relationships, participants’ subjectivity, and the social context of the research.
Another major contribution to the development of qualitative analysis was Miles and Huberman’s Qualitative Data Analysis: A Sourcebook of New Strategies (1984). This work, although it covered most traditional forms of analysis, emphasized what they called displays—visual ways of presenting and analyzing data. Most of these strategies were qualitative adaptations of two forms of data analysis and presentation that had been used in quantitative research: matrices (tables) and networks (concept maps or flowcharts). In contrast to quantitative displays such as numerical tables or structural equation models, Miles and Huberman presented numerous examples of genuinely qualitative displays. Matrices are formed by crossing lists of categories (including individuals, groups, or times) to create cells; but rather than numbers, the cells contain qualitative data, either verbatim quotes or field note excerpts, or summaries of these. Networks, on the other hand, can display relationships among categories (similar to what are called concept maps) or the sequence of actual events. Networks can be used to display both sequences of specific events or properties of a particular group or institution (what they called an event-state network) and hypothesized relationships (usually causal) among categories. Both types of displays can be used either within particular cases or in cross-case analysis.
Charles Ragin’s qualitative comparative analysis, a method originally developed in political science and sociology but more recently used in other fields as well, is a way of analyzing a collection of cases (traditionally done using qualitative case study methods) in a more systematically comparative way to identify cross-case patterns. It is actually a combination of qualitative and quantitative strategies for identifying different combinations of causal conditions (variables) that can generate the same outcome. It is most useful when the number of cases is larger than qualitative researchers can easily handle but too small for rigorous statistical analysis. Ragin’s 2014 presentation of this approach dropped the term qualitative, titling the book The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies.
All these approaches are based on some form of coding and categorization of data. However, there are other ways of doing qualitative data analysis that draw more from the humanities than the social sciences. The most widespread of these is narrative analysis, but this is really a loose collection of rather different approaches to analyzing narrative forms of data. Some of these approaches involve coding and thematic analysis and are thus similar to the types discussed previously. Others [Page 1338]focus on the structure of narratives, using strategies drawn from literature or linguistics. However, all of these tend to be more holistic in their approach than are approaches that primarily involve coding, which intrinsically segment or “fracture” the data and re-sort the segments into categories; they focus more on identifying connections within the data and retaining these connections in the analysis.
The more holistic types of narrative research result in rather different forms of presentation of the results of the analysis, and the creation of these forms of presentation may largely constitute the analysis. For example, Irving Seidman, in his book Interviewing as Qualitative Research, described two types of presentation of life history interviews, which he called vignettes and profiles. These are created by rearranging and condensing the interview transcripts, to generate a clearer flow to the narrative, while retaining the interviewee’s own words. Similarly, what Frederick Erickson called ethnographic microanalysis of interaction involves taking observations (usually videotaped and transcribed) of some event, analytically decomposing these, and then reconnecting them to create a holistic portrayal of social interaction. This sort of analytic segmentation and rearrangement of data is common in qualitative case studies, as well as in much narrative research, but has rarely been discussed as a type of analysis.
Other researchers have used poetry as a way to communicate the meaning of interviews, but the analytic strategies that are involved in this are rarely explicit. An exception, which Carolyn Mears called the gateway approach for analyzing and displaying interview material, is presented in her book Interviewing for Education and Social Science Research. Drawing on humanity-based practices, including oral history interviewing, poetic forms of transcription and display, and Elliot Eisner’s educational connoisseurship, Mears created poetic renditions of her interviews, retaining the interviewee’s own language, but editing and rearranging this to better convey the experience and emotion that may be obscured or missed in a verbatim transcription.
It is also possible to combine categorizing and connecting strategies in analysis—not simply by connecting the results of a prior categorizing analysis, as Strauss did with axial and selective coding, but by integrating connecting strategies from the beginning of the analysis. An example is the listening guide approach to analysis, developed by Carol Gilligan and her associates, for analyzing interviews. This approach, which they describe as a voice-centered relational method, involves a series of “listenings” that attempt to identify the “plot” (the story that is being told), the stance of the speaker (identifying the “I” statements and creating a separate document from these, an “I poem”), and different “voices” in the interview; in Gilligan’s original use of this approach, which contrasted men’s and women’s views on moral judgment, these were the voices of justice and of caring, and of a separate and a connected self. However, the particular voices identified depend on the goals of the research and may be inductively developed during the study. Such an approach interweaves categorizing and connecting steps rather than keeping these separate.
The analysis of qualitative data has been substantially transformed by the development of computer-assisted qualitative data analysis software that facilitates many of the processes involved in analyzing data; many different programs are available, with different strengths and limitations. However, unlike quantitative analysis programs, which actually carry out the chosen statistical procedures, qualitative software simply automates some of the actions involved in the analysis; every decision about what categories to use in coding the data, and selecting which segments to code, must still be made by the researcher, although the software can then display the results. In addition, such software is most useful for categorizing analysis; this can lead the researcher to employ this strategy even when the research purposes would be best served by a connecting approach. It is important to keep in mind, though, that the development of qualitative analysis software is progressing rapidly and that any attempt to characterize the field may quickly become out of date.