The further analysis of an existing data set with the aim of addressing a research question distinct from that for which the data set was originally collected, and generating novel interpretations and conclusions.
The use of secondary analysis with large-scale survey data, using statistical analysis techniques, is most common, and has been well documented. A researcher may re-analyse a data set that he or she has collected earlier, or may use a data set collected by another researcher, or team of researchers. The technique can be distinguished from meta-analysis, which attempts to summarize the already interpreted results of a number of primary research studies; in contrast, secondary analysis attempts to re-interpret the original data set in relation to a new research question.
The first stage of secondary analysis is to obtain a suitable data set. Data sets are available from databases, which contain information about the context (time of collection, sampling techniques employed and so on) of each data set they offer. The data sets available are of high quality, often the result of years of data collection by a team of highly qualified experts. The main UK database is the UK National Data Archive, housed at Essex University (www.data-archive.ac.uk). Most developed countries have an equivalent national database. Both government data sets, and those gathered by academics and social scientists are important resources for secondary analysis. Data sets can be continuous – that is, gathered on a number of occasions over a period of time – or ad hoc carried out just once in response to a specific research request (Procter, 1993). An important example of a continuous data set is the UK government's General Household Survey (GHS), carried out every year since 1971. An example of an ad hoc survey is the Women and Employment survey carried out in 1984 by the UK government's Office of Population Censuses and Surveys (OPCS). Other data sets have been generated precisely for the purpose of providing a secondary analysis resource; an example is the General Social Survey collected by US academics at the National Research Centre at Chicago. While traditionally secondary analysis has been more widely used with quantitative data sets, the analysis of qualitative data sets has recently received increasing attention. A collection of qualitative data sets – ‘Qualidata’ is now available at the UK data archive at Essex (www.qualidata.essex.ac.uk).
Having accessed a suitable data set the next step is to prepare this for re-analysis, which may involve recoding the original variables (which can be done using a statistical software package such as SPSS). Stewart (1984) has outlined a number of questions that should be answered prior to engaging in secondary analysis, including: What was the purpose of the original research? What sampling procedure was used? What information was collected? Is the original data set valid and reliable?
Secondary analysis is an important technique for certain types of research, in particular research requiring large-scale and longitudinal data sets. Using data that have been collected by a specialist team of experts not only maximizes data quality, but also proves efficient in terms of time and cost. Large data sets often contain a wealth of information that is not exhausted in relation to one particular research question. Secondary analysis both facilitates research where the data required would be too costly and time-consuming for [Page 275]the researcher to collect themselves, and can enable access to special populations due to the sheer size of the data sets often available (Procter, 1993). Longitudinal studies are also facilitated through access to continuous data sets such as the GHS. Instead of having to plan the collection of a primary data, the secondary analyst is freed up to focus on analysis and interpretation.
A number of drawbacks emerge when evaluating secondary analysis, however. The researcher is not able to make decisions concerning the way the data is collected, and therefore must be careful to assess the original data set on a number of key aspects (such as those identified by Stewart, 1984) in order to decide whether it is suitable for the new research question. There is always the danger in secondary research of tailoring the research question to fit the data (Procter, 1993). Lack of contact with the primary researchers can be a problem if further questions emerge about the context of data collection which are not addressed in the annotations provided in the relevant database. This issue of context of the original data collection procedures may be especially important in the re-analysis of qualitative data sets; indeed, the question of whether it is possible for a secondary analyst to properly engage with qualitative data which they were not involved in collecting has been raised as an issue. The researcher engaging in secondary analysis must also be aware of ethical requirements, and the extent to which the original ethical procedures (such as obtaining informed consent) extend to the new analysis being carried out.