‘Longitudinal’ is a broad term. It can be defined as research in which: (1) data are collected for two or more distinct periods (implying the notion of repeated measurements);(2) the subjects or cases analysed are the same, or at least comparable, from one period to the next; and (3) the analysis involves some comparison of data between or among periods (Menard, 1991: 4).
There are a number of different designs for the construction of longitudinal evidence: repeated cross-sectional studies; prospective studies, such as household panel surveys or cohort panels; and retrospective studies, such as life and work histories and oral histories.
In the social sciences, cross-sectional observations are the form of data most commonly used for assessing the determinants of behaviour (Davies, 1994; Blossfeld and Rohwer, 1995). However, the cross-sectional survey, because it is conducted at just one point in time, is not suited for the study of social change. It is therefore common for cross-sectional data to be recorded in a succession of surveys at two or more points in time, with a new sample on each occasion. These samples either contain entirely different sets of cases for each period, or the overlap is so small as to be considered negligible. Where cross-sectional data are repeated over time with a high level of consistency between questions, it is possible to incorporate a time trend into the analysis. Examples of repeated cross-sectional social surveys are: the UK's General Household Survey and Family Expenditure Survey, and the EU's Eurobarometer Surveys.
The temporal data most often available to social researchers are panel data, in which the same individuals are interviewed repeatedly across time. Variations of this design (Buck et al., 1994: 21-2; Ruspini, 2002) include:
[Page 182]Household Panel Studies (HPS). A random sample of respondents with repeated data collections from the same individuals at fixed intervals (usually, but not necessarily, annually). HPS trace individuals at regular discrete points in time: they seek to discover what happens/has happened to the same subjects over a certain period of time. Thus, the fundamental feature they offer is that they make it possible to detect and establish the nature of individual change. For this reason, they are well-suited to the statistical analysis of both social change and dynamic behaviour. Among the best known prospective panel studies are the US Panel Study of Income Dynamics (PSID), the British Household Panel Study (BHPS) and the German Socio-Economic Panel (SOEP).
Cohort Panels. A specific form of panel study that takes the process of generation replacement explicitly into account. A cohort is defined as those people within a geographically or otherwise delineated population who experienced the same significant life event or events within a given period of time. A random sample of the individuals in the cohort is followed over time. Usually a researcher will choose one or more birth cohorts and administer a questionnaire to a sample drawn from within that group: thus longitudinal analysis is used on groups that are homogeneous and a number of generations are followed, over time, throughout their life courses. The interest is usually in the study of long-term change and in individual development processes. Such studies typically re-interview every five years. If, in each particular generation the same people are investigated, a cohort study amounts to a series of panel studies; if, in each generation, at each period of observation, a new sample is drawn, a cohort study consists of a series of trend studies (Hagenaars, 1990). Examples are the UK National Child Development Study and the German Life History Study.
Linked or Administrative Panels. In these cases data items which are not collected primarily for panel purposes (census or administrative data) are linked together using unique personal identifiers (the combination of name, birth date and place of birth is normally enough to identify individuals and enable linkage of administrative and/or other records). This is the least intrusive method of collecting longitudinal data (Buck et al., 1994).
All the data types discussed so far have been recorded with reference to fixed and predetermined time points. But, for many processes within the social sciences, continuous measurement may be the most suitable method of empirically assessing social change. When data are recorded in a continuous time, the number and sequence of events and the duration between them can all be calculated. Data recorded in continuous time are often collected retrospectively via life history studies that question backwards over the whole life course of individuals. The main advantage of this approach lies in the greater detail and precision of information (Blossfeld and Rohwer, 1995). A good example is the UK 1980 Women and Employment Survey, which obtained very detailed past work histories from a nationally representative sample of women of working age in Britain.1
Longitudinal data: allow the analysis of duration of social phenomena; permit the measurement of differences or change from one period to another in the values of one or more variables; explain the changes in terms of certain other characteristics (these characteristics can be stable, such as gender) or unstable (that is, time-varying, such as income) (van der Kamp and Bijleveld, 1998: 3); can be used to locate the causes of social phenomena and ‘sleeper effects’, that is, connections between events that are widely separated in time (Hakim, 1987).
Insights into processes of social change can thus be greatly enhanced by making more extensive use of longitudinal data. Dynamic data are the necessary empirical basis for a new type of dynamic thinking about the processes of social change (Gershuny, 1998). The possibility of developing research based on longitudinal data also builds a bridge between ‘quantitative’ and ‘qualitative’ research traditions and enables re-shaping of the concepts of qualitative and quantitative (Ruspini, 1999). Longitudinal surveys usually combine both extensive and intensive approaches (Davies and Dale, 1994). Life history surveys facilitate the construction of individual trajectories since they collect continuous information throughout the life course. Panel data trace individuals and households through historical time: information is gathered about them at regular intervals. Moreover, they often include relevant retrospective information, so that the respondents have continuous records in key fields from the beginning of their lives. As an example, the British Household Panel Study took the opportunity (over the first three waves) to get a very good picture of respondents’ previous lives by asking for life-time retrospective work, marital and fertility histories. Longitudinal analysis thus presupposes the development of a methodological mix where neither of the two aspects alone is sufficient to produce an accurate picture of social dynamics (Mingione, 1999).
However, although dynamic data have the potential to provide richer information about individual behaviour, their use poses theoretical and methodological problems. In addition, longitudinal research typically costs more and can be very time-consuming.
The principal limitations of the repeated cross-sectional design are its inappropriateness for studying developmental patterns within cohorts and its inability to resolve issues of causal order. Both of these limitations result directly from the fact that in a repeated cross-sectional design, the same cases are neither measured repeatedly nor for multiple periods (Menard, 1991). Thus, more data are required to characterise empirically the dynamic process that lies behind the cross-sectional snapshot (Davies, 1994).
Concerning panel data, the main operational problems with prospective studies (other than linked panels) (Magnusson and Bergmann, 1990; Menard, 1991; Duncan, 1992, Blossfeld and Rohwer, 1995; Rose, 2000) are:
Panel attrition. If the same set of cases is used in each period, there may be some variation from one period to another as a result of missing data (due to refusals, changes of residence or death of the respondent). Such systematic differences between waves cause biased estimates. For example, a major problem in most surveys on poverty is the under-sampling of poor people: they are hard to contact [Page 184](and therefore usually undersampled in the first wave of data) and hard to retain for successive annual interviews. Even though weight variables could be used to mitigate under-representation, it is difficult to assess the real efficiency of such weights.
Course of events. Since there is only information on the states of the units at predetermined survey points (discrete time points), the course of the events between the discrete points in time remains unknown.
Panel conditioning. Precisely because they are repeated, panel studies tend to influence the phenomena that they are hoping to observe. It is possible that responses given in one wave will be influenced by participation in previous waves (Trivellato, 1999). During subsequent waves, interviewees often answer differently from how they answered at the first wave due solely to their experience of being interviewed previously. For example, this may occur because they have lost some of their inhibitions or, because they have been sensitised by the questioning in previous waves, respondents to a panel study may acquire new information that they would not have done otherwise (Duncan, 2000).
Consequently, the potential of panel data can only be fully realised if such data meet high quality standards (Duncan, 1992). In particular, Trivellato (1999) stated that for a panel survey to be successful, the key ingredients are a good initial sample and appropriate following rules, that is, a set of rules that permit mimicking the population that almost always changes in composition over time. Taking the British Household Panel Survey as an example, because the BHPS tracks household formation and dissolution, individuals may join and leave the sample. Thus, the study has a number of following rules determining who is eligible to be interviewed at each wave. New eligibility for sample inclusion could occur between waves in the following ways: (a) a baby is born to an Original Sample Member (OSM); (b) an OSM moves into a household with one or more new people;(c) one or more new people move in with an OSM (Freed Taylor et al., 1995).
The drawback of linked panels is that they can only provide a very limited range of information and often on a highly discontinuous temporal basis (as in the case of a census). Moreover, such panels suffer from problems of confidentiality and of data protection legislation, so there is often only very limited access (Buck et al., 1994).
Even if retrospective studies have the advantage of usually being cheaper to collect than panel data, they suffer from several limitations (Davies and Dale, 1994; Blossfeld and Rohwer, 1995):
Recall bias. Many subjects simply forget things about events, feelings, or considerations, and even when an event has not been wholly forgotten, they may have trouble recalling it (memory loss and retrieval problems). Retrospective questions concerning motivational, attitudinal, cognitive or affective states are particularly problematic because respondents find it hard to accurately recall the timing of changes in these states.
[Page 185]Tolerance. Retrospective surveys tend to be quite lengthy. There is a limit to respondents’ tolerance for the amount of data that can be collected on one occasion.
Reinterpretation. The way in which individuals interpret their own past behaviour will be influenced by subsequent events in their lives. Subjects tend to interpret and re-interpret events, opinions and feelings so that they fit in with their, the subjects’ own, current perceptions of their lives and past lives and constitute a sequence of events that ‘bears some logic’ (van der Kamp and Bijleveld, 1998).
Misrepresentation. Like panel studies, retrospective studies too, are subject to distortions which are caused by changes within the sample, changes brought about by death, emigration or, even, a refusal to continue.
The use of longitudinal data (both prospective and retrospective) can ensure a more complete approach to empirical research. Longitudinal data are collected in a time sequence that clarifies the direction as well as the magnitude of change among variables. However, the world of longitudinal research is quite heterogeneous. Some important general suggestions are (Menard, 1991):
- If the measurement of change is not a concern, if causal and temporal order are known, or if there is no concern with causal relationships, then cross-sectional data and analysis may be sufficient. Repeated cross-sectional designs may be appropriate if it is thought that the problem of panel conditioning may arise.
- If change is to be measured over a long span of time, then a prospective panel design is the most appropriate, because independent samples may differ from one another unless both formal and informal procedures for sampling and data collection are rigidly replicated for each wave of data. Within this context, it is important to remember that a period of time needs to occur before it is feasible to do an analysis of social change: a consistent number of waves is necessary to permit in-depth long-term analyses to be carried out.
- If change is to be measured over a relatively short time (weeks or months), then a retrospective design may be appropriate for data on events or behaviour (but probably not for attitudes or beliefs).
- In order to combine the strengths of panel designs and the virtues of retrospective studies, a mixed design employing a follow-up and a follow-back strategy seems appropriate (Blossfeld and Rohwer 1995).
Finally, due to the complexity of longitudinal data sets, user documentation is crucial for the researcher. It should contain essential information required for the analysis of the data (including details of fieldwork, sampling, weighting and imputation procedures) and information to assist users in linking and aggregating data across waves. The documentation should both make the analysis easier and more straightforward and help evaluate data quality.
This is a revised and updated version of an article first published in Social Research Update, 28 (Department of Sociology, University of Surrey).
1 Strictly speaking, longitudinal studies are limited to prospective studies, while retrospective studies have been defined as a quasi-longitudinal design, since they do not offer the same strengths for research on causal processes because of distortions due to inaccuracies in memories (Hakim, 1987: 97).