Case
Abstract
Collecting the perfect data to answer a research question is optimal but not always feasible. Sometimes existing data can be leveraged to answer some questions, but this opportunity comes with important potential pitfalls. To test the effect of early life exposures such as being born preterm or living in a materially deprived neighborhood on early childhood academic performance, we create a longitudinal cohort by deterministically linking all singleton live births to Georgia-resident women over a 5-year period to the children's subsequent performance on first grade Georgia standardized tests. In this case study, we cover the basic features of epidemiologic cohort studies, construction of the linked dataset for our study, and assessment of linkage quality and potential for bias.