In October 2014, I embarked on a PhD study to consider the application of novel statistical approaches to model risk factors and youth offending relationships. My research addresses the conceptualized risk factor of reoffending relationship in youth justice by directly exploring the relationship between risk factors and reoffending. In doing so, the intention is that it will consider more sensitive measures of key variables than has been typical of previous risk factor research. The analysis discussed was undertaken to consider what would initially appear to be a simple and straightforward question—Do all young offenders live in deprived areas?—but in choosing to investigate this using administrative data from a local Youth Offending Team, a number of complications arose, not least was the quality of the data. This case study provides an account of some of the challenges encountered while preparing my data, taking the reader through some methodological and analytical problems that arose during the course of the research. As such, it highlights issues which are commonly not taken into account when undertaking analysis of data in the social sciences, including the need to understand types of data together with their distribution.