The National Institutes of Health and several other public and private agencies have funded longitudinal, observational studies to examine numerous mental and neurological disorders. There has been an increasing push by funding agencies for data to be readily available for use by any interested researcher in secondary data analysis. While there are several databases that are widely used and established, there are also many new, large datasets that are underutilized today. Integrated utilization of biological, behavioral, and brain data is an underdeveloped yet potentially fruitful enterprise. In particular, there is an increasing emphasis on the development of biomarkers to track the cause and development of neurological disorders. Furthermore, it is now possible to combine data across several datasets to answer novel questions of interest that would otherwise be difficult to study within one research institution or center. Such databases also allow researchers to investigate established topics with larger sample sizes, which minimizes spurious associations and enhances replicability. This case presents an approach to determining datasets of interest, logistics for combining data, and analysis considerations.