You can preview and download the dataset from this tab. The dataset is available in multiple file formats, compatible with most common software packages. You can also view and download the Codebook, which provides information on the structure, contents, and layout of the dataset.
This dataset is designed for teaching common pre-processing methods for text analysis. The dataset is a subset of data derived from the 2016 How ISIS Uses Twitter dataset, and the example demonstrates the importance of pre-processing in counting word frequencies from the tweets. The dataset file is accompanied by a Teaching Guide, a Student Guide, and a How-to Guide for Python.