You can preview and download the dataset from this tab. The dataset is available in multiple file formats, compatible with most common software packages. You can also view and download the Codebook, which provides information on the structure, contents, and layout of the dataset.
This dataset is designed for teaching basic concepts in modern text analysis. The dataset is a subset of data derived from the 2016 How ISIS Uses Twitter dataset, and the example demonstrates fundamental concepts such as corpus, tokenization, N-gram, and document-term matrix (DTM). The dataset file is accompanied by a Teaching Guide, a Student Guide, and a How-to-Guide for Python.