Researchers in the social sciences and beyond are dealing more and more with massive quantities of text data requiring analysis, from historical letters to the constant stream of content in social media. Traditional texts on statistical analysis have focused on numbers, but this book will provide a practical introduction to the quantitative analysis of textual data. Using up-to-date R methods, this book will take readers through the text analysis process, from text mining and pre-processing the text to final analysis. It includes two major case studies using historical and more contemporary text data to demonstrate the practical applications of these methods. Currently, there is no introductory how-to book on textual data analysis with R that is up-to-date and applicable across the social sciences. Code and a variety of additional resources are available on an accompanying website for the book.
Using an automated process of textual analysis, you can utilize computers to analyze large collections of textual information. This has become a very popular and useful method for extracting from text information and meaning in fields that have large corpora, such as history, literature, business, medicine, and the social sciences. Text analyses, as well as statistical big data methods for numerical data, have developed rapidly now that computer resources are more readily available. Many of the big data tools from machine learning can be applied to analyzing text, once text has been transformed into component parts such as lists of words and n-grams with their associated frequencies. Such transformations allow text analysis to become a statistical analysis of word (or n-gram) ...