Teach students how to construct a viable research project based on online sources. Gabe Ignatow and Rada Mihalcea’s An Introduction to Text Mining: Research Design, Data Collection, and Analysis provides a foundation for readers seeking a solid introduction to mining text data. The book covers the most critical issues that must be taken into consideration for research projects, including web scraping and crawling, strategic data selection, data sampling, use of specific text analysis methods, and report writing. In addition to covering technical aspects of various approaches to contemporary text mining and analysis, the book covers ethical and philosophical dimensions of text-based research and social science research design.
Chapter 15: Information Extraction
The goals of Chapter 15 are to help you to do the following:
- Define the task of information extraction (IE) and its applications.
- Explain entity and relation extraction.
- Familiarize yourself with more advanced topics, such as web IE and template filling.
- Learn about existing software and data sets for IE.
Concrete pieces of information, such as names of entities or organizations, or the relations between them, are often buried in unstructured text. Consider, for instance, the following sentence: “Virgin America CEO David Cush said Thursday that there will be ‘continued fare wars’ as American Airlines and Delta fight back against the discounters.” Despite being so short, this text includes mentions of several entities—for example, organization names, Virgin America, American Airlines, and Delta; a person’s name, David ...