Skip to main content

Selecting, Scraping, and Sampling Big Data Sets from the Internet: Fan Blogs as Exemplar

By: , , , & Published: 2015 | Product: SAGE Research Methods Cases Part 1
+- LessMore information
Search form
No results
Not Found
Download Case PDF


This case study uses a study of fan blog commentary to explain how researchers decide precisely what text to gather from the Internet as well as how they identify the locations of that content, decide exactly which website from which to gather data, and how to get the data from the Internet into text files for analysis. If the data set thus gathered is too large for the chosen method of analysis, we offer detailed descriptions of how to employ random sampling to data gathered from multiple websites to ensure representativeness as well as employ random selection in assigning chunks of the sampled data to multiple coders for analysis. Next, we explain how to seamlessly incorporate tests for intercoder reliability into the research design. Finally, we explain one method of analysis, thematic analysis using grounded theory analysis with axial coding.

Looks like you do not have access to this content.

Methods Map

Case study research

Copy and paste the following HTML into your website