Local communities often struggle to have their public affairs information needs met. The authors of this case identified one way in which local newsrooms could better inform and engage their users: embedding news quizzes on their websites. This case study overviews the controlled and field experiments used to test the effects of news quizzes. The methods, and their limitations, are described, along with the decisions that the researchers made related to the participants, stimuli, measures, and research partner used in the project. The researchers also provide practical details about how they collaborated with a survey research firm, built a relationship with a newsroom, and statistically analyzed the data. The case concludes by suggesting some general lessons learned: Social science research is collaborative, multi-methodological approaches help build confidence in research findings, and research that builds bridges between academics and practitioners is important.
By the end of this case, students should be able to
- Understand the types of decisions that researchers make when designing experiments
- Differentiate between a controlled experiment and a field experiment
- Evaluate the strengths and weaknesses of a controlled experiment and a field experiment
- Assess ethical issues associated with data anonymity
- Describe possible statistical approaches for dealing with challenging behavioral data
Project Overview and Context
One of the most important functions of the news media is to inform citizens about their communities. Yet research suggests that coverage often falls short of this goal (Waldman, 2011). Although there are many examples in which local news does not live up to its potential of informing communities, one example in particular is important to this case. A previous study that we conducted found that nearly half of local news websites used polls—interactive tools where visitors to the site can answer questions about a topic (Stroud, Muddiman, & Scacco, 2015). Newsrooms likely include polls on a site to encourage site visitors to interact with a webpage and stay on that page longer, thus generating more ad revenue. However, online polls are problematic for a few reasons. First, the results of the polls often are made public, even though they are nonscientific. Anyone who visits a news website can take the poll, so the results are not necessarily representative of a local newsroom’s community. Some people may even try to game the system by taking the poll multiple times. Second, many of the polls we found were not about substantive issues at all. For instance, one station from Oregon asked site visitors about their favorite season of the year. A newsroom in Indiana asked about a political debate and gave its audience the option of selecting “Who cares? It won’t matter who wins,” which is a fairly cynical response for a newsroom to promote.
In the project overviewed here, we wanted to see if we could improve upon news polls by designing a digital tool that could (1) help newsrooms promote learning about substantive issues and (2) fight political misinformation, all while also (3) keeping news users engaged on a site like polls attempt to do. We came up with the idea of testing a news quiz tool that, like polls, would allow individuals to actively interact with a news site, but that, unlike polls, provided correct answers about public policy so that news users could learn from the tool rather than become misinformed by skewed poll results.
Our approach bridges a gap between academic research and practical application. We drew from theory and best research practices to design an experiment and partnered with a local newsroom to test the effects of the quiz we designed in the field. This approach also is championed by the Center for Media Engagement (CME), where we are both faculty research associates. The Center hopes to create “a vibrant news media” by using systematic, academic research to improve business outcomes for journalistic institutions and democratic outcomes for news audiences (About Us, 2018). This case is one example of such an approach. The result of our efforts became a usable quiz tool that has been implemented by newsrooms across the United States. Simultaneously, we hope, the tool has educated people about public affairs topics and encouraged engagement with news websites.
To make sure that our quiz tool idea worked, we needed to test it out on real people to determine whether a quiz could increase learning and engagement with a news page compared with noninteractive, statically presented information. We decided that an experimental approach would work best for this study. Experiments allow researchers to gather human participants, show some of those participants one item and others of those participants something else, and find out if those groups react to what they see in different ways (see Treadwell, 2017). In the case of the quizzes project, we showed one group of participants paragraphs about political facts and two other groups of participants different types of quizzes about the same facts. We then examined the differences between the groups to see if participants in one group remembered more and whether time on a webpage was longer than the participants in the other groups.
Importantly, we used both a controlled experiment, where we gathered participants specifically for the study, and a field experiment, where we partnered with a news organization to test how actual visitors to the outlet’s website reacted to quizzes posted.
The controlled experiment allowed us to recruit participants online, show them political information in either a static paragraph or in quiz form, and then ask them some questions to see if they could remember the correct information. We used this method first for two reasons. First, we could isolate the effect of the quizzes. We could show the participants in our study only the information we wanted to test and limit the other information that the participants saw. This increased our confidence in claiming that the quizzes, and not something else that appeared on a news site (e.g., a news article, a photo, a user comment), influenced knowledge. In addition, we were able to gather evidence that quizzes did actually help people learn political information and kept people on a site longer. We could then take this evidence to a newsroom. We wanted to be sure that our idea had potential before asking journalists to take a chance on implementing it.
For the controlled experiment, we needed to make a number of very precise decisions, ranging from who we were going to ask to participate in the study (participants), what they were going to see in the study (stimuli), and how we were going to measure knowledge and time on page (measures):
- Participants: We wanted to gather a diverse group of participants who could take the study in a setting that was relatively similar to how they would typically encounter digital news. Thus, we used the online survey firm Research Now (formerly SSI) to gather our participants. We created the study using an online platform called Qualtrics, then emailed a link to the study to Research Now. A Research Now associate then emailed our link to people who had previously agreed to take part in research projects so that they could access our study. This approach met both of our goals. First, the participants were diverse because the survey firm made sure that the characteristics of our participants matched demographic data (e.g., age, sex, race) of the people who use the Internet. Second, participants were able to take the study on an electronic device (e.g., computer, tablet, mobile phone) of their choosing, rather than at a computer in a lab. This made the experience similar to how people encounter digital news on their own time. We discuss additional details of collaborating with Research Now in the “Research Practicalities” section.
- Stimuli: A number of our decisions involved what stimuli we were going to show the different groups of participants. Remember that an experimental design allows researchers to assign different groups of participants to see different things. We knew that we wanted some participants to view static political information presented in paragraph format and others to view news quizzes. Beyond this, we had many decisions to make. What and how much political information should we share with participants? What form should the static information and the quizzes take? How could we even make a working quiz in the first place?
This last question was among the easiest for us to answer: We would work with a computer programmer to design the quiz to make sure it could function both in the study and on live news sites. It is important for researchers to know their areas of expertise; none of us knew how to program a working digital website tool, so it was worthwhile for us to spend time (and research funds) with someone who could do so professionally.
As for the content of the stimuli, experiments cannot cover every topic—that would lead to studies that are too long for most participants. Instead, we provided information about three topics (health care, income taxes, and Social Security funding) that covered a range of political viewpoints and focused on contemporary public issues.
We also decided that the quizzes would include either multiple-choice (e.g., choose 5%, 25%, 50%, or 75%) or slider (e.g., choose 0%–100%) responses to provide different levels of interactivity (see Figure 1). We used available academic research in political science and communication to design these quizzes. We wanted the quizzes to promote different kinds of media interactivity where individuals interact with an electronic interface (Stromer-Galley, 2004) and be in line with the findings of survey researchers who conduct public opinion polling. The slider quiz was designed to be the most interactive, because users could choose a wide variety of responses. Multiple-choice quizzes, alternatively, gave news users a way to potentially recognize an answer even if they may not know the answer without a prompt. The multiple-choice quizzes were designed with four answers to choose from because survey research demonstrates that four to seven response options is the optimal range for closed-ended questions (see Lozano, Garcia-Cueto, & Muniz, 2008). Overall, including two different types of quizzes in the study was important because different types of question formats have different strengths and weaknesses (see Funke, Reips, & Thomas, 2011). We wanted to see if both increased knowledge and engagement or if one type was better than the other.
Figure 1. Multiple-choice and slider quiz tools tested in the study.
We decided that the static information presented to some participants would be shown in paragraph form, to mimic a paragraph in a news story. This same information also was incorporated into both quiz types as pop-up boxes that would either affirm a correct response or correct a wrong response that a participant gave. These pop-ups were inspired by previous research that has examined how misinformation can be remedied if it is immediately corrected after an individual has offered an incorrect guess (Kuklinski, Quirk, Jerit, Schwieder, & Rich, 2000). We wanted individuals to learn about public affairs content from our quiz tool, and we found the pop-up “answers” to be a successful means for allowing participants to learn more compared with the presentation of static information.
- Measures: Our final decision for the controlled experiment concerned how we would measure webpage engagement and participants’ information retention. These measures would be statistically compared to see if they differed across the static and quiz groups, so they were important for us to get right. To determine webpage engagement, we decided to measure the time a participant spent with a page. However, it would be too much to ask participants to simply remember how much time they spent with a news quiz or paragraph with political information. Instead, we set the survey software program Qualtrics to record the amount of time each participant spent with their particular stimulus. Then, we could compare the amount of time people spent with the static information with the amount of time people spent with each type of quiz. To measure information retention, we asked people knowledge questions after they were exposed to all of the topics. For instance, to measure whether the participants could remember the correct information related to Social Security, they were asked: “What percentage of the estimated 2012 federal budget is spent on Social Security?” These two measures—time on page and information retention—allowed us to determine the effects of the quiz tool we created. We found that, compared with the static information, people spent more time with and learned more from both the multiple-choice and slider quizzes.
With all methods, there are limitations. For a controlled experiment, two are of note. First, the participants in our controlled experiment, while similar in characteristics to the population of people who use the Internet, were not necessarily typical news users. People who actually visit news websites may react differently to news quizzes. Second, isolating the effects of quizzes was important to provide support for our concept, but we could not determine whether a quiz would have a similar effect on an actual news website, especially given all of the additional information that would surround a news quiz. The field experiment, to which we now turn, helped counter some of these limitations.
The field experiments allowed us to see if the quizzes increased time on page in an actual news setting as they did in the controlled experiment. We partnered with an actual newsroom to test our quiz design in a real-life setting. The newsroom tested the quiz tool twice. In both cases, they embedded a quiz in one of their actual news stories. For the first test, they randomly showed some site visitors a multiple-choice quiz and other site visitors a slider quiz. For the second test, they showed site visitors two quizzes. The type and order of the quizzes varied for each webpage visitor, with some seeing two multiple-choice quizzes, some seeing two slider quizzes, and some seeing a mix of quiz types. This field test allowed us to use some experimental principles, particularly varying the information that participants saw, while also demonstrating that a quiz could work in a real news setting.
In the field experiment, some of the decisions were similar to those we made in the controlled experiment, particularly those related to the stimuli, but others are different enough to discuss in detail. Specifically, we needed to make decisions about the newsroom partner, the articles on which the quizzes would be posted, and the measures:
- Newsroom partner: After the controlled experiment, we knew that we wanted to partner with a news organization, but we needed to set guidelines about the type of organization that would work best for the study. We preferred to work with a local newsroom, rather than a national news organization like The Washington Post, because local communities have the most information needs (Waldman, 2011) and because local news was one of the most popular ways for people in the United States to get their news at the time (Olmstead, Jurkowitz, Mitchell, & Enda, 2013). However, we also needed to partner with a news organization that had an active website with a large number of site visitors. If the local news organization was too small, then we may not collect enough data to test the questions we were interested in answering. We were able to find a local television news station willing to work with us that was located in a top-50 designated media market and that averaged over 250,000 unique website visitors a month.
- Stimuli: As with the controlled experiment, we needed to consider the stimuli that website visitors saw. The quiz tools were embedded within the newsroom’s website next to a news article. The quiz questions asked about information present in the news article. Again, we worked with a computer programmer to make sure that the quiz tools would work on the newsroom’s website and would randomly appear as one type of quiz for some people reading a news article and another type of quiz for other people reading a news article. Unlike the controlled experiment, we as researchers had less control over the topics the quizzes would cover. We, instead, worked with the newsroom on issues that the journalists felt were meaningful to their community. We detail more about this collaboration in the “Research Practicalities” section.
- Measures: Figuring out how to measure our variables of interest was a difficult task in a real-life setting. In the field tests, although we were interested in how much people learned from quizzes, we could not actually ask site visitors knowledge questions like we did in the controlled experiment because the quiz was always embedded in a news article. It would be impossible to know whether site visitors learned from reading the article or from taking the news quiz. Thus, we decided not to measure knowledge gain in the field tests. Instead, we focused only on how site visitors interacted with the quizzes on the site. Specifically, our computer programmer designed the quiz tool to track whether a visitor took a quiz and the time in seconds a visitor remained on the site. This information, known as metadata, was then recorded on a separate website to which we had access. With this information, we were able to add to the findings from the controlled experiment: Site visitors were most engaged with a news site when they could interact with two different types of quizzes.
As with the controlled experiment, the field experiments had a few limitations. As already mentioned, we could only measure engagement with a webpage, not whether site visitors learned from engaging with a quiz tool above and beyond what they would have learned from reading the news article. In addition, because we used website tracking data, we had only numerical identifiers for each site visitor, not any information about that site visitor’s characteristics. Were only people who were curious about the news engaged with the news quizzes? Were young people most likely to use the quizzes? We could not answer these or similar questions because we had no information about the people who engaged with the quizzes in the study.
This research project was both academically derived and practically driven, which falls into emergent trends in communication research associated with engaged scholarship (Dempsey & Barge, 2014). To successfully execute a project with both of these goals, we dealt with three critical issues: (1) collaborating with a national survey vendor to conduct the controlled experiment, (2) building a relationship with a local television news outlet that would assist us in testing the quiz tool, and (3) analyzing data that did not neatly conform with the idealized versions of data discussed in statistics courses.
Survey Vendor Collaboration
We worked with Research Now to identify our sample of interest, pilot test and check our controlled experiment, and finally fully launch our study. During this phase of the study, we were in regular contact with Research Now.
Sample of Interest
Our first step was to identify our sample of interest for Research Now. We contacted the Pew Research Center to obtain their latest information on the population of Internet users derived from a 2012 tracking survey (see page 317 of the study). As our quizzes study was to be conducted in an online setting, we wished to obtain a final sample that closely approximated the demographic breakdown of the Internet population. For instance, the Pew Research Center’s data showed that approximately 49% of the population of U.S. Internet users were women in 2012. We provided this demographic information to Research Now so that when they recruited individuals to take our study, our final research sample would closely approximate this percentage of women. We repeated this process for sex, age, race/ethnicity, and income to get a sample that closely matched the characteristics of Internet users as a whole.
Functionality Pilot Test
A good research practice, particularly for researchers engaging in experimental design and execution, is to pilot test a study. This test entails releasing the full study to a sample equal to 10% of your final sample size. In this study, our final sample goal was between 400 and 500 participants, so we opted for a pilot test sample size of 50 participants. Our pilot was designed to examine the functionality of questions, experimental randomization, and experimental manipulations.
First, the design of survey questions—from order to the wording—can be very important to the quality of participant responses. We wanted to make sure that our survey questions were understandable to audiences and free of potential mistakes (i.e., spelling and grammatical errors). Although we were very careful in (re)reading our survey many times, survey participants may pick up on small mistakes. Our pilot launch of the survey and experiment included an open-ended question at the conclusion of the survey encouraging respondents to leave feedback for us to review. Luckily, the pilot test did not uncover any errors.
Second, careful pilot testing allows for researchers to check for successful randomization of participants to experimental groups. Controlled experiments are important methodological tools for researchers because they establish causality, or internal validity, between a cause (X) and an effect (Y) (Treadwell, 2017). Experiments only work when you can rule out all other rival explanations or influencers on a relationship between X and Y. Random assignment to experimental groups allows this to happen. By pilot testing a study, a researcher can determine whether the survey program—in this case Qualtrics—is successfully assigning individuals randomly to experimental groups. An indicator for successful randomization is groups that do not vary based on demographic factors such as sex, education, and age. Our pilot indicated that successful randomization of participants occurred.
Third, a functionality test also checks for whether a researcher’s experimental stimuli are working properly. In the case of our quizzes experiment, we embedded the quiz tool in the Qualtrics survey. To check the functionality of the experimental stimuli, we examined the open-ended responses for problems individuals may have encountered (e.g., a quiz not appearing because of a browser issue). We did not uncover any functionality issues with the quiz tool.
Once we found that the pilot test did not uncover issues associated with the questions, randomization, and manipulation, we notified Research Now that the study was ready for a full launch.
Full Study Launch
We fully launched the quizzes study in October 2012. Our collaboration with Research Now did not end once the study was launched. To ensure that our sample matched the demographic targets, Research Now would release the survey to batches of approximately 100 individuals and then we (as the researchers) would check the demographics of the developing sample. If we found, for example, that our sample did not match our ideal target percentage of individuals with only a high school education, we would alert Research Now, and they would recruit more individuals in this category to take the study.
When our final sample (n = 456) approximated the demographics of the Internet population and met our minimum sample to achieve statistical power, we stopped data collection. Power is a researcher’s ability to find significant effects if these effects actually exist (Treadwell, 2017). Our target was a minimum of 50 participants per experimental group (eight total groups in this study) to detect potential effects.
Following the completion of our controlled experiment, we began conversations with a local television news outlet in one of the top 50 largest media markets in the United States. Our goal was to field test the quiz tool as well as pilot the outreach process that we would later use with other news outlets who wished to collaborate with us by using quizzes (and other ongoing projects with CME). Engaged scholarship focuses on the shared learnings and understandings that develop between academics and practitioners. In beginning our collaboration with a local news outlet, we understood and appreciated that the journalists we would work with could teach us as much as we could teach them.
Quiz Testing and Programming
The quiz tool we developed in collaboration with a computer programmer was designed as an iframe. We first provided the news outlet with the iframe code to the quiz so that the outlet could test the tool on its website. This process involved the news outlet embedding the quiz on a hidden webpage and providing feedback to us on the look and functionality of the quiz. Initially, we worked with the news outlet and our computer programmer on resizing the quiz to display correctly on the website, as well as across multiple Internet browsers. We also checked (many times) that the quiz was correctly randomizing by examining the news website as well as the logged metadata on a separate CME-based website. Once the quizzes went “live” on the news website, we continued to check for any functionality issues.
In conversation with the journalists about upcoming news stories, we created new quiz questions regarding current public affairs topics in the local community. For each test, the news outlet wrote a news article on the topic, then we worked with journalists at the news outlet to create quiz questions that reflected some of the facts presented in that news article. Because of the nature of the news schedule and what the outlet deemed as newsworthy, we did not control the types of issues covered by the quizzes during the field experiment. These issues involved local vehicle burglaries and student loan debt. This decision may have been beneficial for quiz participation because these issues were considered important for the local news audience.
News Outlet Anonymity
Because news outlets are business entities in the United States, these organizations (rightfully) protect information about their audiences, news production practices, and other strategies. As a condition of our collaboration with the news organization, we agreed to keep the outlet and those journalists with whom we worked anonymous in future discussions of this research. We continue to honor this agreement. This presented a challenge when describing the research in our article in the Journal of Information Technology & Politics. The news outlet could not provide us with proprietary data on the number and makeup of visitors to the website. We subsequently relied on data purchased from comScore to describe unique visits to and overall demographic makeup of the website.
A final practical challenge of conducting experimental research that includes relatively natural human behaviors, such as website clicks or time on page, is that the data obtained are typically messier than the data described in a statistics class. In our case, the time on page data were very skewed (see Treadwell’s, 2017, chapter 7 for a discussion of skewed data). This skew meant that there were a lot of people who only spent a few seconds on the page and one or two that spent a lot of time, maybe even an hour, on the page (for an example of skewed data, see Figure 2). The few people who spent a lot of time on the site are known as outliers, or data points that are very different from the rest of the data. Researchers may take a variety of strategies to deal with skewed data, including Winsorization to remove extreme outliers or using nonparametric statistics designed to compensate for non-normally distributed data. We opted for nonparametric statistics to examine time on page engagement for both the controlled and field experiments. The reader can note the use of these statistics when they see the Kruskal–Wallis test for differences between groups (similar to the parametric analysis of variance [ANOVA] test) and Mann–Whitney U test (similar to the parametric independent-samples t test).
Figure 2. Example of skewed data.
Conclusions and Practical Lessons
This research set out to test a news quiz tool that could then be used by news outlets to engage and educate individuals. We implemented a novel design to establish that the quizzes worked as intended in a controlled environment and then observed their function in a field setting. In the process, we formed collaborative relationships with a national survey vendor and a local newsroom with a large viewership. This research confronted challenges associated with the limitations of controlled and field experiments, analysis of real-life behavioral data, and data anonymity. Our final result was a usable quiz tool that could be adopted by newsrooms and an approach that CME continued for future research projects. This approach appreciates that social scientific research is intensely collaborative, multi-methodological approaches increase confidence in study results, and that bridging theory and practice between academics and practitioners is necessary for confronting difficult political and social issues.
Social Science Research as Intensely Collaborative
Although our publication in the Journal of Information Technology & Politics included authorship from three researchers, we also collaborated closely with a computer programmer, a survey research firm, and a local news station. In other words, many research projects include hidden hands that do not make the author line (or even the acknowledgments). To foster these partnerships, we had to communicate in a clear and organized manner among ourselves and with our collaborators at each stage of the project. We also had to be flexible for collaborators, like journalists, who work on a different deadline schedule as well as bring diverse knowledge or skill sets to the project. As researchers begin new projects, they should be open to collaborating with individuals both inside and outside of university settings as potential partners.
Multi-Methodological Approaches Increase Confidence
As we explain, both controlled and field experiments (and, more broadly, all types of research methods) have limitations. Only by taking multiple approaches to the same topic can researchers be more confident in their results. We were able to use a controlled experimental test to examine knowledge gain from digital news quizzes and a field test to examine whether those quizzes increased engagement in a real news setting. Researchers can either follow suit, by publishing data from more than one method in a single paper, or they can see what has already been done on a topic and conduct a new study with a novel methodological approach to see if established findings still hold. Researchers can ask themselves questions like do the experimental findings replicate when using a survey? Or do the findings from a test in a lab still hold when a study is moved to a more natural setting? Approaching an issue from many different research angles improves the social science understanding of that issue and is an important basis for knowledge creation.
Bridging Theory and Practice Is Necessary
Finally, this research project demonstrates the importance of bridging academic and industry perspectives. We identified an issue with the current news environment (use of online news polls), drew from academic theory to figure out a way to fix the problem (design news quizzes), tested the academic (political knowledge) and practical (time on page) benefits the solution could bring to working newsrooms, and collaborated with a newsroom to learn how the quiz tool worked for that organization. Because the controlled and field experiments found that news quizzes could increase news user engagement and political knowledge, we used this evidence to encourage other news organizations not involved in the study to adopt the news quiz. In fact, CME has put together a quiz creator webpage where journalists can create their own news quizzes to embed in their websites (Quiz Tool, 2018). Over 150 newsrooms have used the quiz tool as of this writing. A research approach that engages with individuals from beyond the academy can plant research-based ideas into every day industry practices.
The authors would like to thank Dr. Natalie Jomini Stroud and the Center for Media Engagement for their assistance on the project as well as the Democracy Fund for funding the research.
Exercises and Discussion Questions
- What categories of decisions need to be made when designing controlled experiments? Which of these decisions seems most difficult to make? Which seems easiest?
- Under what circumstances might it be beneficial for researchers to consider conducting a field experiment?
- How did previous academic research inform the design of the quiz tool in the study?
- What is one challenge the researchers faced in analyzing human behavioral data? How did the researchers confront this challenge?
- What are some ethical considerations that researchers should consider when conducting a field experiment? For example, describe how the researchers dealt with news outlet anonymity in the study.
Quiz Tool. (2018). Center for Media Engagement. Retrieved from https://mediaengagement.org/quiz-creator/