In this guide you will learn how to produce a one-way Analysis of Variance (ANOVA) in IBM® SPSS® Statistical Software (SPSS) using a practical example to illustrate the process. You are provided with links to the example dataset and are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have already opened the data file in SPSS.
One-way ANOVA is a method used to test whether the mean values of a continuous variable differs across two or more subsets of the data. Those subsets are generally defined by categories from a categorical variable. For example, you might compute the average weight for people living in three regions of a country to see if the average weight differs across those regions. In this way, one-way ANOVA allows researchers to explore whether a continuous variable (e.g. weight) and a categorical variable (e.g. region) are related to each other.
This example illustrates one-way ANOVA using two variables from the Pew Research Center’s Project for Excellence in Journalism News Coverage Index for 2012. There are 11,475 observations in the dataset. The two variables examined are:
The duration variable ranges from a low of 3 seconds to a high of 1136 seconds, with a mean of 126.8 and a standard deviation of 119.3. The duration variable is a continuous variable and the geographic focus variable is a categorical variable, making these variables appropriate for a one-way ANOVA.
When conducting a one-way ANOVA, it is often wise to examine each variable in isolation first. We start by presenting a histogram of the duration of news stories. This is done in SPSS by selecting from the menu:
Analyze → Descriptive Statistics → Explore
In the Explore dialog box that opens, move the duration of the story variable (durasec) into the Dependent List: box. On the right-hand side of the Explore dialog box, click the “Plots” button. This opens another dialog box, where you can select the plots you want to produce. For this example, just check “Histogram” under the Descriptive heading. Then click Continue and OK to perform the analysis.
We should also produce a frequency distribution of the variable named focus. This is done in SPSS by selecting from the menu:
Analyze → Descriptive Statistics → Frequencies
In the dialog box that opens, move the geographic focus variable into the Variable(s) box and click OK.
Screenshots for the procedures to produce histograms and frequency distributions in SPSS are available in the How-to Guides for the Dispersion of a Continuous Variable and Frequency Distribution topics, respectively.
You can estimate a one-way ANOVA model in SPSS by selecting from the menu:
Analyze → Compare Means → One-Way ANOVA
In the One-Way ANOVA dialog box that opens, you move:
Figure 2 shows what this looks like in SPSS.
Once you are done, click OK to perform the analysis.
SPSS will produce one figure and two tables based on the procedures outlined above. The histogram of durasec is presented in Figure 3.
The values are clustered at duration times of 200 seconds or less, meaning that the vast majority of news stories lasted about 3 minutes or less. There are a few extreme values, with the largest being 1136 seconds, which is almost 19 minutes long. Researchers may want to explore whether cases with these extreme values have undue influence on the analysis.
Figure 4 presents a frequency distribution of the geographic focus of TV news stories. It shows that 8683 (75.7%) of stories aired on TV news shows focused on U.S. national topics, 1599 (13.9%) focused on international stories involving the U.S., and just 1193 (10.4%) focused on international stories that did not involve the U.S.
Figure 5 presents the results of the one-way ANOVA.
Figure 5 reports the Between, Within, and Total sums of squares, along with their associated degrees of freedom. The Mean square for both the Between Groups and Within Groups are the respective sums of squares divided by the reported degrees of freedom. Again, the F-test shown in the table is the Mean Square between groups divided by the Mean Square within groups. The results shown in Figure 5 would lead us to reject the null hypothesis of no difference in the length of news stories based on their geographic focus. The resulting F-test equals 92.84 and the associated p-value is well below the 0.05 threshold. Thus, we would conclude that the length of a news story and its geographic focus are related to each other. Notice that these results do not tell us how these two variables are related. Additional analysis, such as comparing the mean story length for each category of geographic focus, should be carried out to explore that question.
Download the sample dataset and see if you can replicate these results. Then repeat the analysis, this time replacing the geographic focus variable with a variable named placement, which records where within the news program the story was placed.
IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, © International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.