How-to Guide for IBM® SPSS® Statistics Software

Introduction

In this guide you will learn how to produce a difference of means t-test in in IBM® SPSS® Statistics software (SPSS) using a practical example to illustrate the process. You will find links to the example dataset and you are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have already opened the data file in SPSS.

Contents

- Difference of Means T-Test
- An Example in SPSS: Science Literacy and Gender
- 2.1 The SPSS Procedure
- 2.2 Exploring the SPSS Output

- Your Turn

1 Difference of Means T-Test

A difference of means t-test is a method for testing whether or not the means of a given variable are different between two subsets of the data. Those subsets are typically defined by categories of another variable. For example, you might compute the mean weight for men and women in your sample of data and be interested in determining if those two means are statistically significantly different from each other. Thus the method allows researchers to explore whether a continuous variable (e.g. weight) and a categorical variable (e.g. gender) are related to each other. There are many variants of difference of means testing – this example focuses on the independent samples t-test.

2 An Example in SPSS: Science Literacy and Gender

This example uses a subset of data from EB63.1. This extract includes 21,886 respondents. The two variables we examine are:

- Whether a respondent is male or not (male).
- Score on a science ‘quiz’ composed of 13 true/false items (kstot).

The first variable, male, is coded 1 for male respondents and 0 for female respondents. The science knowledge quiz (the measure of science literacy) has a range of 0 to 13. Its mean is about 8.6. Its standard deviation is about 2.6.

2.1 The SPSS Procedure

When conducting a difference of means t-test, is it often wise to examine each variable in isolation first. We start by presenting a histogram of science literacy (kstot). This is done in SPSS by selecting from the menu:

Analyze → Descriptive Statistics → Explore

In the Explore dialog box that opens, move the variable kstot into the Dependent List: box. On the right of the Explore dialog box, click the “Plots” button. This opens another dialog box where you can select the plots you want to produce. For this example, we only checked “Histogram” under Descriptive heading.

Once you are done, click Continue and then OK to perform the analysis.

We should also present a frequency distribution of the variable male. This is done in

SPSS by selecting from the menu:

Analyze → Descriptive Statistics → Frequencies

In the dialog box that opens, move the variable male into the Variable(s): box and click OK.

Screenshots of the procedures for producing frequency distributions and histograms in SPSS are available in the How-to Guides for the Frequency Distribution and the Dispersion of a Continuous Variable topics, respectively, that are part of SAGE Research Methods Datasets.

You compute a difference of means t-test in SPSS by selecting from the menu:

Analyze → Compare Means → Independent Sample T Test

In the dialog box that opens, move:

- the variable kstot into the Test Variable(s): box
- the variable male into the Grouping Variable: box

Figure 1 shows what this looks like in SPSS.

Figure 1: Selecting Independent Samples T Test from the Analyze menu in SPSS.

Below the Grouping Variable: box there is a button labelled “Define Groups…”. Now click this to open a second dialog box where you define the values for the grouping variable that indicates the two groups for which you want to compare means. In this example, define Group 1 as equal to “0” for female and Group 2 as equal to “1” for male. Figure 2 shows what this looks like in SPSS.

Figure 2: Selecting values for the grouping variable for an Independent Samples T Test in SPSS.

Once you are done, click Continue and then OK to perform the analysis.

2.2 Exploring the SPSS Output

SPSS will produce a number of figures and tables based on following the procedures outlined above. The histogram of the variable kstot is presented in Figure 3. This figure shows that most respondents score between 6 and 11 on the quiz, which follows a roughly normal distribution with a slight negative skew.

Figure 3: Histogram showing the distribution of science literacy.

Figure 4 shows the SPSS output of a frequency distribution of male and female respondents. There are slightly more women than men in the sample.

Figure 4: Frequency distribution of male and female respondents.

Figures 3 and 4 show the distribution of each of these variables by themselves. Next we compare the mean of our continuous variable for the two groups of participants to see if there is a difference.

The SPSS procedure for conducting difference of means t-test outlined above produces two tables. Each serves a different purpose. The first table is shown in Figure 5. It simply reports summary statistics for the test variable (kstot) for each of the two groups of observations (male and female). The second table shown in Figure 6 reports the results of the difference of means t-test. SPSS produces a lot of output in this table. First it provides an F-test (denoted as Levene’s test) of whether or not the variances of welfare for the two groups can be assumed to be equal or not, along with an appropriate estimate of statistical significance. It also produces the t-test, the degrees of freedom, a significance level, the mean difference, the standard error of that mean difference, and a 95% confidence interval for that mean difference under both the assumption that the variances of the two groups are equal and the assumption that they are not.

Figure 5: Descriptive statistics for science literacy for men and women.

It is common to reject a null hypothesis when a reported level of statistical significance for a test is less than 0.05. The F-statistic for the Levene’s test tells us we cannot reject the null hypothesis that the variance of kstot is equal across the two groups. That means we can focus our remaining attention on the results for t-test when equal variances are assumed. We see the mean difference between the two groups is about −0.64 of a point on the knowledge quiz scale. The t-test and its associated level of significance lead us to reject the null hypothesis that these two means are equal in the population (t = −18.06, df = 21.884, p-value < 0.001). We can conclude that there is a statistically significant relationship between gender and science literacy. The analysis supports the conclusion that men and women have differing levels of science literacy, at least insofar as this is measured in our quiz, with men knowing a little more about science than women. Thus science literacy and gender appear to be related to each other.

Figure 6: Results from using a t-test to test the difference between men and women’s science literacy.

3 Your Turn

Download this sample dataset to see if you can replicate these results. Then repeat the process of testing the difference between men and women for another variable – belief that science and technology will solve all problems. The variable name is solveprob.

IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.