How-to Guide for IBM® SPSS® Statistics Software

Introduction

In this guide you will learn how to produce a difference of means t-test in IBM® SPSS® Statistics software (SPSS) using a practical example to illustrate the process. You will find links to the example dataset and you are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have already opened the data file in SPSS.

Contents

- Difference of Means T-Test
- An Example in SPSS: Automobile Fuel Consumption and Transmission Type
- 2.1 The SPSS Procedure
- 2.2 Exploring the SPSS Output

- Your Turn

1 Difference of Means T-Test

A difference of means t-test is a method for testing whether or not the means of a given variable are different between two subsets of the data. Those subsets are typically defined by categories of another variable. For example, you might compute the mean weight for men and women in your sample of data and be interested in determining if those two means are statistically significantly different from each other. This allows researchers to explore whether a continuous (e.g. weight) and a categorical variable (e.g. gender) are related to each other. There are many variants of difference of means testing – this example focuses on the independent samples t-test.

2 An Example in SPSS: Automobile Fuel Consumption and Transmission Type

This example uses a subset of data from the 2015 Fuel Consumption Report from Natural Resources Canada. This extract includes data on 1082 automobiles. The two variables we examine are:

- Whether an automobile has an automatic or manual transmission (trans2), coded 1 = Automatic and 2 = Manual.
- The city fuel consumption rate for each automobile (fuelusecity), measured in liters per 100 kilometers.

The trans2 variable is coded as 1 = Automatic and 2 = Manual. The variable fuelusecity ranges between 4.5 and 30.60 in this sample dataset, with a mean of 12.53. The fuel consumption variable is continuous while the transmission type variable is dichotomous, making the difference of means t-test appropriate for this example.

2.1 The SPSS Procedure

When conducting a difference of means t-test, is it often wise to examine each variable in isolation first. We start by presenting a histogram of the number of liters of fuel per 100 kilometers traveled. This is done in SPSS by selecting from the menu:

Analyze → Descriptive Statistics → Explore

In the Explore dialog box that opens, move the variable fuelusecity into the Dependent List: box. On the right of the Explore dialog box, click the “Plots” button. This opens another dialog box where you can select the plots you want to produce. For this example, we only checked “Histogram” under Descriptive heading.

Once you are done, click Continue and then OK to perform the analysis.

We should also present a frequency distribution of the variable trans2. This is done in SPSS by selecting from the menu:

Analyze → Descriptive Statistics → Frequencies

In the dialog box that opens, move the variable trans2 into the Variable(s): box and click OK.

Screenshots of the procedures for producing frequency distributions and histograms in SPSS are available in the How-to Guides for the Frequency Distribution and the Dispersion of a Continuous Variable topics, respectively.

To compute a difference of means t-test in SPSS, select from the menu:

Analyze → Compare Means → Independent-Samples T Test

In the dialog box that opens, move:

- the variable fuelusecity into the Test Variable(s): box
- the variable trans2 into the Grouping Variable: box.

Figure 1 shows what this looks like in SPSS.

Figure 1: Selecting Independent-Samples T Test from the Analyze menu in SPSS.

Below the Grouping Variables box there is a “Define Groups” button. You must click this to open a second dialog box where you define the values for the grouping variable that indicate the two groups for which you want to compare means. In this example, define Group 1 as equal to “1” for automatic transmissions and Group 2 as equal to “2” for manual transmissions. Figure 2 show what this looks like in SPSS.

Figure 2: Selecting values for the grouping variable for an Independent-Samples T Test in SPSS.

Once you are done, click Continue and then OK to perform the analysis.

2.2 Exploring the SPSS Output

SPSS will produce a number of figures and tables based on following the procedures outlined above. The histogram of the variable fuelusecity is presented in Figure 3.

Figure 3: Histogram showing the distribution of the number of liters of fuel consumed per 100 kilometers of city driving, Fuel Consumption Report, Natural Resources Canada, 2015.

Figure 3 shows the number of liters of fuel per 100 kilometers of city driving consumed by the cars in the dataset. It shows that most of observations fall between about 7 to 20 liters, with the bulk of them clustering around the mean of 12.53. There is a slight right skew to the data as there is a handful of relatively larger values. Researchers might want to explore those cases further to make sure they are not having a disproportionate impact on the analysis.

The SPSS output for the frequency distribution of trans2 is presented in Figure 4.

Figure 4: Frequency distribution of whether automobiles in the dataset have automatic or manual transmissions, Fuel Consumption Report, Natural Resources Canada, 2015.

Figure 4 reports that 877 (81.1%) of the cars have automatic transmissions while 205 (18.9%) of them have manual transmissions. Figure 3 and Figure 4 show the distribution of each of these variables by themselves. Next we compare the mean of our continuous variable for the two groups of automobiles to see if there is a difference.

The SPSS procedure for conducting difference of means t-test outlined above produces two tables. Each serves a different purpose.

The first table is shown in Figure 5. It simply reports summary statistics for the test variable (e.g. fuelusecity) for each of the two groups of observations (e.g. Automatic and Manual transmissions on the trans2 variable).

Figure 5: Descriptive statistics for fuelusecity for each group defined by trans2.

The second table shown in Figure 6 reports the results of the difference of means t-test.

Figure 6: Results from using a t-test to test the difference in fuel consumption between automobiles with automatic and manual transmissions, respectively, assuming unequal variance, Fuel Consumption Report, Natural Resources Canada, 2015.

SPSS produces a lot of output in this table. First it provides an F-test (denoted as Levene’s test) of whether or not the variances of fuelusecity for the two groups can be assumed to be equal or not, along with an appropriate estimate of statistical significance. It also produces the t-test, the degrees of freedom, a significance level, the mean difference, the standard error of that mean difference, and a 95% confidence interval for that mean difference under both the assumption that the variances of the two groups are equal and the assumption that they are not.

It is common to reject a null hypothesis when a reported level of statistical significance for a test is less than 0.05. The F-test tells us we should reject the null hypothesis that the variance of fuelusecity is equal across the two groups. That means we should focus our remaining attention on the results for t-test when equal variances are not assumed (known as Welch’s t-test). We see the mean difference between the two groups is about 1.66 liters of fuel. The t-test and its associated level of significance lead us to reject the null hypothesis that these two means are equal (t = 7.545, df = 402.021, p-value < 0.05). We can conclude that there is a statistically significant relationship between whether the a car has an automatic or manual transmision and the amount of fuel it consumes per 100 kilometers of city driving.

3 Your Turn

Download this sample dataset and see if you can replicate these results. Then repeat the process replacing the fuelusecity variable with one that measures fuel use under highway driving conditions, named fuelusehwy.

IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, © International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.