In this guide you will learn how to estimate a simple regression model in IBM® SPSS® Statistics software (SPSS) using a practical example to illustrate the process. You will find links to the example dataset and you are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have already opened the data file in SPSS.
Simple regression expresses a dependent, or response, variable as a linear function of an independent variable. This requires estimating an intercept (often called a constant) and a slope that describes the expected value of the dependent variable at any particular value of the independent variable. Most attention is typically focused on the slope estimate because it captures the relationship between the dependent and independent variable. The dependent variable should be continuous. This example will focus on using an independent variable that is also continuous, though the model can also accommodate a categorical independent variable (see Regression with Dummy Variables).
This example uses two variables from the IAT2012 dataset:
The implicit attitudes scale has a range of −1.9 to 1.8. Its mean is 0.32. The self-reported attitudes to African Americans scale runs from 0 to 10. Its mean is 7.2. tblack is formally an ordinal variable while impwhitegood is an interval ratio. In accordance with common practice in applied research, we treat both variables as continuous interval level.
When conducting a simple regression, it is often wise to examine each variable in isolation first. We start by presenting histograms of implicit attitude and self-reported attitude. This is done in SPSS by selecting from the menu:
Analyze → Descriptive Statistics → Explore
In the Explore dialog box that opens, move the impwhitegood and tblack variables into the Dependent List: box. On the right-hand side of the Explore dialog box, click the “Plots” button. This opens another dialog box where you can select the plots you want to produce. For this example, just check “Histogram” under the Descriptive heading. Then click Continue and OK to perform the analysis.
Screenshots for the procedure to produce histograms in SPSS are available in the How-to Guides for the Dispersion of a Continuous Variable topic that is part of SAGE Research Methods Datasets.
To estimate a simple regression model in SPSS, select from the menu:
Analyze → Regression → Linear
In the Linear Regression dialog box that opens, move the tblack variable into the Dependent: window and move the impwhitegood variable into the Independent(s): window. Then click OK to perform the analysis. Figure 1 shows what this looks like in SPSS.
Figure 2 shows that impwhitegood is normally distributed with a slight negative skew. This makes it appropriate as an independent variable in simple regression analysis.
Figure 3 shows that the mean score on tblack is approximately normally distributed but negatively skewed. OLS regression is relatively robust to violations of the assumption of normality, and the variable looks suitable for inclusion in the analysis.
Figure 4 presents four tables of results that are produced by the simple linear regression procedure in SPSS. The fourth table shows the results of primary interest.
The first three tables in Figure 4 report the independent variable(s) entered into the model, some summary fit statistics for the regression model, and an analysis of variance for the model as a whole. While detailed examination of these tables is beyond the scope of this example, we note that in the second table, R Square measures the proportion of the variance in the dependent variable explained by the model, which in this case consists of a single independent variable. An R Square of 0.057 means that only about 5.7% of the variance in self-reported attitudes is accounted for by implicit attitudes.
The fourth table in Figure 4 presents an estimate of the intercept (or constant) as equal to approximately 7.55. The constant of a simple regression model can be interpreted as the average expected value of the dependent variable when the independent variable equals zero. In this case, our independent variable has a mean which is just above zero (0.32). This means that the constant is the expected self-reported attitude score for someone just below the mean on implicit attitudes. Researchers do not often have predictions based on the intercept, so it often receives little attention.
The fourth table in Figure 4 reports that the estimated value for the slope coefficient linking implicit to self-reported attitude is estimated to be approximately −1.2. This represents the average marginal effect of implicit on self-reported attitude, and can be interpreted as the expected change in the dependent variable on average for a one-unit increase in the independent variable. For this example, that means that every increase of one on the implicit attitude scale (where a higher score indicates more positive feeling towards Whites) is associated with a decrease in attitude towards African Americans of 1.2 on an 11-point scale. The table also reports that this estimate is statistically significantly different from zero because the p-value is less than 0.001. This leads us to reject the null hypothesis and conclude that there appears to be a relationship between implicit and self-reported attitudes in the expected direction.
There are multiple diagnostic tests researchers might perform following the estimation of a simple regression model to evaluate whether the model appears to violate any of the OLS assumptions or whether there are other kinds of problems such as particularly influential cases. Describing all of these diagnostic tests is well beyond the scope of this example.
Download this sample dataset and see if you can replicate these results. The dataset also includes another variable called twhite, which is a self-reported measure of attitude towards White Americans, measured in the same way as tblack. See if you can reproduce the results presented here, and try producing your own simple regression by replacing tblack with twhite as the dependent variable.
IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, © International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.