How-to Guide for IBM® SPSS® Statistics Software

Introduction

In this guide, you will learn how to estimate an Autoregressive Integrated Moving Average (ARIMA) model for a single time series variable in IBM® SPSS® Statistical Software (SPSS) using a practical example to illustrate the process. You are provided with links to the example dataset, and you are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have already opened the data file in SPSS.

Contents

- Time Series: ARIMA Models
- An Example in SPSS: Daily Air Quality in New York County, 2017
- The Procedure in SPSS

- Your Turn

1 Time Series: ARIMA Models

An ARIMA model is a statistical model used to estimate the temporal dynamics of an individual times series. ARIMA models have three components: (1) an autoregressive (AR) component, (2) an integration (I) component, and (3) a moving average (MA) component. ARIMA models are frequently used for forecasting future values of the time series in question.

2 An Example in SPSS: Daily Air Quality in New York County, 2017

This example explores the daily air quality in the New York county in the United States in the year 2017. This example uses a subset of data from EPA’s Air Quality System Data Mart (https://aqs.epa.gov/aqsweb/documents/data_mart_welcome.html). The variable we examine is

- AQI: air quality index

There are 275 observations in the dataset. The air quality index (AQI) is a continuous variable recorded daily from 1 January 2017 to 2 October 2017. This makes this variable appropriate for producing an ARIMA model.

2.1 The Procedure in SPSS

To produce an ARIMA model of a single time series, you first have to produce an autocorrelation function (ACF) and a partial autocorrelation function (PACF) for the time series variable in question. The procedure for doing so is described in detail in the SAGE Research Methods Datasets example for time series ACFs and PACFs.

Figure 1 shows what the ACF looks like for the air quality variable. We can see a moderately large positive spike at the first lag followed by correlations that bounce around between being positive and negative, and all of which are either not statistically significant or just barely cross the threshold of statistical significance.

Figure 1: ACF for Daily Air Quality Index in the New York County in 2017.

Figure 2 shows what the PACF looks like for the air quality variable. It shows what mostly looks like a decay in the partial correlations toward zero.

Figure 2: PACF for Daily Air Quality Index in the New York County in 2017.

A reasonable conclusion from Figures 1 and 2 together is that the daily AQI is best characterized as following a first-order moving average process.

This suggests estimating an ARIMA (0,0,1) model. To estimate an ARIMA model in SPSS, follow the menus:

Analyze → Forecasting → Create Traditional Models

This will open the Time Series Modeler dialog box as shown in Figure 3. (Note: You may get a message asking you to define the starting time and time interval for your time series. If you get this message, typically you can select OK, and SPSS will execute this task for you automatically.)

Figure 3: Time Series Modeler Dialog Box From the Analyze → Forecasting → Create Traditional Models Menu in SPSS.

First, find the air quality variable, named AQI, in the variable list on the left-hand side of the dialog box. Select it and use the top arrow to move it into the window labeled “Dependent Variables.” Next, near the middle of the dialog box is a button labeled “Method.” It will likely show “Expert Modeler” by default. Click that button and select “ARIMA” instead.

Right next to that button is another button labeled “Criteria ….” Select that button and another dialog box will open, named Time Series Modeler: ARIMA Criteria. The middle of this box has what looks like a small table with rows labeled “Autoregressive (p),” “Difference (d),” and “Moving Average (q).” Under the heading “Nonseasonal,” you will see zeros by default. To estimate an ARIMA (0,0,1) model, you need to change the “Moving Average (q)” entries from zeros to ones. Just click in the box and type the number “1.” When you are done, click the Continue button in the lower right-hand corner.

Figure 4 shows what the Time Series Modeler: ARIMA Criteria dialog box looks like.

Figure 4: Time Series Modeler: ARIMA Criteria Dialog Box in SPSS.

Once you have finished with the Time Series Modeler: ARIMA Criteria dialog box and clicked Continue, you will return to the Time Series Modeler dialog box. Under the button where you selected the Method to be ARIMA, you should see that the Model Type is set to ARIMA (0,0,1).

Next, click the “Statistics” button near the top of the Time Series Modeler dialog box. Make sure the check box at the top left next to the label “Display fit measures, Ljung–Box statistic, and number of outliers by model” is checked. Under the “Fit Measures” heading, check the boxes next to “Stationary R square” and “R square.” Since we are only estimating one model, you can uncheck the box next to “Goodness of fit” under the “Statistics for Comparing Models” heading. Figure 5 shows what this window should look like when you are finished.

Figure 5: Statistics Window in the Time Series Modeler Dialog Box in SPSS.

Next, click the “Plots” button near the top center of the Time Series Modeler dialog box. This will switch the appearance of this dialog box, so you can select plots that you want SPSS to produce. Since we are just estimating a single model, you should uncheck the box next to “Series” in the middle left of the dialog box. Then, click to check the boxes near the middle of the dialog box labeled “Residual autocorrelation function (ACF)” and “Residual partial autocorrelation function (PACF).” Figure 6 shows what this looks like.

Figure 6: Plots Window in the Time Series Modeler Dialog Box in SPSS.

Once you have finished this, you can click OK to estimate the model. Executing this process will produce results that include two tables and one plot. Figure 7 shows the two tables.

Figure 7: Model Description and Model Results From Estimating the ARIMA (0,0,1) Model for This Example in SPSS.

The first table identifies the variable used in this analysis and that the model estimated was an ARIMA (0,0,1) model. The second table reports the Stationary R-squared of 0.174 and the R-squared of 0.174. It also reports the value of the Ljung–Box Q statistic (26.048), its associated degrees of freedom (17), and its associated level of statistical significance (0.074).

Figure 8 reports the ACF and the PACF, respectively, for the residuals resulting from the estimated ARIMA (0,0,1) model. Reading from the bottom-up, both figures show no pattern in the correlations reported among the residuals, and the correlations are either not significant or barely touch the vertical 95% confidence intervals included in the plots. This, combined with the Ljung–Box Q statistic, suggests that the ARIMA (0,0,1) model appropriately modeled the dynamics for this time series.

Figure 8: ACF and PACF for Residuals of the ARIMA (0,0,1) Model Estimated in This Example.

3 Your Turn

Download this sample dataset and see whether you can replicate the results. Then, repeat the process using either the variable PM2.5 or the variable Ozone to estimate an ARIMA model. These two variables measure the level of PM2.5 in micrograms per cubic meter and the level of ozone in parts per million, respectively, in the New York county in 2017.

IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, © International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.