In this guide you will learn how to produce frequency distributions in IBM® SPSS® Statistics software (SPSS), using a practical example to illustrate this process. You will find links to the example dataset and you are encouraged to replicate this example. An additional practice example is suggested at the end of this guide. This example assumes you have already opened the data file in SPSS.
A frequency distribution presents the distribution of values for a single categorical variable in a table. Specifically, a frequency table reports the count and the percentage of observations there are for each category of the variable in question. Frequency distributions are very useful for describing the distribution of values for a categorical variable, and can be helpful in detecting coding errors.
This example presents frequency distributions for two variables taken from the 2006 China Health and Nutrition Survey (CHNS) of adults. The two variables we examine are:
For the first variable, respondents are categorized as either having smoked or never having smoked. The second variable categorizes respondents into one of nine different provinces in China. Both of these are categorical variables, making them each appropriate for a frequency distribution.
A frequency distribution can be produced in SPSS by selecting from the Menu:
Analyze → Descriptive Statistics → Frequencies
In the Frequencies dialog box that opens, move the variable you want from the list on the left into the Variable(s) box (note: you can do this for multiple variables, producing a frequency distribution for each one). Figure 1 shows what this looks like in SPSS.
You may wish to also generate a bar chart to illustrate the contents of a frequency distribution graphically. From the Frequencies dialog box, click:
Figure 2 shows what this looks like in SPSS. To run the full analysis, click OK in the Frequencies dialog box.
Executing this process for the variable province will produce one frequency table and one bar chart. Both contain the same information, but will suit different presentational purposes. Figure 3 and Figure 4 show what the SPSS output looks like.
Looking first at the frequency table shown in Figure 3, SPSS provides four columns of output: the Frequency count, the Percent, the Valid Percent, and the Cumulative Percent. In this example, Valid Percent and Percent are equal, but that would not be the case if there were missing data for this variable. When there are missing data, most researchers would report the Valid Percent.
For this example, we see a total of 9775 observations in this dataset for the variable named province. We see that 1057 (10.8%) respondents lived in Liaoning province, 979 (10.0%) lived in Hubei province, and so forth. The same information in presented as a bar chart in Figure 4.
Download the sample dataset to see if you can replicate these results. Then repeat the process with the variable named smoked, or any of the other variables in the dataset.
IBM® SPSS® Statistics software (SPSS) screenshots Republished Courtesy of International Business Machines Corporation, © International Business Machines Corporation. SPSS Inc. was acquired by IBM in October, 2009. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at http://www.ibm.com/legal/copytrade.shtml.