Skip to main content icon/video/no-internet

When describing a distribution of scores, one should use at least three indices: the shape of the distribution (e.g., unimodal, normal, and skewed), a measure of central tendency (e.g., mean and median), and a measure of the spread of scores. The variance is an example of the latter measure. The importance of a measure for the spread of scores can be seen in the following example:

  • Distribution X: 93 95 97 99 100 101 103 105 107
  • Distribution Y: 75 80 90 95 100 105 110 120 125

Both distributions have the same mean ( = = 100), but the scores in distribution X cluster closer to the mean than those in distribution Y.

Several measures can be used to describe the spread of scores. The range (highest score minus the lowest score) is simple and easy to understand but takes into account only the two outermost scores. One aberrant score can greatly affect the value of the range and give a false impression of how scores actually cluster together. The semi-interquartile range gets around this problem by considering only the central 50% of scores but ignores half the scores and is not a useful measure in inferential statistics. The most commonly used measures of spread of scores are the variance and the standard deviation. The standard deviation is merely the square root of the variance, and thus, it is the variance that is the important indicator.

The variance is commonly referred to as the average squared deviation from the mean. Its formula (using notation for a sample of scores, X) is

where

  • Capital S squared (S2) is the symbol for the variance;
  • (“X bar”) is the mean of the scores;
  • (X) indicates a deviation from the mean (how far away a score is from the mean);
  • The symbol σ (capital Greek letter sigma) is a direction “to sum” or “add”;
  • n is sample size; and
  • SS is the sum of the squared deviations from the mean (the numerator).

Notice several important aspects of the variance. The mean is the most commonly used measure of central tendency, and the variance is calculated by taking deviations from the mean. Thus, the variance shows how spread out scores are around the mean. Deviation scores are squared because the sum of the deviations from the mean, σ(X), always equals zero. An interesting feature of the variance is that the sum of the squared deviations from the mean, σ(X)2, is a smaller value than the sum of the squared deviations taken from any other score.

Note also that because the sum of the squared deviations from the mean is divided by n, the variance itself is a type of mean: the mean of squared deviation scores. Finally, like the mean of the scores, the variance takes every score into account. This is generally considered a desirable quality, but in very skewed distributions or distributions with a few very aberrant scores, one might wish to use another measure.

As an example, here is the calculation of the variance for distribution X. The mean is

Next, take deviations from the mean, square them, and sum all of the squared

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading