Skip to main content icon/video/no-internet

In data analysis, the question of interest rarely can be answered by a single statistical test or comparison. The term multiple comparisons describes an analysis that involves more than one statistical test or comparison based on the same set of data.

Why Multiple Comparisons Need Special Treatment

Multiple comparisons add an additional level of complexity to the analysis. For any individual comparison or test, you set an alpha level that determines the probability of a Type I error (i.e., concluding that the difference is significant when, in reality, it is not).

Suppose you choose a customary alpha of .05 (5%). Now, if you perform two independent comparisons using the same alpha, the chance of making a Type I error on each of them is 5%, and the chance of avoiding a Type I error on each is 95%. So, the chance of avoiding a Type I error altogether is 0.95 × 0.95 = 0.9025 = 90.25%. The complement of this, 9.75%, represents the probability of making at least one Type I error somewhere in the set of comparisons. This probability is sometimes called the familywise alpha level (or the familywise error rate).

You can see that the familywise alpha is significantly higher than the nominal alpha we specified as 5%. The situation gets worse as the number of comparisons increases: With five comparisons, the familywise alpha is about 22.6%. The general formula for calculating the familywise alpha is

None

where k is the number of comparisons.

Figure 1 shows how familywise alpha varies in relation to the number of comparisons. By the time you perform about 50 comparisons, you are almost certain to commit at least one Type I error. Fifty comparisons may sound like a lot, but consider that if you scan a correlation matrix for 10 variables for significant correlations, you are performing 45 tests.

None

Figure 1 Experimentwise Alpha Based on Number of Comparisons

None

Figure 2 Expected Number of Type I Errors Based on Number of Comparisons

None

Figure 3 Correlation Matrix of 10 Randomly Generated Variables, Showing Two Type I Errors (Variable Pairs (5,8) and (1,10))

Figure 2 shows the expected number of Type I errors based on the number of comparisons. The more comparisons, the more Type I errors. In the hypothetical example of a correlation matrix for 10 variables, you should expect to find two to three significant (but spurious) correlations by chance alone, even if all the variables are independent.

Approaches for Handling Multiple Comparisons

To avoid the problem of familywise alpha exploding as the number of comparisons increases, various methods have been developed to control the familywise alpha. Most methods center around selecting a conservative nominal alpha level, so that the familywise alpha is controlled given the number of comparisons to be made.

In controlling the familywise alpha, it is important to account for the right number of comparisons. In some cases, this will not be the same as the actual number of tests or comparisons performed, as explained below.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading