Skip to main content icon/video/no-internet

Control Variables

In correlational research, a control variable might be labeled a confounding variable or nuisance variable that is “held constant” by statistical means. Suppose we want to know the relations between length of study time and scores on a test of American history, but we are worried that interest in history might be an alternate explanation of the association. If we allowed students to choose and report their own study times for the test, and we also measured the participants’ interest in history, we could adjust the relations between study time and test score by statistically holding constant scores on interest in history. In such a study, interest in history would be described as a control variable.

Statistical Control

The mathematics of statistical control is based on correlation and regression, which can be illustrated graphically. In Figure 1, the variance of the distribution of American history test scores is partitioned into 4 areas labeled A, B, C, and D. Partition A is that part of the variance in test scores that is accounted for by neither study time nor interest in history—this is what cannot be predicted by either variable. Partition B is accounted for by study time alone. Partition C is shared by both study time and by interest in history—those more interested in history might spend more time studying, and thus either or both can account for this part of achievement. Finally, Partition D is the variance in achievement accounted for by interest in history alone.

Figure 1 Statistical control through removal of shared variance

Figure

The magnitude of association is indicated by the degree of overlap, that is, by the size of the shared portions. If study time and interest in history do an excellent job of predicting achievement, we would see Areas B, C, and D expand and Area A would shrink. However, if study time and interest in history were highly correlated, they would largely overlap one another, and the area marked C would increase, leading to smaller areas for B and D.

What statistical control does is remove the shared variance. In statistical terms, the partial correlation removes the control variable from both other variables of interest. In our example, we could compute a partial correlation between study time and achievement controlling for interest in history. The partial correlation would represent the ratio of B to (A + B), that is, the association of what is left of achievement with what is left of study time once interest in history is removed from both. The semipartial correlation removes the control variable from only one of the variables of interest. For example, we could compute the semipartial correlation between study time and achievement, holding constant interest in history for study time only. In this semipartial correlation, the association between study time and achievement would represent the ratio B to (A + B + C + D) because we would remove interest in history from study time, but not from achievement. The semipartial correlation is closely related to the regression coefficient. In essence, the multiple regression equation holds constant or controls each independent variable for all other independent variables.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading