Skip to main content icon/video/no-internet

All observations and all measurements contain error. The focus of much work in measurement is to minimize and estimate the amount of error in any given measurement. In classical test theory, X is an observed score that is composed of T, the true score, and E, the error score: X = T + E. The true score is never known, but it can be thought of as the long-range average of scores from a single instrument administered to an individual an infinite number of times (the expected value or expected score). The error score is random and may have many sources, including testing conditions, individual characteristics that fluctuate from administration to administration, differences in forms, or instability of an individual's ability or trait over time.

This random error score is quite different from systematic sources of error, such as testwiseness, which may systematically increase an individual's score on each administration. Since testwiseness is systematic or constant, it finds its way into the true score and creates problems regarding validity because the trait being measured inadvertently may be influenced by testwiseness. Random error, since it varies randomly, influences the consistency of scores but not the expected value of a score (the true score) and thus influences reliability, not validity.

Theoretically, we can estimate the amount of error if we know how much of a given score is due to errors of measurement. If we were able to test a single person repeatedly without the effects of recall and fatigue, variation in their scores would be considered measurement error. If there were no measurement error, they would get the same score on each administration. Since it is not possible to test individuals repeatedly without the interference of recall and fatigue, we employ groups to estimate measurement error variance. This allows us to estimate the standard error of measurement, or the typical amount of measurement error in a set of scores.

If we take the classical test theory model of scores and consider groups of scores and their variances, we see that the variance of the observed scores equals the sum of the variance of true scores and the variance of error scores: S2X = S2T + S2E (in sample notation).

This is the long way of introducing the need for reliability: Reliability is a tool used to estimate the standard error of measurement, but it also has some intrinsic benefits in and of itself. Theoretically, reliability is considered the correlation between scores on two parallel forms of a test. The idea is that if there is no measurement error at work, scores from two parallel forms administered to the same group of individuals should be perfectly correlated—each individual should obtain the same score. It can be shown that the correlation between two parallel forms of a test is equal to the ratio of true score variance to observed score variance, or the proportion of variance in observed scores that is due to true individual differences:

None

This reliability coefficient can then be used in estimation of the standard error of measurement because it tells us the proportion of observed variance that is true variance; the standard error of measurement is a function of the proportion of observed variance that is true variance.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading