Skip to main content icon/video/no-internet

Classical Test Theory

Classical test theory (CTT) is an approach to measurement that considers the relationship between the expected score (or “true” score) and observed score on any given assessment. The word classical is used in the sense that the theory is considered to be the first practical application of mathematics to describe this relationship. CTT offers a relatively parsimonious, elegant, and intuitive way to scale individuals according to some theorized latent construct. This entry further describes CTT and its basic principles and estimation procedures, then discusses its framework for determining a measure’s proportion of true score variance, standard error of measurement, item analysis, and validity. Finally, it looks at the limitations to the theory.

Although more contemporary, model-based approaches to measurement, such as item response theory (IRT), have garnered more focus, CTT retains its relevance and importance for several reasons. First, CTT offers a relatively simple and intuitive analysis of response characteristics for an assessment. Even if the goal is to utilize more contemporary methods of measurement, CTT provides an initial framework of analyses to explore data; its relatively simple approach augments data diagnostic efforts. Second, CTT follows a less rigorous set of assumptions than the more complex IRT approach to measurement. It can be easily be applied to a wide variety of testing situations. Third, CTT requires fewer data demands for scaling procedures. Fourth, CTT extends from a framework of computations that are simpler in nature; variance, covariance, and correlation statistics lay the groundwork for CTT. Thus, almost any statistical software or data management program can be employed for most CTT analyses.

Basic Principles and Estimation Procedures

CTT was born out of the culmination of two particular advances in the field of measurement: first, the growing recognition of symmetrically distributed random errors in measurement (a concept that dates back to Galileo’s masterpiece, Dialogue on Two Main Systems of the Universe: Ptolemaic and Copernicus). By the latter half of the 19th century, it was well accepted that experimental observations were jointly impacted by a stable, true score and an error in measurement defined as a random variable.

The advent of a metric to describe the degree of relationship between two variables provided the second groundwork for the CTT approach. Francis Galton derived the correlation statistic in 1886 to indicate the extent to which mean deviations in one variable reflect corresponding mean deviations in another variable. This metric laid the foundation for estimating the impact of random errors on the stability of a test score (reliability analysis).

Each of these motivations (randomly distributed error terms and correlation) was considered together in a landmark paper by Charles Spearman in 1904, in which he recognized that observed correlations between tests would be attenuated as a function of the amount of error measured with each test. By many accounts, this paper set the stage for the development of CTT as a proper measurement paradigm. Frederic Lord and Melvin Novick are credited with organizing the psychometric developments of the time into a cohesive framework in their 1968 book, Statistical Theories of Mental Test Scores.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading