Skip to main content icon/video/no-internet

Reliability, Split-Half

Reliability refers to the ability of an instrument to consistently measure a construct, and must be empirically demonstrated in order to make an argument for internal validity. As described by William Trochim (2000), measurement reliability refers to the consistency and stability of a measure and is estimated based on the proportion of variability in the measure attributable to the true score. Said differently, a test, scale, or other measurement tool is considered reliable if it can get the same score repeatedly—assuming no change is expected. Reliability is estimated through four general classes of reliability estimates, including interrater or interobserver reliability; test–retest reliability; parallel-forms reliability; and internal consistency reliability. One type of internal consistency reliability is split-half reliability. In this entry, this type of reliability is explained with special consideration to how it differs from parallel-forms reliability.

To establish split-half reliability, a researcher who aims to measure a unidimensional construct must first create a set of items that aim to measure the same construct. To evaluate split-half reliability, all items would be administered to the same sample. Split-half reliability is a measure of consistency whereby a set of items that make up a measure is split in two during the data analysis stage to compare the scores for each half of the measure with one another. This technique is used when a measure cannot easily be taken multiple times and therefore two randomly selected halves are compared to see if the scores are similar. The idea here is that if each item aims to measure the same construct, then these items can be randomly divided and the scores from each set of items should be very similar. Note that the term items is used rather than the term questions because many measures ask participants to rate their agreement with a statement rather than to answer a question. The estimate of reliability is determined and indicated by the correlation between the two parallel forms that were created by splitting the original set of items in half. Most often, the Pearson product-moment correlation coefficient is utilized and this estimate is expected to be positive and of at least a moderate magnitude.

Whereas most published research only includes a report of the correlation coefficient discussed here, some researchers choose to apply the Spearman-Brown correction. Because only half the number of items is used, the reliability coefficient is reduced. To get a better estimate of the reliability of the full test (which would be twice as long when administered as a whole), the Spearman-Brown correction can be applied by simply applying the following formula, where r is the Pearson product-moment correlation coefficient between the two halves of the measure: 2r/1 + r spilt-half reliability estimate. The correction will always be higher than the original correlation coefficient and is interpreted using the same guidelines for magnitude as Cronbach’s coefficient alpha. That is, estimates above .70 are generally considered acceptable.

Estimating reliability by assessing split-half reliability is very similar to parallel forms, but the major difference is that parallel forms are constructed so that the two measures can be used independent of each other and considered equivalent. Whereas researchers randomly divide all items that purport to measure the same construct into two sets during analysis for split-half reliability, parallel-forms reliability aims to create two equivalent measures of the same construct that can be employed independent of one another. These can be used to avoid a testing threat to internal validity in an experimental design. For example, Form A could be employed for the pretest and Form B could be employed for the posttest. When it comes to split-half reliability, researchers have an instrument that they wish to use as a single measurement tool and only develop randomly split halves during data analysis for the purpose of estimating reliability based on the internal consistency of the items that make up the measure.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading