Skip to main content icon/video/no-internet

Split-Half Reliability

Split-half reliability is a statistical method used to measure the consistency of the scores of a test. It is a form of internal consistency reliability and had been commonly used before the coefficient α was invented. Split-half reliability is a convenient alternative to other forms of reliability, including test–retest reliability and parallel forms reliability because it requires only one administration of the test. As can be inferred from its name, the method involves splitting a test into halves and correlating examinees’ scores on the two halves of the test. The resulting correlation is then adjusted for test length using the Spearman-Brown prophecy formula. This entry introduces the basic principles and estimation procedures for this method and discusses its limitations.

Basic Principles and Estimation Procedures

According to classical test theory, variations in examinees’ test scores are due to (a) variations in the test takers’ true ability or trait and (b) error. The proportion of variation in the total score, resulting from variations in the examinees’ true ability, is defined as test reliability. Traditionally, researchers have estimated reliability by administering a test twice to the same examinees and correlating their scores obtained at the 2 times (test–retest reliability) or administering two parallel forms of a test to test takers and correlating their scores on the two forms (parallel forms reliability). Both of these methods have limitations because it is not always feasible to administer a test multiple times and not all tests have multiple forms. One convenient alternative is to split a test in half and use each half as a parallel form of the other. Comparing scores on the two halves is then another way to measure test reliability. This method, referred to as split-half reliability, is considered a measure of the internal reliability of a test or how consistently the items perform within a test. The underlying assumption is if a test measures a single construct, then individuals should perform equally well on both halves of the test.

To estimate split-half reliability, the first step is to split the test in half and administer the two halves to the examinees. If there are multiple subscales or content areas assessed within a single test, split-half reliability should be calculated for each subscale or content area separately. When a test has more than 2 items, there are apparently multiple ways to split it. The principle is to obtain two halves as equivalent as possible. One could consider using the middle item (e.g., the fifth item on a 10-item scale) as the dividing point or randomly divide the test items into two groups which would represent the two halves of the test. However, because many tests organize items by difficulty level, these methods could easily lead to nonequivalent halves, which would result in an underestimation of the test reliability. Furthermore, attention and fatigue may affect individuals’ performance differently at the beginning and at the end of the test. That is, individuals may be more alert when they take the first half of a test, but more tired when they take the second half of the test. For these considerations, a commonly used approach is to split the test by even- and odd-number items. Another approach is to manually balance the difficulty level in the two halves of the test. Once the test is split in half and administered to the examinees, a Pearson correlation coefficient is calculated between the scores of the two halves of the test.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading