Skip to main content icon/video/no-internet

Sampling error consists of two components: sampling variance and sampling bias. Sometimes overall sampling error is referred to as sampling mean squarederror (MSE), which can be decomposed as in the following formula:

where P is the true population value, p is the measured sample estimate, and p' is the hypothetical mean value of realizations of p averaged across all possible replications of the sampling process producing p.

Sampling variance is the part that can be controlled by sample design factors such as sample size, clustering strategies, stratification, and estimation procedures. It is the error that reflects the extent to which repeated replications of the sampling process result in different estimates. Sampling variance is the random component of sampling error since it results from "luck of the draw" and the specifie population elements that are included in each sample. The presence of sampling bias, on the other hand, indicates that there is a systematic error that is present no matter how many times the sample is drawn.

Using an analogy with archery, when all the arrows are clustered tightly around the bull's-eye we say we have low variance and low bias. At the other extreme, if the arrows are widely scattered over the target and the midpoint of the arrows is off-center, we say we have high variance and high bias. In-between situations occur when the arrows are tightly clustered but far off-target, which is a situation of low variance and high bias. Finally, if the arrows are on-target but widely scattered, we have high variance coupled with low bias.

Efficient samples that result in estimates that are close to each other and to the corresponding population value are said to have low sampling variance, low sampling bias, and low overall sampling error. At the other extreme, samples that yield estimates that fluctuate widely and vary significantly from the corresponding population values are said to have high sampling variance, high sampling bias, and high overall sampling error. By the same token, samples can have average level sampling error by achieving high levels of sampling variance combined with low levels of sampling bias, or vice versa. (In this discussion it is assumed, for the sake of explanation, that the samples are drawn repeatedly and measurements are made for each drawn sample. In practice, of course, this is not feasible, but the repeated measurement scenario serves as a heuristic tool to help explain the concept of sampling variance.)

Sampling Variance

Sampling variance can be measured, and there exist extensive theory and software that allow for its calculation. All random samples are subject to sampling variance that is due to the fact that not all elements in the population are included in the sample and each random sample will consist of a different combination of population elements and thus will produce different estimates. The extent to which these estimates differ across all possible estimates is known as sampling variance. Inefficient designs that employ no or weak stratification will result in samples and estimates that fluctuate widely. On the other hand, if the design incorporates effective stratification strategies and minimal clustering, it is possible to have samples whose estimates are very similar, thereby generating low variance between estimates, thus achieving high levels of sampling precision.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading