Skip to main content icon/video/no-internet

Stratified Random Sampling

Stratified random sampling is a method for sampling from a population whereby the population is divided into subgroups and units are randomly selected from the subgroups. Stratification of target populations is extremely common in survey sampling. Stratified sampling techniques are often used when designing business, government, and social science surveys; therefore, it is important for researchers to understand how to design and analyze stratified samples. To obtain a stratified sample, members of a population are first divided into nonoverlapping subgroups of units called strata. The strata must be mutually exclusive and exhaustive, and there is an assumption of homogeneity within the strata. Following stratification, a sample is selected from each stratum, often through simple random sampling.

Determining the Strata and Sample Sizes

Although target populations are almost always heterogeneous, strata are assumed to be internally homogeneous. Survey practitioners should define strata such that the survey variables or measurements of interest have small variation compared to the variation across the population as a whole. In addition, the subpopulations defining the strata may be of interest in themselves. For example, states or regions are often considered important output categories in household surveys. Common stratification variables for surveys of individuals include age, gender, socioeconomic status, and educational attainment.

Once the researcher has divided the population into strata, the researcher must select units from each stratum. Samples are often selected through simple random sampling, a method for sampling in which each unit has the same probability of being chosen, and every possible subset of k units has the same probability of being selected.

There are a couple of ways to determine the strata sample sizes. Assume the population is of size N, the size of stratum h is Nh, for h = 1, …, H, and the desired sample size is n. The sample sizes may be allocated proportionally, such that the fraction of units sampled from each stratum is proportional to the size of the stratum in the population. For instance, if the strata are defined by states and the population consists of all individuals in the United States, then the fraction of units sampled from each stratum is proportional to the population of each state relative to the entire U.S. population. Under proportional allocation, nh=nNh/N, where nh is the sample size of stratum h.

Alternatively, one can use optimal allocation, in which the size of each stratum is proportional to the standard deviation of the variable of interest. Larger samples are taken in the strata with the greatest variability to generate the smallest possible sampling variance. Under optimal allocation, nh=nNhσhh=1HNhσh for h = 1, …, H. There are also variations on this strategy that take into account the cost of sampling from each stratum.

Stratified sampling ensures that at least one observation is picked from each stratum, even if the proportion of population units in a particular stratum is close to 0. The statistical properties of the population may not be preserved if there are strata with very few observations; hence, this should be avoided. Generally, it is recommended to use 5 to 10 strata; however, the main factor that limits the number of strata is the size of the population and strata. If there are 50 states each containing more than 100,000 units, viewing the states as the strata and sampling 5% of the units from each stratum is appropriate.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading