- 00:04
[An Introduction to Sample Design]

- 00:11
KEN COPELAND: Hi, I'm Ken Copeland, senior vice presidentand director of statistics and methodologyfor NORC at the University of Chicago.[Kennon R. Copeland, PhD, Senior Vice President& Director, Statistics & Methodology, NORCat the University of Chicago][Presentation Topics]This tutorial will cover an overview,talk about sample frames, then common sample designs,and finally talk about sample sizedetermination and allocation.

- 00:33
KEN COPELAND [continued]: [Sample Design Overview]Very briefly, a sample design is the roadmapthat is needed for meeting survey requirements in termsof precision, cost, estimate, publicationlevels, and representation of the population of interest.

- 00:55
KEN COPELAND [continued]: To look at a sample design from a high level,we want to look at it in its context of an overall surveydesign.Sample design is composed of three primary components--the sample frame, the sampling plan, and the sample size.

- 01:16
KEN COPELAND [continued]: It is informed by the survey requirementsin terms of precision, estimates,that are required for publication,as well as the data collection methodology.It is needed also for estimation methodology and dataprocessing.

- 01:36
KEN COPELAND [continued]: [Sample Frames]Let's talk first about sample frames.The sample frame is the list of units,whether actual or conceptual, from whichthe sample is collected.The primary considerations in selecting a sample frame

- 01:58
KEN COPELAND [continued]: are, first of all, how current are the data within the sampleframe.Second, how complete are the data,the listing of units within the sample frame.Third, what's the comprehensivenessof the information available on the sample frame.For example, what other variables are available

- 02:19
KEN COPELAND [continued]: that might aid in stratification or other mechanismsfor improving the efficiency of the sample design.[Sample Frame Considerations]And finally, what's the accuracy of the information containedon this sample frame.[Sample Frame Sources for Household Surveys]For household surveys, three primary sources

- 02:41
KEN COPELAND [continued]: of sample frames-- one is telephone numbers,which are used in random digit dial sample designs.When designing a sample using telephone numbers,one must include both landline and cell phone numbers.A second is address listings.Primary source there is United States parcel service delivery

- 03:05
KEN COPELAND [continued]: sequence file in the United States.When using the USPS listing, one shouldlook at doing an actual listing in more rural areas wherethe addresses don't allow appropriate contactwith the specific housing unit.

- 03:25
KEN COPELAND [continued]: And finally, a third source may be listings of populationswhen focused on a particular subset of the population,such as members of an organization, studentsat a university, and so forth.[Common Sample Designs]

- 03:45
KEN COPELAND [continued]: Now let's talk about the common sampledesigns that are used in household surveys.Here, we list five common sample designs--a simple random sample, systematic random sample,sampling probability proportionalto size, stratified sample, and cluster sample.

- 04:06
KEN COPELAND [continued]: I'll talk about each of those in turn.[Simple Random Sample]Simple random sample, every unit on the sample framehas the same probability of selection.A way in which this is done is to order the units,list them from 1 to capital N, where capital N isthe number of units on the frame,randomly select a number between 1 and capital N,

- 04:30
KEN COPELAND [continued]: and select that unit into the sample,repeating this process until you have the full samplesize of little n.[Systematic Random Sample]For ease of implementation, especially priorto the advent of computers, systematic random sampleswere used, and are still used in some applications.

- 04:50
KEN COPELAND [continued]: In this case, the population is randomly orderedand then a sampling interval is determined--sampling interval k, where k is the ratio of the total framesize to the sample size.Then after randomly ordering the units on the sample frame,

- 05:11
KEN COPELAND [continued]: a random starting point is selected.The starting point are between 1 and k, inclusive.So for example, if there were 1,000 units on the population,of which 100 were desired to be selected into the sample,the sampling interval would be 1,000 over 100, or 10.

- 05:32
KEN COPELAND [continued]: We'd then select a random starting pointbetween 1 and 10, inclusive.Assume that that was 3.Then we would select units-- the third unit, the 13th unit,the 23rd unit, and so forth, until working your way allthe way through the population.[Sample Probability Proportional to Size]Another sampling approach is to sample the units proportional

- 05:55
KEN COPELAND [continued]: to their size.In this case, units have unequal probabilities of selection,as opposed to the simple random sample.The probability of selection is based upon some measureof size of the unit.Probability Proportional to Size or PPS samplingis commonly used when selecting geographic areas

- 06:16
KEN COPELAND [continued]: as part of the sample design.[Stratified Sample]Another sample design approach isthe use of a stratified sample.In this instance, the population is grouped into L mutuallyexclusive strata, numbered 1 through capital L.Then a sample is selected from each strata, based

- 06:36
KEN COPELAND [continued]: upon some sampling scheme, such as was described earlier.[Motivation for Stratification]The motivation for stratificationis multiple-- one is for variance and bias reduction.So in this case, what you attempt to dois to define strata on the basis of homogeneity,creating strata that have units that are like each other.

- 07:00
KEN COPELAND [continued]: A second is to meet survey requirements for publication.For example, estimates may be requiredat the state level, in which case states must each bea stratum in the sample design.Another is to control the sample distribution, suchthat you want to ensure that you have a sample,

- 07:21
KEN COPELAND [continued]: say, spread across all states within the US,even if states are not needed for publication.A stratified sample will be more efficientthan a simple random sample of the same size.[Cluster Sample]The final sample design approach that we'll talk aboutis cluster sample.

- 07:42
KEN COPELAND [continued]: In this case, the population is groupedinto M mutually exclusive and exhaustive clusters.So in some sense, similar to whatwe talked about with stratified.However, in this case, only a sample of clustersis selected based upon some sampling scheme.For example, we may select a PPS sample of counties

- 08:05
KEN COPELAND [continued]: based upon the number of housing units in the county.The primary motivation for cluster samplingis to reduce the data collection costs associatedwith field interviewing.Cluster sampling is less efficientthan simple random sampling of the same size.

- 08:26
KEN COPELAND [continued]: This is due to correlation between units within a cluster.[Sample Design May Integrate Types]Now an overall sample design may integratethe types of sampling.So for example, we may start and stratified by state.Within each state, we select a PPS sample of counties.Then the next stage would be to select

- 08:49
KEN COPELAND [continued]: a PPS sample of housing unit block groupswithin the selected counties.The next step would be then to selecta systematic sample of housing units within each block group.And finally, after going to each housing unit,we may select a simple random sample of one adultwithin each selected housing units.

- 09:10
KEN COPELAND [continued]: [Sample Size Determination & Allocation]Next, we'll turn to sample size determination and allocation.Sample sizes need to be determined to meetone of several objectives.

- 09:32
KEN COPELAND [continued]: One is to minimize the sample size to meet survey precisionrequirements.More appropriately, the objectiveis to either minimize variance associated with the sampledesign for the fixed costs that's allowed for the survey,or to minimize survey costs associated

- 09:54
KEN COPELAND [continued]: with a fixed variance that's required for the survey.There are numerous references that we'lltalk about later that detail exactly how to do this.[Assumptions]Sample size determination requires assumptionsas to the population size and its variability,

- 10:15
KEN COPELAND [continued]: the data collection costs associatedwith obtaining completed interviews,the planned stratification, and clustering,as well as the intracluster correlation.[Sample Allocation]Finally, we'll talk about sample allocation.

- 10:36
KEN COPELAND [continued]: After determining the total samplesize needed for the survey, the questionbecomes how do we allocate that sample across strataand clusters.[Sample Allocation Across Strata & Clusters]One approach is equal allocation,wherein we assign the same samplesize equally across strata.This is not a common approach that's used.

- 10:59
KEN COPELAND [continued]: A second approach is proportional allocation,wherein the sample is distributedproportional to the size of the population within each stratum.And then the third is optimal allocation,wherein more sample is assigned to higher variability strata,

- 11:19
KEN COPELAND [continued]: as well as strata with lower collection costs.So this approach will minimize the sample sizethat's needed for a given variance or collection cost.[Conclusion]So in summary, the sample design is developed so as

- 11:41
KEN COPELAND [continued]: to yield the smallest sample size to meet the surveyrequirements in terms of precision, cost,and estimate publication level.For further reading, I've listed four references-- Valiant,Dever, and Kreuter, Practical Toolsfor Designing and Weighting Survey Samples,

- 12:02
KEN COPELAND [continued]: Lohr, Sampling-- Design and Methodology, Groves, Fowler,Couper, Lepkowski, Singer, and Tourangeau, Survey Methodology,and Cochran, Sampling Techniques.

### Video Info

**Publisher:** SAGE Publications Ltd

**Publication Year:** 2017

**Video Type:**Tutorial

**Methods:** Sampling, Sample bias, Sample size, Random sampling, Cluster sampling

**Keywords:** postal service; practices, strategies, and tools; telephones

### Segment Info

**Segment Num.:** 1

**Persons Discussed:**

**Events Discussed:**

**Keywords:**

## Abstract

Professor Kennon Copeland outlines sampling techniques for survey data collection. He highlights the pros and cons of different methodologies and gives examples of when to use certain approaches.