Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO SURVEY SAMPLING
October 8, 2014
Karen Foote Retzer
www.srl.uic.edu
General information
� Please hold questions until the end of the presentation
� Slides available at www.srl.uic.edu/SEMINARS/Fall14Seminars.htm
� Please raise your hand so that I can see that you can hear me
2
Outline
� Introduction
� Target Populations
� Sample Frames
� Sample Designs
� Determining Sample Sizes
� Modes of Data Collection
� Questions
3
Introduction
Census:
� Gathering information about every individual in a population
Sample:
� Selection of a small subset of a population
4
Why sample instead of taking a census?
� Less expensive
� Less time-consuming
� More accurate
� Samples can lead to statistical inference about the entire population
5
Probability vs. non-probability
� Probability Sample
� Generalize to the entire population
� Unbiased results
� Known, non-zero probability of selection
� Non-probability Sample
� Exploratory research
� Convenience
� Probability of selection is unknown
6
Target population
Definition: The population to which we want to generalize our findings
� Unit of analysis: Individual/Household/City
� Geography: State of Illinois/Champaign County/City of Urbana
� Age/Gender
� Other variables
7
Examples of target populations
� Population of adults in Champaign County
� Faculty, staff, or students at the University of Illinois
� Youth age 5 to 18 in Champaign County
8
Sampling frame
� A complete list of all units, at the first stage of sampling, from which a sample is drawn
� For example, lists of . . .
� addresses
� landline phone numbers in specific area codes
� blocks or census tracts in specified geographic areas
� members of professional organization
� schools
� cell phone numbers
9
Target populations, sample frames, and
coverage
Example 1:
� Population:Adults in Champaign County, IL
� Frames: List of landline numbers, list of census blocks, list of addresses
Example 2:
� Population: Youth age 5 to 18 in Cook County
� Frame: List of schools
Example 3:
� Population: Adults age 18-34 in United States
� Frame: ??
Coverage: What part of the target population is not included in these sample frames?
10
Sample designs for probability samples
� Simple random samples
� Systematic samples
� Stratified samples
� Cluster
� Multi-stage
11
Simple random sampling
� Definition: Every element has the same probability of selection and every combination of elements has the same probability of selection.
� Probability of selection: n/N, where n = sample size; N = population size
� Use Random Number tables, software packages to generate random numbers
� Most precision estimates assume SRS
12
Systematic sampling
� Definition: Every element has the same probability of selection, but not every combination can be selected.
� Use when drawing SRS is difficult
� List of elements is long & not computerized
� Procedure
� Determine population size N and sample size n
� Calculate sampling interval (N/n)
� Pick random start between 1 & sampling interval
� Take every ith case
� Problem of periodicity13
Stratified sampling: Proportionate
� To ensure sample resembles some aspect of population
� Population is divided into subgroups (strata)
� Students by year in school
� Faculty by gender
� Simple Random Sample (with same probability of selection) taken from each stratum.
14
Stratified sampling: Disproportionate
� Major use is comparison of subgroups
� Population is divided into subgroups (strata)
� Compare girls & boys who play Little League
� Compare seniors & freshmen who live in dorms
� Probability of selection needs to be higher for smaller stratum (girls & seniors) to be able to compare subgroups.
� Post-stratification weights
15
Cluster sampling
� Typically used in face-to-face surveys
� Population divided into clusters
� Schools (earlier example)
� Blocks
� Reasons for cluster sampling
� Reduction in cost
� No satisfactory sampling frame available
16
Determining sample size: SRS
� Need to consider
� Precision
� Variation in subject of interest
� Formula � Sample size no = CI
2 * (pq)Precision
� For example: no = 1.962 * (.5 * .5)
.052
� Sample size not dependent on population size.
17
Sample size: Other issues
� Finite Population Correction
n = no/(1 + no/N)
� Design effects
� Analysis of subgroups
� Increase size to accommodate nonresponse
� Cost
18
Modes of data collection
� Face to face
� Phone
� Web
19
Target population/frame/mode
correspondence
� Mode needs to be consistent with information in sample frame
� Mode needs to be consistent with target population
20
Cell phone and landline frames
� Increasing proportion of US households are cell phone only
� Cell phone only households tend to be• Unrelated adults
• Hispanic adults
• Younger
• Lower SES
� Landline sample frames can lead to bias
21
Cell phone and landline frames, cont.
� Cell phone frames harder to target geographically than landline frames
� Survey researchers are combining landline and cell phone frames
22
Address-based sampling
� Sampling addresses from a near universal listing of residential mail delivery locations
� Post Office Delivery Sequence Files (DSF)
23
Address-based sampling: advantages
� Coverage of households is very high
� Can be matched to name and listed telephone numbers
� Includes non-telephone households
� More efficient than traditional block-listing
24
Address-based sampling: disadvantages
� Incomplete in rural areas (although improving with 9-1-1 address conversion)
� Difficulties with “multidrop” addresses
25
Thank you!
Future noontime webinars
�Introduction to Web Surveys, Wednesday, October 15
�Introduction to Questionnaire Design, Wednesday, October 22
�Introduction to Survey Data Analysis: Addressing Survey Design and Data Quality, Wednesday, October 29
26