44
Sampling Methods and Sampling Theory Alex Stannard

Sampling Methods and Sampling Theory Alex Stannard

Embed Size (px)

Citation preview

Page 1: Sampling Methods and Sampling Theory Alex Stannard

Sampling Methods and Sampling Theory

Alex Stannard

Page 2: Sampling Methods and Sampling Theory Alex Stannard

Objectives

• Types of samples

• Sampling frames

• Why samples work

Page 3: Sampling Methods and Sampling Theory Alex Stannard

Census or sample?

Edinburgh Council wants to know peoples thoughts on its leisure facilities.

A decision is made to send a questionnaire to every household in Edinburgh. A 5% response rate is expected.

This is a census of all households in Edinburgh

Page 4: Sampling Methods and Sampling Theory Alex Stannard

Census or sample?

Boxes of questionnaires are placed at the door to every council leisure facility in Edinburgh. There is a large sign next to the door advertising the questionnaire.

This is a census of everyone that uses the leisure facilities

Sampling theory does not apply to census results

Page 5: Sampling Methods and Sampling Theory Alex Stannard

What is Sampling?

• Identify population

• Select members of population to sample

• Study selected members (the sample)

• Draw inferences about population from sample

Page 6: Sampling Methods and Sampling Theory Alex Stannard

Types of samples

• Non-Probability SamplesNot all members of the population have a

chance to be included in the sample. Selection method not random.

• Probability Samples Every member of the population has a known,

nonzero chance of being included in the sample. The selection method is random.

Page 7: Sampling Methods and Sampling Theory Alex Stannard

Non-Probability SamplesConvenience• Sample selected simply for ease

Example – Edinburgh Council go to Ocean Terminal and hand questionnaires to everyone they encounter until they have 1000 complete.

• Quick and cheap

Page 8: Sampling Methods and Sampling Theory Alex Stannard

Non-Probability Samples

Convenience

Example – Want to know views on Trams project.• Go to George St at 16:30 on Friday afternoon

and ask 1000 people for their views• Go to airport at 16:30 on Friday afternoon and

ask 1000 people for their views

• Bias is a major problem – sample unlikely to be representative

Page 9: Sampling Methods and Sampling Theory Alex Stannard

Non-Probability Samples

Quota Sampling• Subjective choice of sample based on what researcher

thinks is representative. • Example – Edinburgh council send 3 researchers

knocking on doors in Granton, Broughton and Gilmerton until they have responses broken down as:

Age Men Women10 to 20 120 10020 to 30 100 10030 to 40 55 5040 to 50 70 6550 to 60 80 9060+ 60 110

Page 10: Sampling Methods and Sampling Theory Alex Stannard

Non-Probability Samples

Quota Sampling• Quicker and cheaper than probability sampling• Large non response• Still may not be representative

• What do base your quota on – age, gender, ethnicity,

education, tenure, religion?

Page 11: Sampling Methods and Sampling Theory Alex Stannard

Probability Sampling

Example – 1000 households randomly selected across Edinburgh.

• More expensive and slower• Non response a problem – but resources can be

targeted and extent of non response bias can be estimated.

• Enables precision of final statistics to be assessed.• Sample selection method is objective, specified and

replicable.

Page 12: Sampling Methods and Sampling Theory Alex Stannard

Intended and achieved populations

Intended Population

Coverage Bias

Sampling Frame

Sampling Variance

Selected Sample

Non Response

Achieved Sample

Page 13: Sampling Methods and Sampling Theory Alex Stannard

Sampling Frames

• List of all units/people that could be included in sample

• Sample is only as good as the sampling frameo Eligible units/people not on frame cannot be selected

– leads to coverage erroro Units/people on frame more than once changes

probability of being selectedo Ineligible units/people on frame can lead to final

sample being smaller than intended.

Page 14: Sampling Methods and Sampling Theory Alex Stannard

Sampling Frames

Frames can be created:

• From pre-existing lists (UK postcode address file)

• Geographically with multi-stages

• Through time

Page 15: Sampling Methods and Sampling Theory Alex Stannard

Simple random sampling

• Sampling method completely random based on random numbers.

• Is easy to understand, can be expensive

Example – every household in Edinburgh assigned a number. 1000 random numbers chosen between 1 and 204,683. These numbers identify which households are in sample.

• Tables of random numbers• www.random.org• Excel function ‘=rand()’

Page 16: Sampling Methods and Sampling Theory Alex Stannard

Systematic Sampling

• Uses a ‘random’ start on the sampling frame and then selects every i’th unit/person.

• Easy to understand, quick and easy to implement. • Can lead to some stratification depending on how the list

is ordered.• Can be expensive• Need to be careful on how list is ordered to avoid bias.

Page 17: Sampling Methods and Sampling Theory Alex Stannard

Systematic SamplingExample

Total units on sampling frame = 60

Want sample of 10

Interval size is 60/10 = 6

Select random start between 1 and 6

Select every sixth unit

1 16 31 462 17 32 473 18 33 484 19 34 495 20 35 506 21 36 517 22 37 528 23 38 539 24 39 54

10 25 40 5511 26 41 5612 27 42 5713 28 43 5814 29 44 5915 30 45 60

Sampling Frame

Page 18: Sampling Methods and Sampling Theory Alex Stannard

Stratified Sampling

• Units/people are aggregated into subgroups called strata. A certain number of units are sampled from each stratum.

• Guards against unusual samples• Stratification information has to be available

United Kingdom (60m)

England (51m) Northern Ireland (2m) Scotland (5m) Wales (3m)

Page 19: Sampling Methods and Sampling Theory Alex Stannard

Stratified Sampling

Proportionate Stratification• Chance of inclusion in sample is same for all

units/people regardless of strata

Population

United Kingdom 61 000 000

England 51 000 000

Northern Ireland 2 000 000

Scotland 5 000 000

Wales 3 000 000

Sample Size

6 100

5 100

200

500

300

Sampling Fraction

0.01%

0.01%

0.01%

0.01%

0.01%

Page 20: Sampling Methods and Sampling Theory Alex Stannard

Stratified Sampling

Disproportionate Stratification• Chance of a unit/person being included in the sample

depends on the strata they are in.• Often used to target small sub groups to help analysis

Population

United Kingdom 61 000 000

England 51 000 000

Northern Ireland 2 000 000

Scotland 5 000 000

Wales 3 000 000

Sample Size

6 100

3 100

1 000

1 000

1 000

Sampling Fraction

0.01%

0.01%

0.05%

0.02%

0.03%

Page 21: Sampling Methods and Sampling Theory Alex Stannard

Cluster Sampling

What if there are a small number of units across a large area?

• Divide population into clusters along geographic boundaries (e.g. wards)

• Randomly sample clusters• Measure all units within selected clusters• Reduces costs for face to face interviews• But bias can be a problem if what you want to

measure depends of geographic location.

Page 22: Sampling Methods and Sampling Theory Alex Stannard

Cluster Sampling

Example – Select 100 random postcodes within Edinburgh and interview all households within the 100 postcodes

• Reduces costs for face to face interviews• But bias can be a problem if what you want to

measure depends of geographic location.

Page 23: Sampling Methods and Sampling Theory Alex Stannard

Multistage Sampling

• Larger units are randomly selected

• Smaller units within the larger unit are then randomly selected

• Examples – Edinburgh randomly selects 100 postcodes and then selects 10 households within each postcode

Page 24: Sampling Methods and Sampling Theory Alex Stannard

Multistage Sampling

• Can be useful when no sampling frame is available

• Reduces costs for face to face interviews

• Similar bias problems to clustering

Page 25: Sampling Methods and Sampling Theory Alex Stannard

Why samples work

Employee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

We want to know average age of employees

• Actual average age is 38.8

• Decide to use a simple random sample

• Sample size (n) = 4

Page 26: Sampling Methods and Sampling Theory Alex Stannard

Why samples work

Employee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

25 5530 35 4540 50

Actual Mean = 38.8

45.3

Page 27: Sampling Methods and Sampling Theory Alex Stannard

Why samples workEmployee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

25 5530 35 4540 50

Actual Mean = 38.8

45.335.3

Page 28: Sampling Methods and Sampling Theory Alex Stannard

Why samples workEmployee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

25 5530 35 4540 50

Actual Mean = 38.8

45.335.3 48.8

Page 29: Sampling Methods and Sampling Theory Alex Stannard

Why samples workEmployee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

25 5530 35 4540 50

Actual Mean = 38.8

45.335.3 48.833.5

Page 30: Sampling Methods and Sampling Theory Alex Stannard

Why samples workEmployee Number 1 2 3 4 5 6 7 8 9 10Age 18 32 33 34 34 35 41 44 55 62

25 5530 35 4540 50

Actual Mean = 38.8

45.335.3 48.833.529.3

Page 31: Sampling Methods and Sampling Theory Alex Stannard

Why samples workSampling Distribution, n=4

0

2

4

6

8

10

12

14

16

18

20

29.0 - 29.9 30.0 - 30.9 31.0 - 31.9 32.0 - 32.9 33.0 - 33.9 34.0 - 34.9 35.0 - 35.9 36.0 - 36.9 37.0 - 37.9 38.0 - 38.9 39.0 - 39.9 40.0 - 40.9 41.0 - 41.9 42.0 - 42.9 43.0 - 43.9 44.0 - 44.9 45.0 - 45.9 46.0 - 46.9 47.0 - 47.9 48.0 - 48.9 49.0 - 49.9 50.0 - 50.9

Sample Mean

Nu

mb

er o

f S

amp

les

that

Pro

du

ce M

ean

Actual Mean = 38.8

Page 32: Sampling Methods and Sampling Theory Alex Stannard

Why samples work

Central limit Theorem

The distribution of a sample mean will tend to the normal distribution as sample size increases, regardless of the population distribution

Page 33: Sampling Methods and Sampling Theory Alex Stannard

Normal Distribution

Mean

Page 34: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Population Distribution

Annual Household Income

P opulation Mean

Page 35: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sample Distribution

Annual Household Income

Sample Mean

Page 36: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sample Distribution

Annual Household Income

Sample Mean

Page 37: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution

Sample Mean

Mean of Sample Means

Page 38: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

The smaller the sample size the more ‘spread out’ the sampling distribution will be.

Small sample size

Larger sample size

Page 39: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution of Age, n=2

0

1

2

3

4

5

6

Sample Mean Age

Page 40: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution of Age, n=2,3

0

2

4

6

8

10

12

14

Sample Mean Age

Page 41: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution of Age, n=2,3,4

0

2

4

6

8

10

12

14

16

18

20

Sample Mean Age

Page 42: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution of Age, n=2,3,4,5

0

5

10

15

20

25

30

Sample Mean Age

Page 43: Sampling Methods and Sampling Theory Alex Stannard

Central Limit Theorem

Sampling Distribution of Age, n=2,3,4,5

0

5

10

15

20

25

30

Sample Mean Age

0.005

0.015

0.025

0.035

0.045

0.055

0.065

0.075

Page 44: Sampling Methods and Sampling Theory Alex Stannard

Census vs. Sample

• Questionnaire sent to every household in Edinburgh (20,683). 5% response rate – 10,234 responses.

• Simple random sample of 1250 people in Edinburgh. Extra resources give 80% response rate – 1000 responses.

Which method gives more accurate results?