69
SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture set may be modified during the semester. Last modified: 4-8-2015

SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

Embed Size (px)

Citation preview

Page 1: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Lecture 2Sampling Techniques

For use in fall semester 2015Lecture notes were originally designed by Nigel Halpern. This lecture set may be modified during the semester.

Last modified: 4-8-2015

Page 2: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Lecture Aim & Objectives

Aim• To investigate issues relating to sampling techniques

for survey researchObjectives• What is a sample?• How should the sample be obtained?

– Sampling considerations– Sampling techniques– Sources of error & degrees of confidence

• How large should the sample be?

Page 3: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

What is Sampling?

• Method for selecting people or things from which you plan to obtain data

• Closely associated with quantitative methods– i.e. surveys or experiments

• Sometimes associated with qualitative methods– i.e. content analysis & ethnography

• Used because it’s rarely feasible or effective to include every person or item in a survey or study

Page 4: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Not Feasible or Effective…..

• Travel patterns of UK adults• Need to survey 50mn+ people!

– The UK government conducts a Census of Population every 10 years but this costs tens of £mn’s

• Even a survey of annual cruise passengers visiting Molde would be costly & time consuming

• Sampling provides a feasible & effective solution

Page 5: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

“A sample is a portion or sub-set of a larger group called a population” (Fink, 2003; p33)

Note: sampling isn’t necessary when you survey the entire population!

What is a Sample?

++

++

+ ++

+

+

+

+++

++

+++

+

+

Page 6: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

What is a Population?

• It can consist of human & non-human phenomena– Organisations, businesses, geographical areas, households,

individuals

• Examples:– Hotels in Møre og Romsdal (population of hotels) – Beaches in Australia (population of beaches)– People in Norway (population of Norway)– Households in Molde (population of households)– Visitors to a resort (population of visitors)– Users of a ferry service (population of users)– Students at HiMolde (population of students)

Page 7: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Aims of Sampling

• Provide a small & more manageable portion or sub-set of the population

• Represent the population & be free from bias– Results for the sample should be similar if the survey was

conducted on another sample from the same population– i.e. results are repeatable & reliable

Page 8: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

The Need for Reliable Representation

Page 9: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Extracting a Sample

Two main sources• From a sampling frame

– A list of all known cases in a population from which a sample can be drawn

• Sampled at source– Points in time/space where a potential population is

available

Page 10: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Typical Sampling Frames

• Electoral register – individuals over 18

• Telephone directories – households

• Royal Mail – households

• Market research companies – households / postcodes / census areas

• Businesses – customers

• Organisations / clubs / trade associations – members

• Magazines / newsletters – subscribers

• Local authorities / CCI – households / employers

• Business / trade directories – businesses

• Yellow pages – clubs / organisations / businesses

• Tourism offices – reservations / visitors’

• Hotels/accommodation – registration records / reservations

Page 11: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sampling Frames

• Only available where there is a finite population– i.e. where the population can be clearly defined

• Potential problems– List not up-to-date / only up-dated periodically

• Lags in registration & deregistration

– Clusters of individuals create complexities• e.g. making sure you survey the correct individual in a

sampling frame of households

– Some cost money to access or are confidential

Page 12: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sampling at Source

• Clearly defined population is not the case when sampling at source– i.e. shopping streets, visitor attractions, transport terminals,

museums, sporting events, etc

• Problems– The population is fairly vague (‘hanging around’)– Individuals present are not listed in any form which would

constitute a sampling frame– Sampling is more challenging

Page 13: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sampling Considerations

Two key Q’s to address in any sample survey

1. How should the sample be obtained?a. Who or what should be sampled (eligibility criteria)?

b. Who do you survey (profiles & individuals in clusters)?

c. When should sampling take place (timing & timescale)?

d. Where should the survey be administered (location)?

e. What sampling technique do you use (probability versus non-probability)?

2. How large should the sample be?

Page 14: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

How Should the Sample be Obtained?

a. Who or what should be sampled?– Therefore defining the eligibility criteria

b. Who do you survey?– Households, visitor attractions, shopping streets, etc will

normally have people in clusters as opposed to individuals– Ensure that the survey is completed by the correct

individual

Page 15: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

How Should the Sample be Obtained?

c. When should the sampling take place?– Time of year, month, day, time– Duration of the sampling process– Useful to

• Have some prior knowledge of the phenomena to be sampled as results may be biased by particular times of day or year or weekly, monthly & seasonal variations

• Spread the sampling over different times, days, months, etc to reduce potential for bias

Page 16: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

How Should the Sample be Obtained?

d. Where should the survey be administered?– This could be determined by the definition of the population

• e.g. surveys sent to postal addresses

– On-site surveys should consider location of interviewers• e.g. recreation areas or tourist attractions tend to have natural

or pre-defined entry & exit points

– If using multiple-interviewers, strict instruction must be given on where to stand

Page 17: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

e. What sampling technique should be used?

Two main options

How Should the Sample be Obtained?

Probability Techniques1. Simple random sampling2. Systematic random sampling3. Stratified random sampling4. Cluster sampling5. Multi-stage sampling

Non-Probability Techniques1. Haphazard sampling2. Purposive sampling a. Judgement sampling b. Quota sampling c. Snowball sampling d. Expert choice sampling

Page 18: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

• Choice of technique is dependent on 2 Q’s– Is the population known/clearly defined?– Can the population be listed as a sampling frame?

Sampling Techniques

Yes to either QAllows for

Probability Techniques(used with sampling frames)

No or uncertaintySampling is complex & based on

Non-Probability Techniques(used when sampling at source)

Page 19: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability Sampling Techniques

1. Simple random sampling• Each unit has an equal chance of selection

– e.g. lottery draw, names pulled from a list

– Probability of selection is:• (sample size/total population)*100• e.g. (100/1,000)*100 = 10% (a 1 in 10 chance)

• Should really use a table of random numbers– e.g. see http://stattrek.com/Tables/Random.aspx

Page 20: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Table of Random NumbersCreate a sample of 10 from a population of Norway’s top 30 football clubs

1 7 2 5 8 9 4 0 4 6 3 8 7 0 3 3 2 1 2 7 4 3 7 97 1 3 5 5 3 2 2 8 1 5 3 7 9 9 6 6 0 1 7 3 5 4 93 1 4 9 2 4 0 9 3 5 4 2 1 9 2 1 9 3 3 6 2 5 2 70 3 7 8 3 1 0 6 9 1 4 6 4 2 0 4 7 6 5 3 8 6 4 2

01. Ham-Kam02. Bodø Glimt03. Hereford United04. Brann05. Bryne06. Lillestrøm07. Lyn08. Molde09. Odd Grenland10. Stabæk

11.Start12.Sogndal13.Vålerenga14.Viking15.Aalesund16.Haugesund17.Rosenborg18.Hønefoss19.Tromsø20.Sandefjord

21.Åsane22.Hødd23.Lørenskog24.Strømsgodset25.Frederikstad26.Mjøndalen27.Ranheim28.Tromsdallen29.Moss30.Træff

Page 21: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Your turn….. Create a sample of 10 from a population of England’s top 30 football clubs

7 2 5 8 9 4 0 4 6 3 8 7 0 3 3 2 1 2 7 4 3 7 9 22 3 5 5 3 2 2 8 1 5 3 7 9 9 6 6 0 1 7 3 5 4 9 76 4 9 2 4 0 9 3 5 4 2 1 9 2 1 9 3 3 6 2 5 2 7 33 7 8 3 1 0 6 9 1 4 6 4 2 0 4 7 6 5 3 8 6 4 2 2

01. Chelsea02. Wigan Athletic03. Aston Villa04. Manchester City05. Reading06. Carlisle07. Luton Town08. Portsmouth09. Leicester City10. Derby County

11.Bolton12.Hereford United13.Cheltenham14.Liverpool15.Fulham16.Sunderland17.Middlesborough18.Arsenal19.Swindon Town20.Everton

21.West Ham22.Millwall23.Tottenham24.Birmingham25.Brighton26.Blackburn27.Nottingham Forrest28.Newcastle29.Crewe30.Manchester United

Page 22: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Simple Random Sampling

• Quick, cheap n’ easy…• Each unit has an equal chance of selection…• Need to list units of the poulation

– Difficult to do with a large sampling frame…

Page 23: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability Sampling Techniques

2. Systematic random sampling• Pull one unit from a list at regular intervals

– e.g. every nth name from a membership list

• Commonly used by production companies to survey product quality

Page 24: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Procedure for Systematic Random Sampling

Page 25: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

1. Andy Anderson2. Anita Ashley3. Ben Ball4. Carol Crow5. David Dent6. Eddie East7. Flora Field8. Gaynor Green9. Harold Harvey10. Ineka Ince

11.Jai Jones12.Keith Kent13.Lorna Law14.Larry Love15.Mike Matthews16.Nigel North17.Oscar Oliver18.Paul Plumber19.Peter Parson20.Richard Reed

21.Sarah Smith22.Simon South23.Tony Tapp24.Tom Trade25.Ursula Unger26.Veronica Vallis27.Vic Vaxley28.Wayne West29.Yen Yeah30.Zac Zachid

• Sample 10 from a population of 30• 30/10=3, select a number between 1 & 3 to start from (e.g. 2), then

select every 3rd number

Example (using a small sampling frame) of 30 students

Page 26: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Your turn…..Sample 6 from the list of 30, starting at 3

1. Rafael Nadal2. Kurt Asle Arvessen3. Thierry Henry4. Steffi Graff5. John Carew6. Bjørn Dæhlie7. Hermann Maier8. Roger Federer9. Andy Murray10.Thor Hushovd

11.Steffen Iversen12.Alex Zülle13.Niki Lauda14.Steffen Kjærgaard 15.Michael Schumacher16.Guus Hiddink 17.Jacques Villeneuve18.Katarina Witt19.David Beckham20.Renate Götschl

21.Marco Van Basten22.John Arne Riise23.John Tavares24.Fernando Torres25.Boris Becker26.Bernard Hinault27.Emanuel Pogatetz28.Martina Hingis29.Arantxa S-Vicario30.Lewis Hamilton

Page 27: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability Sampling Techniques

3. Stratified random sampling• Simple/systematic could miss particular groups when

using a small population– e.g. mature students

• Prior knowledge may suggest that inclusion of a group(s) is necessary– e.g. mature students perform better than others

• Stratified random sampling samples according to groups (strata)

Page 28: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Procedure for Stratified Random Sampling

Page 29: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

ExampleSurvey a Sample of 400 Households in a County

H o u s eh o ld s in th e c o u n ty

District 1

District 2

District 3

District 4

Randomly select an equal amount from each of the 4 districts in the county(e.g. 100 from each for a sample of 400)

40%

10%

25%

25%

100

100

100

100

Page 30: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Problem Associated with Multiple Variables

• The sample is representative of a single variable but not of others– e.g. representative of the 4 districts in the county but not

necessarily of age of residents

• Where multiple variables are required, the benefits of stratified random sampling diminish in favour of simple/systematic random sampling

• This problem is less likely when creating a large sample

Page 31: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Problem Associated with Time & Cost

• Stratified divides into groups, then selects units using random sampling

• Random sampling may produce a sample that is geographically dispersed– Especially problematic for face-to-face surveys

• e.g. the 100 units selected for the household survey in districts 1-4 may come from different parts of each district and interviewers may need to travel vast distances between each unit to conduct their surveys

• Clustering can overcome this problem

Page 32: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability Sampling Techniques

4. Cluster sampling• Draw from mutually exclusive sub-groups

– e.g. the 100 units selected for the household survey in districts 1-4 will be selected in clusters instead of randomly

Page 33: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Example: Stratified versus Cluster

H o u s eh o ld s in th e c o u n ty

District 1

District 2

District 3

District 4

Stratified takes an equal amount from each (e.g. 100

from each for a sample of 400)

H o u s eh o ld s in th e c o u n ty

District 1

District 2

District 3

District 4

Cluster takes a proportionate amount from each & in clusters (e.g. 16 clusters of 10 from district 1, 4 clusters of 10 from district 2, 10 clusters of 10

from districts 3 & 4, for a sample of 400)

40% 40%

10% 10%

25%

25%

25%

25%

Page 34: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

The Problem with Cluster Sampling

• Whilst cluster sampling provides huge time & cost savings, it is likely to have a much greater potential for sampling error– i.e. certain parts of each district will be excluded

Page 35: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability Sampling Techniques

5. Multi-stage sampling• Experts increasingly use a combination of probability

sampling techniques– e.g. sample attitudes to tourists in Norway’s towns

• Draw up a sampling frame of towns in Norway

• Randomly (simple, systematic or stratified) select an appropriate number of towns

• Randomly select an appropriate number of electoral wards (geographical units from which politicians are elected) from each town

• Randomly select an appropriate number of voters from the electoral register of each ward

Page 36: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Non-Probability Sampling Techniques

1. Haphazard sampling (accidental, convenience or availability)

– Samples drawn at the convenience of the interviewer• e.g. people on a street that are available & willing to

participate

– This technique should still be systematic• e.g. stop 1 in every 10 passers-by

• Don’t just stop those that you fancy.............!

Page 37: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Non-Probability Sampling Techniques

2. Purposive samplinga. Judgement: samples are believed to possess the

necessary attributes• e.g. mature students for a survey on mature students

b. Quota: selection according to a pre-specified sampling frame

• e.g. select 75 out of 100 units aged 21-25 with the presumption that 75% mature students will be 21-25 and 25% will be 26+

• The problem is that you need to decide which specific characteristics to quota (age, gender, income?)

Page 38: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Non-Probability Sampling Techniques

c. Snowball: one sampling unit refers another, who refers another, etc• e.g. expats refer other expats for a survey on expats• Not particularly representative but useful when the

population is hard to find or access (e.g. the homeless)

d. Expert choice: asks experts to choose typical units• i.e. representative individuals or cities• Often referred to as a ‘panel of experts’• This helps elicit views of persons with specific expertise• Also means they help to validate & ‘defend’ any results

Page 39: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Probability versus Non-probability Sampling Techniques

• In probability sampling– Representation is determined by the fact that every unit has an

equal chance of being selected, based on probability theory

• In non-probability sampling– There is an assumption that there is an even distribution of

characteristics within the population– BUT, the population may or may not be represented and it will

be hard to know which is true

Page 40: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Why Might the Following Approaches to Sampling be Biased?

1. I want to survey golf club members attitudes to the quality of the greens and survey a sample of the top 25 players at the club

2. I want to survey people in Molde to find out what they think about my cafe so I survey every 10th customer in the cafe. Surveys are conducted every Monday morning

3. I survey 2,500 bus passengers in Ålesund, over a series of times, days and months, to ask what they think about the availability of bus services in Ålesund

Page 41: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sources of Error

• Non-sampling errors (i.e. from survey design or delivery)– Non-observation errors: failing to obtain data from certain

segments of the population due to non-response or exclusion– Observation errors: inaccurate information obtained from the

samples or errors in data processing, analysis or reporting

Characteristic Population Sample (% pop) Responses (% sample)

18-21 years 500 250 (50%) 179 (72%)

22-25 years 300 150 (50%) 96 (64%)

26+ years 200 100 (50%) 10 (10%)

Total 1,000 500 (50%) 285 (57%)

Page 42: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sources of Error

• Sampling error (i.e. from sampling)– Where the sample drawn may not provide the same

estimates of certain characteristics as other same-size samples from the population

Page 43: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

• Age of Squash club members (n=40):

24, 21, 23, 16, 17, 56, 60, 64, 58, 57, 60, 47, 42, 41, 40, 22, 35, 38, 40, 41, 49, 19, 19, 20, 35, 27, 28, 29, 30, 71, 66, 21, 23, 26, 27, 30, 31, 45, 55

• Overall average is 37.5 years (population parameter)• Average for 5 separate samples of 10 members

– 35.7, 39.5, 23.1, 51.3, 30.3 (estimates)

• Accuracy (AKA standard error) of sample means can be calculated for probability samples

Example of Sampling Error

Page 44: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Standard Error

• Accuracy is often quoted in studies

• The 2% error is called the standard error• Measures statistical accuracy of the sample• Standard error decreases as sample size increases

– Zero error when the sample is the population

“56% of customers were more than satisfied with service quality; this

estimate is subject to a 2% error either way”

Page 45: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Calculating the Standard Error

• Standard error = sdev / (√n)– sdev: standard deviation of sample mean– n: sample size

Example– Random sample of 50 customers have a mean

age of 23.4 and a standard deviation of 9.7– Standard error = 9.7 / (√ 50) = 1.4– Therefore, population mean is likely to be 23.4 +/-

1.4 (i.e. range between 22.0-24.8 years)

Page 46: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Degrees of Confidence

• Standard error doesn’t say how likely it is (i.e. how confident we can be) that the estimated range is correct

• We use principles of standard deviation to determine the level of confidence in our estimated range

Page 47: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

68%

95%

99%

-3sd -2sd -1sd Mean +1sd +2sd +3sd

Standard Deviation

95% of responses fall within 2 sdev’s

of the mean

Page 48: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Degrees of Confidence

• 2 sdev’s means we can be 95% confident (i.e. correct 95 times out of 100) that the sample mean will lie within 2 sdev’s of the population mean

• Calculating 95% confidence for the earlier example– Where we said that the population mean is likely to be 23.4 +/-1.4 (i.e.

range between 22.0-24.8 years)– 23.4 +/- 2.8 (standard error of 1.4 x 2) provides a range of 20.6 to 26.2

• Therefore, we can be 95% confident that the population mean is between 20.6 and 26.2 years

• Do the same for the 99% level of confidence…..

Page 49: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Acceptable Level of Confidence?

• 68% of all sample means would fall within a range of +/- 1 sdev of the population

– This means that we would be 68% confident that the population mean is between 22.0 & 24.8 years

• The 68% level of confidence means there is a 32% chance of being incorrect

• 95% is normally used as the acceptable level of confidence for statistical analysis

Page 50: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

How Large Should the Sample be?

• Sample size is NOT relative to population size!• Sample size is absolute

– e.g. provided sampling procedures have been followed, a sample size of 1,000 is equally valid for a population of British adults (50mn), London residents (7mn) or Molde residents (24,000)

• Sample size is determined by– The availability of resources– The purpose of data you intend to collect– The required level of accuracy in the results– The required level of confidence

Page 51: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Resources & Purpose

• Availability of resources is self-explanatory• The purpose of data you intend to collect

– Smaller OK for descriptive info. on attitudes– Larger required for explanations for attitudes

• e.g. to investigate satisfaction according to gender, you need sufficient numbers of each gender and each level of satisfaction in order to capture the variation within the population – 5 in each would result in a minimum sample size of 60 (see next slide)

Page 52: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sample Size & Explanations for Attitudes

Male Female Total

Very Satisfied 5 5 10

Satisfied 5 5 10

Neither 5 5 10

Dissatisfied 5 5 10

Very Dissatisfied 5 5 10

Total 30 30 60

Page 53: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Optimum Size for Probability Samples

• Estimating proportions method is one of many methods used by researchers

• Assumes– No info. on standard error from previous studies– Size of population is known– Simple or systematic random sampling– Sample will be used to estimate proportions

• e.g. the percentage of customers that are satisfied• e.g. the percentage of students that like to play squash• e.g. the percentage of voters for a particular party

Page 54: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Optimum Size for Probability Samples

• Sample size is determined byn = z² p(1-p)

• Where– n = sample size needed to achieve the level of reliability– p = the population proportion (i.e. % satisfied customers)– H = desired level of accuracy– z = standard error corresponding to the desired level of

confidence (z = 2.0 for 95%)

Page 55: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Optimum Size for Probability Samples

Example: sampling levels of customer satisfaction1. Want to estimate % satisfied customers within +/-2%

H = 0.02 (2 / 100)

2. Estimate what proportion of the population are satisfied (50% is normal unless a pilot or previous study suggests otherwise) p = 0.5 (5 / 100)

3. Select the desired level of confidence z = 2 (z is 2 at the 95% level)

4. Calculate sample size n = 2² 0.5(1-0.5)

0.02²n = 10,000 x 0.25n = 2,500

Now select 2,500 samples from the sampling frame using simple or systematic random sampling

Page 56: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%

50 14.0 13.7 12.8 11.2 8.4

100 9.8 9.7 9.0 7.9 5.9

250 6.2 6.1 5.7 5.0 3.7

500 4.4 4.3 4.0 3.5 2.6

1,000 3.1 3.0 2.8 2.5 1.9

2,500 2.0 1.9 1.8 1.6 1.2

5,000 1.4 1.4 1.3 1.1 0.8

10,000 1.0 1.0 0.9 0.8 0.6

20,000 0.7 0.7 0.6 0.6 0.4

40,000 0.5 0.5 0.4 0.4 0.3

Optimum Sample Sizes at the 95% Level

Could reduce sample size by reducing level of accuracy

(e.g.4.4% for just 500!)

Page 57: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Effect of Changing the Level of Confidence

Sample size (50/50%) 99% (z=2.6) 95% (z=2.0) 90% (z=1.6)

50 18.4 14.0 11.8

100 13.0 9.8 8.3

250 8.2 6.2 5.2

500 5.8 4.4 3.7

1,000 4.1 3.1 2.6

2,500 2.6 2.0 1.6

5,000 1.8 1.4 1.2

10,000 1.3 1.0 0.8

20,000 0.9 0.7 0.6

40,000 0.6 0.5 0.4

Page 58: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Your turn.....

Sampling if students like to play squash

Using the ‘estimating proportions methods’, estimate the optimum sample size for a survey on whether students like to play squash.

1. The desired level of accuracy is 5%

2. The same survey from last year

found that 20% like to play

3. The desired level of confidence is 95%

Page 59: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Result.....

Example: sampling if students like to play squash1. Want to estimate % students that like to play within +/-5%

H = 0.05 (5 / 100)

2. Estimate what proportion of the population like to play (the same survey from last year found that 20% like to play) p = 0.2 (2 / 100)

3. Select the desired level of confidence z = 2 (z is 2 at the 95% level)

4. Calculate sample size n = 2² 0.2(1-0.2)

0.05²n = 1,600 x 0.16n = 256

Now select 256 samples from the sampling frame using simple or systematic random sampling

Page 60: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%

50 14.0 13.7 12.8 11.2 8.4

100 9.8 9.7 9.0 7.9 5.9

250 6.2 6.1 5.7 5.0 3.7

500 4.4 4.3 4.0 3.5 2.6

1,000 3.1 3.0 2.8 2.5 1.9

2,500 2.0 1.9 1.8 1.6 1.2

5,000 1.4 1.4 1.3 1.1 0.8

10,000 1.0 1.0 0.9 0.8 0.6

20,000 0.7 0.7 0.6 0.6 0.4

40,000 0.5 0.5 0.4 0.4 0.3

Optimum Sample Sizes at the 95% Level

Page 61: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

SUGGESTED APPENDIX

Statistical Note on Sample Size & Confidence IntervalsThis survey has a sample size of 500. All samples are subject to a margin of statistical error. The margins of error, or ‘confidence intervals’, for this survey are as follows:

This means, for example, that if 20% of the sample are found to have a particular characteristic, there is an estimated 95% chance that the true population percentage lies in the range 20 +/- 3.5, i.e. between 16.5 and 23.5%. These margins of error have been taken into account in the analysis in this report.

Source: Veal (1997; p215)

Finding from the survey

95% confidence interval

50/50% +/-4.4%

40/60% +/-4.3%

30/70% +/-4.0%

20/80% +/-3.5%

10/90% +/-2.6%

5/95% +/-1.9%

Page 62: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Dodgy Opinion Polls…..?

Meningsmålingen for august er laget av Sentio Research Norge for Tidens Krav, Romsdals Budstikke, Sunnmørsposten og NRK. 500

personer i Møre og Romsdal er intervjuet 13. og 14. august. 

”Senterpartiet er også i siget med 9 prosent, en framgang på 2,2 siden juni” (Tidens Krav, 20/08/07)

Page 63: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%

50 14.0 13.7 12.8 11.2 8.4

100 9.8 9.7 9.0 7.9 5.9

250 6.2 6.1 5.7 5.0 3.7

500 4.4 4.3 4.0 3.5 2.6

1,000 3.1 3.0 2.8 2.5 1.9

2,500 2.0 1.9 1.8 1.6 1.2

5,000 1.4 1.4 1.3 1.1 0.8

10,000 1.0 1.0 0.9 0.8 0.6

20,000 0.7 0.7 0.6 0.6 0.4

40,000 0.5 0.5 0.4 0.4 0.3

Optimum Sample Sizes at the 95% Level

A 2.2% change is within the

margin of error and can

therefore be ’down to chance’

Page 64: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Optimum Size for Non-Probability Samples

• Optimum sample sizes can’t be determined for non-probability samples– Can use optimum probability samples but levels of accuracy

& confidence are relatively meaningless• The equation is based on probabilities

• Size is simply based on pragmatic considerations– i.e. resources & purpose of data

Page 65: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

• Previous studies may suggest that you can expect a certain response rate – take this into account– e.g. if you need a sample of 200 and expect a response rate

of 40%, you should consider sampling 500– e.g. if your interested in opinions about a particular event

and only 30% of your sample attended the event, sample size should be increased

The Effect of Non-Response on Sample Size

Page 66: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Summary

• A small & manageable portion or sub-set– Commonly associated with quantitative methods– Applies to human & non-human phenomena– Extracted from a sampling frame or at source

• 2 main sampling techniques– Probability & non-probability sampling

• 2 main types of error– Non-sampling & sampling errors

Page 67: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Summary

• Levels of accuracy & confidence– Standard error measures accuracy in sample estimates– Confidence determines likelihood that the estimate is correct

• Sample size is absolute– Based on resources available & purpose of data– Also based on desired accuracy & confidence (probability

sampling)

Page 68: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

Recommended Reading

• Chapters 1 & 2 in Fink, A. (2003). The Survey Handbook. 2nd Ed. London: Sage.

Page 69: SCM300 Survey Design Lecture 2 Sampling Techniques For use in fall semester 2015 Lecture notes were originally designed by Nigel Halpern. This lecture

SCM300 Survey Design

“Thank you for your attention”

Questions.…….