1
Quantitative approaches
Lesson 3:
Sampling
2
Quantitative approaches
Plan
1. Introduction to quantitative sampling
2. Sampling error and sampling bias
3. Response rate
4. Types of "probability samples"
5. The size of the sample
6. Types of "non-probability samples"
3
Quantitative approaches
1. Introduction to quantitative sampling
4
Quantitative approaches
Sampling: Definition
Sampling = choosing the unities (e.g. individuals,
famililies, countries, texts, activities) to beinvestigated
5
Quantitative approaches
Sampling: quantitative and qualitative"First, the term "sampling" is problematic for qualitative research,because it implies the purpose of "representing" the population sampled.
Quantitative methods texts typically recognize only two main types ofsampling: probability sampling (such as random sampling) andconvenience sampling."
(...) any nonprobability sampling strategy is seen as "conveniencesampling" and is strongly discouraged."
This view ignores the fact that, in qualitative research, the typical way ofselecting settings and individuals is neither probability sampling norconvenience sampling."
It falls into a third category, which I will call purposeful selection; otherterms are purposeful sampling and criterion-based selection."
This is a strategy in which particular settings, persons, or activieties areselected deliberately in order to provide information that can't be gottenas well from other choices."Maxwell , Joseph A. , Qualitative research design..., 2005 , 88
6
Quantitative approaches
Population and Sample
Population
Sample
SamplingIIIIIIIIIIIIIIII
IIIIIIIIIIIIIIII
IIIIIIIIIIIIIIII
IIIIIIIIIIIIIIII
IIIIIIIIIIIIIIII
IIIII
IIIII
(= «!Miniature population!»)
7
Quantitative approaches
Population, Sample, Sampling frame
Population = ensemble of unities from which the sample istaken
Sample = part of the population that is chosen for investigation. The choice may be based onrandomness or not.
Sampling
frame = list of all the unities from which the choice ismade.
8
Quantitative approaches
Representative sample, probability sample
Representative sample = Sample that reflects the populationin a reliable way: the sample is a«!miniature population!»
Probability sample = Sample that has been randomlychosen. Therefore, every unity hasa known probability to be chosen.
9
Quantitative approaches
Representativity: an empirical question
The representativity of the sample cannot be assured byfollowing a given method. If we use the correct methods(random choice, stratification etc.) we can only maximize theprobability of producing a representative sample.
It is an empirical question (and should be tested) if thesample is really representative of the population.
For example: we would investigate if the percentage ofwomen in the sample are not significantly different fromthose of the population (==> the sample is representativeconcerning gender).
10
Quantitative approaches
2. Sampling error, sampling bias
11
Quantitative approaches
Errors: different types
1. Sampling error due to chance, size of sample
2. Sampling bias not due to chance or size of sample. E.g. non-response linkedto the specific theme of the research
3. Data collection error e.g. bad question wording; bad interviewing
4. Data processing error e.g. wrong coding
5. Data analysis error e.g. wrong statistical model;erroneous data analysis
6. Data interpretation error e.g. wrong interpretation of results
12
Quantitative approaches
Sampling error, sampling bias
Sampling error = Differences between the sample and thepopulation that are due to the sampling(the randomness). Sampling error can bediminished by increasing the size of thesample
Sampling bias = Differences between the sample and thepopulation that are not due to sampling(the randomness); the sampling biasdoes not diminish with increased samplesize.
13
Quantitative approaches
Sampling error/bias: example (I)
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
smokers non-smokers
Population : N = 200
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
smokers non-smokers
Population : N = 200
Sample : N = 32
no error/bias
P(s) = 0.5; p(s) = 0.5
14
Quantitative approaches
Sampling error/bias: example (II)
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
smokers non-smokers
Population : N = 200
Sample : N = 32
a bit of error/bias
P(s) = 0.5; p(s) = 0.47
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O
smokers non-smokers
Population : N = 200
Sample : N = 32
a lot of error/bias
P(s) = 0.5; p(s) = 0.33
15
Quantitative approaches
Sampling error: decreases
with increasing sample sizeExperiment with a coin
Probability of throwing «!heads!»?
P «!in reality!» = 0.5
We do 5 tries with N =1,2,5,20
With growing N, the p is approaching the P
N = 1 -> p = 0, 1, 0, 1, 1
N = 2 -> p = 0, 0.5, 0.5, 1, 0
N = 5 -> p = 0.6, 0.2, 0.4, 0.8, 0.1
N = 20 -> p = 0.4, 0.35, 0.45, 0.35, 0.55
16
Quantitative approaches
Possible reasons for sampling bias
• The sampling frame does not include all the elements of thepopulation (example: telephone directory)
• The choice is not really random (example: open telephonedirectory at a random page and choose the next 600 names)
• Certain groups of respondents have a higher (lower) responserate (example: the very poor, the very rich, ther very active,the people with an active interest in the question, the peoplecritical of surveys)
17
Quantitative approaches
Sampling error vs. sampling bias: Citation
Sampling error is random. Every time you select an individual, a text, asituation, or any "unit of observation," that unit of observation will bedifferent from the population of such units. Hence you always have anerror (we hope a small one) in generalizing to the population of units."
"Unlike sampling error, "sampling bias" is systematic (nonrandom). Forexample, if for a focus group study you "randomly" select one of everyfive students who happen to be in the library on a Friday afternonnon,you might have a biased sample that does not represent the views of"average" college students."
"Unlike sampling error, increasing the size of the samle does notdecrease the degree of bias in your sample."
Obviously, the results of a biased sample cannot be considered to berepresentative of the population (i.e. , the findings have lowtransferability or external validity)."Tashakkori / Teddlie, Mixed Methodology. Combining Qualitative and Quantitative ..., S.72-73
18
Quantitative approaches
3. Response rate
19
Quantitative approaches
Response rate
Response rate= Percentage of individuals of the samplewho have responded to the questionnaire
N of returned interviews - N returned interviews, not usable
Sample - number of individuals who were not able to
answer or could not be reached
=
=
652 - 8
1212 - 66= 0.56
Example
20
Quantitative approaches
Response rate: example RLSTabelle 1Ausschöpfungsrate und Anzahl der verwendeten Interviews in dieser Studie
N %Brutto-Stichprobe 4800Stichprobenneutrale Ausfälle 1712davon1. Stufe 12912. Stufe 1413. Stufe 280
Netto-Stichprobe 3088 100.0%(Brutto-Stichprobe - stichprobenneutr. Ausf.)
Verweigerungen 1424 46.1%davon1. Stufe 10622. Stufe 1833. Stufe 179
Realisierte Interviews 1664 53.9% (=Netto-
davon Ausschöpfung)
Deutsche Schweiz 1054(davon Kanton Zürich) 330Französische Schweiz 409Italienische Schweiz 201
Anhänger/innen nichtchristlicher Religionen 28
in dieser Studie verwendete Interviews 1636davon Kanton Zürich 325
21
Quantitative approaches
Response rate: example
Christliches Zeugnis• Der tatsächliche Rücklauf war besser als erwartet. Von 942
angeschriebenen Personen antworteten 469 auf das ersteSchreiben(49,8%); nach erfolgter Mahnung sandten weitere125 Personen (13,3%) gültige Fragebogen ein. DieGesamtrücklaufquote beläuft sich damit auf rund 63% (594Personen).
• Dies nach Abzug der ungültigen Antworten und derBefragten, die nicht mehr aufzufinden, krank oder gestorbenwaren.
22
Quantitative approaches
4. Types of probability sample
23
Quantitative approaches
Types of probability sample
4.1. Simple random sample
4.2. Systematic random sample
4.3. Stratified random sampling
4.4. Multi-stage cluster sampling
24
Quantitative approaches
4.1 Simple random sample
Simple random sample = choose randomly a predetermined number of thepopulation (sample frame)
1. decide what population to use
2. choose the sampling frame
3. decide sample size
4. use random numbers (e.g. with the help of a computer) inorder to choose the units)
25
Quantitative approaches
4.2 Systematic random sample
Systematic sample = choose randomly/systematically apredetermined number of thepopulation (sample frame)
1. decide what population to use
2. choose the sampling frame
3. decide sample size
4. begin with a random number between 1 and i; choose everyith unit in the sampling frame. i = sample / population
26
Quantitative approaches
Systematic random sample:
Christliches Zeugnis (I)
Ziel war, eine für den Evangelikalismus der deutschen
Schweiz repräsentative Untersuchung durchzuführen.
Als Methode wurde die schriftliche Befragung gewählt. In
einem nächsten Schritt musste eine geeignete Adresskartei
aller Evangelikalen gefunden werden, um die repräsentative
Stichprobe ziehen zu können. Eine solche Kartei existiert
nicht - und es ist schwierig, ja fast unmöglich, eine sinnvolle
Stichprobe selbst zu konstruieren. (...)
Auf der Suche nach einem Ausweg aus dieser Schwierigkeit
stiessen wir auf Campus für Christus, eine evangelikal
ausgerichtete Organisation.
27
Quantitative approaches
Systematic random sample:
Christliches Zeugnis (II)
Sie gibt eine Zeitschrift, das "Christliche Zeugnis", heraus,
welche innerhalb des Evangelikalismus recht weit verbreitet
ist und eine Auflage von ca. 20000 erreicht. Von der Kartei
dieser Zeitschriftenempfänger kann man hoffen, dass sie ein
unverzerrtes Bild des E in der deutschen Schweiz liefert.
Die Zufallsstichprobe wurde wie folgt gezogen: Die erste
Adresse wurde durch eine Nummer zwischen 1 und 20
zufällig gewählt; dann wurden von hier aus in 20-er-
Schritten die weiteren Adressen aussortiert. Als gültig
erwiesen sich 942 Adressen.
28
Quantitative approaches
Systematic random sample:
Study on islamophobia
The data used for this study stem from a closed-question
face-to-face survey, each interview taking from 45-60
minutes. The population consisted of inhabitants of the city
of Zurich in the age range 18 to 65 with Swiss nationality.
The survey was conducted between October 1994 and March
1995 by the Sociological Institute of the University of
Zurich. The people were chosen randomly from the official
files of the state (Einwohnerkontrolle). In all, 1,138
interviews were conducted. The response rate was 72%. The
survey can be regarded as representative of the Swiss
population of the city of Zurich (Stolz, 2000, 226).
29
Quantitative approaches
4.3 Stratified random sampling
Stratified random sampling: create strata in your samplingframe corresponding to centralcleavages in your popultion.Inside every strata, choosepredetermined numbers of unitsrandomly.
30
Quantitative approaches
Stratified random sampling: example (1)
On sait que dans notre population de 7'000'000 nous avons72% de germanophones, 20% de francophones et 8%d'italophones. Notre sample size est 1000.
Alors nous décidons de chosir aléatoirement
dans la population des germanophones: 720
dans la population des francophones: 200
dans la population des italophones: 80
-> Concernant la langue, notre sample est absolumentreprésentatif.
-> Si nous avions effectué un simple random sample, lesampling erreur aurait produit p.ex. un sample avec: germ:742, franc: 195, ital: 63
31
Quantitative approaches
Stratified random sampling: example (2)
In the NCS-CH study, we stratified for religious tradition.Furthermore, we overweighted smaller religious traditions.
32
Quantitative approaches
4.4 Multi-stage cluster sampling
Multi-stage cluster sampling = on choisit d'abord aléatoirement des groupesd'unités (clusters); puis, onchoisit aléatoirement dans cesgroupes
-> Souvent moins cher
33
Quantitative approaches
Multi-stage cluster sampling:
Etude sur les évangéliques (Milieu) (I)Some 1,850 questionnaires were given out and 1,100 werereturned, giving a response rate of 59.4%. The response ratewas 57.9% (N= 359) for the charismatic group, 54.6%(N=377) for the moderate and 66.9% (N= 361) for thefundamentalist group. Being a mail survey, these responserates can be seen as very satisfactory. The data was collectedbetween June 2003 and September 2003. This sample can besaid to be representative of the members of evangelical freechurches in Switzerland. For a number of analyses weaggregated the data sets from 1999 and 2003. One of thecentral features of the design of our study on evangelical freechurches was to include a large number of questions that hadalready been used in the 1999 survey of the Swisspopulation, in order to be able to compare the evangelicalmilieu to the „societal environment“.
34
Quantitative approaches
Multi-stage cluster sampling:
Etude sur les évangéliques (Milieu)(II)
• Our data stem from two representative surveys, oneconducted in 1999 covering the whole population ofSwitzerland, and a second survey from 2003 among themembers of the evangelical free churches in Switzerland.The first data set (1999) was produced by conducting 1,562computer-aided telephone interviews (CATI), based on arandom sample of the inhabitants of Switzerland within theage-range of 16 to 75. Response rate was 54%.
35
Quantitative approaches
Multi-stage cluster sampling:
Etude sur les évangéliques (Milieu) (III)
The second data set (2003) was produced by a mail survey of
1,100 evangelicals from evangelical free churches in
Switzerland, based on a stratified cluster sample. Cluster
sampling was effectuated by randomly choosing evangelical
free churches from a list and then randomly selecting
members from these churches. Stratification was achieved by
dividing the sample into three groups: charismatic, moderate
and fundamentalist. Since the fundamentalist group in our
population only amounts to about 11%, the fundamentalist
stratum was overrepresented in the sample, in order to be
able to make a better comparison between the three groups.36
Quantitative approaches
5. The size of the sample
37
Quantitative approaches
Size matters!
The larger the sample, the better you fare!
With larger samples,
- your estimates of the parameters gain in precision(confidence intervals are getting smaller)
- the differences you find will become significant easier
- you will be able to make analyses at a more detailed level(comparing various subgroups etc.)
38
Quantitative approaches
Size : absolute and relative
It is not the relative but the absolute size that matters.
-> A random sample of 1000 has the same «!value!» if thepopulation is Switzerland or China
39
Quantitative approaches
Formula
Arithmetic mean = x =
xi
i=1
n
!
n
Standard deviation = s =
(xi! x)
2
i=1
n
"
n !1Variance = s
2=
(xi! x)
2
i=1
n
"
n !1
Standard error = sx=
s
n
95% confidence interval = X ± z0.25sx
(z0.25 = 1.96)40
Quantitative approaches
Example : increasing the sample size
decreases the confidence intervalWhat is the true mean in the population?
Mean in the sample (n = 105): 4.8
standard deviation (sample) = 1.2
standard error (mean) = 1.2/ 105 = 0.117
confidence interval: true mean = 4.8 +- 1.96 * 0.117
-> between 4.571 et 5.029
Mean in the sample (n = 1000): 4.8
standard deviation (sample) = 1.2
standard error (mean) = 1.2/ 1000 = 0.00694
confidence interval: true mean = 4.8 +- 1.96 * 0.00694
-> entre 4.7864 et 4.8136
41
Quantitative approaches
Factors influencing the size of the sample
Coûts: from n = 1000 on for the sample, the gains in precision aredecreasing
Non-response: a certain percentage of individuals will refuse to participate;we therefore have to start out with a larger sample
Heterogeneity: If the heterogeneity of the the sample is large, we have tohave a larger sample.
Type of analysis: If we want to analyze the relationship between manyvariables at the same time (multivariate analysis), we haveto have a larger sample (e.g. sex * age * political
preference)
42
Quantitative approaches
The example of the dwarfs
43
Quantitative approaches
Sampling error: decreases
with growing NIn this example, we imagine an infinite population of dwarfs. We wouldlike to know their mean hight and the variance of their hight in thepopulation.
The question: how many dwarfs do we have to draw randomly from thepopulation in order to measure them and then estimate the populationhight and variance?
In the following simulation we draw 30 samples for different N’s (for N=5,10,15,20....100).
The «!real!» mean in the population is 10 cm. The «!real!» variance in thepopulation is 4 (standard deviation = 2)
The simulation shows that for samples smaller than N = 40, the estimateof the mean and variance are very unreliable. 44
Quantitative approaches
Simulation with R
Simulation with R
plot(c(0,100),c(0,15),type="n",xlab="Samplesize",ylab="Variance", cex.lab=1.2)
for (df in seq(5,100,5)){
for(i in 1:30){
x<-rnorm(df,mean=10,sd=2)
points(df,var(x))}}
45
Quantitative approaches
How estimate of variance becomes more
reliable with growing N
Variance in
population: 4
46
Quantitative approaches
Weighting, change in sample size and their
effect on standard errors : example NCS-CH
47
Quantitative approaches
6. Types of non-probability samples
48
Quantitative approaches
Types of non-probability samples
6.1 Convenience sampling
6.2 Snowball sampling
6.3 Quota sampling
49
Quantitative approaches
6.1 Convenience sampling
Convenience sampling
= We choose the people who are most easily available /approachable.
Problem:
We do not know for what population these people arerepresentative / whom they stand for
50
Quantitative approaches
Convenience sampling: example
"Nous avons déposé dans les boîtes aux lettres des enseignants - qui existent dansla plupart des universités - le questionnaire, une note explicative du contenu denotre recherche, et une enveloppe avec notre adresse afin qu'ils puissent nousfaire parvenir le questionnaire dûment rempli.
La plupart des universités parisiennes - ainsi qu'un bon nombre des plusimportants centres de recherche - sont inclus dans notre enquête. Nous avonsdéposé des questionnaires à Paris I, Paris II, Paris III, Paris V, Paris VI, Paris VII,Sauphine, Paris X-Nanterre, Paris VIII, l'Institut de Sciences Politiques, laMaison des Sciences de l'Homme, et l'Ecole Normale Supérieure.
271 enseignants nous ont fait parvenir leurs réponses au questionnaire.Cependant, les 271 réponses ne constituent pas un échantillon représentatif quipermette de décrire les caractéristiques générales de la population desenseignants. Par exemple, il ne nous permet pas de déterminer le pourcentaged'individus qui sont séduits pour les positions de gauche. L'échantillon n'est doncconstruit que pour fournir un test et non pour décrire la population desenseignants parisiens.
(Magniberton/Rios, 2003)
51
Quantitative approaches
6.2 Snowball sampling
Snowball sampling = We ask the first participants for addresses of other individuals who havethe same characteristics. Every participant is again asked for still otherparticipants.
Problem: no representativity
52
Quantitative approaches
Snowball sampling: example
"I conducted fifty interviews with marijuana users. I hadbeen a professional dance musician for some years when Iconducted this study and my first interviews were withpeople I had met in the music business. I asked them to putme in contact with other users who would be willing todiscuss their experiences with me... Although in the end halfof the fifty interviews were conducted with musicians, theother half covered a wide range of people, includinglaborers, machinists, and people in the professions(Becker 1963: 45-6)
53
Quantitative approaches
6.3 Quota sampling
Quota sampling = Starting with a knowledge of the population (e.g. 50% of women, 20%between 18 and 30 etc.), we decide howmany individuals in certain groups(quotas) the sample should contain.Example: we need 3 elderly womenliving in a rural area in the canton ofAppenzell Innerrhoden). Now, theinterviewers have the responsibility offinding individuals with these characteristics.
54
Quantitative approaches
6.3 Quota sampling
Problems:
- Not really representative; bias because of the choice and thenetworks of the interviewers
- We cannot calculate the standard errors. Statistical inferencefrom the sample to the population is not permitted.
Advantage:
- faster
- cheaper
Often used in market research