Sampling in Research Suter, Chapter 8. Questions about sampling Sample size – do I have enough participants? Is it the right kind of sample? Is it representative?

Sampling in Research

Suter, Chapter 8

Questions about sampling

• Sample size – do I have enough participants?

• Is it the right kind of sample?

• Is it representative?





Describing Data• Central Tendency

– A characteristic of a distribution of scores that describes where scores tend to center. Often referred to as an average, the most common one being the mean (sum of scores divided by the number of scores)

• Dispersion– A characteristic of a distribution of scores that

describes the spread or scatter of scores around the central tendency

– Common measure of dispersion is the standard deviation

▫ The higher the standard deviation, the greater the spread of scores around the mean

The Normal CurveNormal Curve – Bell Curve –

Normal Distribution of Scores

Basic Descriptive Statistics

Describing Central Tendencies of a Sample

Mean

Median

Mode

Describing the Variation of Scores

Range

Standard Deviation

Example of the Mode, Median and Mean in a Distribution

Mode = 62

Median 64.5

Mean = 66.7

Raw Score Freq98 197 191 285 180 577 772 565 364 762 1058 345 233 111 1

5 150

Total = 3337

Range = 93 (98 -5)St Dev = 17.1

Calculation of the Standard Deviation

of a Distribution

√

RawScore Mean X – X (X – X)

2

85 54 31 96180 54 26 67670 54 16 25660 54 6 3655 54 1 150 54 -4 1645 54 -9 8140 54 -14 19630 54 -24 57625 54 -29 841

Variance (SD2) =

Σ(X – X)2

n

3640

10 = 364a

Standard deviation (SD) = Σ(X – X)2

n

• Percentiles

Standardized test scores often accompanied by percentiles.

Percentiles are a comparison with the whole group – a norming function – normalizing

Related to the normal curve in terms of comparing one person’s score with another – using standard deviations.

Probabilities Under the Normal Curve (Figure 10.13)

Percentages Under the Normal Curve (Figure 10.11)

Examples of Standard Scores (Figure 10.15)

Effect Size

• An index of a treatment effect expressed as a standardized difference between two means (mean difference divided by the standard deviation of a comparison group’s mean). It is often expressed as a percentile shift, or “boost” from a baseline at the 50th percentile.

(Treatment Mean - Control Mean)

d = _____________________________

Standard Deviation of Control

Overlapping Distributions Revealing an

Effect Size of 0.5.

Overlapping Distributions Revealing a Large Effect Size

Rules of Thumb

• Older– Minimum per group size of 30

• Newer– For strong evidence

▫ “Rough rule of thumb” of 150 participants in each group

– For entire schools or classrooms▫ Rigorous evidence is 25-30 schools or classroom in each

group

– The dropout rate should not exceed 25%

Sample Size and Precision in Scientific Surveys

• Most scientific, national surveys use about 1,000 or so respondents in their sample

• Produces a “margin of error” around 3%, in other words, a boundary within which a value from the entire population would most likely fall





Population• Target population – desired population

• Accessible population – feasible/practical population

Target population: All 4th graders

Accessible population : 4th graders at St Peter’s

Sample• Subset of a Population• Individuals in the Sample are the participants in

the study

Population (in the green)

A person

Sample (in the small circle)

Choosing the sample• Random

– Every person in the population has an equal chance of being selected

– Best way to achieve “representative sample”

– Difficulty is truly achieving a random sample

Simple Random SampleSampling or selection is done by simply

randomly selecting one member of the population, then another, then another, etc… Until the desired sample size is achieved

Simple example: 100 people in the population, put their names in hat, and draw 20.

Simple Random Sample

• More likely methods:– Use computers

– Use table of random numbers and a list

Stratified Random Sampling• Every person still has an equal chance of

being selected• Select the sample based upon one or

more characteristics – strata• Determine the stratum or strata• Determine their proportions• Select persons from the strata to create a

sample that is consistent with the strata proportions

Stratified Random Sampling• Strata are gender and race

– In Population, 60% women, 70% white– Thus you have four groups

• White women – 42%• Minority Women - 18%• White men – 28%• Minority men – 12%

– Want 200 sample size. Select:• 84 White women from all the white women in the population• 36 Minority women from all the minority women in the pop• 56 White men from all the white men in the pop• 24 Minority men from all the minority men in the pop

Another example next slide:

• Middle school students 7th, 8th, 9th– Population

• 1500 7th graders 1500/3700 = 41%• 1200 8th graders 1200/3700 = 32%• 1000 9th graders 1000/3700 = 27%

• 100 students– 41 are 7th graders

– 32 are 8th graders

– 27 are 9th graders

Helps creates representation, but is more work

Stratified Random Sampling

Cluster Sample• Still Random – every person in population

has equal chance of being chosen• But you are sample groups (clusters) of

people rather than individualsExamples:• 8 elementary schools in Quincy

– Randomly select 2

• 3 sections of Ed Psych – Randomly select 1Simpler than simple random sample but risk an

non-representative sample

Two – Stage Random Sampling• Combination of cluster and simple

• Conduct a cluster sampling

• Randomly select participants from the selected clusters

• 8 elementary schools, cluster sampling of 3 of the schools, then randomly select 30 students from school

Non-random Sampling

• Purposive

• Convenience

• Solicited (volunteers)




• Does my sample size provide me with enough information? Is it representative?

External Validity

• The degree to which research results can be generalized beyond the sample and conditions that yielded the findings

• Population generalization– The extent to which research findings extend beyond the sample

of research participants that provided data

• Ecological generalization– The extent to which research findings extend beyond the setting

which produced sampled data

Representation• Is the sample representative of the

population?• The terms for this are:

– Generalization– External ValidityExample:– Results from 4th graders in Adams

• Generalize to all 4th graders in Quincy?• Generalize to all 4th graders in Illinois?

Ecological Generalizabilty – next slide

Ecological Generalizabilty• Degree to which the results of the study

can generalize to other situations or conditions.

4th graders = 3rd graders?

Live skits = video?

Urban = rural?

Random is best

• The use of random sampling is strongest method to provide external validity or generalization.

• True Random sampling is often not possible

• Replication becomes more important

One summary

Quantitative Qualitative Action

Random Purposive Purposive

Sample size important

Not concerned with sample size

Not concerned with sample size

External Validity is a concern

External Validity can be a concern

Not concerned with external validity

Statistics No statistics Maybe

• end

Documents

Sampling in Research Suter, Chapter 8. Questions about sampling Sample size – do I have enough participants? Is it the right kind of sample? Is it representative?