Last Time
Get lots of sleep! Characteristics of the distribution of a
quantitative variable Shape, center, spread, outliers (in context)
“Formal” analysis for comparing two groups: statistical significance What is the distribution of the “by chance” results?
Statistical Significance
Calculate the difference in means Could a difference this large happen by chance?
Can use simulation to mimic the randomization process, assuming no difference between the groups
See how often you get a difference at least as large by chance alone (no treatment effect) p-value, statistical significance
Consider study design to decide whether to draw a causal conclusion
Example 2 – Day 5
Actual study Hypothetical data
92.15 deprivededunrestrict xx 92.15 deprivededunrestrict xx
Statistical Process
Compareresults
Randomized?
Getting the observational units in the first place!
Explanatory Variable
Example 1: Sampling Words
Circle 10 representative words Def: A parameter is a numerical characteristic
of the population (pi, mu, sigma)
Def: A statistic is a numerical characteristic of the sample , s (x-bar, p-hat, s)px ˆ,
Example 1: Sampling Words
Does our sampling method generally lead to good estimates of the parameter?
Sample results vary from sample to sample!
A sampling method is unbiased if the distribution of the sample statistics is centered at the population value.
Bias Literary Digest (p. 21) Bad Sampling Frame Voluntary response bias
Those who chose to respond are most likely to feel strongly, usually negatively, on the issue.
Nonresponse bias Those who aren’t home or who don’t have listed
numbers or who refuse to participate Convenience sample
Those who are easy to get a hold of, easily remembered
Example 1: Sampling Words
Def: A simple random sample gives everyone word in the population an equal probability of being selected. Every sample of n words is as likely as any other
sample of n words.
Example 1: Sampling Words
Selecting a simple random sampleMTB> set c1
DATA> 1:268
DATA> end
MTB> sample 5 c1 c2 Find the corresponding ID numbers of the
sampling frame (from webpage) Determine the average length of the 5 words
in your sample
Example 2: Sampling Words (cont.) What is the long-term pattern of these sample
means? Def: A sampling distribution of a statistic is the
distribution of the sample statistic for all possible samples (of the same size) from the population.
An empirical sampling distribution gives you an idea of the pattern from a large number of samples of the same size
Summary
Values of sample statistics vary from sample to sample – sampling variability Random sampling error
Sampling distribution = distribution of sample statistics (from all possible random samples) Observational units = samples Variable = sample statistics (e.g., sample means) Sampling method is unbiased if sampling distribution is
centered at parameter of interest Random samples are unbiased and allow us to
estimate the size of the random sampling error Sampling distribution follows a predictable pattern
Statistical Significance
This consistent pattern helps us to decide when we might have a surprising value for the sample statistic. Level of surprise depends on sample size
p-value indicates how often a random sample would like to a value of the sample statistic at least as extreme Is sample statistic result “significantly” different
from population parameter?
Example
Lost ticket, would you buy another?
Lost $20, would you buy another?
Lives saved?
Lives lost?
Prediction: more likely if lost ticket
Prediction: Option A more likely when in terms of lives saved
Nonsampling Errors
March 6-8, 2004 Wall Street Journal/NBC poll of 1,018 adults
GAY MARRIAGE opinions depend on how the question is asked.
To one poll question, a 52%-43% majority opposes a constitutionalamendment "making it illegal for gay couples to marry." A 54%-42%majority responds favorably to a second query that omits the word"illegal" and more benignly asks about an amendment "that defined
marriage as a union only between a man and a woman."
Sources of Nonsampling Errors Sensitive questions
Social acceptability Wording of question
Appearance of interviewer Order of choices Unsure response, change mind, faulty
memory
For Tuesday
Submit your tentative project proposal (see syllabus for additional guidelines)
Submit PP 6 in Blackboard Read Sec. 4.1 and 4.2 Complete Example 3 from the Day 6 handout