Statistics vs. Anecdotal Evidence Smoking causes cancer.Seat
belts save lives. Section 1.1: Introduction to Statistics
Slide 3
Autism and Vaccines Nelson says it wasn't long after her son
Parker's shots at 15 months that she noticed something was wrong.
"He had run a slight fever after the vaccinations, but i didn't
think anything of it," said Nelson. "You know kids run fevers all
the time, but about a week after that he just completely stopped
talking." After months of worrying, wondering, and going back and
forth with doctors, an official diagnosis was made: autism. Nelson
believes it started with the vaccines. "Gradually, I started
piecing it together. He got sick after his vaccinations and about a
week later everything changed. He was a completely different little
boy then," said Nelson.
http://www.wsaz.com/charleston/headlines/19376044.html
Slide 4
What is Statistics? Statistics the discipline that guides us to
produce or collect data which is then analyzed in order to draw
inferences or make predictions. Numerical summaries such as means,
percentages, and standard deviations are called statistics.
Slide 5
Descriptive Statistics Descriptive Statistics refers to methods
for summarizing data. These summaries consist of graphs
(histograms, scatterplots, pie charts, etc.) and numbers (means,
standard deviations, regression equations, percentages, etc.).
Slide 6
Inferential Statistics Inferential statistics refers to methods
of making decisions or predictions about a population or a process,
based on data obtained from a sample. We will use tests of
significance and confidence intervals to achieve this.
Slide 7
This semester, we will be looking at and conducting a number of
studies
Slide 8
Statistical Process Logic of Inference Scope of Inference -
Significance - Estimation - Generalize - Cause/Effect 7.
Communicate findings 1. Ask a research question Research Conjecture
2. Design a study 3. Collect data 4. Explore the data 5. Draw
inferences 6. Formulate conclusions
Slide 9
Physicians Health Study I 1. Research Question: Will taking
aspirin help reduce heart attacks? 2. Design Study: Started in 1982
with 22,071 male physicians. Half took a 325mg aspirin every other
day (the other half took a placebo)
Slide 10
Physicians Health Study I 3. Collect Data: Intended to go until
1995, the aspirin study was stopped in 1988 after 189 heart attacks
occurred in the placebo group and 104 in the aspirin group. Hoped
to be a wonder drug, it was found there was no benefit or harm from
beta carotene. This result allowed investigators to turn to other,
more promising agents.
Slide 11
Physicians Health Study I 4. Explore Data: 1.7% in the placebo
group had heart attacks while only 0.9% in the aspirin group had
heart attacks. (45% reduction in heart attacks for the aspirin
group) 5. Draw Inferences: The likelihood of the difference between
the proportions of heart attacks in each group being as large as it
was just by chance is very, very small.
Slide 12
Physicians Health Study I 6. Formulate Conclusions: They
concluded that taking aspirin does reduce the likelihood of heart
attacks in middle-age and older males. 7. Report Findings:
Slide 13
Terminology The individual entities on which data are recorded
are called observational units. The recorded characteristics of the
observational units are the variables of interest. What are the
observational units and variables in the Physicians Health
Study?
Slide 14
Section 1.2 Introduction to the Logic of Statistical
Inference
Slide 15
Dolphin Communication Can dolphins communicate abstract ideas?
In an experiment done in the 1960s, Doris was instructed which of
two buttons to push. She then had to communicate this to Buzz (who
could not see Doris). If he picked the correct button, both
dolphins would get a reward. What are the observational units and
variables in this study?
Slide 16
Dolphin Communication In one set of trials, Buzz chose the
correct button 15 out of 16 times. Based on these results, do you
think Buzz knew which button to push or is he just guessing? How
might we justify an answer? How might we model this situation?
Slide 17
Modeling Buzz and Doris Flip Coins Applet Applet
Slide 18
Can Chimps Solve Problems? http://youtu.be/ySMh1mBi3cI
Slide 19
Exploration 1.2: Can Chimps Solve Problems? Sarah, a 30
year-old chimp, is shown videos of a person struggling with some
problem. (cant reach a banana, cage door locked, record player not
working, etc.) She is then shown two pictures. One of the solution
and one not. She then picks one of the pictures. Does Sarah
understand the solution to these problems or is she just randomly
picking a picture?
Slide 20
Exploration 1.2 (pg 15) Read the first paragraph. 1. State the
research question. (This is a broad statement.) 2. State the
research conjecture. (This is more specific to our test.) Sarah
correctly picked 7 of the 8 pictures. Is this unlikely if she is
just guessing? Continue working on the exploration.
Slide 21
Section 1.3 Statistical Significance: Other Random Choice
Models
Slide 22
Can dogs sniff out cancer? Marine sniffing samples
Slide 23
Can Dogs Sniff Out Cancer? 1. Research Question: Can dogs
detect a patient with cancer by smelling their breath? 2. Design a
study: Five breath bags were shown to Marine, one from a cancer
patient and four from non-cancer patients. 3. Collect data: Marine
completed 33 attempts at this procedure. 4. Explore the data:
Marine identified the correct bag 30 out of 33 times.
Slide 24
Can Dogs Sniff Out Cancer? How is the chance model we will use
for this situation different than our previous ones? Can we use
coins again?
Slide 25
Can Dogs Sniff Out Cancer? 5. Draw Inferences Three S Strategy
Statistic: Compute the statistic from the observed data. Simulate:
Identify a model that represents a chance explanation. Use the
model to simulate data that could have happened when the chance
model is true. Calculate the value of the statistic from the
could-have- been data. Repeat the simulation process to generate a
distribution of the could-have-been values for the statistic.
Strength of evidence: Consider whether the value of the observed
statistic is unlikely to occur when the chance model is true.
Slide 26
Can Dogs Sniff Out Cancer? We have the statistic. Marine made
the correct identification 30 out of 33 times. How could we set up
a simulation? Tactile (how could this be done?) Applet Strength of
evidence. Is 30 out of 33 very unlikely under the chance
model?
Slide 27
Can Dogs Sniff Out Cancer? 6: Formulate conclusions: Can we
conclude that marine can identify cancerous breath? Can we conclude
that all dogs can do this? Some dogs? 7: Communicate findings:
Marine, the dog that can sniff out bowel cancer By Jeremy Laurance,
Health Editor A labrador retriever called Marine has been trained
to sniff out cancer with stunning accuracy, researchers report
today.
Slide 28
Terminology: Hypotheses The null hypothesis is the chance
explanation. Typically the alternative hypothesis is what the
researchers think is true. Null hypothesis: Marine is randomly
choosing which bag to sit next to. Alternative hypothesis: Marine
is not randomly choosing which bag to sit next to.
Slide 29
Terminology: Null Distribution We will refer to the
distribution of chance outcomes as the null distribution. For
Marine, we should have gotten a null distribution similar to the
following.
Slide 30
Terminology: P-value The p-value as the proportion of outcomes
in the null distribution that are at least as extreme as the value
of the statistic actually observed in the study. What was our
p-value for Marine? Were they all the same? Were they all close to
the same?
Slide 31
Guidelines for evaluating strength of evidence from p-values
p-value >0.10, not much evidence against null hypothesis 0.05
< p-value < 0.10, moderate evidence against the null
hypothesis 0.01 < p-value < 0.05, strong evidence against the
null hypothesis 0.001 < p-value < 0.01, very strong evidence
against the null hypothesis p-value < 0.001, extremely strong
evidence against the null hypothesis
Slide 32
Terminology: Statistically Significant If the observed results
provide strong evidence that the data did not arise by random
chance alone then the research result is called statistically
significant. Are Marines results statistically significant?
Slide 33
Lets play some rock-paper-scissors Rock smashes scissors Paper
covers rock Scissors cut paper Play the novice version at least 30
times and keep track of all your choices.
Slide 34
Activity 1.4 Now work on activity 1.4.
Slide 35
Criminal Justice System vs. Significance Tests Innocent until
proven guilty. We assume a defendant is innocent and the
prosecution has to collect evidence to try to prove the defendant
is guilty. Likewise, we assume our chance model (or null
hypothesis) is true and we collect data and calculate a sample
proportion. We then show how unlikely our proportion is if the
chance model is true.
Slide 36
Criminal Justice System vs. Significance Tests If the
prosecution shows lots of evidence that go against this assumption
of innocence (DNA, witnesses, motive, contradictory story, etc.)
then the jury concludes that the defendant the innocence
assumptions is wrong. If after we collect data and find that the
likelihood (p-value) of such a proportion is so small that it would
rarely occur by chance if the null hypothesis is true, then we
conclude our assumption of the chance model being true is
wrong.
Slide 37
Review For Sarah the chimp, you could have gotten a null
distribution similar to the one shown here. What does a single dot
represent? What does the whole distribution represent? What is the
p-value for this simulation? What does this p-value mean?
Slide 38
More Review The null hypothesis is the chance explanation.
Typically the alternative hypothesis is what the researchers think
is true. Three S Strategy Statistic, Simulate, Strength of evidence
The p-value as the proportion of outcomes in the null distribution
that are at least as extreme as the value of the statistic actually
observed in the study.
Slide 39
Still More Review A small p-value gives evidence against the
null and for the alternative. If the observed results provide
strong evidence that the data did not arise by random chance alone
then the research result is called statistically significant.
Slide 40
Section 1.4 Other Chance Models
Slide 41
Ron Artest, choker at the line? In the 2009-10 basketball
Season Ron Artest made 68.8% of his free throws, similar to his
career average. In his first 15 attempts in the playoffs, he only
made 7 free throws. (46.7%) Is this evidence that he is choking and
performing significantly worse than during the regular season?
Slide 42
Ron Artest Example What are the observational units? Artests 15
free throw attempts. What is the variable? Whether or not he makes
the free throw. What is the statistic of interest? 7/15
Slide 43
Notation
Slide 44
Hypotheses Null hypothesis: Ron Artests performance at the free
throw line during the 2010 NBA finals is the same as his regular
season performance; his probability of making a basket in the
playoffs is 0.688. Alternative hypothesis: Ron Artests performance
at the free throw line during the 2010 NBA finals is worse than his
regular season performance; his probability of making a basket in
the playoffs is less than 0.688.
Slide 45
Simulated Chance Model Coins, cards, dice, spinners, etc. dont
really work well here to develop a chance model of a 68.8% success
rate. But we can still use the magic of an applet. (While this will
be a different applet than the first two we used, it is essentially
the same.) applet
Slide 46
Ron Artest Continued So we have moderate evidence against the
null. Lets see what would happen if we had more data. Suppose he
continued to shoot 46.7% from the free throw line so that he made 7
out of 15 of his next attempts as well for a total of 14 out of 30.
Lets return to the applet to see how our p-value would change.
Slide 47
Ron Artest Continued As the sample size increases, there is
less variability in our null distribution. It is still centered
around 0.688, but its width becomes more and more narrow. As a
result, 0.467 gets further and further out in the tail and thus the
p-value gets smaller. This should make intuitive sense in that with
a larger sample size, we have more evidence.
Slide 48
Ron Artest Continued Besides a larger sample size, how else
could we get more evidence against the null? Artest could make
fewer shots. Is that what really happened? No. Artest made 4 of his
next 5 shots for a total of 11 out of 20 (55%) for the playoffs.
Lets return to the applet and see how this changes our
p-value.
Slide 49
Exploration 1.4 Shaky Putting? Phil Mickelson is one of the
best golfers in the world. Hes won the Masters Tournament three
times. However, 2011 was not his best year. He seemed to struggle
with his putting and switched to a belly putter late in the
year.
Slide 50
Exploration 1.4 Was Mickelson a poor putter in 2011? In this
exploration, you will compare Mickelsons 2011 record of putting
from 10 feet away from the hole with that of all other professional
golfers that year. Was he significantly worse than his peers?
Slide 51
Section 1.5 Modeling More Complex Situations
Slide 52
Infant preference for helper or hinderer?
Slide 53
Helper Toy
Slide 54
Baby chooses a toy
Slide 55
Helper or Hinderer? Sixteen babies were shown the two
demonstrations. One helper toy and one hinderer toy. Which toy used
and the order was random. When presented with the two toys
(randomly which was to the left and which to the right) 14 of the
babies chose the helper toy. How is this experiment different than
any we have looked at so far?
Slide 56
Helper or Hinderer? The key difference is that each attempt was
made by a different baby. Our chance model implies that each baby
has the same chance of choosing the helper toy (50%). It could be
that some babies randomly choose and some do not. We will talk
about this in our conclusion. Lets run the test.
Slide 57
Helper or Hinderer? Null Hypothesis: Each baby is randomly
choosing one of two toys. (The babies choose the helper toy 50% of
the time in the long run.) Alternative Hypothesis: The babies are
not randomly choosing, but show a preference for the helper toy.
(The babies choose the helper toy more than 50% of the time in the
long run.) We can use any applet to test this. Remember that our
sample proportion is 14 out of 16.
Slide 58
Helper or Hinderer? So what can we conclude? Do all the babies
prefer the helper toy? Do some of the babies prefer the helper toy?
Because we had a low p-value, we can conclude that not all the
babies are randomly choosing and that at least some of them prefer
the helper toy. Can we make conclusions beyond these 16
babies?
Slide 59
Which Tire? Two students miss a chemistry exam because of
excessive partying, but blame their absence on a flat tire. The
professor allowed them to take a make-up exam, and he sent them to
separate rooms to take it. The first question, worth 5 points, was
quite easy. The second question, worth 95 points, asked: Which tire
was flat?
Slide 60
Which Tire? How would you answer this question? Drivers side
front Passengers side front Drivers side rear Passengers side
rear
Slide 61
Exploration 1.5: Tire Story Falls Flat We will use the data
from class to determine if students have a preference for picking
one of the four tires. This is similar to the helper-hinderer
example because our observational units are different people. Lets
work exploration 1.5 (page 50).