Upload
sugar
View
42
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Chi-Square Tests 3/14/12. Testing the distribution of a single categorical variable : 2 goodness of fit Testing for an association between two categorical variables: 2 test for association. Section 7.1, 7.2. Professor Kari Lock Morgan Duke University. - PowerPoint PPT Presentation
Citation preview
Chi-Square Tests3/14/12
• Testing the distribution of a single categorical variable : 2 goodness of fit
• Testing for an association between two categorical variables: 2 test for association
Section 7.1, 7.2 Professor Kari Lock MorganDuke University
For more information and to sign up, go to www.stat.duke.edu/datafest
• NOTE: This week only, my Friday office hours will be from 3:30 – 5:30 pm (NOT 1-3pm as usual)
Office Hours This Week
• Homework 6 (due Monday, 3/19)
• Anonymous Midterm Evaluation (due Monday, 3/19, 5pm)
• Project 1 (due Thursday, 3/22, 5pm)
To Do
Multiple Categories
•So far, we’ve learned how to do inference for categorical variables with only two categories
•Today, we’ll learn how to do hypothesis tests for categorical variables with multiple categories
Rock-Paper-Scissors (Roshambo)
Play a game!
Can we use statistics to help us win?
Rock-Paper-Scissors
Which did you throw?
a) Rockb) Paperc) Scissors
Rock-Paper-ScissorsROCK PAPER SCISSORS
34 24 40
How would we test whether all of these categories are equally likely?
Hypothesis Testing
1.State Hypotheses
2.Calculate a statistic, based on your sample data
3.Create a distribution of this statistic, as it would be observed if the null hypothesis were true
4.Measure how extreme your test statistic from (2) is, as compared to the distribution generated in (3)
test statistic
Hypotheses
Let pi denote the proportion in the ith category.
H0 : All pi s are the sameHa : At least one pi differs from the others
OR
H0 : Every pi = 1/3Ha : At least one pi ≠ 1/3
Test Statistic
Why can’t we use the familiar formula
to get the test statistic?
•More than one sample statistic•More than one null value
We need something a bit more complicated…
sample statistic null valueSE
Observed Counts•The observed counts are the actual counts observed in the study
ROCK PAPER SCISSORSObserved 34 24 40
Expected Counts•The expected counts are the expected counts if the null hypothesis were true
•For each cell, the expected count is the sample size (n) times the null proportion, pi
expected = inp
ROCK PAPER SCISSORSObserved 34 24 40Expected 33 33 33
Chi-Square Statistic
•A test statistic is one number, computed from the data, which we can use to assess the null hypothesis
•The chi-square statistic is a test statistic for categorical variables:
22 observed - expected
expected
Rock-Paper-ScissorsROCK PAPER SCISSORS
Observed 34 24 40Expected 33 33 33
2 2 22 34 33 24 33 40 33
3.9733 33 33
What Next?
We have a test statistic. What else do we need to perform the hypothesis test?
A distribution of the test statistic assuming H0 is true
How do we get this? Two options:1) Simulation2) Distributional Theory
Simulation
Let’s see what kind of statistics we would get if rock, paper, and scissors are all equally likely.
1) Take 3 scraps of paper and label them Rock, Paper, Scissors. Fold or crumple them so they are indistinguishable. Choose one at random.
2) What did you get?(a) Rock (b) Paper (c) Scissors
3) Calculate the 2 statistic.
4) Form a distribution.
5) How extreme is the actual test statistic?
RStudio
Chi-Square Distribution
Chi-Square Distribution•If each of the expected counts are at least 5, AND if the null hypothesis is true, then the 2 statistic follows a 2 –distribution, with degrees of freedom equal to
df = number of categories – 1
• Rock-Paper-Scissors:
df = 3 – 1 = 2
Chi-Square Distribution
Chi-Square p-values1.Applet: http://surfstat.anu.edu.au/surfstat-home/tables/chi.php
2.RStudio:pchisq(ts,df,lower.tail=FALSE)
3.TI-83:2nd DISTR 7:2cdf( lower bound, upper bound, df
Because the 2 is always positive, the p-value (the area as extreme or more extreme) is always the upper tail
t-distribution p-values and t*
1.Applet: http://surfstat.anu.edu.au/surfstat-home/tables/t.php
2.RStudio:p-value: pt(ts,df) (use lower.tail=TRUE or FALSE,
and double if two-tailed) t* for 95% CI: qt(0.975,df)
3.TI-83:p-value: 2nd DISTR 5: tcdf( lower bound, upper bound, dft*: Approximate with a normal (invnorm)
Goodness of Fit
• A 2 test for goodness of fit tests whether the distribution of a categorical variable is the same as some null hypothesized distribution
• The null hypothesized proportions for each category do not have to be the same
Chi-Square Test for Goodness of Fit1.State null hypothesized proportions for each category, pi.
Alternative is that at least one of the proportions is different than specified in the null.
2.Calculate the expected counts for each cell as npi . Make
sure they are all greater than 5 to proceed.
3.Calculate the 2 statistic:
4.Compute the p-value as the area in the tail above the 2 statistic, for a 2 distribution with df = (# of categories – 1)
5.Interpret the p-value in context.
22 observed - expected
expected
Mendel’s Pea Experiment
Source: Mendel, Gregor. (1866). Versuche über Pflanzen-Hybriden. Verh. Naturforsch. Ver. Brünn 4: 3–47 (in English in 1901, Experiments in Plant Hybridization, J. R. Hortic. Soc. 26: 1–32)
• In 1866, Gregor Mendel, the “father of genetics” published the results of his experiments on peas
• He found that his experimental distribution of peas closely matched the theoretical distribution predicted by his theory of genetics (involving alleles, and dominant and recessive genes)
Mendel’s Pea ExperimentMate SSYY with ssyy:Þ 1st Generation: all Ss Yy
Mate 1st Generation:Þ 2nd Generation
Second Generation
S, Y: Dominants, y: Recessive
Phenotype Theoretical Proportion
Round, Yellow 9/16
Round, Green 3/16
Wrinkled, Yellow 3/16
Wrinkled, Green 1/16
Mendel’s Pea ExperimentPhenotype Theoretical
ProportionObserved
Counts
Round, Yellow 9/16 315
Round, Green 3/16 101
Wrinkled, Yellow 3/16 108
Wrinkled, Green 1/16 32
Let’s test this data against the null hypothesis of each pi equal to the theoretical value, based on genetics
0 1 2 3 4
0
: 9 /16, 3 /16, 3 /16, 1/16: At is not as specified least one in a i
p p p pHH
Hp
Mendel’s Pea ExperimentPhenotype Theoretical
ProportionObserved
Counts
Round, Yellow 9/16 315
Round, Green 3/16 101
Wrinkled, Yellow 3/16 108
Wrinkled, Green 1/16 32
The 2 statistic is made up of 4 components (because there are 4 categories). What is the contribution of the first cell (315)?
(a) 0.016 (b) 1.05 (c) 5.21 (d) 107.2
Mendel’s Pea ExperimentPhenotype pi Observed
CountsExpected Counts
Round, Yellow 9/16 315 556 9/16 = 312.75 Round, Green 3/16 101 556 3/16 = 104.25
Wrinkled, Yellow 3/16 108 556 3/16 = 104.25Wrinkled, Green 1/16 32 556 1/16 = 34.75
556n expected = 556
22
2 2 2 2315 312.75 101 104.25 108 104.25 32 34.75312.75 104.25 104.25 34.7
observed expectedexpecte
50.470
d
Mendel’s Pea Experiment2 0.470
• Compare to a 2 distribution with 4 – 1 = 3 degrees of freedomhttp://surfstat.anu.edu.au/surfstat-home/tables/chi.php
p-value = 0.9254
Mendel’s Pea Experimentp-value = 0.9254
Does this prove Mendel’s theory of genetics? Or at least prove that his theoretical proportions for pea phenotypes were correct?
(a) Yes(b) No
Two Categorical Variables
• The statistics behind a 2 test easily extends to two categorical variables
• A 2 test for association (often called a 2 test for independence) tests for an association between two categorical variables
Rock-Paper-Scissors
• Did reading the example on Rock-Paper-Scissors in the textbook influence the way you play the game?
• Did you read the Rock-Paper-Scissors example in Section 6.3 of the textbook?
a) Yesb) No
Rock-Paper-ScissorsRock Paper Scissors Total
Read Example 2 10 5 17
Did Not Read Example 32 12 34 80
Total 34 24 39 97
H0 : Whether or not a person read the example in the textbook is not associated with how a person plays rock-paper-scissors
Ha : Whether or not a person read the example in the textbook is associated with how a person plays rock-paper-scissors
Expected Countscorow total ex lumn totapected =
sample el
siz
Maintains row and column totals, but redistributes the counts as if there were no association between the variables
Rock Paper Scissors Total
Read Example 2 10 5 17
Did Not Read Example 32 12 34 80
Total 34 24 39 97
Chi-Square
22 observed - expected
expected
Number of rows = Number of columns
df = ( 1) ( 1)
rc
r c
Chi-Square StatisticOBSERVED Rock Paper Scissors Total
Read Example 2 10 5 17
Did Not Read Example 32 12 34 80
Total 34 24 39 97
222222
225.96104.2156.843228.041419.793432.165.964.216.8428.0419.7932.16
15.4
EXPECTED Rock Paper Scissors Total
Read Example 4.21 6.84 17
Did Not Read Example 28.04 19.79 32.16 80
Total 34 24 39 97
17 5.9697
34
Rock-Paper-Scissors
df = (2 1) (3 1) 2
The smallest expected count is 5.96, which is greater than 5, so it is okay to use the chi-square distribution to find the p-value.
p-value = 0.0005
2 15.4
We have strong evidence that there is an association between reading the RPS example in the textbook and the first RPS throw. (This means you are paying attention and using what you learn in the textbook! )
Chi-Square Test for Association1.H0 : The two variables are not associated
Ha : The two variables are associated
2.Calculate the expected counts for each cell:
Make sure they are all greater than 5 to proceed.
3.Calculate the 2 statistic:
4.Compute the p-value as the area in the tail above the 2 statistic, for a 2 distribution with df = (r – 1) (c – 1)
5.Interpret the p-value in context.
22 observed - expected
expected
corow total ex lumn totapected = sample e
l siz
Metal Tags and PenguinsLet’s return to the question of whether metal tags are detrimental to penguins.
Source: Saraux, et. al. (2011). “Reliability of flipper-banded penguins as indicators of climate change,” Nature, 469, 203-206.
Survived Died Total
Metal Tag 33 134 167
Electronic Tag 68 121 189
Total 101 255 356
Is there an association between type of tag and survival?(a) Yes (b) No
Metal Tags and Penguins
Calculate expected counts (shown in parentheses). First cell:
Calculate the chi-square statistic:
p-value:
Survived Died Total
Metal Tag 33 134 167
Electronic Tag 68 121 189
Total 101 255 356
Survived Died Total
Metal Tag 33 (47.38) 134 (119.62) 167
Electronic Tag 68 (53.62) 121 (135.38) 189
Total 101 255 356
2 2 2 22 33 47.38 134 119.62 68 53.62 121 135.38
11.4847.38 119.62 53.62 135.38
df = (2 – 1) × (2 – 1) = 1 p-value = 0.0007
There is strong evidence that metal tags are detrimental to penguins.
row total 167 47.38sam
column total 1ple size 356
01
Metal Tags and Penguins
Difference in proportions test:z = -3.387 (we got something slightly different due to rounding)
p-value = 0.0007 (two-sided)
Chi-Square test:2 = 11.47p-value = 0.0007
Note: (-3.387)2 = 11.47
Two Categorical Variables with Two Categories Each
• When testing two categorical variables with two categories each, the 2 statistic is exactly the square of the z-statistic
• The 2 distribution with 1 df is exactly the square of the normal distribution
• Doing a two-sided test for difference in proportions or doing a 2 test will produce the exact same p-value.
Summary• The 2 test for goodness of fit tests whether one categorical variable differs from a hypothesized distribution
• The 2 test for association tests whether two categorical variables are associated
• For both, you calculate expected counts for each cell, compute the 2 statistic as
and find the p-value as the area above this statistic on a 2 distribution
22 observed - expected
expected