Upload
gavin-brown
View
239
Download
1
Tags:
Embed Size (px)
Citation preview
AP Statistics Chapter 26 Notes
“Chi-Squared Tests”
Chi-Squared Models
Chi-squared models are skewed to the right. They are parameterized by their degrees of freedom and become less skewed with increasing degrees of freedom. As the degrees of freedom increase, the distribution gets closer to a symmetric shape, but will never become completely symmetric. The Chi-Squared Model always has a longer tail to the right.
Graph some chi-squared models:
)40,(
)20,(
)10,(
)5,(
2
2
2
2
xpdfy
xpdfy
xpdfy
xpdfy
Window Settings:
Xmin = 0
Xmax = 60
Ymin = 0
Ymax = .2
The Chi-Squared Statistic
Chi-Squared analysis is based on the following calculation, called the chi-squared statistic. We use this number to find the p-value, which helps us determine whether to reject or retain the null hypothesis
ected
ectedobserved
exp
)exp( 22
This is similar to the z-score in the Normal model and the t-score in the T model
Types of Chi-Squared Tests Chi-Squared Test for Goodness of Fit: this test compares the
observed sample distribution with the population distribution. Generally, we are testing how well the observations “fit” what we expect. Use L1 for observed and L2 for expected.
(df = n – 1)
Chi-Squared Test for Homogeneity: This test compares two or more populations to look for similarities among the groups regarding a categorical variable. The data are provided in a table or matrix. Use matrix A for observed and matrix B for expected.
(df = (r – 1)(c – 1))
Chi-Squared Test for Independence: This test looks for an association or dependence between two categorical variables within one population. Use matrix A for observed and matrix B for expected.
(df = (r – 1)(c – 1))
Chi-Squared Test for Goodness of Fit
In one experiment, a scientist observed certain genetic alterations in offspring of lobsters in the Gulf of Maine. She found that 315 had alteration A, 108 had alteration B, 101 had alteration C, and 32 had alteration D. According to her theory, the expected frequencies should follow the ratio 9:3:3:1. Does this sample data lend confirmation to her theory?
Observed Count
Expected Count
The Solution Process This is a Chi-Squared test for goodness of fit. We are interested in
whether the observed data found in the lobsters match the theorized distribution of genetic alterations.
Step 1: What are the hypotheses? Ho: the observed data found in the lobsters match/equal the theorized
distribution of genetic alterations Ha: the observed data found in the lobsters do not match/do not
equal the theorized distribution of genetic alterations
Step 2: Check the conditions: the data must be randomly sampled and independent the individual expected counts in each cell of the categories must be at
least 5
Step 3: Do the calculations.
Step 4: State the conclusion
STAT, TESTS, D
X2 – GOF Test
You try… The State University at Center City claims that their
student body is accepted in the same proportion as the population of the key areas of the state. The university claims that their acceptance rate adheres to the following pattern, consistent with where the population of the state lives:
45% live in the northern section, 25% in the central section, 10% in the southwestern section, 15% live in the southeastern section, and 5% are from outside the state.
A group of students question if the State University really follows their stated policy. A researcher takes a random sample of 1000 incoming freshmen and finds that:
487 are from the Northern section, 218 are from the Central section, 89 are from the Southwest section, 147 are from the Southeast section, and 59 are from out of state.
Observed Count
Expected Count
Do these data provide evidence at the 5% significance level that the students are correct in doubting the university’s claim?
The Solution Process This is a Chi-Squared test for goodness of fit. We are interested in
whether the distribution of students at State University is consistent with the distribution of the population of the state by geographical region.
Step 1: What are the hypotheses? Ho: the distribution of students at State University is consistent/equals with the
distribution of the population of the state by geographical region Ha: the distribution of students at State University is not consistent/does not
equal with the distribution of the population of the state by geographical region
Step 2: Check the conditions: the data must be randomly sampled and independent the individual expected counts in each cell of the categories must be at
least 5
Step 3: Do the calculations
Step 4: State the conclusion
Chi-Squared Test for Homogeneity
The following table provides the responses of a group of 100 children shown three different toys and asked which one they liked the best. Based on the data, is there evidence of a difference in preference for toys between the boys and girls at a 5% significance level?
Toy A Toy B Toy C Totals
Boys 25 27 11 63
Girls 9 22 6 37
Totals 34 49 17 100
The Solution Process This is a chi-squared test for homogeneity because we are testing for a
difference in toy preference among the two groups, boys and girls.
Ho: toy preference is the same among the groups, boys and girls
Ha: toy preference is different among the groups, boys and girls
The expected frequency for each cell in the table is: (row total)(column total)
(grand total)
Toy A Toy B Toy C Totals
Boys 25 27 11 63
Girls 9 22 6 37
Totals 34 49 17 100
STAT, TESTS, C
X2 Test
Homogeneity Test – You try…
Medical researchers enlisted 108 subjects for an experiment comparing treatments for depression. The subjects were randomly divided into three groups and given pills to take for a period of three months. Unknown to them, one group received a placebo, the second group the “natural” remedy St. Johnswort, and the third group the prescription drug Paxil. After six months psychologists and physicians (who did not know which treatment each person received) evaluated the subjects to see if their depression had returned. Is there evidence of a difference in the rate of recurrence among the types of treatments?
Placebo St. JW Paxil Totals
Depression returned
24 22 14 60
No sign of depression
6 8 16 30
Totals 30 30 30 90
The Solution Process This is a chi-squared test for homogeneity because we are testing for a
difference in rate of recurrence among the different groups.
Ho: the rate of recurrence is the same for each group
Ha: the rate of recurrence is different for each group
Calculate the expected frequencies.
Placebo St. JW Paxil Totals
Depression returned
24 22 14 60
No sign of depression
6 8 16 30
Totals 30 30 30 90
Chi-Squared Test for Independence
In a study of exercise habits in men working in the health care profession in Chicago, researchers classified the 356 sampled employees according to the level of education they completed and their exercise habits. The researchers want to ascertain if there is an association between the level of education completed and exercise habits at the 5% significance level.
College Some College
High School Total
Exercises Regularly
51 22 43 116
Exercises Occasionally
92 21 28 141
Never Exercises
68 9 22 99
Total 211 52 93 356
The Solution Process This is a Chi-Squared test for independence because we are looking for an association
between education level completed and exercise habits among the men working in the health care profession in Chicago.
Ho: there is no association between education level completed and exercise habits among the men working in the health care profession in Chicago (independence)
Ha: there is an association between education level completed and exercise habits among the men working in the health care profession in Chicago (dependence)
Calculate the expected frequencies.
College Some College
High School Total
Exercises Regularly
51 22 43 116
Exercises Occasionally
92 21 28 141
Never Exercises
68 9 22 99
Total 211 52 93 356