Upload
august-davidson
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Test of Independence
Lecture 43Section 14.5Mon, Apr 23, 2007
Independence
Only one sample is taken. For each subject in the sample, two
observations are made (i.e., two variables are measured).
We wish to determine whether there is a relationship between the two variables.
The two variables are independent if there is no relationship between them.
Mendel’s Experiments
In Mendel’s experiments, Mendel observed 75% yellow seeds, 25% green seeds. 75% smooth seeds, 25% wrinkled seeds.
Because color and texture were independent, he also observed 9/16 yellow and smooth 3/16 yellow and wrinkled 3/16 green and smooth 1/16 green and wrinkled
Mendel’s Experiments
Smooth Wrinkled
Yellow 9 3
Green 3 1
That is, he observed the same ratios within categories that he observed for the totals.
Mendel’s Experiments
Smooth Wrinkled
Yellow 9 3
Green 3 1
3 : 1 Ratio
That is, he observed the same ratios within categories that he observed for the totals.
Mendel’s Experiments
That is, he observed the same ratios within categories that he observed for the totals.
Smooth Wrinkled
Yellow 9 3
Green 3 1
3 : 1 Ratio
Mendel’s Experiments
That is, he observed the same ratios within categories that he observed for the totals.
Smooth Wrinkled
Yellow 9 3
Green 3 1
3 :
1 R
ati
o
Mendel’s Experiments
That is, he observed the same ratios within categories that he observed for the totals.
Smooth Wrinkled
Yellow 9 3
Green 3 1
3 :
1 R
ati
o
Mendel’s Experiments
Had the traits not been independent, he might have observed something different.
Smooth Wrinkled
Yellow 10 2
Green 2 2
Example
Suppose a university researcher suspects that a student’s SAT-M score is related to his performance in Statistics.
At the end of the semester, he compares each student’s grade to his SAT-M score for all Statistics classes at that university.
He wants to know whether the student’s with the higher SAT-M scores got the higher grades.
Example
Does there appear to be a difference between the rows?
Or are the rows independent?
A B C D F
400 - 500 7 8 16 20 21
500 – 600 13 28 32 22 13
600 – 700 8 23 22 10 9
700 - 800 8 13 14 8 5
Grade
SA
T-M
The Test of Independence
The null hypothesis is that the variables are independent.
The alternative hypothesis is that the variables are not independent.
H0: The variables are independent.
H1: The variables are not independent. Let = 0.05.
The Test Statistic
The test statistic is the chi-square statistic, computed as
The question now is, how do we compute the expected counts?
E
EO 22 )(
Expected Counts
Since the rows should all exhibit the same proportions, the method is the same as before.
totalgrand
tal)(column to total)(rowcount Expected
Expected Counts
A B C D F
400 - 5007
(8.64)
8
(17.28)
16
(20.16)
20
(14.40)
21
(11.52)
500 – 60013
(12.96)
28
(25.92)
32
(30.24)
22
(21.60)
13
(17.28)
600 – 7008
(8.64)
23
(17.28)
22
(20.16)
10
(14.40)
9
(11.52)
700 - 8008
(5.76)
13
(11.52)
14
(13.44)
8
(9.60)
5
(7.68)
The Test Statistic
The value of 2 is 23.7603.
Degrees of Freedom
The degrees of freedom are the same as before
df = (no. of rows – 1) (no. of cols – 1). In our example, df = (4 – 1) (5 – 1) = 12.
The p-value
To find the p-value, calculate
2cdf(23.7603, E99, 12) = 0.0219. The results are significant at the 5% level.
TI-83 – Test of Independence
The test for independence on the TI-83 is identical to the test for homogeneity.
Example Admissions figures for the School of Arts
and Sciences.
Acceptance Status
AcceptedNot
Accepted
RaceFemale 50 150
Male 500 1000
Example Admissions figures for the Business
School.
Acceptance Status
AcceptedNot
Accepted
RaceFemale 850 1500
Male 150 200
Example Admissions figures for the two schools
combined.
Acceptance Status
AcceptedNot
Accepted
RaceFemale 900 1650
Male 650 1200
Practice This is called Simpson’s paradox. It occurs whenever the aggregate
population shows a different relationship than the subpopulations.