23
Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Embed Size (px)

Citation preview

Page 1: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Test of Independence

Lecture 43Section 14.5Mon, Apr 23, 2007

Page 2: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Independence

Only one sample is taken. For each subject in the sample, two

observations are made (i.e., two variables are measured).

We wish to determine whether there is a relationship between the two variables.

The two variables are independent if there is no relationship between them.

Page 3: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

In Mendel’s experiments, Mendel observed 75% yellow seeds, 25% green seeds. 75% smooth seeds, 25% wrinkled seeds.

Because color and texture were independent, he also observed 9/16 yellow and smooth 3/16 yellow and wrinkled 3/16 green and smooth 1/16 green and wrinkled

Page 4: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

Smooth Wrinkled

Yellow 9 3

Green 3 1

That is, he observed the same ratios within categories that he observed for the totals.

Page 5: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

Smooth Wrinkled

Yellow 9 3

Green 3 1

3 : 1 Ratio

That is, he observed the same ratios within categories that he observed for the totals.

Page 6: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

That is, he observed the same ratios within categories that he observed for the totals.

Smooth Wrinkled

Yellow 9 3

Green 3 1

3 : 1 Ratio

Page 7: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

That is, he observed the same ratios within categories that he observed for the totals.

Smooth Wrinkled

Yellow 9 3

Green 3 1

3 :

1 R

ati

o

Page 8: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

That is, he observed the same ratios within categories that he observed for the totals.

Smooth Wrinkled

Yellow 9 3

Green 3 1

3 :

1 R

ati

o

Page 9: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Mendel’s Experiments

Had the traits not been independent, he might have observed something different.

Smooth Wrinkled

Yellow 10 2

Green 2 2

Page 10: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Example

Suppose a university researcher suspects that a student’s SAT-M score is related to his performance in Statistics.

At the end of the semester, he compares each student’s grade to his SAT-M score for all Statistics classes at that university.

He wants to know whether the student’s with the higher SAT-M scores got the higher grades.

Page 11: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Example

Does there appear to be a difference between the rows?

Or are the rows independent?

A B C D F

400 - 500 7 8 16 20 21

500 – 600 13 28 32 22 13

600 – 700 8 23 22 10 9

700 - 800 8 13 14 8 5

Grade

SA

T-M

Page 12: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

The Test of Independence

The null hypothesis is that the variables are independent.

The alternative hypothesis is that the variables are not independent.

H0: The variables are independent.

H1: The variables are not independent. Let = 0.05.

Page 13: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

The Test Statistic

The test statistic is the chi-square statistic, computed as

The question now is, how do we compute the expected counts?

E

EO 22 )(

Page 14: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Expected Counts

Since the rows should all exhibit the same proportions, the method is the same as before.

totalgrand

tal)(column to total)(rowcount Expected

Page 15: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Expected Counts

A B C D F

400 - 5007

(8.64)

8

(17.28)

16

(20.16)

20

(14.40)

21

(11.52)

500 – 60013

(12.96)

28

(25.92)

32

(30.24)

22

(21.60)

13

(17.28)

600 – 7008

(8.64)

23

(17.28)

22

(20.16)

10

(14.40)

9

(11.52)

700 - 8008

(5.76)

13

(11.52)

14

(13.44)

8

(9.60)

5

(7.68)

Page 16: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

The Test Statistic

The value of 2 is 23.7603.

Page 17: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Degrees of Freedom

The degrees of freedom are the same as before

df = (no. of rows – 1) (no. of cols – 1). In our example, df = (4 – 1) (5 – 1) = 12.

Page 18: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

The p-value

To find the p-value, calculate

2cdf(23.7603, E99, 12) = 0.0219. The results are significant at the 5% level.

Page 19: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

TI-83 – Test of Independence

The test for independence on the TI-83 is identical to the test for homogeneity.

Page 20: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Example Admissions figures for the School of Arts

and Sciences.

Acceptance Status

AcceptedNot

Accepted

RaceFemale 50 150

Male 500 1000

Page 21: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Example Admissions figures for the Business

School.

Acceptance Status

AcceptedNot

Accepted

RaceFemale 850 1500

Male 150 200

Page 22: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Example Admissions figures for the two schools

combined.

Acceptance Status

AcceptedNot

Accepted

RaceFemale 900 1650

Male 650 1200

Page 23: Test of Independence Lecture 43 Section 14.5 Mon, Apr 23, 2007

Practice This is called Simpson’s paradox. It occurs whenever the aggregate

population shows a different relationship than the subpopulations.