26
Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables.

Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Embed Size (px)

DESCRIPTION

Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables. gg. yy. yg. yg. yg. yg. yy. yg. gg. gy. 25%. 25% 25% 25%. Pea Color freq Observed freq Expected Yellow 158 150 - PowerPoint PPT Presentation

Citation preview

Page 1: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Tuesday, Dec. 2

Chi-square Goodness of FitChi-square Test of Independence: Two Variables.

Page 2: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables
Page 3: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables
Page 4: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables
Page 5: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables
Page 6: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

gg yy

yg yg yg yg

yy yg gg gy

25% 25% 25% 25%

Page 7: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables
Page 8: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Pea Color freq Observed freq Expected

Yellow 158 150

Green 42 50

TOTAL 200 200

Page 9: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Pea Color freq Observed freq Expected

Yellow 158 150

Green 42 50

TOTAL 200 200

2 = (fo - fe)2

fei=1

k

Chi Square Goodness of Fit

d.f. = k - 1, where k = number of categories of in the variable.

Page 10: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

“… the general level of agreement between Mendel’s expectations and his reported results shows that it is closer than would be expected in the best of several thousand repetitions. The data have evidently been sophisticated systematically, and after examining various possibilities, I have no doubt that Mendel was deceived by a gardening assistant, who knew only too well what his principal expected from each trial made…”

-- R. A. Fisher

Page 11: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Pea Color freq Observed freq Expected

Yellow 151 150

Green 49 50

TOTAL 200 200

2 = (fo - fe)2

fei=1

k

Chi Square Goodness of Fit

d.f. = k - 1, where k = number of categories of in the variable.

Page 12: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Peas to Kids: Another ExampleGoodness of Fit

At my children’s school science fair last year,where participation was voluntary but strongly encouraged,

I counted about 60 boys and 40 girls who hadsubmitted entries. Since I expect a ratio of 50:50 if there were no gender preference for submission,is this observation deviant, beyond chance level?

Page 13: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

Page 14: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

2 = (fo - fe)2

fei=1

k

Page 15: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

2 = (fo - fe)2

fei=1

k

For each of k categories, square the difference between theobserved and the expected frequency, divide by the expectedfrequency, and sum over all k categories.

Page 16: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

2 = (fo - fe)2

fei=1

k

For each of k categories, square the difference between theobserved and the expected frequency, divide by the expectedfrequency, and sum over all k categories.

(60-50)2 (40-50)2

+50 50

= 4.00=

Page 17: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

2 = (fo - fe)2

fei=1

k

For each of k categories, square the difference between theobserved and the expected frequency, divide by the expectedfrequency, and sum over all k categories.

(60-50)2 (40-50)2

+50 50

= 4.00=

This value, chi-square, will be distributed with known probabilityvalues, where the degrees of freedom is a function of the number ofcategories (not n). In this one-variable case, d.f. = k - 1.

Page 18: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Boys Girls

Expected: 50 50

Observed: 60 40

2 = (fo - fe)2

fei=1

k

For each of k categories, square the difference between theobserved and the expected frequency, divide by the expectedfrequency, and sum over all k categories.

(60-50)2 (40-50)2

+50 50

= 4.00=

This value, chi-square, will be distributed with known probabilityvalues, where the degrees of freedom is a function of the number ofcategories (not n). In this one-variable case, d.f. = k - 1. Critical value of chi-square at =.05, d.f.=1 is 3.84, so reject H0.

Page 19: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Chi-square Test of Independence

Are two nominal level variables related or independentfrom each other?

Is race related to SES, or are they independent?

Page 20: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

15

32

1928 47

Lo

Hi

SES

White Black

12 3

16 16

Page 21: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Row n x Column n

Total n

The expected frequency of any given cell is

15

32

1928 47

Lo

Hi

SES

White Black

12 3

16 16

Page 22: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

2 =(fo - fe)2

fe

r=1

r

c=1

c

At d.f. = (r - 1)(c - 1)

Page 23: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Row n x Column n

Total n

The expected frequency of any given cell is

15

32

1928 47

(15x28)/47 (15x19)/47

(32x28)/47 (32x19)/47

Page 24: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Row n x Column n

Total n

The expected frequency of any given cell is

15

32

1928 47

(15x28)/47 (15x19)/47

(32x28)/47 (32x19)/47

8.94 6.06

19.06 12.94

Page 25: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

15

32

1928 47

8.94 6.06

19.06 12.94

12 3

16 16

2 =(fo - fe)2

fe

r=1

r

c=1

c

Please calculate:

Page 26: Tuesday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables

Important assumptions:

Independent observations.

Observations are mutually exclusive.

Expected frequencies should be reasonably large: d.f. 1, at least 5 d.f. 2, >2 d.f. >3, if all expected frequencies but one are greater than or equal to 5 and if the one that is not is at least equal to 1.