9
7/22/14 1 The Chi-Square Tests for Goodness of Fit And Independence The Chi-Square I. Introduction II. Expected versus Observed Values III. Distribution of X 2 IV. Interpreting SPSS printouts of Chi-Square V. Reporting the Results of Chi-Square VI. Assumptions of Chi-Square Introduction Often when we are testing hypotheses, we only have frequency data. Our hypothesis concern the distributions of the frequencies across various categories. Examples: Are there an equal number of males and females in a group? Are Republicans more likely to be Fundamentalist Christians than Democrats?

The Chi-Square - Middle Tennessee State University | Middle

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

1

The Chi-Square

Tests for Goodness of Fit And

Independence

The Chi-Square

I.  Introduction II.  Expected versus Observed Values III.  Distribution of X 2 IV.  Interpreting SPSS printouts of Chi-Square V.  Reporting the Results of Chi-Square VI.  Assumptions of Chi-Square

Introduction

Often when we are testing hypotheses, we only have frequency data. Our hypothesis concern the distributions of the frequencies across various categories.

Examples: Are there an equal number of males and

females in a group? Are Republicans more likely to be

Fundamentalist Christians than Democrats?

Page 2: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

2

Introduction

With these data we have the number of people of a certain type in a category. This is qualitative, not quantitative date. The scale of measurement is nominal.

Compare this to age as a variable. Age is

a quantitative variable, measured on a ratio scale.

Introduction

If one were to ask are Republicans older than Democrats, then one could measure the age of a sample of people in each group, calculate the means of each sample, and test if the difference in the sample means is statistically significant (i.e., the sample means represent a difference in the population mean).

Introduction

Compare this to the question: “Are Republicans more likely to be males than Democrats?” Our sample would contain a number of males and females. We would not want to calculate a mean gender.

Page 3: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

3

Introduction

Age and Party Affiliation Republican Democrat M = 51.2 M = 47.5

Appropriate statistical test: Independent samples t test.

t = M1-M2 ------------- sM1-M2

df = are (n1-1) + (n2-1)

Introduction

Gender and Party Affiliation Males Females Republicans Democrats

Appropriate statistical test: Chi-Square

58 42

70 80

Expected versus Observed Values

With the Chi Square, you test the distribution of scores across the groups against a hypothetical distribution (the Ho, or null hypothesis).

For example, the null hypothesis might be

that males and females are equally likely to be Republican and Democrat.

Page 4: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

4

Expected versus Observed Values

For example, in a sample of 100 Republicans, the null hypothesis might be that there would be 50 males and 50 females.

Expected values: Males Females

Republicans: 50 50

Expected versus Observed Values

However, what if you know the population is 60 percent female, then the expected values should be as follows: Males Females

Republicans: 40 60

Expected versus Observed Values

In any random sample of 100 people, I will not observe exactly 60 females and 40 males, any more than I get exactly 50 heads in a 100 coin tosses.

Chi Square measures the difference between the

observed values and the expected values, and compares that difference to what one might expect by chance.

fo = frequency observed fe = frequency expected (fo -fe)2

fe

Chi-square = Χ2 = ∑

Page 5: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

5

Expected versus Observed Values

58 42

40 60

Males Females Republicans:

Observed Expected

Χ2 = (58-40)2 + (42-60)2

40 60 Χ2 = 8.1 + 5.4 = 13.5

Distribution of X 2

Large values of X 2 are unlikely to be observed by chance alone (null hypothesis).

Distribution of X 2

Shape of the distribution depends on the degrees of freedom.

Page 6: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

6

Distribution of X 2

The degrees of freedom are determined by the number of rows and columns in the table.

If there is only one row, df = C-1

With more than one row, df = (R-1)(C-1) R = number of rows. C = number of columns.

In our example, df = 1

Distribution of X 2

With two dimensions: 2 X 2 Chi-Square

Gender and Party Affiliation (observed values) Males Females Republicans Democrats

58 42

70 80

Totals 100 150

Totals 128 122 250

Null hypothesis: counts will be equally distributed Across the cells.

Page 7: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

7

With two dimensions: 2 X 2 Chi-Square

Gender and Party Affiliation (expected values) Males Females Republicans Democrats

100*128/250 = 51.2

100*122/250 = 48.8

150*128/250 = 76.8

150*122/250 = 73.2

Totals 100 150

Totals 128 122 250

Use these values to calculate Chi Square: (fo -fe)2

fe Χ2 = ∑

Use these values to calculate Chi Square: (fo -fe)2

fe Χ2 = ∑

Gender and Party Affiliation (observed values) Males Females Republicans Democrats

Χ2 = .903 + .948 + .602 + .632 = 3.084

(58-51.2)2

51.2

= .903

(42-48.8)2

48.8 = .948

(70-76.8)2

76.8

= .602

(80-73.2)2

73.2 = .632

Interpreting SPSS printouts of Chi-Square

Data Structure: Prticipant Party Gender

1 1 22 2 13 2 14 1 15 1 26 1 27 2 18 2 29 1 110 1 211 2 1

Page 8: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

8

" Case Processing Summary

Cases Valid Missing Total N Percent N Percent N Percent

Party * Gender 250 100.0% 0 .0% 250 10

Party * Gender Crosstabulation Gender male female Total

Party RepublicanCount 58 42 100 Expected Count 51.2 48.8 100.0

Democrat Count 70 80 150 Expected Count 76.8 73.2 150.0

Total Count 128 122 250 Expected Count 128.0 122

Interpreting SPSS printouts of Chi-Square

" Chi-Square Tests Value df Asymp. Sig. Exact Sig. Exact Sig.

(2-sided) (2-sided) (1-sided) Pearson Chi-Square 3.084a 1 .079 Continuity Correction 2.648 1 .104 Likelihood Ratio 3.094 1 .079 Fisher's Exact Test .093 .052 Linear-by-Linear 3.072 1 .080 Association N of Valid Cases 250 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 48.80. b. Computed only for a 2x2 table

Interpreting SPSS printouts of Chi-Square

Compare this value to alpha (.05)

Reporting the Results

“A Chi Square test was performed to determine if males and females were distributed differently across the political parties. The test failed to indicate a significant difference, Χ2 (1) = 3.08, p = .079 (an alpha level of .05 was adopted for this and all subsequent statistical tests).”

Page 9: The Chi-Square - Middle Tennessee State University | Middle

7/22/14

9

Assumptions of Chi-Square

1.  Independence of Observations Each person contributes one score.

2.  Size of Expected Frequencies Fewer than 20% of the cells should have

expected frequencies less than 5.