5
Chi-Square Test What is chi-square testing? o Identifies significant differences among the observed frequencies and the expected frequencies of a particular group o Attempts to identify whether any differences between the expected and observed frequencies are due to chance, or some other factor that is affecting it. o There are actually many types of Chi-square tests, but the most common one is the Pearson Chi-square Test. Terms and Definitions o Categorical Data- 2 types a. Numerical data- in form of numbers. (ex. 1,2,3,4) b. Categorical data- comes in form of divisions. (ex. Yes or no) o Expected Frequencies -values for parameters that are hypothesized to occur -can be determined through: 1) Hypothesizing that the frequencies are equal for each category. 2) Hypothesizing the values on the basis of some prior knowledge. 3) A mathematical method (see Page 3) Two applications of Pearson Chi-Square Test 1) Chi-square test for Independence -This tests whether the “category” from which the data comes from affects the data. -May also be thought of as testing whether the categories in the experiment “prefer” certain kinds of data. Example: Is there a difference in the car choices of male and females? 2) Chi-square test for goodness-of-fit

Chi square hand out (1)

  • Upload
    iamkim

  • View
    14

  • Download
    3

Embed Size (px)

DESCRIPTION

 

Citation preview

Chi-Square Test

What is chi-square testing?

o Identifies significant differences among the observed frequencies and the expected frequencies of a particular group

o Attempts to identify whether any differences between the expected and observed frequencies are due to chance, or some other factor that is affecting it.

o There are actually many types of Chi-square tests, but the most common one is the Pearson Chi-square Test.

Terms and Definitions

o Categorical Data- 2 typesa. Numerical data- in form of numbers. (ex. 1,2,3,4)b. Categorical data- comes in form of divisions. (ex. Yes or no)

o Expected Frequencies-values for parameters that are hypothesized to occur-can be determined through:1) Hypothesizing that the frequencies are equal for each category. 2) Hypothesizing the values on the basis of some prior knowledge.3) A mathematical method (see Page 3)

Two applications of Pearson Chi-Square Test

1) Chi-square test for Independence-This tests whether the “category” from which the data comes from affects the data.-May also be thought of as testing whether the categories in the experiment “prefer” certain kinds of data.Example: Is there a difference in the car choices of male and females?

2) Chi-square test for goodness-of-fit-This tests whether the observed data “fit” the expected data.Example: Do the car sales this year match the car sales last year? (ie. Did we still sell around 50 blue cars? 25 red cars?)

Requirements of the Chi-squared Test

1. The values of the parameters to be compared are quantitative and nominal.2. There should be one or more categories in the setup.3. The observations should be independent of each other.4. An adequate sample size. (At least 10)5. Most of the time, it is the frequency of the observations that are used.

ExampleA student wants to see whether the food preferences of males and females differed. He tried to see whether males or females had a general difference in the preference for cooked and raw foods. A survey was conducted with the following results:

Twelve males preferred Cooked foods.Eight males preferred Raw foods.Five females preferred Cooked foods.Five females preferred Caw foods.

Step 1: State the null hypothesis and the alternative hypothesis.

Ho: There is no significant difference between the food preferences of males and females.OrFood preference is independent of gender.

Ha: There is a significant difference between the food preferences of males and females.OrFood preference is affected by gender.

Step 2: State the level of significance. (Fish Thingy)

α = 0.050.05 is the level of significance for most scientific experiments.

Step 3: Set up a contingency table:

The contingency table summarizes the data.The categories on the columns are the “preferences” that you are checking. The categories on the rows are the “populations” whose preferences are being checked. A row total and column total is always included as well.

Preference Male Female Total (Row)

Cooked 12 5 17

Raw 8 5 13

Total (Column) 20 10 30

Step 4: Compute for the expected frequencies.

The chi-square test for independence usually uses the third method of getting expected frequencies.Expected Frequency = (Row Total)(Column Total)

Grand totalThis expected frequency is computed for EACH cell.

Preference Male Female Total (Row)

Cooked (20)(17)/30 = 11.33

(10)(17)/30 = 5.67

17

Raw (13)(20)/30 = 8.67

(13)(10)/30 = 4.33

13

Total (Column)

20 10 30

The fundamental formula for the Chi-squared test is:

Where O is the observed frequenciesE is the expected frequenciesAnd x2 is the chi-square value

Step 5: Rearrange the table to show the observed and expected frequencies on the columns, and the subcategories on the rows.

Preference Observed Expected Chi-square

Cooked Males 12 11.33 0.0396

Cooked Females 5 5.67 0.0792

Raw Males 8 8.67 0.0518

Raw Females 5 4.33 0.1037

Total 0.2743

 

Step 6: Determine the degrees of freedom

The degrees of freedom is: df = (Rows – 1)(Columns – 1)

df = (2 – 1)(2 – 1) = 1

Step 7: Check the tabular Chi-squared value with your df and level of significance.

Checking the table, we see that the tabular chi-squared value for df = 1, and α = 0.05 is 3.841.

Since our calculated chi-square is less than this, the conclusion is to accept the null hypothesis. Hence, food preference is independent of gender.

If it were greater, we would reject the null hypothesis.