20
Inference for Categorical Variables Probability & Statistics L. Weinstein May 2014

Inference for Categorical Variables

  • Upload
    hallam

  • View
    66

  • Download
    0

Embed Size (px)

DESCRIPTION

Inference for Categorical Variables. Probability & Statistics L. Weinstein May 2014. Testing a Claim with Categorical Data. Three tests: Goodness of Fit Test Does the distribution of the categorical variable fit an expected model? - PowerPoint PPT Presentation

Citation preview

Page 1: Inference for Categorical Variables

Inference for Categorical Variables

Probability & StatisticsL. Weinstein

May 2014

Page 2: Inference for Categorical Variables

Testing a Claim with Categorical Data

• Three tests:1. Goodness of Fit Test

Does the distribution of the categorical variable fit an expected model?

2. Test for Homogeneity of PopulationsDoes each population have the same distribution for this variable?

3. Test for Association / IndependenceAre two categorical variables associated?

Page 3: Inference for Categorical Variables

Goodness of Fit Test

State:Is the distribution of <your variable here> different from the expected distribution of <be specific here>?

The distribution is the same as expected for all categories

The distribution is the different than expected for at least one category

Test at significance level <choose a level>

Page 4: Inference for Categorical Variables

Goodness of Fit Test

Plan:Use a Goodness of Fit testConditions: • Sample is randomly selected from population• All expected counts are at least 5• Sample observations are independent; that is,

if sampling without replacement, sample size is not more then 10% of the population size.

Page 5: Inference for Categorical Variables

Goodness of Fit Test

To conduct the test in Minitab, summarize the data by category and put this in one column. If equal counts are expected, this is enough. If something other than equal counts are expected, make a column of expected counts.Then run Stat>Tables>Chi-Square Goodness of Fit Test in Minitab.

Page 6: Inference for Categorical Variables

Goodness of Fit TestEnter the column names for Observed Counts, Category names, and Proportions specified by historical counts (this is your expected counts list):

Page 7: Inference for Categorical Variables

Goodness of Fit Test

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Page 8: Inference for Categorical Variables

Goodness of Fit Test

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>

Page 9: Inference for Categorical Variables

Test for Homogeneity

State:Is the distribution of <your variable here> different for the populations <be specific here>?

The distribution is the same for all populations The distribution is the different for at least one

categoryTest at significance level <choose a level>

Page 10: Inference for Categorical Variables

Test for Homogeneity

Plan:Use a Test for HomogeneityConditions: • Samples are randomly selected from each

population• All expected counts are at least 5• Sample observations are independent; that is, if

sampling without replacement, each sample size is not more then 10% of that population size.

Page 11: Inference for Categorical Variables

Test for Homogeneity

To conduct the test in Minitab, make a column of the summarized distribution of the variable for each population. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.

Page 12: Inference for Categorical Variables

Test for HomogeneityEnter the column names for each population:

Page 13: Inference for Categorical Variables

Test for Homogeneity

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Page 14: Inference for Categorical Variables

Test for Homogeneity

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>

Page 15: Inference for Categorical Variables

Test for Independence

State:Is there an association between <categorical variable one> and <categorical variable two>?

There is no association between the variables (they are independent).

There is an association between the variables (they are NOT independent.

Test at significance level <choose a level>

Page 16: Inference for Categorical Variables

Test for Independence

Plan:Use a Test for Independence / AssociationConditions: • Sample is randomly selected from population• All expected counts are at least 5• Sample observations are independent; that is,

if sampling without replacement, sample size is not more then 10% of the population size.

Page 17: Inference for Categorical Variables

Test for Independence

To conduct the test in Minitab, make a two-way table summarizing the observed counts for each category of the two variables.Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.

Page 18: Inference for Categorical Variables

Test for IndependenceEnter the column names that contain the summarized data:

Page 19: Inference for Categorical Variables

Test for Independence

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Page 20: Inference for Categorical Variables

Test for Independence

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>