Inference for Categorical Variables

Preview:

DESCRIPTION

Inference for Categorical Variables. Probability & Statistics L. Weinstein May 2014. Testing a Claim with Categorical Data. Three tests: Goodness of Fit Test Does the distribution of the categorical variable fit an expected model? - PowerPoint PPT Presentation

Citation preview

Inference for Categorical Variables

Probability & StatisticsL. Weinstein

May 2014

Testing a Claim with Categorical Data

• Three tests:1. Goodness of Fit Test

Does the distribution of the categorical variable fit an expected model?

2. Test for Homogeneity of PopulationsDoes each population have the same distribution for this variable?

3. Test for Association / IndependenceAre two categorical variables associated?

Goodness of Fit Test

State:Is the distribution of <your variable here> different from the expected distribution of <be specific here>?

The distribution is the same as expected for all categories

The distribution is the different than expected for at least one category

Test at significance level <choose a level>

Goodness of Fit Test

Plan:Use a Goodness of Fit testConditions: • Sample is randomly selected from population• All expected counts are at least 5• Sample observations are independent; that is,

if sampling without replacement, sample size is not more then 10% of the population size.

Goodness of Fit Test

To conduct the test in Minitab, summarize the data by category and put this in one column. If equal counts are expected, this is enough. If something other than equal counts are expected, make a column of expected counts.Then run Stat>Tables>Chi-Square Goodness of Fit Test in Minitab.

Goodness of Fit TestEnter the column names for Observed Counts, Category names, and Proportions specified by historical counts (this is your expected counts list):

Goodness of Fit Test

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Goodness of Fit Test

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>

Test for Homogeneity

State:Is the distribution of <your variable here> different for the populations <be specific here>?

The distribution is the same for all populations The distribution is the different for at least one

categoryTest at significance level <choose a level>

Test for Homogeneity

Plan:Use a Test for HomogeneityConditions: • Samples are randomly selected from each

population• All expected counts are at least 5• Sample observations are independent; that is, if

sampling without replacement, each sample size is not more then 10% of that population size.

Test for Homogeneity

To conduct the test in Minitab, make a column of the summarized distribution of the variable for each population. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.

Test for HomogeneityEnter the column names for each population:

Test for Homogeneity

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Test for Homogeneity

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>

Test for Independence

State:Is there an association between <categorical variable one> and <categorical variable two>?

There is no association between the variables (they are independent).

There is an association between the variables (they are NOT independent.

Test at significance level <choose a level>

Test for Independence

Plan:Use a Test for Independence / AssociationConditions: • Sample is randomly selected from population• All expected counts are at least 5• Sample observations are independent; that is,

if sampling without replacement, sample size is not more then 10% of the population size.

Test for Independence

To conduct the test in Minitab, make a two-way table summarizing the observed counts for each category of the two variables.Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.

Test for IndependenceEnter the column names that contain the summarized data:

Test for Independence

Do:<Include Minitab results of chi-square test here><Indicate the value of the test statistics, , and the P-value of the test.>

Test for Independence

Conclude:<Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>

Recommended