73
The Chi Squared Procedure MA 217 - Stephen Sawin Fairfield University August 8, 2017

The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure

MA 217 - Stephen Sawin

Fairfield University

August 8, 2017

Page 2: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 3: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 4: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 5: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 6: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 7: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Introduction

The Chi Squared Procedure is named after the χ2-distribution. χis the Greek letter written chi, pronounced “kai,” which is theancestor of our x . It is also called the Chi Square Procedure.The Chi Squared Procedure tests whether two categorical variables(not necessarily binary) are associated rather than independent.Alternately, it tests whether the proportion of various values of acategorical (not necessarily binary) variable differ among two ormore populations. As such it generalizes the Two SampleProportion Procedure, which does the same thing for binaryvariables and exactly two populations.

Page 8: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 9: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 10: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 11: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 12: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 13: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 14: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

The Chi Squared Procedure: Initial Example

A group project asked for evidence that what brand of designerclothes you like affects what reality TV show you like among F. U.students. The population was Fairfield U. students, explanantoryvariable was brand, response variable was TV show (bothcategorical, not binary). They stopped 50 students going intolibrary on Wednesday evening and asked these two questions.Convenience sample, favoring studious students who go to libraryWedensday evening, and students more like the questioners(unconscious bias). If you can argue either of these groups aremore likely to prefer one particular brand or one particular show,you have identified sampling bias. Their results are

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Page 15: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review Association

Recall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%

Page 16: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review Association

Recall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%

Page 17: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review Association

Recall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%

Page 18: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review AssociationRecall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%48.1% of Louis Vuitton wearers watch Keeping Up with theKardashians.

Page 19: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review AssociationRecall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%25% of Ed Hardy wearers watch Keeping Up with the Kardashians.

Page 20: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review AssociationRecall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%21.1% of Abercrombie and Fitch wearers watch Keeping Up withthe Kardashians.

Page 21: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review AssociationRecall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%So chance of watching KUWTK differs depending on what youwear. The variables are related (in the sample)

Page 22: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review AssociationRecall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%If first column of conditional proportions were equal, knowing whatyou wear tells you nothing about chances of watching Kim.

Page 23: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Review Association

Recall variables are independent if knowing value of one gives noinformation on likelihood of other, associated otherwise. Forcategorical variables check this by conditional proportions, theproportion of each value of explanatory variable with given value ofresponse variable.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Conditional ProportionsBrand KUWTK Jersey Shore Teen Mom Total

Louis Vuitton 13/27 = 48.1% 29.6% 22.2% 100%

Ed Hardy 1/4 = 25% 50% 25% 100%

A& F 4/19 = 21.1% 52.6% 26.3% 100%If cond. props in each column are equal, variables are independent.

Page 24: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

These two variables are associated in the sample. But are theyassociated in the population? If one more Ed Hardy wearer likedKUWTK, the difference with Louis Vuitton would disappear. ChiSquared tells if apparant relationships in data are explainable byrandom variation or probably represent real relationships atpopulation level. p-value gives chance you’d see results like yougot if variables were independent. If small, results are strongevidence vars. are not independent.

Page 25: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

These two variables are associated in the sample. But are theyassociated in the population? If one more Ed Hardy wearer likedKUWTK, the difference with Louis Vuitton would disappear. ChiSquared tells if apparant relationships in data are explainable byrandom variation or probably represent real relationships atpopulation level. p-value gives chance you’d see results like yougot if variables were independent. If small, results are strongevidence vars. are not independent.

Page 26: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

These two variables are associated in the sample. But are theyassociated in the population? If one more Ed Hardy wearer likedKUWTK, the difference with Louis Vuitton would disappear. ChiSquared tells if apparant relationships in data are explainable byrandom variation or probably represent real relationships atpopulation level. p-value gives chance you’d see results like yougot if variables were independent. If small, results are strongevidence vars. are not independent.

Page 27: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

These two variables are associated in the sample. But are theyassociated in the population? If one more Ed Hardy wearer likedKUWTK, the difference with Louis Vuitton would disappear. ChiSquared tells if apparant relationships in data are explainable byrandom variation or probably represent real relationships atpopulation level. p-value gives chance you’d see results like yougot if variables were independent. If small, results are strongevidence vars. are not independent.

Page 28: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

These two variables are associated in the sample. But are theyassociated in the population? If one more Ed Hardy wearer likedKUWTK, the difference with Louis Vuitton would disappear. ChiSquared tells if apparant relationships in data are explainable byrandom variation or probably represent real relationships atpopulation level. p-value gives chance you’d see results like yougot if variables were independent. If small, results are strongevidence vars. are not independent.

Page 29: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Process

I Enter the table of counts (not percentages) into the “data”tab of the Chi Squared Procedure template. Do not enter thetotals, and rename the row and column labels if they arenumerals. Delete excess rows or columns.

I Check number of rows and columns, number of observations,and row/column totals given at top of “expected” tab arecorrect.

I Read off the p-value from the top of the “calc” tab. Sincethere is no choice in H0 and HA, there is no need to setanything.

I Conclude this data [is/ is not] significant evidence at the [α]significance level that [EXPLANATORY VARIABLE] and[RESPONSE VARIABLE] are related in [POPULATION]. Orconclude this data [is/ is not] significant evidence at the [α]significance level that the porportions of [ VARIABLE] aredifferent among the populations [POPULATION1,POPULATION 2, etc].

Page 30: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Process

I Enter the table of counts (not percentages) into the “data”tab of the Chi Squared Procedure template. Do not enter thetotals, and rename the row and column labels if they arenumerals. Delete excess rows or columns.

I Check number of rows and columns, number of observations,and row/column totals given at top of “expected” tab arecorrect.

I Read off the p-value from the top of the “calc” tab. Sincethere is no choice in H0 and HA, there is no need to setanything.

I Conclude this data [is/ is not] significant evidence at the [α]significance level that [EXPLANATORY VARIABLE] and[RESPONSE VARIABLE] are related in [POPULATION]. Orconclude this data [is/ is not] significant evidence at the [α]significance level that the porportions of [ VARIABLE] aredifferent among the populations [POPULATION1,POPULATION 2, etc].

Page 31: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Process

I Enter the table of counts (not percentages) into the “data”tab of the Chi Squared Procedure template. Do not enter thetotals, and rename the row and column labels if they arenumerals. Delete excess rows or columns.

I Check number of rows and columns, number of observations,and row/column totals given at top of “expected” tab arecorrect.

I Read off the p-value from the top of the “calc” tab. Sincethere is no choice in H0 and HA, there is no need to setanything.

I Conclude this data [is/ is not] significant evidence at the [α]significance level that [EXPLANATORY VARIABLE] and[RESPONSE VARIABLE] are related in [POPULATION]. Orconclude this data [is/ is not] significant evidence at the [α]significance level that the porportions of [ VARIABLE] aredifferent among the populations [POPULATION1,POPULATION 2, etc].

Page 32: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Process

I Enter the table of counts (not percentages) into the “data”tab of the Chi Squared Procedure template. Do not enter thetotals, and rename the row and column labels if they arenumerals. Delete excess rows or columns.

I Check number of rows and columns, number of observations,and row/column totals given at top of “expected” tab arecorrect.

I Read off the p-value from the top of the “calc” tab. Sincethere is no choice in H0 and HA, there is no need to setanything.

I Conclude this data [is/ is not] significant evidence at the [α]significance level that [EXPLANATORY VARIABLE] and[RESPONSE VARIABLE] are related in [POPULATION]. Orconclude this data [is/ is not] significant evidence at the [α]significance level that the porportions of [ VARIABLE] aredifferent among the populations [POPULATION1,POPULATION 2, etc].

Page 33: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example again

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Enter the table (minus totals!) into the “data” tab of template.Read off p-value from “calc” tab

p-val = 0.395.

Since this is more than the significance level this data is notsignificant evidence at the 5% level that favorite designer andfavorite reality TV show are related in F.U. students.

Page 34: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example again

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Enter the table (minus totals!) into the “data” tab of template.Read off p-value from “calc” tab

p-val = 0.395.

Since this is more than the significance level this data is notsignificant evidence at the 5% level that favorite designer andfavorite reality TV show are related in F.U. students.

Page 35: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example again

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Enter the table (minus totals!) into the “data” tab of template.Read off p-value from “calc” tab

p-val = 0.395.

Since this is more than the significance level this data is notsignificant evidence at the 5% level that favorite designer andfavorite reality TV show are related in F.U. students.

Page 36: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example again

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Enter the table (minus totals!) into the “data” tab of template.Read off p-value from “calc” tab

p-val = 0.395.

Since this is more than the significance level this data is notsignificant evidence at the 5% level that favorite designer andfavorite reality TV show are related in F.U. students.

Page 37: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example again

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

Enter the table (minus totals!) into the “data” tab of template.Read off p-value from “calc” tab

p-val = 0.395.

Since this is more than the significance level this data is notsignificant evidence at the 5% level that favorite designer andfavorite reality TV show are related in F.U. students.

Page 38: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 39: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 40: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 41: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 42: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 43: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 44: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 45: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 46: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: under the hood

I Chi Squared associates to your actual data an table ofexpected data. It is a table with the same row and columntotals as your data but independent, so cond. props alongeach colmn are equal Expected table is at bottom of“expected” tab. It is what you would expect to get fromsample if H0 were true.

I Chi Squared measures how far actual data is from beingindependent, which is how far it is from expected data, bycombining the differences into one number called thechi-squared statistic. Chi squared stat is found above p-valueon “calc” page.

I If H0 is true the chi squared stat follows a chi squareddistribution with (#rows − 1)(#columns − 1) degrees offreedom. Degrees of freedom is found on “calc” page abovechi squared value. The p-value comes from this distribution.

Page 47: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 48: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 49: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 50: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 51: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 52: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: Assumptions

1. SRS- the sample is a SRS, or if there are several separatesamples they are each SRS of their respective populations andare independent.

2. Large Pop- The population is at least 20 times the samplesize, or each population is at least 20 times its respectivesample size if there are separate samples.

3. Rule of 5- 80% of the expected cells must have at least 5 inthem. This percentage is worked out for you at bottom of the“use” tab.

Page 53: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 54: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 55: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 56: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 57: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 58: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 59: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: example assumptions

Test at the 5% significance level that the following sample isevidence that there is a relationship between favorite designer andfavorite reality TV show among F.U. students.

Brand KUWTK Jersey Shore Teen Mom TotalLouis Vuitton 13 8 6 27

Ed Hardy 1 2 1 4A& F 4 10 5 19Total 18 20 12 50

1. SRS- The sample was a convenience sample. Not Met.

2. Large Pop-n = 50 so need at least 1000 F.U. students. Met.

3. Rule of 5- “use” tab says that only 55.6% of expected cellshave at least 5. Not Met.

Page 60: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 61: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 62: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 63: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 64: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 65: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 66: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 67: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 68: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 69: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 70: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 71: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 72: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Chi Squared Procedure: another example

I surveyed my class about their gender and party affiliation and putthe data on my website under Gender Partisan. Use this data totest the claim at the 1% level that gender and party affiliation arerelated.H0 : gender and party affilation are independent.HA : gender and party affiliation are related

p-val = .878

This data is not significant evidence at the 1% level that genderand party affiliation are related in F.U. students.Assumptions:

1. SRS- Not met. Convenience sample.

2. Large Pop- Met. More than 20 × 68 = 1360 students atFairfield.

3. Rule of 5- Met. 83.3% are 5 or more.

Page 73: The Chi Squared Procedurefaculty.fairfield.edu/ssawin/217/lecturenotes217/lect27...Brand KUWTK Jersey Shore Teen Mom Total Louis Vuitton 13 8 6 27 Ed Hardy 1 2 1 4 A& F 4 10 5 19 Total

Key Points

After whating this lecture you should be able to

I state the null (the two variables are independent) andalternate (the two variables are related) hypotheses for a ChiSquared Procedure hypothesis test.

I enter a table of data into the Chi Squared Procedure templatecorrectly, get the p-value, and state your conclusion in anEnglish sentence.

I use and understand the table of expected values and relate itto the actual data.

I check the assumptions.