14
Slide 17-1 2/10/2012 Chapter 17 Chi-Squared Analysis: Testing for Patterns in Qualitative Data

Statistics Chi Squared Analysis

Embed Size (px)

Citation preview

Page 1: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 1/14

Slide

17-1

2/10/2012

Chapter 17

Chi-Squared Analysis: Testing

for Patterns in Qualitative Data

Page 2: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 2/14

Slide

17-2

2/10/2012

Chi-Squared Tests

Hypothesis Tests for Qualitative Data

 ± Categories instead of numbers

 ± Based on counts

the number of sampled items falling into each category

 ± Chi-squared statistic

Measures the difference between Actual counts and Expected 

counts (as expected under the null hypothesis  H 0)

where the sum extends over all categories or combinations of 

categories

 ±  Signifi cant if the chi-squared statistic is large enough

§

!

!

i

ii

 E 

 E O22

CountExpected

)CountExpectedCountObserved( of Sumstatisticsquared-Chi

Page 3: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 3/14

Slide

17-3

2/10/2012

Summarizing Qualitative Data

Use Counts and Percentages, for Example:

 ±  How many (of people sampled) prefer your product?

 ± What percentage of sophisticated product users saidthey would like to purchase an upgrade?

 ± What percentage of products produced today are both

designed for export

and imperfect

Page 4: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 4/14

Slide

17-4

2/10/2012

Testing Population Percentages

Testing if Population Percentages are Equal to

Known Reference Values

 ± The chi-squared test for equality of percentages

 ± Could a table of observed counts have reasonably come

from a population with known percentages (the

reference values)?

 ±  The data: A table indicating the count for each

category for a single qualitative variable

 ±  The hy pot heses: H 0: The population percentages are equal to a set of known,

fixed reference values

H 1: The population percentages are not equal to a set of 

known, fixed reference values

Page 5: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 5/14

Slide

17-5

2/10/2012

Testing Percentages (continued) ±  The expected counts:

For each category, multiply the population reference

 proportion by the sample size, n

 ±  The assumpt i on s:

1.Data set is a random sample from the population of interest2.At least 5 counts are expected in each category

 ±  The chi- squared stat i  st i c:

 ±  The de g rees o f   f reedom: Number of categories minus 1

 ±  The test result :  Signifi cant if the chi-squared statistic is

larger than the critical value from the table

§

!

i

ii

 E 

 E O22

CountExpected

)CountExpectedCountObserved( of Sum

Page 6: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 6/14

Slide

17-6

2/10/2012

Independence

Two Qualitative Variables are  I ndependent if:

 ± knowledge about the value (i.e., category) of one

variable does not help you predict the other variable

For Example: Background Sales Information

 ± The two variables are

where the customer lives, and

the customer¶s favorite product

 ± Independence would say that

customers tend to have the same pattern (distribution) of 

favorite products, regardless of where they live, and

where customers live is not associated with which product is

their favorite

Page 7: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 7/14

Slide

17-7

2/10/2012

Testing for Association

The chi-squared test for independence

 ±  The data: A table indicating the counts for each

combination of categories for two qualitative variables

 ±  The hy pot heses:

 H 0: The two variables are independent of one another 

 H 1: The two variables are associ ated ; they are not independent 

 ±  The expected table:

Tells you what the counts would have been, on average, if the

variables were independent and there were no randomness

n

¹¹ º

 ¸

©©ª

¨

¹¹ º

 ¸

©©ª

¨

!

ableother varifor 

categoryfor Count

variableonefor 

categoryfor Count

countExpected

Page 8: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 8/14

Page 9: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 9/14

Slide

17-9

2/10/2012

Example: Market Segmentation

Data: Rowing Machine Purchases

 ± Which model rowing machine was purchased?

Basic, Designer, or Complete

 ± Which type of customer purchased it?

Practical or Impulsive

BasicDesigner 

Complete

Total

Practical

2213

54

89

Impulsive

2588

19

132

Total

47101

73

221

Observed Counts

Tbl 17.3.1

Page 10: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 10/14

Slide

17-10

2/10/2012

Example (continued)

Overall Percentages

 ± Divide each count by the overall total, 221

e.g., 10.0% = 22/221 were practical customers who purchased

 basic machines

 Note: impulsive customers bought most designer machines,while practical customers bought most complete machines

BasicDesigner 

Complete

Total

Practical

10.0%5.9%

24.4%

40.3%

Impulsive

11.3%39.8%

8.6%

59.7%

Total

21.3%45.7%

33.0%

100.0%

Overall Percentages

Tbl 17.3.2

Page 11: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 11/14

Slide

17-11

2/10/2012

Example (continued)

Percentages by Model

 ± Divide each count by the total for its model type

e.g., 22/47 = 46.8% of the 47 basic machines were purchased

 by practical customers

 Note: Practical customers make up 40.3% of all customers, butrepresent 74.0% of complete-machine purchasers

BasicDesigner 

Complete

Total

Practical

46.8%12.9%

74.0%

40.3%

Impulsive

53.2%87.1%

26.0%

59.7%

Total

100.0%100.0%

100.0%

100.0%

Percentages by Model

Tbl 17.3.3

Page 12: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 12/14

Slide

17-12

2/10/2012

Example (continued)

Percentages by Customer Type

 ± Divide each count by the total for its customer type

e.g., 22/89 = 24.7% of the 89 practical customers purchased

 basic machines

 Note: Of all machines, 33.0% are complete; looking only at practical-customer purchases,60.7% are complete

BasicDesigner 

Complete

Total

Practical

24.7%14.6%

60.7%

100.0%

Impulsive

18.9%66.7%

14.4%

100.0%

Total

21.3%45.7%

33.0%

100.0%

Percentages by Customer Type

Tbl 17.3.4

Page 13: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 13/14

Slide

17-13

2/10/2012

Example (continued)

Expected Counts

 ±  Multiply row total by column total, then divide by the

overall total of 221

e.g., if there were no association between type of customer and

type of machine, we would have expected to find 89v47/221 =18.93 basic machines purchased by practical customers

BasicDesigner 

Complete

Total

Practical

18.9340.67

29.40

89.00

Impulsive

28.0760.33

43.60

132.00

Total

47.00101.00

73.00

221.00

Expected Counts

Tbl 17.3.5

Page 14: Statistics Chi Squared Analysis

8/3/2019 Statistics Chi Squared Analysis

http://slidepdf.com/reader/full/statistics-chi-squared-analysis 14/14

Slide

17-14

2/10/2012

Example (continued)

Chi-Squared Statistic (do not use the t ot al row or column)

66.8

Degrees of freedom

(Rows ± 1

)(Column ± 1

) = (3

  ± 1

)(2

  ± 1

) =2

Result

 ± If there were no association, such a large chi-squared

value would be highly unlikely

The associ at i on between customer t  y pe and model 

 purchased i  s ver  y hig hl  y signifi cant (p < 0.001)