19
Section 11.2-1 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola

Section 11.2-1 Copyright 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Embed Size (px)

DESCRIPTION

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Key Concept In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table). We will use a hypothesis test for the claim that the observed frequency counts agree with some claimed distribution, so that there is a good fit of the observed data with the claimed distribution.

Citation preview

Page 1: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-1Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Lecture Slides

Elementary Statistics Twelfth Edition

and the Triola Statistics Series

by Mario F. Triola

Page 2: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-2Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Chapter 11Goodness-of-Fit and Contingency Tables

11-1 Review and Preview

11-2 Goodness-of-Fit11-3 Contingency Tables

Page 3: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-3Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Key Concept

In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table).

We will use a hypothesis test for the claim that the observed frequency counts agree with some claimed distribution, so that there is a good fit of the observed data with the claimed distribution.

Page 4: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-4Copyright © 2014, 2012, 2010 Pearson Education, Inc.

DefinitionA goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.

Page 5: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-5Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Notation

O represents the observed frequency of an outcome, found from the sample data.

E represents the expected frequency of an outcome, found by assuming that the distribution is as claimed.

k represents the number of different categories or cells.

n represents the total number of trials.

Goodness-of-Fit Test

Page 6: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-6Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Goodness-of-Fit Test

1. The data have been randomly selected.2. The sample data consist of frequency counts for each

of the different categories.3. For each category, the expected frequency is at least 5.

(The expected frequency for a category is the frequency that would occur if the data actually have the distribution that is being claimed. There is no requirement that the observed frequency for each category must be at least 5.)

Requirements

Page 7: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-7Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Goodness-of-Fit

Hypotheses and Test Statistic

22 ( )O Ex

E

0

1

: The frequency counts agree with the claimed distribution.: The frequency counts do not agree with the claimed distribution.

HH

Page 8: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-8Copyright © 2014, 2012, 2010 Pearson Education, Inc.

P-Values and Critical ValuesP-ValuesP-values are typically provided by technology, or a range of P-values can be found from Table A-4.

Critical Values1. Found in Table A-4 using k – 1 degrees of freedom, where k = number of categories.2. Goodness-of-fit hypothesis tests are always right-tailed.

Page 9: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-9Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Finding Expected Frequencies

If all expected frequencies are assumed equal:

If all expected frequencies are assumed not equal:

nEk

for each individual categoryE np

Page 10: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-10Copyright © 2014, 2012, 2010 Pearson Education, Inc.

A close agreement between observed and expected values will lead to a small value of χ2 and a large P-value.

A large disagreement between observed and expected values will lead to a large value of χ2 and a small P-value.

A significantly large value of χ2 will cause a rejection of the null hypothesis of no difference between the observed and the expected.

Goodness-of-Fit Test

Page 11: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-11Copyright © 2014, 2012, 2010 Pearson Education, Inc.

Goodness-Of-Fit Tests

Page 12: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-12Copyright © 2014, 2012, 2010 Pearson Education, Inc...

ExampleA random sample of 100 weights of Californians is obtained, and the last digit of those weights are summarized on the next slide.

When obtaining weights, it is extremely important to actually measure the weights rather than ask people to self-report them.

By analyzing the last digit, we can verify the weights were actually measured since reported weights tend to be rounded to something ending with a 0 or a 5.

Test the claim that the sample is from a population of weights in which the last digits do not occur with the same frequency.

Page 13: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-13Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - Continued

Page 14: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-14Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedRequirement Check:

1.The data come from randomly selected subjects.

2.The data do consist of counts.

3.With 100 sample values and 10 categories that are claimed to be equally likely, each expected frequency is 10, which is greater than 5.

All requirements are met to proceed.

Page 15: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-15Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedStep 1: The original claim is that the digits do not occur with the same frequency. That is:

Step 2: If the original claim is false, then all the probabilities are the same:

0 1 2 3 4 5 6 7 8 9p p p p p p p p p p

0 1 9at least one of the probabilities , , , is different from the others

p p p

Page 16: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-16Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedStep 3: The hypotheses can be written as:

Step 4: No significance level was specified, so we select α = 0.05.

Step 5: We use the goodness-of-fit test with a χ2 distribution.

0 0 1 2 3 4 5 6 7 8 9

1

:: At least one of the probabilities is different.

H p p p p p p p p p pH

Page 17: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-17Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedStep 6: The calculation of the test statistic is given:

Page 18: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-18Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedStep 6: The test statistic is χ2 = 212.800 and the critical value is χ2 = 16.919 (Table A-4). The P-value was found to be less than 0.0001 using technology.

Page 19: Section 11.2-1 Copyright  2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series

Section 11.2-19Copyright © 2014, 2012, 2010 Pearson Education, Inc...

Example - ContinuedStep 7: Reject the null hypothesis, since the P-value is small and the test statistic is in the critical region.

Step 8: We conclude there is sufficient evidence to support the claim that the last digits do not occur with the same relative frequency.

In other words, we have evidence that the weights were self-reported by the subjects, and the subjects were not actually weighed.