Upload
shoshana-klein
View
24
Download
0
Embed Size (px)
DESCRIPTION
Analysis of Variance. Chapter 12. 1. 2. 3. 4. Chapter Goals. When you have completed this chapter, you will be able to:. Discuss the general idea of analysis of variance. List the characteristics of the F distribution. - PowerPoint PPT Presentation
Citation preview
12 - 2
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1.1. Discuss the general idea of analysis of variance.
2.2. List the characteristics of the F distribution.
When you have completed this chapter, you will be able to:
Organize data into a one-way and a two-way ANOVA table.
3.3. Conduct a test of hypothesis to determine whether the variances of two populations are equal.
4.4.
12 - 3
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
5.5. Define the terms treatments and blocks.
6.6. Conduct a test of hypothesis to determine whether three or more treatment means are equal.
7.7. Develop multiple tests for difference between each pair of treatment means.
12 - 4
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Characteristics of the F-Distribution
Characteristics of the F-Distribution
There is a “family of F-Distributions:There is a “family of F-Distributions:
Each member of the family is determined by two parameters:
…the numerator degrees of freedom, and the … denominator degrees of freedom
F cannot be negative, and it is a continuous distribution
The F distribution is positively skewed
Its values range from 0 to as F , the curve approaches the X-axis
12 - 5
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Test for Equal Variances Test for Equal Variances
For the two tailed test, the test statistic is given by:
The null hypothesis is rejected if the computed value of the test statistic
is greater than the critical value
22
21
s
sF 2
2
21
s
sF
and are the sample variances for the two samples21s 2
2s
12 - 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Colin, a stockbroker at Critical Securities, reported that the mean rate of return on a sample of 10 internet stocks was 12.6 percent
with a standard deviation of 3.9 percent.
The mean rate of return on a sample of 8 utility stocks was 10.9 percent with a
standard deviation of 3.5 percent.
At the .05 significance level, can Colin conclude that there is
more variation in the internet stocks?
12 - 7
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Do not reject H0Do not reject H0 Reject H0 and accept H1
Reject H0 and accept H1
State the null and alternate hypothesesState the null and alternate hypothesesStep 1Step 1
Select the level of significanceSelect the level of significanceStep 2Step 2
Identify the test statisticIdentify the test statisticStep 3Step 3
State the decision ruleState the decision ruleStep 4Step 4
Step 5Step 5
Hypothesis Testing Hypothesis Testing
Compute the value of the test statistic and make a decision
Compute the value of the test statistic and make a decision
12 - 8
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Hypothesis Test Hypothesis Test
State the null and alternate hypotheses
State the null and alternate hypotheses
Step 1Step 1
Select the level of significanceSelect the level of significanceStep 2Step 2
Identify the test statisticIdentify the test statisticStep 3Step 3
State the decision ruleState the decision ruleStep 4Step 4
= 0.05
The test statistic is the F distribution
State the decision ruleState the decision ruleStep 4Step 4
Compute the test statistic and make
a decision
Compute the test statistic and make
a decision
Step 5Step 5
Reject H0 if F > 3.68 The df are 9 in the numerator and
7 in the denominator.
Do not reject the null hypothesis; there is insufficient evidence to show more variation in the internet stocks.
= 1.2416 = 1.2416F 22
21
s
s 2
2
)5.3(
)9.3(
220: U IH
U22
1 : IH
12 - 9
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
This this technique is called analysis of variance or ANOVA
The F distribution is also used for testing whether two or more sample means came from
the same or equal populations
The F distribution is also used for testing whether two or more sample means came from
the same or equal populations
ANOVA
12 - 10
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
…the populations have equal standard deviations
…the samples are randomly selected and are independent
…the sampled populations follow the normal distribution
ANOVA requires the following
conditions…
ANOVA requires the following
conditions…
12 - 11
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
The Null Hypothesis (H0) is that the population means are the same
The Alternative Hypothesis (H1) is that
at least one of the means is different
ANOVA ProcedureANOVA Procedure
The Test Statistic is the F distribution
The Decision rule is to reject H0
if F(computed) is greater than F(table)
with numerator and denominator df
12 - 13
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Terminology
Total Variation …is the sum of the squared differences between each observation and
the overall mean
Random Variation …is the sum of the squared differences between each observation and
its treatment mean
Treatment Variation …is the sum of the squared differences
between each treatment mean and the overall mean
12 - 14
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
SSESSTF
kn
k 1
If there are a total of n observations the denominator degrees of freedom is n - k
The test statistic is computed by:
If there k populations being sampled, the numerator degrees of freedom is k – 1
12 - 15
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
• SS Total is the total sum of squares
nX
X2
2 )(TotalSS
12 - 16
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
• SST is the treatment sum of squares
nX
nT
SSTc
c22
TC is the column total, nc is the number of observations in each column, X the sum of all the observations, and n the total number of observations
12 - 17
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
•SSE is the sum of squares error
SST - totalSS SSE
12 - 18
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Easy Meals Restaurants specialize in meals for senior citizens.
Katy Smith, President, recently developed a new meat loaf dinner. Before making it a part of the regular
menu she decides to test it in several of her restaurants.
She would like to know if there is a difference in the mean number of dinners sold per day at the Aynor, Loris, and Lander restaurants.
Use the .05 significance level.
12 - 19
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Aynor Loris Lander13 10 1812 12 1614 13 1712 11 17
17
Tc 51 46 85nc 4 4 5
Tc 51 46 85nc 4 4 5
…continued
12 - 20
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
…continued
• SS Total (is the total sum of squares)
= 8613
= 2634 -
)( TotalSS22 n
XX
(182)2
12 - 21
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
…continued
nX
nT
SSTc
c22
•SST is the treatment sum of squares
= 76.2513
)182(5
854
464
51 222 2
12 - 22
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
•SSE is the sum of squares error…continued
SSE = SS Total - SST
86 – 76.25
= 9.75
12 - 23
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Hypothesis Test Hypothesis Test
State the null and alternate hypotheses
State the null and alternate hypotheses
Step 1Step 1
Select the level of significanceSelect the level of significanceStep 2Step 2
Identify the test statisticIdentify the test statisticStep 3Step 3
State the decision ruleState the decision ruleStep 4Step 4
= 0.05The test statistic is the
F distribution
State the decision ruleState the decision ruleStep 4Step 4
Compute the test statistic and make
a decision
Compute the test statistic and make
a decision
Step 5Step 5
Reject H0 if F > 4.10 The df are 2 in the numerator and
10 in the denominator.
= 39.10 = 39.10
1:H
0:H 1 2 == 3Treatment means are not all equal
SSE
SSTF
kn
k 1
109.75 2 76.25
12 - 24
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
The decision is to reject the null hypothesis
The treatment means are not the same
The mean number of meals sold at the three locations is not the same
…continued
12 - 25
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Analysis of Variance
Source DF SS MS F P
Factor 2 76.250 38.125 39.10 0.000
Error 10 9.750 0.975
Total 12 86.000
Individual 95% CIs For Mean Based on Pooled St.Dev
Level N Mean St.Dev ---------+---------+---------+-------
Aynor 4 12.750 0.957 (---*---)
Loris 4 11.500 1.291 (---*---)
Lander 5 17.000 0.707 (---*---)
---------+---------+---------+-------
Pooled St.Dev = 0.987 12.5 15.0 17.5
ANOVA TableANOVA Table
…from the Minitab system
12 - 26
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Analysis of Variance
in Excel
12 - 27
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
SeeSee
Using
Click on DATA ANALYSIS
Click on DATA ANALYSIS
See…See…
Click on Tools
12 - 28
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Highlight ANOVA: SINGLE FACTOR…Click OK
Using
See…See…
SeeSee
12 - 29
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
INPUT NEEDS INPUT NEEDS
A1:C6
SeeSee
Click on OK
See…See…
Input the sample data in Columns A, B, C.Input the sample data in Columns A, B, C.
12 - 30
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
SS TotalSS Total
SSTSSTSSESSE
F testF test
12 - 31
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Inferences About
Treatment Means
Inferences About
Treatment Means
12 - 32
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
When we reject the null hypothesis that the means are equal, we may want to know which treatment means differ
When we reject the null hypothesis that the means are equal, we may want to know which treatment means differ
One of the simplest procedures is through the use of confidence intervals
Inferences
About Treatment
Means
Inferences
About Treatment
Means
Confidence IntervalConfidence Interval
12 - 33
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Confidence Interval for the Difference Between Two Means
Confidence Interval for the Difference Between Two Means
where t is obtained from the t table with degrees of freedom (n - k).
MSE = [SSE/(n - k)]
where t is obtained from the t table with degrees of freedom (n - k).
MSE = [SSE/(n - k)]
X X1 2 t MSEn n1 2
1 1
12 - 34
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Develop a 95% confidence interval for the difference in the mean number
of meat loaf dinners sold in Lander and Aynor.
Can Katy conclude that there is a difference between the two restaurants?
Confidence Interval for the Difference Between Two Means
Confidence Interval for the Difference Between Two Means
12 - 35
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
MSEMSE
X X1 2 t MSE
n n1 2
1 1
(17-12.75) 2.228 .9751
4
1
5
. .4 25 1 48 ( 2.77, 5.73)
Confidence Interval for the Difference Between Two Means
Confidence Interval for the Difference Between Two Means
12 - 36
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Because zero is not in the interval, we conclude that this pair of means differs
The mean number of meals sold in Aynor is different from Lander
Confidence Interval for the Difference Between Two Means
Confidence Interval for the Difference Between Two Means
…continued
12 - 37
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
For the two-factor ANOVA we test whether there is a significant difference between the treatment effect
and whether there is a difference in the blocking effect!
…Let Br be the block totals (r for rows)
…Let SSB represent the sum of squares for the blocks
ANOVAANOVA
SSBBk
Xn
r
2 2( )
12 - 38
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
The Bieber Manufacturing Co. operates 24 hours a day, five days a week.
The workers rotate shifts each week. Todd Bieber, the owner, is interested in whether
there is a difference in the number of units produced when the employees
work on various shifts. A sample of five workers is selected and their output
recorded on each shift. At the .05 significance level, can we conclude there is a difference in the
mean production by shift and in the mean production by employee?
ANOVAANOVA
12 - 39
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
ANOVAANOVA
Employee DayOutput
EveningOutput
NightOutput
McCartney 31 25 35
Neary 33 26 33
Schoen 28 24 30
Thompson 30 29 28
Wagner 28 26 27
…continued
12 - 40
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Hypothesis Test Hypothesis Test
State the null and alternate hypotheses
State the null and alternate hypotheses
Step 1Step 1
Select the level of significanceSelect the level of significanceStep 2Step 2
Identify the test statisticIdentify the test statisticStep 3Step 3
State the decision ruleState the decision ruleStep 4Step 4
= 0.05
The test statistic is the F distribution
State the decision ruleState the decision ruleStep 4Step 4
Compute the test statistic and make
a decision
Compute the test statistic and make
a decision
Step 5Step 5
Reject H0 if F > 4.46. The df are 2
and 8
1:H
0:H 1 2 == 3Not all means are equal
)1)(1(
1
bkSSE
kSSTF
Difference between various shifts?Difference between various shifts?
12 - 41
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Compute the various sum of squares:
SS(total) = 139.73
SST = 62.53
SSB = 33.73
SSE = 43.47
df(block) = 4, df(treatment) = 2 df(error)=8
ANOVAANOVA
…continued
Using to get these results
Using to get these results
12 - 42
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Since 5.754 > 4.46, H0 is rejected.
151343.47
1353.62
ANOVAANOVA
…continued
Step 5Step 5
There is a difference in the mean number of units produced on the different shifts.
)1)(1(
1
bkSSE
kSSTF
= 5.754
12 - 43
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Hypothesis Test Hypothesis Test
State the null and alternate hypotheses
State the null and alternate hypotheses
Step 1Step 1
Select the level of significanceSelect the level of significanceStep 2Step 2
Identify the test statisticIdentify the test statisticStep 3Step 3
State the decision ruleState the decision ruleStep 4Step 4
= 0.05
The test statistic is the F distribution
State the decision ruleState the decision ruleStep 4Step 4
Compute the test statistic and make
a decision
Compute the test statistic and make
a decision
Step 5Step 5
1:H
0:H 1 2 == 3Not all means are equal
)1)(1(
1
bkSSE
kSSTF
Difference between various shifts?Difference between various shifts?
Reject H0 if F > 3.84 The df are 4 and 8
12 - 44
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
ANOVAANOVA
…continued
Step 5Step 5
)1)(1(
1
bkSSE
kSSTF
Since 1.55 < 3.84, H0 is not rejected.
= 1.55 4243.47
433.73
There is no significant difference in the mean number of units produced by the various employees.
12 - 45
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Units versus Worker, ShiftAnalysis of Variance for Units Source DF SS MS F P
Worker 4 33.73 8.43 1.55 0.276
Shift 2 62.53 31.27 5.75 0.028
Error 8 43.47 5.43
Total 14 139.73
Units versus Worker, ShiftAnalysis of Variance for Units Source DF SS MS F P
Worker 4 33.73 8.43 1.55 0.276
Shift 2 62.53 31.27 5.75 0.028
Error 8 43.47 5.43
Total 14 139.73
…from the Minitab system
ANOVAANOVA
12 - 46
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
See…See…
Highlight ANOVA: TWO FACTOR WITHOUT REPLICATION
…Click OK
SelectSelect
INPUT DATA INPUT DATA
12 - 47
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
SS TotalSS Total
SSTSSE
SSBFtestFtest FcriticalFcritical
Using
Since F(test) < F(critical), there is not sufficient evidence to reject H0
Since F(test) < F(critical), there is not sufficient evidence to reject H0
There is no significant difference in the average
number of units produced by the different employees.
There is no significant difference in the average
number of units produced by the different employees.
12 - 48
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Test your learning…Test your learning…
www.mcgrawhill.ca/college/lindClick on…Click on…
Online Learning Centrefor quizzes
extra contentdata setssearchable glossaryaccess to Statistics Canada’s E-Stat data…and much more!