Upload
ida-hasniza
View
229
Download
0
Embed Size (px)
Citation preview
7/29/2019 Topic Anova
1/18
MTE3105 Statistics
Topic 5Analysis of variance (ANOVA)
1.1 Synopsis
In this course, students will revisit the concepts of probability and explore inferential statistics
such as analysis variance (ANOVA) in hypothesis testing.The important of using the appropriate
statistical methods in solving real life problems is emphasized.
1.2 Learning Outcomes
1. Understand the theoretical and empirical (concept) of ANOVA2. Use inferential statistics such as ANOVA in hypothesis testing
3. Calculating ANOVA by hand
4. Calculating ANOVA using EXCEL
1.3 Conceptual Framework
TESTING
HYPHOTESIS
ONE WAY ANOVATWO WAYANOVA CHI-SQUARE
LINEAR
REGRESSION
7/29/2019 Topic Anova
2/18
MTE3105 Statistics
1.4 ANOVA
Analysis of variance compares two or more populations of interval data. Specifically, we
are interested in determining whether the differences exist between the population
means. The procedure works by analyzing the sample variance.
1.4.1. Definitions
F-distribution
The ratio of two independent chi-square variables divided by their respective degrees of
freedom. If the population variances are equal, this simplifies to be the ratio of the
sample variances.
Analysis of Variance (ANOVA)A technique used to test a hypothesis concerning the means of three or mor populations.
One-Way Analysis of Variance
Analysis of Variance when there is only one independent variable. The null hypothesis
will be that all population means are equal, the alternative hypothesis is that at least one
mean is different.
Between Group VariationThe variation due to the interaction between the samples, denoted SS(B) for Sum of
Squares Between groups. If the sample means are close to each other (and therefore
the Grand Mean) this will be small. There are k samples involved with one data value for
each sample (the sample mean), so there are k-1 degrees of freedom.
Between Group Variance
The variance due to the interaction between the samples, denoted MS(B) for Mean
Square Between groups. This is the between group variation divided by its degrees offreedom.
Within Group Variation
The variation due to differences within individual samples, denoted SS(W) for Sum of
Squares Within groups. Each sample is considered independently, no interaction
7/29/2019 Topic Anova
3/18
7/29/2019 Topic Anova
4/18
MTE3105 Statistics
Interaction Effect
The effect one factor has on the other factor
Main Effect
The effects of the independent variables.
1.4.2.ONE WAY ANOVA
In the analysis of variance, the approach is conceptually similar to the t-test, although
the method differs.When you want to compare more than two means, the ONEWAY
Analysis of Variance (ANOVA) is used. Say, for example you conducted an experiment
in which you compared the effectiveness of three teaching methods in enhancing
reading comprehension.
A One-Way Analysis of Variance is a way to test the equality of three or more means at
one time by using variance.
Assumptions
The populations from which the samples were obtained must be normally or
approximately normally distributed.
The samples must be independent.
The variences of the populations must be equal.
1.4.3 How ANOVA works
ANOVA measures two sources of variation in the data and compares their relative sizes
variation BETWEEN groups
for each data value look at the difference between its group mean and
the overall mean
variation WITHIN groups
for each data value we look at the difference between that value and the
mean of its group
2xxi
2iij xx
7/29/2019 Topic Anova
5/18
MTE3105 Statistics
The ANOVA F-statistic is a ratio of the Between Group Variaton divided by the Within
Group Variation
A large F is evidence againstH0, since it indicates that there is more difference between
groups than within groups.
We want to measure the amount of variation due to BETWEEN group variation and
WITHIN group variation
For each data value, we calculate its contribution to:
BETWEEN group variation:
WITHIN group variation:
1.4.4. Example problem using One-way Analysis of Variance
Three groups of students, 5 in each group, were receiving therapy for severe test
anxiety. Group 1 received 5 hours of therapy, group 2 - 10 hours and group 3 - 15
hours. At the end of therapy each subject completed an evaluation of test anxiety (the
dependent variable in the study). Did the amount of therapy have an effect on the level
of test anxiety?
The three groups of students received the following scores on the Test Anxiety Index
(TAI) at the end of treatment.
TAI Scores for Three Groups of Students
Group 1 - 5 hours Group 2 - 10 hours Group 3 - 15 hours
48 55 51
50 52 52
53 53 50
52 55 53
50 53 50
MSE
MSG
Within
BetweenF
2
xxi
2iij xx
7/29/2019 Topic Anova
6/18
7/29/2019 Topic Anova
7/18
MTE3105 Statistics
The degrees of freedom between groups is:
dfB = K - 1 = 3 - 1 = 2
Where K is the number of groups.
Next we calculate SSW, the sum of squares within groups.
The degrees of freedom within groups is:
dfW = NT - K = 15 - 3 = 12
Where NT is the total number of subjects.
7/29/2019 Topic Anova
8/18
MTE3105 Statistics
Finally, we will calculate SST, the total sum of squares.
As a check SST = SSB + SSW
54.4 = 25.2 + 29.2
We can now calculate MSB, the mean square between groups, MSW, the mean square
within groups, and F, the F ratio.
To test the significance of the F value we obtained, we need to compare it with the
critical F value with an alpha level of .05, 2 degrees of freedom between groups (or
degrees of freedom in the numerator of the F ratio), and 12 degrees of freedom within
groups (or degrees of freedom in the denominator of the F ratio). We can look up the
critical value of F in Appendix Table D of the text book (The 5 percent (Lightface Type)
and 1 percent (Boldface Type) points for the Distribution of F), pages 319-326. Look in
the table under column 2 (2 degrees of freedom for the numerator) and row 12 (12
degrees of freedom for the denominator) and read the non-boldfaced entry (for .05 level)
of 3.88 - this is the critical value for F.
7/29/2019 Topic Anova
9/18
MTE3105 Statistics
One way of indicating this critical value of F at the .05 level, with 2 degrees of freedom
between groups and 12 degrees of freedom within groups is
F.05(2,12) = 3.88
When using analysis of variance, it is a common practice to present the results of the
analysis in an analysis of variance table. This table which shows the source of variation,
the sum of squares, the degrees of freedom, the mean squares, and the probability is
sometimes presented in a research article. The analysis of variance table for our
problem would appear as follows:
Analysis of Variance Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
SquareF Ratio p
Between
Groups25.20 2 12.60 5.178
7/29/2019 Topic Anova
10/18
MTE3105 Statistics
Text Book A Text Book B Text Book C
54 53 4949 56 53
52 57 47
55 51 5048 59 54
With = .05, test if the means of the three populations are equal.
1. State the independent variable and the dependent variable in this study
2. State the assumptions for using a one-way ANOVA
3. State the null hypothesis and the alternative hypothesis
4. Compute SSB, SSw and SST
5. Compute the between and within samples variances
6. Indicate the value of Fcritical.
7. Compute the F value
8. Create and ANOVA table and fill in the above information
9. Describe the conclusion.
Solution:
Text Book A Text Book B Text Book C
54 53 49
49 56 5352 57 47
55 51 5048 59 54
T1 = 258 T2 = 276 T3 = 253
X21 = 13350 X
22 = 15276 X
23 = 12835
n1 = 5 n2 = 5 n3 = 5
1 = 51.6 2 = 55.2 3 = 50.6
1) Independent variable : Text book with three different text books
Dependent variable : scores of mathematics achievement
2) The assumption using one-way ANOVA:
1. The distribution of the populations are normal,
2. The variances of the populations are equal
3. Scores are independent
7/29/2019 Topic Anova
11/18
MTE3105 Statistics
4. Samples are independent
5. Samples are random
3) Null Hypothesis, H0 = (the three group mean are equal)
Alternative Hyphotesis, Ha : ( at least one of the means are
unequal)
4) a) Sum of Squares Between Group (SSB)
SSB =
()
()
SSB =()
()
()
()
= 58.5333
b) Sum of Square Within Groups (SSw)
SSw = - =
= 41,461 -()
()
()
= 111.2
c) Sum of Squares Total (SST)
SST = SSB + SSw = 58.5333 + 111.2 = 169.7333
5) Between Group Variance
MSB =
Within Group Variance
MSw =
=
= 9.2667
6) The value of Fcritical
Fcritical = F (0.05,2,12) = 3.89
Decision Rules: Reject Ho if F> 3.89
7) The value of F
F =
=
7/29/2019 Topic Anova
12/18
MTE3105 Statistics
8) One-Way ANOVA Table
Sources of Variation Sum ofSquares(SS)
DegreesofFreedom
(df)
MeanSquare(MS)
TestStatisticValue (F)
F critical
Between 2 58.5333 29.26673.16 3.89
Within 12 112.2000 9.2667
Total 14 169.7333
9) Conclusions
F = 3.16, Fcritical = 3.89. Therefore we fail to reject the Ho. The data indicate
that the means
of populations are equal ( F(2,12) = 3.16, = 0.05). The differences
of the three sample means are simply due to sampling errors.
4.6 Using the Excel Spreadsheet Program to Calculate One-Way Analysis of
Variance
The Excel spreadsheet program has a tool to calculate One-Way Analysis of
Variance, which simplifies our computational task considerably. Let's use the same
research problem we already considered, but use the spreadsheet program to do the
calculations.
Research Problem:
Three groups of students, 5 in each group, were receiving therapy for severe test
anxiety. Group 1 received 5 hours of therapy, group 2 - 10 hours and group 3 - 15hours. At the end of therapy each subject completed an evaluation of test anxiety (the
dependent variable in the study). Did the amount of therapy have an effect on the
level of test anxiety?
7/29/2019 Topic Anova
13/18
7/29/2019 Topic Anova
14/18
MTE3105 Statistics
In the Excel Worksheet select Data Analysis under the Tools menu. If Data Analysis is
not available you must install the Data Analysis Tools.
If you need to you can install the data analysis tools as follows:
1. Select Add-Ins from the Tools menu.
2. In the Add-Ins window click on the box next to Analysis Tool Pak to select it.
3. Click OK. You have now installed the Tool Pak.
With the Data Analysis Tools installed, select Data Analysis under the Tools menu.
In the Data Analysis window scroll down and select Anova: Single Factor. Complete
the Anova: Single Factorwindow as follows:
1. Enter$A$2:$C$7 in the Input Range: box (or you can enter that value
automatically by clicking in the box and then selecting the range of cells A2
through C7). Note that we have included the labels, Group 1, Group 2, and
Group 3, in the range of cells we selected.
2. Click the Columns button so that we indicate we our data is grouped by
columns.
3. Click the Labels in first row box so that we indicate we are using labels (Group
1, Group 2, and Group 3)
4. Enter.05 in the Alpha: box.
5. UnderOutput Options click the button forOutput range: and enter$A$9 in the
Output range: box (or click in the box and then click on the cell A9 to cause it to
appear in the box).
6. Click OK.
7/29/2019 Topic Anova
15/18
MTE3105 Statistics
Your spreadsheet should now appear as follows:
The results of the one-way analysis of variance can be seen in the resultant tables. The
means for the three groups (as well as the count, sum, and variance for each group) can
be seen in the SUMMARYtable.
The ANOVA table shows the same results as we put in the Analysis of Variance table
when we calculated the results ourselves. The value of F is shown to be 5.178082192,
which rounded to 5.18 is the same value as we received when we calculated F. The P-
Value is shown as .02391684 which indicates that the result is significant at the .02 level.
We have set our alpha level as .05 so we will simply indicate that p < .05. There is an
additional entry to the table showing the critical value of F at the .05 level (F Crit) which
is 3.88529031 which is similar to the result (2.88) we looked up in Appendix Table D in
the textbook.
Unfortunately, the spreadsheet program does not have a program to calculate the
Scheffe test, so we will have too calculate those the way we did before. The results of
our Scheffe tests were:
7/29/2019 Topic Anova
16/18
MTE3105 Statistics
Summary of Scheffe Test Results
Group One versus Group Two 4.62
Group One versus Group Three 0.18
Group Two versus Group Three 2.96
We now have all the information we need to complete the six step statistical inference
process:
1. State the null hypothesis and the alternative hypothesis based on your
research question.
Note: Our null hypothesis, for the F test, states that there are no differences
among the three means. The alternate hypothesis states that there are significant
differences among some or all of the individual means. An unequivocal way of
stating this is not H0.
2. Set the alpha level.
Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of
making a type I error.
3. Calculate the value of the appropriate statistic. Also indicate the degrees of
freedom for the statistical test if necessary and the results of any post hoc
test, if they were conducted.
F(2,12) = 5.178, value of the F ratio
F.05(2,12) = 3.88, critical value of F
F12 = 4.630, Scheffe test value for comparing means 1 and 2
F13 = 0.185, Scheffe test value for comparing means 1 and 3
F23 = 2.963, Scheffe test value for comparing means 2 and 3
4. Write the decision rule for rejecting the null hypothesis.
Reject H0 if F is >= 3.88
Note: To write the decision rule we had to know the critical value for F, with an
alpha level of .05, 2 degrees of freedom in the numerator (df between groups)
and 12 degrees of freedom in the denominator (df within groups). We can do this
7/29/2019 Topic Anova
17/18
MTE3105 Statistics
by looking at Appendix Table D and noting the tabled value for the .05 level in the
column for 2 df and the row for 12 df.
5. Write a summary statement based on the decision.
Reject H0, p < .05
Note: Since our calculated value of F (5.178) is greater than 3.88, we reject the
null hypothesis and accept the alternative hypothesis.
6. Write a statement of results in standard English.
There is a significant difference among the scores the three groups of students
received on the Test Anxiety Index.
Group 1 (the five hour therapy group) has a significantly lower score on the TAI
than does Group 2 (the ten hour therapy group).
We can see that the Excel spreadsheet program gives us an easy way to calculate the F
ratio. It also provides us with an analysis of variance table which shows, among other
things, the critical value of F for the alpha level we specified, and the probability level (p)
of the result.
http://f/ANOVA/ed602lesson13.htm7/29/2019 Topic Anova
18/18
MTE3105 Statistics
Question :
(1) State the Assumptions of ANOVA.(2) Describe the Rationale of ANOVA stating the ANOVA table.(3) Solve the following problem using One-Way ANOVA
Four types of advertising displays were set up in 12 retail outlets, with three
outlets randomly assigned to each of the displays, for the purpose of studyingthe point-of-sale impact of the displays. The relevant information is given in the
following table.
Type of Display Sales
A1 40 44 43
A2 53 54 59A3 48 38 46
A4 48 61 47
Carry out the Analysis of Variance to test the differences among the mean sales
values for the four types of displays, using the 5 percent level of significance.
(i) State the Null Hypothesis and Alternative Hypothesis.
Give Step by Step solution using all the required formulas. Give the
ANOVA table and comment on the conclusion.
(ii) Use excel to solve the above problem using the data given in the abovetable and comment on the conclusion.