Upload
hoangnhan
View
226
Download
1
Embed Size (px)
Citation preview
Comparing means of 3 or more groups requires the use of different statistical tool, call1
the F-ratio. That tool is introduced in a later chapter.
107
CHAPTER 9
T-TESTS: EXAMINING DIFFERENCES BETWEEN MEANS
The previous chapter emphasized the importance of hypothesis testing within the context of
statistical research, but did not provide information about the specific research tools or statistics that
are employed in the process of hypothesis testing. Researchers have developed a variety of these
tools. The specific research tool examined in this chapter is the T-test. This statistic is used when a
researcher wishes to compare the means of two sample groups to test hypotheses related to the means
of the two overall populations represented by these samples. This chapter explains the theoretical1
foundations of the T-test and provides instruction in its use in the process of testing hypotheses.
The first step in the explanation of the t-test requires the introduction of a new concept. This
concept is a special form of sampling distribution referred to as the sampling distribution of mean
differences. The sampling distribution of mean differences is similar in function and importance to
the concept of the sampling distribution of means introduced in Chapter 7. Both are expected to
approximate the normal curve and are useful in provide a means of constructing confidence intervals
using the rules of probability. The sampling distribution of mean differences represents a theoretical
construct that represents the distribution of values that would be produced if a researcher drew
repeated samples from two groups and calculated the differences between each pair of sample means.
To illustrate, assume that a researcher selected 1000 samples of 100 teachers from region A and 1000
samples of 100 teachers from region B and calculated the salaries for each teacher. Each time the
researcher selected a sample from regions A and B, means were obtained for each sample. The
108
difference between the two means was calculated for each sample pair. These mean differences
could then be used to construct a frequency polygon called the sampling distribution of mean
differences. This distribution should closely approximate the normal curve shown in Figure 9:1.
FIGURE 8:1
SAMPLING DISTRIBUTION OF MEAN DIFFERENCE
The null hypothesis begins with the presumption that there is not a statistically significant
difference between groups. Since our research tools are based on testing this presumption, the mean
of a sampling distribution of mean differences is expected to be zero because the positive and
negative mean differences in a distribution tend to cancel each other out. For every negative
difference in means observed among samples drawn from equivalent populations, there tends to be a
positive value an equal distance from the mean observed between another pair of samples. Within the
sampling distribution of differences, most of the sample mean differences in the distribution fall close
to zero. There will be relatively few mean differences having an extreme value in either direction
from the mean. This important characteristic makes the sampling distribution of mean differences a
very useful tool in testing hypotheses.
109
As with the sampling distribution of means, knowledge of the mean and the standard
deviation of a distribution of mean differences is rarely, if ever, available. It would be a major effort
to select a large number of sample pairs in order to calculate a standard deviation for a distribution of
mean differences; however, this standard deviation plays an important role in testing hypotheses
related to mean differences. Fortunately, there is a simple method whereby the standard deviation of
the distribution of differences can be accurately estimated. On the basis of two samples actually
drawn, the estimate of the standard deviation of a sampling distribution of differences is called the
standard error of the difference. The formulas for estimating the standard error of the difference
are as follows:
Formula 1 – Independent Samples:
standard error of mean differences
=variance of the first sample
variance of the second sample
sample size of groups 1 and 2 respectively
Formula 2 – Dependent Samples:
standard error of mean differences
For each observed value, obtain the difference between the value before and the value
after. Then square each of these differences and add them all together.
Independent samples are samples drawn from different groups. They are two or more
separate samples. For example, college admission test scores for men (Sample 1) and women
W hen the number of degrees of freedom exceeds 30, t-distributions become almost2
indistinguishable from the normal curve. For that reason, the same critical value will be used with all tests
that involve more than 30 degrees of freedom.
110
(Sample 2) would be two independent samples. Individuals in independent samples must be included
in one and only one group. Dependent samples are those which are involve individuals or
phenomena observed at two different points in time. For example, student scores on their first and
second semester exams are dependent samples. In this example, two sets of data are used for the
same group of individuals or sample.
The second step in the process of testing hypotheses involves the computation of a t-statistic.
The t-statistic is used in a similar way to the z-scores discussed earlier in the text. It converts the
difference in means observed between the two sample groups under examination to units based on the
standard error of the mean differences. The formula for the t-statistic used in this process is:
Calculating a value for the t-statistic allows the researcher to test the validity of the null hypothesis
that has been proposed. Appendix D contains a table that provides the critical values for t at various
degrees of freedom. Critical values represent the positive and negative points in the distribution that2
mark the top and bottom end of a confidence interval for the standard error of mean differences in a
particular circumstance. For example, the critical value for t at the 95% level of confidence for a
sample with 5 degrees of freedom would be 2.571. This means that if the null hypothesis is true (the
true difference in the population means = 0) the laws of probability suggest that 95% of the time a
111
researcher should obtain a mean difference that was between -2.571 standard errors of the mean
difference and +2.571 standard errors of the mean difference. Once the critical values have been
obtained, they are compared to the calculated value of t. If the value of t falls outside of the range of
values marked by the positive and negative scores provided in Appendix D (-2.571 and +2.571 in our
example), the null hypothesis is rejected and the researcher must conclude that the difference in the
population means is NOT zero. If, however, the calculated value of t falls within the range of values
marked by the critical value provided in the appendix, the null hypothesis is accepted and the
researcher must conclude that there is not a statistically significant difference in the means of the two
populations.
At this point, it is useful to explain the rationale behind accepting the null hypothesis in the
above example. The null hypothesis is rejected if the probability is very small ( P = .05 or only 5
chances out of 100 ) that the obtained sample difference is a product of sampling error. At .05 the
null hypothesis is rejected if an obtained sample difference occurs by chance only 5 times out of 100.
The 95% confidence level means that the two areas of the tails of the distribution of mean differences
are equivalent to 2.5% and 2.5% when applying scores. These two areas are shown graphically in
Figure 9:2.
FIGURE 9:2
NINETY-FIVE PERCENT CONFIDENCE LEVEL
112
Confidence levels can be set at any probability, but as a matter of convention a majority of research
applications use the .05 or the 95% level. A more stringent confidence level is the .01 level. At .01,
there is only 1 chance out of 100 that the obtained sample difference can occur by sampling error.
When a researcher selects a certain confidence level, the decision is made on the basis of the possible
errors that can be made and the nature of the research.
There are two types of error that can be made by those testing hypotheses. The first error is
known as the alpha error or Type I error. This error occurs when a researcher rejects the null
hypothesis when it was true (no difference existed). The alpha error can only be committed when a
null hypothesis is rejected. One’s likelihood of making the alpha error depends on the confidence level
selected. The above sales illustration represents a situation in which the probability of making the
alpha or Type I error would be P=.05 corresponding with the .05 level or 95% level of confidence. If
one rejects the null hypothesis at the .05 level of confidence and concludes that the observed
differences in two sample means are statistically significant, there are 5 chances of 100 that the wrong
research decision has been made at this level of confidence. Simply put, the probability is P = .05 that
the alpha error has been made and the population means are the same. If the .01 level of confidence is
chosen, there is 1 chance out of 100 (P = .01) of making the wrong decision with reference to the
research conclusion. The alpha area of the curve becomes smaller when a more restrictive confidence
level is needed for specific purposes of the research.
A second kind of error is known as beta or Type II error. Accepting the null hypothesis when
it was false (when the difference was significant) is the beta error. Beta error can only be made when a
null hypothesis has been accepted. The relationship between the alpha and beta error should now be
apparent. The errors are mutually exclusive propositions and represent two separate areas of the curve.
Expanding or reducing one area of the curve likewise affects the other area of the curve. For most
113
research applications, the alpha error is perceived to be the more serious of the two errors. A very
simple way to describe the two errors is that when the alpha error is made, one assumes they have
found something (a difference) that does not exist, and when the beta error is made, one has something
(a difference) that they do not know they have.
To illustrate how the t test is used in statistical research, suppose a researcher selected samples
1 2from the Northeast (X ) and Midwest (X ) to compare the percentage decline (interval data ) in
population between the 1980 and 1990 census. The results are reported in the following simple
distribution:
The research question could be whether or not the decline was greater for the Northeastern
states than the decline for the Midwestern states. Such a research finding would be useful in
understanding many other characteristics of these states such as receipts of federal monies and voting
implications. A likely null hypothesis would be that there is no statistically significant difference
between the mean percentage decline in the Northeastern states and the mean percentage decline in
X Region
1 NE
2 MW
1 NE
2 MW
2 NE
4 MW
3 NE
2 MW
3 NE
2 MW
114
the Midwestern states. Begin by using the grouping variable (region) to separate the values into two
groups and then use solution matrices to calculate the t-statistic as follows:
SOLUTION MATRICES
2 -.4 .16
2 -.4 .16
4 1.6 2.56
2 -.4 .16
2 -.4 .16
1 -1 1
1 -1 1
2 0 0
3 1 1
3 1 1
115
By consulting the Table in Appendix D, at 8 degrees of freedom, the critical values for rejecting
the null hypothesis at the .05 and .01 levels of confidence are 2.31 and 3.36 respectively. The rule for
accepting and rejecting the null hypothesis is that if the obtained t value is smaller, the null is accepted.
Since the obtained t values of -.60 is smaller than either of these critical values, the null hypothesis is
accepted. Accepting the hypothesis means that the researcher concludes that there is no statistically
significant difference in the means of the two groups. The calculated difference between the two
means of -.60 is assumed to be the result of sampling error. There is no difference between the two
geographical areas with reference to population declines for the years 1980 - 1990.
The example of the t test given above is for two independent samples which are the same size.
An explanation of what constitutes two independent samples has already been given in this chapter.
116
However, researchers do not always generate data for analysis from two independent samples. One
very frequently used research design which does not involve two independent samples is a before and
after design, or dependent samples. For these research situations, data from one sample are collected
twice. An application of this kind of design is to determine if there are significant changes in the
behavior of a single group.
The following matrix shows data for a single group for two different times. In this case, the
researcher wants to determine whether there has been a significant change in spending increases for
the two quarters. Are the spending increases the same for both quarters?
The stated null hypothesis is that there is no significant difference in mean value of x in the 1st
observation and the mean value of x in the 2 observation. Most of the calculations are the same asnd
presented earlier in the text except those associated with differences between time one and time two.
The actual calculations are as follows:
Value of X – Before Value of X – After
18 39
15 30
10 21
12 10
9 20
117
SOLUTION MATRIX
It should be noted that the standard error of the difference formula in this research design is not
the same as the formula for independent samples and that the degrees of freedom is the number of
pairs, not the total number. After the t test has been obtained, the table in Appendix D is consulted.
The critical values for accepting and rejecting the null hypothesis for 4 degrees of freedom at .05 and
Before -- After –
18 39 -21 441
15 30 -15 225
10 21 -11 121
12 10 2 4
9 20 -11 121
Mean Mean
118
Every time a hypothesis is tested for a researchsituation the following MUST ALWAYS be applied:
(1) State the RESEARCH QUESTION.
(2) State the NULL HYPOTHESIS.
(3) Conduct the STATISTICAL ANALYSIS.
(4) Draw STATISTICAL CONCLUSIONS.
(5) Draw RESEARCH CONCLUSIONS.
.01 are 2.776 and 4.604. Since the obtained t value of -2.96 falls outside of the critical values at the
.05 level (-2.776 to +2.776), the null hypothesis is rejected. There is statistically significant difference
between the means. The values are significantly higher in the second observation. On the other hand,
the obtained value for t falls inside the critical values at the .01 level (-4.604 to +4.604). Therefore, the
null hypothesis is accepted at this level of confidence.
This chapter has been devoted to testing hypotheses and defining variables. The statistical tests
discussed were those used for testing differences between two means. These tests require interval
data and two independent or dependent samples drawn from a population or populations. A standard
error of the difference for the two samples is calculated, and a t test is obtained as a test of significance.
The researcher must decide what confidence level will be acceptable for the research study. A
decision of whether to accept or reject the null hypothesis is then made and research decisions drawn.
For a step by step explanation of the actual calculations, the student is encouraged to use the sequential
statistical steps chart at the end of the chapter. The statistical tests for three or more interval data
sample means will be explained in a later chapter
A Major Concept:
119
EXERCISES - CHAPTER 8
(1) A firm used two methods of instruction for their management trainees. Test a null hypothesisthat there is no statistically significant difference in the two methods of instruction at .01 and.05. Draw conclusions.
Trainees' Scores for each Method
Television-Programmed Instruction = 42, 69, 81, 52, 86, 91, 75, 87Lecture-Programmed Instruction = 93, 33, 67, 77, 81, 62, 75, 72
(3) ___________________________________________________________________________ PERCENT PERCENT
STATE COLLEGE GRADS STATE COLLEGE GRADS ___________________________________________________________________________
Alabama 12.6 Massachusetts 25.4Arkansas 9.7 Maine 21.6Florida 14.7 Rhode Island 18.6Georgia 15.3 New Jersey 13.6Louisiana 13.4 New York 14.1
___________________________________________________________________________
Is there a statistically significant difference between these groupings of states with reference topercent college graduates? What are your research conclusions at the .01 and .05 confidencelevels?
(4) A group of twelve students participated in a speed reading course. They were given a readingtest before and after the course. Did the course make a statistically significant difference intheir reading skills? Test null at both .01 and .05 confidence levels. Draw researchconclusions. ________________________________________________________________
Words per Minute Words per Minute Student Before After
________________________________________________________________
1 200 300 2 250 305 3 150 155 4 350 400 5 700 1000 6 200 250 _________________________________________________________________
120
(5) Create a hypothetical research design using interval data for two groups and test a nullhypothesis at the .01 and .05 confidence levels. Draw research conclusions.
(6) Over a ten year period a group received different treatments for a medical problem. Is there astatistically significant difference in the number of reoccurrences of the problem after eachtreatment? Draw research conclusions.
_________________________________________________ Group Member Treatment G Treatment H_________________________________________________
A 10 15B 12 12C 15 21D 11 16E 9 12
_________________________________________________