Upload
kishanvora
View
217
Download
0
Embed Size (px)
Citation preview
8/10/2019 Simple Comparative Experiments
1/24
19-Aug-14
1
Simple Comparative Experiments
Simple Comparative Experiments
2
Compare two conditions (sometimes called treatments)
Illustration
The tension bond strength of Portland cement mortar is an important
characteristic of the product. An engineer is interested in comparing
the strength of a modified formulation in which polymer latex
emulsions have been added during mixing to the strength of the
unmodified mortar. The experimenter has collected 10 observations
on strength for the modified formulation and another 10 observations
for the unmodified formulation. The data are shown in the table.
The two different formulations are referred to as two treatments or as
two levels of the factor formulations
8/10/2019 Simple Comparative Experiments
2/24
19-Aug-14
2
Tension Bond Strength of Portland Cement
3
Modified Mortar Unmodified Mortar
16.85 17.50
16.40 17.63
17.21 18.25
16.35 18.00
16.52 17.86
17.04 17.75
16.96 18.22
17.15 17.90
16.59 17.96
16.57 18.15
Basic Concepts
4
Each observation in the Portland cement experiment
would be called a run
The individual runs differ, so there is fluctuation, or noise,
in the results.
This noise is usually called experimental error or simply
error.
It is a statistical error, meaning that it arises from variation
that is uncontrolled and generally unavoidable.
The presence of error or noise implies that the response
variable, tension bond strength, is a random variable.
A random variable may be either discrete or continuous.
8/10/2019 Simple Comparative Experiments
3/24
19-Aug-14
3
Graphical Description Dot Plot
5
Mean Strength
Modified Mortar 16.764
Unmodified Mortar 17.922
Do the two samples differ by a
non-trivial amount?
Shows central tendency as
well as dispersion.
Graphical Description - Boxplot
6
8/10/2019 Simple Comparative Experiments
4/24
19-Aug-14
4
Basic Concepts Expected Value & Variance
7
Basic Concepts
8
8/10/2019 Simple Comparative Experiments
5/24
19-Aug-14
5
Basic Concepts
9
If y1 and y2 are independent,
But, in general
Sampling and Estimators
10
Most statistical methods assume that
random samples are used
If every element in the population has
an equal probability of being chosen,
then the procedure employed is called
random sampling
Estimators
An estimator of an unknown parameter
is a statistic that corresponds to that
parameter
Note that a point estimator is a random
variable
is a point es timatorof and s is a
point estimator of
Properties of estimators
The point estimator should be unbiased.
The long-run average or expected value of
the point estimator should be the
parameter that is being estimated.
An unbiased estimator should have
minimum variance.
This property states that the minimum
variance point estimator has a variance
that is smaller than the variance of any
other estimator of that parameter.
The probability distribution of a
statistic is called a sampling
distribution
8/10/2019 Simple Comparative Experiments
6/24
19-Aug-14
6
The Normal Distribution
11
Sample runs that differ as a result of experimental error
often are well described by the normal distribution
The Central Limit Theorem
12
This result states essentially that the sum of n independent and identically
distributed random variables is approximately normally distributed
We think of the error in an experiment as arising in an additive manner from
several independent sources; consequently, the normal distribution
becomes a plausible model for the combined experimental error
8/10/2019 Simple Comparative Experiments
7/24
19-Aug-14
7
The
distribution
13
If z1, z2,,zkare normally and independently distributed
random variables with mean 0 and variance 1, then the
random variable
follows a chi-squared distribution with kdegrees of
freedom with
The distribution
14
8/10/2019 Simple Comparative Experiments
8/24
19-Aug-14
8
The t distribution
15
If z and are independent standard normal and chi-
square random variables, respectively, the random
variable
follows the t distribution with k degrees of freedom,
denoted tk
Mean and variance of t are = 0 and =
for k > 2
The t distribution
16
is distributed as t with n - 1 degrees of freedom.
8/10/2019 Simple Comparative Experiments
9/24
19-Aug-14
9
The F distribution
17
If and
are two independent chi-square random
variables with u and v degrees of freedom, respectively,
then the ratio
follows the Fdistribution with u numerator degrees of
freedom and v denominator degrees of freedom
The F distribution
18
8/10/2019 Simple Comparative Experiments
10/24
19-Aug-14
10
Randomized Designs Inferences About Means
19
Assume that a completely randomized experimental design
is used
In such a design, the data are viewed as if they were a random
sample from a normal distribution.
Recall the Portland cement data
A Model for the Data
20
Often describe the results of an experiment with a model
yij is thejth observation from factor level i
is the mean of the response at the ith factor level
is a normal random variable associated with the ijth
observation
We assume that are NID(0,), i= 1, 2
is the random error componentof the model
Because the means and are constants, yijare NID(,)
8/10/2019 Simple Comparative Experiments
11/24
19-Aug-14
11
Hypotheses
21
A statement either about the parameters of a probability distribution or theparameters of a model
Reflects some conjecture about the problem situation
In the Portland cement experiment, we may think that
The mean tension bond strength of the modified mortar formulation is equal to acertain value
The mean tension bond strengths of the two mortar formulations are equal
Null and alternative hypotheses
Type I ( is the significance level) and type II errors
Power
The Portland Cement Example Summary Statistics
22
8/10/2019 Simple Comparative Experiments
12/24
19-Aug-14
12
How the Two-Sample t-Test Works:
23
How the Two-Sample t-Test Works
24
8/10/2019 Simple Comparative Experiments
13/24
19-Aug-14
13
How the Two-Sample t-Test Works:
25
Values of t0 that are near zero are consistent with the null hypothesis
Values of t0 that are very different from zero are consistent with the
alternative hypothesis
t0 is a distance measure-how far apart the averages are expressed
in standard deviation units
Notice the interpretation of t0 as a signal-to-noise ratio
The Two-Sample (Pooled) t-Test
26
8/10/2019 Simple Comparative Experiments
14/24
19-Aug-14
14
The Two-Sample (Pooled) t-Test
27
So far, we havent really doneany statistics
We need an objective basis fordeciding how large the teststatistic t0 really is
In 1908, W. S. Gosset derivedthe referencedistributionfor t0 called the tdistribution
(tables of the tdistribution inthe textbook.)
t0 = -2.20
The Two-Sample (Pooled) t-Test
28
A value of t0 between -2.101and +2.101 is consistent withequality of means
It is possible for the means tobe equal and t0 to exceedeither 2.101 or 2.101, but it
would be a rareevent leads to the conclusion thatthe means are different
Could also use the P-valueapproach
t0 = -2.20
8/10/2019 Simple Comparative Experiments
15/24
19-Aug-14
15
The Two-Sample (Pooled) t-Test
29
The P-value is the risk of wrongly rejecting the null hypothesisof equal means (it measures rareness of the event)
The P-value in our problem is P= 0.042
t0 = -2.20
Minitab Two-Sample t-Test Results
30
Two-Sample T-Test and CI: Modified Mortar, Unmodified Mortar
Two- sampl e T f or Modi f i ed Mort ar vs Unmodi f i ed Mor t ar
N Mean StDev SE MeanModi f i ed Mort ar 10 16. 764 0. 316 0. 10Unmodi f i ed Mor t ar 10 17. 922 0. 248 0. 078
Difference = (Modified Mortar) - (Unmodified Mortar)Est i mat e f or di f f erence: - 1. 15895% CI f or di f f erence: ( - 1. 426, - 0. 890)T- Test of difference = 0 (vs ): T- Val ue = - 9.11 P-Val ue = 0.000 DF = 17
8/10/2019 Simple Comparative Experiments
16/24
19-Aug-14
16
Assumptions of the t-test
31
Both samples are drawn from independent populations
Populations can be described by a normal distribution
The standard deviation or variances of both populations
are equal
The observations are independent random variables
The assumption of independence is critical, and if the run order
is randomized (and, if appropriate, other experimental units
and materials are selected at random) this assumption will
usually be satisfied
The equal-variance and normality assumptions are easy to
check using a normal probability plot
Probability Plot
32
Observations in the sample are first ranked from smallest to
largest
The ordered observations Y(j)are then plotted against their
observed cumulative frequency (j - 0.5)/n.
If the hypothesized distribution adequately describes the data,
the plotted points will fall approximately along a straight line
Usually, the determination of whether or not the data plot as a
straight line is subjective
The assumption of equal population variances can be checked
by simply comparing the slopes of the two straight lines
8/10/2019 Simple Comparative Experiments
17/24
19-Aug-14
17
Normal Probability Plot for the Cement Example
33
What can you conclude?
Choice of sample size
34
Suppose we are testing the following hypothesis
and that the means are not equal so that = The probability of type II error depends on the true difference
in means
A graph of versus for a particular sample size is called theoperating characteristic curve, or O.C. curve for the test
The error is also a function of sample size.
For a given value of , the error decreases as the sample sizeincreases
A specified difference in means is easier to detect for larger samplesizes than for smaller ones
8/10/2019 Simple Comparative Experiments
18/24
19-Aug-14
18
Operating characteristic curves for the two-sided t-test
with = 0.05
35
The greater the difference inmeans, , the smallerthe probability of type II errorfor a given sample size and .
That is, for a specified samplesize and , the test will detectlarge differences more easilythan small ones.
As the sample size gets larger,the probability of type II errorgets smaller for a givendifference in means and .
That is, to detect a specified
difference , we may make thetest more powerful byincreasing the sample size.
=| |
2 =
||
2
Example 1
36
Analysis of a random sample consisting of n1 = 20 specimens
of cold rolled steel to determine yield strengths resulted in a
sample average strength of 29.8 ksi. A second random
sample of n2 = 25 two-sided galvanized steel specimens gave
a sample average strength of 34.7 ksi. Assuming that the two
yield strength distributions are normal with s1 = 4.0 and s2 =5.0, does the data indicate that the corresponding average
yield strengths 1 and 2 are different? Use a significance
level of = 0.01.
Table of the t distribution
8/10/2019 Simple Comparative Experiments
19/24
19-Aug-14
19
Example 2
37
A hardness testing machine presses a rod with a pointed tipinto a metal specimen with known force. By measuring thedepth of the depression caused by the tip, the hardness ofthe specimen is determined. Two different tips are availablefor the machine, and although the precision (variability) ofthe measurements seem to be the same, is suspected thatone tip produces different hardness readings from the other.
Ten specimens were tested and the hardness readings were
obtained as shown in the table.
What do you conclude?
Example 2 Data
38
8/10/2019 Simple Comparative Experiments
20/24
19-Aug-14
20
Example 2 (contd.)
39
The metal specimens chosen for testing are from different
bar stock that are produced in different heats and are not
exactly homogenous in some other way that affects
hardness. This lack of homogeneity will contribute to the
variance and will inflate the experimental error.
Alternative Experimental Design for Example 2
40
Assume that each specimen is large enough so that two
hardness determinations may be made on it.
This alternative design would consist of dividing each
specimen into two parts, then randomly assigning one tip
to one-half of each specimen and the other tip to the
remaining half.
The order in which the tips are tested for a particular
specimen would also be randomly selected.
The mathematical model would be as follows
8/10/2019 Simple Comparative Experiments
21/24
19-Aug-14
21
The Paired Comparison Design (The Paired t-test)
41
Testing := is equivalent to testing
The test statistic would be
H0 would be rejected if > ,
Paired t-test Data
42
8/10/2019 Simple Comparative Experiments
22/24
19-Aug-14
22
The Paired Comparison Design
43
Special case of a more general type of design called the randomized block design
Block refers to a relatively homogeneous experimental unit (in our case, the metal specimens
are the blocks)
The block represents a restriction on complete randomization because the treatment
combinations are only randomized within the block
Note that, although 2n = 2(10) = 20 observations have been taken, only n - 1 = 9
degrees of freedom are available for the t statistic
As the degrees of freedom for t increase the test becomes more sensitive
By blocking or pairing, we have effectively lost n 1 degrees of freedom, but we
hope we have gained a better knowledge of the situation by eliminating anadditional source of variability (the difference between specimens).
Minitab Output
44
Two-Sample T-Test and CI: Tip 1, Tip 2
Two- sampl e T f or Ti p 1 vs Ti p 2
N Mean St Dev SE MeanTi p 1 10 4. 80 2. 39 0. 76Ti p 2 10 4. 90 2. 23 0. 71
Di f f erence = (Ti p 1) - (Ti p 2)Esti mate f or di f f erence: - 0. 1095% CI f or di f f erence: (- 2. 28, 2.08)
T- Test of difference = 0 (vs ): T- Val ue = - 0. 10 P- Val ue = 0. 924 DF = 18Both use Pool ed StDev = 2. 3154
Paired T-Test and CI: Tip 1, Tip 2
Pai red T f or Ti p 1 - Tip 2
N Mean StDev SE MeanTi p 1 10 4. 800 2. 394 0. 757Ti p 2 10 4. 900 2. 234 0. 706Di f f erence 10 - 0. 100 1. 197 0.379
95% CI f or mean di f f erence: ( - 0. 956, 0. 756)T- Test of mean difference = 0 (vs 0): T - Val ue = - 0.26 P-Val ue = 0. 798
8/10/2019 Simple Comparative Experiments
23/24
19-Aug-14
23
Inferences About Variances
45
Sensitive to the normality assumption
The single sample test
Test statistic
The null hypothesis is rejected if
> ,
or 0.01. Use an = 0.05
level of significance. What assumptions are required for
this test?
Table of the Chi Sq distribution
8/10/2019 Simple Comparative Experiments
24/24
19-Aug-14
Two Sample Test for Variance
47
Assumes normality of both populations
Test statistic
The null hypothesis is rejected if
> ,, or < ,,
Note that
Illustration
48
A chemical engineer is investigating the inherent
variability of two types of test equipment that can be used
to monitor the output of a production process. He
suspects that the old equipment, type 1, has a larger
variance than the new one. Two random samples of n1 =
12 and n2 = 10 observations are taken, and the samplevariances are S1 = 14.5 and S2 = 10.8. What do you
conclude?
F0.05,11,9 = 3.10