Statistics

Design of Experiment

An Introduction to Experimental Design Completely Randomized Designs Randomized Block Design Factorial Experiments

Contents

An Introduction to Experimental Design

Statistical studies can be classified as being either experimental or observational.

In an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest.

In an observational study, no attempt is made to control the factors.

Cause-and-effect relationships are easier to establish in experimental studies than in observational studies.

A factor is a variable that the experimenter has selected for investigation.

A treatment is a level of a factor. Experimental units are the objects of

interest in the experiment.

Example:Suppose we want to examine if there is a

difference among working methods A, B, and C. 18 workers are randomly chosen.

factor is the working methods. treatment is one of A, B, and C.Experimental units are the workers chosen

A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units.

If the experimental units are heterogeneous, blocking can be used to form homogeneous groups, resulting in a randomized block design.

Completely Randomized Design

Recall Simple Random Sampling Finite populations are often defined by lists

such as: Organization membership roster Credit card account numbers Inventory product numbers

A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.

Replacing each sampled element before selecting subsequent elements is called sampling with replacement.

Sampling without replacement is the procedure used most often.

In large sampling projects, computer-generated random numbers are often used to automate the sample selection process.

Excel provides a function for generating random numbers in its worksheets.

Infinite populations are often defined by an ongoing process whereby the elements of the population consist of items generated as though the process would operate indefinitely.

A simple random sample from an infinite population is a sample selected such that the following conditions are satisfied. Each element selected comes from the same

population. Each element is selected independently.

Random Numbers: the numbers in the table are random, these four-digit numbers are equally likely.

Most experiments have critical error on random sampling.

Ex: sampling 8 samples from a production line in one day Wrong method:

Get one sample every 3 hours not random!

Ex: sampling 8 samples from a production line Correct method:

You can get one sample at each 3 hours interval but not every 3 hours correct but not a simple random sampling

Get 8 samples in 24 hours Maximum population is 24, getting 8 samples two digits 63, 27, 15, 99, 86, 71, 74, 45, 10, 21, 51, … Larger than 24 is discarded So eight samples are collected at:

15, 10, 21, … hour

In Completely Randomized Design, samples are randomly collected by simple random sampling method.

Only one factor is concerned in Completely Randomized Design, and k levels in this factor.

Data:Treatment Observations sample sample(level) 1 2 i … nj mean SD

1 x11 x12 x1i … s1

2 x21 x22 x1i … s2

j xj1 xj2 xji … sj

k xk1 xk2 xki … sk

1x11nx

MSTR 1

denominator is thedenominator is thedegrees of freedomdegrees of freedomassociated with SSTRassociated with SSTR

numerator is callednumerator is calledthe the sum of squares sum of squares duedueto treatmentsto treatments (SSTR) (SSTR)

Between-Treatments Estimate of Population Variance

The between-samples estimate of σ2 is referred to as the mean square due to treatments (MSTR).

Within-Treatments Estimate of Population Variance.

The second estimate of σ2, the within-samples estimate, is referred to as the mean square due to error (MSE).

denominator is denominator is thethedegrees of degrees of freedomfreedomassociated with associated with SSESSE

numerator is numerator is calledcalledthe the sum of sum of squaressquaresdue to errordue to error (SSE)(SSE)

ANOVA Table

Source ofSource ofVariationVariation

Sum ofSum ofSquaresSquares

Degrees ofDegrees ofFreedomFreedom

MeanMeanSquaresSquares FF

TreatmentsTreatments

ErrorError

TotalTotal

kk - 1 - 1

nnTT - 1 - 1

SSTRSSTR

SSESSE

SSTSST

nnT T - - kk

Completely Randomized Design--Example

Example: AutoShine, Inc. AutoShine, Inc. is considering marketing a

long-lasting car wax. Three different waxes (Type 1, Type 2, and Type 3) have been developed.

In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 with Type 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signs of deterioration.

The number of times each car went through the carwash is shown on the next slide. AutoShine, Inc. must decide which wax to market. Are the three waxes equally effective?

1122334455

27273030292928283131

33332828313130303030

29292828303032323131

Sample MeanSample MeanSample VarianceSample Variance

ObservationObservationWaxWax

Type 1Type 1WaxWax

Type 2Type 2WaxWax

Type 3Type 3

2.52.5 3.3 3.3 2.5 2.529.029.0 30.4 30.4 30.030.0

Hypotheses

where: μ1 = mean number of washes for Type 1 wax

μ2 = mean number of washes for Type 2 wax

μ3 = mean number of washes for Type 3 wax

H0: 1=2=3

Ha: Not all the means are equal

Because the sample sizes are all equal:

MSE = 33.2/(15 - 3) = 2.77

MSTR = 5.2/(3 - 1) = 2.6

SSE = 4(2.5) + 4(3.3) + 4(2.5) = 33.2

SSTR = 5(29–29.8)2 + 5(30.4–29.8)2 + 5(30–29.8)2 = 5.2

Mean Square Error

Mean Square Between Treatments 1 2 3( )/ 3x x x x= (29 + 30.4 + 30)/3 = 29.8

Rejection Rule

where F.05 = 3.89 is based on an F distributionwith 2 numerator degrees of freedom and 12denominator degrees of freedom

p-Value Approach: Reject H0 if p-value < .05

Critical Value Approach: Reject H0 if F > 3.89

Test Statistic

There is insufficient evidence to conclude thatthe mean number of washes for the three waxtypes are not all the same.

Conclusion

F = MSTR/MSE = 2.6/2.77 = .939

The p-value is greater than .10, where F = 2.81. (Excel provides a p-value of .42.) Therefore, we cannot reject H0.

TreatmentsTreatmentsErrorError

TotalTotal

5.25.233.233.2

38.438.4

12122.602.602.772.77

.939.939

ANOVA Table

Minitab Outputs

One-way ANOVA: Observation versus Type

Source DF SS MS F PType 2 5.20 2.60 0.94 0.418Error 12 33.20 2.77Total 14 38.40

S = 1.663 R-Sq = 13.54% R-Sq(adj) = 0.00%

In Completely Randomized Experimental Design, the test for a difference among treatment means, we compute a F value by using the ratio

A problem can arise whenever differences due to extraneous factors (not considered in the experiment) cause the MSE become large.

Randomized Block Design

MSEMSTRF

In previous example, the time after wax is produced is not considered, but it could make differences.

To examine the effects on shift in a production line, shift is a factor but gender is not considered. It could also make differences.

To isolate the effects of a known effect caused by extraneous factors, one can “block” it by using the Randomized Block Design.

After blocking the extraneous effect, the value of MSE becomes smaller and the ability of F test increases.

ANOVA Procedure For a randomized block design the sum of squares

total (SST) is partitioned into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error.

The total degrees of freedom, nT - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and (k - 1)(b - 1) go to the error term.

SST = SSTR + SSBL + SSESST = SSTR + SSBL + SSE

ErrorError

TotalTotal

kk - 1 - 1

nnTT - 1 - 1

SSTRSSTR

SSESSE

SSTSST

ANOVA Table

BlocksBlocks SSBLSSBL bb - 1 - 1

((k k – 1)(– 1)(bb – 1) – 1)

SSBLMSBL -1b

Example: Crescent Oil Co.

Crescent Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends.

Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide.

29.8 29.8 28.8 28.8 28.4 28.4TreatmentTreatmentMeansMeans

1122334455

31313030292933332626

30302929292931312525

30302929282829292626

30.33330.33329.33329.33328.66728.66731.00031.00025.66725.667

Type of Gasoline (Treatment)Type of Gasoline (Treatment) BlockBlockMeansMeansBlend XBlend X Blend YBlend Y Blend ZBlend Z

AutomobileAutomobile(Block)(Block)

Mean Square Due to Error

MSE = 5.47/[(3 - 1)(5 - 1)] = .68

SSE = 62 - 5.2 - 51.33 = 5.47MSBL = 51.33/(5 - 1) = 12.8

SSBL = 3[(30.333 - 29)2 + . . . + (25.667 - 29)2] = 51.33

MSTR = 5.2/(3 - 1) = 2.6 SSTR = 5[(29.8 - 29)2 + (28.8 - 29)2 + (28.4 - 29)2] = 5.2

The overall sample mean is 29. Thus, Mean Square Due to Treatments

Mean Square Due to Blocks

ErrorError

TotalTotal

5.205.20

5.475.47

62.0062.00

2.602.60

.68.68

3.823.82

ANOVA Table

BlocksBlocks 51.3351.33 12.8012.8044

Rejection Rule

For = .05, F.05 = 4.46 (2 d.f. numerator and 8 d.f. denominator)

p-Value Approach: Reject H0 if p-value < .05

Critical Value Approach: Reject H0 if F > 4.46

Conclusion

There is insufficient evidence to conclude thatthe miles per gallon ratings differ for the threegasoline blends.

The p-value is greater than .05 (where F = 4.46) and less than .10 (where F = 3.11). (Excel provides a p-value of .07). Therefore, we cannot reject H0.

F = MSTR/MSE = 2.6/.68 = 3.82

Test Statistic

Factorial Experiments

In some experiments we want to draw conclusions about more than one variable or factor.

Factorial experiments and their corresponding ANOVA computations are valuable designs when simultaneous conclusions about two or more factors are required.

Factorial Experiments The term factorial is used because the

experimental conditions include all possible combinations of the factors.

For example, for a levels of factor A and b levels of factor B, the experiment will involve collecting data on ab treatment combinations.

ANOVA ProcedureThe ANOVA procedure for the two-factor

factorial experiment is similar to the completely randomized experiment and the randomized block experiment

We again partition the sum of squares total (SST) into its sources.

SST = SSA + SSB + SSAB + SSESST = SSA + SSB + SSAB + SSE

Two-Factor Factorial Experiment

SST = SSA + SSB + SSAB + SSESST = SSA + SSB + SSAB + SSE

The total degrees of freedom, nT - 1, are partitioned such that (a – 1) d.f go to Factor A, (b – 1) d.f go to Factor B, (a – 1)(b – 1) d.f. go to Interaction, and ab(r – 1) go to Error.

SSAMSA -1a

MSAMSE

Factor AFactor A

ErrorError

TotalTotal

aa - 1 - 1

nnTT - 1 - 1

SSASSA

SSESSE

SSTSST

Factor BFactor B SSBSSB bb - 1 - 1

abab((rr – 1) – 1)

SSEMSE ( 1)ab r

SSBMSB -1b

InteractionInteraction SSABSSAB ((a a – 1)(– 1)(bb – 1) – 1) SSABMSAB ( 1)( 1)a b

MSBMSE

MSABMSE

Step 3 Compute the sum of squares for factor B

1 1 1SST = ( )

ijki j k

1SSA = ( . )

br x x

1SSB = ( . )

ar x x

Step 1 Compute the total sum of squares

Step 2 Compute the sum of squares for factor A

Step 4 Compute the sum of squares for interaction

1 1SSAB = ( . . )

ij i ji j

r x x x x

SSE = SST – SSA – SSB - SSABSSE = SST – SSA – SSB - SSAB

Step 5 Compute the sum of squares due to error

Example: State of Ohio Wage Survey A survey was conducted of hourly wages for a

sample of workers in two industries at three locations in Ohio. Part of the purpose of the survey was to determine if differences exist in both industry type and location. The sample data are shown on the next slide.

Example: State of Ohio Wage Survey

Industry Cincinnati Cleveland ColumbusI 12.10 11.80 12.90I 11.80 11.20 12.70I 12.10 12.00 12.20II 12.40 12.60 13.00II 12.50 12.00 12.10II 12.00 12.50 12.70

Factors Factor A: Industry Type (2 levels) Factor B: Location (3 levels)

Replications Each experimental condition is repeated 3

Factor AFactor A

ErrorError

TotalTotal

.50.50

1.431.43

3.423.42

.50.50

.12.12

4.194.19

ANOVA Table

Factor BFactor B 1.121.12 .56.5622InteractionInteraction .37.37 .19.1922

4.694.691.551.55

Conclusions Using the p-Value Approach Industries: p-value = .06 > = .05

Mean wages do not differ by industry type. Locations: p-value = .03 < = .05

Mean wages differ by location. Interaction: p-value = .25 > = .05

Interaction is not significant.(p-values were found using Excel)

Conclusions Using the Critical Value ApproachIndustries: F = 4.19 < F = 4.75

Mean wages do not differ by industry type.Locations: F = 4.69 > F = 3.89

Mean wages differ by location.Interaction: F = 1.55 < F = 3.89

Interaction is not significant.

Statistics

Documents

QBM117 Business Statistics Descriptive Statistics

Statistics Portugal Economic Statistics Department

Descriptive Statistics - StatPlus · Descriptive Statistics The DESCRIPTIVE STATISTICS procedure displays univariate summary statistics for selected variables. Descriptive statistics

Jose Miguel Esparza - eternal-todo.com · 2016. 2. 7. · Socks5 Free Formgrabber $500 Keylogger $200 TeamViewer $500. Statistics. Statistics. Statistics. Statistics. Statistics

OFFICIAL STATISTICS PEOPLE’S WELFARE STATISTICS

Classical Statistics and Quantum Statistics

Medical Statistics Part-I:Descriptive statistics

Statistics Management - Cisco · Statistics Management Thischapterincludesthefollowingsections: • StatisticsManagement,page1 Statistics Management

STATISTICS 7 Basic Statistics

Statistics Introduction to Statistics. Section 1.1 An Overview of Statistics

MSc Statistics MSc Statistics (Financial Statistics) MSc

Transport Statistics 2016 - NSO Home · PDF fileTransport Statistics 2016. – Valletta: National Statistics Ofﬁ ce, ... Transport and Agriculture Statistics National Statistics

Statistics (Medical Statistics) MSc - ReportLab · Statistics (Medical Statistics) MSc / Medical statistics is a fundamental scientific component of health research. Medical statisticians

Statistics Tutor,Statistics Help,Statistics Homework Tutoring,Statistics Homework help,Statistics Quiz help,Statistics Assignment help,Chemistry Homework help,Chemistry Tutoring,Physics

Marriage Statistics - Statistics Botswana

Statistics U.S. Department of Labor – Bureau of Labor Statistics Statistics

Maritime ports freight and passenger statistics Statistics ... · Maritime ports freight and passenger statistics Statistics Explained Source : Statistics Explained ... Antwerpen,

Statistics 222, Spatial Statistics

2010 Statistics 2010 Statistics

Water statistics Statistics Explained - European Commissionec.europa.eu/eurostat/statistics-explained/pdfscache/1182.pdf · Water statistics Statistics Explained Source : Statistics