24
1 INTRODUCTION TO EXPERIMENTAL DESIGN TUTORIAL 3 FACTORIAL DESIGNS 1. 2 k Factorial designs, k = 2 The first step in any factorial design is to define the lower and upper levels of the factors to be investigated. Generally the upper level is coded as +1 while the lower level is coded as -1. A complete coded design with replicates (repeats) is given below in Table 1. 1. 2 k Factorial Designs, k = 2 1.1 2 k Factorial Designs, k = 3,4 and 5 2. Factorial Designs in Blocks and Confounding. 3. Fractional Factorial (2 k-p ) Designs. 4. Explanation of ANOVA Tables for Factorial and Fractional Factorial Experiments. 5. Visual Aids to Interpret Effects and Interactions. 5.1 Pareto Charts and Response Surface Plots. 5.2 Normal Probability Plots and Normal Probability of Residual Plots. 5.3 Interaction Plots.

INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

1

INTRODUCTION TO EXPERIMENTAL DESIGN

TUTORIAL 3 FACTORIAL DESIGNS

1. 2k Factorial designs, k = 2 The first step in any factorial design is to define the lower and upper levels of the factors to be investigated. Generally the upper level is coded as +1 while the lower level is coded as -1. A complete coded design with replicates (repeats) is given below in Table 1.

1. 2k Factorial Designs, k = 2 1.1 2k Factorial Designs, k = 3,4 and 5

2. Factorial Designs in Blocks and Confounding. 3. Fractional Factorial (2k-p) Designs. 4. Explanation of ANOVA Tables for Factorial and

Fractional Factorial Experiments. 5. Visual Aids to Interpret Effects and Interactions.

5.1 Pareto Charts and Response Surface Plots.

5.2 Normal Probability Plots and Normal Probability of Residual Plots.

5.3 Interaction Plots.

Page 2: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

2

Run Replicates A B AB 1 2 .... n Total Average

1 +1 -1 +1 Y11 Y12 …. Y1n Y1• 1y 2 -1 -1 -1 Y21 Y22 …. Y2n Y2• 2y 3 +1 +1 -1 Y31 Y32 …. Y3n Y3• 3y 4 -1 +1 +1 Y41 Y42 …. Y4n Y4• 4y Table 1. General design matrix for a 22 factorial design. In terms of analysis of the experiment, a general “algorithm” will be defined that starts with a definition of the effects. In general, an effect is defined as n

iContrastEffect k 121 where )()( lowyhighyiContrast . An example of the determination of the effect of A is

given below where the values of A at their high levels are subtracted from the values at the low levels. The “algorithm” will be illustrated using an example that can also be found in the accompanying spreadsheet. Table 2 gives the results of a 22 design that was replicated twice. Run

Factor A Factor B 1 2 Total Average 1 +1 -1 831 824 1655 827,5 2 -1 -1 607 582 1189 594,5 3 +1 +1 1149 1129 2278 1139 4 -1 +1 854 794 1648 824 Table 2. Simple example of a replicated ( Columns 1 and 2) 22 experimental design. Effect of oxygen flow rate on oxide thickness of silicon wafers. This particular experiment was designed to study the effect of oxygen flow rate and time on the oxide thickness of silicon wafers used in integrated circuit design during oxidation. Factor A in

Page 3: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

3

Table 2 is the oxygen flow rate and Factor B is the oxidation time. In terms of the suggested flow of data analysis Schematic 1 illustrates the steps to be followed.

Estimate Effects

Determine Sum of Squares

Determine Mean Sum of Squares

ANOVA

Regression Model

Schematic 1. General process to follow in DoE analysis. The effect of a particular factor can be calculated as shown below.

Another way of determining the effects is to denote the high level of any treatment by the corresponding lower case letter. In the case of the 22 design a represents the factor A at the high level and B at the low level while b denotes B at the high level and A at the low level. By convention, (1) is used to denote both factors at the low level.

41096

2)( 14231 n

yyyyAEffectMain k

41082

2)( 12143 n

yyyyBEffectMain k

Page 4: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

4

(1)

a

b

ab

A

B Figure 2. Example of the factor space associated with a 22 factorial design. The space can be thought of as a square in factor space. In Figure 2 the 22 design is shown in terms of lowercase letters. The arrows depict the transition from low to high for the factors A and B. In this design ab denotes the interaction of A and B at their respective high levels. In addition, (1), ab, a and b denotes the total of all n replicates of the experiment. Using this convention, the effects for A, B and their interaction AB can be written as follows:

A ab b a 12 1( ) (1)

B ab a b 12 1( ) (2)

AB ab b a 12 1( ) (3)

Note that the contrast for each factor is found inside the brackets of each equation. The derivation of these equations is illustrated below using the A effect as an example.

12

12

12

baabnn

bn

aabyyA AA

(4)

Note that the contrast is again found between the square brackets. Once the effects are calculated one can proceed to the next step which is the derivation of the sums of squares. The determination of the effects can already start to give indications about the magnitude of the influence of certain factors.

Page 5: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

5

Figure 3. Graphic exhibition of effects for wafer experiment. The next step involves the derivation of the sums of squares.

(5) (6)

(7)

(8)

(9)

The sum of squares has been calculated using the information in Table 2. Once the contrast has been defined for each factor, this can be used to calculate the sum of squares. Note that the error sum of squares is obtained by subtracting the model sum of squares from the total sum of squares. The total sum of squares is perhaps not immediately obvious in its derivation but the procedure to obtain it is illustrated below.

221096

2 222 n

AContrastSSAeffectsquaresofSum kA

8

10822

22 nBContrastSSBeffectsquaresofSum kB

8

1642

22 nABContrastSSABeffectsquaresofSum kAB

ABBAModel SSSSSSSS

ErrorModelTotal SSSSSS

81648227811891655794...607824831

22...2

2222

22

4321242

221

212

211

yyyyyyyyTotalofSquareSum

Page 6: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

6

Observe that the totals are subtracted from the squared value of each response in the design. Once the sum of squares has been calculated for each factor and interaction, the most tedious calculations have been completed and one can move on to the next step in which the mean sum of squares is calculated. For the mean sum of squares the only important thing to remember is the degrees of freedom used to divide the sum of squares with to enable the derivation of the mean. For the 22 design described here, the following equations show how to obtain the mean sum of squares and how to calculate the degrees of freedom required.

(10)

(11) In 2k designs the level is always 2. Next the analysis of variance (ANOVA) can proceed. The statistic most frequently employed is to test the data using the F distribution. The F0 values are simply the mean square of each factor and interaction divided by the mean square of the error. This yields an indication of how significant the specific factor or interaction is relative to the total sample variance (mean square of the error). If F0 > Fα,1,4(n-1) then it is clear that the factor or interaction term is statistically significant. The α value in the evaluation term denotes the level of the probability at which the evaluation is conducted. An α value of 0,05 for example indicates that if F0 > Fα,1,4(n-1) there is a 95% confidence level that the observed statistic is not due to random variation. In addition to the F-value, the student t-test and the p-value are often included in the ANOVA table.

1 levelSSMS AA

12 nSSMS k

ErrorError

Page 7: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

7

Source of Variation SS df MS F0

Factor A 150152 1 150152 257 Factor B 146340 1 146340 250 Interaction AB 3362 1 3362 5.75 Error 2337 4 584 Total 302191 7 (N-1) Table 3. Analysis of variance for 22 factorial experiment on wafer oxidation. From the ANOVA table both it appears that the A and B factor are significant. The interaction of the two factors namely AB is comparatively unimportant. Once the statistical evaluation has been concluded there remains the option of constructing a model of the process. The derivation of a regression model for the oxidation experiment is given below.

(12)

The derived regression model is then tested to see to what extent it models the observed data (Figure 4).

y = 0,9888x + 9,4882R² = 0,9888

0200400600800

10001200

400 500 600 700 800 900 1000 1100 1200

calculated Values

Experimental Observations Figure 4. Regrssion model fit of observed data for wafer experiment.

exxyexx

exABEffectxBEffectxAEffectexxxy

21

21

12210

121222110

25,13513725,8462

5,2702

27425,846222

ˆ

Page 8: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

8

Figure 4 shows the correlation between the observed values and the calculated values using the regression model. The first approach is always to plot the data and to see to what extent it correlates. Another important diagnostic check for the regression model is to use the adjusted R2. The derivation of the adjusted R2 is given below.

9865,07302191

423371

12

2

Adj

totaltotalerrorError

Adj

RdfSSdfSSR

(13)

This value suggests that the derived regression model is a good fit of the observed data. 1.2 2k Factorial Designs, k = 3, 4 and 5 The 22 factorial design serves as a model for designs with more factors and an iteration is illustrated in the accompanying spreadsheet that shows how to obtain larger designs by iteration. Following on from the 22 design the 23 design is evaluated using exactly the same process or “algorithm” as presented earlier but including an additional factor and interaction terms. The 23 design can be visualized as a cube in experimental space. Figure 5 illustrates this.

A BC

a(1)

ab

abc

acb

c

bc

Figure 5. An example of a 23 factorial design space. The design can be visualised as a cube in three dimensional factor space. As in the 22 design, the most important aspect to calculate is the estimation of the effects. The derivation for the A factor effect is given below.

Page 9: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

9

bccbabcacabanA

nbccb

nabcacabaA

yyA AA

)1(41

41

4

(14)

All the values in which A is at its low level should be subtracted from the sum of those values at which A is high. The contrast can in general be derived for a factorial experiment with any number of factors and the derivation in general is written as follows:

1.....11... kbaContrast KAB (15) This leads to the following operational equations to calculate the contrast of each effect and interaction for a 23 design.

)1()1)(1)(1( cbabcacababccbaA )1()1)(1)(1( cbabcacababccbaB)1()1)(1)(1( cbabcacababccbaC)1()1)(1)(1( cbabcacababccbaAB)1()1)(1)(1( cbabcacababccbaAC)1()1)(1)(1( cbabcacababccbaBC)1()1)(1)(1( cbabcacababccbaABC

The “trick” lies in choosing the sign in Equation 15. Consider for example a 23 factorial experiment with factors A, B and C. If a specific contrast is to be determined and one of the factors is not in the contrast then the sign of the 1 is positive. If the factor is present in the contrast the sign is negative. Once the contrasts have been defined the effects and sum of squares can be determined.

KABkKBA ContrastnEffect ....,...,, 22 (16)

2...... 2

1KABkKAB ContrastnSS (17)

Alternative methods to derive these are given next. These effects may be calculated by establishing a design matrix. The design matrix for a 23 factorial experiment is given below.

Page 10: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

10

Run A B AB C AC BC ABC Responses 1 -1 -1 +1 -1 +1 +1 -1 (1) 2 +1 -1 -1 -1 -1 +1 +1 a 3 -1 +1 -1 -1 +1 -1 +1 b 4 +1 +1 +1 -1 -1 -1 -1 ab 5 -1 -1 +1 +1 -1 -1 +1 c 6 +1 -1 -1 +1 +1 -1 -1 ac 7 -1 +1 -1 +1 -1 +1 -1 bc 8 +1 +1 +1 +1 +1 +1 +1 abc Table 4. Coded 23 factorial design. In this design matrix, the upper level of a particular factor is denoted by a +1 and lower level by –1. This must be compared to the previous notation in which a lower case letter was used to denote the upper level of a particular factor. This design has the property of being orthogonal, that is, it allows the estimation of the average effects of a factor without fear that the results will be distorted by effects of other factors. To determine orthogonality, the example of A and B in the previous design matrix can be used. This is done by multiplying together each of the corresponding row values and then obtaining their sum. If the sum is equal to zero, the columns are said to be orthogonal.

011111111

)1)(1()1)(1()1)(1()1)(1()1)(1()1)(1()1)(1()1)(1()1)(1(

SumSumSum

To obtain the contrast for a particular factor or interaction requires calculating the point product of the column under the specific factor or interaction. This is illustrated below for the factor A and the interaction BC. In EXCEL this can be accomplished using the SUMPRODUCT function.

Page 11: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

11

abcbcaccabba

abcbcaccabba

1

1

11111111

(18)

The reader can verify that this equation corresponds to equations earlier derived to obtain the contrasts. In a similar way one can obtain the contrast for an interaction in this case BC.

abcbcäccabba

abcbcaccabba

1

)1(

11111111

(19)

Of course the question remains why should one use this method to obtain the contrasts? The answer is that although it is still relatively easy to derive the contrasts for 22 and 23 factorial experiments this soon becomes tedious with higher order factorial designs. The design matrix for a 24 design is given below and one can already see that the number of interaction terms have increased significantly.

Page 12: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

12

Run A B AB C AC BC ABC D AD BD ABD CD ACD BCD ABCD Responses 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 2 1 -1 -1 -1 -1 1 1 -1 -1 1 1 1 1 -1 -1 a 3 -1 1 -1 -1 1 -1 1 -1 1 -1 1 1 -1 1 -1 b 4 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 ab 5 -1 -1 1 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 c 6 1 -1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 ac 7 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 bc 8 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 abc 9 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1 1 -1 d 10 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 1 1 ad 11 -1 1 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1 bd 12 1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 abd 13 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 cd 14 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 acd 15 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 bcd 16 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 abcd Table 5. Coded design matrix for a 24 factorial design. The reader should follow how the various interaction terms were derived from the main factors (bold) rather than doing the multiplication of the values in brackets. The same procedure can be followed for higher order full factorial designs. The accompanying EXCEL spreadsheet gives examples of both 23 and 24 full factorial designs with replication in some cases. Note especially how the statistical and other values are derived in the case of replication. The accompanying EXCEL spreadsheet allows for a simple iterative procedure to provide for large full factorial designs.

Page 13: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

13

2. Factorial Designs in Blocks and Confounding Many researchers are familiar with situations where there is simply not enough time or other resources to complete a factorial design. In such cases the situation calls for blocking of the design. The technique commonly used to arrange the experiments in blocks is called confounding. Using this approach means that each block contains fewer experiments than in a single factorial design and inevitably some information is lost. Usually the lost information can be approximated by the higher order interactions within the original factorial design. Suppose we want to use the ABC interaction and use it to confound a 23 design in two blocks. How do we do this? The easiest is to use the plus and minus signs for the ABC interaction in Table 4 and assign the plus signs to one block and the minus signs to another. We therefore have two blocks consisting of the following factors and interactions:

21 1BlockBlock

bcacaband

abccba

(20)

Another method to use when deciding which factors and interactions to assign to individual blocks is the so-called defining contrast where x is the level of the kth factor.

kk xxxL ....2211 (21) In the equation above α can have the value of 0 or 1 and x can be 0 (low level) or 1 (high level). For a 23 design x1 corresponds to factor A, x2 corresponds to factor B and x3 to factor C. The defining contrast for the interaction ABC is for example:

321 xxxL (22) Now by using this defining contrast with the different treatment combinations we get the following results using modulus 2 arithmetic.

Page 14: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

14

)2(mod13)1(1)1(1)1(1:)2(mod02)1(1)1(1)0(1:)2(mod02)1(1)0(1)1(1:

)2(mod11)1(1)0(1)0(1:)2(mod02)0(1)1(1)1(1:

)2(mod11)0(1)1(1)0(1:)2(mod11)0(1)0(1)1(1:

)2(mod00)0(1)0(1)0(1:)1(

LabcLbcLac

LcLab

LbLaL

Therefore all the outcomes with a value of 1 are run in one block and all the zeros (0) in another block. Another method is to use the principal block. This is the block that contains the (1) treatment combination. Since we know that the b treatment (for example) must be in the other block, we multiply each element of the principal block by b modulus 2 to obtain the other treatments.

ccbbcbabcacb

aababbbb

2

2)1(

(23)

While the illustrations above seem trivial for a 23 design nevertheless the procedure can become quite complex with higher order designs when the number of factors are ≥ 5. 3. Fractional Factorial (2k-p) Designs Sometimes, because of external constraints such as time, money or limited availability of a particular resource, a full factorial experimental design is not feasible. We may then either block the designs or try so-called fractional factorial designs. Fractional factorial designs makes the assumption that higher order interactions are negligible and can indeed be used to estimate the sum of squares for the error. Two such examples are given in the accompanying EXCEL spreadsheet for 24 designs where the ABCD interaction is used to estimate the error. The design of a fractional factorial design will be illustrated according to an example from the book by Montgomery. We again use the previous examples in which filtration rate was the observed response. The fractional factorial design used here will be a 24-1 design. Central to the generation of the fractional design is the use of a defining relationship that always equals the identity column I. Therefore I = ABCD is the defining relationship for this design. Using this defining relation it can be shown that each main effect is aliased with a three-factor interaction.

Page 15: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

15

ABCABCDDABDDABCCACDCDABBBCDBCDAA 2222 ,,, The linear combination of observations associated with the various effects can then be calculated according to:

BCADl

BDAClABCDlABDCl

ACDBlCDABl

BCDAl

AD

AC

D

C

B

AB

A

00.199680607565451004541

50.189680607565451004541

50.169680607565451004541

00.149680607565451004541

59.19680607565451004541

00.19680607565451004541

00.199680607565451004541

From the calculated effects it seems that A, C and D are relevant and therefore it can be logically assumed that the interaction alias AC + BD should also be large

Run Basic Design Treatment Combinations

Filtration rate

A B C D = ABC 1 -1 -1 -1 -1 (1) 45 2 +1 -1 -1 +1 ad 100 3 -1 +1 -1 +1 bd 45 4 +1 +1 -1 -1 ab 65 5 -1 -1 +1 +1 cd 75 6 +1 -1 +1 -1 ac 60 7 -1 +1 +1 -1 bc 80 8 +1 +1 +1 +1 abcd 96

Table 6. Example of a fractional factorial design (24-1).

Page 16: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

16

Therefore the interactions AC and AD should also be large. Based on the assumptions a regression model can now be constructed to model the filtration rate.

4131431 200.19

250.18

250.16

200.14

200.1975.70ˆ xxxxxxxy

For larger fractional factorial designs there may be a number of design generators. It is important that the generator with the highest possible resolution is chosen. 4. Explanation of ANOVA Tables for Factorial and Fractional Factorial Experiments In this study, use has been made of the STATGRAPHICS computer software for design and analysis of experiments. Many of these computer software programs generate ANOVA tables for the responses analysed in a particular experiment. Such an ANOVA table usually contains the main and interaction effects, the sum of squares of the main effects and interactions as well the mean sum of squares of these effects and their individual degrees of freedom. It is not always clear how it is arrived at these particular entries in such an ANOVA table. This section will hopefully shed some light on this. Throughout this section, it will be good to keep Table 4 in mind. The first question to be asked is how the sum of squares for main effects and interactions are calculated. A formula for this is as follows:

ki

i

ii

i

nXa

anXa

ASS 2)(2

...12

2...1

(24)

This formula shows how the sum of squares for the main effect A is calculated. In this equation, ai is the coefficients in a linear comparison, in other words, the +1 and –1 signs in the column of Table 4 for the factor A. In this equation, Xij is the measured responses multiplied by the appropriate sign in the column under factor A. The lowercase n in this equation denotes the number of replicates of a particular experiment. In the same way the other main effects and interactions can be calculated. This also holds for a fractional factorial design. Next we have to define the degrees of freedom for each main effect and interaction5. The degrees of freedom are necessary for the calculation of the Mean Sum of Squares (MSS). For main effects this is given by the number of levels of a particular factor (In a two level factorial experiment this is always 1). The equation for this is (alevels – 1) for factor A. Lowercase letters for other main effects with the subscript levels may be used. In this case alevels equals two. For interactions, the procedure becomes more complicated. Take for instance the interaction AB. The degrees of freedom needed here are given by (alevel–1)(blevel-1). Since the main effects are evaluated at two levels, this means the degree of freedom is again 1.

Page 17: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

17

Having defined the means of calculating the degrees of freedom, it still remains to calculate the MSS. This is done by dividing the sum of squares originally calculated by the degrees of freedom for each main effect and interaction. We now have the means of calculating the variance ratio or F-ratio. Dividing the mean sum of squares by the total error of the sum of squares does this. To do this, some estimate of the total error is needed. This is usually done by summing the Sum of Squares of the higher order interaction terms and dividing them by the number of higher order interactions that is calculate the average of the higher order interactions. The F-ratio can now be determined using the MSS calculated for each main effect and interaction as well as the estimate of the error. Once the F-ratio has been calculated, the corresponding p-value can be generated using the degrees of freedom for each of the MSS of the main effects (usually 1) and that for the estimate of the error. 5. Visual aids to interpret effects and interactions 5.1 Pareto Charts and Response Surface Plots The simplest way to describe a pareto chart is as a histogram of effects. Figure 4 should serve as a good example of this.

0 2 4 6 8 10 12Standardized effects

BCABAC

CAB

Pareto Chart

-1.13-1.481.49

-4.33-5.21

10.57

Figure 6. Example of a Pareto chart to visually display standardized effects. As can be seen in Figure 6, the various main effects and interactions are given as standardized values in terms of sideways histograms. This chart maps the distance of each effect from zero. This means that the further away an effect is from zero, the more significant that effect will be. The line drawn across the columns indicate how large an effect has to be to be statistically significant. This is akin to measuring the various effects relative to the null hypothesis that says

Page 18: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

18

that all data is normally distributed around a mean of zero. Normalisation of the effects is achieved by dividing each main effect and interaction by the total error or variance.

Figure 7. Example of a response surface plot. Part a illustrates the individual responses while part b shows the response equation fitted to the data points. Figure 7 shows an example of a response surface for exposure of test samples to specific temperatures over time (rat embryonic development). Notwithstanding the rather gross topic, the response surface shows a highly non-linear interaction between the variables which would not have been possible to visualize if a simple two dimensional graph was used.

Page 19: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

19

5.2 Normal Probability Plot and Normal Probability Plot of Residuals. Apart from the response surface design, an important visual aid is the normality plot.

Figure 8. Normal probability plot and accompanying ANOVA table. When the normal probability plot is interpreted along with the ANOVA table the meaning becomes clear. Those factors and interactions NOT lying on the straight line are the significant factors and their interactions. This can immediately be seen in the magnitude of the F-values and related p-values.

Page 20: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

20

Figure 9. Example of a normal plot of residuals taken from an example of using DESIGN-EXPERT software. In Figure 9 a plot of the residuals in terms of normal% probability and studentized residuals is shown. If the experiment was conducted in such a way that all the experiments were unbiased and no result was influenced externally then the residuals or deviation from the mean should be totally random and small. Figure is a graph from an example in Design Expert, a software program for DoE and it shows that the results from the experimental design is in general randomly distributed around the mean and that no significant outlier is present. Testing for bias and other influences on the data can also be accomplished simply by calculating the mean of a specific response (from all contributing factors and interactions) and plotting the difference from the mean for each factor and interaction. Again, if the experiment was conducted correctly a random scatter of residuals around the mean should be the result. Sometimes a specific pattern may be seen in the data when plotting the residuals. This may occur even though the experimenter took possible bias into account and even though the experiments were designed with the necessary caution. Such a pattern in the residuals over time

Page 21: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

21

may point to correlation between factors, that is, a change in a factor may automatically affect another factor or factors and this means the choice of factors is not really independent of each other. This type of mistake is surprisingly easy to make. To get rid of the pattern, data may be transformed by any number of suitable data transformations. One example may be a log transformation of the data or a square root transformation to stabilize the variance. Software packages such as DESIGN EXPERT automatically calculate the best variance stabilizing transformation of the data. 5.3 Interaction Plot

Figure 10. DESIGN-EXPERT example of an interaction plot. Another graphical representation that conveys the effect of interaction between factors is the interaction plot as shown in Figure 10 above. While an ANOVA table may give numerical values regarding factor interaction, the visual plot of such an interaction significantly helps the experimenter to obtain a “feel” for the interaction and it acts as a valuable tool to translate experimental findings to other observers. These visual tools are especially valuable to interpret the ANOVA table for people without a statistical background.

Page 22: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

22

Summary In this tutorial the steps towards evaluating two-level factorial designs are taken. Two level factorial designs are adequate for most experimental designs when the purpose of the design is to screen potential factors for importance. A general technique is illustrated to construct the contrast through to establishing the sum of squares and mean sum of squares for each factor and their interactions. This enables the experimenter to construct an ANOVA that allows the experimenter to choose which of the factors investigated are significant. This tutorial builds on the previous one by increasing the complexity of deriving the required sum of squares and other parameters that enable statistical analysis. It quickly becomes apparent that it is better to use dedicated DoE software or spreadsheets because of the complexity involved to obtain the contrasts and the increase in the number of calculations. One way to reduce the number of experiments is to use fractional factorial designs which can yield equally useful data at a lower number of experiments. However, there is a cost involved since single factors and their interactions may become confounded with higher order interactions. The blessing with regard to this problem is that the higher order interactions usually contribute relatively little to the overall statistical analysis and it is usually (but not always) safe to assume that the statistical analysis of individual factors and their important interactions are statistically relevant.

Page 23: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

23

REFERENCES 1. Chemometrics: Experimental Design, Analytical Chemistry by OpenLearning, Ed

Morgan, John Wiley and Sons, New York, (1982) 2. Designing for Quality, An Introduction to the best of Taguchi and Western methods

of Statistical Experimental Design, Robert H. Lochner, Joseph E. Matar, Quality Resources, A Division of The Kraus Organisation Limited, New York, (1990)

3. Statistics and Experimental Design in Engineering and the Physical Sciences, Volume II, Second Edition, Norman L. Johnson, Fred C. Leone, John Wiley & Sons, New York, (1977)

4. Experiments with Mixtures, Designs, Models, and the analysis of Mixture data, John A. Cornell, John Wiley & Sons, New York, (1981)

5. Introductory Statistics, Third Edition, Thomas H. Wonnacot, Ronald J. Wonnacot, John Wiley & Sons, New York, (1977)

6. Edward J. Powers, Proceedings of the Water-Borne and Higher Solids Coatings Symposium, “Handling and curing a water-borne epoxy coating”, pp. 111 – 135, (1981)

7. Chorng-Shyan Chern, Yu-Chang Chen, Polymer Journal, “Semibatch emulsion polymerization of butyl acrylate stabilised by a polymerizable surfactant”, , Vol. 28, No. 7, pp. 627-632, (1996)

8. G.E.P. Box, J.S. Hunter, Technometrics, “The 2k-p fractional factorial designs. Part I”,

Vol. 3, No. 3, (1961) 9. W.J. Youden, Technometrics, “Partial confounding in fractional replication”, Vol. 3,

No. 3, (1961) 10. Dunae E. Long Analytica Chimica Acta, “Simplex optimisation of the response from

chemical systems”, Vol. 46, pp. 193 – 206, (1969) 11. Linear Algebra and its Applications, Gilbert Strang, Third Edition, Chapter 3, p. 153,

Harcourt Brace Jovanovich, Inc., Orlando, (1988)

12. Andre I. Khuri, John A. Cornell, Response Surfaces, Designs and Analyses, Second Edition, Revised and Expanded, Chapter 2, Matrix Algebra, Least Squares, the Analysis

Page 24: INTRODUCTION TO EXPERIMENTAL DESIGN Tutorial 4 · )ljxuh *udsklf h[klelwlrq ri hiihfwv iru zdihu h[shulphqw 7kh qh[w vwhs lqyroyhv wkh ghulydwlrq ri wkh vxpv ri vtxduhv 7kh vxp ri

24

of Variance, and Principles of Experimental Design, Marcel Dekker, Inc., New York, 1996

13. Statistics for Management and Economics, Gerald Keller, Brian Warrack, Duxbury

Press, Johannesburg, 1997

14. Introduction to DOE, Veli-Matti Taavitsainen, Borealis, 2009 15. Design and Analysis of Experiments, 5th Edition, Douglas C. Montgomery, Wiley

Student Edition, Wiley India, 2004 16. Design and optimization in organic synthesis, Second revised and enlarged edition,

Rolf Carlson, Johan E. Carlson, Elsevier, Sweden, 2005 17. Operations Research: Applications and Algorithms, Wayne L. Winston, PWS-Kent

Publishing Company, Boston, 1987

18. http://mathworld.wolfram.com/NormalDistribution.html