Transcript
Page 1: Statistical Tools for Multivariate Six Sigma

1

Statistical Tools for Multivariate Six Sigma

Dr. Neil W. PolhemusCTO & Director of DevelopmentStatPoint, Inc.

Page 2: Statistical Tools for Multivariate Six Sigma

2

The Challenge

The quality of an item or service usually depends on more than one characteristic.

When the characteristics are not independent, considering each characteristic separately can give a misleading estimate of overall performance.

Page 3: Statistical Tools for Multivariate Six Sigma

3

The Solution

Proper analysis of data from such processes requires the use of multivariate statistical techniques.

Page 4: Statistical Tools for Multivariate Six Sigma

4

Outline Multivariate SPC

Multivariate control charts Multivariate capability analysis

Data exploration and modeling Principal components analysis (PCA)

Partial least squares (PLS) Neural network classifiers

Design of experiments (DOE) Multivariate optimization

Page 5: Statistical Tools for Multivariate Six Sigma

5

Example #1

Textile fiber

Characteristic #1: tensile strength - 115 ± 1

Characteristic #2: diameter - 1.05 ± 0.05

Page 6: Statistical Tools for Multivariate Six Sigma

6

Sample Data

n = 100

Page 7: Statistical Tools for Multivariate Six Sigma

7

Individuals Chart - strength

X Chart for strength

0 20 40 60 80 100

Observation

114

114.3

114.6

114.9

115.2

115.5

115.8

X

CTR = 114.98UCL = 115.69

LCL = 114.27

Page 8: Statistical Tools for Multivariate Six Sigma

8

Individuals Chart - diameter

X Chart for diameter

0 20 40 60 80 100

Observation

1.04

1.043

1.046

1.049

1.052

1.055

1.058

X

CTR = 1.05UCL = 1.06

LCL = 1.04

Page 9: Statistical Tools for Multivariate Six Sigma

9

Capability Analysis - strength

NormalMean=114.978Std. Dev.=0.238937

Cp = 1.41Pp = 1.40Cpk = 1.38Ppk = 1.36K = -0.02

Process Capability for strength

LSL = 114.0, Nominal = 115.0, USL = 116.0

114 114.4 114.8 115.2 115.6 116

strength

0

4

8

12

16

20

24

freq

uenc

y

DPM = 30.76

Page 10: Statistical Tools for Multivariate Six Sigma

10

Capability Analysis - diameter

DPM = 44.59

NormalMean=1.04991Std. Dev.=0.00244799

Cp = 1.41Pp = 1.36Cpk = 1.39Ppk = 1.35K = -0.01

Process Capability for diameter

LSL = 1.04, Nominal = 1.05, USL = 1.06

1.04 1.044 1.048 1.052 1.056 1.06diameter

0

4

8

12

16

20

freq

uenc

y

Page 11: Statistical Tools for Multivariate Six Sigma

11

Scatterplot

Plot of strength vs diameter

1.04 1.045 1.05 1.055 1.06diameter

114

114.5

115

115.5

116

str

en

gth

correlation = 0.89

Page 12: Statistical Tools for Multivariate Six Sigma

12

Multivariate Normal Distribution

Multivariate Normal Distribution

114 114.5 115 115.5 116

strength

1.041.045

1.051.055

1.06

diameter

Page 13: Statistical Tools for Multivariate Six Sigma

13

Control Ellipse

Control Ellipse

1.04 1.043 1.046 1.049 1.052 1.055 1.058diameter

114

114.3

114.6

114.9

115.2

115.5

115.8

stre

ng

th

Page 14: Statistical Tools for Multivariate Six Sigma

14

Multivariate Capability

Determines joint probability of being within the specification limits on all characteristics

Observed Estimated Estimated Variable Beyond Spec. Beyond Spec. DPM strength 0.0% 0.00307572% 30.7572 diameter 0.0% 0.00445939% 44.5939 Joint 0.0% 0.00703461% 70.3461

Page 15: Statistical Tools for Multivariate Six Sigma

15

Multivariate Capability

Multivariate Normal DistributionDPM = 70.3461

113.5 114 114.5 115 115.5 116 116.5

strength

1.0351.041.0451.051.0551.061.065

diameter

Page 16: Statistical Tools for Multivariate Six Sigma

16

Capability Ellipse

99.73% Capability Ellipse

MCP =1.27

113.5 114 114.5 115 115.5 116 116.5strength

1.035

1.04

1.045

1.05

1.055

1.06

1.065

diam

eter

Page 17: Statistical Tools for Multivariate Six Sigma

17

Mult. Capability Indices

Defined to give the

same DPM as in the

univariate case.

Capability Indices Index Estimate MCP 1.27 MCR 78.80 DPM 70.3461 Z 3.80696 SQL 5.30696

Page 18: Statistical Tools for Multivariate Six Sigma

18

Test for Normality

Probability Plot

-2.6 -1.6 -0.6 0.4 1.4 2.4 3.4normal distribution

-2.6

-1.6

-0.6

0.4

1.4

2.4

3.4

empi

rical

dat

a

strengthdiameter

P-Values Shapiro-Wilk strength 0.408004 diameter 0.615164

Page 19: Statistical Tools for Multivariate Six Sigma

19

More than 2 Characteristics

Calculate T-squared:

where

S = sample covariance matrix

= vector of sample means

)()( 12 xxSxxT iii

x

Page 20: Statistical Tools for Multivariate Six Sigma

20

T-Squared Chart

Multivariate Control Chart

UCL = 11.25

0 20 40 60 80 100 120Observation

0

5

10

15

20

25

30

T-S

quar

ed

Page 21: Statistical Tools for Multivariate Six Sigma

21

T-Squared Decomposition

Subtracts the value of T-squared if each variable is removed.

Large values indicate that a variable has an important contribution.

T-Squared Decomposition Relative Contribution to T-Squared Signal Observation T-Squared diameter strength 17 26.3659 22.9655 25.951

Page 22: Statistical Tools for Multivariate Six Sigma

22

Control Ellipsoid

Control Ellipsoid

1.04 1.044 1.048 1.052 1.056 1.06

diameter

114114.4114.8

115.2115.6

116

strength

6.8

7.8

8.8

9.8

10.8

11.8

12.8

rnor

mal

(100

,10,

1)

Page 23: Statistical Tools for Multivariate Six Sigma

23

Multivariate EWMA Chart

Multivariate EWMA Control Chart

UCL = 11.25, lambda = 0.2

0 20 40 60 80 100 120

Observation

0

3

6

9

12

15

T-S

quar

ed

Largeststrengthdiameter

Page 24: Statistical Tools for Multivariate Six Sigma

24

Generalized Variance Chart

Plots the determinant of the variance-covariance matrix for data that is sampled in subgroups.

Generalized Variance Chart

0 4 8 12 16 20 24

Subgroup

0

1

2

3

4

5

6(X 1.E-7)

Gen

. Var

ianc

e

UCL = 3.281E-7CL = 7.01937E-8LCL = 0.0

Page 25: Statistical Tools for Multivariate Six Sigma

25

Data Exploration and Modeling

When the number of variables is large, the dimensionality of the problem often makes it difficult to determine the underlying relationships.

Reduction of dimensionality can be very helpful.

Page 26: Statistical Tools for Multivariate Six Sigma

26

Example #2

Page 27: Statistical Tools for Multivariate Six Sigma

27

Matrix PlotMPG City

MPG Highway

Engine Size

Horsepower

Fueltank

Passengers

Length

Wheelbase

Width

U Turn Space

Weight

Page 28: Statistical Tools for Multivariate Six Sigma

28

Analysis Methods

Predicting certain characteristics based on others (regression and ANOVA)

Separating items into groups (classification)

Detecting unusual items

Page 29: Statistical Tools for Multivariate Six Sigma

29

Multiple RegressionMPG City = 29.6315 + 0.28816*Engine Size - 0.00688362*Horsepower - 0.297446*Passengers - 0.0365723*Length + 0.280224*Wheelbase + 0.111526*Width - 0.139763*U Turn Space - 0.00984486*Weight Standard T Parameter Estimate Error Statistic P-Value CONSTANT 29.6315 12.9763 2.28351 0.0249 Engine Size 0.28816 0.722918 0.398607 0.6912 Horsepower -0.00688362 0.0134153 -0.513119 0.6092 Passengers -0.297446 0.54754 -0.543241 0.5884 Length -0.0365723 0.0447211 -0.817786 0.4158 Wheelbase 0.280224 0.124837 2.24472 0.0274 Width 0.111526 0.218893 0.5095 0.6117 U Turn Space -0.139763 0.17926 -0.779668 0.4378 Weight -0.00984486 0.00192619 -5.11104 0.0000 R-squared = 73.544 percent R-squared (adjusted for d.f.) = 71.0244 percent Standard Error of Est. = 3.02509 Mean absolute error = 1.99256

Page 30: Statistical Tools for Multivariate Six Sigma

30

Principal Components

The goal of a principal components analysis (PCA) is to construct k linear combinations of the p variables X that contain the greatest variance.

pp XaXaXaC 12121111 ...

pp XaXaXaC 22221212 ...

pkpkkk XaXaXaC ...2211

Page 31: Statistical Tools for Multivariate Six Sigma

31

Scree Plot

Shows the number of significant components.

Scree Plot

Component

Eig

en

valu

e

0 2 4 6 80

1

2

3

4

5

6

Page 32: Statistical Tools for Multivariate Six Sigma

32

Percentage Explained

Principal Components Analysis Component Percent of Cumulative Number Eigenvalue Variance Percentage 1 5.8263 72.829 72.829 2 1.09626 13.703 86.532 3 0.339796 4.247 90.779 4 0.270321 3.379 94.158 5 0.179286 2.241 96.400 6 0.12342 1.543 97.942 7 0.109412 1.368 99.310 8 0.0552072 0.690 100.000

Page 33: Statistical Tools for Multivariate Six Sigma

33

ComponentsTable of Component Weights Component Component 1 2 Engine Size 0.376856 -0.205144 Horsepower 0.292144 -0.592729 Passengers 0.239193 0.730749 Length 0.369908 0.0429221 Wheelbase 0.374826 0.259648 Width 0.38949 -0.0422083 U Turn Space 0.359702 -0.0256716 Weight 0.396236 -0.0298902

First component 0.376856*Engine Size + 0.292144*Horsepower + 0.239193*Passengers + 0.369908*Length + 0.374826*Wheelbase + 0.38949*Width + 0.359702*U Turn Space + 0.396236*Weight Second component -0.205144*Engine Size – 0.592729*Horsepower + 0.730749*Passengers + 0.0429221*Length + 0.259648*Wheelbase - 0.0422083*Width - 0.0256716*U Turn Space – 0.0298902*Weight

Page 34: Statistical Tools for Multivariate Six Sigma

34

Interpretation

Plot of C_2 vs C_1

C_1

C_2

TypeCompactLarge MidsizeSmall Sporty Van

-6 -4 -2 0 2 4 6-5

-3

-1

1

3

Page 35: Statistical Tools for Multivariate Six Sigma

35

Principal Component RegressionMPG City = 22.3656 - 1.84685*size + 0.567176*unsportiness Standard T Parameter Estimate Error Statistic P-Value CONSTANT 22.3656 0.353316 63.302 0.0000 size -1.84685 0.147168 -12.5492 0.0000 unsportiness 0.567176 0.339277 1.67172 0.0981 R-squared = 64.0399 percent R-squared (adjusted for d.f.) = 63.2408 percent Standard Error of Est. = 3.40726 Mean absolute error = 2.26553

Page 36: Statistical Tools for Multivariate Six Sigma

36

Partial Least Squares (PLS)

Similar to PCA, except that it finds components that minimize the variance in both the X’s and the Y’s.

May be used with many X variables, even exceeding n.

Page 37: Statistical Tools for Multivariate Six Sigma

37

Component Extraction

Starts with number of components equal to the minimum of p and (n-1).

Model Comparison Plot

Number of components

Pe

rce

nt

vari

ati

on

XY

1 2 3 4 5 6 7 80

20

40

60

80

100

Page 38: Statistical Tools for Multivariate Six Sigma

38

Coefficient Plot

PLS Coefficient Plot

Stn

d.

coe

ffic

ien

t

MPG CityMPG HighwayFueltank

-0.7

-0.5

-0.3

-0.1

0.1

0.3

0.5E

ng

ine

Siz

e

Ho

rse

po

we

r

Pa

sse

ng

ers

Le

ng

th

Wh

ee

lba

se

Wid

th

U T

urn

Sp

ace

We

igh

t

Page 39: Statistical Tools for Multivariate Six Sigma

39

Model in Original Units

MPG City = 50.0593 – 0.214083*Engine Size - 0.0347708*Horsepower

- 0.884181*Passengers + 0.0294622*Length - 0.0362471*Wheelbase

- 0.0882233*Width - 0.0282326*U Turn Space - 0.00391616*Weight

Page 40: Statistical Tools for Multivariate Six Sigma

40

Classification

Principal components can also be used to classify new observations.

A useful method for classification is a Bayesian classifier, which can be expressed as a neural network.

Page 41: Statistical Tools for Multivariate Six Sigma

41

6 Types of Automobiles

Plot of unsportiness vs size

size

un

spo

rtin

ess

TypeCompactLarge MidsizeSmall Sporty Van

-6 -4 -2 0 2 4 6-5

-3

-1

1

3

Page 42: Statistical Tools for Multivariate Six Sigma

42

Neural Networks

Input layer

(2 variables)

Pattern layer

(93 cases)

Summation layer

(6 neurons)

Output layer

(6 groups)

Page 43: Statistical Tools for Multivariate Six Sigma

43

Bayesian Classifier Begins with prior probabilities for membership in

each group

Uses a Parzen-like density estimator of the density function for each group

jn

i

i

jj

XX

nXg

12

2

exp1

)(

Page 44: Statistical Tools for Multivariate Six Sigma

44

Options

The prior probabilities may be determined in several ways.

A training set is usually used to find a good value for .

Page 45: Statistical Tools for Multivariate Six Sigma

45

OutputNumber of cases in training set: 93 Number of cases in validation set: 0 Spacing parameter used: 0.0109375 (optimized by jackknifing during training) Training Set Percent Correctly Type Members Classified Compact 16 75.0 Large 11 100.0 Midsize 22 77.2727 Small 21 76.1905 Sporty 14 85.7143 Van 9 100.0 Total 93 82.7957

Page 46: Statistical Tools for Multivariate Six Sigma

46

Classification Regions

Classification Plot

size

unsp

ortin

ess

TypeCompact Large Midsize Small Sporty Van

sigma = 0.0109375

-6 -4 -2 0 2 4 6-5

-3

-1

1

3

Page 47: Statistical Tools for Multivariate Six Sigma

47

Changing Sigma

Classification Plot

size

unsp

ortin

ess

TypeCompact Large Midsize Small Sporty Van

-6 -4 -2 0 2 4 6-5

-3

-1

1

3

sigma = 0.3

Page 48: Statistical Tools for Multivariate Six Sigma

48

Overlay Plot

Classification Plot

size

un

spo

rtin

ess

TypeCompact Large Midsize Small Sporty Van

sigma = 0.3

-6 -4 -2 0 2 4 6-5

-3

-1

1

3

Page 49: Statistical Tools for Multivariate Six Sigma

49

Outlier Detection

Control Ellipse

size

unsp

ortin

ess

-8 -4 0 4 8-5

-3

-1

1

3

5

Page 50: Statistical Tools for Multivariate Six Sigma

50

Cluster Analysis

Cluster Scatterplot

Method of k-Means,Squared Euclidean

-6 -4 -2 0 2 4 6

size

-5

-3

-1

1

3

unsp

ortin

ess

Cluster 1234Centroids

Page 51: Statistical Tools for Multivariate Six Sigma

51

Design of Experiments

When more than one characteristic is important, finding the optimal operating conditions usually requires a tradeoff of one characteristic for another.

One approach to finding a single solution is to use desirability functions.

Page 52: Statistical Tools for Multivariate Six Sigma

52

Example #3

Myers and Montgomery (2002) describe an experiment on a chemical process:

Response variable Goal

Conversion percentage maximize

Thermal activity Maintain between 55 and 60

Input factor Low High

time 8 minutes 17 minutes

temperature 160˚ C 210˚ C

catalyst 1.5% 3.5%

Page 53: Statistical Tools for Multivariate Six Sigma

53

Experimentrun time temperature catalyst conversion activity (minutes ) (degrees C ) (percent ) 1 10.0 170.0 2.0 74.0 53.2 2 15.0 170.0 2.0 51.0 62.9 3 10.0 200.0 2.0 88.0 53.4 4 15.0 200.0 2.0 70.0 62.6 5 10.0 170.0 3.0 71.0 57.3 6 15.0 170.0 3.0 90.0 67.9 7 10.0 200.0 3.0 66.0 59.8 8 15.0 200.0 3.0 97.0 67.8 9 8.3 185.0 2.5 76.0 59.1 10 16.7 185.0 2.5 79.0 65.9 11 12.5 160.0 2.5 85.0 60.0 12 12.5 210.0 2.5 97.0 60.7 13 12.5 185.0 1.66 55.0 57.4 14 12.5 185.0 3.35 81.0 63.2 15 12.5 185.0 2.5 81.0 59.2 16 12.5 185.0 2.5 75.0 60.4 17 12.5 185.0 2.5 76.0 59.1 18 12.5 185.0 2.5 83.0 60.6 19 12.5 185.0 2.5 80.0 60.8 20 12.5 185.0 2.5 91.0 58.9

Page 54: Statistical Tools for Multivariate Six Sigma

54

Step #1: Model Conversion

Standardized Pareto Chart for conversion

0 2 4 6 8

Standardized effect

A:timeABAABCBB

B:temperatureCC

C:catalystAC +

-

Page 55: Statistical Tools for Multivariate Six Sigma

55

Step #2: Optimize ConversionGoal: maximize conversion Optimum value = 118.174 Factor Low High Optimum time 8.0 17.0 17.0 temperature 160.0 210.0 210.0 catalyst 1.5 3.5 3.48086

Contours of Estimated Response Surfacetemperature=210.0

8 9 10 11 12 13 14 15 16 17

time

1.5

2

2.5

3

3.5

cata

lyst

conversion70.072.575.077.580.082.585.087.590.092.595.097.5100.0

Page 56: Statistical Tools for Multivariate Six Sigma

56

Step #3: Model Activity

Standardized Pareto Chart for activity

0 2 4 6 8

Standardized effect

ACCCBBBC

B:temperatureABAA

C:catalystA:time +

-

Page 57: Statistical Tools for Multivariate Six Sigma

57

Step #4: Optimize ActivityGoal: maintain activity at 57.5 Optimum value = 57.5 Factor Low High Optimum time 8.3 16.7 10.297 temperature 209.99 210.01 210.004 catalyst 1.66 3.35 2.31021

Contours of Estimated Response Surface

temperature=210.0

8 9 10 11 12 13 14 15 16 17

time

1.5

2

2.5

3

3.5

cata

lyst

activity55.056.057.058.059.060.0

Page 58: Statistical Tools for Multivariate Six Sigma

58

Step #5: Select Desirability Fcns.

Maximize

Desirability Function for Maximization

Predicted response

Desir

abili

ty, d

s = 1s = 2

s = 8

s = 0.4

s = 0.2

Low

0 20 40 60 80 100

0

0.2

0.4

0.6

0.8

1

High

Page 59: Statistical Tools for Multivariate Six Sigma

59

Desirability Function

Hit Target

Desirability Function for Hitting Target

Predicted response

Desir

abilit

y, d

Low HighTarget

s = 1 t = 1

s = 0.1 t = 0.1

s = 5

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

t = 5

Page 60: Statistical Tools for Multivariate Six Sigma

60

Combined Desirability

where m = # of factors and 0 ≤ Ij ≤ 5. D ranges from 0 to 1.

m

jjm

IIm

II dddD 121

/1

21 ...

Page 61: Statistical Tools for Multivariate Six Sigma

61

ExampleOptimum value = 0.949092 Factor Low High Optimum time 8.0 17.0 11.1394 temperature 160.0 210.0 210.0 catalyst 1.5 3.5 2.20119

Weights Weights Response Low High Goal First Second Impact conversion 50.0 100.0 Maximize 1.0 3.0 activity 55.0 60.0 57.5 1.0 1.0 3.0

Response Optimum conversion 95.0388 activity 57.5

Page 62: Statistical Tools for Multivariate Six Sigma

62

Desirability Contours

Contours of Estimated Response Surfacetemperature=210.0

8 9 10 11 12 13 14 15 16 17

time

1.5

2

2.5

3

3.5

cata

lyst

Desirability0.00.10.20.30.40.50.60.70.80.91.0

Page 63: Statistical Tools for Multivariate Six Sigma

63

Desirability Surface

Estimated Response Surfacetemperature=210.0

8 9 10 11 12 13 14 15 16 17time

1.52

2.53

3.5

catalyst

0

0.2

0.4

0.6

0.8

1

Des

irab

ility

Page 64: Statistical Tools for Multivariate Six Sigma

64

Overlaid Contours

Overlay Plottemperature=210.0

conversionactivity

10 11 12 13 14 15

time

2

2.2

2.4

2.6

2.8

3

cata

lyst

Page 65: Statistical Tools for Multivariate Six Sigma

65

References Johnson, R.A. and Wichern, D.W. (2002). Applied Multivariate

Statistical Analysis. Upper Saddle River: Prentice Hall.Mason, R.L. and Young, J.C. (2002).

Mason and Young (2002). Multivariate Statistical Process Control with Industrial Applications. Philadelphia: SIAM.

Montgomery, D. C. (2005). Introduction to Statistical Quality Control, 5th edition. New York: John Wiley and Sons.

Myers, R. H. and Montgomery, D. C. (2002). Response Surface Methodology: Process and Product optimization Using Designed Experiments, 2nd edition. New York: John Wiley and Sons.


Recommended