30
paul geladi Oct 2007 1 Regression / Calibration Paul Geladi, SBT, SLU MLR, RR, PCR, PLS Univariate regression

Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

  • Upload
    vutram

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 1

Regression / Calibration

Paul Geladi, SBT, SLU

MLR, RR, PCR, PLS

Univariate regression

Page 2: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 2

x

y

Offset

Slope a

x

y

Offset a

Slope b a

e

y = a + bx + e

Page 3: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 3

x

y

x

y Linear fit

Underfit

Page 4: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 4

x

y Overfit

x

y Quadratic fit

Page 5: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 5

Multivariate linear regression

Things to do

• Set up the equation

• Solve the equation

• Diagnose the equation

• Visualise the results

• Use the equation

Page 6: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 6

Things to do

• Check residuals

• Check for outliers

• Check for nonlinearity

• Correct for nonlinearity

• Wavelength reduction

y = f(x)

Works sometimes

y = f(x)

Works only for a few variables

Measurement noise!

∞ possible functions

Page 7: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 7

X y

I

K

y = f(x)

y = f(x)

Simplified by:

y = b0 + b1x1 + b2x2 + ... + bKxK + f

Linear approximation

Page 8: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 8

y = b0 + b1x1 + b2x2 + ... + bKxK + f

y : response

xk : predictors

bk : regression coefficients

b0 : offset, constant

f : residual

Nomenclature

X y

I

K

X, y mean-centered b0 out

Page 9: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 9

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

} I samples

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

Page 10: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 10

X y

I

K

f

b

= +

y = Xb + f

X, y known, measurable b, f unknown

No solution

f must be constrained

Page 11: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 11

The MLR solution

Multiple Linear Regression

Ordinary Least Squares (OLS)

b = (X’X)-1 X’y

Problems?

Least squares

Page 12: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 12

3b1 + 4b2 = 1

4b1 + 5b2 = 0

One solution

3b1 + 4b2 = 1

4b1 + 5b2 = 0

b1 + b2 = 4

No solution

Page 13: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 13

3b1 + 4b2 + b3 = 1

4b1 + 5b2 + b3 = 0

∞ solutions

b = (X’X)-1 X’y

-K > I ∞ solutions

-I > K no solution

-error in X

-error in y

-inverse may not exist

-inverse may be unstable

Page 14: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 14

3b1 + 4b2 + e = 1

4b1 + 5b2 + e = 0

b1 + b2 + e = 4

Solution

Wanted solution

- I ≥ K

- No inverse

- No noise in X

Page 15: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 15

Diagnostics

y = Xb + f

SS tot = SSmod + SSres

R2 = SSmod / SStot = 1- SSres / SStot

Coefficient of determination

Diagnostics

y = Xb + f

SSres = f’f

RMSEC = [ SSres / (I-A) ] 1/2

Root Mean Squared Error of Calibration

Page 16: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 16

Alternatives to MLR/OLS

Principal Component Regression (PCR)

- I ≥ K

-Easy inversion

Page 17: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 17

Principal Component Regression (PCR)

X T

K A

PCA

- A ≤ I

- T orthogonal

- Noise in X removed

Principal Component Regression (PCR)

y = Td + f

d = (T’T)-1 T’y

Page 18: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 18

Problem

How many components used?

Advantage

- PCA done on data

- Outliers

- Classes

- Noise in X removed

Page 19: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 19

Partial Least Squares Regression

X Y t u

Page 20: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 20

X Y t u

w’ q’

Outer relationship

X Y t u

w’ q’

Inner relationship

Page 21: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 21

X Y t u

w’ q’

A

A A

A

p’

Advantages

- X decomposed

- Y decomposed

- Noise in X left out

- Noise in Y left out

Page 22: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 22

PCR, PLS are one component at a time methods

After each component, a residual

is calculated

The next component is calculated on the residual

Another view

y = Xb + f

y = XbRR + fRR

y = XbPCR + fPCR

y = XbPLS + fPLS

Page 23: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 23

Prediction

Page 24: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 24

Xcal ycal

I

K

Xtest ytest

J

yhat

Prediction diagnostics

yhat = Xtestb

ftest = ytest -yhat

PRESS = ftest’ftest

RMSEP = [ PRESS / J ] 1/2

Root Mean Squared Error of Prediction

Page 25: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 25

Prediction diagnostics

yhat = Xtestb

ftest = ytest -yhat

R2test = Q2 = 1 - ftest’ftest/ytest’ytest

Some rules of thumb

R2 > 0.65 5 PLS comp.

R2test > 0.5

R2 - R2test < 0.2

Page 26: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 26

Bias

f = y - Xb

always 0 bias

ftest = y - yhat

bias = 1/J S ftest

Leverage - influence

b= (X’X)-1 X’y

yhat = Xb = X(X’X)-1 X’y = Hy

the Hat matrix

diagonal elements of H: Leverage

Page 27: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 27

Leverage - influence

b= (X’X)-1 X’y

yhat = Xb = X(X’X)-1 X’y = Hy

the Hat matrix

diagonal elements of H: Leverage

Leverage - influence

Page 28: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 28

Leverage - influence

Leverage - influence

Page 29: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 29

Residual plot

Residual

-Check histogram f

-Check variablewise E

-Check objectwise E

Page 30: Regression / Calibration · paul geladi Oct 2007 5 Multivariate linear regression Things to do • Set up the equation • Solve the equation • Diagnose the equation • Visualise

paul geladi Oct 2007 30