Correction of systematic disturbances in latent-variable ... · STEP 2: DRIFT-CORRECTION Shrinking...

Preview:

Citation preview

Correction of systematic disturbances in latent-variable

calibration models

Paman Gujral, Michael Amrhein, Barry M. Wise, Enrique Guzman,

Davyd Chivala and Dominique Bonvin

Laboratoire d’AutomatiqueEcole Polytechnique Federale de Lausanne

ÉCOLE POLYTECHNIQUEFÉDÉRALE DE LAUSANNE

Gujral (LA-EPFL) SSC11 2009 08/06/2009 1 / 15

Agenda

Introduction

Calibration and predictionConstituents of prediction error

Unifying framework for different correction methodologies

Illustrative examples

Simulation exampleTwo real data examples

Gujral (LA-EPFL) SSC11 2009 08/06/2009 2 / 15

Introduction

Background

Calibration: {Xc , yc} → b Prediction: yp = Xpb

Gujral (LA-EPFL) SSC11 2009 08/06/2009 3 / 15

Introduction

Background

Calibration: {Xc , yc} → b Prediction: yp = Xpb

xT

c = ycsT + (spectra from other species) + noisec

Gujral (LA-EPFL) SSC11 2009 08/06/2009 3 / 15

Introduction

Background

Calibration: {Xc , yc} → b Prediction: yp = Xpb

xT

c = ycsT + (spectra from other species) + noisec

but

xT

p = ypsT + (spectra from other species) + dT + noisep

Gujral (LA-EPFL) SSC11 2009 08/06/2009 3 / 15

Introduction

Constituents of prediction error

yp = xT

p b

= (ypsT + spectra from other species)b + dTb + (noise) b

Prediction error (yp − yp) has three constituents:

1 due to noise (variance)

2 due to systematic disturbance (bias)

3 due to the PCR/PLSR modeling error (bias)

Gujral (LA-EPFL) SSC11 2009 08/06/2009 4 / 15

Unifying frameworkEXPLICIT CORRECTION METHODS USING ADDITIONAL MEASUREMENTS

CC: component correction, 2000

IIR: independent interference reduction, 2001

GLSW: generalized least squares weighting, 2003

EPO: external parameter orthogonalization, 2003

TOP: calibration transfer by orthogonal projection, 2004

DCPS: difference correction of prediction samples, 2005

DOP: dynamic orthogonal projection, 2006

EROS: error removal by orthogonal subtraction, 2008

Gujral (LA-EPFL) SSC11 2009 08/06/2009 5 / 15

Unifying frameworkSTEP 1: ESTIMATION OF DRIFT SPACE

nτ replicate measurements nτ reference measurements

Matched y-values Non-matched y-values (e.g. uncontrolledonline measurements)

D approximated as

D = Xτ,2|{z}

slave

− Xτ,1|{z}

master

D approximated as

D = Xτ|{z}

slave

−AXc| {z }

master

GLSW & TOP (calibration transfer),EPO, DCPS & EROS (temperaturechanges), IIR (unknown variation), CC(unknown drift)

DOP (unknown drift)

Gujral (LA-EPFL) SSC11 2009 08/06/2009 6 / 15

Unifying frameworkSTEP 2: DRIFT-CORRECTION

Shrinking Orthogonal projection Subtraction

GLSW CC, IIR, EPO, DOP, TOP,

EROS

DCPS

Calibrate with {XcW, yc},where

W =

DTD

nτ − 1+ α

2I

!−

12

Calibrate with {XcN, yc}

D = TPT + E

N = (I − PPT)

Calibrate with {Xc , yc}Assuming one drift factor, d,correct the prediction sample:

xp∗ = xp − β d

β optimized to minimize the2-norm ||xp∗||.

r

Gujral (LA-EPFL) SSC11 2009 08/06/2009 7 / 15

Unifying frameworkSTEP 2: DRIFT-CORRECTION

Shrinking Orthogonal projection Subtraction

GLSW CC, IIR, EPO, DOP, TOP,

EROS

DCPS

Calibrate with {XcW, yc},where

W =

DTD

nτ − 1+ α

2I

!−

12

Calibrate with {XcN, yc}

D = TPT + E

N = (I − PPT)

Calibrate with {Xc , yc}Assuming one drift factor, d,correct the prediction sample:

xp∗ = xp − β d

β optimized to minimize the2-norm ||xp∗||.

r

ANALYTICALRESULTS

{(i) Equivalent for

one drift component

Gujral (LA-EPFL) SSC11 2009 08/06/2009 7 / 15

Unifying frameworkSTEP 2: DRIFT-CORRECTION

Shrinking Orthogonal projection Subtraction

GLSW CC, IIR, EPO, DOP, TOP,

EROS

DCPS

Calibrate with {XcW, yc},where

W =

DTD

nτ − 1+ α

2I

!−

12

Calibrate with {XcN, yc}

D = TPT + E

N = (I − PPT)

Calibrate with {Xc , yc}Assuming one drift factor, d,correct the prediction sample:

xp∗ = xp − β d

β optimized to minimize the2-norm ||xp∗||.

r

ANALYTICALRESULTS

{(i) Equivalent for

one drift component

{(ii) Equivalent when

r = nτ and α → 0

Gujral (LA-EPFL) SSC11 2009 08/06/2009 7 / 15

Step 2: Drift correctionCHOICE OF META-PARAMETERS (α, r)

More complex than determining pseudo-rank of D

Wilks’ f test, Malinowski’s F-test, Faber-Kowalski F-testResults based on random matrix theory and perturbation theory (Nadler et al.)

Gujral (LA-EPFL) SSC11 2009 08/06/2009 8 / 15

Step 2: Drift correctionCHOICE OF META-PARAMETERS (α, r)

More complex than determining pseudo-rank of D

Wilks’ f test, Malinowski’s F-test, Faber-Kowalski F-testResults based on random matrix theory and perturbation theory (Nadler et al.)

Constituents of prediction error

yp = xT

p b

= (ypsT + spectra from other species)b + dTb + (noise)p b

Gujral (LA-EPFL) SSC11 2009 08/06/2009 8 / 15

Step 2: Drift correctionCHOICE OF META-PARAMETERS (α, r)

More complex than determining pseudo-rank of D

Wilks’ f test, Malinowski’s F-test, Faber-Kowalski F-testResults based on random matrix theory and perturbation theory (Nadler et al.)

Constituents of prediction error

yp = xT

p b

= (ypsT + spectra from other species)b + dTb + (noise)p b

Prediction error constituents after drift correctionb ⊥ estimated drift-space bias due to drift |dTb| ↓ ✔

Gujral (LA-EPFL) SSC11 2009 08/06/2009 8 / 15

Step 2: Drift correctionCHOICE OF META-PARAMETERS (α, r)

More complex than determining pseudo-rank of D

Wilks’ f test, Malinowski’s F-test, Faber-Kowalski F-testResults based on random matrix theory and perturbation theory (Nadler et al.)

Constituents of prediction error

yp = xT

p b

= (ypsT + spectra from other species)b + dTb + (noise)p b

Prediction error constituents after drift correctionb ⊥ estimated drift-space

RMSECV ↑

bias due to drift |dTb| ↓

bias in PCR/PLSR ↑

Gujral (LA-EPFL) SSC11 2009 08/06/2009 8 / 15

Step 2: Drift correctionCHOICE OF META-PARAMETERS (α, r)

More complex than determining pseudo-rank of D

Wilks’ f test, Malinowski’s F-test, Faber-Kowalski F-testResults based on random matrix theory and perturbation theory (Nadler et al.)

Constituents of prediction error

yp = xT

p b

= (ypsT + spectra from other species)b + dTb + (noise)p b

Prediction error constituents after drift correctionb ⊥ estimated drift-space

RMSECV ↑

||b||2 ↑

bias due to drift |dTb| ↓

bias in PCR/PLSR ↑

variance due to noise ↑

Gujral (LA-EPFL) SSC11 2009 08/06/2009 8 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

Data generation

Using Beer’s law and known pure component spectra of 4 speciesDrift in 7-dimensional loading space S(Pd), overlapping with signal loadingspace S(Ps)

Gujral (LA-EPFL) SSC11 2009 08/06/2009 9 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

Data generation

Using Beer’s law and known pure component spectra of 4 speciesDrift in 7-dimensional loading space S(Pd), overlapping with signal loadingspace S(Ps)

Estimated drift-space for one (out of 5) reference sample

dT = σdT

pT

d,1

pT

d,2

pT

d,3

...pT

d,7

+

Gujral (LA-EPFL) SSC11 2009 08/06/2009 9 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

Data generation

Using Beer’s law and known pure component spectra of 4 speciesDrift in 7-dimensional loading space S(Pd), overlapping with signal loadingspace S(Ps)

Estimated drift-space for one (out of 5) reference sample

dT = σdT

pT

d,1

pT

d,2

pT

d,3

...pT

d,7

+ σsT

pT

s,1

...pT

s,4

+

Gujral (LA-EPFL) SSC11 2009 08/06/2009 9 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

Data generation

Using Beer’s law and known pure component spectra of 4 speciesDrift in 7-dimensional loading space S(Pd), overlapping with signal loadingspace S(Ps)

Estimated drift-space for one (out of 5) reference sample

dT = σdT

pT

d,1

pT

d,2

pT

d,3

...pT

d,7

+ σsT

pT

s,1

...pT

s,4

+ σn randn(1, L)

Gujral (LA-EPFL) SSC11 2009 08/06/2009 9 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

Data generation

Using Beer’s law and known pure component spectra of 4 speciesDrift in 7-dimensional loading space S(Pd), overlapping with signal loadingspace S(Ps)

Estimated drift-space for one (out of 5) reference sample

dT = σdT

pT

d,1

pT

d,2

pT

d,3

...pT

d,7

+ σsT

pT

s,1

...pT

s,4

+ σn randn(1, L)

σdT = [1 × randn(1, 2) 0.1 × randn(1, 5)]

σsT = [0.3 × randn(1, 4)]

σn = 0.1

Gujral (LA-EPFL) SSC11 2009 08/06/2009 9 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

−6 −4 −2 0

−6 −4 −2 0

0.04

0.06

0.08

0.1

log( α )

RR

MS

EP

Without correction

With shrinkage

With OP r = 1

r = 2,3,4,5

Gujral (LA-EPFL) SSC11 2009 08/06/2009 10 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

−6 −4 −2 0

−6 −4 −2 0

log( α)

RR

MS

EP

High Signal (σ = 3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

Low Noise (σ = 0.1)

Low Signal (σ = 0.3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

High Noise (σ = 0.3)

High Signal (σ = 3)

High Noise (σ = 0.3)

s

n

s

s

s

s

Low Signal (σ = 0.3)

High Noise (σ = 0.3)s

n

n

n

n

n

Gujral (LA-EPFL) SSC11 2009 08/06/2009 11 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

−6 −4 −2 0

−6 −4 −2 0

log( α)

RR

MS

EP

High Signal (σ = 3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

Low Noise (σ = 0.1)

Low Signal (σ = 0.3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

High Noise (σ = 0.3)

High Signal (σ = 3)

High Noise (σ = 0.3)

s

n

s

s

s

s

Low Signal (σ = 0.3)

High Noise (σ = 0.3)s

n

n

n

n

n

Gujral (LA-EPFL) SSC11 2009 08/06/2009 11 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

−6 −4 −2 0

−6 −4 −2 0

log( α)

RR

MS

EP

High Signal (σ = 3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

Low Noise (σ = 0.1)

Low Signal (σ = 0.3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

High Noise (σ = 0.3)

High Signal (σ = 3)

High Noise (σ = 0.3)

s

n

s

s

s

s

Low Signal (σ = 0.3)

High Noise (σ = 0.3)s

n

n

n

n

n

Gujral (LA-EPFL) SSC11 2009 08/06/2009 11 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 1: SIMULATION

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

0.04

0.06

0.08

0.1

−6 −4 −2 0

−6 −4 −2 0

−6 −4 −2 0

log( α)

RR

MS

EP

High Signal (σ = 3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

Low Noise (σ = 0.1)

Low Signal (σ = 0.3)

Low Noise (σ = 0.1)

Medium Signal (σ = 2)

High Noise (σ = 0.3)

High Signal (σ = 3)

High Noise (σ = 0.3)

s

n

s

s

s

s

Low Signal (σ = 0.3)

High Noise (σ = 0.3)s

n

n

n

n

n

Gujral (LA-EPFL) SSC11 2009 08/06/2009 11 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 2: REAL DATA WITH TEMPERATURE EFFECTS

NIR, 3 species

11 x 2 = 22@ 30, 40 C

11 x 3 = 33@ 50, 60, 70 C

Monte Carlo5 reference samples

Calibration

Predictionο

ο

−10 −8 −6 −4 −2 0 20.02

0.03

0.04

0.05

0.06

r = 1

r = 2 r = 3

r = 4 r = 5

log(α)

RR

MS

EP

Without correctionWith shrinkageWith OP

Gujral (LA-EPFL) SSC11 2009 08/06/2009 12 / 15

Step 2: Shrinkage vs orthogonal projectionEXAMPLE 3: REAL DATA FOR CALIBRATION TRANSFER

NIR, 4 measured properties

40on Instrument 1

40on Instrument 2

Monte Carlo5 reference samples

Calibration

Prediction

−8 −6 −4 −2 0 20.1

0.11

0.12

0.13

0.14

r = 1

log(α)

RR

MS

EP

r = 2

r = 4 r = 5

r = 3

Without correctionWith shrinkageWith OP

Gujral (LA-EPFL) SSC11 2009 08/06/2009 13 / 15

Summary

Unifying framework

Many drift-correction methods proceed in two steps: (i) drift estimation, (ii)drift correction

Gujral (LA-EPFL) SSC11 2009 08/06/2009 14 / 15

Summary

Unifying framework

Many drift-correction methods proceed in two steps: (i) drift estimation, (ii)drift correction

Illustrative examples

Shrinkage performs better than orthogonal projection typically whensignificant signal present in estimated drift

Gujral (LA-EPFL) SSC11 2009 08/06/2009 14 / 15

Summary

Unifying framework

Many drift-correction methods proceed in two steps: (i) drift estimation, (ii)drift correction

Illustrative examples

Shrinkage performs better than orthogonal projection typically whensignificant signal present in estimated drift

Open issues

Statistically sound approach to choose meta-parameters

Gujral (LA-EPFL) SSC11 2009 08/06/2009 14 / 15

Summary

Unifying framework

Many drift-correction methods proceed in two steps: (i) drift estimation, (ii)drift correction

Illustrative examples

Shrinkage performs better than orthogonal projection typically whensignificant signal present in estimated drift

Open issues

Statistically sound approach to choose meta-parametersMulti shrinkage parameters {α1, α2, . . . }

Gujral (LA-EPFL) SSC11 2009 08/06/2009 14 / 15

Recommended