46
CONSENSUS MULTIVARIATE CALIBRATION OR MAINTENANCE WITHOUT REFERENCE SAMPLES USING TIKHONOV TYPE REGULARIZATION APPROACHES John Kalivas, Josh Ottaway, Jeremy Farrell, Parviz Shahbazikah Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA

John Kalivas, Josh Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

  • Upload
    jewel

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

CONSENSUS MULTIVARIATE CALIBRATION OR MAINTENANCE WITHOUT REFERENCE SAMPLES USING TIKHONOV TYPE REGULARIZATION APPROACHES . John Kalivas, Josh Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

CONSENSUS MULTIVARIATE CALIBRATION OR MAINTENANCE WITHOUT REFERENCE SAMPLES

USING TIKHONOV TYPE REGULARIZATION APPROACHES  

John Kalivas, Josh Ottaway, Jeremy Farrell, Parviz Shahbazikah Department of ChemistryIdaho State UniversityPocatello, Idaho 83209 USA

Page 2: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Outline

• Multivariate calibration• Tikhonov regularization (TR)• TR calibration maintenance with reference samples to

form full wavelength or sparse models– Selecting “a” model– Selecting a collection of models– Comparison to PLS

• TR calibration or maintenance without reference samples– Examples with comparison to PLS

• Summary TR variant equations

2

Page 3: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Spectral Multivariate Calibration• y = Xb

y = m x 1 vector of analyte reference values for m calibration samplesX = m x n matrix of spectra for n wavelengthsb = n x 1 regression (model) vector

• MLR solution; requires m ≥ p (wavelength selection)

• Biased regression solutions such as TR, RR (a TR variant), PLS, and PCR

• Requires meta-parameter (tuning parameter) selection•

3

1T Tˆ b X X X y

ˆ b X y

Tunk unk

ˆy x b

Page 4: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

= Euclidian vector 2-norm (vector magnitude or length)

• General TR in 2-norm

• Ridge regression (RR) when L = I

η

η

ηt tˆ

y Xb

0 I

b X

Xb

X I X y

by 22

2

2

2

2

1

min

η

η

ηt t tˆ

Xb y Lb

y Xb

0 L

b X X LL X y

2 222 2

12

min

Quantitation by Tikhonov Regularization (TR)

4

RR is regularized by using I and selecting η to minimize prediction errors (low bias) simultaneously shrinking the model vector (low variance)

Depending on the calibration goal, L can have different forms

2

Page 5: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• Cross-validation• L-curve graphic (can use with RMSEC)

• Bias/Variance can be assessed • Useful for putting RR, PLS, etc. on one plot for objective

comparison– C.L. Lawson, et.al.,

Solving Least-Squares Problems. Prentice-Hall, (1974)

– P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, SIAM Press (1998)

Selecting η

5

underfitting

overfitting

best model

2ˆ y y

2ˆLb

Page 6: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Calibration Maintenance• Need primary model to function over time

and/or under new secondary conditions1. Prepare calibration samples to span all potential

spectral variances• Not possible with a seasonal or geographical effects in

some data sets

2. Preprocess primary and secondary data to be robust to new conditions

3. Adjust spectra measured under new conditions to fit the primary model

4. Update the primary model to predict in the new conditions

6

Page 7: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• Model updating a RR model requires a new penalty term– Minimize prediction errors for a few samples from new

secondary conditions

M = spectra from secondary conditions yM = analyte reference values

• Avoid measuring many samples by tuning with λ• Local centering

– Respectively mean center X, y, M, and yM – Validation spectra centered to M

Calibration Maintenance with TR2

7

2 222

222 2

ηmin λ MXb y b Mb y

Page 8: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Pharmaceutical Example• M. Dyrby, et. al., Appl. Spectrosc. 56 (2002) 579-585

– http://www.models.life.ku.dk/datasets; Dept. of Food Sciences, Univ. of Copenhagen

• 310 Escitolopram tablets measured in NIR from 7,400-10,507 cm-1 at resolution 6 cm-1 for 404 wavelengths

• Four tablet types based on nominal weight: type 1, type 2, type 3, and type 4

• Three tablet batches (production scale): laboratory, pilot, and full• 30 tablets for each batch tablet type combination

8-0.1 0 0.1

-0.1

-0.05

0

0.05

0.1

0.15

PC1

PC2

Lab, type1 Lab, type 2 Lab, type 3 Lab, type 4 Full, type 1 Full, type 2 Full, type 3 Full, type 4

Page 9: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Objective• Using laboratory produced tablets as the primary

calibration set– Determine active pharmaceutical ingredient (API)

concentration in new tablets produced in full production (secondary condition)

Primary Calibration Space: 30 random lab batch samples with 15 from types 1 and 2 eachSecondary Calibration Space: 30 random full batch samples with 15 from types 1 and 2 eachStandardization Set M: 4 random full batch samples with 2 from types 1 and 2 each Validation Space: Remaining 30 full batch types 1 and 2

• Other batch type combinations studied

9

Page 10: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

2 2 22 22 2 2

min η λ MXb y b Mb y

Example Model Merit Landscapes

10

η

λλ

η

η

η

RMSEC

RMSEMλ λ

Page 11: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Merit Landscapes

11

RMSEC

RMSEMη

λη

λ

Convergence at small λ• Secondary conditions are not

included in new model• Amounts to using primary RR with

local centering where secondary validation samples are centered to the mean of M

2 2 22 22 2 2

min η λ MXb y b Mb y

Best local centered modelsA tradeoff region

Prediction of primary degrades while the prediction of secondary improves

Page 12: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Merit Landscapes

12

RMSEC

RMSEMη

λη

λ

2 2 22 22 2 2

min η λ MXb y b Mb y

too large2

b

Further tradeoffs• Tradeoff region between and RMSEC and RMSEM

• Can use an L-curve at a fixed λ value

2b

2

b

λ

ηA tradeoff regionPrediction of primary degrades while the prediction of secondary improves

Page 13: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

0 0.004 0.257 16.037 10000

0.2

0.4

0.6

0.8

1

• Multiple merits can be used to assess tradeoff– Respective RMSEC and RMSEM landscapes for R2, slope,

and intercept– L-curves at selected η and λ values–

Model Merit Evaluations

13

H

η

λ = 54.29

22

max max2 2ˆ ˆH RMSEM RMSEMi i ib b

RMSEVη

λλ = 54.29

Page 14: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Updating Results

14

Method RMSEC RMSEM RMSEV R2 ηλ

TR2 2.10 0.731 0.014 0.264 0.966 1.70154.29

RR local centering 2.70 0.468 0.245 0.487 0.935 0.588

0RR no update 16.40 0.096 - 0.653 0.925 0.0359

-

2b

Batch RMSEC RMSEV R2 η

Lab 16.40 0.096 0.276 0.972 0.0359Full 1.34 0.205 0.239 0.968 0.215

Updating Primary Lab Batch Types 1 and 2 to Predict Secondary Full Batch Types 1 and 2

Lab and Full Batches Types 1 and 2 Self Predicting Using RR

• Updated primary models predicts equivalently to the secondary model predicting the secondary validation samples

2b

Page 15: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

8000 9000 10000

-0.2

-0.1

0

0.1

8000 9000 10000-2

-1

0

1

2

3

8000 9000 10000

-0.4

-0.2

0

0.2

Model Vectors

15

2b

TR2

RR Lab Batch

RR Full Batch

ib Wavelength, cm-1

Wavelength, cm-1

ib

Wavelength, cm-1 ib

Page 16: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Using PLS• PLS (and other methods) can also be used• With PLS, the PLS latent vectors (PLS factors) replace

the η values

16

' '

ˆ ' '

λ λ

M

y Xb

y My X b

b X y 12 2ˆ t t t

ηλ λ

η λ

M

y X0 I by M

b X X I M M X y

TR2 PLS

Page 17: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

0 0.002 1.0 37

1

5

10

15

20 0.3

0.4

0.5

0.6

0 0.002 1.0 37

1

5

10

15

20 20

40

60

80

0 0.002 1.0 37

1

5

10

15

20 0.2

0.4

0.6

0.8

PLS Model Merit Landscapes

0 0.002 1.0 37

1

5

10

15

20 0.1

0.2

0.3

0.4

0.5

17

RMSEC RMSEM

• Similar landscape trends

• The discrete factor aspect of PLS can make it difficult to capture the underlying continuity of the landscapes

λ λ

λ

Fact

ors

Fact

ors

RMSEV 2

Page 18: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

PLS and TR2 Model Updating Results

18

Method RMSEC RMSEM RMSEV R2 ηλ

TR2 2.10 0.731 0.014 0.264 0.966 1.7054.29

PLS 3.29 0.658 0.024 0.266 0.9643

factors19.31

2b

Updating Primary Lab Batch Types 1 and 2 to Predict Secondary Full Batch Types 1 and 2

• PLS prediction equivalent to TR2• The discrete factor aspect of PLS can make it difficult to

capture the underlying continuity of the landscapes

Page 19: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• TR2:

• TR2b (sparse model):

• L = diagonal matrix with lii = 1/│bi│Gorodnitsky IF, Rao BD. IEEE Transactions on Signal Processing 1997; 45: 600-616.

• TR2-1 (sparse model):

η λ 2 2 22 22 2 2

min MXb y b Mb y

Sparse TR Calibration Maintenance

19

η λ MXb y Lb Mb y2 2 22 22 2 2

min

η λ 2 222 1 2

min MXb y b Mb y

Page 20: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

TR2-1 Sparse Model Merit Landscapes

20

RMSEC

RMSEV

RMSEM

2b

Mod

els w

ith in

crea

sing

η

λ

2 222 1 2

min η λ MXb y b Mb y

• Similar landscape trends

• For small λ values, the η values are the same across λ

• At greater λ values, the η values vary across λ

λ

Page 21: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

8000 9000 10000

-0.4

-0.2

0

0.2

8000 9000 10000

-0.5

0

0.5

8000 9000 10000

-5

0

5

TR2 and TR2-1 Model Updating Results

21

Method RMSEC RMSEM RMSEV R2 ηλ

TR2-1 10.65 0.666 0.029 0.227 0.970 76502442

TR2 2.10 0.731 0.014 0.264 0.966 1.7054.29

PLS 3.29 0.658 0.024 0.266 0.9643

factors19.31

2b

Updating Primary Lab Batch Types 1 and 2 to Predict Secondary Full Batch Types 1 and 2

TR2-1 prediction results improve over TR2 and PLS

TR2-1 TR2 PLS

cm-1 cm-1cm-1

ib

Page 22: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• TR2-1b (sparse models when L = I or L ≠ I):

• TR1-2b (full wavelength when L = I):

• TR1 (sparse models when L = I or L ≠ I):

MXb y Lb Mb yη λ2 222 1 2

min

Other TR Sparse Maintenance Methods

22

MXb y Lb Mb yη λ2 222 2 1

min

MXb y Lb Mb yη λ2

2 1 1min

Page 23: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• Updating a primary model:– for extra virgin olive oil adulterant quantitation to a

new geographical region (applicable to new seasons)

– to a new temperature– formed on one instrument to work on another

Other TR Applications

23

Page 24: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Summary• Only a few samples needed for M with appropriate

weighting• Same samples measured in primary and

secondary conditions are not needed– Avoids long term stability issue

• PLS and other methods can be used– Discrete nature (PLS factors) can limit landscapes

• Need to select a pair of tuning parameters for “a” model

• Requires reference values for yM

24

Page 25: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Consensus (Ensemble) Modeling• Samples predicted with a collection of models

– Composite (fused) prediction is formed– Simple mean prediction used here

• Typically form models by random sampling across calibration samples and/or variables

• From collection, filter for model quality• Ideal models:

1. High degree of prediction accuracy2. Small but noteworthy difference between selected

models (model diversity)

25

Page 26: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Consensus TR and PLS Modeling• Models formed from varying tuning

parameter values

• Plot predicted values against reference values for X,y and M,yM

• Use respective R2, slope, and intercept model merit values

• Natural target values:– R2 → 1– Slope → 1– Intercept → 0

26

Merit Min Max

R2 X,y 0.85 0.95

Slope X,y 0.85 0.95

Intercept

X,y0.30 1.00

R2 M,yM0.98 0.99

Intercept

M,yM

0.01 0.20

Slope

M,yM

0.95 0.99

Page 27: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

0 0.002 1.0 37

1

5

10

15

20 0.3

0.4

0.5

0.6

1 PLS Model

TR and PLS Consensus Models (RMSEV)

27

348 TR2 Modelsη

λ

•Fewer PLS models selected due to sharpness of landscapes from the discrete factor nature of PLS

•Number of “good” models can be made to increase by reducing the increment sizes of η and λ

628 TR2-1 Models

Fact

ors

λ

Mod

el w

ith in

crea

si8ng

η

Page 28: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Consensus Mean Model Updating Results

28

Method No. Models RMSEC RMSEM RMSEV R2 η

Λ

TR2 348 4.32 0.552 0.016 0.284 0.955 0.591207

TR2-1 628 16.32 0.580 0.007 0.274 0.958 0.3791619

PLS 1 3.29 0.658 0.024 0.266 0.964 3 factors19.31

2b

Updating Primary Lab Batch Types 1 and 2 to Predict Secondary Full Batch Types 1 and 2

• The one PLS model predicts best• PLS limited to discrete factors where TR allows 0 ≤ η < ∞ to more fully resolve the landscape

Page 29: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

8000 9000 10000-10

0

10

20

0 0.5 10

2

4

6

x 104

Correlation

Freq

uenc

y

8000 9000 10000

-0.5

0

0.5

1

1.5

Consensus Models and Correlations

29

TR2348 models

ib

ib

0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5 x 104

Correlation

Freq

uenc

y

cm-1

TR2-1628 models

Page 30: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Summary• Only a few samples needed for M with appropriate

weighting• Same samples measured in primary and secondary

conditions are not needed– Avoids long term stability issue

• Can select “a” model or a collection of models– Natural target values (thresholds) with model merits R2, slope, and

intercept for primary and secondary standardization sets– Work in progress

• Requires reference values for yM

30

Page 31: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Beer’s law: x = yaka + yiki + m + + n ka = pure component (PC) analyte spectrum ki = PC spectrum of ith interferent (drift, background, etc.) m = rest of the sample matrix n = spectral noise

• Ideal situation:

WHEN:

THEN:

• Cannot simultaneously satisfy 1, 2, and 3 to obtain 4

ˆ ay y4.

Without Reference Samples

31

ˆˆˆ ˆ ...ˆ ˆTa i

Ti

Ta

T Ty y y y k b k b m b nx b b

2

ˆ ˆ and ˆ ˆ such th0 at 0 ˆ 1 T Ti

Ta

T 2. k b 3. b n1 k b. bmb

Page 32: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

N = spectra without analyte, e.g., ki

• Minimizing the sum requires a tradeoff between the three conditions– The closer the three conditions are met, the more

likely • Updating the non-matrix effected PC ka to predict

in current conditions (spanned by N)

2 2 22 2

2 22min 1 η λT

ak b b Nb 0

Compromise PCTR2 Model

32

ˆ ay y

Page 33: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• PC interferent spectra– Reference values are 0

• Matrix effected samples without the analyte– Reference values are 0

• Constant analyte samples– Reference values are 0 after spectra are mean centered

• Estimate using samples with reference values

• Samples for N need to be measured at current conditions

Sources of N

33

T

T

yyN I Xy y

Goicoechea et al. Chemom. Intell. Lab. Syst. 56 (2001) 73-81

Page 34: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Zakynthos

• EVOO samples: Crete, Peloponnese, and Zakynthos– RR calibration: 56 samples spiked 5, 10, and 15% (wt/wt) sunflower oil– Primary: PC sunflower oil, 1 sample– Secondary: EVOO, 25 samples– Validation: 22 spiked samples

• Synchronous fluorescence spectra 270 to 340 nm at Δλ=20 nm

Extra Virgin Olive Oil Adulteration

34280 300 3200

1

2

3 x 106

Excitation Wavelength (nm)

Inte

nsity

SunflowerEVOO

Page 35: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Merit Landscapes

35

RMSEV

RMSEN

2

b

λ

λ

λη η

η

RMSEPC

λη

Page 36: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

H Values

36

95 9.7e4 1.0e8

0.2

0.4

0.6

0.8

1

H

95 9.7e4 1.0e80

0.2

0.4

0.6

0.8

1

1 3.6e-6 3.6e-3 3.6 1.8e3

0.2

0.4

0.6

0.8

1

H

PCTR2 at η = 9.1e3

RR full cal

RR with PCTR2 cal samples

22

max max2 2ˆ ˆH RMSEN RMSENi i ib b

22max max2 2

ˆ ˆH RMSEC RMSEC i i ib b

η

ηλ

Page 37: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Updating From PC SunflowerMethod

(No. Samples)

RMSEV R2 ηλ

PCTR2 (26) 2.6e-7 0.031 0.882 9.1e30.0036

RR (56) 4.0e-7 0.028 0.649 1.9e5-

RR with PCTR2

samples (26)2.3e-7 0.077 0.787 1.6e6

-

2b

•Updated PC predicts better than a full calibration

2b

0.05 0.1 0.150

0.05

0.1

0.15

0.2

RRPCTRLS lineLS lineEquality

37yi

ˆiyyi = 0.807xi - 0.0074

yi = 0.422xi + 0.048

Page 38: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• Wülfert, et al., Anal. Chem. 70 (1998) 1761-1767• hhttp://www.models.life.ku.dk/datasets ; Dept. of Food Sciences,

Univ. of Copenhagen• Water, 2-propanol, ethanol (analyte) • 850 to 1049 nm at 1 nm intervals at 30, 40, 50, 60, and 70°C• Calibration: 13 mixtures from 0% to 67% at 30°C• Validation: 6 mixtures from 16% to 66% at 70°C• Primary: PC ethanol at 30°C• Non-analyte matrix (standardization set) N at 70°C

PC interferents water and 2-propoanol (2 samples)Blanks (3 samples)Constant analyte (CA, 5 samples)

Temperature Data Set

38

Page 39: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

8.7e-7 9.5e-5 1.1e-2 1.2 1000

2.73e-6

1e-3

0.37

100

0.1

0.2

0.3

0.4

0.5

0.6

8.7e-7 9.5e-5 1.1e-2 1.2 1000

2.73e-6

1e-3

0.37

100

2

4

6

8

10

12

14

8.7e-7 9.5e-5 1.1e-2 1.2 1000

2.73e-6

1e-3

0.37

100

0.2

0.4

0.6

0.8

1

PCTR2 Model Merit Landscapes

39

RMSEPC

RMSEV

RMSEN

2

b

λ λ

η

η

8.7e-7 9.5e-5 1.1e-2 1.2 1000

2.73e-6

1e-3

0.37

100

0.2

0.4

0.6

0.8

Page 40: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Model Updating From PC 30°C to 70°CMethod

(No. Samples) N RMSEV R2 ηλ

PCTR2 (6) Blanks Int PC 13.41 0.037 0.97 5.2e-5

0.021

PCTR2 (9) BlanksCA 17.23 0.050 0.99 5.2e-5

0.054

PCTR2 (8) CAInt PC 15.42 0.093 0.98 5.2e-5

0.013

PCTR2 (11)Blanks

CAInt PC

23.32 0.069 0.99 5.2e-50.054

RR at 30°Cno update (13)

- 4.52 0.258 0.66 0.043-

RR at 70°C (13) - 4.93 0.115 0.86 0.034-

2b

Updated PC predicts as well as secondary model predicting the secondary validation samples 40

Page 41: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

PCTR2 and PLS Modeling Temperature

41

2b

Method No. Models RMSEPC RMSEN RMSEV R2 η

Λ

PCTR2 1454 13.70 0.0003 0.004 0.036 0.973 0.000444.94

PLS 84 12.33 0.0003 0.013 0.054 0.963 5 factors0.17

Updating analyte PC at 30°C to 70°C using interferent PC and blanks

PLS and PCTR2 predict similarly

8.7e-7 9.5e-5 1.1e-2 1.2 1000

2.73e-6

1e-3

0.37

100

0.1

0.2

0.3

0.4

0.5

0.6

PCTR2 RMSEV

0 2.8e-6 1.0e-3 9.1e-1 100

1

2

3

4

5

6 0.1

0.2

0.3

0.4

0.5

0.6

PLS RMSEV

λ

ηFa

ctor

s

λ

Page 42: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

PCTR2 Consensus Modeling Temperature

• On-going work1. Cannot use R2, slope, and intercept for respective

predicted values of the PC and N– Set thresholds for RMSEN, RMSPC, and based on

preliminary inspection of landscapes• Tradeoff needed between , RMSEN and RMSEPC

– Can further filter based on predicted values• Majority vote• Remove outliers

2. Combine predicted value of analyte pure component sample with predicated non-analyte samples to obtain R2, slope, and intercept

42

ˆ2

b

2b

ˆ2

b

Page 43: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

1.No reference values

2.With current condition sample reference values

3.A combination of N and M4.Replace with or to obtain sparse models

min η λTa Mk b b Mb y

2 2 22 22 22

1

PCTR Variants (Calibration or Maintenance)

43

2 2 22 2

2 22min 1 η λT

ak b b Nb 0

2b

1b

2Lb

Page 44: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Summary• PCTR2 calibrates (updates) to current conditions without

reference samples• Only a few new samples needed • Can predict better than a full calibration

– More focused to orthogonalize to the sample matrix

• Requires PC analyte spectrum– Does not have to be matrix effected

• Requires non-analyte samples– Can be estimated with reference samples

44

ˆ ˆˆ ...ˆ T TTa a i

Tiyy y k b m b nk b b

bias variance

Page 45: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

Other TR Variants

Expression CommentsRR when L = I; includes variable selection when L = diag and approximates an 1-norm

Model updating; includes variable selection

Model updating with variable selection; approximates 0-norm when L = diagModel updating with robustness to the standardization set MCalibration or updating without reference samples

Calibration to target model b*

Adaptive LASSO and LASSO when L = IClaerbout JF, Muir F. Geophysics 1973; 38: 826-844

Elastic net

η λ MXb y Lb Mb y2 2 22 22 2 2

min

η λTa k b Lb Nb

2 2 22 22 22

min 1

η λ MXb y Lb Mb y2 222 1 2

min

MXb y Lb Mb yη λ2 222 2 1

min

η Xb y Lb2 222 2

min

η λ Xb y b b2 222 1 2

min

η Xb y Lb2

2 1min

η * Xb y L b b 22 22 2

min

Page 46: John Kalivas, Josh  Ottaway , Jeremy Farrell, Parviz Shahbazikah Department of Chemistry

• In addition to combining a set of models, can combine TR2, PLS, PCTR2, … sets of model predictions

Other On-Going Consensus Modeling

46

Consensus TR2 models

Consensus PLS models

Consensus PCTR2 models

Consensus TR2-1 models

Final prediction