43
Logical Analysis Of Data (LAD) Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Data To Predict Rate Of Decline Of Kidney Function Kidney Function M. Lipkowitz 1 , M. Subasi 2 , E. Subasi 2 , V. Anbalagan 1 , W. Zhang 1 , P.L. Hammer 2 J. Roboz 1 and the AASK Investigators 1 Mount Sinai School of Medicine, NY, NY 2 RUTCOR, Rutgers Center for Operations Research, Piscataway, NJ DIMACS-RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory of Peter L. Hammer January, 2009

Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Embed Size (px)

Citation preview

Page 1: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Logical Analysis Of Data (LAD)Logical Analysis Of Data (LAD)Applied To Mass SpectrometryApplied To Mass Spectrometry

Data To Predict Rate Of Decline OfData To Predict Rate Of Decline OfKidney FunctionKidney Function

M. Lipkowitz1, M. Subasi2, E. Subasi2, V. Anbalagan1, W. Zhang1, P.L. Hammer2 J. Roboz1 and the AASK Investigators

1Mount Sinai School of Medicine, NY, NY2RUTCOR, Rutgers Center for Operations Research, Piscataway, NJ

DIMACS-RUTCOR Workshop on Boolean andPseudo-Boolean Functions in Memory of Peter L. Hammer

January, 2009

Page 2: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

AcknowledgementsAcknowledgements

• 1,094 Participants

• Investigators and Staff at 21 AASK Clinical Centers and Coordinating Center

• Sponsors– NIDDK

– NIH Office on Research in Minority Health

– King Pharmaceuticals

Page 3: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Why worry about chronic kidney Why worry about chronic kidney disease???disease???

Page 4: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Prevalence of Renal Disease in USPrevalence of Renal Disease in US(Age > 20 yrs, NHANES III)(Age > 20 yrs, NHANES III)

ESRDESRD 300,000300,000

Severe CKDSevere CKDGFR 15-29GFR 15-29

400,000400,000

Moderate CKDModerate CKDGFR 30-59GFR 30-59

Mild CKDMild CKDGFR 60-89GFR 60-89

NormalNormalGFR > 90GFR > 90

7-12 million7-12 million

55 million55 million

114 million114 million

Creat > 1.3-1.4 (men)Creat > 1.1-1.2 (women)

Adapted from:Coresh et al, AJKD 41:1-12, 2003

Page 5: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Risk of Death and Cardiovascular Disease in CKD

Go et al. N Engl J Med 2004;351:1296-305.Go et al. N Engl J Med 2004;351:1296-305.

Page 6: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

0

10

20

30

Lif

e E

xpec

tan

cy (

yrs)

at A

ge

49

Total ProstateCancer

ColonCancer

ESRD LungCancer

Life Expectancy in ESRDLife Expectancy in ESRD

Page 7: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

African American Study of Kidney African American Study of Kidney Disease and HypertensionDisease and Hypertension

(AASK)(AASK)

Motivated by the high incidence of kidney Motivated by the high incidence of kidney disease in African Americans with hypertensiondisease in African Americans with hypertension

Extremely hard to recruit Extremely hard to recruit 500,000 medical records screened to 500,000 medical records screened to recruit recruit

1094 participants1094 participants

Page 8: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Two Phases of AASKTwo Phases of AASK

• Phase 1: Randomized trial (completed Sept 2001)Phase 1: Randomized trial (completed Sept 2001)

– 1,094 African-Americans with non-diabetic, hypertensive 1,094 African-Americans with non-diabetic, hypertensive CKD (baseline GFR of 20-65 ml/min/1.73 mCKD (baseline GFR of 20-65 ml/min/1.73 m22

– Demonstrated that one class of BP medications, ACE Demonstrated that one class of BP medications, ACE inhibitor, slowed progression of kidney diseaseinhibitor, slowed progression of kidney disease

• Phase 2: Observational cohort (completed June 2007)Phase 2: Observational cohort (completed June 2007)

– One Objective: document the long-term effects of trial One Objective: document the long-term effects of trial interventions on CKD events interventions on CKD events

– Therapy: all participants received Therapy: all participants received recommended BP recommended BP therapytherapy::• ACEi (or ARB) ACEi (or ARB) • BP goal < 130/80 mmHgBP goal < 130/80 mmHg

Page 9: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Main Results of Phase 1Main Results of Phase 1

• Trial results published in JAMA 2002Trial results published in JAMA 2002

– ACEi was more effective than CCBs and BBs ACEi was more effective than CCBs and BBs in slowing progression of hypertensive renal in slowing progression of hypertensive renal diseasedisease

– Largest difference seen in participants with Largest difference seen in participants with UP/Cr > 0.22 (>300 mg/24h)UP/Cr > 0.22 (>300 mg/24h)

– No difference between participants No difference between participants randomized to lower MAP goal <92 mmHg vs randomized to lower MAP goal <92 mmHg vs 102-107 mmHg regardless of UP/Cr102-107 mmHg regardless of UP/Cr

Page 10: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Second Phase of AASKSecond Phase of AASK

Cohort study (completed 6/07)Cohort study (completed 6/07)

– One Objective: document the long-term effects of trial One Objective: document the long-term effects of trial interventions on CKD events interventions on CKD events

– Therapy: all participants received Therapy: all participants received recommended BP therapyrecommended BP therapy::• ACEi (or ARB) ACEi (or ARB) • BP goal < 130/80 mmHgBP goal < 130/80 mmHg

– Primary composite outcome: Primary composite outcome: • doubling of serum Cr from the trial baseline, ESRD, or doubling of serum Cr from the trial baseline, ESRD, or

death across death across bothboth trial and cohort phase trial and cohort phase

Page 11: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Event Rates- Trial and CohortEvent Rates- Trial and Cohort

Page 12: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Conclusion:Conclusion:

ACE inhibition does slow progression of CKD.ACE inhibition does slow progression of CKD.

However, the residual progression rate on However, the residual progression rate on best therapy is unacceptable!best therapy is unacceptable!

Page 13: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Heterogeneity of Progression of CKD

Glomerular Filtration Rate (GFR)

A measure of kidney function

Normal is 100ml/min/1.73 m2

GFR slope

We use rate of decline of GFR as our main measure of progression

Page 14: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Clinical Case 1Clinical Case 1

ACEi ACEi Good BP controlGood BP control1 gm proteinuria1 gm proteinuria

Page 15: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

ACEiACEiSub-optimal BP ControlSub-optimal BP ControlUprot 1.1 g/24 hUprot 1.1 g/24 h

Clinical Case 2Clinical Case 2

Blood PressureBlood Pressure eGFReGFR

Page 16: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

How do we find the “Rapid Progressors” How do we find the “Rapid Progressors” and “Non-progressors”and “Non-progressors”

Page 17: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Heterogeneity in Chronic GFR SlopeHeterogeneity in Chronic GFR Slope

Page 18: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Urine Protein, Our Current Best Predictor, Is Not Adequate

Page 19: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

A Serum Proteomics ApproachA Serum Proteomics Approach

• Use SELDI-tof Mass Spectrometry to detect Use SELDI-tof Mass Spectrometry to detect serum proteinsserum proteins

• Use Logical Analysis of Data (LAD), a special Use Logical Analysis of Data (LAD), a special data analysis methodology which combines data analysis methodology which combines ideas and concepts from optimization, ideas and concepts from optimization, combinatorics, and Boolean functionscombinatorics, and Boolean functions

Page 20: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

The Data Set

Rapid Rapid

ProgressorsProgressors

(n=60)(n=60)

Slow Slow

ProgressorsProgressors

(n=60)(n=60)p-valuep-value

Chronic SlopeChronic Slope -6.60 -6.60 ++ 1.36 1.36 +2.18 +2.18 ++ 1.12 1.12 <0.00001<0.00001

GFRGFR 45.05 45.05 ++ 11.97 11.97 53.45 53.45 ++ 11.50 11.50 <0.0001<0.0001

ProteinuriaProteinuria 1.09 1.09 ++1.351.35 0.090.09++0.190.19 <0.00001<0.00001

AgeAge 50.85 50.85 ++ 11.95 11.95 53.35 53.35 ++ 9.51 9.51 NSNS

WeightWeight 95.75 95.75 ++ 22.75 22.75 86.23 86.23 ++ 20.82 20.82 NSNS

*Matched for randomized drug class*Matched for randomized drug class

Page 21: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

SELDI-tofSELDI-tof

Page 22: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

SELDI Data

insulininsulin

Page 23: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Logical Analysis of Data (LAD)Logical Analysis of Data (LAD)

• Non-statistical method based on– Combinatorics– Optimization– Logic

• Initiated by Peter L. Hammer in 1988.

• Has been applied to numerous disciplines: economics and business, seismology, oil exploration, medicine.

Page 24: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD ApproximationLAD Approximation

Hidden Function Hidden Function

LAD LAD Approximation Approximation

Dataset Dataset

Page 25: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Main Components of LADMain Components of LAD

• DiscretizationDiscretization

• Support setSupport set

• Pattern generationPattern generation

• ModelModel

• PredictionPrediction

Page 26: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

DiscretizationDiscretization

Feasible set of Feasible set of cut-pointscut-points

Minimum set of Minimum set of cut-pointscut-points

Set covering

Page 27: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Support SetSupport Set

• Smallest (cardinality) subset of attributes which Smallest (cardinality) subset of attributes which are sufficient to distinguish between the positive are sufficient to distinguish between the positive and negative observations.and negative observations.

• Finding a support set is a set-covering problem!Finding a support set is a set-covering problem!

Page 28: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Positive PatternPositive Pattern Negative PatternNegative Pattern

PatternsPatterns

Page 29: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Positive Pattern Covering A:Positive Pattern Covering A: i) Covers Ai) Covers A ii) Does not cover D, E, ii) Does not cover D, E, FF

01:P 31 xANDx

Coverage(P)Coverage(P) = Number of observations covered by = Number of observations covered by PP

Degree(P)Degree(P) = Number of conditions in = Number of conditions in PP

Homogeneity(PHomogeneity(P) = Proportion of positive observation among those) = Proportion of positive observation among those it coversit covers

Prevalence(P)Prevalence(P) = Proportion of positive observations covered by = Proportion of positive observations covered by P P to to

total number of positive observationstotal number of positive observations

x1 x2 x3 x4 x5 x6A 1 0 0 1 1 1 1B 1 1 0 1 0 0 1C 0 0 1 0 1 0 1D 1 0 1 0 1 1 0E 0 0 0 1 1 1 0F 1 1 1 0 0 0 0O

bse

rvat

ion

s

AttributesOutcome

Pattern CharacteristicsPattern Characteristics

Page 30: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Positive TheoryPositive Theory Negative TheoryNegative Theory

TheoryTheory

Page 31: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD ModelLAD Model

Unexplained Unexplained AreaArea

Discordant Discordant AreaArea

Positive Positive areaarea

NegativeNegativeareaarea

Page 32: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

A good LAD Model!A good LAD Model!

• Small # of featuresSmall # of features

• High quality patternsHigh quality patterns

– Small degreeSmall degree– High prevalenceHigh prevalence– High homogeneityHigh homogeneity

• Small # of patternsSmall # of patterns

Page 33: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD PredictionLAD Prediction

• Model: Model: PP11, , PP22, … , , … , PPpp ; ; NN11, , NN22 , … , , … , NNnn

• DiscriminantDiscriminant

j

ji

i xNn

xPp

x11

• Prediction:Prediction:Based on the sign of the Based on the sign of the discriminant.discriminant.

• Discriminant is not only used for prediction, Discriminant is not only used for prediction,

but also as an effective risk score!but also as an effective risk score!

Page 34: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD SoftwaresLAD Softwares

• Sorin Alexe, DatascopeSorin Alexe, Datascope

http://rutcor.rutgers.edu/~salexe/LAD_kit/SETUP-LAD-DS-SE20.ziphttp://rutcor.rutgers.edu/~salexe/LAD_kit/SETUP-LAD-DS-SE20.zip

• Pierre Lemaire, LadoscopePierre Lemaire, Ladoscope

http://www.kamick.org/lemaire/LADhttp://www.kamick.org/lemaire/LAD

Page 35: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD Applied to AASK DataLAD Applied to AASK Data

• Generates groups of “combinatorial biomarkers”Generates groups of “combinatorial biomarkers”

– Pairs of SELDI peak intensities that are either Pairs of SELDI peak intensities that are either “positive” (predict rapid progression) or “positive” (predict rapid progression) or “negative” (predict slow progression) biomarkers“negative” (predict slow progression) biomarkers

– Groups of these “combinatorial biomarkers” are Groups of these “combinatorial biomarkers” are combined to create a model that predicts combined to create a model that predicts outcomesoutcomes

– There are a small number of pairs of peaks There are a small number of pairs of peaks potentially provides targets for future researchpotentially provides targets for future research

Page 36: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

The ‘Support Set’The ‘Support Set’

• 5751 SELDI protein peaks5751 SELDI protein peaks– 7 are enough to predict outcomes7 are enough to predict outcomes

Attribute

Correlation

Correlation rank

M2018 0.039 4115 M2756 0.260 16 M2780 0.252 28 M5266 0.065 3290 M9940 0.194 348

M11274 0.133 1565 M11752 0.192 378

Page 37: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

The LAD Model

Patterns

Pattern characteristics Pattern defining conditions

PrevalenceHomogeneity

Hazard Ratio

M2018 M2756 M2780 M5266 M9940 M11274 M11752Positive Negative

P133

(57.89%)10

(16.95%)78.57% 2.42 < 0.575 > 0.055

P232

(56.14%)8

(13.56%)80% 2.43 < 3.835 > 2.78

P332

(56.14%)9

(15.25%)78.05% 2.34 > 0.49 < 0.515

N111

(19.30%)39

(66.10%)78% 2.57 > 1.705 > 0.465

N26

(10.53%)31

(52.54%)85.71% 2.39 > 0.235 < 0.115

N38

(14.04%)35

(59.32%)81.4% 2.48 > 1.295 > 0.515

N47

(12.28%)31

(52.54%)83.33% 2.3 > 0.425 < 2.78

Page 38: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Validation of the LAD ModelValidation of the LAD Model

Accuracy

Sensitivity

Specificity

Hazard Ratio

Basic LAD model

80.6%

78.4%

78.5%

2.72

““10-folding” experiments:10-folding” experiments:• patients randomly divided into 10 equal groupspatients randomly divided into 10 equal groups• use data from 9 groups to predict outcomes in use data from 9 groups to predict outcomes in 10th10th• repeat for each grouprepeat for each group• randomly re-divide and repeat X 10 (100 total randomly re-divide and repeat X 10 (100 total runs)runs)

Page 39: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Discriminants as Risk ScoresDiscriminants as Risk Scores

GroupGroup # of# ofobservationsobservations

Percentage of Percentage of Rapid ProgressorsRapid Progressors

Average Risk Average Risk ScoreScore

11 2323 0%0% 0.0870.087

22 2323 26.09%26.09% 0.2750.275

33 2323 56.52%56.52% 0.4980.498

44 2323 69.57%69.57% 0.6970.697

55 2424 91.67%91.67% 0.9240.924

Page 40: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

Outcomes by Quintile of “Risk Score”

LADLAD

Upro/UCrUpro/UCr

Proportion of Rapid Progressors vs Average Risk Index

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1 2 3 4 5Quintiles of Risk Index

Proportion of RapidProgressors

Average Risk Index

Page 41: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

LAD vs Proteinuria to Predict Progression

• Both work well to find rapid progressors– >95% of patients with high risk or high protein

progress

• LAD Risk Score better defines slow progressors– None with lowest LAD risk score progress

– 16% with lowest protein progress

– In fact, the degree of proteinuria in the 3 lowest quintiles may not be distinguishable on repeated testing, so progression could be up to 40%

Page 42: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To

1.1. Expand this pilot SELDI study to the full Expand this pilot SELDI study to the full AASK data set (800 samples).AASK data set (800 samples).

• If data are reproducible this could lead to a If data are reproducible this could lead to a clinical test for progression rate.clinical test for progression rate.

2.2. The ultimate goal: isolate and identify The ultimate goal: isolate and identify components of combinatorial biomarkerscomponents of combinatorial biomarkers

• This will hopefully lead to new therapeutic This will hopefully lead to new therapeutic targets for drug developmenttargets for drug development

• Identification of proteins is difficult, and LAD Identification of proteins is difficult, and LAD limits the number to identifylimits the number to identify

Future StudiesFuture Studies

Page 43: Logical Analysis Of Data (LAD) Applied To Mass Spectrometry Data To Predict Rate Of Decline Of Kidney Function Logical Analysis Of Data (LAD) Applied To