45
Potential usefulness of a framework of 7 steps for prediction models Ewout Steyerberg Professor of Medical Decision Making Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Oberwolfach, Jan 28, 2010

Steyerberg Prediction Modeling 7 Steps Jan10

Embed Size (px)

DESCRIPTION

PROG

Citation preview

  • Potential usefulness of a framework of 7 steps for prediction modelsEwout Steyerberg

    Professor of Medical Decision MakingDept of Public Health, Erasmus MC, Rotterdam, the Netherlands

    Oberwolfach, Jan 28, 2010

  • Erasmus MC University Medical Center Rotterdam

  • Prediction models in many areasOncologyTesticular cancerOesophageal cancerProstate cancerLung cancerLymphomaCardiovascularAAARenal artery stenosisVascular patientsInfectionsLeprosyRSVChlamydia infection TraumaTBIIntracranial lesions at CTMutationsLynch syndromeNeurologyGuillain BarreOsteoporosisPediatricsSeizuresTriage

  • OverviewBackground: Oberwolfach in the mountains

    A framework to develop prediction models

    Potential usefulness

    Discussion: how to improve prediction research

  • Oberwolfach

  • Workshop Statistical Issues in Prediction, personal aimsMeet and listen to other researchers

    Go cross-country skiing

    Sell book

    Get back on track with work in TO DO box

  • Presentation optionsTheoretical challenges

    Practical challenges

  • Problems in prediction models

  • 1. Predictor selection: we all present something new

  • 2. Methodological problemsMissing valuesOptimal recoding and dichotomizationStepwise selection, relatively small data setsPresentationValidation

  • Potential solutionsAwareness and educationScientific progress requiredTranslation to practiceEpidemiologists/clinicians interested in prediction modelingStatisticians not interested in prediction modeling

    Reporting guidelinesNot yet available

    Study protocol registrationPossible, rare

  • http://www.clinicalpredictionmodels.org

    http://www.springer.com/978-0-387-77243-1

  • Proposed modeling framework

  • Aim: knowledge on predictors, or provide predictions?Predictors (Prognostic factors)Traditional (demographics, patient, disease related)Modern (omics, biomarkers, imaging)Challenges:Testing: independent effect, adjusted for confoundersEstimation: correct functional form

    PredictionsPragmatic combination of predictorsEssentially an estimation problem

  • Prognostic modelling checklist:intended to assist in developing a valid prediction model

  • Prognostic modeling checklist: general considerations

  • Prognostic modeling checklist: 7 steps

  • Prognostic modeling checklist: validity

  • Usefulness of frameworkChecklist for model buildingSMART data, survival after cardiovascular event, 2008Critical assessment of model buildingGUSTO-I model, Lee 1995

  • Example: prediction of myocardial infarction outcome

  • Aim: predictors or predictions?Title: predictions vs text: prediction

    Additional publication focuses at clinicians

  • Predictors

  • General considerations in GUSTO-I model

  • 1. Data inspection, specifically: missing valuesAmong the array of clinical characteristics considered potential predictor variables in the modeling analyses were occasional patients with missing values. Although a full set of analyses was performed in patients with complete data for all the important predictor variables (92% of the study patients), the subset of patients with one or more missing predictor variables had a higher mortality rate than the other patients, and excluding those patients could lead to biased estimates of risk. To circumvent this, a method for simultaneous imputation and transformation of predictor variables based on the concepts of maximum generalized variance and canonical variables was used to estimate missing predictor variables and allow analysis of all patients.33 34 The iterative imputation technique conceptually involved estimating a given predictor variable on the basis of multiple regression on (possibly) transformed values of all the other predictor variables. End-point data were not explicitly used in the imputation process. The computations for these analyses were performed with S-PLUS statistical software (version 3.2 for UNIX32), using a modification of an existing algorithm.33 34 The imputation software is available electronically in the public domain.33

  • 2. Coding of predictorscontinuous predictorslinear and restricted cubic spline functions truncation of values (for example for systolic blood pressure)

    categorical variablesDetailed categorization for location of infarction: anterior (39%), inferior (58%), or other (3%)Ordinality ignored for Killip class (I IV) class III and class IV each contained only 1% of the patients

  • 3. Model specificationMain effects: .. which variables were most strongly related to short-term mortality:hypothesis testing rather than prediction question

    Interactions: many tested, one included: Age*Killip

    Linearity of predictors: transformations chosen at univariate analysis were also used in multivariable analysis

  • 4. Model estimationStandard ML

    No shrinkage / penalizationNo external information

  • 5. Model performanceDiscriminationAUC

    Calibration: observed vs predictedGraphically, including deciles (links to Hosmer-Lemeshow goodness of fit test)Specific subgroups of patients

  • Calibration

  • Calibration

  • 6. Model validation10-fold cross validation

    100 bootstrap samples model refitted, tested on the original sample

  • 7. Model presentationPredictor effects: Relative importance: Chi-square statisticsRelative effects: Odds ratios graphically

    PredictionsFormula

  • Risk Model for 30-Day Mortality Probability of death within 30 days=1/[1+exp (-L)], where L=3.812+0.07624 age-0.03976 minimum (SBP, 120)+2.0796 [Killip class II]+3.6232 [Killip class III]+4.0392 [Killip class IV]-0.02113 heart rate+0.03936 (heart rate-50)+-0.5355 [inferior MI]-0.2598 [other MI location]+0.4115 [previous MI]-0.03972 height+0.0001835 (height-154.9)+^3-0.0008975 (height-165.1)+^3+0.001587 (height-172.0)+^3-0.001068 (height-177.3)+^3+0.0001943 (height-185.4)+^3+0.09299 time to treatment-0.2190 [current smoker]-0.2129 [former smoker]+0.2497 [diabetes]-0.007379 weight+0.3524 [previous CABG]+0.2142 [treatment with SK and intravenous heparin]+0.1968 [treatment with SK and subcutaneous heparin]+0.1399 [treatment with combination TPA and SK plus intravenous heparin]+0.1645 [hx of hypertension]+0.3412 [hx of cerebrovascular disease]-0.02124 age [Killip class II]-0.03494 age [Killip class III]-0.03216 age [Killip class IV].

    Explanatory notes. 1. Brackets are interpreted as [c]=1 if the patient falls into category c, [c]=0 otherwise. 2. (x)+=x if x>0, (x)+=0 otherwise. 3. For systolic blood pressure (SBP), values >120 mm Hg are truncated at 120. 4. For time to treatment, values

  • Conclusion on usefulness of frameworkGUSTO-I makes for an interesting case-study on General modeling considerationsIllustration of 7 modeling stepsInternal vs external validity (early 1990s 2009?)

    Debate possible on some choices1. Missing values: multiple imputation, including the outcome2. Coding: fractional polynomials? Lump categories?3. Selection: stepwise works because of large N4. Estimation: standard ML works because of large N; penalization?5. Performance: usefulness measures6. Validation: CV and bootstrap, not necessary because of large N?7. Presentation:Predictor effects: nice! Predictions: score chart / nomogram

  • Discussion on usefulness of frameworkChecklist for model buildingSMART data, survival after cardiovascular event, 2009Critical assessment of model buildingGUSTO-I model, Lee 1995Basis for reporting checklistLink with REMARK / STROBE / Basis for protocol registrationLink with requirements in other protocols?

  • Challenges in developing a valid prognostic modelTheoretical: biostatistical research New analysis techniques, e.g.Neural networks / Support vector machines / Fractional polynomials / splines for continuous predictorsPerformance measuresSimulations: what makes sense as a strategy?

    Applications: epidemiological and decision-analytic researchSubject matter knowledgeClinical expertsLiterature: review / meta-analysisBalance research questions vs effective sample sizeIncremental value new markersTransportability and external validityClinical impact of using a model

  • Which performance measure when?

    Discrimination: if poor, usefulness unlikely, but >= 0Calibration: if poor in new setting: Prediction model may harm rather than improve decision-making

  • ***