Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterprise Miner by Patricia B. Cerrito

Clinical Trials Versus Health Clinical Trials Versus Health Outcomes Research: Outcomes Research:

SAS/STAT Versus SASSAS/STAT Versus SASEnterprise MinerEnterprise Miner

Patricia B. CerritoPatricia B. Cerrito

[email protected]

University of LouisvilleUniversity of Louisville

ObjectivesObjectives

To examine some issues with traditional To examine some issues with traditional statistical models and their basic statistical models and their basic assumptionsassumptions

To examine the Central Limit Theorem To examine the Central Limit Theorem and its necessity in statistical modelsand its necessity in statistical models

To look at the differences and similarities To look at the differences and similarities between clinical trials and health between clinical trials and health outcomes researchoutcomes research

Surrogate Versus Real Surrogate Versus Real EndpointsEndpoints

Because clinical trials tend to be short Because clinical trials tend to be short term, they use high risk patients and term, they use high risk patients and surrogate endpointssurrogate endpoints

Use of statins reduce cholesterol levels Use of statins reduce cholesterol levels but do they increase longevity and disease but do they increase longevity and disease free survival?free survival?

Health outcomes data can examine real Health outcomes data can examine real endpoints from the general populationendpoints from the general population

One Versus Many EndpointsOne Versus Many Endpoints

Clinical trials generally have one survival Clinical trials generally have one survival endpoint-time to recurrence, time to death, endpoint-time to recurrence, time to death, time to disease progressiontime to disease progression

Health outcomes can examine multiple Health outcomes can examine multiple endpoints simultaneously using survival endpoints simultaneously using survival data miningdata mining

Homogeneous Versus Homogeneous Versus Heterogeneous DataHeterogeneous Data

Clinical trials generally use Clinical trials generally use inclusion/exclusion criteria to define a inclusion/exclusion criteria to define a homogeneous samplehomogeneous sample

Health outcomes have to rely upon Health outcomes have to rely upon heterogeneous dataheterogeneous data Populations are more gamma distributions Populations are more gamma distributions

than normal and this must be taken into than normal and this must be taken into considerationconsideration

Large Versus Small SamplesLarge Versus Small Samples

Clinical trials tend to use the smallest sample Clinical trials tend to use the smallest sample possible to achieve the desired powerpossible to achieve the desired power Database designed for analysis and data are Database designed for analysis and data are

very cleanvery clean Health outcomes have an abundance of data Health outcomes have an abundance of data

and variablesand variables Power not an issuePower not an issue Data are very messy and require considerable Data are very messy and require considerable

preprocessingpreprocessing

Rare OccurrencesRare Occurrences

Clinical trials not large enough to find all Clinical trials not large enough to find all potential rare occurrencespotential rare occurrences

Health outcomes have enough data to find Health outcomes have enough data to find rare occurrences and to predict the rare occurrences and to predict the probability of occurrenceprobability of occurrence Requires modifications to standard linear Requires modifications to standard linear

modelsmodels Predictive modeling much better at actual Predictive modeling much better at actual

predictionprediction

Example 1Example 1

Ottenbacher, Kenneth J. Ottenbacher, Ottenbacher, Kenneth J. Ottenbacher, Heather R. Tooth, Leigh. Ostir, Glenn V.Heather R. Tooth, Leigh. Ostir, Glenn V.

A review of two journals found that articles A review of two journals found that articles using multivariable logistic regression using multivariable logistic regression frequently did not report commonly frequently did not report commonly recommended assumptions. recommended assumptions. Journal of Clinical Epidemiology. Journal of Clinical Epidemiology. 57(11):1147-52, 2004 Nov.57(11):1147-52, 2004 Nov.

continued...

Example 1Example 1

Statistical significance testing or Statistical significance testing or confidence intervals were reported in all confidence intervals were reported in all articles. Methods for selecting articles. Methods for selecting independent variables were described in independent variables were described in 82%, and specific procedures used to 82%, and specific procedures used to generate the models were discussed in generate the models were discussed in 65%. 65%.

continued...

Example 1Example 1

Fewer than 50% of the articles indicated if Fewer than 50% of the articles indicated if interactions were tested or met the interactions were tested or met the recommended events per independent recommended events per independent variable ratio of 10:1. variable ratio of 10:1.

Fewer than 20% of the articles described Fewer than 20% of the articles described conformity to a linear gradient, examined conformity to a linear gradient, examined collinearity, reported information on validation collinearity, reported information on validation procedures, goodness-of-fit, discrimination procedures, goodness-of-fit, discrimination statistics, or provided complete information on statistics, or provided complete information on variable coding.variable coding.

Example 2Example 2 Brown, James M. O'Brien, Sean M. Wu, Brown, James M. O'Brien, Sean M. Wu,

Changfu. Sikora, Jo Ann H. Griffith, Bartley P. Changfu. Sikora, Jo Ann H. Griffith, Bartley P. Gammie, James S.Gammie, James S.Title: Isolated aortic valve replacement in North Title: Isolated aortic valve replacement in North America comprising 108,687 patients in 10 America comprising 108,687 patients in 10 years: changes in risks, valve types, and years: changes in risks, valve types, and outcomes in the Society of Thoracic Surgeons outcomes in the Society of Thoracic Surgeons National Database.National Database.Source: Journal of Thoracic & Cardiovascular Source: Journal of Thoracic & Cardiovascular Surgery. 137(1):82-90, 2009 Jan.Surgery. 137(1):82-90, 2009 Jan.

continued...

Example 2Example 2

108,687 isolated aortic valve replacements 108,687 isolated aortic valve replacements were analyzed. Time-related trends were were analyzed. Time-related trends were assessed by comparing distributions of risk assessed by comparing distributions of risk factors, valve types, and outcomes in 1997 factors, valve types, and outcomes in 1997 versus 2006.versus 2006.

Differences in case mix were summarized by Differences in case mix were summarized by comparing average predicted mortality risks comparing average predicted mortality risks with a logistic regression model.with a logistic regression model.

Differences across subgroups and time were Differences across subgroups and time were assessed.assessed.

continued...

Example 2Example 2

RESULTS:RESULTS: There was a dramatic shift There was a dramatic shift toward use of bioprosthetic valves. toward use of bioprosthetic valves.

Aortic valve replacement recipients in Aortic valve replacement recipients in 2006 were older (mean age 65.9 vs 67.9 2006 were older (mean age 65.9 vs 67.9 years, P < .001) with higher predicted years, P < .001) with higher predicted operative mortality risk (2.75 vs 3.25, P operative mortality risk (2.75 vs 3.25, P < .001)< .001)

Observed mortality and permanent stroke Observed mortality and permanent stroke rate fell (by 24% and 27%, respectively). rate fell (by 24% and 27%, respectively).

continued...

Example 2Example 2

Female sex, age older than 70 years, and Female sex, age older than 70 years, and ejection fraction less than 30% were all ejection fraction less than 30% were all related to higher mortality, higher stroke related to higher mortality, higher stroke rate and longer postoperative stay. rate and longer postoperative stay.

There was a 39% reduction in mortality There was a 39% reduction in mortality with preoperative renal failure.with preoperative renal failure.

Central Limit TheoremCentral Limit Theorem

As the sample size increases to infinity, the As the sample size increases to infinity, the distribution of the sample average approaches distribution of the sample average approaches a normal distribution with mean a normal distribution with mean μμ and variance and variance σσ22/n. /n.

As n approaches infinity, the variance As n approaches infinity, the variance approaches zero. approaches zero.

Therefore, the distribution of the sample Therefore, the distribution of the sample average starts to look like a straight line at the average starts to look like a straight line at the point point μμ if n is too large. if n is too large.

continued...

Central Limit TheoremCentral Limit Theorem

In addition, the sample mean is very In addition, the sample mean is very susceptible to the influence of outliers. susceptible to the influence of outliers.

Moreover, the confidence limits are Moreover, the confidence limits are defined based upon the assumption of defined based upon the assumption of normality and symmetry. Therefore, the normality and symmetry. Therefore, the existence of many outliers will skew the existence of many outliers will skew the confidence interval.confidence interval.

Nonparametric StatisticsNonparametric Statistics

Nonparametric models still require Nonparametric models still require symmetry.symmetry.

Many populations are highly skewed so Many populations are highly skewed so that these models also have problemsthat these models also have problems

DatasetDataset

We use data from the National Inpatient We use data from the National Inpatient Sample from 2005Sample from 2005

A stratified sample from 1000 hospitals A stratified sample from 1000 hospitals from 37 statesfrom 37 states

Approximately 8 million inpatient staysApproximately 8 million inpatient stays

Distribution of Patient StaysDistribution of Patient Stays

Normal EstimateNormal Estimate

0.15 4.65 9.15 13.65 18.15 22.65 27.15 31.65 36.15 40.65 45.15 49.65

0

2.5

5.0

7.5

10.0

12.5

15.0

17.5

Percent

Length of stay (cleaned)

Kernel Density EstimationKernel Density Estimation

Instead of assuming that the population Instead of assuming that the population follows a known distribution, we can follows a known distribution, we can estimate it.estimate it.

Kernel density estimation is an excellent Kernel density estimation is an excellent method to use to do thismethod to use to do this

continued...

Kernel Density EstimationKernel Density Estimation

n

j

n

j

na

xXK

naxf

1

1)(

Proc KDEProc KDE

procproc kdekde data=nis.diabetesless50los; data=nis.diabetesless50los;

univar los/gridl=univar los/gridl=00 gridu= gridu=5050 method=srot method=srot out=nis.kde50 bwm=out=nis.kde50 bwm=33;;

runrun;;

Kernel Estimate of Length of Kernel Estimate of Length of StayStay

Sampling from NISSampling from NIS

Given that the National Inpatient Sample Given that the National Inpatient Sample has 8 million records, we can consider it to has 8 million records, we can consider it to be an infinite population. Therefore, we be an infinite population. Therefore, we can sample from this population to see if it can sample from this population to see if it can be estimated by the Central Limit can be estimated by the Central Limit TheoremTheorem

We start with extracting 100 different We start with extracting 100 different samples of size N=5samples of size N=5

Examine Central Limit TheoremExamine Central Limit Theorem

PROC SURVEYSELECT DATA=nis.nis_205 PROC SURVEYSELECT DATA=nis.nis_205 OUT=work.samples METHOD=SRS N=5 OUT=work.samples METHOD=SRS N=5 rep=100 noprint;rep=100 noprint;

RUN;RUN;

proc means data=work.samples noprint;proc means data=work.samples noprint;

by replicate;by replicate;

var los;var los;

output out=out mean=mean;output out=out mean=mean;

run;run;

Sample Size=5Sample Size=5




Confidence LimitConfidence Limit

The confidence limit excludes much of the actual population distribution

Confidence Limit With Larger nConfidence Limit With Larger n

DiscussionDiscussion

An over-reliance on the Central Limit An over-reliance on the Central Limit Theorem can give a very misleading Theorem can give a very misleading picture of the population distribution. picture of the population distribution.

Kernel density estimation (PROC KDE) Kernel density estimation (PROC KDE) allows an examination of the entire allows an examination of the entire population distribution instead of just using population distribution instead of just using the mean to represent the population.the mean to represent the population.

Without the assumption of normality, we Without the assumption of normality, we need to use predictive modeling.need to use predictive modeling.


This is true for both logistic and linear This is true for both logistic and linear regression where the assumption of regression where the assumption of normality is required.normality is required.

The two regression techniques do not The two regression techniques do not work well with skewed populations. work well with skewed populations.

We first look at logistic regression for rare We first look at logistic regression for rare occurrencesoccurrences

Problems With RegressionProblems With Regression

Logistic regression is not designed to Logistic regression is not designed to predict rare occurrencespredict rare occurrences

With a rare occurrence, logistic regression With a rare occurrence, logistic regression will predict virtually all observations as will predict virtually all observations as non-occurrencesnon-occurrences

The accuracy will be high but the The accuracy will be high but the predictive ability of the model will be predictive ability of the model will be virtually nil.virtually nil.

Regression EquationRegression Equation

25252221

... XXX

Threshold ValueThreshold Value

For Logistic regression, a threshold value For Logistic regression, a threshold value is defined, and regression values above is defined, and regression values above the threshold are predicted as 1the threshold are predicted as 1

Regression values below the threshold are Regression values below the threshold are predicted as 0predicted as 0

Choice of threshold value optimizes error Choice of threshold value optimizes error raterate

Simple RegressionSimple RegressionTable of pneumonia by DIED

pneumonia DIED

Frequency Row Pct Col Pct

0 1

Total

0 7431129 98.21 94.97

135419 1.79

81.02

7566548

1 393728 92.54

5.03

31731 7.46

18.98

425459

Total 7824857 167150 7992007

Frequency Missing = 3041

Classification TableClassification Table

Classification Table

Correct Incorrect Percentages Prob Level

Event Non- Event

Event Non- Event

Correct Sensi- tivity

Speci- ficity

False POS

False NEG

0.920 782E4 0 167E3 0 97.9 100.0 0.0 2.1 .

0.940 743E4 31731 135E3 394E3 93.4 95.0 19.0 1.8 92.5

0.960 743E4 31731 135E3 394E3 93.4 95.0 19.0 1.8 92.5

0.980 743E4 31731 135E3 394E3 93.4 95.0 19.0 1.8 92.5

1.000 0 167E3 0 782E4 2.1 0.0 100.0 . 97.9

Classification With 3 VariablesClassification With 3 VariablesClassification Table

Correct Incorrect Percentages Prob Level

Event Non- Event

Event Non- Event

Correct Sensi- tivity

Speci- ficity

False POS

False NEG

0.480 782E4 0 167E3 0 97.9 100.0 0.0 2.1 .

0.500 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.520 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.540 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.560 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.580 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.600 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.620 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.640 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.660 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.680 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

0.700 781E4 4907 162E3 11633 97.8 99.9 2.9 2.0 70.3

continued...

Classification With 3 VariablesClassification With 3 Variables Event Non-

Event Event Non-

Event Correct Sensi-

tivity Speci-

ficity False POS

False NEG

0.760 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.780 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.800 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.820 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.840 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.860 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.880 775E4 26363 141E3 78618 97.3 99.0 15.8 1.8 74.9

0.900 768E4 41608 126E3 149E3 96.6 98.1 24.9 1.6 78.1

0.920 757E4 51297 116E3 258E3 95.3 96.7 30.7 1.5 83.4

0.940 757E4 51297 116E3 258E3 95.3 96.7 30.7 1.5 83.4

0.960 757E4 51297 116E3 258E3 95.3 96.7 30.7 1.5 83.4

0.980 634E4 103E3 64219 149E4 80.6 81.0 61.6 1.0 93.5

1.000 0 167E3 0 782E4 2.1 0.0 100.0 . 97.9

ModelsModels

Linear regression:Linear regression: Y = βY = β0 0 + β+ β1 1 XX1 1 + β+ β22XX2 2 …….+…….+ββk k XXkk

Logistic regression:Logistic regression: loglogee(p/1− p) = β(p/1− p) = β0 0 + β+ β11ΧΧ1 1 + β+ β22ΧΧ2 2 …….β…….βnnΧΧnn

Poisson regressionPoisson regression logloge e (Y) = β(Y) = β0 0 + β+ β11ΧΧ1 1 + β+ β22ΧΧ2 2 …….β…….βnnΧΧnn

Poisson DistributionPoisson Distribution

The parameter of the Poisson Distribution, The parameter of the Poisson Distribution, λλ, will represent the average mortality rate, , will represent the average mortality rate, say 2%.say 2%.

Then the sample size times 2% will give the Then the sample size times 2% will give the estimate for the number of deaths, say estimate for the number of deaths, say 1,000,000*0.02=20,0001,000,000*0.02=20,000

However, the problem still persists.However, the problem still persists. For example, septicemia has a 26% For example, septicemia has a 26%

mortality rate, pneumonia has a 7.5% ratemortality rate, pneumonia has a 7.5% rate

ParametersParameters

The three conditions include approximately The three conditions include approximately 25% of total hospitalizations, leaving 75% 25% of total hospitalizations, leaving 75% not accounted for. not accounted for.

The Poisson distribution can be accurate on The Poisson distribution can be accurate on those patients but cannot determine those patients but cannot determine anything about the remaining 75%anything about the remaining 75%

If more patient conditions are added, the If more patient conditions are added, the 25% will increase but not to the point that 25% will increase but not to the point that the model will have good predictabilitythe model will have good predictability

Predictive ModelingPredictive Modeling

Takes a different approachTakes a different approach Uses equal group sizesUses equal group sizes

100% of the rarest level100% of the rarest level Equal sample size of other levelEqual sample size of other level Randomizes the selection of the samplingRandomizes the selection of the sampling

Uses prior probabilities to choose the Uses prior probabilities to choose the optimal modeloptimal model

50/50 Split in the Data50/50 Split in the Data

Target Outcome Target Percentage Outcome Percentage Count Total Percentage 0 0 67.8 80.1 54008 40.4 1 0 32.2 38.3 25622 19.2 0 1 23.8 19.2 12852 9.6 1 1 76.3 61.7 41237 30.8

Filter data to mortality outcome

Filter data to non-mortality outcome

Use PROC SURVEYSELECT to extract a subsample of non-mortality outcome

Append the mortality outcome data to subsample


Target Outcome Target Percentage

Outcome Percentage

Count Total Percentage

0 0 80.4 96.6 10070 72.5 1 0 19.6 70.9 2462 17.7 0 1 25.6 3.3 348 2.5 1 1 74.4 29.1 1010 7.3


Target Outcome Target Percentage Outcome Percentage Count Total Percentage 0 0 91.5 99.3 31030 89.4 1 0 8.5 83.5 2899 8.3 0 1 27.3 0.7 216 0.6 1 1 72.6 16.5 574 1.6

ValidationValidation

The reduced sample is partitioned into The reduced sample is partitioned into training/validation/testing setstraining/validation/testing sets

Only need training/testing sets for Only need training/testing sets for regression modelsregression models

Model is validated on the testing setModel is validated on the testing set

Sampling NodeSampling Node

Misclassification in RegressionMisclassification in Regression

Target Outcome Target Percentage

Outcome Percentage

Count Total Percentage

Training Data 0 0 67.8 80.1 54008 40.4 1 0 32.2 38.3 25622 19.2 0 1 23.8 19.2 12852 9.6 1 1 76.3 61.7 41237 30.8 Validation Data

0 0 67.7 80.8 40498 40.4 1 0 32.3 38.5 19315 19.2 0 1 23.8 19.2 9646 9.6 1 1 76.2 61.5 30830 30.7

ROC CurvesROC Curves

Rule Induction ResultsRule Induction Results

Target Outcome Target Percentage Outcome Percentage Count Total Percentage Training Data 0 0 67.8 80.8 54008 40.4 1 0 32.2 38.3 25622 19.2 0 1 23.8 19.2 12852 9.6 1 1 76.3 61.7 41237 30.8 Validation Data 0 0 67.7 80.8 40498 40.4 1 0 32.3 38.5 19315 19.2 0 1 23.8 19.2 9646 9.6 1 1 76.2 61.5 30830 30.7

Variable SelectionVariable Selection

ROC CurvesROC Curves

DecileDecile

Data are sorted and divided into decilesData are sorted and divided into deciles True positive patients with highest True positive patients with highest

confidence come firstconfidence come first Next, positive patients with lower Next, positive patients with lower

confidence. confidence. True negative cases with lowest confidence True negative cases with lowest confidence

come nextcome next Next, negative cases with highest Next, negative cases with highest

confidence. confidence.

LiftLift

Target densityTarget density =number of actually =number of actually positive instances in that decile\ the total positive instances in that decile\ the total number of instances in the decile.number of instances in the decile.

TheThe lift lift =the ratio of the target density for =the ratio of the target density for the decile to the target density over all the the decile to the target density over all the test data. test data.

Way to find patients most at risk for Way to find patients most at risk for mortality (or infection)mortality (or infection)


Predictive modeling in Enterprise Miner Predictive modeling in Enterprise Miner has some capabilities that are possible, has some capabilities that are possible, but extremely difficult in SAS/Statbut extremely difficult in SAS/Stat Sampling a rare occurrence to a 50/50 splitSampling a rare occurrence to a 50/50 split Partitioning to validate the resultsPartitioning to validate the results Comparing multiple models to find the one Comparing multiple models to find the one

that is optimalthat is optimal Variable selectionVariable selection

SummarySummary

Clinical trials do differ from health outcomes Clinical trials do differ from health outcomes research and the statistical techniques research and the statistical techniques required must be adapted to outcomes required must be adapted to outcomes researchresearch

Model assumptions are important, but too Model assumptions are important, but too often ignoredoften ignored

We need to look at results in detailWe need to look at results in detail Superficial consideration of results can lead Superficial consideration of results can lead

to very erroneous conclusionsto very erroneous conclusions

Health & Medicine

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterprise Miner by Patricia B. Cerrito