14
Evaluating Performance of the Spetzler-Martin Supplemented Model in Selecting Patients With Brain Arteriovenous Malformation for Surgery Helen Kim, PhD; Tony Pourmohamad, MA; Erick M. Westbroek, BS; Charles E. McCulloch, PhD; Michael T. Lawton, MD; William L. Young, MD Background and Purpose—Our recently proposed point scoring model includes the widely-used Spetzler-Martin (SM)-5 variables, along with age, unruptured presentation, and diffuse border (SM-Supp). Here we evaluate the SM-Supp model performance compared with SM-5, SM-3, and Toronto prediction models using net reclassification index, which quantifies the correct movement in risk reclassification, and validate the model in an independent data set. Methods—Bad outcome was defined as worsening between preoperative and final postoperative modified Rankin Scale score. Point scores for each model were used as predictors in logistic regression and predictions evaluated using net reclassification index at varying thresholds (10%–30%) and any threshold (continuous net reclassification index 0). Performance was validated in an independent data set (n117). Results—Net gain in risk reclassification was better using the SM-Supp model over a range of threshold values (net reclassification index9%–25%) and significantly improved overall predictions for outcomes in the development data set, yielding a continuous net reclassification index of 64% versus SM-5, 67% versus SM-3, and 61% versus Toronto (all P0.001). In the validation data set, the SM-Supp model again correctly reclassified a greater proportion of patients versus SM-5 (82%), SM-3 (85%), and Toronto models (69%). Conclusions—The SM-Supp model demonstrated better discrimination and risk reclassification than several existing models and should be considered for clinical practice to estimate surgical risk in patients with brain arteriovenous malformation. (Stroke. 2012;43:2497-2499.) Key Words: modified Rankin Scale net reclassification receiver operator curve cerebral arteriovenous malformations T he Spetzler-Martin (SM) 5-point grading scale is the most widely accepted surgical risk prediction tool for brain arteriovenous malformations, although other models have been proposed. 1–6 We recently developed a simple point scoring model that incorporates SM angiographic variables but supplements with additional clinical factors (SM-Supp) to improve outcome prediction and demonstrated improved discrimination over SM-5 using area under the receiver operating characteristic curve (AUROC). 7 Here we extend our previous work by comparing SM-Supp performance with other models using the net reclassification index (NRI) and validating the model in an independent data set. Methods We included consecutive patients with brain arteriovenous malfor- mation who underwent microsurgical resection between 2000 and 2010 with at least one postoperative visit and no missing outcome data. The development data set consisted of 300 patients with brain arteriovenous malformation treated by a single neuro- surgeon (M.T.L.) between 2000 and 2007. 7 The primary valida- tion data set consisted of 117 patients (67 new M.T.L. cases between 2007 and 2010; 50 cases from other neurosurgeons between 2000 and 2010) with no missing data. We also included data from a larger validation data set (n183) for which we multiply imputed missing angiographic data (provided in the online-only Data Supplement). Outcome was change between preoperative and last postoperative modified Rankin Scale score 8 dichotomized into 0 (bad outcome) versus 0 (good outcome). 7 Predictors included age at surgery, sex, nonhemorrhagic presentation, arteriovenous malformation size, any deep venous drainage, eloquence, diffuse border, and time from surgery to last postoperative modified Rankin Scale assessment (days). SM-5, 1 SM-3, 6 Toronto, 5 and SM-Supp 7 scores are defined in online-only Data Supplement Table I. NRI 9,10 was used to evaluate model performance and quantifies the correct movement in risk reclassification when comparing predictions between 2 models at various risk thresholds (10%–30%) Received April 20, 2012; final revision received May 25, 2012; accepted June 12, 2012. From the Center for Cerebrovascular Research, Department of Anesthesia and Perioperative Care (H.K., T.P., E.M.W., C.E.M., W.L.Y., and the Departments of Epidemiology and Biostatistics (H.K., C.E.M.), Neurological Surgery (M.T.L., W.L.Y.), and Neurology (W.L.Y.), University of California, San Francisco, San Francisco, CA. The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111. 661942/-/DC1. Correspondence to Helen Kim, PhD, 1001 Potrero Avenue, Bldg 10, Rm 1206, Box 1363, San Francisco, CA 94110. E-mail [email protected] © 2012 American Heart Association, Inc. Stroke is available at http://stroke.ahajournals.org DOI: 10.1161/STROKEAHA.112.661942 2497 by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from by guest on January 27, 2016 http://stroke.ahajournals.org/ Downloaded from

Evaluating Performance of the Spetzler-Martin Supplemented Model in Selecting Patients With Brain Arteriovenous Malformation for Surgery

Embed Size (px)

Citation preview

Evaluating Performance of the Spetzler-MartinSupplemented Model in Selecting Patients With Brain

Arteriovenous Malformation for SurgeryHelen Kim, PhD; Tony Pourmohamad, MA; Erick M. Westbroek, BS; Charles E. McCulloch, PhD;

Michael T. Lawton, MD; William L. Young, MD

Background and Purpose—Our recently proposed point scoring model includes the widely-used Spetzler-Martin (SM)-5variables, along with age, unruptured presentation, and diffuse border (SM-Supp). Here we evaluate the SM-Supp modelperformance compared with SM-5, SM-3, and Toronto prediction models using net reclassification index, whichquantifies the correct movement in risk reclassification, and validate the model in an independent data set.

Methods—Bad outcome was defined as worsening between preoperative and final postoperative modified Rankin Scalescore. Point scores for each model were used as predictors in logistic regression and predictions evaluated using netreclassification index at varying thresholds (10%–30%) and any threshold (continuous net reclassification index �0).Performance was validated in an independent data set (n�117).

Results—Net gain in risk reclassification was better using the SM-Supp model over a range of threshold values (netreclassification index�9%–25%) and significantly improved overall predictions for outcomes in the development dataset, yielding a continuous net reclassification index of 64% versus SM-5, 67% versus SM-3, and 61% versus Toronto(all P�0.001). In the validation data set, the SM-Supp model again correctly reclassified a greater proportion of patientsversus SM-5 (82%), SM-3 (85%), and Toronto models (69%).

Conclusions—The SM-Supp model demonstrated better discrimination and risk reclassification than several existingmodels and should be considered for clinical practice to estimate surgical risk in patients with brain arteriovenousmalformation. (Stroke. 2012;43:2497-2499.)

Key Words: modified Rankin Scale � net reclassification � receiver operator curve� cerebral arteriovenous malformations

The Spetzler-Martin (SM) 5-point grading scale is themost widely accepted surgical risk prediction tool for

brain arteriovenous malformations, although other modelshave been proposed.1–6 We recently developed a simple pointscoring model that incorporates SM angiographic variablesbut supplements with additional clinical factors (SM-Supp) toimprove outcome prediction and demonstrated improveddiscrimination over SM-5 using area under the receiveroperating characteristic curve (AUROC).7 Here we extendour previous work by comparing SM-Supp performance withother models using the net reclassification index (NRI) andvalidating the model in an independent data set.

MethodsWe included consecutive patients with brain arteriovenous malfor-mation who underwent microsurgical resection between 2000and 2010 with at least one postoperative visit and no missing

outcome data. The development data set consisted of 300 patientswith brain arteriovenous malformation treated by a single neuro-surgeon (M.T.L.) between 2000 and 2007.7 The primary valida-tion data set consisted of 117 patients (67 new M.T.L. casesbetween 2007 and 2010; 50 cases from other neurosurgeonsbetween 2000 and 2010) with no missing data. We also includeddata from a larger validation data set (n�183) for which we multiplyimputed missing angiographic data (provided in the online-onlyData Supplement).

Outcome was change between preoperative and last postoperativemodified Rankin Scale score8 dichotomized into �0 (bad outcome)versus �0 (good outcome).7 Predictors included age at surgery, sex,nonhemorrhagic presentation, arteriovenous malformation size, anydeep venous drainage, eloquence, diffuse border, and time fromsurgery to last postoperative modified Rankin Scale assessment(days). SM-5,1 SM-3,6 Toronto,5 and SM-Supp7 scores are defined inonline-only Data Supplement Table I.

NRI9,10 was used to evaluate model performance and quantifiesthe correct movement in risk reclassification when comparingpredictions between 2 models at various risk thresholds (10%–30%)

Received April 20, 2012; final revision received May 25, 2012; accepted June 12, 2012.From the Center for Cerebrovascular Research, Department of Anesthesia and Perioperative Care (H.K., T.P., E.M.W., C.E.M., W.L.Y., and the

Departments of Epidemiology and Biostatistics (H.K., C.E.M.), Neurological Surgery (M.T.L., W.L.Y.), and Neurology (W.L.Y.), University ofCalifornia, San Francisco, San Francisco, CA.

The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111.661942/-/DC1.

Correspondence to Helen Kim, PhD, 1001 Potrero Avenue, Bldg 10, Rm 1206, Box 1363, San Francisco, CA 94110. E-mail [email protected]© 2012 American Heart Association, Inc.

Stroke is available at http://stroke.ahajournals.org DOI: 10.1161/STROKEAHA.112.661942

2497 by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from

or any threshold (continuous, cNRI �0).10 NRI was compared bycombining one-sided McNemar tests across outcomes using theFisher method.11 We derived bootstrap 95% CI for cNRI using 1000replications.

ResultsCharacteristics were similar between development and vali-dation data sets (P�0.05; Table 1; online-only Data Supple-ment Table II). Outcomes were bad for 73 (24%) and goodfor 227 (76%) patients in the development data set. In thevalidation data set, outcomes were bad for 39 (21%) and goodfor 144 (79%) patients.

In the development data set, NRI showed improvement inreclassification of 9% to 25% with SM-Supp than SM-5 overall threshold values (Table 2). A greater net gain wasobserved at lower thresholds for good and at higher thresh-olds for bad outcomes. For example, at 15% risk threshold, 85of 300 (28%) were reclassified into different risk categories.Net gain in reclassification was �6.8% for those with badoutcomes and 27% for those with good outcomes(NRI�0.205, P�0.001). Thus, patients with good outcomeswere 21% more likely to move down a risk category than upcompared with patients with bad outcomes.

Because risk categories for brain arteriovenous malforma-tion surgical outcome are not well established, we alsocalculated the cNRI comparing SM-Supp to SM-5. The cNRIwas 64% (95% CI, 39%–89%; P�0.001) with a net gain of

26% in those with good outcomes and 37% in those with badoutcomes (Table 2). Thus, 64% had predicted risks reclassi-fied in the correct direction with SM-Supp. Results weresimilar when comparing SM-Supp with SM-3 (cNRI�67%;95% CI, 41%–93%) and with Toronto (cNRI�61%; 95% CI,37%–85%). Scatterplots of predicted probabilities (Figure)by good and bad outcomes reflected a greater proportion ofpatients with correct assignments using the SM-Supp modelcompared with SM-5 (Figure A), SM-3 (Figure B), orToronto models (Figure C). In the validation data set, theSM-Supp model again correctly reclassified a greater propor-tion of patients versus SM-5 (cNRI�82%; 95% CI, 43.6%–121%), SM-3 (cNRI�85%; 95% CI, 44.7%–126%), andToronto models (cNRI�69%; 95% CI, 26.4%–121%).

Consistent with NRI results, the SM-Supp model yieldedbetter discrimination and highest AUROC than all othermodels (online-only Data Supplement Figure I) in develop-ment (AUROC�0.76, P�0.001) and validation (AUROC�0.77, P�0.402) data sets.

DiscussionThe SM-Supp model performed equally well in predictingoutcomes in an independent data set and consistently showedbetter risk reclassification and discrimination. For example,�60% of patients were correctly reclassified as having higherrisk for those with bad outcomes and lower risk for those withgood outcomes compared with each of SM-5, SM-3, orToronto models.

Direct comparisons with other models2–5 are difficultbecause outcome measures and time points assessed differamong studies, for example, we examined change in out-come, which takes into account preoperative state. OnlySpears et al5 compared performance of their prediction modelto SM-5 using modified Rankin Scale and AUROC, showinggood discrimination and performance (AUROC�0.80).5

Our model showed equally high discrimination in bothdevelopment (AUROC�0.76) and validation data sets(AUROC�0.77).

Although the SM-Supp model derives from a single neu-rosurgeon and referral institution, we provide an independentvalidation using the NRI and include cases treated by otherneurosurgeons in the largest series to date. However, furthervalidation in external settings would be useful to assessgeneralizability and clinical use. A limitation of all scoring

Table 1. Preoperative Scores in the Development andValidation Cohorts

Scores

DevelopmentCohort

(n�300)

ValidationCohort

(n�117)P

Value

Spetzler-Martin (SM-5)

1 56 (19) 25 (21) 0.379

2 122 (40) 36 (31)

3 91 (30) 43 (37)

4 29 (10) 12 (10)

5 2 (1) 1 (1)

SM-Supplemented (SM-Supp)

2 7 (2) 5 (4) 0.304

3 21 (7) 7 (6)

4 55 (18) 28 (24)

5 90 (30) 30 (26)

6 70 (23) 32 (27)

7 43 (15) 9 (8)

8 9 (3) 3 (3)

9 5 (2) 2 (2)

10 0 (0) 1 (1)

Modified Rankin Scale

0 85 (28) 26 (22) 0.173

1 65 (22) 33 (28)

2 33 (11) 21 (18)

3 55 (18) 16 (14)

4 33 (11) 9 (8)

5 29 (10) 12 (10)

Table 2. Net Reclassification Index (NRI) at Varying RiskThresholds and Continuous NRI (>0) for Improvement UsingSpetzler-Martin (SM) Supplemented Versus the SM-5 Scale inDevelopment Cohort

RiskThreshold

Bad OutcomeNet Gain

Good OutcomeNet Gain NRI

PValue

10% �0.027 0.278 0.250 �0.001

15% �0.068 0.273 0.205 �0.001

20% 0.027 0.057 0.085 0.101

25% 0.096 0.044 0.140 0.031

30% 0.178 0 0.178 0.002

�0 0.260 0.374 0.635 �0.001

2498 Stroke September 2012

by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from

systems is dealing with missing data. In our full validationdata set (n�183), 34% were missing angiographic data forSM-Supp, 36% for Toronto, and 13% for SM-5 and SM-3scores. One way of accommodating missing data are throughmultiple imputation (see the online-only Data Supplement).Prospective studies planning to use SM-Supp should haveminimal issues with missing data: all variables should beavailable from angiograms and MRI, which are standard fordiagnostic evaluation and pretreatment planning, or fromrecords at clinic visits.

In conclusion, the SM-Supp model performs better thancurrent prediction models and should be considered for use inclinical practice. An online calculator is provided to assistclinicians (http://avm.ucsf.edu/healthcare_pro/).

Sources of FundingSupported by K23NS058357 (H.K.), R01NS034949 (W.L.Y.),P01NS044155 (W.L.Y.), and the Doris Duke Charitable Foundation(E.M.W.).

DisclosuresNone.

References1. Spetzler RF, Martin NA. A proposed grading system for arteriovenous

malformations. J Neurosurg. 1986;65:476–483.

2. Tamaki N, Ehara K, Lin TK, Kuwamura K, Obora Y, Kanazawa Y, et al.Cerebral arteriovenous malformations: factors influencing the surgicaldifficulty and outcome. Neurosurgery. 1991;29:856–861.

3. Pertuiset B, Ancri D, Kinuta Y, Haisa T, Bordi L, Lin C, et al. Classifi-cation of supratentorial arteriovenous malformations. A score system forevaluation of operability and surgical strategy based on an analysis of 66cases. Acta Neurochir (Wien). 1991;110:6–16.

4. Hollerhage HG, Dewenter KM, Dietz H. Grading of supratentorial arte-riovenous malformations on the basis of multivariate analysis of prog-nostic factors. Acta Neurochir (Wien). 1992;117:129–134.

5. Spears J, Terbrugge KG, Moosavian M, Montanera W, Willinsky RA,Wallace MC, et al. A discriminative prediction model of neurologicaloutcome for patients undergoing surgery of brain arteriovenous malfor-mations. Stroke. 2006;37:1457–1464.

6. Spetzler RF, Ponce FA. A 3-tier classification of cerebral arteriovenousmalformations. Clinical article. J Neurosurg. 2011;114:842–849.

7. Lawton MT, Kim H, McCulloch CE, Mikhak B, Young WL. A supple-mentary grading scale for selecting patients with brain arteriovenousmalformations for surgery. Neurosurgery. 2010;66:702–713.

8. van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van Gijn J. Inter-observer agreement for the assessment of handicap in stroke patients.Stroke. 1988;19:604–607.

9. Pencina MJ, D’Agostino RB, Sr., D’Agostino RB, Jr., Vasan RS. Eval-uating the added predictive ability of a new marker: from area under theROC curve to reclassification and beyond. Stat Med. 2008;27:157–172.

10. Pencina MJ, D’Agostino RB, Sr., Steyerberg EW. Extensions of netreclassification improvement calculations to measure usefulness of newbiomarkers. Stat Med. 2011;30:11–21.

11. Fisher RA. Statistical Methods for Research Workers. Edinburgh, UK:Oliver and Boyd; 1932.

Figure. Scatterplot of predicted risk in patients with good (black dots) and bad (gray dots) postsurgical outcomes. The 45° line indi-cates concordance of predicted probabilities between models. For patients with good outcomes, a greater proportion of black dotswere correctly assigned below the line indicating lower predicted risk using the SM-Supp model compared with SM-5 (A), SM-3 (B), orToronto models (C). Conversely, in patients with bad outcomes, a greater proportion of gray dots were correctly classified above theline indicating higher predicted risk with SM-Supp. SM indicates Spetzler-Martin.

Kim et al Prediction Model for Outcome After BAVM Surgery 2499

by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from

ONLINE SUPPLEMENT

Evaluating Performance of the Spetzler-Martin Supplemented Model in Selecting Brain Arteriovenous Malformation Patients for Surgery

Supplemental Methods

We evaluated model performance in two validation cohorts. The primary validation dataset consisted of 117 patients with no missing data; details included in the paper. The larger “imputed” validation dataset consisted of 183 patients, which included the 117 patients with no missing data plus an additional 66 patients who had outcome data but were missing data for one or more angiographic predictors needed to construct the point scores (3 missing predictors for Toronto score, 40 missing predictors for both Toronto and SM-Supp scores, and 23 missing predictors for all 4 models). These were all outside surgical cases referred to our institution for other treatment or evaluation (i.e., not treated by UCSF neurosurgeon) and captured in our prospective BAVM registry, but images were not available to extract the necessary angiographic information. In order to use all the data from 183 patients, we performed multiple imputations from the entire cohort (both development and validation datasets), using the imputation by chained equations algorithm implemented in Stata SE v11.1 (StataCorp LP). We generated 20 datasets and filled in missing values by drawing from the conditional density of the missing variables given the other known variables. Validation diagnostics were performed and the pooled results were analyzed using the method described by Rubin (1987).1

Supplemental Tables

Spetzler-Martin

Scale (SM) Toronto weighted model

SM-Supplemental (SM-S)

Variable Definition Points Definition Points Definition Points AVM size <3 cm 1 <3 cm 1

3-6 cm 2 3-6 cm 2 >6 cm 3

>6 cm 3

Deep venous No 0 No 0 No 0 drainage Yes 1 Yes 2 Yes 1

Eloquence No 0 No 0 No 0 Yes 1 Yes 4 Yes 1

Age at presentation

<20 years 1 20-40 years 2

>40 years 3

Unruptured No 0 presentation

Yes 1

Diffuse border No 0 No 0 Yes 3 Yes 1

Total score

1-5 0-9 2-10

Supplemental Table S1. Point Scoring System for the Spetzler-Martin Scale (SM), Toronto weighted model, and the Spetzler-Martin Supplemented (SM-Supp) model. SM-5 scale is the sum of points defined in SM column.2 SM-3 scale collapses SM-5 into three categories corresponding to low (SM-5: 1 or 2), intermediate (SM-5: 3), and high (SM-5: 4 or 5).3 Points for the Toronto model are derived from a weighted logistic regression model described by Spears et al.4 Points from the SM-Supp model use SM-5 scale and add points for three additional variables.5

Development

cohort

Validation cohort Characteristics n = 300 No missing data

n= 117 Imputed data*

n = 183 Gender

Male 146 (49) 59 (51) 97 (53) Female 154 (51) 58 (49) 86 (47)

Age at presentation (decades) 3.8 ± 1.7

3.7 ± 1.8 3.7 ± 1.8

Unruptured presentation

Yes 154 (51) 50 (43) 100 (55) No 146 (49) 67 (57) 83 (45)

AVM Size (cm) Missing, n (%)

2.5 ± 1.3 0

2.8 ± 1.5

0 2.8 ± 1.5 22 (12)

Eloquence

Yes 160 (53) 70 (60) 103 (56) No Missing

140 (47) 0

47 (40) 0

80 (44) 24 (13)

Venous Drainage

Any deep 130 (43) 48 (41) 77 (42) Superficial only Missing

170 (57) 0

46 (59) 0

106 (58) 37 (20)

Diffuse border

Yes 37 (12) 15 (13) 24 (13) No Missing

263 (88) 0

102 (87) 0

159 (87) 62 (34)

Spetzler-Martin (SM)

1 56 (19) 25 (21) 30 (20) 2 122 (40) 36 (31) 59 (32) 3 91 (30) 43 (37) 66 (36) 4 29 (10) 12 (10) 20 (11) 5 Missing

2 (1) 0

1 (1) 0

1 (1) 23 (13)

SM-Supplemented

2 7 (2) 5 (4) 6 (3)

3 21 (7) 7 (6) 14 (8) 4 55 (18) 28 (24) 39 (21)

5 90 (30) 30 (26) 47 (26) 6 70 (23) 32 (27) 49 (27) 7 44 (15) 9 (8) 18 (10) 8 8 (3) 3 (3) 6 (3) 9 5 (2) 2 (2) 3 (2) 10 0 1 (1) 1 (1)

Baseline mRS

0 85 (28) 26 (22) 44 (24)

1 65 (22) 33 (28) 44 (24) 2 33 (11) 21 (18) 30 (16) 3 55 (18) 16 (14) 25 (14) 4 33 (11) 9 (8) 20 (11) 5 29 (10) 12 (10) 20 (11)

* Multiple imputations were performed to fill in missing data for predictors using data from both the development and validation cohorts.

Supplemental Table S2. Demographic and angiographic characteristics of patients included in the model development and validation cohorts.

Overall, characteristics were similar between those in the development and validation datasets (P>0.05), including distribution of SM-5 (P=0.38), SM-Supp (P=0.30), and baseline MRS scores (P=0.17). There were slight differences in characteristics between those with and without imputed data in the validation datasets, suggesting potential bias when restricting to those with complete data only. For example, unruptured presentation data was available on everyone, yet the percentage was lower when restricting the validation cohort to the 117 patients with no missing data (43%) compared to the full, imputed validation dataset (55%) and the development dataset (51%).

Supplemental Figure Supplemental Figure S1. Area under the Receiver Operating Characteristic Curve analysis demonstrating greater predictive accuracy for the Spetzler Martin-Supplemented (SM-Supp) model compared to SM-5, SM-3, and Toronto models for A) development dataset (n=300), B) 10-fold cross-validated development dataset (n=300), C) independent validation dataset (n=117), and D) imputed validation dataset (n=183)

ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- mod_SMSupp 300 0.7563 0.0309 0.69578 0.81674 mod_SM5 300 0.6563 0.0354 0.58685 0.72569 mod_SM3 300 0.6326 0.0383 0.55756 0.70754 mod_Toronto 300 0.6566 0.0368 0.58445 0.72875 ------------------------------------------------------------------------- Ho: area(mod_SMSupp) = area(mod_SM5) = area(mod_SM3) = area(mod_Toronto) chi2(3) = 19.61 Prob>chi2 = 0.0002 S1-A. The SM-Supp model did a better job at discriminating outcomes compared to other models in the development dataset, with the highest AUROC (0.76). There was a significant difference in AUROC (P<0.001), and SM-Supp perfomed significantly better than each of the other models (all pairwise P<0.001).

ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- cv_SMSupp 300 0.7417 0.0317 0.67961 0.80376 cv_SM5 300 0.6326 0.0366 0.56096 0.70433 cv_SM3 300 0.6069 0.0396 0.52932 0.68443 cv_Toronto 300 0.6369 0.0374 0.56362 0.71017 ------------------------------------------------------------------------- Ho: area(cv_SMSupp) = area(cv_SM5) = area(cv_SM3) = area(cv_Toronto) chi2(3) = 22.37 Prob>chi2 = 0.0001 S1-B. We performed 10-fold cross validation in the development dataset to compare performance of each prediction model. We used 90% of the data to obtain estimates from the corresponding point score models and predicted outcomes in the remaining 10% of the sample, repeating the process 10 times so that no one was used both in the testing and prediction stage at the same time. Cross-validated AUROC results were similar to observed data indicating that the SM-Supp model was not overly optimistic, and performed significantly better than all other models and had the highest AUROC (0.74, P<0.001).

. ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- mod_SMSupp 117 0.7722 0.0559 0.66268 0.88167 mod_SM5 117 0.7128 0.0605 0.59429 0.83137 mod_SM3 117 0.7037 0.0661 0.57406 0.83333 mod_Toronto 117 0.7254 0.0561 0.61553 0.83534 ------------------------------------------------------------------------- Ho: area(mod_SMSupp) = area(mod_SM5) = area(mod_SM3) = area(mod_Toronto) chi2(3) = 2.94 Prob>chi2 = 0.4016 S1-C. All models did a better job at predicting outcomes in the independent validation dataset of 117 patients, with higher AUROC measures than the development cohort (S1-A). The SM-Supp model still had the highest AUROC (0.77), although differences between AUROC are no longer significant (P=0.40).

. ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- mod_SMSupp 183 0.7113 0.0092 0.61528 0.80737 mod_SM5 183 0.6462 0.0100 0.54201 0.75035 mod_SM3 183 0.6179 0.0092 0.50663 0.72926 mod_Toronto 183 0.6458 0.0105 0.54336 0.74817 ------------------------------------------------------------------------- S1-D. In the imputed validation dataset (n=183), the SM-Supp model predicted outcomes equally well and better than other prediction models with the highest AUROC (0.71). The AUROC for SM-5, SM-3, and Toronto models are more similar to the development cohort (S1-A and S1-B), suggesting a potentially biased subset when restricting the validation dataset to the 117 patients with no missing data (S1-C).

Supplemental References 1. Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: J. Wiley & Sons;

1987. 2. Spetzler RF, Martin NA. A proposed grading system for arteriovenous malformations. J

Neurosurg. 1986;65:476-483. 3. Spetzler RF, Ponce FA. A 3-tier classification of cerebral arteriovenous malformations.

Clinical article. J Neurosurg. 2011;114:842-849. 4. Spears J, Terbrugge KG, Moosavian M, Montanera W, Willinsky RA, Wallace MC, et al. A

discriminative prediction model of neurological outcome for patients undergoing surgery of brain arteriovenous malformations. Stroke. 2006;37:1457-1464.

5. Lawton MT, Kim H, McCulloch CE, Mikhak B, Young WL. A supplementary grading scale for selecting patients with brain arteriovenous malformations for surgery. Neurosurgery. 2010;66:702-713.

Lawton and William L. YoungHelen Kim, Tony Pourmohamad, Erick M. Westbroek, Charles E. McCulloch, Michael T.

With Brain Arteriovenous Malformation for SurgeryEvaluating Performance of the Spetzler-Martin Supplemented Model in Selecting Patients

Print ISSN: 0039-2499. Online ISSN: 1524-4628 Copyright © 2012 American Heart Association, Inc. All rights reserved.

is published by the American Heart Association, 7272 Greenville Avenue, Dallas, TX 75231Stroke doi: 10.1161/STROKEAHA.112.661942

2012;43:2497-2499; originally published online July 19, 2012;Stroke. 

http://stroke.ahajournals.org/content/43/9/2497World Wide Web at:

The online version of this article, along with updated information and services, is located on the

http://stroke.ahajournals.org/content/suppl/2012/07/19/STROKEAHA.112.661942.DC1.htmlData Supplement (unedited) at:

  http://stroke.ahajournals.org//subscriptions/

is online at: Stroke Information about subscribing to Subscriptions: 

http://www.lww.com/reprints Information about reprints can be found online at: Reprints:

  document. Permissions and Rights Question and Answer process is available in the

Request Permissions in the middle column of the Web page under Services. Further information about thisOnce the online version of the published article for which permission is being requested is located, click

can be obtained via RightsLink, a service of the Copyright Clearance Center, not the Editorial Office.Strokein Requests for permissions to reproduce figures, tables, or portions of articles originally publishedPermissions:

by guest on January 27, 2016http://stroke.ahajournals.org/Downloaded from