12
Concurrent Validity and Reliability of Retrospective Scoring of the Pediatric National Institutes of Health Stroke Scale Lauren A. Beslow, MD, MSCE; Scott E. Kasner, MD, MSCE; Sabrina E. Smith, MD, PhD; Michael T. Mullen, MD; Matthew P. Kirschen, MD, PhD; Rachel A. Bastian, BA; Michael M. Dowling, MD, PhD, MSCS; Warren Lo, MD; Lori C. Jordan, MD, PhD; Timothy J. Bernard, MD; Neil Friedman, MBChB; Gabrielle deVeber, MD; Adam Kirton, MD; Lisa Abraham, MD; Daniel J. Licht, MD; Abbas F. Jawad, PhD; Jonas H. Ellenberg, PhD; Ebbing Lautenbach, MD, MPH, MSCE*; Rebecca N. Ichord, MD* Background and Purpose—The Pediatric National Institutes of Health Stroke Scale (PedNIHSS), an adaptation of the adult National Institutes of Health Stroke Scale, is a quantitative measure of stroke severity shown to be reliable when scored prospectively. The ability to calculate the PedNIHSS score retrospectively would be invaluable in the conduct of observational pediatric stroke studies. The study objective was to assess the concurrent validity and reliability of estimating the PedNIHSS score retrospectively from medical records. Methods—Neurological examinations from medical records of 75 children enrolled in a prospective PedNIHSS validation study were photocopied. Four neurologists of varying training levels blinded to the prospective PedNIHSS scores reviewed the records and retrospectively assigned PedNIHSS scores. Retrospective scores were compared among raters and to the prospective scores. Results—Total retrospective PedNIHSS scores correlated highly with total prospective scores (R 2 0.76). Interrater reliability for the total scores was “excellent” (intraclass correlation coefficient, 0.95; 95% CI, 0.94 – 0.97). Interrater reliability for individual test items was “substantial” or “excellent” for 14 of 15 items. Conclusions—The PedNIHSS score can be scored retrospectively from medical records with a high degree of concurrent validity and reliability. This tool can be used to improve the quality of retrospective pediatric stroke studies. (Stroke. 2012;43:341-345.) Key Words: arterial ischemic stroke pediatric NIH Stroke Scale T he National Institutes of Health Stroke Scale (NIHSS), a reliable quantitative measure of acute stroke severity in adults, 1 predicts stroke outcome at 7 and 90 days. 2 The adult NIHSS is the primary examination used for adult stroke research and acute treatment trials. This scale has enabled multicenter studies because the clinical stroke severity of various patients can be compared easily, even by practitioners other than neurologists. 3 Furthermore, the reliability and concurrent validity of estimating the adult NIHSS score from medical records retrospectively was first demonstrated using 39 subjects 4 and has been replicated. 5–8 These studies have been extremely useful, allowing researchers to quantify stroke severity when a prospective NIHSS score was not recorded. 9,10 Children were not included in the initial validation study of the NIHSS. A study to determine the validity and utility of a Received August 10, 2011; final revision received September 20, 2011; accepted October 3, 2011. Jeffrey L. Saver, MD, was the Guest Editor for this paper. From the Division of Neurology (L.A.B., S.E.S., M.P.K., R.A.B., D.J.L., R.N.I.) and the Department of Pediatrics (A.F.J.), Children’s Hospital of Philadelphia, University of Pennsylvania School of Medicine, Colket Translational Research Building, Philadelphia, PA; the Department of Neurology (S.E.K., M.T.M.), The Hospital of the University of Pennsylvania, University of Pennsylvania School of Medicine, Philadelphia, PA; the Departments of Pediatrics and Neurology (M.M.D.), Children’s Medical Center Dallas, University of Texas Southwestern Medical Center, Dallas, TX; the Department of Neurology (W.L.), Nationwide Children’s Hospital, Ohio State University, Columbus, OH; the Department of Neurology (L.C.J.), Johns Hopkins University, Baltimore, MD; the Section of Child Neurology (T.J.B.), Denver Children’s Hospital, University of Colorado, Aurora, CO; the Center for Pediatric Neurology/Desh S60 (N.F.), Neurological Institute, Cleveland Clinic, Cleveland, OH; the Department of Pediatrics (G.d.V.), Division of Neurology, Hospital for Sick Children, Toronto, Ontario, Canada; the Department of Pediatrics and Clinical Neurosciences (A.K.), Alberta Children’s Hospital, University of Calgary, Calgary, Alberta, Canada; the Department of Pediatrics (L.A.), Schenectady Neurological, Consultants, PC, Schenectady, NY; and the Department of Biostatistics and Epidemiology (J.H.E.) and the Division of Infectious Diseases (E.L.), Department of Medicine, Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA. The online-only Data Supplement is available at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111.633305/-/DC1. *E.L. and R.N.I. are co-last authors. Correspondence to Lauren A. Beslow, MD, MSCE, Division of Neurology, Children’s Hospital of Philadelphia, Colket Translational Research Building, 10th Floor, 3501 Civic Center Boulevard, Philadelphia, PA 19104. E-mail [email protected] © 2011 American Heart Association, Inc. Stroke is available at http://stroke.ahajournals.org DOI: 10.1161/STROKEAHA.111.633305 341 by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from by guest on June 28, 2018 http://stroke.ahajournals.org/ Downloaded from

Concurrent Validity and Reliability of Retrospective ...stroke.ahajournals.org/content/strokeaha/43/2/341.full.pdf · Concurrent Validity and Reliability of Retrospective Scoring

Embed Size (px)

Citation preview

Concurrent Validity and Reliability of Retrospective Scoringof the Pediatric National Institutes of Health Stroke Scale

Lauren A. Beslow, MD, MSCE; Scott E. Kasner, MD, MSCE; Sabrina E. Smith, MD, PhD;Michael T. Mullen, MD; Matthew P. Kirschen, MD, PhD; Rachel A. Bastian, BA;

Michael M. Dowling, MD, PhD, MSCS; Warren Lo, MD; Lori C. Jordan, MD, PhD;Timothy J. Bernard, MD; Neil Friedman, MBChB; Gabrielle deVeber, MD; Adam Kirton, MD;

Lisa Abraham, MD; Daniel J. Licht, MD; Abbas F. Jawad, PhD; Jonas H. Ellenberg, PhD;Ebbing Lautenbach, MD, MPH, MSCE*; Rebecca N. Ichord, MD*

Background and Purpose—The Pediatric National Institutes of Health Stroke Scale (PedNIHSS), an adaptation of theadult National Institutes of Health Stroke Scale, is a quantitative measure of stroke severity shown to be reliable whenscored prospectively. The ability to calculate the PedNIHSS score retrospectively would be invaluable in the conductof observational pediatric stroke studies. The study objective was to assess the concurrent validity and reliability ofestimating the PedNIHSS score retrospectively from medical records.

Methods—Neurological examinations from medical records of 75 children enrolled in a prospective PedNIHSS validationstudy were photocopied. Four neurologists of varying training levels blinded to the prospective PedNIHSS scoresreviewed the records and retrospectively assigned PedNIHSS scores. Retrospective scores were compared among ratersand to the prospective scores.

Results—Total retrospective PedNIHSS scores correlated highly with total prospective scores (R2�0.76). Interraterreliability for the total scores was “excellent” (intraclass correlation coefficient, 0.95; 95% CI, 0.94–0.97). Interraterreliability for individual test items was “substantial” or “excellent” for 14 of 15 items.

Conclusions—The PedNIHSS score can be scored retrospectively from medical records with a high degree of concurrentvalidity and reliability. This tool can be used to improve the quality of retrospective pediatric stroke studies. (Stroke.2012;43:341-345.)

Key Words: arterial ischemic stroke � pediatric � NIH Stroke Scale

The National Institutes of Health Stroke Scale (NIHSS), areliable quantitative measure of acute stroke severity in

adults,1 predicts stroke outcome at 7 and 90 days.2 The adultNIHSS is the primary examination used for adult strokeresearch and acute treatment trials. This scale has enabledmulticenter studies because the clinical stroke severity ofvarious patients can be compared easily, even by practitionersother than neurologists.3 Furthermore, the reliability and

concurrent validity of estimating the adult NIHSS score frommedical records retrospectively was first demonstrated using39 subjects4 and has been replicated.5–8 These studies havebeen extremely useful, allowing researchers to quantifystroke severity when a prospective NIHSS score was notrecorded.9,10

Children were not included in the initial validation study ofthe NIHSS. A study to determine the validity and utility of a

Received August 10, 2011; final revision received September 20, 2011; accepted October 3, 2011.Jeffrey L. Saver, MD, was the Guest Editor for this paper.From the Division of Neurology (L.A.B., S.E.S., M.P.K., R.A.B., D.J.L., R.N.I.) and the Department of Pediatrics (A.F.J.), Children’s Hospital of

Philadelphia, University of Pennsylvania School of Medicine, Colket Translational Research Building, Philadelphia, PA; the Department of Neurology(S.E.K., M.T.M.), The Hospital of the University of Pennsylvania, University of Pennsylvania School of Medicine, Philadelphia, PA; the Departmentsof Pediatrics and Neurology (M.M.D.), Children’s Medical Center Dallas, University of Texas Southwestern Medical Center, Dallas, TX; the Departmentof Neurology (W.L.), Nationwide Children’s Hospital, Ohio State University, Columbus, OH; the Department of Neurology (L.C.J.), Johns HopkinsUniversity, Baltimore, MD; the Section of Child Neurology (T.J.B.), Denver Children’s Hospital, University of Colorado, Aurora, CO; the Center forPediatric Neurology/Desh S60 (N.F.), Neurological Institute, Cleveland Clinic, Cleveland, OH; the Department of Pediatrics (G.d.V.), Division ofNeurology, Hospital for Sick Children, Toronto, Ontario, Canada; the Department of Pediatrics and Clinical Neurosciences (A.K.), Alberta Children’sHospital, University of Calgary, Calgary, Alberta, Canada; the Department of Pediatrics (L.A.), Schenectady Neurological, Consultants, PC, Schenectady,NY; and the Department of Biostatistics and Epidemiology (J.H.E.) and the Division of Infectious Diseases (E.L.), Department of Medicine, Departmentof Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA.

The online-only Data Supplement is available at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111.633305/-/DC1.*E.L. and R.N.I. are co-last authors.Correspondence to Lauren A. Beslow, MD, MSCE, Division of Neurology, Children’s Hospital of Philadelphia, Colket Translational Research

Building, 10th Floor, 3501 Civic Center Boulevard, Philadelphia, PA 19104. E-mail [email protected]© 2011 American Heart Association, Inc.

Stroke is available at http://stroke.ahajournals.org DOI: 10.1161/STROKEAHA.111.633305

341

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

pediatric adaptation of the NIHSS, the Pediatric NIH StrokeScale (PedNIHSS), closed its enrollment in 2009. Modifica-tion of the NIHSS for pediatric use is crucial because theneurological examination of children varies significantly byage and development. The PedNIHSS has the same exami-nation elements as the adult scale including 11 neurologicaldomains and 15 scored items. The PedNIHSS (for childrenaged 2–18 years) adapts the tasks the child performs so thatthey are appropriate for the child’s age and development. Thetotal score for the PedNIHSS ranges from 0 (least severe) to42 (most severe).11

The PedNIHSS validation study was prospective withscores determined during hospitalization for the acutestroke.11 Many studies on pediatric stroke, includingpopulation-based studies from which stroke incidences indeveloped countries have been estimated, are retrospective.Retrospective pediatric stroke studies are limited becausecomparison of children’s clinical stroke severity across cen-ters and even within centers cannot be done quantitatively.The ability to assess the PedNIHSS score retrospectivelywould allow investigators to use this quantitative measureeven when a study design is retrospective or when thePedNIHSS score is missing during prospective studies,thereby improving the validity of study findings. However,the concurrent validity and reliability of calculating thePedNIHSS score retrospectively using medical records areuntested.

Materials and MethodsStudy DesignThis was a cross-sectional study comparing PedNIHSS scoresderived retrospectively from neurological examinations documentedin the medical record with PedNIHSS scores assigned prospectivelyin the parent study.11

Inclusion CriteriaAll sites participating in the parent study were invited to participatein the current study, and all subjects enrolled in the parent study witha neurological examination documented in the medical record wereeligible for the current study. The methods of the parent study aredescribed elsewhere, and consent from the parent or legal guardianof the child was obtained.11 Enrollment in the parent study requireddefinite arterial ischemic stroke based on criteria for the InternationalPediatric Stroke Study. The criteria consist of an acute neurologicaldeficit and corresponding imaging findings of focal infarct conform-ing to an established arterial territory.12 This study was approved bythe local Institutional Review Board at each site.

Study Population and SitesThe population for the current study was comprised of 75 childrenenrolled from the following 9 sites: Children’s Medical CenterDallas (Dallas, TX, 26), The Children’s Hospital of Philadelphia(Philadelphia, PA, 16), The Hospital for Sick Children (Toronto,Ontario, Canada, 13), The Cleveland Clinic Children’s Hospital(Cleveland, OH, 5), Nationwide Children’s Hospital (Columbus,OH, 4), The Children’s Hospital (Denver, CO, 4), The Children’sHospital of Pittsburgh (Pittsburgh, PA, 4), Alberta Children’s Hos-pital (Calgary, Alberta, Canada, 2), and The Johns Hopkins Chil-dren’s Center (Baltimore, MD, 1). All sites are tertiary care centersaffiliated with universities and have established pediatric strokeprograms. All prospective PedNIHSS scores were performed by astudy pediatric stroke neurology attending physician or strokefellow. The Children’s Hospital of Philadelphia was the coordinatingcenter.

Study Procedure

Medical Record PreparationA study coordinator at each site photocopied the first neurologicalexamination documented by a neurologist or neurologist-in-training.Of the 75 subjects in this study, 36 neurological examinationsdocumented in the medical record (48%) were performed by ageneral pediatric neurologist who was not the stroke neurologistperforming the prospective PedNIHSS score, and 39 (52%) wereperformed by the pediatric stroke neurologist who also performed theprospective PedNIHSS. Of the 36 subjects in whom the neurologicalexamination was performed by a general neurologist and not thestroke neurologist who performed the prospective PedNIHSS score,22 (29% of all subjects) were performed before the subject wasenrolled in the parent study. Identifying patient information wasremoved at the enrolling site. The Children’s Hospital of Philadel-phia study coordinator (R.A.B.) read through each neurologicalexamination, removed any references to the prospectively assignedscores, and photocopied the examination for each of the 4 raters. Arandom number generator was used to order the subject neurologicalexaminations, so each rater scored the 75 examinations in the sameorder to limit variability for each subject due to rater experience. Theinitial PedNIHSS score from the parent study served as the referencecriterion for each subject in the concurrent validity analysis.

Rater TrainingFour raters from The Children’s Hospital of Philadelphia and TheUniversity of Pennsylvania Neurology Departments with varyinglevels of clinical training were blinded to the prospectively assignedPedNIHSS scores. The raters included a pediatric stroke attending(S.E.S.), an adult stroke attending (M.T.M.), a pediatric stroke fellow(L.A.B.), and a child neurology resident (M.P.K.). The ratersattended a training session with the principal investigator of theparent study (R.N.I.). Detailed instructions on how to translateinformation from the medical record into scores were provided at thetraining session. If an item was not recorded in the medical record,it was scored as 0 (normal) as was done in past adult studies (seeOnline Appendix; http://stroke.ahajournals.org).5 Every rater rec-orded whether each item was documented. Study data were enteredinto a database by L.A.B. and were reviewed for accuracy by TheChildren’s Hospital of Philadelphia study coordinator (R.A.B.).

Statistical AnalysesDescriptive statistics were performed using frequency distributionsand proportions for categorical variables and means with SDs ormedians with interquartile ranges for continuous variables. Linearregression was performed to determine the correlation of the esti-mated PedNIHSS scores with the prospectively assigned scoresclustered by subject because the 4 estimates obtained by the 4 ratersfor each subject were not independent. The parameter R2 wasreported as the measure of concurrent validity. Age at the time of thestroke and the time difference between the prospective scores andneurological examination recorded in the medical record wereexamined as covariates. Linear regression analysis was also done foreach rater. A subanalysis was performed to assess the correlation ofthe estimated PedNIHSS scores with the prospectively assignedscores for the 22 subjects in whom the neurological examination wasdocumented by a nonstroke neurologist before the subject wasenrolled in the parent study. Using a predetermined threshold basedon categorizations in the original adult retrospective NIHSS scoringstudy,4 the sensitivity and specificity for correctly scoring a totalPedNIHSS that had been prospectively scored as �5 were calculatedwith their 95% CIs using MedCalc Version 11.5.1 (MedCalcSoftware, Mariakerke, Belgium). Since the total NIHSS score isgenerally used as a continuous measure, interrater reliability of theestimated total scores was determined by the calculation of anintraclass correlation coefficient (ICC) using 1-way analysis ofvariance. A subanalysis was performed to assess the reliability of the4 raters’ retrospective scores for the 22 subjects in whom theneurological examination was documented by a nonstroke neurolo-gist before the subject was enrolled in the parent study. Subanalyses

342 Stroke February 2012

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

were performed on the 15 items’ scores using a weighted � statisticbecause the item scores are categorical using SAS 9.2 macro (SASInstitute Inc, Cary, NC). A � or ICC was considered moderateagreement if 0.41 to 0.60, substantial agreement if 0.61 to 0.80, andalmost perfect (excellent) if 0.81 to 1.00.13 A 2-sided probabilityvalue of �0.05 was considered significant. STATA Version 11.1(STATA Corporation, College Station, TX) was used for regressionanalysis and ICC analysis.

ResultsSubject CharacteristicsNine of 15 centers (60%) participating in the parent prospec-tive study agreed to enroll their subjects in the current study.All subjects from the 9 centers were enrolled in the currentstudy with the exception of 1 subject for whom the neuro-logical examination in the medical record could not belocated. Therefore, 75 of 113 children (66.4%) from theparent study (66.4%) were included. Forty-five subjects(60%) were male. Racial distribution was 56 white (8Hispanic), 11 black or African American, 6 Asian, 1 mixedrace, and 1 of unknown race (Hispanic). The mean andmedian ages of the participants were 9.7 years (SD, 5.5 years)and 9.2 years (interquartile range, 4.5–15.6 years), respec-tively. There were no significant differences between the 75subjects enrolled in the study and the 38 subjects who werenot enrolled on the basis of median prospective PedNIHSStotal score, age at presentation, or sex (data not shown).

Concurrent Validity of Retrospective Scores WithProspective ScoresThe mean and median total prospective and retrospectivePedNIHSS scores were not different for the 75 subjects(P�0.49 Student t test; P�0.37 Wilcoxon rank sum). Themean prospective total PedNIHSS score for the 75 participat-ing children was 8.2 (SD, 7) with a median of 6 (interquartilerange, 3–12), and the mean total retrospective PedNIHSSscore was 7.6 (SD, 7) with a median of 5 (interquartile range,3–11). In regression analysis, R2 for the retrospective estima-tions of the total scores and the prospectively assigned totalscores was 0.76, slope 0.87 (95% CI, 0.77–0.97), intercept1.62, and P�0.001. The mean and median time differencebetween the performance of the prospective PedNIHSS scoreand the neurological examination documented in the medicalrecord were 6.8 hours (SD, 12.8 hours) and 1 hour (inter-quartile range, 0.25–9 hours), respectively. Neither age at thetime of the stroke nor time difference between the prospectivescore and the neurological examination recorded in themedical record altered the observed R2 (not shown). The R2

for each of the 4 raters’ retrospectively estimated total scoresversus the prospectively assigned total scores ranged from0.73 to 0.83. Figure 1 demonstrates a scatterplot of theprospective total PedNIHSS scores versus the retrospectivetotal PedNIHSS scores by rater. Two hundred sixty-eight of300 retrospective total scores (89%) were within 5 points ofthe prospectively scored totals. The sensitivity and specificityfor the raters correctly scoring a total PedNIHSS that hadbeen prospectively scored as �5 were 87% (95% CI, 81%–92%) and 81% (95% CI, 74%–87%), respectively. In thesubanalysis on the 22 subjects whose neurological examina-tion in the chart was performed before parent study enroll-

ment and in whom the neurological examination was per-formed by a general pediatric neurologist who was not thepediatric stroke neurologist who performed the prospectivePedNIHSS, the R2 was 0.85, slope 0.88 (95% CI, 0.77–1.00),intercept 1.44, and P�0.001.

Interrater Reliability of Retrospective ScoresFor the total score, the ICC was 0.95 (95% exact CI,0.94–0.99). Figure 2 demonstrates box plots of the distribu-tion of the retrospectively estimated total PedNIHSS scoresfor each rater, indicating that the 4 raters’ total scores werecomparable. The weighted � for the item scores ranged from0.47 to 0.93 (Table). The ICC was 0.95 (95% exact CI,0.90–0.98) in the subanalysis on the 22 patients in whom theneurological examination was documented before parent

Figure 1. Scatterplot of prospective total Pediatric NationalInstitutes of Health Stroke Scale (PedNIHSS) score vs retro-spective total PedNIHSS score for the 4 raters. Line representsreference with slope of 1.

Figure 2. Box plots of distributions of retrospective total Pediat-ric National Institutes of Health Stroke Scale (PedNIHSS) scoresfor each rater. Lower and upper boundaries represent the 25thand 75th percentiles. The central line represents the median.The circles represent outliers, defined as those scores fartheraway from the box than 1.5 times the interquartile range. Thesquare is an extreme outlier, defined as a score farther awayfrom the box than 3 times the interquartile range. Whiskers rep-resent the largest and smallest nonoutlying scores.

Beslow et al Retrospective Scoring of Pediatric NIHSS 343

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

study enrollment by a nonstroke neurologist different fromthe pediatric stroke neurologist who performed the prospec-tive PedNIHSS.

DiscussionRetrospective studies have been critical in the field ofpediatric stroke, but a limitation has been the inability tomeasure and then adjust for initial clinical stroke severity.The results of this study demonstrate that the PedNIHSSscore can be assessed retrospectively from neurologicalexaminations found in the medical record with excellentconcurrent validity and interrater reliability. Our ICC for thetotal PedNIHSS score among the 4 raters was 0.95, whichcompares favorably to the ICC of 0.82 previously reported forretrospective assessment of the NIHSS in adults4 and to theICC of 0.99 obtained in the prospective pediatric study.11

Compared with the weighted � for the item scores from theprospective pediatric study, our weighted � for all items weresimilar except for the visual item (3).11 There was 100%agreement on the visual performance item in the prospectivestudy, whereas in the retrospective study, this item hadpoorest agreement (��0.47). On inspection of the 4 raters’scores for this item, there were 6 instances in which 1 raterrecorded a 3 (bilateral hemianopia), whereas the other ratersrecorded 0 (normal). These were subjects who had severestrokes with poor mental status. This scenario was perhapsmore difficult to score because the directions for the prospec-tive PedNIHSS score indicate that a comatose subject shouldbe scored as a 3 for visual fields, whereas our algorithmindicated that items not clearly documented should be scoredas 0.11 Although the reliability for the visual item was onlymoderate, the ICC for the overall score was excellent. We

therefore would not alter the retrospective scoring method toimprove the interrater reliability of this single item.

The current study has several potential limitations. Al-though not all subjects in the parent study were enrolled inthis ancillary study, subjects who were enrolled did not differfrom those not enrolled with respect to age, sex, and prospec-tive PedNIHSS total score. All study subjects presented totertiary care hospitals and were enrolled in a single clinicaltrial in which the PedNIHSS was tested as a research tool. Itis not possible to determine if documentation practices in themedical record were altered based on enrollment. Someneurologists may have documented examinations either moreor less carefully due to subject participation in the parentstudy. Notably, the PedNIHSS score form for the study wasnot a part of the medical record and was not documented inthe medical record in most cases. Nonetheless, if participationin the study increased the degree of documentation in themedical chart compared with documentation practices inpatients not enrolled in the study, our reliability estimates orthe concurrent validity estimate may have been increased.However, our excellent reliability estimate (ICC, 0.95) andconcurrent validity estimate (R2�0.85) from the subanalysison 22 subjects whose neurological examination was docu-mented before enrollment in the parent study and by a generalpediatric neurologist who was not the pediatric stroke neu-rologist who performed the prospective PedNIHSS scoresuggests that documentation practices in the chart maylargely have been unchanged. Furthermore, the fact that theneurological examination corresponding to 3 different Ped-NIHSS score items was not documented in the medicalrecords of �15% of the subjects suggests that documentationpractices in the medical record may not have been altered bysubject participation in the parent study. Most pediatricstrokes are mild to moderate (median prospective total score6), and only 7 subjects had prospective total PedNIHSSscores �20. Therefore, extrapolation of the concurrent valid-ity and reliability of the retrospective scoring method toscores in the upper range for the total PedNIHSS score mayneed additional study. Additionally, the study findings cannotbe extrapolated to raters who are nonneurologist physicians,nurses, and research coordinators or to neurological exami-nations documented by nonneurologists. Replication of thestudy involving such raters who have other training, inpatients whose neurological examinations were documentedby practitioners other than neurologists, and in patients whoare not enrolled in the prospective PedNIHSS validationstudy would be useful to expand the use of the retrospectiveestimation of the PedNIHSS score. Nevertheless, our studyresults are strengthened by including raters at various stagesin their neurology training and by our inclusion of subjectsfrom 9 geographically diverse centers. Both of these factorsincrease the generalizability of our results. Despite possiblelimitations, our results demonstrate that the PedNIHSS scorecan be scored retrospectively from medical records.

ConclusionsStroke in children is uncommon14,15 with diverse and oftenmultiple etiologies,16 and the predictors of outcome arepoorly understood. Although we hope that the PedNIHSS

Table. Agreement Among 4 Raters by IndividualRetrospectively Scored Pediatric National Institutes of HealthStroke Scale Item

ItemWeighted

Kappa95% CI

for KappaNo. (%) of ItemsUndocumented*

1a. LOC 0.82 0.76–0.87 1 (1.3%)

1b. LOC questions 0.77 0.69–0.83 3 (4%)

1c. LOC commands 0.71 0.62–0.79 6 (8%)

2. Best gaze 0.83 0.78–0.88 6 (8%)

3. Visual 0.47 0.35–0.59 18 (24%)

4. Facial palsy 0.84 0.79–0.89 2 (2.7%)

5a. Motor arm left 0.93 0.90–0.95 2 (2.7%)

5b. Motor arm right 0.93 0.91–0.95 0 (0%)

6a. Motor leg left 0.89 0.85–0.92 3 (4%)

6b. Motor leg right 0.91 0.87–0.94 0 (0%)

7. Limb ataxia 0.87 0.83–0.91 11 (14.7%)

8. Sensory 0.72 0.63–0.79 8 (10.7%)

9. Best language 0.83 0.77–0.88 9 (12%)

10. Dysarthria 0.86 0.80–0.90 14 (18.7%)

11. Extinction/inattention

0.77 0.69–0.83 31 (41.3%)

LOC indicates level of consciousness.*No. of charts in which item not documented (total charts�75).

344 Stroke February 2012

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

score will become part of the routine examination for child-hood stroke patients, scoring the PedNIHSS from medicalrecords retrospectively is valid and reliable. The ability toassess and control for initial stroke severity will facilitatepotential retrospective studies and even prospective studieswith missing data on the PedNIHSS score that could provideessential information about pathophysiology, clinical course,and outcomes of pediatric stroke.

AcknowledgmentsThis article is the Master of Science in Clinical Epidemiology thesisfor L.A.B. from The Center for Clinical Epidemiology and Biosta-tistics at The University of Pennsylvania.

Sources of FundingL.A.B. received funding from NIH-T32-NS007413 and L. MortonMorley Funds of The Philadelphia Foundation. S.E.S. receivedfunding from NIH-K12-NS049453. M.M.D. received funding fromNIH-KL2-RR024983, The First American Real Estate Services, Inc,and the Doris Duke Charitable Foundation. L.C.J. received fundingfrom NIH-K23-NS062110. T.J.B. received funding from NIH-K23-HL096895. G.d.V. received funding from NIH-R01-NS062820.A.K. received funding from Heart/Stroke Foundation Canada, Heart/Stroke Foundation Alberta, Alberta Children’s Hospital Foundationand Research Institute, and the Hotchkiss Brain Institute. D.J.L.received funding from NIH-K23-NS52380, Dana Foundation, andthe June and Steve Wolfson Family Fund for Neurological Research.A.F.J. received funding from NIH-R01-NS050488. E.L. receivedfunding from NIH-K24-AI080942. R.N.I. received funding fromNIH-R01-NS050488 and K23-NS062110.

DisclosuresR.N.I. is a consultant/advisory board member, Berlin Heart ClinicalEvent Committee. L.C.J. is a consultant/advisory board member,Berlin Heart Clinical Event Committee.

References1. Brott T, Adams HP Jr, Olinger CP, Marler JR, Barsan WG, Biller J, et al.

Measurements of acute cerebral infarction: a clinical examination scale.Stroke. 1989;20:864–870.

2. Adams HP Jr, Davis PH, Leira EC, Chang KC, Bendixen BH, ClarkeWR, et al. Baseline NIH Stroke Scale score strongly predicts outcome

after stroke: a report of the Trial of Org 10172 in Acute Stroke Treatment(TOAST). Neurology. 1999;53:126–131.

3. Goldstein LB, Samsa GP. Reliability of the National Institutes of HealthStroke Scale. Extension to non-neurologists in the context of a clinicaltrial. Stroke. 1997;28:307–310.

4. Kasner SE, Chalela JA, Luciano JM, Cucchiara BL, Raps EC, McGarveyML, et al. Reliability and validity of estimating the NIH Stroke Scalescore from medical records. Stroke. 1999;30:1534–1537.

5. Williams LS, Yilmaz EY, Lopez-Yunez AM. Retrospective assessment ofinitial stroke severity with the NIH Stroke Scale. Stroke. 2000;31:858–862.

6. Barber M, Fail M, Shields M, Stott DJ, Langhorne P. Validity andreliability of estimating the Scandinavian Stroke Scale score frommedical records. Cerebrovasc Dis. 2004;17:224–227.

7. Stavem K, Lossius M, Ronning OM. Reliability and validity of theCanadian Neurological Scale in retrospective assessment of initial strokeseverity. Cerebrovasc Dis. 2003;16:286–291.

8. Bushnell CD, Johnston DC, Goldstein LB. Retrospective assessment ofinitial stroke severity: comparison of the NIH Stroke Scale and theCanadian Neurological Scale. Stroke. 2001;32:656–660.

9. Chitravas N, Dewey HM, Nicol MB, Harding DL, Pearce DC, Thrift AG.Is prestroke use of angiotensin-converting enzyme inhibitors associatedwith better outcome? Neurology. 2007;68:1687–1693.

10. Hsia AW, Sachdev HS, Tomlinson J, Hamilton SA, Tong DC. Efficacy ofIV tissue plasminogen activator in acute stroke: does stroke subtype reallymatter? Neurology. 2003;61:71–75.

11. Ichord RN, Bastian R, Abraham L, Askalan R, Benedict S, Bernard TJ, etal. Interrater reliability of the pediatric National Institutes of HealthStroke Scale (PedNIHSS) in a multicenter study. Stroke. 2011;42:613–617.

12. Sebire G, Fullerton H, Riou E, deVeber G. Toward the definition ofcerebral arteriopathies of childhood. Curr Opin Pediatr. 2004;16:617–622.

13. Landis JR, Koch GG. The measurement of observer agreement for cate-gorical data. Biometrics. 1977;33:159–174.

14. Fullerton HJ, Wu YW, Zhao S, Johnston SC. Risk of stroke in children:ethnic and gender disparities. Neurology. 2003;61:189–194.

15. Giroud M, Lemesle M, Gouyon JB, Nivelon JL, Milan C, Dumas R.Cerebrovascular disease in children under 16 years of age in the city ofDijon, France: a study of incidence and clinical features from 1985 to1993. J Clin Epidemiol. 1995;48:1343–1348.

16. Roach ES, Golomb MR, Adams R, Biller J, Daniels S, Deveber G, et al.Management of stroke in infants and children: a scientific statement froma Special Writing Group of the American Heart Association StrokeCouncil and the Council on Cardiovascular Disease in the young. Stroke.2008;39:2644–2691.

Beslow et al Retrospective Scoring of Pediatric NIHSS 345

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

Jonas H. Ellenberg, Ebbing Lautenbach and Rebecca N. IchordFriedman, Gabrielle deVeber, Adam Kirton, Lisa Abraham, Daniel J. Licht, Abbas F. Jawad,

Rachel A. Bastian, Michael M. Dowling, Warren Lo, Lori C. Jordan, Timothy J. Bernard, Neil Lauren A. Beslow, Scott E. Kasner, Sabrina E. Smith, Michael T. Mullen, Matthew P. Kirschen,

Institutes of Health Stroke ScaleConcurrent Validity and Reliability of Retrospective Scoring of the Pediatric National

Print ISSN: 0039-2499. Online ISSN: 1524-4628 Copyright © 2011 American Heart Association, Inc. All rights reserved.

is published by the American Heart Association, 7272 Greenville Avenue, Dallas, TX 75231Stroke doi: 10.1161/STROKEAHA.111.633305

2012;43:341-345; originally published online November 10, 2011;Stroke. 

http://stroke.ahajournals.org/content/43/2/341World Wide Web at:

The online version of this article, along with updated information and services, is located on the

http://stroke.ahajournals.org/content/suppl/2011/11/17/STROKEAHA.111.633305.DC1Data Supplement (unedited) at:

  http://stroke.ahajournals.org//subscriptions/

is online at: Stroke Information about subscribing to Subscriptions: 

http://www.lww.com/reprints Information about reprints can be found online at: Reprints:

  document. Permissions and Rights Question and Answer process is available in the

Request Permissions in the middle column of the Web page under Services. Further information about thisOnce the online version of the published article for which permission is being requested is located, click

can be obtained via RightsLink, a service of the Copyright Clearance Center, not the Editorial Office.Strokein Requests for permissions to reproduce figures, tables, or portions of articles originally publishedPermissions:

by guest on June 28, 2018http://stroke.ahajournals.org/

Dow

nloaded from

SUPPLEMENTAL MATERIAL

Appendix. Retrospective pediatric NIH stroke scale scoring algorithm

Item Score Prospective Retrospective

All items Score 0 if item not

documented

1a Level of consciousness

0 Alert; keenly responsive Same

1 Not alert, but arousable by

minor stimulation to obey,

answer or respond

Same

2 Not alert, requires repeated

stimulation to attend, or is

obtunded and requires strong

or painful stimulation to

make movements (not

stereotyped)

Same

3 Responds only with reflex

motor or autonomic effects or

totally unresponsive, flaccid

areflexic

Same

1b LOC questions

Additional Directions:

Q1 How old are you?

Q2 Where is XX?

XX=caregiver’s name)

0 Answers both questions

correctly

Note: Credit given if child

states age or shows correct

number of fingers

Credit given if child points to

or gazes purposefully in

direction of caregiver

Oriented X 3

1 Answers one question

correctly

Oriented X < 3

2 Answers neither question

correctly

Disoriented; if receptive

language deficit present

and orientation not

reported, code as

disoriented

1c LOC commands Additional directions:

C1 Open and close your eyes.

C2 Touch your nose.

0 Performs both tasks correctly Notation of “follows

commands” without

mention of receptive

language deficit

1 Performs one task correctly Notation of “intermittently

follows commands” either

with or without mention

of receptive language

deficit

2 Performs neither task

correctly

Notation of “does not

follow commands” either

with or without mention

of receptive language

deficit

2 Best gaze

0 Normal Same

1 Partial gaze palsy. This score

is given when gaze is

abnormal in one or both eyes,

but where forced deviation or

total gaze paresis not present.

Conjugate deviation of the

eyes that can be overcome by

voluntary or reflexive activity

Isolated peripheral nerve

paresis (CN III, IV, VI)

Same

2 Forced deviation, or total

gaze paresis not overcome by

the oculocephalic maneuver.

Same

3 Visual Additional Directions: Children up to age 6, testing

done with visual threat

Children > 6 years testing

done by confrontation using

finger counting

0 No visual loss

If child looks at side of

moving fingers appropriately,

can be scored as normal

Same

1 Partial hemianopia

Clear-cut asymmetry

including quadrantanopia

Double simultaneous

stimulation is performed at

Same

this point. If extinction exists,

score as 1 and use results to

answer item 11.

2 Complete hemianopia Same

3 Bilateral hemianopia

including any cause of

blindness

Same

4 Facial Palsy

0 Normal symmetrical

movement

Same

1 Minor paralysis (flattened

nasolabial fold, asymmetry

on smiling)

Same (“mild” in

description)

2 Partial paralysis (total or near

total paralysis of lower face)

Same (“moderate” or

“severe” in description or

if no qualifier)

3 Complete paralysis of one or

both sides (absence of facial

movement in the upper and

lower face)

Same

5 Motor arm

5a left arm

5b right arm

Additional directions:

For children too immature to

follow precise directions or

uncooperative for any reason,

power in each limb should be

graded by observation of

spontaneous or elicited

movements according to the

same grading scheme,

excluding the time limits

0 No drift, limb holds 90

degrees if sitting, or 45

degrees if supine for full 10

seconds

MRC 4+ or 5 or “normal”

1 Drift, limb holds 90 (or 45)

degrees [arm] but drifts down

before 10 seconds, does not

hit bed or other support

MRC 4 or “mild”

2 Some effort against gravity,

limb cannot get to or maintain

(if cued) 90 (or 45) degrees,

drifts down to bed, but has

some effort against gravity

MRC 3 or “moderate”

3 No effort against gravity,

limb falls

MRC 2 or “severe”

4 No movement MRC 0 or 1

9 Amputation, joint fusion Same

6 Motor leg

6a left leg

6b right leg

Additional directions:

For children too immature to

follow precise directions or

uncooperative for any reason,

power in each limb should be

graded by observation of

spontaneous or elicited

movements according to the

same grading scheme,

excluding the time limits

0 No drift, leg holds 30 degrees

position for full 5 seconds

MRC 4+ or 5 or “normal”

1 Drift, leg falls by the end of

the 5 second period but does

not hit bed

MRC 4 or “mild”

2 Some effort against gravity;

leg falls to bed by 5 seconds,

but has some effort against

gravity

MRC 3 or “moderate”

3 No effort against gravity, leg

falls to bed immediately

MRC 2 or “severe”

4 No movement MRC 0 or 1

9 Amputation, joint fusion

7 Limb ataxia

Right arm 1= yes 2= no

Left arm 1= yes 2= no

Right leg 1= yes 2 = no

Left leg 1 = yes 2 = no

9 = amputation or joint

fusion

Additional directions:

In children <5 substitute

tasks on adult scale with

reaching for a toy for the

upper extremity and kicking a

toy or examiner’s hand

0 Absent or paralyzed or

patient does not understand

Same

1 Present in 1 limb Same

2 Present in 2 limbs Same

8 Sensory Additional directions: For

children too young or

otherwise uncooperative for

reporting gradations of

sensory loss, observe for any

behavioral response to pin

prick, and score it according

to same scoring scheme as a

“normal” response, “mildly

diminished,” or “severely

diminished” response

0 Normal; no sensory loss Same

1 Mild to moderate sensory

loss; patient feels pinprick is

less sharp or is dull on the

affected side; or there is a

loss of superficial pain with

pinprick but patient is aware

he/she is being touched

Same or “mildly

diminished” response

2 Severe to total sensory loss;

patient is not aware of being

touched in the face, arm and

leg

Same or “severely

diminished” response

9 Best language Additional directions: For

children 2-6 years or older

children with premorbid

language disability, item

scored based on observations

of language comprehension

and speech during preceding

examination

0 No aphasia, normal Same

1 Mild to moderate aphasia;

some obvious loss of fluency

or facility in comprehension,

without significant limitation

on ideas expressed or form of

expression, however, makes

conversation about provided

material difficult of

impossible. For example in

conversation about provided

materials examiner can

identify picture or naming

card from patient’s response

Expressive deficit only,

unless qualified as

“severe” or “no speech but

preserved

comprehension,” etc., then

score as 2

2 Severe aphasia; all

communication is through

fragmentary expression; great

need for inference,

questioning, and guessing by

the listener. Range of

Expressive plus receptive

deficit or receptive deficit

only

information that can be

exchanged is limited; listener

carries burden of

communication. Examiner

cannot identify materials

provided from patient

response

3 Mute, global aphasia; no

usable speech of auditory

comprehension. If patient in

coma and scored 3 on

question 1a

Same

10 Dysarthria

0 Normal Same

1 Mild to moderate; patient

slurs at least some words and,

at worst, can be understood

with some difficulty

Same (qualified as mild or

moderate or “dysarthria”

or “slurred speech” with

no qualifier)

2 Severe; patient’s speech is so

slurred as to be unintelligible

in the absence of or out of

proportion to any dysphasia,

or is mute/anarthric

Same ( qualified as

severe)

9 Intubated or other physical

barrier

Same

11 Extinction and

inattention (formerly

neglect)

0 No abnormality Same (cannot test a

modality due to other

deficit but all other tested

modalities are normal)

1 Visual, tactile, auditory,

spatial, or personal

inattention or extinction to

bilateral simultaneous

stimulation in one of the

sensory modalities

Same

2 Profound hemi-inattention to

more than one modality. Does

not recognize own hand or

orients to only one side of

space

Same

Abbreviations: Q, question; C, command