MOS-SF Reliability and Validity

Embed Size (px)

Citation preview

  • 8/12/2019 MOS-SF Reliability and Validity

    1/13

    The MOS Short-Form General Health Survey: Reliability and Validity in a Patient PopulationAuthor(s): Anita L. Stewart, Ron D. Hays and John E. Ware, Jr.Source: Medical Care, Vol. 26, No. 7 (Jul., 1988), pp. 724-735Published by: Lippincott Williams & WilkinsStable URL: http://www.jstor.org/stable/3765494.

    Accessed: 02/04/2014 14:32

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at.http://www.jstor.org/page/info/about/policies/terms.jsp

    .JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

    content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

    of scholarship. For more information about JSTOR, please contact [email protected].

    .

    Lippincott Williams & Wilkinsis collaborating with JSTOR to digitize, preserve and extend access toMedical

    Care.

    http://www.jstor.org

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/action/showPublisher?publisherCode=lwwhttp://www.jstor.org/stable/3765494?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/3765494?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=lww
  • 8/12/2019 MOS-SF Reliability and Validity

    2/13

    MEDICAL AREJuly 1988, Vol. 26, No. 7

    CommunicationThe MOS Short-form General Health Survey

    Reliabilityand Validityin a Patient PopulationANITAL. STEWART,PHD, RON D. HAYS, PHD, ANDJOHNE. WARE,JR., PHD

    There is a greatdemand for measuresofphysicaland mentalhealth, social and rolefunctioning,and other generalhealth con-ceptsforuse in evaluatinghealthcare.1-3 obe useful, these instrumentsshould repre-sentmultiplehealthconceptsand a rangeofhealth statespertaining o generalfunction-ing and well-being.4They should adheretoconventional tandardsof reliability nd va-lidity.5To be useful in clinicsettings,mea-sures must also be simple and easy to use.Patients tend to be sicker than the generalpopulation,their attention is divided, andtime is limited. Therefore, measures thatwork well in generalpopulations may notwork well forpatients.Health surveys that are comprehensiveand satisfy psychometric tandardsare cur-rently available, including the McMasterHealth Index Questionnaire,the SicknessImpactProfile(SIP), the FunctionalStatusQuestionnaire, he Duke-UNC Health Pro-file, the RAND Health InsuranceExperi-ment (HIE) measures, the NottinghamHealthProfile,and the Index of Well-being

    From the Department of Behavioral Sciences, TheRAND Corporation, Santa Monica, California.Supported by grants for the Medical OutcomesStudy from the Robert Wood Johnson Foundation, TheHenry J. KaiserFamily Foundation, and the Pew Chari-table Trusts.The opinions expressed are those of the authors anddo not necessarily reflect the opinions of the sponsorsor The RAND Corporation.Address correspondence to: Anita L. Stewart, TheRAND Corporation, 1700 Main Street, Santa Monica,CA 90406.

    724

    (IWB).6-12 owever, these instrumentsaretoo long to be practical n most clinic set-tings. For example, the HIE health scalesincluded 108 questionnairetems that tookan averageof 45 minutesto complete.Theshortformof the SIP ncludes 136questionsthat take an averageof 30 minutes.13TheIWB s interviewer-administerednd takesabout 18 minutes.14The length of most available nstrumentshasprompted nvestigatorso adoptsurveysbased on a few single-itemmeasures.15 orexample,Spitzer,Dobson,Hall,et al. devel-oped a qualityof life index that aggregatesfive single-item measures of health andhealth-relatedconcepts;the index requiredabout a minute for completion by physi-cians.16A single-item rating of health ingeneralis one of the most commonlyusedmeasures, such as in the National Healthand Nutrition ExaminationSurvey.17Sin-gle-item measuresof subjectivewell-beinghave been proposed by several investiga-tors.18-20n general, single-item measuresare less satisfactory han multi-item scalesbecausesingle items are generally ess pre-cise, less reliable,and less valid.21'22 ulti-item scales also provide more options forestimating scores when a response to agiven item is missing.A compromisebetween lengthy instru-ments and single-itemmeasures of healthwas sought. A small subset of items fromlong-formmeasureshas been shown to sat-isfy standardsof acceptability,reliability,and validity in a generalpopulation.23Re-ported here are results from administering

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    3/13

    SHORT-FORM HEALTH SURVEY

    such a survey to patients. The results arecompared with those from administration ofthis survey to a general population.Methods

    SamplingPatients surveyed were participating inthe Medical Outcomes Study (MOS), an ob-servational study of variations in physicianpractice styles and patient outcomes in dif-ferent systems of care. At each of threestudy sites (Boston, Chicago, Los Angeles),physicians (general internists, family physi-cians, cardiologists, endocrinologists, diabe-

    tologists, psychiatrists), psychologists, andother mental health providers were sampledfrom health maintenance organizations(HMO), large multispecialty groups, andsolo fee-for-service practices. Altogether,526 health care providers age 31-55, whoreported direct patient care as their primaryprofessional activity and who had been intheir current practice setting at least oneyear were included in the MOS.The information in this article is based ona sample of 11,186 adult, English-speakingpatients who visited these providers duringthe sampling period (lasting 9 days on aver-age). Ages ranged from 18-103 (mean agewas 47). Thirty-eight percent were male and87% had completed high school (average of13.7 years of education). Fifty percent of thesample had a total household income of atleast $20,000 in 1985 dollars. Seventy-nine

    percent were white, 11% black, 5% Latino,and 3% were Asian or Pacific Islander. Thegeneral population sample of adults repre-senting United States households, to whichwe compare results, is described else-where.23 As expected,24 the patient samplewas slightly older and overrepresentedwomen relative to the general populationsample. The patients were also slightly moreeducated and had slightly higher income.Data Collection

    Data from patients were collected fromFebruary through October 1986. The 20

    health items were located in the middle of a75-item self-administered questionnaire,which was completed by patients as theywaited to see their doctor. In all solo prac-tices and in some group practices, surveyswere distributed by office staff. In mostgroup practices, they were distributed byMOS field representatives. The entire ques-tionnaire took an average of 13 minutes tocomplete, of which it is estimated that thehealth items took from 3-4 minutes, onaverage. Questionnaires were returned forabout 74% of the eligible patient visits ingroup practices and about 65% of such visitsin fee-for-service practices. These returnrates underestimate patient acceptance ofthe questionnaire because, when practiceswere very busy, staff were encouraged tosurvey every other patient.HealthMeasures

    In accordance with the minimum stan-dard of content validity for a comprehensivehealth measure suggested by Ware4 andconsistent with previous definitions ofhealth,25-2720 items were selected to repre-sent six health concepts: physical function-ing, role functioning, social functioning,mental health, health perceptions, and pain.Physical functioning was assessed by limita-tions in a variety of physical activities,ranging from strenuous to basic, due tohealth. Role and social functioning weredefined by limitations due to health prob-lems. Mental health was assessed in termsof psychological distress and well-being.The measure of health perceptions tappedpatients' own ratings of their current healthin general. Pain was included to capturedifferences in physical discomfort. Defini-tions of physical functioning, mental health,and health perceptions tap positive as wellas negative states of health. The definitionsare summarized in Table 1. Questionnaireitems are presented in the appendix.Eighteen of the 20 items were adaptedfrom longer HIEmeasures of these conceptsand were used successfully in a general pop-

    725

    Vol. 26, No. 7

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    4/13

    STEWART ET AL.TABLE . Definitionsof HealthConcepts

    No. of ItemMeasure Items Definition NumbersaPhysical 6 Extent to which health interferes with a variety of 16a-16ffunctioning activities (e.g., sports, carrying groceries, climbingstairs, and walking)Role 2 Extent to which health interferes with usual daily activity 18, 19functioning such as work, housework, or schoolSocial 1 Extent to which health interferes with normal social 20

    functioning activities such as visiting with friends during past monthMental 5 General mood or affect, including depression, anxiety, 21-25health and psychologic well-being during the past monthHealth 5 Overall ratings of current health in general 2, 26a-26d

    perceptionsPain 1 Extent of bodily pain in past 4 weeks 17a See appendix.

    ulation survey.23The two additional single-item measures (social functioning and pain)were added after that administration, basedon experience with similar measures in theHIE.Analysis Plan

    Analyses were designed to evaluate theextent to which very short multi-item scaleswould satisfy traditional psychometric crite-ria.5A multitraitscaling method was used totest item convergent and discriminant valid-ity.28 This method consists of three stepsdesigned to determine whether items haveequivalent variances, whether each item in ahypothesized group is substantially related(r > 0.40) to the total score computed fromother items in that group (item convergentvalidity criterion) and whether each itemcorrelates significantly higher with its hy-pothesized scale than with other scales (itemdiscriminant validity criterion). If these con-ditions are met, it is appropriate to combineitems as hypothesized into simple sum-mated ratings scales. These multitrait scal-ing tests were performed for the patientshaving complete data on all 20 items (N= 8,294, 73% of respondents).726

    Cronbach's alpha,29 a measure of inter-nal-consistency reliability, was estimatedfor the four multi-item scales. Reliability isconsidered acceptable for group compari-sons when alpha is 0.50 or above.30 On thestrength of experience with longer forms ofthese measures, the authors thought itwould be possible to achieve reliability coef-ficients above 0.70, as recommended byNunnally.31 Reliability was evaluated forthe total sample and in two subsamples forwhom data quality was hypothesized to belower based on prior studies-those withless than a high school education and thoseover age 75.32In addition, because patientswith serious health problems might havetrouble completing such a questionnaire, theauthors tested the reliability separately forgroups of patients with congestive heartfailure, depressive symptoms, diabetes,and/or recent myocardial infarction. Wherepossible, the reliability estimates for theshort-form scales were also compared withthose obtained for longer versions of thesame scales.

    Preliminary tests of validity were alsopossible within the available data set, in-cluding product-moment correlationsamong the health measures, discrimination

    MEDICAL CARE

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    5/13

    SHORT-FORM EALTHSURVEYbetween patient and general populationgroups, and correlations with sociodemo-graphiccharacteristics. ll correlationswereexaminedto determine whether the short-form measuresproducedthe same patternof results as observed for long-formmea-suresin previousresearch.Patientandgen-eral population groups were comparedbyexaminingthe magnitudeof the differencein the proportion of those scoring in thepoor healthrange(i.e.,knowngroupsva-lidity).The variabilityof score distributionswas examinedfor all six health measures.Our goal was to achieve scores that werenormallydistributedor at least not highlyskewed or kurtotic.Constructionof Health Measures

    Consistent with previousstudies, limita-tions in physicaland role functioningwerecounted regardless of duration and werescored to reflect the numberof limitationspresent.33'34cores were reversed so that ahigh value indicated better functioning.Mentalhealth scales were scored by sum-ming the itemresponses,afterreversing hescoringof some items so that a high scoreindicated better health. Before combiningitems in the health perceptionsscale, theauthorsrecodedthe responsechoices of theoverall health item (item1), to betterreflectthe unequal ntervalsof the item.*Thescalewas scoredby summingthe itemresponses,after recoding some items so that a highscore indicated better health. The single-item measures were scored so that highscores indicated better social functioningand more pain. Finally, for all measures,scores were transformed inearly to 0-100

    *To do this,the authorscalculated he meangeneralhealth score for the otherfouritemsforeachresponselevel on the overall tem.Thesemeanscoreswere thentransposednto a 1-5 scale to correspond o the scaleused for the other4 items.This resulted n the follow-ing transformation: = 5, 2 = 4.36, 3 = 3.43,4 = 1.99,5 = 1.

    scales,with 0 and 100assignedto thelowestand highestpossiblescores,respectively.tResults

    Multitrait calingThe item means and standarddeviationswere roughlyequalwithin each scale, thusmeetingthe firstcriterion or summatedrat-ings scales,with one minorexception n thephysical functioningscale. Item-scale cor-relations correctedor overlap) or the fourmulti-item scales indicated that our strin-

    gent criterion of convergent validity wasmet in all cases. Item-scale correlations orhypothesized scales ranged from 0.45 to0.79, with a median of 0.68. All items ineach hypothesizedscale also exceeded thediscriminant aliditycriterion.These resultssupport the construction of simple sum-matedratingsscales based on hypothesizeditem groupings.Variabilityof Health Measures

    The mean and standard deviation foreach of the measuresareshown in Table2.The full range of possible scores was ob-

    t Forthe mental health and generalhealthscales,amissingscore was assignedonly if all five itemsin thescale were missing.For the two-itemrole-functioningscale, a missingscore was assigned initiallyif eitheritem wasmissing.However, f theonenonmissingtemindicated hat the personwas unableto work(limitedon item 18), the lowestpossiblefunctioning core of 0wasassigned. f the one nonmissingtem ndicated hatthe personhad no limitationsn the kind or amountofwork (not limited on item 19), the highest possiblefunctioning coreof 100 wasassigned.For hephysicalfunctioning cale,a missingscorewas assigned f morethan one item was missing.However, f the first tem(item 16a) was answered as not limited and any re-mainingitems were missing,a score of 100 was as-signed. The authors found about 8% of respondentswere assigneda missing value on physicaland rolefunctioning, %on mentalhealth,andless than1%ongeneralhealth. Missingdata ratesfor the single-itempainand socialfunctioningmeasureswere both about3%.Overall,15%of the samplehad missingdata onone or moreof the finalscales.Missingdata rateswerehighest amongthose over 75 (39%).

    727

    Vol. 26, No. 7

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    6/13

    STEWART ET AL.TABLE. Descriptive Statistics for Health Scales and Percent of Patient (N = 11,186) and GeneralPopulation (N = 2,008) Samples Scoring in the Poor Health Range

    % n Poor HealthbNo. of GeneralMeasure Items Mean SD Patients Population'

    Physical unctioning 6 78.5 30.8 45 22Rolefunctioning 2 77.5 38.3 28 12Socialfunctioning 1 87.2 23.6 9 dMentalhealth 5 72.6 20.2 31 19Healthperceptions 5 63.0 26.8 52 20Pain 1 31.4 27.7 29 da Observed angeof all scoreswas 0-100. A highscore ndicatesbetterhealthexcept orpain,where a highscoreindicatesmorepain.bPoorhealthdefinedas:physicalandrolefunctioning= one or more imitations;ocial unctioning= limitationsa goodbitof the time ormore;mentalhealth= lowest 19%of scores n generalpopulation ample(scoreof 67 orlower)(cutoffdefinedas close as possible o the bottom20%);healthperceptions= lowest20%of scores n generalpopulation ample(scoreof 70 or lower);pain = moderate, evere,or veryseverepain.' T-testsof difference etweenproportionsn patientandgeneralpopulation ampleswerestatistically ignificant(P < 0.01)forevery possiblecomparison.dNot available.

    served for all measures (data not reported).?The distributions of mental health andhealth perceptions scores were roughlysymmetric, as desired. The distributions ofphysical and role functioning scores wereskewed, with more people scoring along thepositive end of the scale but to a lesser de-gree than in general populations.34The rolefunctioning scale had a somewhat bimodaldistribution with the least prevalent cate-gory being the middle one; 72% had perfectfunctioning and 17% scored at the worstlevel. The distribution of the social func-tioning item was quite skewed and kurtotic,with the modal score (69% of the sample)being perfect functioning. The pain itemwas well distributed even though the modalscore was no pain (30%), with approxi-mately 9% reporting severe or very severepain and about 20% reporting moderatepain.

    ? Actual score distributions are available from thesenior author upon request.

    Reliabilityof HealthMeasures

    Reliability coefficients for the multi-itemhealth scales ranged from 0.81 to 0.88(Table 3). These estimates are nearly identi-cal to those for the same scales in the gen-eral population sample (range was from0.76 to 0.88). Estimates for the four multi-item scales were similar for depressed pa-tients (0.82 to 0.87) and for other subgroupsanalyzed: congestive heart failure (0.77 to0.87), diabetes (0.83 to 0.87), myocardialinfarction (0.77 to 0.88), less than a highschool education (0.86 to 0.88), and overage 75 (0.84 to 0.89).The internal-consistency reliabilities ofthese short-form measures were lower, butnot much lower, than their full-length ver-sions. The internal-consistency reliability ofthe five-item health perceptions measurewas 0.87, compared to 0.88 for a nine-itemgeneral health scale.35 The reliability of thefive-item mental health measure was 0.88,compared with 0.96 for a 38-item version.36The reliability of the six-item physical func-

    728

    MEDICALCARE

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    7/13

    SHORT-FORM EALTHSURVEYTABLE . ReliabilityEstimatesand CorrelationsAmongHealthMeasures

    Measure PF RF SF MH HP PPhysicalFunctioning (PF) (0.86)RoleFunctioning (RF) 0.65 (0.81)SocialFunctioning (SF) 0.47 0.56 (-)MentalHealth (MH) 0.24 0.33 0.45 (0.88)HealthPerceptions (HP) 0.53 0.57 0.53 0.45 (0.87)Pain (P) -0.39 -0.42 -0.39 -0.42 -0.47 (-)

    Note:Ns varied rom9,729 to 10,860,due to missingdata. All correlation oefficients restatistically ignificant(P < 0.01).Internal-consistencyeliabilities regiven on the diagonal or multi-item cales.

    tioning measure was 0.86, compared with0.90 for a 10-item similarmeasure.37Finally,the reliability of the two-item role function-ing was 0.81, compared with a coefficient ofreproducibility of 0.92 for a three-item ver-sion.34Validityof HealthMeasures

    Results for the three types of validityanalyses, introduced in the methods section,are presented below.Correlations Among Health Measures.All correlations among the health measureswere statistically significant (P < 0.01) andmost were substantial in magnitude (seeTable 3). This pattern of correlations corre-sponds well with that observed from studiesof full-length versions of these mea-sures.1038 Because the social functioningitem had the same format as the mentalhealth items, it might be expected to corre-late highest with mental health. It did not,suggesting that this method effect did notdominate the results. Consistent with pre-vious research,35 the health perceptionsscale correlated substantially with bothphysical and mental health; in fact, thatscale correlated substantially with all of theother health scales.

    Comparison of Patient and GeneralPopulation Samples. The authors calcu-lated the percent scoring in the poorhealth range for each of the health measures(see Table 2 for definitions of poor health).The percentage of respondents with poor

    health was significantly greater(P < 0.01) inthe patient sample than in the general popu-lation sample on all four comparable mea-sures, consistent with previous stud-ies.13'16'39'40he percentage of patients withphysical or role limitations or poor healthperceptions was about twice that observedfor the general population sample. The per-centage of respondents with poor mentalhealth was 50% larger in the patient samplethan in the general population sample.These differences could not be accountedfor by differences in sociodemographiccharacteristicsbetween the two samples.Correlations Between Health Measuresand Sociodemographics. Correlations be-tween the health measures and age, sex, ed-ucation, income, and race (not shown) wereconsistent with results using longer formmeasures.38 People with more educationand income tended to have better health. Inthis study, men reported slightly betterhealth than women on all measures excepthealth perceptions. Older people tended toreport poorer health than younger peopleon all measures except mental health, as ex-pected. Nonwhites tended to report poorerhealth perceptions, poorer social function-ing, and more pain than whites.Discussion

    The authors' goal was to develop a gen-eral health survey that is comprehensiveand psychometrically sound, yet short729

    Vol. 26, No. 7

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    8/13

    STEWART ET AL.

    enough to be practical for use in large-scalestudies of patients in practice settings. Theresulting 20-item survey assesses physicalfunctioning, role and social functioning,mental health, health perceptions, and pain.A full range of favorable and unfavorablehealth levels is tapped by items in thesemeasures. Thus, the survey achievesbreadth and depth of measurement, whilepermitting self-administration in only 3-4minutes. By virtue of its reduction of re-spondent burden by 80% relative to thelengthier measures from which it was de-rived, this short-form survey offers a morepractical approach to patient health assess-ment.The reliability of each of the multi-itemscales is acceptable for group comparisons,even in subgroups of patients over age 75,with serious chronic conditions, with de-pressive symptoms, and with less than ahigh school education. The reliability of themental health and the health perceptionsmeasures, however, might be slightly in-flated because these items were asked in asequence. Future tests of the reliability ofthese items should split them up to mini-mize recall effects.The fact that the reliabilities observedhere were not substantially lower than thosefor long-form measures is encouraging.However, it does not necessarily follow thatshort-form measures will achieve equiva-lent precision in measuring changes inhealth over time. Such precision is essentialfor studies of health outcomes. Althoughsome sacrifice in precision is likely withshort measures, compared with lengthierones, these short-form scales represent again in precision relative to single-itemmeasures, which are typically coarse.22Tradeoffs between short- and long-formmeasures in detecting changes in healthover time are currently being evaluated inthe MOS.The results also offer preliminary supportfor the validity of the measures. First,excel-

    lent item discrimination among hypothe-sized scales in the multitrait scaling analyseswas observed. Second, correlations amongthe health measures and between the mea-sures and sociodemographic characteristicswere similar to correlations observed usinglonger form versions of these measures.Third, substantial differences in health be-tween the patient and general populationsamples were observed and the pattern ofdifferences (across measures) was consistentwith previous research.39The validity of a health survey cannot beestablished in a single study. Future studiesof these short-form measures should evalu-ate how well they discriminate amonggroups differing in diagnosis and disease se-verity. Their validity in predicting futurehealth and utilization of health servicesshould also be tested. Short-form measuresmay not do as well as long-form measuresin tests of validity.22Thus, it is important toestablish the limits of the short-form mea-sures and understand fully the tradeoffs in-volved in their use.A relatively high rate of missing data foritems in the physical and role functioningscales (8%) was observed. The pattern ofmissing data in the role functioning scalesuggests that some people overlooked thesecond role functioning item (item 19). Thisitem should be printed directly below thefirst one in future administrations of thisbattery. For the physical functioning items,instructions should make more clear thatrespondents should answer every item, notjust those describing problems that apply.Missing data was more prevalent amongpatients: older than 75, with less than a highschool education, and with diabetes or heartdisease. The authors suspect that this prob-lem is not unique to this study and suggest

    The modified instructions for physical functioningare as follows: For how long has your health limitedyouin each activity listed? Please provide an answer foreach of the activities listed.

    730

    MEDICALARE

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    9/13

    SHORT-FORM EALTHSURVEYthat specialsteps be takento guard againstthisproblem n studies of thesepopulations.An advantage of multi-itemscales in thisregardis the option they provide for esti-matingmissing temscores fromother temsin the same scale. Using this method, thepercentage of respondents available foranalysiswas increased rom 73%to 85%.The results of this study offer some en-couragement regarding the potential ofshort-formhealth measures in surveys ofboth patients and general populations. Itmay no longer be necessaryto completelyomit measures of functional status andwell-beingfromlarge-scalestudies becauseof practicalconstraints.Further,for somepurposes the same short-form measuresmay be appropriateoruse in both popula-tions. The usefulness of these measuresinhealthpolicyresearch o evaluatetheeffectsof healthcare as well as in clinical rialsalsowarrants urtherstudy.

    (Keywords:health;healthassessment;unctioning.)References

    1. McDermottW.Absenceof indicators f the influ-ence of its physicianson a society'shealth.Am JMed1981;70:833.2. Schroeder SA. Outcome assessment 70 yearslater:are we ready?N EnglJMed 1987;316:160.3. TarlovAR.Shattuck ecture-the increasing up-ply of physicians, he changing tructure f the health-services ystem,andthe futurepracticeof medicine.NEnglJMed 1983;308:1235.4. WareJE.Standardsfor validatinghealth mea-sures: definition and content. J Chronic Dis1987;40:473.5. WareJE.Methodologicalonsiderationsn the se-lection of health status assessment procedures. In:WengerNK,MattsonME,FurbergCD,et al., eds. As-sessmentof Qualityof Life n ClinicalTrialsof Cardio-vascularTherapies.New York: Le Jacq Publishing,1984.6. ChambersLW,MacDonaldLA,TugwellP, et al.The McMasterHealthIndexQuestionnaire s a mea-sureof qualityof life forpatientswithrheumatoiddis-ease.JRheumatol1982;9:780.7. BergnerM, BobbittRA, CarterWB,et al. TheSicknessImpactProfile:developmentand final revi-sion of a health status measures. Med Care1981;19:787.

    8. JetteAM,DaviesAR,ClearyPD, et al. TheFunc-tional Status Questionnaire:reliabilityand validitywhen used in primary care. J Gen Intern Med1986;1:143.9. ParkersonGR,GehlbachSH, WagnerEH,et al.The Duke-UNChealth profile:an adulthealth statusinstrument orprimary are.MedCare1981;19:806.10. BrookRH,WareJE,Davies-AveryA, et al.Con-ceptualization ndmeasurement f healthforadults nthe HealthInsuranceStudy:vol VIII,overview.SantaMonica, CA: The RAND Corporation(publicationnumberR-1987/8-HEW),1979.11. Hunt SM, McKennaSP, McEwenJ, et al. TheNottinghamHealth Profile:Subjectivehealth statusandmedicalconsultations. ocSciMed1981;15A:221.12. PatrickDL,BushJW,ChenMM.Towardanop-erational definition of health. J Health Soc Behav1973;14:6.13. BergnerM.TheSickness mpactProfile SIP) n:WengerNK,MattsonME,FurbergCD,et al., eds. As-sessmentof Qualityof Life n ClinicalTrialsof Cardio-vascularTherapies.New York:Le JacqPublishing,1984.14. Read JL, Quinn RJ, Hoefer MA. Measuringoverallhealth:an evaluation of three importantap-proaches.JChronicDis 1987;40(Supp):7S.15. Diener E. Subjectivewell-being. Psychol Bull1984;95:542.16. SpitzerWO,DobsonAJ,HallJ,et al. Measuringthequalityof life of cancerpatients: conciseQL-indexforuse by physicians.JChronicDis 1981;34:585.17. Wan TTH,LivieratosB. Interpreting generalindex of subjectivewell-being.MilbankMemFundQ1978;56:531.18. AndrewsFM,WitheySB.Developingmeasuresof perceived ife quality:results fromseveralnationalsurveys.SocialIndicatorsResearch1974;1:1.19. CantrilH. The Pattern of Human Concerns.New Brunswick,NJ:RutgersUniversityPress,1965.20. GurinG,VeroffJ,FeldS. AmericansViewTheirMentalHealth.New York:BasicBooks,1960.21. WareJE,KarmosAH. Developmentandvalida-tion of scalesto measureperceivedhealthandpatient

    rolepropensity: olumeIIof a finalreport.Springfield,VA: National TechnicalInformationServices(NTISpublicationno. PB288-331),1976.22. Manning WG, Newhouse JP, Ware JE. Thestatus of healthin demandestimation; r, beyondex-cellent,good, fair, and poor. In: FuchsVR, ed. Eco-nomicAspectsof Health.Chicago:Universityof Chi-cagoPress,1982.23. WareJE,Sherboure CA, Davies AR, et al. Ashort-formgeneralhealthsurvey.SantaMonica:TheRAND Corporation(publication number P-7444),1988.24. McKinlay B.Someapproaches ndproblems nthe studyof the use of services:an overview.JHealthSoc Behav1972;13:115.25. WorldHealthOrganization.Constitution f the

    731

    Vol. 26, No. 7

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    10/13

    STEWART ET AL.World Health Organization. In: Basic Documents.Geneva:WorldHealthOrganization, 948.26. BergnerM. Measurement f health status. MedCare1985;23:696.27. BreslowL. A quantitative pproach o the WorldHealth Organizationdefinition of health: physical,mental and social well-being. Int J Epidemiol1972;1:347.28. WareJE,BrookRH,Davies-AveryA, et al. Con-ceptualization nd measurement f healthfor adults nthe HealthInsurance tudy:vol. I,model of healthandmethodology.SantaMonica:The RANDCorporation(publication umberR-1987/1-HEW),1980.29. CronbachLJ.Coefficient lphaand the internalstructure f tests.Psychometrika 951;16:297.30. HelmstadterGC. Principles of PsychologicalMeasurement.New York:Appleton-Century-Crofts,1964.

    31. Nunnally JC. Psychometric Theory, 2nd ed.New York:McGraw-Hill, 978.32. AndrewsFM.Construct alidityanderror om-ponentsof surveymeasures: structuralmodelingap-proach.PublicOpinionQuarterly 984;48:409.33. StewartAL,WareJE,BrookRH.Advances n themeasurement f functional tatus:construction f ag-gregate ndexes.Med Care1981;19:473.34. StewartAL, WareJE,BrookRH. Construction

    and scoringof aggregate unctionalstatus measures:volume I. Santa Monica: The RAND Corporation(publication o. R-2551-1-HHS),1982.35. DaviesAR, WareJE.Measuringhealthpercep-tionsin the HealthInsuranceExperiment. antaMon-ica: The RAND Corporation,(publication numberR-2711-HHS),1981.36. VeitCT,WareJE.Thestructure f psychologicaldistressand well-being n generalpopulations.JCon-sult ClinPsychol1983;51:730.37. StewartAL,WareJE,BrookRH,et al.Conceptu-alizationand measurement f health for adults n theHealthInsurance tudy:vol II,physicalhealth n termsof functioning.SantaMonica:The RANDCorporation(publication o. 1987/2-HEW),1978.38. WareJE,Davies-AveryA, BrookRH.Conceptu-alizationand measurement f healthfor adults n theHealth InsuranceStudy:vol VI, analysisof relation-ships among health status measures.Santa Monica:The RANDCorporationpublication umberR-1987/6-HEW),1980.39. Nelson E, CongerB, DouglassR, et al. Func-tional health status levels of primarycare patients.JAMA1983;249:3331.40. CassilethBR,LuskEJ,StrouseTB,et al. Psycho-social statusin chronic llness:a comparative nalysisof six diagnosticgroups.N EnglJMed 1984;311:506.

    732

    MEDICALCARE

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    11/13

    SHORT-FORM EALTHSURVEYAppendix

    Short-form Health Survey: Medical Outcomes Study2. In general, would you say

    your health is:a ExcellentO Very GoodO GoodO FairO Poor

    17. How much bodily pain haveyou had during the past 4weeks?

    1 0 None2 O Very mild3 E Mild4 0 Moderate5 0 Severe

    16. For how long (if at all) has your health limited you in each of thefollowing activities?(Check One Box on Each Line)

    Limited formore than3 months1a. The kinds or amounts of

    vigorous activities you can do,like lifting heavy objects,running or participating instrenuous sports ............

    b. The kinds or amounts ofmoderate activities you can do,like moving a table, carryinggroceries or bowling ........c. Walking uphill or climbing afew flights of stairs .........d. Bending, lifting or stooping ...e. Walking one block ..........f. Eating, dressing, bathing, orusing the toilet .............

    Limited for3 monthsor less2

    Notlimitedat all3

    O

    0ElEl

    OOO

    O O18. Does your health keep youfrom working at a job, doingwork around the house or

    going to school?1 0 Yes, for more than 3 months2 0 Yes, for 3 months or less3 l No

    19. Have you been unable to docertain kinds or amounts ofwork, housework orschoolwork because of yourhealth?

    1 O Yes, for more than 3 months2 0 Yes, for 3 months or less3 O No

    12345

    733

    Vol. 26, No. 7

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    12/13

    STEWART ET AL.

    Appendix ContinuedFor each of the following questions, please check the box for the oneanswer that comes closest to the way you have been feeling during the pastmonth.(Check One Box on Each Line)

    All oftheTime1

    Mostof theTime2

    A GoodBit oftheTime3

    Someof theTime4

    ALittleof theTime5

    Noneof theTime620. How much of the time,

    during the past month,has your health limitedyour social activities(like visiting withfriends or closerelatives)? .........

    21. How much of the time,during the past month,have you been a verynervous person? ....

    22. During the pastmonth, how much ofthe time have you feltcalm and peaceful? ..23. How much of the time,

    during the past month,have you feltdownhearted andblue? .............24. During the pastmonth, how much ofthe time have youbeen a happy person?25. How often, during thepast month, have youfelt so down in the

    dumps that nothingcould cheer you up?

    0 0 0 E 0 E

    E 0 E 0 E E

    0 0 0 0 0 0

    0 0 0 0 0 0

    0 0 0 0 0 0

    0 0 0 0 0 0

    734

    MEDICALCARE

    This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/12/2019 MOS-SF Reliability and Validity

    13/13

    SHORT-FORM HEALTH SURVEY

    Appendix Continued26. Please check the box that best describes whether each of the followingstatements is true or false for you.

    (Check One Box on Each Line)Definitely Mostly Not Mostly DefinitelyTrue True Sure False False1 2 3 4 5a. I am somewhat ill .......b. I am as healthy as anybodyI know ................c. My health is excellent ....d. I have been feeling badlately .................

    LI L L L EL L L L LL L L L LIL L L L L

    NOTE: Item numbers indicate the order in which the questions appearedin the questionnaire

    735

    Vol. 26, No. 7