EVALUATION - Sugar Labswiki-devel.sugarlabs.org › images › 3 › 3d › Weiss_Analyzing.pdf · 2012-01-25 · As question 2 notes, the implicit question in a process evaluation

EVALUATIONMETHODS FOR STUDYINGPROGRAMS AND POLICIES

SECOND EDITION

Carol H. WeissHarvard University

Prentice Hall, Upper Saddle River, New3ersey 07458.

12.:

ANAL YZING

AND INTERPRETiNGTHE DATA

Unobstructedaccessto factscanproduceunlimitedgoodonly if it is matchedbythe desireandability to find outwhat they meanandwherethey lead.Factsareterrible things if left sprawlingandunattended.

—NormanCousins(1981)’[E]ventually, of course,our knowledgedependsupon the living relationshipbetweenwhat we seegoing on and ourselves.If exposure[to know1edge~isessential,still moreis the reflection.Insightdoesn’thappenoftenon the click ofthe moment,like a lucky snapshot,but comesin its own time and moreslowlyandfrom nowherebut within.

—EudoraWelty (1971)2

Introduction

The aim of analysis is to converta massof raw data into a coherentaccount.Whether the dataare quantitativeor qualitative, the task is to sort, arrange,andprocessthem and makesenseof their configuration.The intent is to producea read-ing that accuratelyrepresentsthe raw data and blends them into a meaningfulaccountof events.

Thisbook is no placeto learnthetechniquesof dataanalysisin eithertradition.Many booksexist to induct novicesinto the innersanctumsof analysisandto update“adepts”on the latestanalyticdevelopments.Thequantitativetraditionhas100 yearsof statisticalprogressto draw upon.Textbooksare availableto explain whetherandwhen to useanalysisof variance,analysisof covariance,regressionanalysis,multi-level analysis,survival analysis,eventhistoryanalysis,structuralequationmodeling,

~FromNormanCousins,HumanOptions:AnAutobiographicalNotebook.Copyright© 1981 byNormanCousins.Reprintedby permissionofW. W. Norton& Company,Inc.

2From EudoraWelty, OneTime, OnePlace (New York: RandomHouse,1971).Reprintedbypermissionof RandomHouse,Inc.

271

272 CHAPTER12

and othermulti-syllabicanalytic techniques.Bookson dataanalysiscanbe supple-mentedby excellentjournals,universitycourses,consultants,andconferences.

Thequalitativeapproach,wheretextsusedto concentratealmostsolely on fieldmethodsfor collectingdata,hasrecentlyseenanexplosionof bookson analyzingthedatacollected (e.g., Maxwell, 1996; Miles & Huberman,1994; Silverman, 1993;Strauss,1987;Wolcott, 1994).Here,too, therearecourses,journals,conferences,andworkshops.Much specific adviceis available,but the instructionis not (andproba-bly never will be) asdetailed or explicit as for quantitativeanalysis.Despitetheplethoraof resources,I wantthisbook,unlikeotherbookson evaluation,to saysome-thing aboutdataanalysis.It is too importanta subjectto go unremarked.

Analysis grows outof all the precedingphasesof the evaluation.It is heavilydependenton thecentralquestionsthatthestudy haschosento address,the researchdesign,andthe natureof the measuresand datacollected.At the point of analysis,the evaluatorbringsanothersetof skills tobearin derivingmeaningfrom thedesignand thedatain orderto answerthecentralquestions.

In this chapter,I first setout thecentraltasksof analysisin evaluation.Next Ioutlinea setof analyticstrategiescommontobothqualitativeandquantitativeanaly-sis, althoughI go on to point out key differencesin the two approaches.Finally, Iconsidertheuseof informationbeyondthat collectedin thestudy.

This is a shortchapter.It doesnottry totell thereaderhow to analyzeresearchdata,how to usecomputerprogramsor perform statisticalcaLculations,or how tomakesenseof sprawlingdescriptions.Librariesarefull of booksthatdo just that.Atthe endof the chapter,1 list textson analyticmethods.What this chaptertries to dois providethe logic ofdataanalysisin evaluation.Anyone,with or without trainingin researchtechniques,oughtto beableto follow the logic of the analytic task.

Analytic Tasksin Evaluation

Thebasictasksof evaluationareto provideanswersto the questionslisted in Figure12-l. An evaluationstudy cango aboutansweringthesekindsof questionsin a lim-ited numberof ways.Evaluatorswill follow strategiessuchas thoselisted in Figure12-1.Thelist presentedfollows thesequencemorecommonin quantitativethanqual-itative work. I discussthequalitativespinon theanalytictaskin the nextsection.

No onestudywill seekto answerall thequestionsin Figure12-1,but the fig-uresetsout the rangeof questionsthatevaluativeanalysiscanaddress.

Question 1 callsfor descriptionsof the programin operation.This is oftencalledprocessevaluation.Thebasictaskis description.Thedescriptioncanbeeitherquantitativeor qualitativein nature,or a mix ofboth. Thequantitativeevaluatorwillrely on countsof various aspectsof the program.The qualitative evaluatorwilldevelopnarratives.Heremphasisis likely to beon thedynamicsof the operationandthe environmentsurroundingtheprogram(suchasthestructureof theprogramorga-nization or the neighborhoodandcommunity environment).As question ld sug-gests,sheis alsoapt to be interestedin the recipients’interpretationsof the programandtheir participationin it: What doesit meanto themin their own terms?

As question2 notes,the implicit questionin a processevaluationmay be

ANALYZING AND INTERPRETINGTHE DATA 27a

FIGURE 12-1 LOGIC OF ANALYSIS IN EVALUATION

1. What went on in the program over time? Describing.a. Actorsb. Activities and servicesc. Conditions of operationd. Participants’ interpretation

2. How closely did the program follow its original plan? Comparing.3. Did recipients improve? Comparing.

a. Differences from preprogram to postprogramb. (If data were collected at several time periods) Rate of change.c. What did the improvement (or lack of improvement) mean to

the recipients?4. Did recipients do better than non recipients? Comparing.

a. Checking original conditions for comparabilityb. Differences in the ‘two groups preprogram to postprogramc. Differences in rates of change

5. Is observed change due to the program? Ruling out rivalexplanations.

6. What was the worth of the relative improvement of recipients?Cost-benefit or cost-effectiveness analysis.

7. What characteristics are associated with success? Disaggregating.a. Characteristics of recipients associated with successb. Types of services associated with successc. Surrounding conditions associated with success

8. What combinations of actors, services, and conditions areassociated with success and failure? Profiling.

9. Through what processes did change take place over time?Modeling.a. Comparing events to assumptions of program theoryb. Modifying program theory to take account of findings

10. What unexpected events and outcomes were observed? Locatingunanticipated effects.

11. What are the limits to the findings? To what populations, places,and conditions do conclusions not necessarily apply? Examiningdeviant cases.

12. What are the implications of these findings? What do they meanin practical terms? Interpreting.

13. What recommendations do the findings imply for modificationsin program and policy? Fashioning recommendations.

14. What new policies and programmatic efforts to solve social prob-lems do the findings support? Policy analysis.

274 CHAPTER12

whethertheprogramdid what it was supposedto do. Did it continueto operateasits designersintended,or did it shift over time?Theanalyticstrategyis comparison.The descriptivedataaboutthe program(e.g.,frequencyof homevisits, contentofcurriculum) arecomparedto theblueprintsof programdesigners.Theanswerscanindicatethe waysin which the programdivergedandperhapswhy it divergedfromtheoriginal plan.

Question3 askswhetherrecipientsof the program prospered.Herethe ana-lytic mode is comparingthe recipients’situation beforeand after exposureto theprogram.If datahavebeencollectedseveraltimesprior to theprogram,at severaltimesduringtheprogram,and/orat multipletimesaftertheconclusionof recipients’exposure,all thosedataare includedin the analysis.With intermediatedatapoints,the evaluatorcanassessor calculatetherateof change.3Forexample,did recipientsimproverapidly duringthe earlypartof theirencounterandthenlevel off? Did theyimprove all acrossthe courseof the programso thattheir immediatepostprogramstatuswas much better, only to seethe improvementfade out in the succeedingmonths?Such analysiscanbe performedeitherquantitativelyor qualitatively.

Question3c asksaboutthemeaningof programparticipationto its beneficiariesand how they construedtheir improvementor lack of improvement.For example,ayoungsterwho barely improvedhis readingscoreon a standardizedtestmay still bedelightedthathecannow readafew simplewordsandwritehisname.Thatmaymarktremendousachievementto him. This questionis thehometurfofqualitativemethods.

Question4 askshow recipientscomparedto nonrecipients.If the groupswererandomlyassigned,thentheirstatuson key variableswasprobablyvery similar at theoutset.Any differenceat the endrepresentsa gaindueto the program.If the groupswerenot selectedrandomly, it is useful to find out how their initial statuscompared,particularlyon the critical variablethatwill be usedasthe outcomeindicator(e.g.,fre-quencyof illegal druguse,knowledgeof geography).Their statuson otherfactorsthatmightbe expectedto influenceoutcomesshouldalsobe compared(suchas the lengthof time that they havebeenusingdrugs,theirIQ scores).If the groupsdiffer in signif-icantways,thisdifferential hasto be takeninto accountin subsequentanalyses.

Question4b callsfor comparingtheamountof changein theprogramandnon-programgroups.If the programgroupwentupconsiderablyon theoutcomevariablewhile the comparisongroupdidnotchangeat all, theevaluatorhasabasisforbegin-ning tomakejudgmentsaboutthevalueofthe program.Whenthe comparisongroupwas chosenthroughrandomization,the evidenceseemspretty convincingthat theprogramhadsomedegreeof success.But for nonrandomcomparisongroups,manyotherpossibleexplanationswill haveto be testedbeforereachingthat conclusion.Evenwith randomizedcontrols,additional checksmay be required.Thesekinds ofconcernswill takeus to question5.

Question4c is themulti-waveanalogto question3b. It asksfor comparisonofthe ratesof changeof the two groups.Answerswill give anindication of the relativetrajectories.

—3JohnWillett pointsout thattheterm rate ofchangeseemsto imply linearchange,whereasmuch

changeis nonlinear.I acknowledgethe pointbut retainthe termbecausethealternative—gmwt/ztrajec-tory—isless familiar to nontechnicalreaders.Keepin mind thatchangeneednot be linearovertime.

ANALYZINGAND INTERPRETINGTHE DATA 275

Question5 askswhethertheprogramwasthecauseof whateverchangeswereobserved. The analytic strategy is ruling out plausible rival explanations.AsCampbell(1966) haswritten:

It is the absenceof plausiblerival hypothesesthat establishesonetheoryas“cor-rect.” In caseswheretherearerival theories,it is the relativeover-allgoodnessof fit that leadsoneto be preferredto another.(p. 102)

Couldsomethingelseplausiblyhave led to the situation foundat follow-up? If so,then it is worthwhile examiningwhethersuchconditionsor eventswereoperative.Forexample,werethe programgroupandthecomparisongroupdifferent at theout-setand could that havebeenresponsiblefor differential success?How did the twogroupsdiffer, and is it reasonableto think that differenceson thosecharacteristicsled to different outcomes?

Therandomizedexperimentis the designbestsuitedto ruling out competinghypotheses.Evenwith randomassignment,it is possiblethat slippageoccurred.Theseeminglysuperiorperformanceof programrecipientsmightbe dueto the fact thatthe poorerrisks droppedoutof theexperimentalgroupandbetterrisks droppedoutof the controlgroup.Thatwould suggestthathadit notbeenfor the differentialattri-tion, the programgroupwould not look so good.The basicanalytic taskis to con-sider whatotherinfluencesmighthavecausedthe patternof dataobservedandthenseewhetherthosepossibilitiescanberejected.

Forcomparisongroupsconstructedwithoutrandomization,manymorecheckswill needto bemade.As notedin question4, thegroupsmight havebeendissimilarto startwith on thekey criterionvariablesand on variablesthataffectedtheirabili-ty to do well. If they were locatedin different communities,theymight havebeenexposedto different influences,which affectedtheir outcomes.In casessuch asthese,analysis cannottotally compensatefor the original inequalities. However,additional analysescannarrowdown the list of possiblecontaminantsandhelp theevaluatorreachan estimateof programeffects—perhapsnota point estimatebutanestimateof the rangeof effects that are probablyattributableto the program.Theevaluatorneedsto think hardaboutall the ways in which theprogramand nonpro-gramgroupsdiffered not only at the beginningof the programbut all throughitscourseand into its aftermath.Then shehas to find data that allow her to test theextentto which eachfactorwasimplicatedin theoutcomes.

I rememberreading some time ago about an airline crash that occurredbecausethe runwaywastoo shortand the planecouldnot stop beforecrashingintothebarrier.It turnedoutthat therunwayhadbeenbuilt to the appropriatelength,butover the ensuingmonthslittle thingshappenedthat chippedawayat its size.A newhangarwasbuilt that neededextraroom.A smallpartof the runwaywasdivertedtoaccommodateparked vehicles. The paving at one end was cracking and wasremoveduntil it couldbereplaced.Eachof thesedecrementswassmallandby itselfdid notcompromisesafety.Cumulatively,however,they led todisaster.

Thatseemsto mea good analogyto analysisof evaluationdata,especiallyincaseswithout randomizedassignment.The evaluatorhasto find eachof the smalldiscrepanciesthatmight vitiate theconclusionsaboutthe effectsof theprogram.Noone of the small problemsis importantenoughto cancelout the first-ordereffects,

276 CHAPTER12

butcumulativelytheymayunderminethe evaluationconclusions—unlesstakenintoaccount.

Question6 is the cost-benefitquestion.It asksfor acomparisonof the costsofthe programwith the benefits that accrueto participantsand to the larger society.Costsof the programincludenot only out-of-pocketexpendituresfor running theprogrambut alsoopportunitycosts—thatis, benefits thatwere forgonebecauseofthe program.For example,traineesin a job trainingprogramcouldhavebeenwork-ing andearningwages.Becauseof their enrollmentin the program,they lost thosewages.That is aprogramcost.

Programbenefitswere calculatedin answeringquestions4 and5. The analy-sis thereindicatedhow much and whatkind of improvementwasthe result of theprogram.For example,an emergencyhotlineto thepolicemight haveincreasedthenumberof crimesfor which arrestswere rapidly madeand reducedresidents’fearsabout their safety.The costs of the programcanbe fairly readily calculated;costsincludethe personnelstaffingthehotline,the phonecosts,andthe diversionof offi-cersfrom employmenton otherpolice work. Whataboutthebenefits?How can theanalystquantifythe worthof higherarrestratesand reducedfear?Dataare availableto estimatethe costof police investigationof a crime, soa decreasein thenumberandlength of thoseinvestigationscanbecostedout. If earlyarrestsaremorelikelyto leadto convictions,thenthe analysthasto seekdataon the benefitto societyofplacing a criminal behindbarsfor a given period of time, (The additionalcostsofprosecutingthe casesand of incarcerationneed to be deducted.)The worth of adecreasein residents’fears is harderto quantify.Analystshavedevelopedmethodsto quantify thevalue of suchintangiblebenefits,althoughaswe sawin Chapter10,much dependson the analyst’sjudgment.

Question7 askswhich characteristicsof the programor the peoplein it areassociatedwith betteror poorerperformance.Theanalytic strategyis disaggrega-tion.Theevaluatorlooksat the datato seeif thereareapparentrelationshipsbetweensomecharacteristicsof the programandthe outcomes.Question7a asksaboutpar-ticipant characteristics(aprograminput).Did menorwomendo better?Question7basksaboutprogramcomponents(programprocess).Did prisoninmateswho attend-ed group counselingdo better than those who receivedindividual counseling?Question7c asksaboutconditionsin the environment,Werewelfaremothersin low-unemploymentstatesmoresuccessfulat gettingjobs thanthosein high-unemploy-ment states?Theevaluatorcanlook at one variableat a time to seewhetherany ofthem provesto be significantly relatedto outcomes.With regressionanalysis,shecan find out which of a setof variablesare associatedwith good outcomes,whenothervariablesareheldconstant.

Qualitativeevaluatorscanalsoexaminethe relationshipbetweencharacteris-tics of clientsandthe servicesthey receivedandthecharacteristicsof their progressin the program.Miles andHuberman(1994),for example,provideguidancefor con-structingmatricesthatrevealthis kind of relationship.

Question8 askswhetherthereare clustersof services,conditions,staff, andclients that tend to go along with betteroutcomes.This questiongoesbeyondtheinvestigationof singlevariablesto examinetheeffectof combinationsof variables.With the useof interactiontermsin regressionanalysis,the analystcandiscoverthe


extent to which the effect of one variableon outcomesis increasedor decreaseddependingon the levels of anothervariable. For example,single-variableanalysismay show that betteroutcomesare obtainedwhen patientsreceiveservicefrommedicalspecialistsand alsowhen they stay in the hospital for shorterstays.If aninteractionterm is enteredinto the regressionequation,the resultsmay show thatdisproportionallybetterresultsare obtainedwhen servicefrom medicalspecialistsandshort staysarecombined.Otherstatisticaltechniquesallow for investigationofcomplexrelationshipsin datasetsof differenttypes.Similar analysescan be doneby qualitativeevaluatorswho systematicallyfactor and cluster thenarrativedatathey have.Ragin (1994) presentsa procedureto facilitatesuchanalysis.

Question9 inquires about the processesthrough which changetakesplace.Thishastraditionallybeenthe domainof qualitativeanalysis,notablythecasestudy.However,with thedevelopmentof newmethodsof structuralmodelingandgrowthcurveanalysis,quantitativemethodsarecatchingup to thedemandsof thetask.Theevaluatorwho isn’t familiar with thesemethodscan consulta good appliedstatisti-cian.

One relatively simple way of analyzing the processesof changeis thoughcomparisonof observedeventswith the ways in which programactors expectedchangeto occur—thatis, comparingeventswith programtheory.The evaluatorhaslaid out expectationsfor the microstepsby which changewill t’ike shape,and sheexaminesevolving eventsin relationshipto that theoreticalmodel.This givessomesuggestionaboutwhetherthe programworks in the way that programpeopleantic-ipated.

If the evaluationdatashowthatmostof the assumptionsof the programtheo-ry were met (e.g.,police foot patrol madethe police presencemorevisible to gangmembers,gangmembersbecamemore wary of apprehensionand reducedthe fre-quencyof illegal activities, incidenceof crimeby juvenileswentdown), the analy-sisof why the programworkedis fairly well specified.Thestudymayalsoshedlighton othermechanismsat work. For example,anothercomponentof programtheorymay haveanticipatedthat police foot patrol would encourageadults in the neigh-borhoodto take strongerprotectiveaction,suchasauxiliary neighborhoodpolicing,which would makegang membersreducetheir criminal activity. Did this chain ofeventstake place?Why questionsmay also be addressed.For example,did gangdelinquencysimply shift to otherneighborhoodsthat were not beingpatrolled?Ofcourse,one study is hardly likely to find answersto all the questionsthat can beasked.But to the extentthat dataare availableto explain programoutcomes,theycan be exploited—togive at leastprovisional guesses,or new hypotheses,aboutwhatprocessesthe programsetsin motion.

Whentheprogramtheorywith which the evaluationbeganprovestobeapoorrenditionof events,theevaluatorhasmorework todo. Why didn’t nutritionsupple-mentsto pregnantwomenimprovethe birth weightof their newborns?The theorywasperfectlystraightforward.Nutrition supplementswould improvethe women’shealthandnutritionalstatus,andtheirbabieswould bebornatnormalweights.If thedatashow that this outcomewasnot reachedfor mostwomenin the program,dodatagive cluesto the reason?Maybe not. Thenfurther inquiry is required.In onestudy further investigationshowedthewomendid not take the supplementsthem-

278 CHAPTER12

selvesbut insteadaddedthemto the family’s meals—putthem in the commonpot.Therefore,the supplementshad little effect on the woman’s health or the birthweight of the child. Unlessthequantitativeevaluatorhadforeseensomesucheven-tuality anddevelopedmeasuresto capturesuchbehavior,shemightnotknow whythingscameoutasthey did. Thequalitativeevaluatorwould haveabetterchanceoffinding out, but evenshecould miss the actionsof women backin theprivacyoftheir own kitchens.It might take a follow-on study to track down the mechanismsthat led to observedeffects.

In usingprogramtheoryasaguide to analysis,apossiblelimitation is thatthedataare notthoroughlyaccommodating.Theymaynot line up in clear-cutpatternsthat shoutyes,this is the way it is, or no, that’s all wrong.The associationsareusu-ally goingto be partial, andthe evaluatorwill haveto exercisecarefuljudgmentinfiguring out the extentto which they supportthe theory. One concernthat shewillhaveto attendto is whethera plausibletheory thathasnotbeentestedwould fit thedataevenbetter.If timeallows, shewill often wantto testseveralalternativetheo-ries to seewhetherthe datafit one of thesetheoriesmoresnugly thanthe one shestartedwith.

While I have suggestedthat basingevaluationon programmatictheoriesofchangespinsoff multiple advantages,it is nota panacea.Whenthe dataspraddleinindeterminateways,theydo not allow for clear-cuttestingof programtheories.Theevaluatorhasto exercisejudgmentin interpretinghow well the programadherestothe posited theories. Statistical tests alone are not going to resolve the issue.Judgmentis an indispensablecomponentin the analysis.

Programtheoryhasmany benefitsfor theevaluationenterprise.Evenwhenit doesnot lead to crisp conclusionsabout the processesandmechanismsofchange,it will providemoreinformationandmoreusefulkindsof knowledgethanare currently available.Further, its widespreaduse will encourageanalyststodevelopandapply new statisticalmethodsto the analysisof the fit betweentheo-ry andevents.Qualitativeanalystscanhonetheir own brandof analytic skills onthe exercise.

Thetenth questionfocuseson unanticipated(andusuallyundesired)effectsofthe program.This questionactuallyhasmoreto do with datacollectionthananaly-sis.Forqualitativeevaluators,it suggeststhattheyremainalert to possibleundesiredsideeffectsthroughoutthe fieldwork. For quantitativeevaluators,it suggeststhattheydevisemeasuresof possibleunwantedeffectsearly in the gameandcollect therequisitedata.When time comesfor analysis,the dataof eitherstripeareanalyzedaccordingto the sameproceduresusedfor moredesirableoutcomes.

Question 11 looks outwardto the generalizabilityof study findings. It askswhat the limits of the dataare,andan analytic strategyfor addressingthequestionis the examinationof deviantcases.Let’s takean example.An evaluationof a home-careprogramfor theagedfinds that mostof theclientsweremaintainedin theirownhomesandwerehappierthere;althoughtheir physicalconditioncontinuedto dete-riorate, the deteriorationwasnotas profoundasin agedpatientsin a nursinghome.For this program,theresultswerepositive.

The next questionis how generalizabletheseresultsare to other programs.Theremay be severalfeaturesof thestudythat suggestcaution.Forexample,in one


nursinghomeevaluation,the study populationwasall underthe ageof 80; it is notobviousthat resultswill be the samewith olderpopulations.Any limiting conditionsuchasthis needstobespecifiedin the evaluationreport Second,someof theelder-iy peoplein the study did notdo well. Whenthe evaluatorexaminesthesedeviantcases,shefinds that they tendedto be peoplewith seriouschronichealthproblemslike diabetesand emphysema.Basedon deviant-caseanalysis,the evaluatormaywant to limit herconclusionsand notclaim across-the-boardsuccess.

Question12 asksaboutthe implicationsof evaluationfindings.Puttingtogeth-ereverythingthe studyhasrevealed,the evaluatorhasto interprettheirmeaning.Shehasto decidewhich amongthe manydataitems are importantandworthhighlight-ing, andwhich findings,evenif statisticallysignificant,arelessnoteworthy.Shehasto combineindividual findingsinto coherentbundlesandin somecasesdisaggregatefindings to show the subcategoriesof which they were composed.For example,Friedlanderand Burtless(1995) examinedthe impact of welfare-to-workprogramson participants’earnings.Increasein total earningswas important,but the analysiswasmorerevealingwhenit isolatedthe separateeffectsof a numberof components:shorterinitial joblessness,fasterjob finding, increasedproportionof peoplefindingemployment,earningson the job, anddurationof employment.

Interpretation is the phaseof analysis that makessenseout of the variedstrandsofdata.It turnsdatainto a narrative,a story. It identifiesstrengthsand weak-nessesandhighlightsconditionsthat areassociatedwith both.

The evaluatorinevitably comesto the taskof interpretationwith certainpre-conceptions.Sometimesthe preconceptionstake the shapeof supportfor the pro-gram,for the generalidea behindtheprogram,or for the program’sclientele.Theevaluatormaybelievethat earlychildhoodeducationis a verygoodthing or thatjobdevelopmentfor adults with disabilitiesis vitally neededor that adoptionof TotalQuality Managementwill makefirms betterand moreproductiveenvironmentsfortheir workers.Sometimesherpreconceptionsaremoregeneral,taking the shapeofperspectiveson life, people,research,intervention,justice,andequality.Theycomefrom her life experiencesand stockof knowledge.At the interpretationstage,theyinfluence the way she proceeds.In the old unlamentedpositivist world, it wasassumedthat the facts should speakfor themselvesand the investigatorshouldbetotally neutral.Sheshoulddo everythingshecould to expungeherown knowledgeandbeliefs from the analytic task, to becomea cipherwhile sheciphered.In thesepostpositivistdays, we recognizethe futility of such advice; we can’t erase theessenceof whatwe are andknow from theresearchprocess.But we canseekto beconsciousof our biasesand of the way they affectinterpretation,and we canevenstateourbiasesup front in evaluationreports.

Muchof the knowledgethat will influenceinterpretationis knowledgeofpriorevaluationsof theprogramor programsof the sametype.The evaluatorcan showthe extentto which the currentprogramhasbeensimilarly or divergentlyimple-mentedand the extentto which findingsaboutoutcomessupportor disagreewithearlierfindings. The analysiswill alsoseekto answerquestionsleft unansweredinearlierwork—in effect, to fill in missingpartsof the programpuzzle.

‘Beyond this type of information, theevaluatorwill bring to bearknowledgefrom social scienceresearchand theory. Conceptsand theoreticalgeneralizations

280 CHAPTER12

about,say, egodefensesor organizationalgatekeeping,caninspirecreativewaysofinterpretingevaluationdata.Nor can(or should)theanalystignorethewholerealmof “ordinary knowledge”(Lindblom & Cohen, 1979) that tendsto definewhat weall believesimply becausewe live in the United Statesaroundthe turn of themil-lennium.As Cronbach(1982)haswritten:

Experienceusedin interpretingasummativefinding comespartly from supple-mentarysources.Someof thesesourcesmay speakof the programitself, butobservationof otherprograms,tangentiallyrelatedsocialresearchand theory,andcommonsenseviews (communityexperience)all may carryweight.A studygainsauthorityinsofarasit is translatedinto a story that fits with otherexperi-ence.(p. 303)

But it is the analyst’sjob notto acceptunquestioninglytheeverydayassump-tions that are partof our workadaylives. Sheshouldtry to surfacethem,turn themaroundin her mind, look at them from a new angle, probethem, and seeif theprocessgeneratesnew insights.For example,Schonand Rein (1994) discussthetenacityof familiar metaphorsin structuringhow we seethe world. They offer theexampleof fragmentedservices,acommondiagnosisof the problemsinvolved indeliveryof socialservices.This metaphorlimits attentionto only asubsetof thephe-nomenainvolved in the problem.It also implicitly containsthe outlineof the logi-cal solution.If the problemis fragmentationof services,the obviousremedyis inte-gration,making them whole. A commonsenseway of looking at the problemthusforestallsotherapproachesthat might havegreaterchanceof success.Theevaluatorshouldnot settlefor facile explanationseven when (especiallywhen) they arepartof the currentorthodoxy.

A dilemmathat evaluatorsmay facewhen they reachthe stageof interpreta-tion is what do when the findings show that the program has had little effect.Although, strictly speaking,this is apolitical ratherthanan analyticproblem,find-ings of this sort tendto mergethe political with theanalytic.Especiallywhen theevaluatorsarebeingpaidby programpeopleand/orhavebeenworking closelywiththem,thereis astrongtemptationto soft pedalpoorresultsandemphasizepositivefindings.If the aim is to encourageprogrammanagersand practitionersto listen toresultsratherthanrejectthemandto modify activities ratherbecomedefensive,suchan approachis understandable.However,whenit doesviolenceto the data,it runsthe risk of becomingthe whitewashthat Suchmanwarnedus againstin Chapter2. Itis not obviousthat downplayingseriousflaws will benefitprogrampeoplein thelong run. More helpful will be sensibleand practical adviceon how to cope withpoorresults.

Which brings us to thenextphase,task13: developingrecommendationsforchangein programor policy. Evaluatorsare not alwaysexpectedto developtheimplicationsof their resultsinto recommendationsfor action. Sometimesthey areaskedto be the purveyorsof straight evidence,what SergeantFriday on the oldDragnet programcalled“just the facts,ma’am,” while programand policy peopleseeit as their responsibilityto review the evidencein termsof changesneeded.Programpeopleunderstandthe programin its historical setting,its management,thepolitical climate,andthe optionsavailable,andtheycanbestconsiderwhatadapta-

ANALYZINGAND INTERPRETiNGTHE DATA 281

tions should andcan be made.They can useevaluationfindings asone guidetochange,if changeseemscalled for, but they have muchotherknowledgethattheycanbring to bear.

In othercases,however,theevaluatoris expectedto offer herjudgmentof thetypesof actionthat shouldbetaken.Theevaluatorhasan intimateunderstandingofthedetailsof theinformationin thestudy,andshehasworkedlong andhardto com-prehendit in all its complexity.Programandpolicy peopleoften want theevaluatorto takethenext stepandrecommendthe kinds of changestheyshouldmake. Whenshe is called upon to do so, the evaluatorhasto takefull advantageof all theevi-dence.For example,if shehasidentifiedconditionsin the programthatareassoci-atedwith betteroutcomes,she has leadsto the kind of recommendationsthat arewell supportedin the data.If shehas examinedthe fit of the datato the program’stheory of action, the dataprovidesuggestionson stepsto takenext. If shehasdatafrom observationandinformal interviewing,shewill haveideasaboutthe kinds ofchangesthatmakesense,

However, someevaluationsare black box evaluations.They havecollecteddataonly on outcomesanddo not knowwhatwenton insidetheprogrambox, treat-ing it asthoughit weretotally opaque.Thestudiesmayhavebeenafter-only,before-and-after,before-and-afterwith comparisongroups,time series,or before-and-afterwith randomizedcontrols,but thefocus wason outcomes,with little attentionto theoperationof the program.When outcomesarelessthansatisfactory,the evaluatorhaslittle senseof the conditionsthat led to the outcomes.Sheis in a bind when itcomesto recommendations.Clearlysomethingis wrongnow,but whatwould makeit better?Someevaluatorshavespecializedin a programmaticareafor years—sayphysicalrehabilitation,child abuseprevention,or teachingEnglishasa secondlan-guage.Out of their yearsof evaluatingsuchprogramsandtheir othersubstantiveknowledgein thefield, theyhavedevelopedastockof knowledge.Theyhavea baseon which theycandraw to makerecommendations.

However,many evaluatorsare not topic-areaspecialists.They evaluatepro-gramsof manydifferent kindsandthusdo not havedeepknowledgeaboutanyonesubjectarea.If the study doesn’tprovide informationaboutfactors that influenceprogrameffectiveness,theyhaveto makedo with commonsensein developingrec-ommendationsfor change.They aresometimesdriven to recommendingtheoppo-site of whateverthe unsuccessfulprogram is currently doing—especiallyif thereportis almostdueandtheyhavelittle time left to think aboutthekinds of recom-mendationsthat would besensible.

Needlessto say, this is not goodpractice.Evenin thisbind, theevaluatorhasa responsibilityto think about the recommendationssheis making with the samecritical faculty that shedevotesto currentpractices.Are therecommendedchangeslikely to achievedesiredeffects?Is thereexperienceelsewherethat suggeststhattheywill bepractical,feasible,andeffective?Whatprogramtheorywould theyrep-resent?Are they consonantwith otherconditionsin the environment,suchaslevelof budgetor staffcapabilities?To find out if the changessheis thinking of recom-mendingarelikely to work, shecanturn to social sciencetheoryandresearch,pastevaluationsof similar programs,andthejudgmentof expertsin thefield. Collectinginformation about the likely effectsof recommendationsis worthwhile, evenif it

282 CHAPTER12

involvesonlyconsultationwith programmanagersandstaff. In effect, shewould bedoing somethingapproachinganevaluability assessmenton theprospectiverecom-mendations.If thereis enoughtime,a wisecourseof actionwouldbeto collectsomedataon the recommendedpracticesin action,butexcepton the smallestscale,thisis not likely to befeasiblein a time-boundstudy.

In all cases,howevermuchdatathe studyprovidesandhoweversubstantive-ly expertthe evaluatoris, the actof makingrecommendationsrequiresthe evaluatorto extrapolatebeyondthe evidenceathand.Shehasto think aboutthe future worldinto which any recommendedchangewill enter. If the federalgovernmentis goingto givestatesmoreauthorityto setrules,if schooldistrictsaregoingto devolvemoreauthorityto schools,if budgetsaregoingto beundercontinuedstress,whatdo theseconditionsmeanfor the programin the future?And whatkind of recommendationsmakesenseunderthe comingconditions?If theevaluatorhasbeenforesightful,shewill havedataon somesitesthat aresimilar to theconditionsthat arelikely to pre-vail, such asa statethat has beenallowed to allocatefederaltransportationfundsacrossdifferent modesof transportationor a schooldistrict that hasdecentralizeddecisionmakingto theschools.Shecanexaminethesecasescarefullyasshedevel-opsideasfor recommendations.

No oneexpectstheevaluatorto beNostradamus(andhow well did all his pre-dictions work?). But the point is that the evaluatorcannotassumethat tomorrow’sworld will be the sameas today’s. Training programsfor displacedworkers(dis-placedby shifts in technology)will be affectedby changesin economicactivity andthe employmentrate;court-orderedprogramsof communityservicefor adjudicateddelinquentswill be affectedby changesin laws,sentencingguidelines,andcommu-nity sentiment.Thepossibility-eventhe probability—ofrelevantchangehasto betakeninto accountin the recommendation-makingenterprise.

Evaluatorsrarelyrecommendthe totaleradicationof aprogram.Evenif datashowthat the programis totally ineffective,theusualrecourseis to try to patchitup andtry again.But policymakerswill alsowelcomeinsightsaboutalternativestrategies,and findings from the evaluationcan inform the formulation of newinterventions.This is the subjectof question13: Whennewpoliciesandprogramsare underconsideration,what lessonsdoesthis evaluationoffer to the venture?Here we arein the domainof policy analysis,but we takeadistinctively evalua-tive slant.

The analysisof future policies and programsusually startswith a problem—suchasthe inaccessibilityofpublic transportationto handicappedpeople.Thepolicy-maker,bureaucrat,oranalystinterestedin theproblemgeneratesa list ofpossiblesolu-tions, suchaswheel-chair-accessibleelevatorsin train stations,kneelingbuses,andsubsidizedvansthat pick up handicappedcustomerson call. Theanalysthasthe taskof investigatingwhich, if any,oftheseproposalswouldbeoperationallypractical,con-ceptuallylogical, andempirically supportedby pastexperience.The contributionofevaluationis to supplyempiricalevidenceabouttheeffectivenessof thesepractices.

The personundertakinganalysisof this sort is likely to be a policy analystratherthananevaluator.But he needsto know a good dealaboutwhatevaluationhas to say. In analyzingthe likely effectivenessof alternatives,he seeksthesumtotal of evaluationevidenceon eachtopic. Meta-analysisis of particularuse.By

ANALYZINGAND INTERPRETINGTHEDATA 283

combiningresultsof dozensor hundredsof evaluationsof similarprograms,meta-analysiscontributesstate-of-the-artknowledgeto theforecast.Thesynthesisofevi-denceis appliedto theanalysisof futurepoliciesandthisextendsthereachof eval-uationbeyondthe individual programto thefutureof programminggenerallyin anarea.

Althoughquestion14 takesus beyondthe terrainof the individual evaluator,it has been addressedby evaluators.The GeneralAccounting Office’s formerProgramEvaluation and MethodologyDivision essentiallyreinventedaspectsofpolicy analysison a distinctly evaluativebase.It publisheda paperon whatit calledprospectiveevaluation,an attemptto formalize a methodfor extrapolatingfromexisting studiesto future times andplaces(U.S. GeneralAccountingOffice, 1989).BecauseGAO works for the Congress,it is oftenconfrontedwith requeststo exam-ine proposalsandbills that would setup newprograms,andthe paperillustratesthewaysin which the staffgoesaboutsynthesizingpastevaluationfindingsto judgethelikely efficacy of future proposals.An interestingsidelightof the GAO paperis theemphasisit placeson surfacing the “theoreticalunderpinningsof prospectivepro-grams” (GAO, 1989,p. 60, seealsopp. 30—33).Explicit programtheoryaswell asevaluativeevidencehelp to determinethe potential of a proposal for successfuloperation.

General Strategies of Analysis

Thetasksof analysisof quantitativeandqualitativedatahavemuchin common,butevaluatorswill go aboutthe tasksin differentways.Therearetwo basicreasonsforthe differencein strategy:the natureof the dataand the analytic intent.First, quan-titative andqualitativeanalystshavedifferentkinds of datain hand.Thequantita-tive evaluatoris workingwith measuresthat yield numericalvalues,suchasyearsof schoolingor scoreson testsof knowledge.Eventhosedatathat startedout asnarrativeresponsesto interviewsor notesfrom observationshavebeencodedintoa setof categoriesto which numericalvaluesare assigned.The qualitativeevalua-tor, on the otherhand,is working with sheavesof field notes—narrativeaccountsof conversations,interviews,observations,fieldworkers’ reactionsand tentativehunches,schooldocuments,and so on. Therefore,the task that eachconfrontsisdifferent.

Quantitativestudiesusuallyhavea largenumberof cases.Thelargesizeof thesamplehelpsto ensurethat odditiesof individual casescanceleachotherout, andthe datawill showthemaineffectsof the intervention.If the samplehasbeencho-senthroughrandomsampling,resultswill generalizeto the populationfrom whichthe samplewasdrawn.However,quantitativestudiesoftenhavea limitednumberofdescriptorsabouteachcase.A local evaluationcanhavedataon 300 scienceteach-ers,and anationalstudycanhavedataon 2,000,butboth haverelatively little infor-mation abouteachteacher—perhapslength of time he has beenteaching,grade(s)taught,scoresoncapsulesciencetestsbeforeandafterthestaffdevelopmentproject,andperhapsone or two moreitems.Although thereis limited information on eachparticipant,the largenumberof casesmakesstatisticalanalysisappropriate.

A qualitativeevaluationof thesameprojectwould havedataon a small sub-

284 CHAPTER12

setof theteachersin the staffdevelopmentproject,maybe8 to 15. If theevaluationteamhasseveralmembers,theymight havestudiedthreeor four times that manypeople~but not so many that theycould not remembereachindividual in his ownsocialandhistoricalcontext.Thestudywould probablyhavea largeamountof dataabouteachone: noteson conversationsovertime, observationsof the person’spar-ticipationin the staff developmentproject,transcribedinterviews,and so on. Theinformation would cover scoresof topics, from each person’s backgroundandviews on teachingto thephysicalsettingof the staffdevelopmentproject,from theactionsof the staff developerto the characterof the schoolsto which the scienceteacherswill return. By narrowing down to a relatively small numberof cases,qualitativeevaluatorsareableto bring into focusall the details,idiosyncrasies,andcomplexitiesin the individual case.However, it is often not apparenthow repre-sentativethesecasesare nor how generalizablethey areto all scienceteachersintheprogram.

The secondandmore importantreasonthat quantitativeandqualitativeeval-uators undertakedifferent modesof analysisis that their intentionsare different.Qualitativeevaluatorsaim to look at the programholistically, seeingeachaspectinthe context of the whole situation.They areconcernedaboutthe influenceof thesocial context.They try to understandprior history as it influencescurrentevents.Their emphasisis dynamic,trying to gain a senseof development,movement,andchange.In essence,theyaim to providea video(afew yearsagowe would havesaidamovingpicture)ratherthanaseriesof snapshotsat time 1, time 2, andtime 3. Inaddition,they areconcernedwith the perspectivesof thoseinvolved with the pro-gramandthe meaningof the experienceto them.Ratherthanimposethe evaluator’sassumptionsanddefinitions,they seekto elicit the viewsof participants.

Quantitativeanalyststraffic largely in numbers.Much of the work that theyhaveto do at this stagethey have alreadydone throughthe developmentand/orselectionof measuresanddatacollection.They havedefinedthe variablesof inter-estandthewaystheyaremeasured,andnow is thetime to focus on locatingsignif-icant relationshipsamongthem.Throughstatisticaltechniques,they identify associ-ations amongvariablesandthe likelihood that associationsare real and not merechancefluctuations.Their aim is to model the systemof causeandeffect.

To put the distinctionin a moregeneralway, quantitativeevaluatorstend tofocusonwhetherandto whatextentchangein x causeschangein y. Qualitativeeval-uatorstendto be concernedwith theprocessthatconnectsx andy.

A third differencecan be mentioned,the time ordering of analytic tasks.Quantitativeevaluatorsgenerallywait until thedataare in beforethey begin ana-lyzing them. Theyneednot wait until all thedataarecollected,but theyusuallywaituntil all the datafrom onewaveof dataarecollected,coded,checked,enteredintothecomputer,andcleaned.Qualitativeevaluators,on theotherhand,generallybegintheir analysisearly in the datacollection phase.As they talk to peopleand watchwhatgoeson, they beginto develophunchesaboutwhat is happeningandthe fac-tors in the situationthat areimplicatedin events.They canusesubsequentdatacol-lection to testtheir earlyhypotheses.Analysis is an ongoingprocess.

Despitethe differencesin the two traditions, they havecommonfeatures.Quantitative analystsdon’t throw numbers unthinkingly into a computer and


swearby the resultsthat come out the otherend. Qualitativeanalystsdon’t readthe entrailsof animalsto figure out whatthedatamean.They sharecommitmentsto systematicandhonestanalysis.They alsoconfront commontasks,as outlinedin Figure12-1.

Furthermore,only a limitednumberof strategiesareavailablefor datareduc-tion andinterpretation.Listedbelowis a setof thesebasicstrategies.Most analystsmakeuseof them,whethertheirproclivities arequalitativeor quantitative.

BasicAnalyticStrategies

DescribingCountingFactoring(i.e., dividing into constituentparts)ClusteringComparingFindingcommonalitiesExaminingdeviantcasesFindingcovariationRuling out rival explanationsModelingTelling thestory

DescribingAll evaluatorsusedescriptionto evokethe natureof the program.Theydescribetheprogram,its setting,staff, structure,sponsorship,andactivities.The qualitativeana-lyst is likely to providenarrativechronicles;thequantitativeevaluatorwill providesummarystatisticson a seriesof measures.But in orderto give a fair accounting,they all haveto engagein description.

CountingCounting is a part of description and of mostother analytic strategies.Countinghelpsto show what is typical andwhat is aberrant,what is partof aclusterandwhatis unique.Eventhehardiestqualitativeanalystwill talk about“more,” and“a few,”andin orderto be surethat she is right, judiciouscounting is in order.Countsalsohelpto showthe extentto whichprogramparticipantsarerepresentativeof the larg-erpopulationto which resultsmight be relevant.

Factoring

I usefactoring in the algebraicsenseof breakingdown aggregatesinto constituentparts.Quantitativeevaluatorshavealreadyfactoredaggregatesinto componentsinthe processof measurement.They have constructedmeasuresto capturedifferentfacetsof inputs,process,andoutcomes.Qualitativeevaluatorsprefernotto abstractpeopleandeventsinto partsbutto deal with themholistically.Their emphasisis ongestaltandthe naturalcontext.Nevertheless,the humanmind, for all its grandeur,istoo limited toholddozensofthoughtssimultaneously.Evendevoutqualitativeeval-

286 CHAPTER12

uatorsbreakout separateaspectsof experienceand deal with them in sequence—while trying to keepthelargercontextin mind.

Codingis anexampleof factoring.Thecoding of narrativematerialinto cate-goriesis a commonpracticein evaluation.It splicesup a narrativeinto a setof cat-egoriesthat havecommonsubstantivecontent.The procedureallows the analysttoexaminethecontentandfrequencyofeachcategory,andthe covariationamongcat-egories.

Clustering

Clusteringis the procedureof puttinglike things together.It identifies characteris-tics or processesthat seemto group,aggregate,or sidlealongtogether.Factoranaly-sis, principal componentsanalysis,clusteranalysis,and multi-dimensionalscalingare among the statistical techniquesfor accomplishingthis kind of aggregation.Qualitativeevaluatorstend to clustercasesthroughiterativereading,grouping,cod-ing, or datadisplays(Miles & Huberman,1994). Glaserand Strauss(1967) suggesta systematicprocedurecalledanalyticinductionfor developingcategoriesandmod-els. (I describeit in the sectionExaminingDeviant Cases.)Ragin (1994)describesan explicit way to arrayqualitativedataalonga setofdimensionstofigure outwhichcasesbelongtogether,whathecalls thecomparativemethod.Many qualitativeana-lystswho deal with a small sampleof casesrely on their immersionin the materialsandrepeatedrereadingof theirdatato arrive at appropriateclusters.

A concomitanttaskis decidingwhateachclusteris a caseof. As the analystdevelopsclusters,she usesintuitive or explicit conceptsto group casestogether;eachcaseexemplifiestheconceptthatunitestheminto acategory.But asshegoesalong,sherefines andredefinesthe concepts.Somecaseswill not fit her originaldefinitionof the conceptandthe conceptmustbealteredor narroweddown, or thecasemust be droppedfrom thecategoryandanothercategoryfound for it. Someconceptswill turn outto beuselessfor makingsenseof the data.New classificationswill beneeded.Conceptualclarification is an ongoingtask.

ComparingComparisonis theheartof the evaluationenterprise.Theanalystcomparesthe situ-ationbefore,during,andafterthe program.Shecomparesprogramparticipantswiththosewho didnot receivetheprogram.Shecomparesparticipantswith oneanotherto see whether there is a great deal of variability within the program group.Sometimesthereis asmuch variability amongthe programparticipantsas thereisbetweenthem andthe membersof the comparisonor controlgroup.In thatcase,thedifferencebetweenprogramand controlgroupsmaynotbe significant,eitherstatis-tically or practically.Analysisofvarianceandregressionanalysisarestatisticaltech-niquesfor fmding this out.

Qualitativeevaluators,who usuallyanalyzea smallnumberof cases,cancom-pare individuals directly. When they have comparisongroups,they, too, have toguardagainstthe possibility that differenceswithin eachgroupare as largeas dif-ferencesbetweenthe groups.Ratherthan makea seriesof comparisonson singlecharacteristics,theycraftmulti-dimensionalportraitsthat show similaritiesanddif-ferences.

Aw~v~YzJNGAND INTERPRETINGTHE DATA 287

Finding CommonalitiesA basicstrategyof the analytic task is to locate commontrendsand main effects.What arethe commonexperiencesthat groupsof programparticipantshave?Whatarethecommonelementsthat characterizesuccessfulparticipants?Muchof the sta-tisticalrepertoireis gearedtowardidentifying suchsimilarities.

Qualitativeanalystswhofacethe sametaskareoftenadvisedto find patternsin the data.Without more specificguidance,I’ve alwaysthoughtthat thatadviceis asuseful asMichelangelo’sadviceon how to carvea marblestatue:Chip awayall themarblethatis notpartof the figure. Oneof the specificwaysof finding pat-terns is to look at the frequencyof clustersof events,behaviors,processes,struc-tures,or meanings,and find the commonelements.Brilliant insights arisefromfinding commonalitiesamongphenomenathat were originally viewed astotallyunconnected.

ExaminingDeviantCasesIn statistical analysis,the deviantcaseis far from the body of the distribution.Analystsmaysometimesdropsuchcasesto concentrateon thecentraltendencyandsimplify the story. But at somepoint it is often worthwhile to take a hardlook atthem and what makesthem so unusual.Qualitativeanalysts,too, may ignoreaber-rantcases,but importantinformationmay lurk at theextremeendsof a distribution.Qualitative analystscan use systematictechniques~toincorporatethe informationinto their analysis.One suchtechniqueis analytic induction (Bogden & Biklen,1992; Glaser& Strauss,1967) which requiresanalyststo pay careful attentiontoevidencethat challengeswhateverconstructsandmodels~theyaredeveloping.Withanalytic induction,they keep modifying their conceptsand introducing qualifica-tionsandlimits to theirgeneralizationsuntil the conclusionscoverall thedata.Casesthat are different from the mainstream,outliers in statisticalterms,can also giveimportantcluesto unanticipatedphenomenaand may give rise to new insightsintowhat is happeningandwhy.

Finding CovariationIt is one thing to find out whetherthe valuesof onephenomenonare relatedto thevaluesof another.This is the strategythat I havecalledcomparison.Anotherques-tion is whetherchangesin a phenomenonare relatedto changesin otherphenome-na. Doesincreasingthe numberof hoursof classroomtrainingfor disturbedadoles-centsoverthe courseof theschoolyear leadto betteradjustment?Doesimprovingthe socialrelationshipsamongstaffin the programleadto improvingrelationshipsamongparticipants?Again, statisticshastried and true proceduresfor making esti-matesof covariation.Qualitative evaluationhas to handcraftsuchtechniqueson astudy-by-studybasis.

Ruling out Rival ExplanationsBoth quantitativeandqualitativeevaluatorswant to beableto ascriberesultsto theprogramintervention.Theyneedto separatethosechangesthat the programbringsaboutfrom changescausedby extraneouscircumstances.Eventhe so-calledcadil-lac of designs,the randomizedexperiment,may leaveloopholesand uncertainties

288 CHAPTER12

aboutcausation.The analystmustrule outthe possibility that somethingotherthanthe programwasresponsiblefor observedeffects.

To go aboutthis step,the analystidentifies plausiblealternativeexplanations.Were successfuloutcomesbrought aboutby the new math curriculum or by newcomputingfacilitiesin the schoolsor theschoolrestructuringthatempoweredteach-ersandraisedtheirmoraleandenthusiasm?Were unsuccessfuloutcomestheresultofthecurriculumorweretheydueto thepoorquality of curriculumimplementation,insufficient class time devotedto it, or competingdemandson students’attention?Whateverfactors might plausibly be held responsiblefor observedresults—non-comparability of program and comparisongroups, outsideevents,the pattern ofdropoutsfrom theprogram,inappropriatemeasures,changesin the conductof theprogramovertime—mayneedto beexamined.

If the evaluatorlocatesa potentialexplanationthat seemsreasonable,shehasseveraloptions.Shecan collect availabledatato seewhetherthe rival explanationcanberuledout; shecan collectnew datato try to disproveit; shecanmounta smallsubstudyto investigateits cogency. If the data do not contradict it, she shouldacknowledgeit asapossibleexplanationfor theresultsandperhapstagit for furtherinvestigation.

ModelingModelingis a procedurefor puttingall relevantinformationtogetherto createexpla-nationsof importantoutcomes.Here is whereall the prior stepscome to fruition.Theanalystcharacterizesthe natureofprogrameffectiveness,Sheidentifiestheele-mentsthat wereimplicatedin programeffectivenessandassessesthe relativeweightof their contribution.She noteswhich elementsworkedtogethersynergisticallytoenhancesuccessand which elementswere incompatible with eachother. To theextentpossible,she explainshow the programachievedthe observedresults. Shenoteswhich programcomponentsandtheorieswere supportedin the analysisandwhich werenot. At theendshehasaroundedaccountof whathappenedand,at leastto the limit of herdata,how it happened.

Telling the StoryComethe endof the analytic phase,all evaluatorshaveto communicatetheir find-ings to readers.Communicationrequiresputtingthe datainto understandableform.Nobodyhasa monopolyon techniquesfor making evaluationresultsclear,puttingthemin context,pointingout the limits of their applicability, andinterpretingthemin light of likely developmentsin programandpolicy. Writing is wherethe rubberhits theroad,Goodwriting is not onlya matterof styleandgrace;it is actuallya testof theextentto which the writer fully understandswhatsheis talking about.If shecan’tcommunicateit well, perhapsshehasn’tfully graspedit herself.

An Example of Program Theory as a Guide to Analysis

Programtheorycanprovidea structurefor analysisof evaluationdata.The analysiscanfollow the seriesof stepsthathavebeenpositedasthe coursethat leadsfrom theinterventionthroughto theachievementof goals.


Let’s takeanexample.A groupat Stanfordevaluatedaneducationprogramforasthmapatientsto enablethem to gain increasedcontrol over theirsymptomsandreducetheir sufferingfrom asthma(Wilsonet al., 1993).Theunderlyingtheorywasthat educationwould increasetheir understandingof their conditionand thereforetheir confidencethat they could control it, which in turn would leadto their adher-enceto a treatmentregimen,and thusto bettercontrol of symptoms.Becausethetheory guided the evaluationdesign, the evaluatorshad measureson eachof thestepspositedin thetheory.The study included323 adult patientswith moderatetosevereasthma.Therewere two treatmentgroups,onereceivinggroupeducationandone receivingindividualizededucation,and a control group.

The evaluatorsmonitoredparticipationand found that 88% of the patientsassignedto groupeducationattendedall four sessions.Observationof the sessionsindicatedthat “educatorsadheredclosely to the programoutlines and script. andusedthe handoutsand homework assignmentsas intended” (p. 569). Notice theimportanceof this componentof the evaluation.Without the monitoringof content,therewould beno wayof knowingwhat actuallywenton in the groupeducationses-sion—thatis, whethertheprogramin action was the sameasthe programdesign.

Theevaluatorscomparedthe treatmentandcontrol groupson before-and-aftermeasuresto seeif therewere any changes.If not, the thing to explain would bewherethe theorystalledoutand why theprogramfailed. If therewerechanges,andtherewere,thenthe analysiswould concentrateon howmuchchangetherewas,howthe programworked,who benefitedthe most, and which componentsof the treat-ment weremosteffective.

Following the training, patientswho receivededucationshowedimprove-mentsin their testscoreson understandingand knowledgeaboutasthma.4Theyalsoadheredto the regimenof care,which includedbothcleaningtheir living quartersand self-care.They showedsignificantly greater improvementin their bedroomenvironment(absenceof allergenicfurnishings,betterdust control and cleaningpractices)andmuchbettertechniquein useof a metered-doseinhaler.However,theprogramgroup did not significantly improve in getting rid of pets to whom theywere allergicor (for the 10% who smoked)giving up smoking, but the numberofpatientsin thesecategorieswassmall.

Final outcomesfor the group-educationgroupwerepositive.They were lesslikely to reportbeingbotheredby asthmathantheyhadbeenoriginally andlesslike-ly than the controls; they hadfewerdayswith symptomsthan controls;physicianswere more likely to judge them improved; their visits to clinics for acute caredeclinedsignificantly.

However,thearticlereportstheoutcomesfor the treatmentandcontrolgroupsasa wholeanddoesnot link outcomesto earlierstagesin the program,suchasatten-danceat groupeducationsessionsor adherenceto regimen.It is likely that samplesizeswould get toosmall to find significantdifferencesif the groupwerepartitionedalong the lines of the theory.Still, it would be informative to know whetherthosewho did everythingright (i.e., attendedsessions,cleanedup their homeenviron-

4Thedataabouttest scoreson knowledgeaboutasthmado not appearin the article mentioned.Italked to SandraWilsonin 1994andobtainedthis information.

290 CHAPTER12

ments,andusedthe inhalerproperly) turnedout to havebetteroutcomesthanoth-ers,or whetherthereweresomefeaturesthat didn’t link to outcomes.Onestatementin the articlenotesthatimprovementin inhalertechniquewasassociatedwith reduc-tion in patients’botherwith symptomsof asthma,but improvementin inhalertech-nique “could not accountfor all of the observedimprovementin symptoms”(p.575). Thus,mostof the how and why questionsaboutthe programwere answeredthroughstatisticalanalysisbecausethe evaluationincludedappropriatemeasurestotracktheunfoldingof outcomes.

One rival hypothesisthat the evaluatorsexploredwas that physiciansmighthavechangedpatients’ medicationsover the courseof the program,which couldaccountfor someof thepositiveresults.However,they foundthat peoplein theedu-cationand control groupsdid not differ at the outsetor at follow-up in intensity ofmedication(typically fouror five daily drugs),andtherewereno significantchangesin theproportionsreceivingspecifictypesof medications.Somedicationwasnot theexplanationfor changein condition.

A finding aboutthe superiorityof groupeducationover individual educationledto furtherexplorationof why thegroupprogramwasmoresuccessful.Evaluatorsinterviewedandlearnedthat patientsvaluedandbenefitedfrom theopportunityforpeersupportand interaction,andeducatorsbelievedthat the groupsettingencour-agedpatientsto expressfearsandconcerns,which theeducatorcould thenaddress.Also,educatorsweretypically morecomfortablein groupthan individual teaching.Sohere,the why questionswere answeredthroughreportsfrom participants.

This report is a fine exampleof analysisthat follows thelogical progressionof expectedeffects.It showsthat qualitativeevaluationdoesn’thavea monopolyoninsight into theprocessesof programoperation.The quantitativeevaluatorcandojust as well if shehassavvyabout the program,goodprogramtheory,measuresofinputs and outcomes,and appropriatemeasuresof interveningprocessesalong thelinesof theprogramtheory.And good statisticalknow-how—andtheopportunitytoaddon further inquiry whenresultsare puzzling.

Books on Analytic Methods

It is not possibleto shoehorna descriptionof analyticmethodsinto this book with-out terminal oversimplification.Many good booksalreadyprovide the help in thisarenathatevaluatorscanusewith profit. Herearea few titles.

For Quantitative Evaluation

Linear Models(Regression/ANOVA)

Afifi, A. A., & V. Clark. (1990). Computeraided multivariateanalysis(2nd ed.).New York: ChapmanandHall.

Lunnenberg,C. (1994). Modeling experimentaland observationaldata. Boston:PWS-Kent.

Neter,J,, M. H. Kutner,C. J. Nachtschein,& W. Wasserman.(1996).Appliedlinearstatisticalmodels(4th ed.).Homewood,IL: Irwin.


Multi-LevelModeling

Bryk, A., & S. Raudenbush.(1992).Hierarchical linear modeling.Beverly Hills,CA: Sage.

Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: EdwardArnold.

StructuralEquationModeling

Bollen,K. A. (1989). Structuralequationswith latentvariables.New York: Wiley.Loehlin, J. C. (1992). Latentvariable models:An introduction tofactor,path, and

structuralanalysis.Hillsdale,NJ: Eribaum.

EventHistoryAnalysis

Allison, P. D. (1984).Eventhistoryanalysis.QuantitativeApplicationsin the SocialSciences,no. 46. BeverlyHills, CA: Sage.

Blossfeld, H. P., A. Hamerle, & K. U. Mayer. (1989).Event history analysis.Hillsdale, NJ: Erlbaum.

Yamaguchi,K. (1991).Eventhistoryanalysis.BeverlyHills, CA: Sage.

CategoricalData Analysis,Log-linearModeling,LogisticRegression

Agresti,A. (1990).Categoricaldataanalysis.NewYork: Wiley.Hosmer,D. W., & S. Lemeshow.(1989). Applied logistic regression.New York:

Wiley.

Meta-Analysis

Cooper,H., & L. V. Hedges.(1994).Handbookofmeta-analysis.New York: RussellSageFoundation.

Cook,T. D., H. Cooper,D. S. Cordray,H. Hartmann,L. V. Hedges,R. J. Light, T.A. Louis, & F. Mosteller. (1992).Meta-analysisfor explanation:A casebook.New York: RussellSageFoundation.

For Qualitative Evaluation

Strauss,Anselm L. (1987). Qualitative analysisfor social scientists.Cambridge:CambridgeUniversity Press.

Silverman,David. (1993).Interpretingqualitativedata: Methodsfor analyzingtalk,text,and interaction.London: Sage.

Wolcott, HarryE (1994).Transformingqualitativedata: Description,analysis,andinterpretation.ThousandOaks,CA: Sage.

Miles, MatthewB., & A. Michael Huberman.(1994).Qualitative dataanalysis:Asourcebookofnewmethods(2nded.).ThousandOaks,CA: Sage.

Maxwell, JosephA. (1996). Qualitative researchmethods.ThousandOaks,CA:Sage.

292 CHAPTER12

Ethical Issues

Two newethical issuescan arisesomewherearoundhere.Onehasto do with own-ershipof the data.Theevaluatoroften thinksthattheraw dataandanalysesbelongto her. The study sponsormay believethat his agencyownsthedata.Thedisagree-ment doesn’tbecomecontentiousuntil one or theotherwishestoreleasethe datatoother investigatorsfor possiblereanalysis.If a federalagencyis the sponsorof thestudy,it is almostalwayswilling to releasedata. In somelarge studies,the govern-ment agencywill evenfund thepreparationof public usedisksand documentationto makefurther analysisof the raw dataeasy.Othersponsors,suchasfoundations,nationalassociations,andserviceagencies,canbe lessdisposedtorelease.Thebesttreatmentfor this problemis prevention.In the early phasesof the study,perhapsevenbeforethecontractis signed,theevaluatorandsponsorshoulddrawup guide-lines aboutownershipand release(andpublication)of the data.

Theotherproblemis the safeguardof confidentialityof therawdata.Evenwithall theprotectionsthat the evaluatorhasintroduced(removalof namesandotheriden-tifiers), computerizeddatabasesarebecominglesssecure.Newtechnologiesare mak-ing it easyto accessdatabases,which can leadto the identificationof sitesand indi-viduals.Whendatabasesreachthe World Wide Web, the possibilitiesfor datamisusewill multiply.Theproblemsarenew, andcreativepeopleare working on solutions.

Summary

Thischapterhasdiscussedanalyticstrategiesfor making senseof data.Not seekingto duplicatetextbooksdevotedto methodsof quantitativeandqualitativeanalysis,ittries to revealthe logic of analysis.It setsouta rangeof questionsthat anevaluationmaybecalledupon to addressandsuggeststhebasictechniquethat analysisof eachquestionrequires.

The basicquestionsthat canbe answeredin analysisare: (1) what happenedin theprogram; (2) how faithfully did the programadhereto its original plans; (3)did recipientsimprove: (4) did recipientsfare betterthan nonrecipientsof programservice; (5) was observedchangedue to the program;(6) did benefitsoutweighcosts;(7) whatcharacteristicsof persons,services,andcontextwereassociatedwithprogramsuccess;(8) whatcombinationsor bundlesof characteristicswereassociat-ed with success;(9) through what mechanismsdid successtakeplace; (10) whatwere the unanticipatedeffects;(11) what limits are thereto the applicability of thefindings; (12) what aretheir implicationsfor futureprogramming;(13) whatrecom-mendationscanbebasedon the findings; (14) what new policiesand programsdothe findingssupport.

In setting out the logic of analysis, I call attention to the commonalitiesbetweenquantitative and qualitative approachesto evaluation.Both approachesmakeuseof suchbasicstrategiesas describing,comparing,disaggregating,com-bining andclustering,looking at covariation,examiningdeviant cases,ruling outrival explanations,modeling,andinterpretingand telling thestory.

Theevaluator’sanalyticpriorities aredeterminedby the purposeof the study.The purposeof the study determinesthe questionsasked,the designused,and the


analysisconducted.Takea projectthat offers cleanneedlesto drug usersin ordertopreventtheircontractingAIDS. If the issuein contentionis whetherdruguserswillcometo theneedleexchangeprojectandacceptneedlesfrom acity agency,thentheanalysisconcentrateson countinganddescribingthosewho cometo the projectandcomparingthosewho attendwith thenumberandcharacteristicsof drugusersin thecommunity(from externaldatasources).If the burningissueis whetherratesof newHIV-positive casesgo down, then the analysiscomparesthe incidenceof casesbeforeandaftertheinitiation of theproject.

The chapteralso suggeststhat the evaluatormake useof datafrom sourcesoutsidethe project to supplementthe datacollectedexplicitly for evaluation.Suchdatacanincludedatafrom otherevaluations,datafrom socialscienceresearch,dataavailablein city recordsor agencyfiles, andevennew investigationsto follow upanomaliesorpuzzlesthattheevaluationreveals.By the endof the analysis,the eval-uatorshouldhaveanswersto thequestionswith which shebegan.

The chapterconcludeswith the exampleof an analysisof a program to edu-cateasthmapatientsin techniquesof self-care.

Documents

EVALUATION - Sugar Labswiki-devel.sugarlabs.org › images › 3 › 3d › Weiss_Analyzing.pdf · 2012-01-25 · As question 2 notes, the implicit question in a process evaluation