Evaluability Assessments and Choice of Evaluation Methods Richard Longhurst, IDS Discussant: Sarah Mistry, BOND Centre for Development Impact Seminar 19

Evaluability Assessments and Choice of Evaluation Methods

Richard Longhurst, IDSDiscussant: Sarah Mistry, BOND

Centre for Development Impact Seminar19th February 2015

Introduction and some health warnings• Some acknowledgements and thanks• How this work came about: multilateral agency experience as

well as some review of literature• Evaluability assessments (EAs) are not new, go back 25 years• Will try to avoid getting bound up in the technical aspects ….

some of this will seem common sense …..but what matters is trying to make explicit the basis on which decisions are made… and how they relate to the culture of the organisation

• It is important to make judgements about choice of evaluation methods (as this is a CDI event) and what drives choices. The EA literature beginning to enter debate of choice of methods

• In the scope of this seminar, will not be covering every evaluation method

Context of this work with International Programme for the Elimination of Child labour (ILO-IPEC)

• Large technical cooperation programme (since 1992) largely funded by US Dept. of Labor

• Causes of child labour are multi-faceted, approaches to eliminate are equally various

• Main programme tool is Programme of Support to the national Time Bound Programme to reduce the worst forms of child labour

• TBP involved ‘upstream’ enabling environment and ‘downstream’ action support to reduction of child labour, therefore mix of interventions

• Also project and global interventions: at its peak IPEC carrying out 25 evaluations per year

• See: Perrin and Wichmand (2011) Evaluating Complex Strategic Interventions: The Challenge of Child Labour in Forss, Marra and Schwartz (eds), Transaction Publ.

Context: IPEC Evaluation approaches and Information Sources

• National Household Surveys• Baseline Surveys• Rapid Assessment Surveys• Child Labour Monitoring Systems and programme monitoring• Tracking and Tracer studies• One on one interviews; Focus groups• Document Analysis, Observation, Case studies• Impact and outcome evaluations, expanded final evaluations• Success case method and most significant change• Use of SPIF: strategic planning and impact framework

Context: My baseline at Commonwealth Secretariat (1995-2002)

• Starting up an expanded evaluation function• Conservative, diplomatic based organisation• An organisation with many small (<£50K) projects• About 4-5 project evaluations plus one strategic review of the

political function• Evaluation worked with planning function and reported direct

to CEO with oversight from GB• Many projects were hard to evaluate because of their design• Evaluability regarded as achieved through adherence to the 2

year strategic plan

Current Use of EAs

• Use of EAs is growing:• After their popularity in the US in the 1980s, EA guidance has

been developed by ILO, CDA, IDRC, EBRD and UNODC, with recently DFID, AusAID, UNFPA, WFP, IADB, UNIFEM and HELP (a German NGO).

• Encouraged by the International Financial Institutions (IFIs)• Over half of EAs were for individual projects (balance were

country strategies, strategic plans, work plans and partnerships)

Some definitions of EA from multilaterals

• OECD-DAC: ‘the feasibility of an evaluation is assessed … it should be determined whether or not the development intervention is adequately defined and its results verifiable, and if evaluation is the best way to answer questions posed by policy makers or stakeholders’. (broad)

• Evaluation Cooperation Group of the IFIs: ‘The extent to which the value generated or the expected results of a project are verifiable in a reliable and credible fashion’ (narrow but useful)

• World Bank: ‘A brief preliminary study undertaken to determine whether an evaluation would be useful and feasible …. It may also define the purpose of the evaluation and methods for conducting it’. (says something about methods)

Process for EAs (i)

• Common steps include (Davies):– Identification of project boundaries– Identification of resources available for EA– Review of documentation– Engage with stakeholders, then feedback findings– Recommendations to cover: project logic and design, M&E

systems, evaluation questions of concern to stakeholders and possible evaluation designs.

Process for EAs (ii) – Incorporating approaches for methods

• Mapping an analysis of existing information• Developing the theory of change to identify evaluation

questions noting linkages to changes attributable to intervention

• Setting out priorities, key assumptions and time frames• Choosing appropriate methods and tools• Ensuring resources are available for implementation• Outline reporting and communicating results of evaluation

Issues for an EA

• Review of guidance documents of international agencies suggest EAs should address three broad issues:– Programme design– Availability of information– Institutional context (including breadth of stakeholders)

EA Tools (i)

• Checklists are normally used: ILO covers five main areas: – Internal logic and assumptions– Quality of indicators, Baselines, Targets and Milestones– Means of verification, measurement and methodologies– Human and Financial resources, and– Partners’ Participation and use of information(and ILO uses a rating system for this).Don’t knock checklists, there is always a theory of change embodied in them An independent consultant is usually employed

EA tools (ii) to lead to choice of methods

• EA can be the focus for a modified design workshop that brings together staff and participants involved in all stages of the intervention (e.g. use of SPIF)

• Helps develop a stronger theory of change• Can strengthen monitoring and needs for other information• Can defuse suspicions about evaluations • Can be very useful when a Phase I has been completed and a

Phase II has been proposed, building on an evaluation • Allows ‘lessons learned’ from Phase I to be properly

addressed

Experience from using EAs (i)

• Generally EAs have been a good thing: – Improved usefulness and quality of evaluations: an advance on when

evaluator arrived at the end of the project and finding no means to evaluate

– Early EAs dependent on logic models and linearity, now some signs they are being broadened

– An opportunity for an early engagement with stakeholders, i.e. more participation

– Some evidence of improvements in project outcomes as well as design– More resources applied up front helps address later problems

Experience from using EAs (ii)

• Some of the difficulties:– Clash of work cultures between design and evaluation professionals –

working to different incentives and time scales– Issues of how far the evaluation ‘tail’ wags the design ‘dog’, leading to

some ethical issues– Have to be prepared for ‘cats’ put among ‘pigeons’ if there are

significant gaps in design; does it mean intervention is stopped ? – Evaluators must not get too seduced by what EAs can achieve,

especially if original intervention design is weak– EAs will not work everywhere and must always be light touch - there

will be a budget constraint– Other techniques may be more appropriate (e.g. DFID approach

papers)

Linking to Evaluation Methods

• Using the starting point of Stern et al (2012) Broadening the range of designs and methods for Impact evaluations, DFID working Paper No 38.

– Selection of appropriate evaluation designs has to satisfy three constraints or demands:

– Evaluation questions– Programme attributes– Available evaluation designs

Some criteria for choice of methods based on the results of the EAs (criteria will interact)

• Purpose of the evaluation• Level of credibility required: what sort of decisions will be

made on the basis of the evaluation?• What does the agency know already, i.e. nature of existing

information and evidence• Nature of intervention and level of complexity• The volume of resources and nature of capacity available to

carry out the evaluation• Governance structure of the implementing agency and

relationship with partners

Purpose of the evaluation

• This is the overarching framing question (so EA can make this clear)

• Relates to the position of the intervention in the agency’s planning structure and how evaluation has been initiated

• Any special role for stakeholders• Is the evaluation being implemented for accountability, learning

or ownership purposes or for wider process objectives• Nature of topic: project, country, thematic, global, programme• To set up an extension of an intervention

Level of credibility of evaluation results and decisions to be made

• How does the decision maker need to be convinced? Independence of the process ?

• How will the evaluation be used? What sort of evaluation information convinces policy makers?

• What is the nature of the linkages between results and intervention:– Attribution– Contribution– Plausible attribution

• If attribution is required with a need for a ‘yes/no it works/or not’ decision, then have to choose an impact evaluation

• If contribution is required, then can use contribution analysis• If ‘plausible attribution’ is required then can use an outcome

summative method.

Other common observations on method choice(relates to criterion of credibility)

• Experimental: demonstrates counterfactual, strong on independence, requires both treatment and control

• Qualitative: strong on understanding, answers ‘why?’ , difficult to scale up findings

• Theory based and realistic evaluation: compatible with programme planning, strong emphasis on context, requires strong ToC

• Participatory: provides for credibility and legitimacy, enhances relevance and use, can be time consuming

• Longitudinal tracking: tracks changes over time and can provide reasons for change, can be resource intensive

What does the agency and its partners already know ?

• No need to repeat evaluations if they do not add to the agency’s ability to take decisions (value of DFID writing approach papers)

• Role of information banks outside the agency (e.g. systematic reviews, research studies); external validity

• Have all stakeholders been involved with information gathering at the design stage

• How strong is the M&E, will the ‘M’ be useful for the ‘E’• Have worthwhile decisions been made in the past on existing

information, good enough for sound design • Is some form of comparison group required ?

Nature of the intervention and level of complexity

• Key question on complexity is: what is the level of complexity/ reductionism at which an intervention is implemented and an evaluation can be carried out

• Do the findings of the evaluation provide the basis for going ahead to make a decision ?

• If complexity is addressed in design through multiple intervention components, some where the n=1 (addressed to governments), some where n=thousands (addressed to children), then different evaluation methods can handle this.

• But, what do we know already that allows the evaluator to compromise on complexity ?

Resources and capacity

• Much choice comes down to the budget line, what the evaluation staff know and how much they are willing to take risks on unfamiliar methods (e.g. realist evaluation) and the time lines they work to

• There are opportunities for methods to be applied differently based on criteria already mentioned.

• Some agency staff describe the ‘20 day’, ‘30 day’ etc. evaluation method, defined by the resources they have

• This is why the ‘outcome summative’ method is so popular and why efforts should be made to improve it.

Governance Structure of the Agency

• Always remains a key issue as structure often inhibit risk taking by the evaluators

• Role of the governing body and executive varies in terms of what evaluators can do.

Importance of strengthening the ‘outcome summative’ evaluation

• Still remains the most common evaluation method (over 75% of evaluations ?) but not much covered in recent literature

• Large element of evaluator’s judgement involved, familiar, convenient, inexpensive

• But considering other factors for choice it can become the best choice: plausible attribution, aligned closely with other information sources, acknowledges deficiencies in addressing complexity, borrows ideas from other more rigorous techniques such as some form of comparison group of retrospective baseline.

Thank you !

Documents

Evaluability Assessments and Choice of Evaluation Methods Richard Longhurst, IDS Discussant: Sarah Mistry, BOND Centre for Development Impact Seminar 19