15
A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration Author(s): M. Vanhouckel and S. Vandevoorde Source: The Journal of the Operational Research Society, Vol. 58, No. 10 (Oct., 2007), pp. 1361- 1374 Published by: Palgrave Macmillan Journals on behalf of the Operational Research Society Stable URL: http://www.jstor.org/stable/4622825 . Accessed: 12/05/2011 17:28 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=pal. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Palgrave Macmillan Journals and Operational Research Society are collaborating with JSTOR to digitize, preserve and extend access to The Journal of the Operational Research Society. http://www.jstor.org

A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

  • Upload
    fmunoz7

  • View
    102

  • Download
    2

Embed Size (px)

Citation preview

Page 1: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

A Simulation and Evaluation of Earned Value Metrics to Forecast the Project DurationAuthor(s): M. Vanhouckel and S. VandevoordeSource: The Journal of the Operational Research Society, Vol. 58, No. 10 (Oct., 2007), pp. 1361-1374Published by: Palgrave Macmillan Journals on behalf of the Operational Research SocietyStable URL: http://www.jstor.org/stable/4622825 .Accessed: 12/05/2011 17:28

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and youmay use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .http://www.jstor.org/action/showPublisher?publisherCode=pal. .

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Palgrave Macmillan Journals and Operational Research Society are collaborating with JSTOR to digitize,preserve and extend access to The Journal of the Operational Research Society.

http://www.jstor.org

Page 2: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

Journal of the Operational Research Society (2007) 58, 1361 -1374 ? 2007 Operational Research Society Ltd. All rights reserved. 0160-5682/07 $30.00

www.palgravejournals.com/jors

A simulation and evaluation of earned value metrics

to forecast the project duration

M Vanhouckel,2* and S Vandevoorde3

1Ghent University, Ghent, Belgium; 2Vlerick Leuven Ghent Management School, Ghent, Belgium; and 3Fabricom Airport Systems, Brussels, Belgium

In this paper, we extensively review and evaluate earned value (EV)-based methods to forecast the total project duration. EV systems have been set up to deal with the complex task of controlling and adjusting the baseline project schedule during execution, taking into account project scope, timed delivery and total project budget. Although EV systems have been proven to provide reliable estimates for the follow-up of cost performance within our project assumptions, they often fail to predict the total duration of the project. We present an extensive simulation study where we carefully control the level of uncertainty in the project, the influence of the project network structure on the accuracy of the forecasts and the time horizon where the EV- based measures provide accurate and reliable results. We assume a project setting where project activities and precedence relations are known in advance and do not consider fundamentally unforeseeable events and/or unknown interactions among various actions that might cause entirely unexpected effects in different project parts. This is the first study that investigates the potential of a recently developed method, the earned schedule method, which improves the connection between EV metrics and the project duration forecasts. Journal of the Operational Research Society (2007) 58, 1361-1374. doi: 10.1057/palgrave.jors.2602296 Published online 13 September 2006

Keywords: project management; simulation; forecasting

1. Introduction

Earned value management (EVM) is a methodology used to measure and communicate the real physical progress of a project and to integrate the three critical elements of project management (scope, time and cost management). It takes into account the work complete, the time taken and the costs in- curred to complete the project, and it helps to evaluate and control project risk by measuring project progress in monetary terms. The basic principles and the use in practice have been comprehensively described in many sources (for an overview, see eg Anbari (2003) or Fleming and Koppelman (2005)). Although EVM has been set up to follow-up both time and cost, the majority of the research has been focused on the cost aspect (see eg the paper written by Fleming and Koppelman (2003) who discuss EVM from a price-tag point-of-view). In this paper, we elaborate on the recent research focused on the time aspect of EVM and compare a newly developed method, called earned schedule (ES) (Lipke, 2003a), with the more traditional approach of forecasting a project's duration.

In this paper, we extensively simulate project execution based on a large set of very diverse project networks and a wide set of uncertainty scenarios. This approach allows an objective and extensive comparison between various EVM

methods to forecast the project duration. We generate numer- ous networks based on a network generator from literature and build a baseline plan with randomly generated activity du- rations and costs. We simulate project execution by means of Monte-Carlo simulations under different controlled scenarios and measure project performance, based on the actual (ie sim- ulated) activity durations and costs that will differ from their corresponding planned values (PVs). Based on these differ- ences, we calculate the earned value (EV) metrics and report our forecasting indicators. Thanks to this specific approach, we aim at drawing general conclusions that hold for a wide variety of projects and uncertainty scenarios, rather than draw- ing case-specific conclusions based on one or a few real-life projects. The purpose of this research is threefold. First, we evaluate the forecast accuracy of the three methods that aim to predict the total project duration based on EV metrics. Second, we check their robustness for different network types, based on a pre-defined topological structure of each network. Last, we study the behaviour of the different forecasting methods with respect to the stage of completion of the project and the level of uncertainty for each (critical and non-critical) activity.

The outline of the paper is as follows. In Section 2, we review the current EV key parameters, the corresponding per- formance measures and their use as indicators to predict a project's final duration and cost. In Section 3, we propose the settings of our simulation model to test the accuracy of the three methods to predict a project's final duration. In Section

*Correspondence: M Vanhoucke, Faculty of Economics and Business Administration, Ghent University, Hoveniersberg 24, Ghent, Belgium. E-mail: [email protected]

Page 3: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1362 Journal of the Operational Research Society Vol. 58, No. 10

4, we discuss our results for the three methods based on dif- ferent levels of uncertainty and a known structure for each

project. We conclude in Section 5 with some overall conclu- sions and suggestions for future research.

It is crucial to note that this research paper rests on a fun- damental assumption that project activities and precedence relations are known in advance. Hence, we assume a project setting where estimates can be given within a certain range, even though we may not be able to predict every source of

unexpected event with certainty. However, projects often do not fulfil these assumptions but, on the contrary, are com-

monly plagued by fundamentally unforeseeable events and/or unknown interactions among various actions and project parts (Loch et al, 2006). The straight application of the metrics and methods evaluated in this paper are certainly insufficient for

projects where our assumption does not hold, but this is cer-

tainly outside the scope of this paper. We refer to the book

by Loch et al (2006) for a more general framework where the authors position and classify the sources of risk by the fore-

seeability of the underlying influence factors and by the com-

plexity of the project. We also cite the letter to the editor of Harvard Business Review from Cooper (2003) as a response to the article written by Fleming and Koppelman (2003). In this letter, the author argues that the use of EVM can be ques- tioned when it is applied in highly complex projects. Due to the cycles of rework, the accuracy of the EVM metrics can be biased, leading to incorrect management decisions.

2. EV metrics revisited

In this Section, we review the different metrics of an EVM

system, as used in our simulation approach. In Section 2.1, we

briefly review the EVM key parameters that serve as an input for the performance measures and the forecasting indicators. In Section 2.2, we briefly review the existing performance measures and in Section 2.3, we discuss the use of these

performance measures to forecast the future performance of the project. Figure 1 serves as a guideline to Sections 2.1- 2.3.

2.1. EVM key parameters

Project performance should be measured throughout the life of the project and hence requires a fixed time frame (ie a baseline schedule) for the project. A project schedule defines starting times (and finishing times) for each project activity and hence, a PV for each activity, both in terms of duration and costs. The planned duration (PD) equals the total project duration as a result of the constructed critical path method (CPM) schedule and is often referred to as schedule at completion (SAC, Anbari, 2003). The actual time (AT) defines the project progress and reporting periods for performance measurement. The actual duration (AD) defines the real activity or project duration. The budget at completion (BAC) is the sum of all budgeted costs for the individual activities.

EVM requires three key parameters to measure project per- formance, that is, the PV, the actual cost (AC) and the EV. The

Earned Value Key Parameters

Planned Value (PV) Actual Cost (AC) - Earned Schedule (ES)

Earned Value (EV)

Earned Value Performance Measures

Cost Performance Schedule Performance Schedule Performance Index (CPI) Index (SPI) Index (SPI(t))

Cost Variance (CV) Schedule Variance (SV) Schedule Variance (SV(t))

Translation to time units Time Variance (TV)

Earned Duration (ED)

Earned Value Forecasting Indicators

Cost Duration Duration Estimate at Completion Estimate at Completion Estimate at Completion

(EAC) (EAC(t)) (EAC(t))

Figure 1 EVM: key parameters, performance measures and fore- casting indicators.

PV is the time-phased budget baseline as an immediate result of the CPM schedule constructed from the project network. The PV is often called budgeted cost of work scheduled. The AC is often referred to as the actual cost of work performed and is the cumulative AC spent at a given point AT in time. The EV represents the amount budgeted for performing the work that was accomplished by a given point AT in time. It is often called the budgeted cost of work performed and equals the total activity (or project) BAC multiplied by the percent- age activity (or project) completion at the particular point in time (PC x BAC). Figure 2 displays the three EVM key pa- rameters for a fictive project under the four different possible time/cost scenarios.

2.2. EVM performance measures

Project performance, both in terms of time and costs, is de- termined by comparing the three key parameters PV, AC and

EV, resulting in four well-known performance measures:

SV schedule variance (SV = EV-PV) SPI schedule performance index (SPI = EV/PV) CV cost variance (CV = EV-AC) CPI cost performance index (CPI = EV/AC)

In our specific simulation approach, we calculate these per- formance measures on the project level, and not on the level of each individual activity. Jacob and Kane (2004) criticize this approach and argue that the well-known performance measures are true indicators for project performance as long as they are used on the activity level, and not on the con- trol account level or higher work breakdown structure (WBS) levels. They illustrate their statement with a simple example with two activities, leading to wrong and misleading results. As an example, a delay in a non-critical activity might give a warning signal that the project is in danger, while there is

Page 4: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1363

o o

delay U - sdelay

>...... over budget >I under budget

CLSV<0 SPI<1 -SV<0 SPI <1 CV <O CPI<1 CV > O CPI > 1

AT Time AT Time

...................... ..... over budget ahead ii ahead budget

Sunder budget w &

SSV>O SPI>1 SV>O SPI>1

CV<O CPI<• 1+ CV>O CPI>1

AT Time AT Time

S PV -- AC - EV

Figure 2 The EVM key parameters PV, AC and EV for a project under four scenarios. Scenario 1: late project, over budget; Scenario 2: late project, under budget; Scenario 3: early project, over budget; and Scenario 4: early project, under budget.

no problem at all since the activity only consumes part of its slack. Since we calculate the performance measures on the project level, this will lead to a false warning signal and hence, wrong corrective actions can be taken. We recognize that effects (delays) of non-performing activities can be neu- tralized by well-performing activities (ahead of schedule) at higher WBS levels, which might result in masking potential problems, but we believe that this is the only approach that can be taken by practitioners. The EV metrics are set up as early warning signals to detect in an easy and efficient way (ie at the cost account level, or even higher), rather than a simple replacement of the critical path-based scheduling tools. This early warning signal, if analysed properly, defines the need to eventually 'drill-down' into lower WBS levels. In conjunc- tion with the project schedule, it allows taking corrective ac- tions on those activities which are in trouble (especially those tasks which are on the critical path). As a result, we calculate the performance measures (SPI and SV) on the level of the project based on the three key indicators (PV, AC and EV) that are calculated per reporting period as the sum over all the individual activities (which can be easily done since they are expressed in monetary units).

The cost performance indicators and their predictive power to forecast the final project cost (see next section) have been discussed extensively in literature and will not be repeated

here. However, in order to track project time performance, the schedule performance measures need to be translated from

monetary units to time units. In literature, three methods have been proposed to measure schedule performance: the PV method (Anbari, 2003) and the ED method (Jacob and Kane, 2004) translate the well-known SV and SPI indicators from

monetary units to time units. The ES method has been re- cently introduced by Lipke (2003a) and calculates two alter- native schedule performance measures (referred to as SV(t) and SPI(t)) that are directly expressed in time units.

The PV method of Anbari (2003) relies on the well-known EV metrics to forecast a project's duration using the following metrics:

PVR planned value rate (or planned accomplishment rate) =BAC/PD

TV time variance =SV/PVR

The average PV per time period, the PVR, is defined as the baseline BAC divided by the PD. This measure can be used to translate the SV into time units, denoted by the TV.

Jacob and Kane (2004) introduced a new term, ED, as the product of the AD and the SPI. Jacob (2003) and Jacob

Page 5: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1364 Journal of the Operational Research Society Vol. 58, No. 10

project delay ahead of schedule

IJ >

SV(t) SV(t)

ES AT Time AT ES Time

PV -&- AC -o- Ez

Figure 3 The ES metric for a late (left) and early (right) project.

SPI(t) -------SPI - - SV SV(t) 0.0 - 0.0

1. 1 - .....

1.0- Tim P Te-2.0-. ---1.0

EL"0.9 . ..

... S-4.0 - -

-2.0 E

.

- 6 .0 ....... " ...... .-----, .....

... .....

.... - 3 .0 > 0.7

.. ..CI) i . --

. ,. 0.6 -. - . . . -8.0 - i i i -4.0

PD Time PD Time

Figure 4 The SPI and SV versus SPI(t) and SV(t) performance measures.

and Kane (2004) introduced the ED method as a reliable methodology for forecasting duration using the SPI.

ED earned duration =AD*SPI

Lipke (2003a) criticized the use of the classic SV and SPI metrics since they give false and unreliable time forecasts near the end of the project. Instead, he provided a time-based measure to overcome the quirky behaviour of the SV and SPI indicators. This ES method relies on similar principles of the EV method, and uses the concept of ES as follows:

Find t such that EV > PVt and EV < PVt+1

ES = t + (EV - PVt)/(PVt+ - PVt)

with

ES earned schedule EV earned value at the actual time PVt planned value at time instance t The cumulative value for the ES is found by using the EV to identify in which time increment t of PV the cost value for EV occurs. ES is then equal to the cumulative time t to the beginning of that time increment, plus a fraction (EV -

PVt)/(PVt+1 - PVt) of it. The fraction equals the portion of

EV extending into the incomplete time increment divided by

the total PV planned for that same time period, which is simply calculated as a linear interpolation between the time-span of time increment t and t + 1. Figure 3 illustrates the translation of the EV into the ES metric to clearly show whether a project is behind (left) or ahead of (right) schedule.

Using the ES concept, two indicators can be constructed which serve as good and reliable alternatives of the SV and SPI indicators, as follows:

SV(t) schedule variance with earned schedule ES-AD

SPI(t) schedule performance index with earned schedule ES/AD

Figure 4 clearly displays the unreliable behaviour of the SV and SPI metrics for a project that finishes later than planned (PD = 9 weeks while the AD = 12 weeks). The last review periods of the project are unreliable since both the SV and SPI metrics clearly show an improving trend. At the end of the project, both metrics give a signal that the project finishes within time (SV = 0 and SPI = 1 at the end of the project), although it is 3 weeks late. The SV(t) and SPI(t) metrics give a correct signal along the whole life of the project. The SV(t) equals -3 at the end of the project, which is a reflection of the 3 weeks delay.

Page 6: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1365

Forecasting method Situation Anbari (2003) Jacob (2003) Lipke (2003a) Comments

Past performance is not a good predictor of future

PDWR according to plan EAC(t)pvl

EAC(t)EDI EAC(t)Est performance. Problems/opportunities of the the past will not affect the future, and the remaining work will be done accordine to plan Past performance is a good predictor of future performance

PDWR will follow current SPI trend EAC(t)PV2 EAC(t)ED2 EAC(t)E2 (realistic!). Problems/opportunities of the past will affect future performance, and the remaining work will be corrected for the observed efficiencies or inefficiences Past cost and schedule problems are good indicators for future

PDWR will follow current SCI trend EAC(t)PV3 EAC(t)ED3(a) EAC(t)E3(b) performance (i.e. cost and schedule management are inseparable). The SCI = SPI * CPI (schedule cost ratio) is often called the critical ratio index

Planned Earned Earned Value Rate Duration Schedule

(a) This forecasting formula does not appear in Jacob (2003). It is based on Anbari (2003) and has been added by Vandevoorde and Vanhoucke (2006) (b) This forecasting formula does not appear in Lipke (2003a). It is based on Anbari (2003) and has been added by Vandevoorde and Vanhoucke (2006)

Figure 5 The estimated PDWR depending on the project situation. (Source: Vandevoorde and Vanhoucke (2006), and based on Anbari (2003))

Since the introduction of the ES concept by Lipke (2003a), other authors have investigated the potential of the new method in various ways. Henderson (2003) has shown the validity of the ES concepts on a portfolio of six projects. In another paper, he extended this novel approach (Henderson, 2004) and used it on a small scale but time critical informa- tion technology software development project (Henderson, 2005). Vandevoorde and Vanhoucke (2006) were the first au- thors that extensively compared the three methods and tested them to a simple one activity project and a real-life data set. Moreover, they summarize the often confusing terminology used in the ES literature, which will be used throughout the current paper. In the current paper, we wish to elaborate on the ES concept and simulate the three methods on a large (generated) data set containing projects of moderate size.

2.3. EVM forecasting indicators

One of the primary tasks of a project manager is making decisions about the future. EVM systems are designed to follow-up the performance of a project and to act as a warning signal to take corrective actions in the (near) future. Forecast- ing total project cost and the time to completion is crucial to take corrective actions when problems or opportunities arise (and hence, the performance measures are translated into early warning signals). EVM metrics are designed to forecast these two important performance measures (time and cost) based on the actual performance up to date and the assumptions about future performance. In this section, we review some generally accepted and newly developed forecasting measures that will be used in our simulation study.

The general formula for predicting a project's final cost is given by the Estimated cost At Completion (EAC), as follows:

EAC = AC + PCWR

with EAC estimated cost at completion AC actual cost PCWR planned cost of work remaining

The general and similar formula for predicting a project's total duration is given by the estimated duration at completion (EAC(t)), as follows:

EAC(t) = AD + PDWR

with EAC(t) estimated duration at completion AD actual duration PDWR planned duration of work remaining

Note that we use EAC for cost forecasting and we add a t between brackets (ie EAC(t)) for time forecasting. Cost

performance and forecasting have been widely investigated by numerous researchers, and is outside the scope of this paper. For an overview, we refer to Christensen (1993) who reviews different EAC formulas and several studies that examine their accuracy.

In this paper, we compare and validate the different meth- ods to forecast a project's final duration. Note that EAC(t) is often referred to as the time estimate at completion (TEAC), the estimate of duration at completion (EDAC) or the Indepen- dent EDAC. In the current paper, we rely on the terminology of Vandevoorde and Vanhoucke (2006) who compared three methods to estimate the PDWR based on research done by Anbari (2003), Jacob (2003) and Lipke (2003a). Each method has three different versions to predict a project's final dura- tion, depending on the characteristics and performance of the project in the past. Figure 5 summarizes the nine forecasting metrics used in our simulation study. Note that Anbari (2003) maintains that the TEAC can be calculated based on the SAC

Page 7: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1366 Journal of the Operational Research Society Vol. 58, No. 10

or PD, actual performance up to any given point AT in the project, and analysis of expected future performance. He spec- ifies that 'when current analysis shows that past schedule per- formance is a good predictor of future schedule performance,' then the TEAC 'is the sum of the cumulative AT plus the original scheduled time for the remaining work, modified by the cumulative SPI'.

The PV method of Anbari (2003) does not directly give an estimate for the PDWR, but instead, is based on the TV = SV/PVR = SV*PD/BAC. Anbari (2003) proposes three differ- ent time forecasting measures reflecting the three situations of Figure 5, as follows:

EAC(t)pvl = PD - TV when the duration of remaining work is as planned EAC(t)PV2 = PD/SPI when the duration of remaining work follows the current SPI trend

EAC(t)PV3 = PD/SCI when the duration of remaining work follows the current SCI trend

Jacob and Kane (2004) use the ED method to forecast the total project duration, as follows: EAC(t) = AD + unearned remaining duration/ PF = AD + (PD - ED)/PF = AD + (PD - AD*SPI)/PF. The performance factor (PF) is used to correct the estimate towards previous performance (see Figure 5). Indeed, a PF = 1 denotes a future performance at the efficiency rate of the original plan (100% efficiency). However, the future performance can be corrected towards the current SPI trend or the current schedule cost index SCI = CPI*SPI trend. Hence, the three forecasting methods to predict total project duration are:

EAC(t)ED1 = PD + AD*(1 - SPI)

EAC(t)ED2 = PD/SPI

EAC(t)ED3 = AD + (PD - ED)/(CPI*SPI)

In situations where the project duration exceeds the PD, and the work is not yet completed (ie when AD > PD and SPI < 1), the PD will be substituted by the AD in the above-mentioned formulas. In these cases, the formulas are

EAC(t)EDI = AD + (AD - ED)/1 = AD*(2 - SPI)

EAC(t)ED2 = AD + (PD - ED)/CPI

EAC(t)ED3 = AD + (PD - ED)/(CPI*SPI)

The ES method of Lipke (2003a) calculates the EAC(t) as AD +unearned remaining duration/PF= AD+ (PD -ES)/PF. The three forecasting metrics, based on the scenarios pre- sented in Figure 5, are:

EAC(t)Esl = EDAC1 = AD + (PD - ES)

EAC(t)ES2 = EDAC2 = AD + (PD - ES)/SPI(t)

EAC(t)ES3 = EDAC3 = AD + (PD - ES)/(CPI*SPI(t))

3. Research methodology and test design parameters

In this section, we discuss our specific methodological ap- proach to measure the forecast accuracy of the different fore- casting measures described earlier. A summary picture is given in Figure 6, and details are explained in the follow- ing sub-sections. In a first step, we generate 3100 diverse project networks under a controlled design. More precisely, we vary and control the topological structure. (Several sources in literature (eg Herroelen and De Reyck (1999), Demeule- meester et al (2003) and Vanhoucke et al (2004)) refer to 'the exact shape of a network' as its 'topological structure'. Other authors (eg Tavares et al (1999, 2002)) refer to it as the 'morphological structure'.) of each network as a measure of diversity. This is the topic of Section 3.1. Each network will be transformed in an earliest start Gannt chart, based on the well-known critical-path-based forward pass calculations. The resulting schedule gives the periodically PVs for each activity. The project is then the subject to uncertainty during project execution, which will be simulated according to the Monte-Carlo principles. More precisely, we run nine differ- ent scenarios reflecting all possible outcomes a project can have, as discussed in Section 3.2. The resulting AC and EV metrics will be compared with the PV, resulting in the fore- casting measures described earlier. The results of the forecast accuracy are reported in Section 4.

We mentioned before that our unique approach aims at the generation of a wide variety of diverse projects and the sim- ulation of nine uncertainty scenarios, resulting in 27 900 dif- ferent possible project executions (moreover, each possible execution has been simulated for 100 Monte-Carlo runs). It allows drawing objective and extensive conclusions that are not based on the execution on one or a few real-life projects but can be generalized to a wide set of possible real-life sce- narios, as follows:

* The diverse set of networks: the generation of the 3100 project networks is based on and measured by their topo- logical structure (Section 3.1). Consequently, the specific topological structure of each project network is not linked to any practical characteristic of a particular real-life project but is used as a diversity measure.

* The nine uncertainty scenarios: We simulate random varia- tions in both activity durations and costs under a controlled

design (Section 3.2) by means of triangular distributions tailed to the right or left. We have not the intention to link the generated variability to practical uncertainties or sys- tematic variations such as changes in client requirements, re-allocations of staff, over-ambition of technical people, etc.

Moreover, throughout our experiments, we assume that:

* Managers do not change their focus on control. We simu- late project execution under different controlled scenarios, but do not take any possible corrective actions into account.

Page 8: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1367

Project network Network generator --- Generate project networks

With a pre-defined structure

Project schedule Construct an earliest start schedule - Planned Value (PV)

(ESS) using forward calculations

9 simulation PrQiect execution Actual Cost (AC) scenarios Monte-Carlo simulation for each Earned Value (EV)

activity's duration and cost

Project monitoring Measure the forecasting accuracy

at each review period

Figure 6 The methodological approach of our research experiment.

Hence, once the project started, it will be executed accord- ing to the different execution scenarios.

* Resources are available at all times and can be passed- on to succeeding activities at no cost. This implies that activities can be started earlier and/or later than scheduled, depending on the performance of the previous activities. In doing so, we avoid that earliness in activity durations will be cancelled out later in the project due to resource inflexibility.

3.1. Diversity of project networks

In this section, we describe our generation process in or- der to construct a set of project networks that differ from each other in terms of their topological structure. Rather than drawing conclusions for a (limited) set of real-life projects, we aim at generating a large set of project networks that spans the full range of complexity (Elmaghraby and Her- roelen, 1980). In doing so, we guarantee that we have a very large and diverse set of generated networks that can and might occur in practice, and hence, our results can be generalized. We rely on the project network generator de- veloped by Vanhoucke et al (2004) to generate activity-on- the-node project networks where the set of nodes represents network activities and the set of arcs represents the techno- logical precedence relations between the activities. These au- thors have proposed a network generator that allows gener- ating networks with a controlled topological structure. They have proven that their generator is able to generate a set of very diverse networks that differ substantially from each other from a topological structure point-of-view. Moreover, it has been shown in literature that the structure of a net- work heavily influences the constructed schedule (Patterson, 1976), the risk for delays (Tavares et al, 1999), the critical- ity of a network (Tavares et al, 2004) or the computational effort an algorithm needs to schedule a project (Elmaghraby and Herroelen, 1980). In our experiment, we carefully vary

Critical activities 0 +

1 4 7

- SPl(t) > 1 SPl(t) > 1 SPl(t) > 1 e- > AD < PD AD = PD AD > PD

S 2 5 8

" 0 SPI(t) > 1 SPI(t) = 1 SPI(t) < 1 o

AD < PD AD = PD AD > PD 3 6 9

z + SPI(t) <1 SPI(t) < 1 SPI(t) < 1 AD < PD AD = PD AD > PD

Figure 7 Nine simulation scenarios for our computational tests.

and control the design and structure of our generated net- works, resulting in 3100 diverse networks with 30 activities. For more information about the specific topological structures and the generation process, we refer to Vanhoucke et al (2004) or to www.projectmanagement.ugent.be/rangen.php. The con- structed data set can be downloaded from www.projectmanage ment.ugent.be/evms.php.

3.2. The nine simulation scenarios

Figure 7 displays the nine simulation scenarios of our com- putational experiment on all 3100 generated networks, and reads as follows:

Critical versus non-critical activities: A distinction has been made between critical and non-critical activities. Each (critical and non-critical) activity can have a actual (ie sim- ulated) duration which is smaller than (-), equal to (0) or larger than (+) its corresponding PD.

Actual project performance: The actual (ie simulated) project performance at completion is measured by compar- ing the actual project duration AD with the PD. Hence, each

Page 9: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1368 Journal of the Operational Research Society Vol. 58, No. 10

column reflects a known schedule condition as follows: first column: early project completion (AD < PD), second col- umn: on schedule (AD = PD) or last column: late project completion (AD > PD).

Measured project performance: The SPI(t) index is used to calculate the average measured performance, measured as the average of all SPI(t) values calculated over all review periods during the entire project execution.

As an example, scenario 3 measures-on the average over all reporting periods-a project delay (SPI(t) < 1), but the project finishes earlier than expected (AD < PD). Scenario 8 measures an average project delay, which is a correct warning signal since AD > PD.

Our simulations follow the design of Figure 7 and allow validating the three predictive techniques and comparing their relative performance. The SPI(t) indicator, which is used as an average predictor of the overall project performance through- out the entire project life cycle, plays a central role and might acts as a (good or bad) warning signal of project performance. More precisely, the nine scenarios can be interpreted as follows:

Scenario 1: We measure a project ahead of schedule and it's true.

Scenario 2: We measure a project ahead of schedule and it's true.

Scenario 3: We measure a project delay but we are ahead of schedule.

Scenario 4: We measure a project ahead of schedule but there is none.

Scenario 5: Everything according to schedule. Scenario 6: We measure a project delay but there is none. Scenario 7: We measure a project ahead but the project is

behind schedule. Scenario 8: We measure a project delay and it's true. Scenario 9: We measure a project delay and it's true.

We would like to stress that all EV-based metrics discussed in this paper make no distinction between critical and non- critical activities and suffer from the fact that all activities have an equal weight in the total EV calculations. Therefore, we make a distinction between critical and non-critical activities throughout our simulation runs and test the potential false warning signal EVM might observe in predicting the final project duration (see Section 4.3). Note that four scenarios (1, 2, 8 and 9) give a correct warning signal during the execution of the project, and four scenarios (3, 4, 6 and 7) give a false warning signal. One scenario assumes no uncertainty at all, that is, all PDs equal the ADs.

Note that, throughout our simulations, the random vari- ations in both (critical and non-critical) activity durations and costs are based on triangular distributions tailed to the right or left. However, the choice of the specific tails depend on the scenario to simulate. As an example, scenario 1 as- sumes activities finishing earlier than planned, which has been

simulated by means of triangular distribution tailed to the left to simulate the earliness. Scenario 9 assumes that all activities finish late, which has been simulated by triangular distribu- tion tailed to the right (lateness). All intermediate scenarios contain some activities with left-tailed triangular distributions and some activities with right-tailed triangular distributions, depending on the settings of the scenario.

3.3. The completion stage of work

The accuracy of forecasts depends on the completion stage of the project. Christensen et al (1995) has shown that no index- based formula always outperforms all others when forecast- ing the cost of the project. In this section, we have the in- tention to perform a similar study, and measure the accuracy of index-based time forecasts as a function of the completion stage of the project. Lipke (2003a) has shown that the clas- sic schedule indicators (SV and SPI) are unreliable as project duration forecasting indicators since they show a strange be- haviour over the final third of the project. This problem is overcome by the ES concept which behaves correctly over the complete project horizon (see Figure 4). In order to investi- gate the effect of the behaviour of SPI and SPI(t) on the fore- casting measures, we have measured and forecast the overall (duration) performance along the completion stage of the

project (expressed in their percentage completed EV/BAC). We divide the project horizon in three stages (early, mid- dle and late) and simulate 3, 9 and 3 settings, respectively, as follows:

Percentage completed

1. Early stage 0-20%, 0-30%, 0-40% 2. Middle stage 20-60%, 30-60%, 40-60%, 20-70%,

30-70%, 40-70%, 20-80%, 30-80%, 40-80%,

3. Late stage 60-100%, 70-100%, 80-100%

4. Computational tests and results

In this section, we display extensive results for all our simu- lations. This section has been divided into three sub-sections, inspired by the three criteria proposed by Covach et al (1981) for evaluating the performance of EAC methods, that is, ac- curacy, timeliness and stability.

Forecast accuracy: In Section 4.1, we evaluate the overall forecast accuracy of the three methods (PV, ED and ES) for the nine proposed scenarios of Section 3.2.

Timeliness: Section 4.2 analyses the behaviour of the fore- casts along the completion stage of the generated projects, and hence, measures whether the forecasting methods are ca- pable of producing reliable results as early as possible in the life of the project.

Stability: In Section 4.3, we discuss the influence of the network structure on the forecast accuracy, based on an

Page 10: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1369

Table 1 The forecast accuracy (MAPE) of the three methods for the nine scenarios

PV1 PV2 PV3 ED] ED2 ED3 ES1 ES2 ES3

Scenario 1 0.106 0.128 0.481 0.112 0.128 0.249 0.076 0.099 0.270 Scenario 2 0.114 0.095 0.101 0.121 0.095 0.087 0.094 0.036 0.054 Scenario 3 0.067 0.080 0.254 0.066 0.080 0.175 0.055 0.064 0.164 Scenario 4 0.035 0.071 0.426 0.023 0.071 0.229 0.033 0.092 0.237 Scenario 5 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 Scenario 6 0.024 0.051 0.416 0.021 0.051 0.242 0.019 0.063 0.273 Scenario 7 0.042 0.077 0.409 0.032 0.077 0.222 0.034 0.093 0.227 Scenario 8 0.100 0.090 0.119 0.102 0.090 0.102 0.076 0.031 0.067 Scenario 9 0.061 0.064 0.232 0.064 0.064 0.132 0.046 0.032 0.142

indicator that measures the closeness of each network to a serial or parallel network. In doing so, we measure the ro- bustness or stability of the forecasting measures for different network structures. Note that we deviate from the original definition of stability of Covach et al (1981) that refers to the stability of forecasting methods over the different re- view periods, and not over the different structures of the network.

In order to evaluate the forecasting measures and to deter- mine the forecast accuracy of each technique, we calculate:

The mean percentage error (MPE) as:

T1 RD - EAC(t)rp

T RD rp=l

The mean absolute percentage error (MAPE):

T1 RD

- EAC(t)rP I

T E RD rp=l

where T is used to refer to the total number of reporting pe- riods over the complete project horizon and EAC(t)rp is used to denote the estimated duration at completion in reporting period rp (rp = 1, 2,..., T) (more precisely, at each report- ing period, we calculate a corresponding duration forecast EAC(t)rp).

4.1. Forecast accuracy for the nine scenarios

In order to test our nine scenarios of Section 3.2, we need to adapt the uncertainty in activity durations according to the specifications of each scenario. As an example, the simulation of scenario 9 is straightforward and involves randomness in activity durations resulting in an increase of the PD of each activity. However, the simulation of scenario 3 (ie simulate a project ahead of schedule but we measure the opposite) de- pends heavily on the structure of the network. Indeed, the duration of the critical activities needs to be decreased, but the non-critical activities need to be increased to guarantee that we measure an average SPI(t) < 1 along the life of the project, although the project finishes ahead of schedule. To obtain this scenario, the duration of the non-critical activi- ties needs to be increased as much as possible within their

activity slack (resulting in SPI(t) < 1). As an example, a project network consisting of many serial activities has only a few non-critical activities, and hence, a carefully selected simulation is stringent. More precisely, only a few critical ac- tivities will be decreased to a very small extent (such that AD < PD), while the few non-critical activities need to be in- creased as much as possible within their slack (such that the SPI(t) value is, on the average, smaller than 1). Therefore, each scenario has been simulated under strict conditions and hence, comparison between scenarios is of little value for the MPE and the MAPE. However, we use these two error mea- sures to compare and evaluate the overall forecast accuracy of the methods within each scenario. The cost deviations are assumed to deviate from the original budget (BAC per activ- ity) in correlation with the duration deviation. In doing so, we assume that the cost is expressed in per man-hour and hence, deviations in activity duration have an immediate effect on the cost, due to an increase or a decrease in the total amount of man-hours to finish the particular activity. Although this reflects many real-life situations in project management, one can consider other settings where the cost deviation has an- other relation (or no relation) to the duration of an activity. However, the focus of this paper is on the prediction of a project's final duration, and not on cost. The SCI and SCI(t) metrics are only used to forecast the duration, which makes sense only when the cost is correlated with duration perfor- mance.

Table 1 displays the MAPE for all the scenarios for the three proposed methods (note that we have abbreviated each forecasting method, for example, EAC(t)pvl = PV1). The table reveals that the ES method often outperforms the other methods, but not in all scenarios. The results can be summa- rized as follows.

* Projects that finish ahead of schedule (scenarios 1, 2 and 3): the ES method outperforms both the PV and the ED method. This is in line with the example project 3 of Van- devoorde and Vanhoucke (2006). In these cases, the ES method can be used as a reliable indicator to detect oppor- tunities in the project.

* Projects that finish on schedule (scenarios 4, 5 and 6): The ED method outperforms the other methods (note

Page 11: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1370 Journal of the Operational Research Society Vol. 58, No. 10

Table 2 The MAPE for project with AD < PD (ahead of schedule)

Early Middle Late

0-20 0-30 0-40 20-60 20-70 20-80 30-60 30-70 30-80 40-60 40-70 40-80 60-100 70-100 80-100

PV1 0.89 0.83 0.78 0.64 0.60 0.56 0.60 0.56 0.52 0.56 0.52 0.47 0.31 0.28 0.25 PV2 0.34 0.32 0.31 0.28 0.27 0.27 0.26 0.26 0.25 0.25 0.24 0.24 0.23 0.23 0.23 PV3 0.43 0.42 0.41 0.39 0.39 0.39 0.39 0.38 0.38 0.38 0.39 0.39 0.39 0.39 0.39

EDI 0.94 0.89 0.84 0.72 0.69 0.65 0.69 0.66 0.62 0.66 0.62 0.58 0.44 0.41 0.40 ED2 0.34 0.32 0.31 0.28 0.27 0.27 0.26 0.26 0.25 0.25 0.24 0.24 0.23 0.23 0.23 ED3 0.41 0.37 0.35 0.28 0.25 0.23 0.25 0.23 0.21 0.23 0.21 0.19 0.14 0.12 0.12

ES1 0.91 0.86 0.80 0.67 0.62 0.57 0.63 0.58 0.53 0.58 0.53 0.48 0.24 0.19 0.13 ES2 0.24 0.22 0.20 0.16 0.14 0.13 0.14 0.13 0.12 0.13 0.12 0.11 0.06 0.05 0.04 ES3 0.47 0.44 0.41 0.33 0.31 0.28 0.31 0.28 0.26 0.28 0.26 0.23 0.12 0.09 0.07

Table 3 The MPE for project with AD < PD (ahead of schedule)

Early Middle Late

0-20 0-30 0-40 20-60 20-70 20-80 30-60 30-70 30-80 40-60 40-70 40-80 60-100 70-100 80-100

PV1 -0.89 -0.83 -0.78 -0.64 -0.60 -0.56 -0.60 -0.56 -0.52 -0.56 -0.52 -0.47 -0.29 -0.25 -0.21 PV2 -0.16 -0.15 -0.16 -0.14 -0.15 -0.15 -0.14 -0.15 -0.15 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14 PV3 0.35 0.37 0.37 0.37 0.37 0.37 0.37 0.37 0.37 0.37 0.38 0.38 0.38 0.39 0.39

ED1 -0.94 -0.89 -0.84 -0.72 -0.68 -0.64 -0.68 -0.64 -0.60 -0.64 -0.60 -0.55 -0.35 -0.31 -0.27 ED2 -0.16 -0.15 -0.16 -0.14 -0.15 -0.15 -0.14 -0.15 -0.15 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14 ED3 0.32 0.31 0.29 0.24 0.21 0.19 0.22 0.19 0.17 0.19 0.17 0.15 0.04 0.01 -0.01

ES1 -0.91 -0.86 -0.80 -0.67 -0.62 -0.57 -0.63 -0.58 -0.53 -0.58 -0.53 -0.48 -0.23 -0.18 -0.13 ES2 0.05 0.05 0.04 0.03 0.02 0.02 0.02 0.02 0.01 0.02 0.01 0.01 0.00 0.00 0.00 ES3 0.44 0.42 0.40 0.33 0.30 0.28 0.30 0.28 0.25 0.28 0.26 0.23 0.11 0.09 0.07

that ED2 and PV2 are exactly the same predictors) for scenarios 4 and 6. However, these scenarios are espe- cially built to generate an SPI(t) indicator that gives a false warning signal. Hence, the forecasting metrics will be influenced by this false indicator, resulting in wrong forecasts. The ED method does not suffer from this error, since the SPI indicator tends to go to 1 at the end of the project, decreasing the error of the false warning signal.

* Projects that finish behind schedule (scenarios 7, 8 and 9): the ES method outperforms the other methods, which means it can be used to detect problems in projects. How- ever, in scenario 7, the ED has the best performance. This, too, is a scenario for which the SPI(t) indicator gives a false warning signal.

The table also reveals that the performance of the forecast- ing metrics does not perform very well when the SCI is used as a performance measure. Hence, correcting the forecasting metrics with cost information (the CPI is used in the denom- inator) does not lead to reliable results for the three methods, and should be excluded. Note that we have tested the signif- icance of all differences with a non-parametric test in SPSS. All differences as indicated in the table (the best performing method has been indicated in bold) were statistically signifi- cant on the 5% level.

4.2. Forecast accuracy and the completion of work

In this section, we analyse the behaviour of the three sched- ule forecasting methods along the completion stage of the project as discussed in Section 3.3. We divide our computa- tional tests into two sub-tests. In a first simulation run (see Tables 2 and 3), we analyse the MAPE and MPE of the fore- casting techniques under the assumption that the project will end sooner than expected (AD < PD). In a second simulation run (see Tables 4 and 5), we assume that the project is be- hind schedule, that is, AD > PD. To that purpose, we have influenced the AD of each activity by randomly generating a number from a triangular distributions, with a tail to the left (ahead of schedule) or to the right (behind schedule).

Table 2 (MAPE) reveals that the ES method almost always outperforms all other methods, regardless of the stage of com-

pletion. Only ES1 (ES3) shows a slightly worse performance than PV1 (PV3) in the early and middle (early) stages, but

performs better in the late stages. Moreover, the forecast ac- curacy increases along the stage of completion, resulting in

very low absolute percentage errors in the late stages. Table 3 shows an overall overestimation for all forecasting meth- ods with a PF = 1 (ie PV1, ED1 and ES 1). This behaviour is intuitively clear, since the PF equals 1 and hence, outstand- ing performance in the past (SPI > 1 and SPI(t) > 1) will not

Page 12: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1371

Table 4 The MAPE for project with AD > PD (project delay)

Early Middle Late

0-20 0-30 0-40 20-60 20-70 20-80 30-60 30-70 30-80 40-60 40-70 40-80 60-100 70-100 80-100

PV1 0.11 0.10 0.10 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.07 0.07 0.08 0.08 0.08 PV2 0.13 0.13 0.12 0.10 0.09 0.09 0.09 0.09 0.08 0.09 0.08 0.08 0.08 0.08 0.08 PV3 0.32 0.29 0.27 0.19 0.18 0.17 0.17 0.16 0.15 0.16 0.15 0.14 0.09 0.09 0.08

ED 1 0.11 0.11 0.10 0.09 0.09 0.08 0.09 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 ED2 0.13 0.13 0.12 0.10 0.09 0.09 0.09 0.09 0.08 0.09 0.08 0.08 0.08 0.08 0.08 ED3 0.31 0.27 0.24 0.15 0.14 0.12 0.13 0.12 0.11 0.12 0.11 0.09 0.08 0.08 0.08

ES 1 0.11 0.11 0.10 0.08 0.08 0.07 0.08 0.07 0.07 0.08 0.07 0.07 0.04 0.03 0.03 ES2 0.13 0.12 0.11 0.08 0.08 0.07 0.07 0.07 0.06 0.07 0.06 0.06 0.03 0.03 0.02 ES3 0.33 0.29 0.26 0.17 0.15 0.14 0.14 0.13 0.11 0.12 0.11 0.10 0.05 0.04 0.03

Table 5 The MPE for project with AD > PD (project delay)

Early Middle Late

0-20 0-30 0-40 20-60 20-70 20-80 30-60 30-70 30-80 40-60 40-70 40-80 60-100 70-100 80-100

PV1 0.11 0.10 0.09 0.07 0.07 0.07 0.07 0.06 0.06 0.06 0.06 0.06 0.07 0.07 0.08 PV2 -0.03 -0.03 -0.02 -0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.05 0.06 0.07 PV3 -0.26 -0.24 -0.22 -0.16 -0.15 -0.14 -0.14 -0.13 -0.12 -0.14 -0.13 -0.12 -0.07 -0.06 -0.04

ED 1 0.11 0.11 0.10 0.08 0.08 0.08 0.08 0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.08 ED2 -0.03 -0.03 -0.02 -0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.05 0.06 0.07 ED3 -0.24 -0.22 -0.19 -0.12 -0.10 -0.08 -0.09 -0.08 -0.06 -0.08 -0.07 -0.05 0.03 0.04 0.06

ES 1 0.11 0.10 0.10 0.08 0.07 0.07 0.07 0.07 0.06 0.07 0.07 0.06 0.03 0.03 0.02 ES2 -0.04 -0.05 -0.04 -0.03 -0.03 -0.02 -0.02 -0.02 -0.01 -0.01 -0.01 -0.01 0.00 0.00 0.00 ES3 -0.26 -0.24 -0.22 -0.14 -0.13 -0.11 -0.12 -0.10 -0.09 -0.10 -0.09 -0.08 -0.03 -0.02 -0.02

be accounted for the estimate of the remaining work. How- ever, all other methods take a corrective factor into account (either SPI, SPI(t) or these factors multiplied by CPI), in or- der to predict the remaining work based on the excellent per- formance of the past. However, the SPI (used for the PV2 and ED2 methods) fails to incorporate the excellent perfor- mance in their estimates and hence, leads to an overestimation along all stages of the project. The use of the SPI(t) metric (ES2), however, shows an excellent and improving perfor- mance along the completion stages (from the middle stages on, the MPE is almost always close to zero). As mentioned previously, the extra correction factor CPI (PF = SCI) biases the results and leads to over-optimistic results for all fore- casting methods. Anbari (2003) points out that the TEAC ad- justed for cost performance 'may provide a better indication of estimated time at completion, when adherence to budget is critical to the organization'. He points out that additional time may be needed to bring the project back on budget (by reducing resources applied to the project, taking additional time to find better prices for equipment and material and sim- ilar actions). In our paper, we did not consider the impact of cost performance on the schedule.

Similar observations have been found, although less ex- treme, when the project finishes later than expected. In this case, the SPI < 1 and SPI(t) < 1 are used as corrective factors

to increase the remaining work estimate for the poor perfor- mance in the past. However, poor performance results in an SPI < 1 and SPI(t) < 1 in the early and middle stages, but the SPI tends to go to 1 at the late stages even though the project is late (and hence, SPI(t) < SPI in the late stages). Since the SPI and SPI(t) metrics are used in the denominator of the for- mulas PV2 and ED2, the duration forecast will be lower than the ES2 forecast (based on the reliable SPI(t) metric). This explains the underestimation for the PV2 and ED2 methods during the late stages, and the improving forecast accuracy behaviour for the ES methods.

4.3. Influence of the network structure on the forecast accuracy

Jacob and Kane (2004) argue that EV metrics and the cor- responding forecasting indicators cannot be used but on the level of an individual activity. Indeed, a delay in a non-critical activity might give a false warning signal to the project man- ager, and hence, wrong corrective actions can be taken. How- ever, we mentioned earlier that we completely ignored this remark in our specific simulation approach due to practical reasons (the project manager is usually interested in the sta- tus of the overall project and has no time to calculate every metric on the activity level) and calculated these performance

Page 13: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1372 Journal of the Operational Research Society Vol. 58, No. 10

0.8 0.8- 0.8

0.7- 0.7- 0.7

0.6- 0.6- 0.6

0.5- 0.5- 0.5-

0.404- 0.30

.0.3-0

.3-- 0.3 0.2

o0 ~'7--

- -* -

-- 0.1-

.... • .• ..................

0.2 1 0.15

0.10- 09-

o3o•19-03

0.10.1 0.1

0.06 0 4-

0.-- 3 3 - 0.15

0.05 0.2 -0.2- -0.1- 01 0.05- 0.1 0.2 0.3 0.40.5 0.5 0.6 0.7 0.8 0.9 -0.3

0.1- 0.1 0.1

- 0.1

0..2.-1o O ...-.3 0.4 0.0. 0.3 -0.158

-0.0-0.4 -0.15 ..2 -0.3 -0.5

-0.2

--0.4.25..-0.6.."--0----25---

--0.1

-0.7- -0.3

0.1 0.8 - 0.35-

-0.6 -0.9- -0.054

-0.3

-E E3 - ........

~-PV1 ---- PV2 -?a-PV3 ~-ED l ~-ED2 E--D3 --4~--ES1 E2--:-ES

Figure 8 The influence (MPE) of the serial or parallel networks for the nine scenarios.

measures on the project level, and not on the level of each individual activity. The possible bias of our approach (project level) compared to the ideal approach (on the activity level) is influenced by the structure of the network, and more pre- cisely by the number of critical activities in the networks.

Therefore, we have calculated for each network from our data set how close it is to a serial or parallel network. We measure this closeness by a serial/parallel indicator, SP e [0, 1]. More

precisely, when SP = 0 then all activities are in parallel, and when SP = 1 then we have a complete serial network. Be- tween these two extreme values, networks can be closer to a serial or to parallel network. This allows us to investigate the influence of the closeness of each network to a serial or par- allel network (SP) on the accuracy (measured by the MPE) of the forecasting measures. Since the SP indicator is directly linked with the number of possible critical activities in a net- work (the closer the SP to 1, the more critical activities in the network), this indicator might serve as an aid to detect in which cases our project level approach (compared to the

activity level approach of Jacob and Kane (2004)) leads to

wrong project duration forecasts.

Figure 8 displays the MPE for the networks with varying values for the SP indicator, ranging from 0.1 (close to a par- allel network) to 0.9 (close to a serial network), in steps of 0.1. These graphs are ranked in a similar way as the nine scenarios of Figure 7 and reads as follows: the first graph is

scenario 1, the second graph is scenario 4, the third graph is scenario 7, the fourth graph on the second line is scenario 2, etc. MPE-values larger (lower) than zero give an indication of an under-estimation (over-estimation) of the forecasting met- rics for the AD.

The results of Figure 8 can be summarized as follows.

First, the tables reveal that the network structure clearly influ- ences the forecast accuracy. Indeed, almost all graphs show an

improving forecast performance (closer to zero) for all fore-

casting methods for increasing values of SP (ie more serial

networks). The main reason is that the number of non-critical activities decreases for increasing SP-values, and hence, the

probability to make wrong conclusions decreases (delays in non-critical activities were the cause of mis-interpretations as shown by Jacob and Kane (2004)). Only the graphs for scenarios 2 and 8 show a deteriorating trend for increasing values of SP. Again, this observation confirms the possible mis-interpretations that can be made due to effects in non- critical activities. Both scenarios have no change (neither a de-

lay, nor a duration decrease) in the non-critical activities (see Figure 7), and hence, the aforementioned mis-interpretations are completely excluded. Secondly, most graphs reveal that the SPI or SPI(t) indicator is an important factor of the fore-

casting formulas and hence, might give a false warning sig- nal. As an example, the graphs for scenarios 1, 4 and 7 all show an underestimation for the final project duration. This

Page 14: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

M Vanhoucke and S Vandevoorde-Earned value metrics to forecast project duration 1373

is the result of the SPI(t) > 1 which measures a project ahead of schedule. However, the project of scenario 4 (scenario 7) finishes on time (later than expected), which explains the un- derestimations for all the forecasting methods. A similar, but opposite reasoning holds for scenarios 3 and 6 which both measure a project ahead (SPI(t) < 1) but have a final project delay AD < PD), resulting in a clear over-estimation of the project's duration. Last, almost all scenarios reveal that the ex- tra correction factor SCI to forecast a project's duration leads to a low forecast accuracy. As an example, the PV3, ED3 and ES3 metrics clearly have a very low accuracy for scenarios 1, 3, 4, 6 and 7, when SP values are low. Note that scenario 5 is an ideal scenario, with no deviations whatsoever, resulting in a 0% MPE value.

5. Conclusions

In this paper, we have tested the forecast accuracy of three dif- ferent project duration forecasting methods, the PV method, the ED method and the ES method, based on extensive sim- ulations on a large set of generated networks. First, we con- trolled the topological structure of the generated networks based on existing and reliable network generator, in order to make the link between the project network and the forecast accuracy. Secondly, we split up the network structure into crit- ical and non-critical activities to measure their influence on the forecast accuracy. Last, we have influenced the behaviour of the SPI(t) indicator as a correct or false warning signal, and hence, the type of uncertainty (risk) has been carefully controlled, resulting in nine different test scenarios. The re- sults reveal that the ES method outperforms, on the average, all other forecasting methods. The closeness of a network to a serial or parallel network directly influences the activity slack and has an impact on the accuracy of the forecasts.

This research is highly relevant to both academicians and practitioners. From an academic point-of-view, it is inter- esting to measure the influence of the network structure on the behaviour of both scheduling and monitoring tools, and hence, this research serves as a call to researchers to focus their further research attention towards specific problem or project instances. Indeed, rather than developing tools and techniques for general problems that have an average excel- lent behaviour, one can develop an excellent method for a set of specific problem instances, which belongs to a cer- tain class or set of network structures. This research area is closely related to the phase transition research attention that has been described in many research papers (see eg Herroe- len and De Reyck (1999) who describe the concept of phase transitions from a project scheduling point-of-view). The re- search to evaluate the forecast accuracy of project duration forecasting methods is, to the best of our knowledge, com- pletely void. Hence, this study can be used by practitioners who use these metrics and principles on a daily basis, but are unaware of the merits and the pitfalls of each individual method. Moreover, we believe that this research paper also

provides a framework for future research purposes and hence, can be used as a guide for our future research. Our future re- search ideas are threefold. Firstly, we want to continue our research in order to improve the forecast accuracy of the EV metrics. More precisely, the ES method can be extended and refined by taking network specific information into account, in order to improve the early-warnings potential of the met- rics. Secondly, we want to use network-specific information to select a priori the best forecasting method for the partic- ular project in execution. Based on the topological network structure and simple rules-of-thumb, software should be able to calculate the project duration forecast and give an indi- cation about the reliability of the forecast. Last, we want to extend our simulation framework and investigate the best set of corrective actions and the required performance needed to execute these actions, given the network structure, the fore- casting method and the values for the EV metrics. In doing so, the project manager gains insight into the different possi- ble corrective actions and the link with the project, the net- work, the selected method, etc. Interesting references for this research track can be found in Lipke (2003b, 2004) and Cioffi (2006). Anbari (2003) highlights the importance of such cor- rective and preventive actions and points out that such fore- casts help 'focus management's interest on projects or work packages that need the most attention'.

References

Anbari FT (2003). Earned value project management methods and extensions. Project Mngt J 34: 12-23.

Christensen DS (1993). The estimate at completion problem: A review of three studies. Project Mngt J 24: 37-42.

Christensen DS, Antolini RC and McKinney JW (1995). A review of EAC research. J Cost Anal Mngt (Spring):41-62.

Cioffi DF (2006). Completing projects according to plans: An earned- value improvement index. J Opl Res Soc 57: 290-295.

Cooper KG (2003). Your project's real price tag-letters to the editor. Harvard Business Review 81: 122.

Covach J, Haydon JJ and Riether RO (1981). A Study to Determine Indicators and Methods to Compute Estimate at Completion (EAC). ManTech International Corporation: Virginia.

Demeulemeester E, Vanhoucke M and Herroelen W (2003). RanGen: A random network generator for activity-on-the-node networks. J Scheduling 6: 13-34.

Elmaghraby SE and Herroelen W (1980). On the measurement of complexity in activity networks. Eur J Opl Res 5: 223-234.

Fleming Q and Koppelman J (2003). What's your project's real price tag? Harvard Business Review 81: 20-21.

Fleming Q and Koppelman J (2005). Earned Value Project Management, 3rd ed. Project Management Institute: Newtowns Square, PA.

Henderson K (2003). Earned schedule: A breakthrough extension to earned value theory? A retrospective analysis of real project data. The Measurable News (Summer):13-17, 21-23.

Henderson K (2004). Further developments in earned schedule. The Measurable News (Spring):15-16, 20-22.

Henderson K (2005). Earned schedule in action. The Measurable News (Spring):23-28, 30.

Page 15: A Simulation and Evaluation of Earned Value Metrics to Forecast the Project Duration

1374 Journal of the Operational Research Society Vol. 58, No. 10

Herroelen W and De Reyck B (1999). Phase transitions in project scheduling. J Opl Res Soc 50: 148-156.

Jacob D (2003). Forecasting project schedule completion with earned value metrics. The Measurable News (March): 1, 7-9.

Jacob DS and Kane M (2004). Forecasting schedule completion using earned value metrics.., revisited. The Measurable News (Summer): 1, 11-17.

Lipke W (2003a). Schedule is different. The Measurable News (Summer): 31-34.

Lipke W (2003b). Deciding to act. Crosstalk 12: 22-24. Lipke W (2004). Connecting earned value to the schedule. The

Measurable News (Winter):1, 6-16. Loch CH, DeMeyer A and Pich MT (2006). Managing the Unknown:

A New Approach to Managing High Uncertainty and Risk in

Project. John Wiley and Sons, Inc.: New Jersey. Patterson JH (1976). Project scheduling: The effects of

problem structure on heuristic scheduling. Nal Res Log 23: 95-123.

Tavares LV, Ferreira JA and Coelho JS (1999). The risk of delay of a project in terms of the morphology of its network. Eur J Opl Res 119: 510-537.

Tavares LV, Ferreira JA and Coelho JS (2002). A comparative morphologic analysis of benchmark sets of project networks. Int J Project Mngt 20: 475-485.

Tavares LV, Ferreira JA and Coelho JS (2004). A surrogate indicator of criticality for stochastic networks. Int Transac Opl Res 11: 193-202.

Vandevoorde S and Vanhoucke M (2006). A comparison of different project duration forecasting methods using earned value metrics. Int J Project Mngt, 24: 289-302.

Vanhoucke M, Coelho JS, Tavares LV and Debels D (2004). On the topological structure of a network. Working Paper 04/272, Ghent University.

Received July 2005; accepted July 2006