Proactive policies for the stochastic resource-constrained project scheduling problem

European Journal of Operational Research 214 (2011) 308–316

Contents lists available at ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier .com/locate /e jor

Stochastics and Statistics

Proactive policies for the stochastic resource-constrained projectscheduling problem

Filip Deblaere, Erik Demeulemeester ⇑, Willy HerroelenResearch Center for Operations Management, Faculty of Business and Economics, Katholieke Universiteit Leuven, Naamsestraat 69, 3000 Leuven, Belgium

a r t i c l e i n f o

Article history:Received 26 April 2010Accepted 20 April 2011Available online 28 April 2011

Keywords:Project schedulingProactive schedulingExecution policiesStochastic RCPSP

0377-2217/$ - see front matter � 2011 Elsevier B.V. Adoi:10.1016/j.ejor.2011.04.019

⇑ Corresponding author. Tel.: +32 16 326972; fax: +E-mail addresses: [email protected] (F. D

[email protected] (E. Demeulemeester), willy.herHerroelen).

a b s t r a c t

The resource-constrained project scheduling problem involves the determination of a schedule of theproject activities, satisfying the precedence and resource constraints while minimizing the project dura-tion. In practice, activity durations may be subject to variability. We propose a stochastic methodologyfor the determination of a project execution policy and a vector of predictive activity starting times withthe objective of minimizing a cost function that consists of the weighted expected activity starting timedeviations and the penalties or bonuses associated with late or early project completion. In a computa-tional experiment, we show that our procedure greatly outperforms existing algorithms described in theliterature.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction schedule is sometimes referred to as the stability objective (Van de

Most of the literature in resource-constrained project schedul-ing involves the determination of a deterministic schedule (withfixed, deterministic activity starting times and activity durations)that can serve as a guideline for the actual execution of the project.During the execution of a project, unexpected events can occurthat cause deviations from this schedule. Examples of such eventsare equipment failure, weather delay, under- or overestimation ofthe work content, etc. The majority of these types of events can bemodelled as an activity duration increase or decrease.

In the literature, two important research tracks can be identi-fied in the field of project scheduling under uncertainty. A first re-search track is concerned with the use of a deterministic baselineschedule for the resource-constrained project scheduling problem(or RCPSP) when statistical information about possible disruptionsis known. Proactive scheduling procedures can then be used toconstruct a deterministic schedule that incorporates some protec-tion against possible disruptions (e.g. through time buffering or re-source buffering), and reactive scheduling procedures can beinvoked during the execution of the project, when disruptions ren-der this schedule infeasible. Typically, the objective of the reactivescheduling procedures is to generate a deterministic reactive sche-dule that is feasible with respect to the newly available informa-tion, and deviates as little as possible from the original baselineschedule. The objective of minimizing deviations from the baseline

ll rights reserved.

32 16 326624.eblaere), erik.demeulemees-

[email protected] (W.

Vonder, 2006). In the field of proactive/reactive scheduling underthe stability objective, the problem of coping with activity durationvariability has been tackled in Van de Vonder (2006) and Van deVonder et al. (2007, 2008). The problem of uncertainty with respectto resource availability has been addressed by Lambrechts et al.(2008a,b) and Lambrechts (2007). In Yu and Qi (2004), Zhu et al.(2005) and Deblaere et al. (2011), some procedures are presentedfor reactive scheduling in the multi-mode RCPSP.

A second research track deals with the stochastic resource-con-strained project scheduling problem (stochastic RCPSP or SRCPSP),problem m,1jcpm,djjE(Cmax) in the classification of Herroelen et al.(2000). In the SRCPSP, project activities have a known activityduration distribution, and the objective is typically to minimizethe expected makespan E(Cmax). Instead of using a deterministicschedule, the execution of an SRCPSP instance is determinedthrough a so-called scheduling policy (Möhring et al., 1984,1985). The execution of the project is then treated as a multi-stagedecision process where at each decision point the policy acts as ascheduling rule, determining which activities are to be startednext. The vast majority of the research in the area of the SRCPSPis concerned with the expected makespan objective only. To thebest of our knowledge, no procedures exist that generate SRCPSPpolicies with some sort of stability objective.

In this paper, we describe a novel proactive project executionapproach for single-mode resource-constrained projects with sto-chastic activity durations. We assume that statistical informationis known about the activity durations, and we use this informationto generate a dynamic execution policy as well as a vector ofpredictive activity starting times such that the sum of the weightedexpected activity starting time deviations and the (expected)

http://dx.doi.org/10.1016/j.ejor.2011.04.019

mailto:[email protected]




http://dx.doi.org/10.1016/j.ejor.2011.04.019

http://www.sciencedirect.com/science/journal/03772217

http://www.elsevier.com/locate/ejor

F. Deblaere et al. / European Journal of Operational Research 214 (2011) 308–316 309

bonuses or penalties associated with early or late project comple-tion are minimized. We show that the proposed approach is bothmore general and more effective than other state-of-the-art proac-tive scheduling procedures for the RCPSP with the stability objec-tive described in the literature.

The remainder of this paper is organized as follows. In the nextsection, we give a formal problem description. Section 3 presents away to derive the optimal predictive starting times for inflexibleactivities, given a project execution policy. In Section 4, we de-scribe and motivate our choice to work with the class of re-source-based policies with release times. Section 5 is devoted toour main procedure, in which an initial resource-based policy withrelease times is optimized for our objective function. The entiresolution approach is subsequently illustrated on a simple examplein Section 6. In Section 7 we discuss the results of a number ofcomputational experiments and finally, we finish off with someconclusions.

2. Problem description

We assume that the activity network of a resource-constrainedproject in activity-on-the-node representation is given by a graphG(N,A) where the nodes N = {0,1, . . . ,n + 1} represent the activitiesand the arcs (i, j) 2 A represent the zero-lag finish-start precedencerelations. The activities 0 and n + 1 are the dummy start and thedummy end activity, respectively. We assume the presence of aset of renewable resource types K with a per-period availabilityak, "k 2 K. The activities i require a per-period amount rik of re-source type k, with i 2 N and k 2 K. We assume that the durationof an activity i is represented by a stochastic variable di that followsa known distribution. The dummy start and dummy end activityhave a deterministic duration equal to zero.

The execution of a project is performed according to a certainexecution policy P. Such a policy can be interpreted as a functionP : Rþnþ2 ! Rþnþ2 that maps a vector ~d ¼ ðd0;d1; . . . ;dnþ1Þ of activitydurations to a vector ~s ¼ ðs0; s1; . . . ; snþ1Þ of resource and prece-dence feasible starting times (Stork, 2001). In the research domainof the SRCPSP, one is interested in finding a policy P that mini-mizes the expected makespan E(Cmax) of the project, given theprobability distributions of the stochastic activity durations di.

In this paper we combine the deployment of a scheduling policyP with the generation of a vector of predictive starting times~s ¼ ðs1; s2; . . . ; snÞ that fulfill the role of estimates for the actuallyrealized starting times ~s. These predictive starting times can thenserve as a basis for certain coordination purposes involved in thesuccessful execution of the project, such as material procurement,resource planning, fixing agreements with subcontractors, etcet-era. During the execution of the project, the realized starting timesi of activity i may deviate from its predicted starting time si byamounts Dþi or D�i , where:

Dþi ¼si � si if si > si

0 otherwise;

�i ¼ 1; . . . ;n ð1Þ

D�i ¼si � si if si < si;

0 otherwise;

�i ¼ 1; . . . ;n ð2Þ

With a positive deviation Dþi we can now associate a non-negativecost cþi per time unit. In a construction project, for example, anactivity suffering from a positive deviation may require a crane(i.e., an inflexible resource that needs to be reserved in advance)and construction materials. The cost cþi could reflect the cost ofthe prolonged need for the crane as well as additional storage costsfor the required materials. With a negative deviation D�i (i.e., start-ing an activity earlier than its predictive starting time si), we canassociate a non-negative cost c�i per time unit. These costs are per-

haps less important or frequent, but may involve e.g. the costs ofacquiring the necessary staff and equipment earlier than planned.Activities with nonzero deviation costs are called inflexible activities.Flexible activities have cþi ¼ c�i ¼ 0.

For the execution of a project, we assume that a due date d isgiven. The actual project duration sn+1 may deviate from the duedate d by amounts Dþnþ1 or D�nþ1, where:

Dþnþ1 ¼snþ1 � d if snþ1 > d;

0 otherwise;

�ð3Þ

D�nþ1 ¼d� snþ1 if snþ1 < d;

0 otherwise:

�ð4Þ

Rewarding early project completions D�nþ1 with a penalty costc�nþ1 6 0 (i.e., an early completion bonus) and penalizing due dateviolations by a penalty cost cþnþ1 P 0, the policy execution cost C ofa policy P and its associated vector of predictive starting times ~scan be written as:

C P;~sð Þ ¼Xnþ1

i¼1

E Dþi cþi þ D�i c�i� �

: ð5Þ

We aim to construct a project execution policy P and a vector ofpredictive starting times~s that attempt to minimize the policy exe-cution cost (5) for a given resource-constrained project with knownactivity duration distributions. We show in the next section that theproblem of generating the predictive starting time vector that min-imizes objective (5) can be solved to optimality. The heuristics usedfor generating a suitable execution policy P are discussed in Sec-tions 4 and 5.

3. Generating optimal predictive starting times

Given an execution policy P, we can easily determine a vectorof predictive starting times ~s that minimizes the objective value(5) using Algorithm 1.

Algorithm 1. Determine optimal predictive starting times

1. Simulate the execution of the project according to P2. "i 2 {1,2, . . . ,n}:3. use the simulation results to derive the probability

distribution of si

4. determine smini and smax

i , the minimum and maximumvalue for si observed during the simulation

5. ifððcþi –0Þ or ðc�i – 0ÞÞ set p ¼ cþi

cþi þc�i

6. else set p = 0.57. determine t 2 smin

i ; smaxi

� �such that P(si 6 t � 1) < p 6

P(si 6 t)8. set si ¼ t

For every inflexible activity i we need to determine the value forsi that minimizes the policy execution costsCiðP; siÞ ¼ EðDþi cþi þ D�i c�i Þ with respect to activity i. This value forsi can be obtained through the formula shown in step 7 of Algo-rithm 1, that uses the p-value that is calculated in step 5. The for-mula presented in step 5 of the algorithm is nothing else than theclassical fractional formula encountered in the newsvendor prob-lem (Arrow et al., 1951). In the newsvendor problem, one wantsto predict the optimal volume of newspapers to have in stock giventhe demand distribution. Per unit that the demand is overesti-mated, an overage cost will be incurred corresponding to thecost of an unsold newspaper. Per unit that the demand is

310 F. Deblaere et al. / European Journal of Operational Research 214 (2011) 308–316

underestimated, an underage cost will be incurred due to a lostsale. Clearly, the formula used in the newsvendor problem can bedirectly applied to our problem, where the overage costs are setequal to c�i and the underage costs are set equal to cþi .

For the flexible activities, the value for the predictive startingtime si has no influence on the objective function. Therefore, weset it equal to the median of the stochastic variable si, as this valueminimizes the expected absolute deviation of si from si.

Algorithm 1 uses simulation to determine the vector ~s. Sincethis is a necessary step for the evaluation of the objective function(5), the number of samples of the random vector ~d required for asufficiently accurate calculation will be motivated by the resultingerror in the objective function. As will be pointed out in Section 5.1,1000 samples suffice for that purpose. The computational com-plexity of the underlying problem forces us to use simulation in-stead of an analytical evaluation of the objective function.Hagstrom (1988) has shown that in PERT networks, the calculationof the probability of a single point in the distribution of sn+1 is ]P-complete when the activities have two-point distributions.Although this particular result only covers the case of two-pointdistributions, it is generally assumed that the result can be ex-tended to more complex distributions, which yields the analyticaldetermination of activity starting time distributions very cumber-some. Also, in our setting the problem becomes even more com-plex because the resource constraints need to be taken intoaccount.

4. Policy class selection

Before we can generate a policy with the objective of minimiz-ing (5), we must first decide upon a suitable policy class. For thebasic SRCPSP, a number of policy classes have been proposed inthe literature (see e.g. Stork (2001), Ashtiani et al. (2011)). In thispaper, we will use the class of resource-based policies with releasetimes. Resource-based policies (see Ashtiani et al. (2011)) use thesame scheduling logic of the well-known parallel schedule gener-ation scheme, but now the decision times are determined by therealized activity durations, such that the schedule generationscheme must be used as a dynamic scheduling rule.

We represent a policy P by a pair P ¼ ð~p;~sÞ where~p ¼ ðp1;p2; . . . ;pnÞ represents a vector of priorities and~s ¼ ðs1; s2; . . . ; snÞ represents a vector of release times for the non-dummy activities. The execution of a project according to P thenproceeds as follows. At every decision time t, a parallel schedulegeneration scheme will be invoked that uses the priority list im-plied by the priorities pi and only starts a nondummy activity i ifthe additional condition t P si is satisfied. The use of release timessi will enable us to control the variability of the starting times ofthe inflexible activities. For the sake of completeness, we mentionthat the starting time of the dummy start activity is always equalto zero while the starting time of the dummy end activity alwaysequals the finish time of the latest finishing nondummy activity.As a consequence, for the two dummy activities, no priorities or re-lease times are defined.

Because the class of resource-based policies with release timesrepresents a very large search space, it is important to find goodstarting values for the vectors ~p and ~s. We do this by heuristicallysolving the RCPSP that corresponds to the given project instance,using median activity durations as point estimates for the stochasticactivity durations di. The so obtained schedule~s ¼ ðs0; s1; . . . ; snþ1Þ isthen transformed into a resource-based policy with release times byusing a priority vector ~p that corresponds to a smallest si first order-ing of the activities, and where the starting times si are used as therelease times si for the nondummy activities. We will refer to thisspecial policy as P0. In Deblaere et al. (2010b) it is thoroughly argued

why P0 is a good starting point if we aim to minimize (5). In short, P0

appears to have good properties with respect to makespan perfor-mance (which is important if we aim to minimize (5)), and the re-lease times will enable us to control the variability in the startingtimes of the inflexible activities.

5. Policy generation

In this section, we describe our main procedure that uses a heu-ristic schedule ~s (and the policy P0 derived thereof) as a startingpoint to generate a policy P that minimizes (5). The detailed stepsof the procedure for generating P and a set of predictive startingtimes~s are shown in Algorithm 2. We call this procedure Simula-tion-based Descent (SBD), as the procedure is basically a combina-tion of four descent procedures that use simulation to evaluatethe objective function. The SBD procedure consists of three parts.First, using a deterministic input schedule, an initial execution pol-icy P1 is determined (see Section 5.1) by means of a descent pro-cedure applied to P0. We obtain the final policy P by means ofan improvement heuristic (see Section 5.2) that uses P1 as a start-ing point, and finally a set of predictive starting times ~s is deter-mined using Algorithm 1.

Algorithm 2. Simulation-based Descent (SBD)

Input: deterministic schedule~sOutput: execution policy P ¼ ð~p;~sÞ and predictive starting

times~sP1 using~s, generate an initial policy P1 (see Algorithm 3)P improve the initial policy P1 (see Algorithm 4)given P, determine~s using Algorithm 1

5.1. Generating an initial policy

In a first step, an initial policy P1 is determined using the deter-ministic input schedule~s. The logic of this procedure is shown inAlgorithm 3. Algorithm 3 is essentially a descent procedure thatuses the policy P0 as a starting point and improves this policy byincreasing the release time si of certain activities, thereby reducingtheir starting time variability. First, the policy P0 is determined bysetting the release times equal to the starting times in the inputschedule~s, and by ordering the activity priority list in increasingorder of the activity starting times derived from the input schedule~s that was obtained by (heuristically) solving an RCPSP instancewith median activity durations M(di). This priority list will not bechanged during the execution of Algorithm 3. In Van de Vonderet al. (2007) it is shown that among a large number of priority liststhis ordering performs very well in minimizing the expected devi-ations from the input schedule.

During the descent procedure, the set of release times si is keptresource and precedence feasible (in the sense that the si wouldrepresent a set of feasible activity starting times with respect tothe median activity durations M(di)). We have empirically estab-lished that in a descent procedure as described in Algorithm 3, thisgives much better results than the alternative (i.e. not imposingany constraints on the si). A possible explanation for this is that lo-cal search procedures are more effective if the initial search spaceis reduced. Indeed, by restricting the vector ~s to the set of feasiblestarting time vectors, we will be able to move across larger ‘‘dis-tances’’ in the search space, while the unrestricted variant will pos-sibly only observe a very local sample of solution candidates. Also,restricting the initial search space will better conserve the struc-ture of the input schedule~s, which could prove beneficial for the


expected makespan of the resulting policy. This, in turn, will have apositive effect on the objective function (5).

In order to maintain resource feasibility of the vector~s, we use aresource flow network (Artigues et al., 2003; Leus, 2003; Leus andHerroelen, 2004). This network reflects the way in which renew-able resources are passed on between the various project activitiesin a given schedule. It has the same set of nodes (N) as the originalproject network G(N,A), but resource arcs (AR) are connecting twonodes i and j if there is a strictly positive resource flow fijk of anyresource type k from activity i (when it finishes) to activity j (whenit starts). A project network G(N,A) augmented with a set of re-source arcs AR that also act as additional precedence relations,has the property that any early start schedule is resource feasible,regardless of the actual activity durations di that may differ fromtheir median value M(di). In other words, there are no forbiddensets in the project network G(N,A [ AR). In Algorithm 3, we usethe single-pass algorithm by Artigues et al. (2003) to generate aresource flow network.

Algorithm 3. Generate initial policy P1

Input: deterministic schedule~s ¼ ðs0; s1; . . . ; snþ1ÞOutput: execution policy P1 ¼ ð~p;~sÞ
1. Determine P0 as follows:
"i 2 Nn{0,n + 1}, set si = si

sort the activities i 2 Nn{0,n + 1} according tononincreasing si;

this yields an activity list~a ¼ ða1; a2; . . . ; anÞ; set paj ¼ j;
"j 2 Nn{0,n + 1}
set P0 ¼ ð~p;~sÞ
2. P1 P0
3.
generate a resource flow network G0(N,AR) 4. generate a topological ordering (l1, l2, . . . , ln) of the
activities i 2 Nn{0,n + 1} such that li < lj "(li, lj) 2 (A [ AR)
5. C CðP1;~sÞ 6. repeat: 7. Cold C 8. L N n f0; nþ 1g 9. save ~s 10. while L – ; 11. i argmaxi2LðCiðP1; siÞÞ 12. si si + 1 13. find the index m such that lm = i 14. for j = lm+1, lm+2, . . . , ln: 15. if (i, j) 2 (A [ AR) and si + di > sj: 16. sj si + di
17.
C0 CðP1;~sÞ 18. if C0 < C: 19. set C C0 and go to step 9 20. else 21. restore ~s, set L L n fig 22. until Cold = C
In steps 5, 11 and 17 of Algorithm 3, the objective function (5)must be evaluated. This is done through simulation using 1000samples of the random vector ~d. We found that, in 120-activitynetworks, a sample size of 1000 results in an average relative errorof 0.95%, which is sufficiently accurate for optimization purposes.Note that the evaluation of Eq. (5) requires us to use Algorithm 1to determine the optimal~s vector.

In Algorithm 3, a list L is maintained of nondummy activitiesthat are candidates for having their release time si increased. Theactivity i 2 L with the highest contribution to the objective func-tion (5) will be considered first. We increase its release time (step

12) and the release times of its successors in the networkG(N,A [ AR), if necessary (steps 13–16). This new vector~s is evalu-ated and if it results in a decrease of the objective function, wekeep the new vector of release times. If this is not the case, we re-store the vector~s as saved before the move, we remove the activityfrom the list and continue the search for an improving move in thereduced activity list L. Removing activities from consideration dur-ing the while-loop of step 10 not only results in a speed increase, italso results in better solutions with respect to the objective func-tion (5). Apparently, moves that have been rejected once are notthe best candidates to reconsider in the near future. Only if the listL is empty and improving moves were made, we revoke the tabustatus of all activities and we restart the search for improvingmoves. Doing so ensures that we finish in a local optimum w.r.t.the proposed neighborhood.

5.2. Improving the initial policy

In the second step of the SBD procedure (see Algorithm 2), wesubject the initial policy P1 to an improvement procedure. Themain differences between the improvement procedure and the ini-tial policy generation procedure are that we now remove the pre-cedence and resource feasibility constraints on the vector ~s andthat we allow the priority vector ~p to change. The improvementprocedure is outlined in Algorithm 4. It consists of three majorsteps that are repeated as long as one of the steps yields animprovement.

Algorithm 4. Improve the policy P1 ¼ ð~p;~sÞ

Input: initial policy P1 ¼ ð~p;~sÞOutput: improved policy P ¼ ð~p;~sÞ
1. set P P1
2.
repeat: 3. ~s perform a broad neighborhood search on ~s 4. ~p perform a neighborhood search on ~p 5. ~s perform a narrow neighborhood search on ~s 6. until none of steps 3, 4 or 5 yielded an improvement
5.2.1. Broad neighborhood search on ~s

As was the case in Algorithm 3, a list L of nondummy activitiesis maintained and the activity j 2 L with the highest contributionto the objective function is considered first. For an activity j, weconsider all alternative release times except for the nearest alter-natives sj � 1 and sj + 1, because these cases are treated by the nar-row descent procedure discussed below. Considered activities arealways removed from the list since for a given activity (almost)all values for sj that could lead to an improvement of the objectivefunction are evaluated. Therefore, a recently considered activity isnot likely to lead to another improvement in the first subsequentsteps of the procedure. Alternative candidate release times sj areevaluated through simulation using 100 scenarios. We keep trackof the best new objective value Cmin with respect to this approxi-mate evaluation of Eq. (5) and the corresponding release time t⁄.When t⁄ has been identified, we check whether the correspondingobjective value Cmin is an improvement compared to the previousobjective value C. We then perform a more accurate evaluationof the objective function through 1000 simulation steps and we ac-cept the move if there is indeed an improvement. In the other case,sj is restored to its former value. If the list L is empty and animprovement was obtained, the list is refilled and the improve-ment loop restarts.


5.2.2. Neighborhood search on ~pThe second step of Algorithm 4 is concerned with improving the

priority vector ~p. Recall that pi 2 {1 . . . ,n} represents the priority ofactivity i. For i – j we have pi – pj, such that the vector ~p is actuallya permutation of the set {1, . . . ,n}. If during the execution of the pol-icy P two activities i1 and i2 become eligible, the activity with thehighest pi value will be started first. The working logic of the algo-rithm is best explained through a small example. Suppose we haven = 7 and a priority vector~p ¼ ð3;5;4;7;6;1;2Þ. Translating this vec-tor into a priority list where the activities are ordered according todecreasing priority yields the list lð~pÞ ¼ ð4;5;2;3;1;7;6Þ.

The procedure iterates over all nondummy activities i and verifieswhether priority decreases or increases of an activity yield better re-sults with respect to the objective function (5). Let us investigatewhat happens when the algorithm assesses the priority of activity3. Increasing p3 will result in the evaluation of the listsl = (4,5,3,2,1,7,6), l = (4,3,5,2,1,7,6) and l = (3,4,5,2,1,7,6).Decreasing the priority will result in the evaluation of the listsl = (4,5,2,1,3,7,6), l = (4,5,2,1,7,3,6) and l = (4,5,2,1,7,6,3).

We call a priority increase or decrease of an activity i effective ifthe relative ordering of activity i changes in relation to some activityj with ({(i, j), (j, i)} \ T(A)) = ;, where T(A) denotes the transitive clo-sure of the arc set A, defined as T(A) = {(i, j)j there is a path from i toj in the directed graph A}. Only effective priority changes can giverise to a difference in the objective value. Indeed, if i and j are prece-dence related, we need not evaluate the new priority list as the pre-cedence relation will always be respected regardless of the priorityvalues pi and pj. For a given activity, we will consider kmax effectivepriority decreases and kmax � 1 effective priority increases. We mayconsider one less increase because the first effective increase of pi isequivalent to the first effective decrease of the priority of some activ-ity j with pj > pi. Translated to the lists example, we do not need toevaluate the priority increase l = (4,5,3,2,1,7,6) of activity 3 if thepriority decrease l = (4,5,3,2,1,7,6) of activity 2 has been evaluatedearlier. For a given activity, the first improving priority change willalways be accepted, after which the search proceeds with the nextactivity. The procedure terminates when a full scan of the activitylist does not result in an improving move.

5.2.3. Narrow neighborhood search on ~sIn the final step of the improvement procedure (see Algorithm

4), we perform a narrow neighborhood search on the release timevector. A list L contains, as before, the inflexible nondummy activ-ities. Activities in this list are considered in decreasing order of thecontribution to the objective function. For an activity i, we considerthe alternative release times si + 1 and si � 1. These two alterna-

00

0

56

3

15

6

23

8

65

7

44

3

31

2

70

0

a1= 1212

10

5

Fig. 1. Example

tives are evaluated using simulation and an improving move isimmediately executed. Once a move has been made, we restartthe search for an improving move with a full list L. The search ter-minates as soon as no activity in L has an improving move. The useof this procedure guarantees that we finish in a local optimum withrespect to the narrow neighborhood.

6. Example

To show some of the benefits of the SBD procedure, we illus-trate it on an example instance. Fig. 1(a) shows a project networkand Fig. 1(b) shows the resource profile for the correspondingmakespan minimizing RCPSP schedule with the resource utiliza-tion shown on the Y-axis and the time indicated on the X-axis.Every nondummy activity i has a stochastic activity duration di thatfollows a discretized beta distribution. The characteristics of thisdistribution will be described in Section 7. The nondummy activi-ties in this particular example all have a median duration M(di)equal to the expected duration E(di). In Fig. 1(a), the activity num-ber is shown inside the node, the expected duration E(di) is shownabove the node and the requirement ri1 for the single renewable re-source type with per-period availability a1 = 12 is shown below thenode. A due date d = 15 is imposed on the project. The per-periodpenalty of exceeding this due date equals cþ7 ¼ 20. There is no ear-liness bonus. There are two inflexible activities, namely activities 1and 4 for which we have ðc�1 ; cþ1 Þ ¼ ð2;7Þ and ðc�4 ; cþ4 Þ ¼ ð4;8Þ.

The execution of the project is simulated according to three dif-ferent policies: policy P0, the SBD policy described in the previoussection, and a policy generated by applying the STC + D heuristicdeveloped by Van de Vonder et al. (2008). The STC + D heuristicranks as one of the best proactive time buffering procedures devel-oped in the literature. The STC + D heuristic uses the expected val-ues as point estimates for the stochastic activity durations andtakes as input a precedence and resource feasible deterministicschedule. The algorithm then iteratively inserts time buffers inthe schedule, while keeping the schedule resource and precedencefeasible using a resource flow network. The output of the STC + Dprocedure is a schedule with buffered activity starting times sbuf

i .These buffered starting times will act as both predictive startingtimes and as release times. The execution policy proposed by theauthors is the same resource-based policy with release times thatis used in the policy P0.

Fig. 2 shows a breakdown (with respect to the source of thecosts) of the objective function value for the P0 policy, the STC + Dpolicy and the SBD policy. The X-axis corresponds to the policyexecution costs. In the horizontal bars, the white zone denotes

5 10 13

12

5

6

4

3

instance.

0 5 10 15 20 25 30 35

0

STC+D

2

Activity 1

Activity 4

Due date

SBD

STC+D

0

Policy execution costs

Fig. 2. Policy execution costs of P0, STC + D and SBD.


the policy execution cost contribution of activity 1, the grey zonedenotes the cost contribution of activity 4 and the black zone de-notes the cost contribution due to the due date violation. BothSTC + D and SBD succeed in improving the initial policy P0, butthe SBD procedure does a much better job than the STC + D proce-dure. In order to understand what has happened, we need to take acloser look at the behavior of the three different policies.

In Fig. 3, we show a graphical illustration of the different poli-cies in a stochastic Gantt chart. This concept was first introducedby Baker and Trietsch (2009) who refer to it as a predictive Ganttchart. Similar to a regular Gantt chart, the execution intervals ofthe activities are indicated by a horizontal bar per activity. A detailof the stochastic Gantt chart of Fig. 3(a) is displayed in Fig. 4. Fig. 4corresponds to the fourth horizontal bar shown in Fig. 3(a). The so-lid line and the dashed line represent the cumulative probabilitydistribution of the start time and the finish time of activity 4,respectively. The left edge of the grey bars in Fig. 3 representsthe minimum value for the start time of an activity. The right edgerepresents the maximum value for the finish time of the activity.The top edge of a grey bar corresponds to a cumulative probabilityequal to one, while the bottom edge corresponds to a cumulativeprobability equal to zero. The activity number is shown to the leftof the corresponding horizontal bar. Note that activity 7 is thedummy end activity, and its (start or finish time) distribution isactually the project makespan distribution.

We can now visually compare the three different policies. Thestochastic Gantt chart corresponding to the STC + D policy is shownin Fig. 3(b). When we compare this policy to P0 (see Fig. 3(a)), wesee that not much has changed. The variability of s1 remains thesame while the variability of s4 is reduced to the detriment ofthe makespan performance. Nevertheless, the increase of the ex-pected penalty due to the violation of the due date is smaller thanthe reduction of the expected costs with respect to deviations froms4, such that the total costs are reduced. This can also be verified inFig. 2. The visualization of the SBD policy shows a different picture(see Fig. 3(c)). The algorithm has shifted activity 1 to the very startof the project, eliminating all variation (and associated costs) of s1.This results in a high variability of s2, which is no problem sinceactivity 2 is characterized as flexible. This beneficial switch hasas a second consequence that the variability in s4 is easier to con-trol, while the STC + D policy suffers from the precedence con-straint 1 � 4. Finally, the SBD policy allows more variability inthe starting time of the flexible activity 6, which is beneficial forthe makespan performance. The STC + D procedure cannot makea similar decision, because activities 1 and 6 constitute a forbiddenset, forcing the release time sbuf

6 to be greater than or equal to

sbuf1 þ Eðd1Þ. For a similar reason, the STC + D procedure is unable

to schedule activity 1 before activity 2.This example shows that by making the policy generation pro-

cedure independent from deterministic schedules or resource con-straints, the SBD procedure is able to generate execution policiesthat are superior to those generated by the STC + D procedure interms of the policy execution cost (5).

7. Computational results

7.1. Computational setup

In this section we will verify whether the nice result of theexample in the previous section can be generalized. For our com-putational experiments we used the well-known J30, J60 andJ120 instance sets of PSPLIB (Kolisch and Sprecher (1997)). Boththe STC + D procedure and the SBD procedure require a determin-istic RCPSP schedule as part of the input. For the problems of theJ30 instance set, we have used the makespan minimizing branch-and-bound algorithm of Demeulemeester and Herroelen (1992,1997) for that purpose. Heuristic schedules for the J60 and J120 in-stance sets have been obtained by the combined crossover algo-rithm by Debels and Vanhoucke (2006) using 25,000 schedulegenerations as a stopping criterion. This algorithm is an improvedversion of the decomposition-based GA described in Debels andVanhoucke (2007) that has been shown to be among the best per-forming metaheuristic RCPSP procedures for both small and largerdata sets. For every activity i, we generated a discretized beta dis-tribution with shape parameters 2 and 5, and an expected valueE(di) equal to the deterministic PSPLIB activity duration. The activ-ity duration distribution has either a small, medium or large vari-ability, all with equal probability. In the case of a small variability,the minimum and maximum values of the stochastic variable di areequal to 0.75 � E(di) and 1.625 � E(di), respectively. In the case of amedium and a large variability, the corresponding intervals are[0.5 � E(di),2.25 � E(di)] and [0.25 � E(di),2.875 � E(di)], respec-tively. The probability distributions corresponding to the threepossible variability levels are illustrated in Fig. 5 for an activitywith E(d)i = 8.

Every nondummy activity i has a 50 % chance of being inflexible.In that case, the costs cþi and c�i are drawn from a discrete triangu-lar distribution with Pðcþi ¼ qÞ ¼ Pðc�i ¼ qÞ ¼ ð21� 2qÞ% forq 2 {1,2. . . ,10} (this is the same distribution as used in Van deVonder et al. (2008)). This distribution results in a higher occur-rence probability for low costs and in an average cost cavg = 3.85.

Fig. 3. Stochastic Gantt charts.

0

0.2

0.4

0.6

0.8

1

8 9 10 11 12 13 14 15 1

Fig. 4. Stochastic Gan


Flexible activities have cþi ¼ c�i ¼ 0. All algorithms have been codedin Visual C++, and the results have been obtained on a personalcomputer equipped with an Intel� Xeon� 2.33 GHz processor.

7.2. Performance of the algorithm

In a first set of experiments, we compare the performance of theSBD procedure to the STC + D procedure. Recall that the STC + Dprocedure determines a vector of starting times sbuf

i that act as bothrelease times and predictive starting times. Because the STC + Dprocedure was developed for a less general problem setting thanthe one described in Section 2, we first restrict ourselves to theproblem setting proposed in Van de Vonder et al. (2008). More pre-cisely, we assume symmetric costs, no bonus for the early comple-tion of the project and a marginal cost cn+1 = b10cavgc = 38 per timeunit violation of the project due date d. Also, contrary to what isproposed in Van de Vonder et al. (2008), we use the optimal pre-dictive starting times si instead of the buffered starting times sbuf

i

for calculating the objective function, resulting in the followingformula:

~C ¼Xn

i¼1

ciEjsi � sij þ cnþ1E max 0; snþ1 � dð Þð Þð Þ: ð6Þ

This allows for a more fair comparison, as both STC + D and the SBDprocedure then use Algorithm 1 for the determination of the predic-tive starting times.

We have tested the STC + D procedure and the SBD procedureon the J30, J60 and J120 sets of PSPLIB, using a due date d equalto 1.01 � E(sn+1), 1.05 � E(sn+1) and 1.1 � E(sn+1), where E(sn+1) is cal-culated as the expected makespan when we execute the projectusing the policy P0. The higher the value of d, the more room theprocedures will have to improve the objective function. A low va-lue of d implies that makespan performance is more importantthan the protection of inflexible activities. In Table 1, we showthe average relative improvement of the SBD procedure over theSTC + D procedure, calculated as

~CðSTCþDÞ�~CðSBDÞ~CðSTCþDÞ � 100%. We see that

the policies generated by the SBD procedure are substantially bet-ter than the ones generated by the STC + D procedure. The differ-ences become bigger when the number of activities and/or thedue date increases.

We have also evaluated both procedures in a more generalproblem setting. We now allow the costs to be asymmetric (i.e.for a given activity i the deviation costs cþi and c�i are independentdraws from the triangular distribution described in the previoussection) and we assume the presence of an earliness bonusc�nþ1 ¼ �b5cavgc ¼ �19. The earliness bonus is (in absolute value)half as large as the tardiness penalty. We want to stress that theSTC + D procedure is disadvantaged here because it was not de-signed to work with asymmetric costs or bonuses for early projectcompletion. The results of this experiment are shown in Table 2.The important observation here is that the differences in perfor-mance between the two algorithms can grow very large if the ear-liness costs or bonuses are ignored as is the case with the STC + D

6 17 18 19 20 21 22 23 24

tt chart (detail).

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230

0.02

0.04

0.06

0.08

0.1

0.12

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Fig. 5. Three levels of activity duration variability.

Table 1Improvement of the SBD procedure over STC + D.

J30 J60 J120

1.01 � E(sn+1) 18.7% 22.7 % 32.8 %1.05 � E(sn+1) 19.9 % 24.1 % 34.7 %1.10 � E(sn+1) 21.0 % 25.4 % 35.2 %

Table 2Improvement of the SBD procedure over STC + D in a general problem setting.

J30 J60 J120

1.01 � E(sn+1) 37% 32% 33 %1.05 � E(sn+1) 104% 67% 53%1.10 � E(sn+1) 166% 515% 378%


algorithm. As before, the relative performance of the SBD proce-dure increases as the due date increases. The reader may notice aseemingly strange trend in the results when looking at differentproject network sizes for a given due date. This is because twoopposing forces are in effect. On the one hand, there is the fact thatthe SBD procedure has a better relative performance than theSTC + D procedure as the number of activities increases (see the re-sults in Table 1). On the other hand, when the number of (non-dummy) activities increases, the relative importance (in theobjective function) of bonuses/penalties related to the due date de-creases. In other words, when the number of nondummy activitiesincreases, it makes the ‘‘handicap’’ of the STC + D procedure (ignor-ing earliness bonuses) less severe. Some of the percentages in Ta-ble 2 are larger than 100 % because the corresponding policyexecution costs have negative values.

The average computation times (in seconds) of the secondexperiment are shown in Table 3. The computation times of theSBD procedure are substantially higher than the computationtimes of the STC + D procedure. This is no surprise, because allthe components of the SBD procedure rely on simulation to evalu-ate the objective function. For the largest instance set, the average

Table 3Computation times (in seconds).

SBD STC + D

J30 J60 J120 J30 J60 J120

1.01 � E(sn+1) 8.4 94 1,631 0.1 1.4 12.91.05 � E(sn+1) 7.8 84 1,475 0.1 1.7 14.81.10 � E(sn+1) 7.1 77 1,371 0.2 2.3 19.1

computation time lies around 25 minutes, which is still acceptable.Furthermore, the imposed due date does not have a very significanteffect on the required computation time of the SBD procedure.

The reader will notice that the computation times of the SBDprocedure seem to increase with decreasing due date, while theopposite holds for the STC + D procedure. For the SBD procedure,the computation time increase can be attributed to an increase inthe computation time of the improvement procedure (Algorithm4). This is because during the initial policy generation procedure(Algorithm 3) moves only result in release time increases. Whenwe remove the resource constraints during the improvement pro-cedure (Algorithm 4), a lot of release time decreases will result inan improvement of the objective function, because of the smalldue date. This results in more moves being made in total, andhence a slightly larger computation time.

The computation time increase of the STC + D procedure withincreasing due date is because the descent procedure in STC + Donly considers resource feasible starting times. When the due dateis small, the schedule will have little slack, and only a small num-ber of moves must be considered. A larger due date on the otherhand implies more slack in the schedule and thus a greater numberof moves to be evaluated. Hence the larger computation time. InAlgorithm 4, there is no such thing as slack, and as a consequencethat variable does not influence the running time of the algorithm.

8. Conclusions

In this paper, we studied the problem of the execution of a re-source-constrained project when activity durations are uncertain.We formulated a problem where one tries to determine a projectexecution policy and a vector of predictive activity starting timessuch that the policy execution costs are minimized. These policyexecution costs consist of costs related to positive and negativedeviations from the predictive activity starting times of inflexibleactivities, as well as the costs associated with exceeding the projectdue date and the bonus opportunities related to early project com-pletion. We showed that this problem setting is more general thanrelated problem descriptions found in the literature.

For this general problem setting, we have proposed a solutionprocedure that is essentially a combination of four descent proce-dures that heavily rely on simulation for the evaluation of theobjective function. The result of this procedure is a vector of re-lease times and a priority list, that are to be used in a simple mod-ified parallel schedule generation scheme. Given this executionpolicy, we proposed a methodology based on the newsvendorproblem to derive the corresponding vector of optimal predictivestarting times. We have graphically and numerically illustrated


the benefits of our procedure on an example instance. In computa-tional experiments, our procedure has been compared to the state-of-the-art STC + D procedure, and we found that our procedure isvastly superior in performance, at the cost of a larger average com-putation time.

References

Arrow, K., Harris, T., Marshack, J., 1951. Optimal inventory policy. Econometrica 19,250–272.

Artigues, C., Michelon, P., Reusser, S., 2003. Insertion techniques for static anddynamic resource-constrained project scheduling. European Journal ofOperational Research 149 (2), 249–267.

Ashtiani, B., Leus, R., Aryanezhad, M., 2011. New competitive results for thestochastic resource-constrained project scheduling problem: exploring thebenefits of pre-processing. Journal of Scheduling. 14 (2), 151–171.

Baker, K., Trietsch, D., 2009. Principles of sequencing and scheduling.Wiley:Hoboken, New Jersey.

Debels, D., Vanhoucke, M. 2006. Future research avenues for resource constrainedproject scheduling: Search space restriction or neighbourhood searchextension. Research report. Ghent University, Belgium.

Debels, D., Vanhoucke, M., 2007. A decomposition-based genetic algorithm for theresource-constrained project scheduling problem. Operations Research 55 (3),457–469.

Deblaere, F., Demeulemeester, E., Herroelen, W., 2011. Reactive scheduling in themulti-mode RCPSP. Computers and Operations Research 38 (1), 63–74.

Deblaere, F., Demeulemeester, E., Herroelen, W. Generating proactive executionpolicies for resource-constrained projects with uncertain activity durations.Technical report KBI_1006, FEB, KULeuven, Belgium; 2010b.

Demeulemeester, E., Herroelen, W., 1992. A branch-and-bound procedure for themultiple resource-constrained project scheduling problem. ManagementScience 38, 1803–1818.

Demeulemeester, E., Herroelen, W., 1997. New benchmark results for the resource-constrained project scheduling problem. Management Science 43, 1485–1492.

Hagstrom, J., 1988. Computational complexity of PERT problems. Computers andOperations Research 18, 139–147.

Herroelen, W., De Reyck, B., Demeulemeester, E., 2000. On the paper‘‘Resource-constrained project scheduling: Notation, classification, modelsand methods’’ by Brucker et al. European Journal of Operational Research128 (3), 221–230.

Kolisch, R., Sprecher, A., 1997. PSPLIB – A project scheduling library. EuropeanJournal of Operational Research 96, 205–216.

Lambrechts, O. 2007. Robust project scheduling subject to resource breakdowns.Ph.D. thesis. Department of Decision Sciences and Information Management(KBI), K.U. Leuven.

Lambrechts, O., Demeulemeester, E., Herroelen, W., 2008a. A tabu search procedurefor developing robust predictive project schedules. International Journal ofProduction Economics 111 (2), 493–508.

Lambrechts, O., Demeulemeester, E., Herroelen, W., 2008b. Proactive and reactivestrategies for resource-constrained project scheduling with uncertain resourceavailabilities. Journal of Scheduling 11 (2), 121–136.

Leus, R. 2003. The generation of stable project plans. Ph.D. thesis. Department ofapplied economics, Katholieke Universiteit Leuven, Belgium.

Leus, R., Herroelen, W., 2004. Stability and resource allocation in project planning.IIE Transactions 36 (7), 1–16.

Möhring, R., Radermacher, F., Weiss, G., 1984. Stochastic scheduling problems I –Set strategies. Zeitschrift für Operations Research 28, 193–260.

Möhring, R., Radermacher, F., Weiss, G., 1985. Stochastic scheduling problems II –General strategies. Zeitschrift für Operations Research 29, 65–104.

Stork, F. (2001). Stochastic resource-constrained project scheduling. Ph.D. thesis.Technical University of Berlin, School of Mathematics and Natural Sciences.

Van de Vonder, S. (2006). Proactive-reactive procedures for robust projectscheduling. Ph.D. thesis, Katholieke Universiteit Leuven, Leuven, Belgium.

Van de Vonder, S., Ballestín, F., Demeulemeester, E., Herroelen, W., 2007. Heuristicprocedures for reactive project scheduling. Computers & Industrial Engineering52 (1), 11–28.

Van de Vonder, S., Demeulemeester, E., Herroelen, W., 2008. Proactive heuristicprocedures for robust project scheduling: An experimental analysis. EuropeanJournal of Operational Research 189 (3), 723–733.

Yu, G., Qi, X., 2004. Disruption management – Framework, models and applications.World Scientific, New Jersey.

Zhu, G., Bard, J., Yu, G., 2005. Disruption management for resource-constrainedproject scheduling. Journal of the Operational Research Society 56 (4), 365–381.

Documents

Proactive policies for the stochastic resource-constrained project scheduling problem