9
Omega 30 (2002) 127 – 135 www.elsevier.com/locate/dsw Integrating management judgment and statistical methods to improve short-term forecasts Paul Goodwin The Management School, University of Bath, Claverton Down, Bath BA2 7AY, UK Received 16 October 2001 Abstract The complementary strengths that management judgment and statistical methods can bring to the forecasting process have been widely discussed. This paper reviews research on the eectiveness of methods that are designed to allow judgment and statistical methods to be integrated when short-term point forecasts are required. The application of both voluntary and mechanical integration methods are considered and conditions identied where the use of particular methods is appropriate, according to current research. While acknowledging the power of mechanical integration methods that exclude the judgmental forecaster from the integration process, the paper suggests that future research eort should focus on the design of forecasting support systems that facilitate voluntary integration. ? 2002 Elsevier Science Ltd. All rights reserved. Keywords: Forecasting; Judgment; Bias; Time series; Decision support 1. Introduction All forecasts involve judgment. Even if a forecast em- anates from a highly sophisticated statistical method, human judgment will have been involved in the choice of method, model form, predictor variables or data set. Indeed, this role of judgment has recently been formalised in expert systems such as Collopy and Armstrong’s rule-based forecasting system [1]. However, surveys of business and government forecasting methods have repeatedly shown that the role of judgment usually goes much further in that it is directly ap- plied to the quantity to be forecast. In most organisations, forecasts are either entirely based on judgment or judgmen- tal adjustments are applied to the forecasts of a statistical method [2–7]. There appear to be a large number of reasons for the predominance of “direct” judgment in forecasting. These An earlier version of this paper was presented at the OR42 conference in Swansea, UK, as the keynote paper in forecasting. Tel.: +44-122-532-3594; fax: +44-122-582-6473. E-mail address: [email protected] (P. Goodwin). relate to the nature of statistical forecasting methods and the nature and attitude of the personnel involved in producing and using the forecasts. In a dynamic environment statisti- cal methods that extrapolate past patterns into the future are perceived as being slow to react to change [3,4,6,8]. Alter- natively, there may be little or no past data that are relevant to the current forecasting problem [3,4]. When it is avail- able, past data may contain the eects of unusual events, like strikes, so that data have to be massaged to remove these ef- fects before statistical methods can be applied. This can be a laborious process where a large number of forecasts need to be made [9]. Moreover, statistical methods may have dif- culty in taking into account special events that are known to be occurring in the future [10], although some statistical software packages (e.g., [11–13]) now include a facility for modelling events like sales promotions. Judgment may also predominate because many organisa- tions lack personnel who are skilled in the use of statistical methods [6,14], but even if these sta are available, a num- ber of behavioural factors favour judgment. Managers who are well regarded for their knowledge of their products or markets may feel a loss of control and ownership if forecasts are delegated to a statistical model. Indeed, the variable to be forecast may be partly controllable by the manager who 0305-0483/02/$ - see front matter ? 2002 Elsevier Science Ltd. All rights reserved. PII:S0305-0483(01)00062-7

Integrating management judgment and statistical methods to improve short-term forecasts

Embed Size (px)

Citation preview

Page 1: Integrating management judgment and statistical methods to improve short-term forecasts

Omega 30 (2002) 127–135www.elsevier.com/locate/dsw

Integrating management judgment and statistical methods toimprove short-term forecasts�

Paul Goodwin ∗

The Management School, University of Bath, Claverton Down, Bath BA2 7AY, UK

Received 16 October 2001

Abstract

The complementary strengths that management judgment and statistical methods can bring to the forecasting process havebeen widely discussed. This paper reviews research on the e*ectiveness of methods that are designed to allow judgmentand statistical methods to be integrated when short-term point forecasts are required. The application of both voluntary andmechanical integration methods are considered and conditions identi,ed where the use of particular methods is appropriate,according to current research. While acknowledging the power of mechanical integration methods that exclude the judgmentalforecaster from the integration process, the paper suggests that future research e*ort should focus on the design of forecastingsupport systems that facilitate voluntary integration. ? 2002 Elsevier Science Ltd. All rights reserved.

Keywords: Forecasting; Judgment; Bias; Time series; Decision support

1. Introduction

All forecasts involve judgment. Even if a forecast em-anates from a highly sophisticated statistical method, humanjudgment will have been involved in the choice of method,model form, predictor variables or data set. Indeed, this roleof judgment has recently been formalised in expert systemssuch as Collopy and Armstrong’s rule-based forecastingsystem [1]. However, surveys of business and governmentforecasting methods have repeatedly shown that the role ofjudgment usually goes much further in that it is directly ap-plied to the quantity to be forecast. In most organisations,forecasts are either entirely based on judgment or judgmen-tal adjustments are applied to the forecasts of a statisticalmethod [2–7].

There appear to be a large number of reasons for thepredominance of “direct” judgment in forecasting. These

� An earlier version of this paper was presented at the OR42conference in Swansea, UK, as the keynote paper in forecasting.

∗ Tel.: +44-122-532-3594; fax: +44-122-582-6473.E-mail address:[email protected] (P.Goodwin).

relate to the nature of statistical forecasting methods and thenature and attitude of the personnel involved in producingand using the forecasts. In a dynamic environment statisti-cal methods that extrapolate past patterns into the future areperceived as being slow to react to change [3,4,6,8]. Alter-natively, there may be little or no past data that are relevantto the current forecasting problem [3,4]. When it is avail-able, past data may contain the e*ects of unusual events, likestrikes, so that data have to be massaged to remove these ef-fects before statistical methods can be applied. This can bea laborious process where a large number of forecasts needto be made [9]. Moreover, statistical methods may have dif-,culty in taking into account special events that are knownto be occurring in the future [10], although some statisticalsoftware packages (e.g., [11–13]) now include a facility formodelling events like sales promotions.

Judgment may also predominate because many organisa-tions lack personnel who are skilled in the use of statisticalmethods [6,14], but even if these sta* are available, a num-ber of behavioural factors favour judgment. Managers whoare well regarded for their knowledge of their products ormarkets may feel a loss of control and ownership if forecastsare delegated to a statistical model. Indeed, the variable tobe forecast may be partly controllable by the manager who

0305-0483/02/$ - see front matter ? 2002 Elsevier Science Ltd. All rights reserved.PII: S 0305 -0483(01)00062 -7

Page 2: Integrating management judgment and statistical methods to improve short-term forecasts

128 P. Goodwin / Omega 30 (2002) 127–135

is making the forecasts [15]. In addition, forecasting meet-ings, where managers combine their judgments, may be seenas useful events in their own right, so that there would beopposition to any replacement of these by statistical meth-ods [16]. Finally, the processes underlying complex statisti-cal models may not be transparent to forecast users and theoutputs of these methods may therefore attract scepticism[6,17].

Although “direct” judgment is very widely used in fore-casting, it has a number of serious disadvantages when com-pared to statistical methods. The human mind has limitedinformation processing capacity [18] and the prevailingview is that people use simplifying mental strategies, calledheuristics, to cope with the complexities of the forecastingtask [19,20]. A typical heuristic will be to use this week’ssales ,gure as a starting point for estimating next week’ssales [21]. This is referred to as the anchor and adjustheuristic. Similarly, people may judge the amount of sys-tematic variation there is in a time series by assessing howrepresentative it appears to be of their stereotypical view ofa random pattern. In this case they are using what is knownas the representativeness heuristic [22]. While heuristicslike these sometimes provide people with eKcient ways oftackling problems they can also lead to systematic biases injudgment.

A large number of biases have been documented in theliterature [23,24]. When judgment is used to extrapolatetime-series patterns people tend to underestimate the amountof growth or decay that is present in the series. They alsotend to see systematic patterns in randomness, and possiblyas a consequence of this, they tend to overreact to the mostrecent observation. When judgmental forecasters have ac-cess to non-time-series information (e.g. information aboutpromotion campaigns) they tend to use it inconsistently [25]or see pre-conceived relationships that do not exist [26].

Special problems can arise when groups of people meetto agree on a set of forecasts. Experiments by Asch [27] andthe work of Janis and Mann [28] have shown the surprisingextent to which group processes can distort the judgmentsof individual group members. In Asch’s experiments, indi-viduals even misjudged the relative lengths of manifestlydi*erent straight lines after several other group members(stooges planted by the experimenter) had volunteereddeliberately incorrect answers. The presence of dominatingindividuals in the group and pressures to conform can alllead to judgments being formed without suKcient exchangeof information and views. Problems like these have ledto the development of structured group techniques, likethe Delphi method [29], but the success achieved by thesemethods, and the extent of their use in business and otherorganisations, is unclear.

In view of all of these problems, why is it worth consid-ering integrating judgmental forecasts with statistical meth-ods? First, maintaining a role for judgment in forecastingshould help to mitigate some of the behavioural objections to“pure” statistical forecasting, while also reducing the e*ects

of judgmental biases. For example, the manager’s esteemmay not be threatened because he or she still has an input tothe forecasting process. However, there is a more compellingreason for considering integration: judgment and statisticalmethods have complementary strengths and weaknesses [8].For example, human judges are adaptable and can take intoaccount one-o* events, but they are inconsistent, can onlytake into account small amounts of data and su*er fromcognitive biases. In contrast, statistical methods are rigid,but consistent and can make optimal use of large volumesof data. In the light of this it is not surprising that there ismuch evidence to show that more accurate forecasts can beobtained by integrating judgment and statistical methods inappropriate ways. The rest of this paper will outline meth-ods that can be used to facilitate integration. The relativeadvantages and disadvantages of these methods will also beidenti,ed, as will areas where further research is needed.Because most research to date has focussed on the produc-tion of point forecasts, this will be the emphasis here, so thatmethods associated with the derivation of forecasts repre-sented by probability distributions (e.g. Bayesian methods[30]) or prediction intervals [31] will not be discussed.

2. Methods of integration

When point forecasts are to be produced, two approachesto the integration of judgment and statistical methods canbe identi,ed. In voluntary integration the judgmental fore-caster is supplied with details of the statistical forecast anddecides how to use this in forming his or her judgment. Theforecaster is therefore free to completely ignore or com-pletely accept the statistical forecast or merely to take someaccount of it in making the judgmental forecast. Usually,voluntary integration involves the application of judgmentaladjustments to statistical forecasts, but it might also involvethe forecaster modifying “prior” judgmental forecasts in thelight of a newly arrived statistical forecast. In mechanicalintegration the “integrated” forecast is obtained through theapplication of a statistical method to the judgmental fore-cast. Combining, correction for bias and bootstrapping aretypical methods used here (these will be discussed later).

2.1. Voluntary integration

When only time-series information is available to boththe statistical method and the judgmental forecaster, moststudies suggest that judgmental adjustments of statisticalforecasts will lead to reduced accuracy [10,32–34], thoughWillemain’s study [35] found that “when statistical forecastswere nearly optimal, adjustment had little e*ect [but] whenthe forecasts were less accurate, adjustment improved accu-racy”. It seems that people read system into the noise asso-ciated with a time series and so make damaging adjustmentsto statistical forecasts in an attempt to forecast this noise.In contrast, when the judgmental forecaster has exclusive

Page 3: Integrating management judgment and statistical methods to improve short-term forecasts

P. Goodwin / Omega 30 (2002) 127–135 129

access to contextual information (e.g. information that therewill be a rise in the tax on the company’s product), there isplenty of evidence to show that judgmental adjustment canimprove the accuracy of statistical forecasts [33]. For exam-ple, studies of macro-economic forecasts made by govern-ment departments have shown that judgmental adjustmentsto econometric forecasts, which are made in order to takeinto account special economic conditions, tend to be bene-,cial [36–38]. A series of company-based studies by Math-ews and Diamantopoulos [39] led to similar conclusions,as did the results of laboratory-based studies by Wolfe andFlores [40] and Goodwin and Fildes [10].

Despite these reported bene,ts of voluntary integrationwhen contextual information is available, there are a numberof problems associated with the approach. First, there isthe danger of double counting bias. This can arise whena regression model is being used to produce the forecastsand a variable has been omitted from the model. If thismissing variable is collinear with a variable that is in themodel, the included variable will act as a proxy for theexcluded variable. This means that some of the e*ects of theexcluded variable will already be taken into account by thestatistical method and any judgmental forecaster who makesan adjustment for the full e*ects of the missing variable willbe double counting some of these e*ects.

Secondly, when the statistical method is reliably forecast-ing part of the time-series pattern and hence forming anideal baseline for adjustment, people apparently ignore thestatistical forecast completely [10]. For example, a statisti-cal forecast may be reliably forecasting the “normal” pat-tern in a sales time series excluding the e*ects of specialpromotions. Rather than simply adjusting this forecast forthe e*ect of the promotion, people apparently tend to basethe entire forecast on judgment. Finally, in many environ-ments, judgmental adjustments are made on an ad hoc basis,without adequate documentation or a defensible rationale,so that the credibility of the forecasts to users may be dam-aged, and the opportunity to learn about and improve therole of judgmental intervention is lost.

How then can voluntary integration be improved? A num-ber of approaches have had limited success. Advertising therelative accuracy of the statistical forecasts did not impressthe judgmental forecasters in Lim and O’Connor’s study[32]. They continued to rely on their own forecasts evenwhen they were forced to see a pop-up window on their mon-itor that carried a message like “Please be aware that youare 18.1% LESS ACCURATE than the statistical forecastprovided to you.” Providing support that removed the needto carry out mental calculations in obtaining a weighted av-erage of judgmental and statistical forecasts also met withno success. There is also, as yet, no evidence that improv-ing the forecaster’s technical knowledge, both of statisticalforecasting methods and potential judgmental biases, willlead to improvements in the use of judgment [41–43].

However, some simple methods have led to improved in-tegration. Willemain [44] found that applying the rule “ad-

just only in cases when the forecast error from a statisti-cal forecast exceeds the naive forecast” led to improved ac-curacy, but of course the forecaster would still need to bepersuaded to adhere to this rule voluntarily. In a study byGoodwin [45] making “no adjustment” the default action,so that the judgmental forecaster has explicitly to make a re-quest to adjust the statistical forecast, signi,cantly reducedthe number of harmful adjustments without reducing thepropensity to make adjustments when they were appropri-ate. Harmful adjustments were reduced further, in this study,when forecasters were also required to indicate a reasonfor requesting the adjustment. However, even under thesecircumstances, 35% of forecasts were still adjusted whensimply accepting the statistical forecast would have led togreater accuracy.

There is a substantial literature which suggests that in-volving users in the design and operation of decision sup-port systems leads to an increased willingness to accept theadvice of the system [46,47]. A recent study by Lawrenceet al. [48] found that this also applied to forecasting sup-port systems, but the increased acceptance of the statisticalforecast, resulting from participation, came at a price. Userparticipation in the selection of the statistical forecastingmethod led to less accurate methods being selected, eventhough the forecasts of these methods were more acceptableto the forecaster.

The underlying rationale of methods like decision analysisis that judgmental tasks can be simpli,ed by decomposingthem into a series of smaller tasks, enabling the judge to con-centrate on particular aspects of the problem separately. Thissuggests that using a formal decomposition model to struc-ture judgmental adjustments to statistical forecasts may leadto improvements over informal adjustment. Surprisingly, lit-tle research has been carried out to date to examine this pos-sibility. Of the few studies that have been conducted, mosthave used the analytic hierarchy process (AHP) [40,49], butthe validity of employing the AHP in the forecasting con-text has been questioned by Salo and Bunn [50] and Beltonand Goodwin [51]. One alternative, proposed by Ghalia andWang [52], uses a fuzzy logic system to approximate thereasoning of managers when they wish to make adjustments.

2.2. Mechanical integration

2.2.1. CombiningOf the approaches available for the mechanical integra-

tion of judgmental forecasts with statistical methods, themost widely discussed is combination. In practice, combi-nation often implies taking a simple average of indepen-dent judgmental and statistical forecasts. The alternative ofattaching weights to the constituent forecasts and taking aweighted average can be problematical. Mathematically op-timised weights require unbiased constituent forecasts, a sta-tionary pattern of forecast errors over time and suKcientpast data to estimate the optimum weights reliably, condi-tions which often do not apply in practical settings.

Page 4: Integrating management judgment and statistical methods to improve short-term forecasts

130 P. Goodwin / Omega 30 (2002) 127–135

The simple average has performed well in a number ofstudies [8,41,53], though it performed poorly in a study byFildes [54] who pointed out that “combining inadequateforecasts (however optimally) still produces inadequateforecasts”. One factor that inOuences the value of combin-ing forecasts is the correlation between the errors of theforecasts in the combination. If the constituent forecasts areunbiased, and if the simple average is used, it can be shown[55] that the mean squared error (MSE) of the combinedforecasts will only be lower than that of the judgmentalforecasts when

�j�s¿r + (r2 + 3)0:5

3= �;

where �2s and �2j are, respectively, the variances of the statis-

tical and judgmental forecast errors and r is the correlationbetween these errors.

If it is also the case that

�j�s¡

1�;

then the MSE of the combined forecast will be less than thatof both the constituent forecasts. These formulae imply thatcombination is likely to be less e*ective when the correla-tion between the forecast errors is high because the secondforecast is bringing little new information to the combina-tion. Indeed, the ideal situation is to have strong negativecorrelations between the forecast errors, but this is rarelyfound in practice [56].

While combination may improve the accuracy of fore-casts, it is worth noting that the combined forecast is notunderpinned by a single coherent theory that seeks to ex-plain the behaviour of the time series. This may reduce thechances of gaining insights into the behaviour of series andalso diminish the credibility of the forecasts in the eyes ofusers.

2.2.2. BootstrappingPsychological bootstrapping models arguably o*er a more

elegant method for combining the bene,ts of judgment andstatistical methods. Bootstrapping (which is to be distin-guished from the use of the term in statistics, where it de-scribes a resampling procedure) normally involves usingmultiple linear regression to build a model of a judge’s de-cisions or forecasts. The usual form of the model is

Ft = a+ b1x1; t + b2x2; t + · · ·+ bnxn; t ;where Ft is the judge’s forecast for period t; xi; t are thevalues of cues available to the judge at t; bi is the weightthat the judge implicitly attaches to cue i and a is a con-stant. For example, suppose that it is known that a judgeis using advertising expenditure and a competitor’s price toforecast sales, and data on the past values of these variablesare available, together with the judge’s forecasts based onthese variables. If this is the case, a regression model can beconstructed to estimate the weights that the judge attaches

to the variables. The model can then be used to replace theoriginal judgmental forecaster in the production of forecasts.

It has been repeatedly found in areas as diverse asbankruptcy prediction, the prediction of student perfor-mance and psychiatric diagnosis [25] that bootstrap modelsof judges outperform the judges themselves because themodels average-out the inconsistencies of human judg-ment. (More surprisingly, models employing weights thatwere random, apart from their sign, also outperformedthe judges.) However, despite the large literature showingthe bene,ts of bootstrapping, there is as yet no evidencethat it brings bene,ts in judgmental time-series forecasting[54,57]. This appears to be because of (i) the large numberof possible cues that might be available to the judge (e.g.the last observation, the di*erence between the last two ob-servations, the mean of the last three observations) whichmakes the identi,cation of cues for the model diKcult;(ii) the possibility of judges using con,gural cues like the“shape” of the time-series graph and (iii) the possibility thatthe judge has exclusive access to contextual informationthat cannot, by de,nition, be included in the model.

2.2.3. Correction for biasCorrecting judgmental forecasts for bias is an under-

explored method for improving judgment through the ap-plication of statistical methods. Theil [58] showed how theMSE of a set of forecasts can be decomposed to reveal twotypes of bias, mean and regression bias:

MSE= ( QA− QF)2 +(SF − rSA)2 +(1− r2)S2A;Term 1 Term 2 Term 3

where QA and QF are the means of the actuals (At) and forecasts(Ft), SA and SF are the standard deviations of the actualsand forecasts, and r is the correlation between the actualsand forecasts.

In the decomposition, Term 1 represents mean bias, thatis the tendency of the forecasts to be too high or low. Term 2represents regression bias. This is manifested in a systematicfailure of the forecasts to track the pattern in the actuals.For example, there may be a tendency for high forecaststo be too low and low forecasts to be too high. The thirdterm represents the, so-called, random error in the forecasts(i.e., the variation in the actuals that is not explained by theforecasts).

Theil went on to show that by regressing the actuals onto the forecasts one obtains

At = a+ bFt ;

where At is the estimated actual at time t, and then usingAt as the corrected forecast, both mean and regression biasare removed from “past” forecasts (i.e., forecasts where theactual has already been realised). Assuming that past bi-ases will continue to be unchanged, the correction can beexpected to remove systematic bias from future forecasts.However, it should be noted that, when the forecast horizonis greater than one period, serial correlation in the residuals

Page 5: Integrating management judgment and statistical methods to improve short-term forecasts

P. Goodwin / Omega 30 (2002) 127–135 131

of the regression equation will increase the variance of theordinary least-squares estimates, a and b [59].A number of successful applications of Theil’s correction

have been reported. For example, Elgers et al. [60] appliedit to analysts’ company earnings forecasts and found thatit reduced the MSE’s emanating from systematic bias byabout 91%. In an analysis of two UK-based manufacturingcompanies, Goodwin [55] found that the median absolutepercentage error of managers’ forecasts was reduced, forout-of-sample periods, by an average of 12% (of its originalvalue) for the ,rst company and 23% for the second.

Fildes [54] has successfully demonstrated another formof correction that can be applied when information on thepredictor variables (cues) used by the forecaster are avail-able. This involves regressing the forecaster’s errors on tothe predictor variables to obtain a model of the form

et = a+ b1x1; t + b2x2; t + · · ·+ bnxn; t + it ;where et is the judge’s forecast error for period t; xi; t are thevalues of cues available to the judge at t; it is the residualat period t and a is a constant.Future forecasts are corrected by the predicted error. It can

be seen that Fildes’s method allows forecasts to be correctedfor bias in the use of available information. By improving theextent to which variation in the forecast variable is explainedby this information, the correction attacks term 3 of Theil’sdecomposition. This suggests that both Theil’s and Fildes’scorrections might be usefully combined in a model of theform

et = a+ b0Ft + b1x1; t + b2x2; t + · · ·+ bnxn; t + it :Although multicollinearity may be a problem here, becauseof the relationship between the forecasts (Ft) and cues (xi; t),Elgars et al. [60] have demonstrated that this approach canbe e*ective.

Clearly, there are potential problems with correction whenthe nature of the biases changes over time, possibly as a re-sult of learning by the forecaster or changes in forecastingpersonnel. Where biases change gradually, Goodwin [61]has shown that discounted weighted regression can be usedin combination with Theil’s method so that the estimated co-eKcients in the regression model have a greater tendency totake into account the performance of recent forecasts. How-ever, in some circumstances biases may vary sporadicallybetween di*erent types of period. For example, in sales fore-casting, biases in “normal” periods may di*er from thosethat apply in promotion periods. Although one study [55]found that Theil’s correction was still useful under theseconditions, its e*ectiveness was blunted and the combinedTheil–Fildes’ method may be more appropriate in these cir-cumstances, if data on cues are available.

2.2.4. Correcting and combiningWhen the constituent forecasts in a simple average com-

bination su*er from mean bias, the bene,ts of combinationwill depend on the relative size and sign of the forecasts’

mean errors (i.e., QA − QF). If the mean errors of the judg-mental and statistical forecasts are given, respectively, by vand w, then the MSE of the combined forecast will be

MSE = 0:25[(�2s + �2j + 2r�s�j) + (v + w)2]:

Recall that �2s and �2j are, respectively, the variances ofthe statistical and judgmental forecast errors and r is thecorrelation between these errors.

Thus if v=−w, the mean error of one forecast will can-cel out that of the other. However, if the statistical forecastsare unbiased, but the mean bias of the judgmental forecastsis v2 units then the combination would only remove 75%of this mean bias. Given the propensity of judgmental fore-casts to su*er from biases, it may be bene,cial to apply cor-rection to them before combining them with the statisticalforecasts—that is, a correct-then-combine strategy. Indeed,in their seminal paper on combination, Bates and Granger[62] argued that forecasts should be corrected for bias be-fore being combined—although their suggested correctiononly involved the removal of mean bias. Since Bates andGranger’s paper, much of the published theory on combina-tion has been based on the presumption that the constituentforecasts are unbiased [63].

There are, however, a number of reasons why applying acorrect-then-combine strategy involving Theil’s correctionmight diminish the potential gains of combination. First, thecorrection might be so successful that subsequent combina-tion cannot lead to further improvements and it may evenreduce accuracy. For example, if Theil’s correction success-fully removes mean bias from future forecasts, then it willalso remove the potential bene,ts of mean errors of oppo-site signs tending to cancel each other out in the combina-tion. Secondly, it is possible that the smoothing e*ect thatTheil’s method has on the judgmental forecasts [61] willincrease the correlation of their errors with those of the sta-tistical forecasts. This would again reduce the potential ben-e,ts of combination. Fildes [54] has said that combinationis “already becoming a catch-all recommendation to appliedforecasters” and there is danger that it will be applied unnec-essarily when a simple correction of judgmental forecasts isall that is required.

3. Comparing integration methods

Clearly, the choice of the most appropriate integrationmethod will be dependent on the speci,c conditions thatapply in a given context. Fig. 1 shows a tentative decisiontree, based on our current state of knowledge, which mightbe used to choose the most accurate integration method,based on these conditions.

Regular patterns, or relationships, favour the exclusiveuse of statistical forecasting [10]. Irregular patterns, whereeach new observation in a series is a result of a combinationof particular circumstances that apply only in that period,some of which the forecaster will have prior knowledge of

Page 6: Integrating management judgment and statistical methods to improve short-term forecasts

132 P. Goodwin / Omega 30 (2002) 127–135

Series type

Regular

Regular with

special events

Irregular

Statistical

forecasting

Corrected

judgmental forecasts

Type of period

Normal Special

Statistical

forecasting

Availability of past

'hard' data on special

circumstances

Statistical

forecasting

High

Low

Availability of data on past

judgmental forecasts and

outcomes for special periods

High

Correct and

combine

Low

Likely impact of

special events

High

Voluntary

integration

Low

Statistical

forecasting

Fig. 1. Choosing a forecast integration method.

(e.g., promotion campaigns), favour the use of judgment,albeit statistically corrected to remove biases [55].

Most interesting are series that are a combination of reg-ular patterns, overlaid with the e*ects of foreseeable specialevents. In periods when these events do not apply (“nor-mal” periods), statistical forecasting is likely to be most ac-curate, as judgmental forecasters will be over inOuenced bythe noise in the data. In “special” periods, where there is adearth of hard data on the special event, but ample past dataon the performance of judgmental forecasts, a correct andcombine strategy is likely to be most e*ective [55]. In theabsence of suKcient data to detect judgmental biases, volun-

tary integration should only be considered where the specialevent is likely to have a major e*ect on the variable to beforecast [10,33]. Judgmental forecasters are not likely to beskilled in adjusting for minor e*ects as their judgments willagain be distorted by the noise in the data. In this case sta-tistical methods are likely to be most accurate even thoughthey fail to take into account these relatively minor e*ects.

The tree in Fig. 1 is only tentative because much researchremains to be done. For example, to the author’s knowl-edge, there have been no studies making a direct comparisonbetween judgmental forecasts, where special events disturbotherwise regular series, and statistical methods designed to

Page 7: Integrating management judgment and statistical methods to improve short-term forecasts

P. Goodwin / Omega 30 (2002) 127–135 133

handle these events, such as intervention analysis [64] orspecial event indices (e.g. [11–13]). There have also beenrelatively few studies looking at the e*ectiveness of judg-mental forecasts under conditions where a series is subjectto non-reversionary changes, despite this being a situationwhere the judgmental forecaster may be expected to havean advantage [52,65]. In fact, the evidence of the studies inthis area that do exist is that statistical forecasting methodsoutperform judgmental forecasters because the latter over-react to discontinuities by continuing to anticipate change,even when a series has settled into a new but stable pattern[65,66].

Fig. 1 also ignores what is perhaps the most important fac-tor in the choice of forecasting methods—human behaviour.Ultimately, human beings decide which forecasts to use intheir decision making. Highly accurate statistical forecastingmethods are of no use if decision makers doubt the credi-bility of the resulting forecasts and choose to ignore them[17]. Correcting judgmental forecasts for bias may improvethem in the short term, but the long-term e*ect may be tocause forecasters to devote less e*ort to producing forecaststhat they know will be corrected or to cause them to makepre-emptory adjustments to their forecasts in an attempt tonegate the correction (though, in the absence of research ev-idence, these arguments are speculative). Indeed, the choiceof forecasting method may often have little to do with theaccuracy associated with the method.

Research suggests that decision makers are more likelyto accept forecasts if they have a sense of ownership of theforecasts, because they have contributed to the process thatderived them [48]. All of this implies that the greatest prac-tical improvements in forecasting may be obtained throughimproved methods of voluntary integration. Forecasting sup-port systems, which allow and encourage judges to inter-act with statistical methods so that insights are gained andjudgment is used e*ectively and appropriately, would ap-pear to o*er the most promising way forward. Routines al-lowing the updating of estimates of bias and bias-correctionformulae could be easily incorporated, as could analysis ofthe correlation between statistical and judgmental forecasterrors, in order to indicate whether combination is advis-able. However, this information would be used to advise andenlighten the forecaster, rather than to produce the “,nal”forecasts without his or her approval. Designers of such sys-tems will need to build in Oexible and user-friendly routinesthat allow judges to access, assimilate and analyse relevantdata in di*erent forms, to decompose their judgments and tounderstand the rationale and implications of any statisticalmethods that are employed.

4. Summary

Statistical methods and human judges can make valuableand complementary contributions to short term forecasts.The key question is how these two inputs to the forecast-

ing process can best be integrated. Voluntary integration hasthe advantage that the resulting forecasts are more likely tobe acceptable to the forecaster and decision makers. How-ever, it is usually carried out ineKciently and can lead to thedegradation of relatively accurate statistical forecasts. Me-chanical integration can be e*ective in improving forecastaccuracy in many circumstances, but unless there is a sep-aration between forecasters and forecast users, behaviouralfactors may undermine the bene,ts of any improved ac-curacy that is achieved. This suggests that future researchshould focus on the problems of designing methods and sup-port systems that workwith judgmental forecasters, enablingthem to use their judgments e*ectively, but also encouragingthem to make full use of statistical methods, where this isappropriate.

Acknowledgements

The author would like to thank two anonymous referees.One referee, in particular, made a large number of detailedand insightful comments and the paper was much improved,as a result.

References

[1] Collopy F, Armstrong JS. Rule-based forecasting:development and validation of an expert systems approach tocombining time series extrapolations. Management Science1992;38:1394–414.

[2] Dalrymple DJ. Sales forecasting practices, results from aUnited States survey. International Journal of Forecasting1987;3:379–91.

[3] Duran JA, Flores BE. Forecasting practices in Mexicancompanies. Interfaces 1998;28:56–62.

[4] Hughes MC. Forecasting practice: organisational issues.Journal of the Operational Research Society 2001;52:143–9.

[5] Klassen RD, Flores BE. Forecasting practices of Canadian,rms: survey results and comparisons. International Journalof Production Economics 2001;70:163–74.

[6] Mady MT. Sales forecasting practices of Egyptian publicenterprises: survey evidence. International Journal ofForecasting 2000;16:359–68.

[7] Sanders NR, Manrodt KB. Forecasting practices in USCorporations: survey results. Interfaces 1994;24:92–100.

[8] Blattberg RC, Hoch SJ. Database models and managerialintuition: 50% model + 50% manager. Management Science1990;36:887–99.

[9] Lawrence MJ. An integrated inventory control system.Interfaces 1997;7:55–62.

[10] Goodwin P, Fildes R. Judgmental forecasts of time seriesa*ected by special events: does providing a statistical forecastimprove accuracy? Journal of Behavioral Decision Making1999;12:37–53.

[11] Stellwagen EA, Goodrich RL. Forecast pro for Windows.Belmont, MA: Business Forecast Systems Inc., 1994.

[12] Smart CN, Willemain TR, Stein M. SmartForecasts forWindows users’ guide. Belmont, MA: Smart Software, Inc.,2001.

Page 8: Integrating management judgment and statistical methods to improve short-term forecasts

134 P. Goodwin / Omega 30 (2002) 127–135

[13] Holt Wilson J, Keating B. Business forecasting, 4th ed.with Accompanying Excel-based ForecastXTM software. NewYork: McGraw-Hill, 2001.

[14] Fildes R, Hastings R. The organisation and improvementof market forecasting. Journal of the Operational ResearchSociety 1994;45:1–16.

[15] Brown LD. Editorial: Comparing judgmental to extrapolativeforecasts: it’s time to ask why and when. International Journalof Forecasting 1998;4:171–3.

[16] O’Connor M, Lawrence M. Judgmental forecasting and theuse of available information. In: Wright G, Goodwin P,editors. Forecasting with judgment. Chichester: Wiley, 1998.p. 65–90.

[17] Taylor PF, Thomas ME. Short term forecasting: horsesfor courses. Journal of the Operational Research Society1982;33:685–94.

[18] Hogarth RM. Judgement and choice, 2nd ed. Chichester:Wiley, 1987. p. 4–7.

[19] Tversky A, Kahneman D. Judgment under uncertainty:heuristics and biases. Science 1974;185:1124–31.

[20] Bolger F, Harvey N. Context-sensitive heuristics in statisticalreasoning. Quarterly Journal of Experimental Psychology1993;46A:779–811.

[21] Lawrence MJ, O’Connor MJ. Exploring judgementalforecasting. International Journal of Forecasting 1992;8:15–26.

[22] Eggleton IRC. Intuitive time-series extrapolation. Journal ofAccounting Research 1982;20:68–102.

[23] Hogarth RM, Makridakis S. Forecasting and planning: anevaluation. Management Science 1981;27:115–38.

[24] Goodwin P, Wright G. Improving judgmental time seriesforecasting: a review of the guidance provided by research.International Journal of Forecasting 1993;9:147–61.

[25] Dawes RM, Faust D, Meehl PE. Clinical versus actuarialjudgment. Science 1989;243:1668–73.

[26] Chapman LJ, Chapman JP. Illusory correlation as an obstacleto the use of valid psychodiagnostic signs. Journal ofAbnormal Psychology 1969;74:271–80.

[27] Asch SE. E*ects of group pressure upon the modi,cationand distortion of judgments. In: Proshansky H, Seidenberg B,editors. Basic studies in psychology. New York: Holt, Rinehartand Winston, 1965.

[28] Janis IL, Mann L. Decision making. New York: Free Press,1979.

[29] Rowe G, Wright G. The impact of task characteristics onthe performance of structured group forecasting techniques.International Journal of Forecasting 1996;12:73–89.

[30] Pole A, West M, Harrison J. Applied Bayesian forecastingand time series analysis. New York: Chapman and Hall, 1994.

[31] Chat,eld C. Prediction intervals for time-series forecasting.In: Armstrong JS, editor. Principles of forecasting. Norwell,MA: Kluwer Academic Publishers, 2001. p. 475–94.

[32] Lim JS, O’Connor M. Judgemental adjustment of initialforecasts: its e*ectiveness and biases. Journal of BehavioralDecision Making 1995;8:149–68.

[33] Sanders NR, Ritzman LP. Judgmental adjustment of statisticalforecasts. In: Armstrong JS, editor. Principles of forecasting.Norwell, MA: Kluwer Academic Publishers, 2001. p. 405–16.

[34] Harvey N. Why are judgments less consistent in lesspredictable task situations? Organizational Behavior andHuman Decision Processes 1995;63:247–63.

[35] Willemain TR. Graphical adjustment of statistical forecasts.International Journal of Forecasting 1989;5:179–85.

[36] Turner DS. The role of judgment in macroeconomicforecasting. Journal of Forecasting 1990;9:315–45.

[37] McNees SK. The role of judgment in macroeconomicforecasting accuracy. International Journal of Forecasting1990;6:287–99.

[38] Donihue MR. Evaluating the role judgment plays in forecastaccuracy. Journal of Forecasting 1993;12:81–92.

[39] Mathews BP, Diamantopoulos A. Judgemental revision ofsales forecasts: e*ectiveness of forecast selection. Journal ofForecasting 1990;9:407–15.

[40] Wolfe C, Flores B. Judgmental adjustment of earningsforecasts. Journal of Forecasting 1990;9:389–405.

[41] Sanders NR, Ritzman LP. The need for contextual andtechnical knowledge in judgmental forecasting. Journal ofBehavioural Decision Making 1992;5:39–52.

[42] Wagenaar WA, Sagaria SD. Misperception of exponentialgrowth. Perception and Psychophysics 1975;18:416–22.

[43] Edmundson R. Decomposition: a strategy for judgementalforecasting. Journal of Forecasting 1990;9:305–14.

[44] Willemain TR. The e*ect of graphical adjustment on forecastaccuracy. International Journal of Forecasting 1991;7:151–4.

[45] Goodwin P. Improving the voluntary integration of statisticalforecasts and judgement. International Journal of Forecasting2000;16:85–99.

[46] Turban E. Decision support and expert systems: managementsupport systems, 4th ed. Engelwood Cli*s, NJ: Prentice-Hall,1995.

[47] Schultz RL. The implementation of forecasting models.Journal of Forecasting 1984;3:43–55.

[48] Lawrence M, Goodwin P, Fildes R. Increasing the use offorecasting software by greater user involvement. TwentiethAnnual International Symposium on Forecasting, Lisbon,2000.

[49] Saaty TL. The analytic hierarchy process. Pittsburgh: RWSPublications, 1990.

[50] Salo AA, Bunn DW. Decomposition in the assessment ofjudgemental probability forecasts. Technological Forecastingand Social Change 1995;49:13–25.

[51] Belton V, Goodwin P. Remarks on the application ofthe analytic hierarchy process to judgmental forecasting.International Journal of Forecasting 1996;12:155–61.

[52] Ghalia MB, Wang PP. Intelligent system to supportjudgmental business forecasting: the case of estimatinghotel room demand. IEEE Transactions on Fuzzy Systems2000;8:380–97.

[53] Lawrence MJ, Edmundson RH, O’Connor MJ. Theaccuracy of combining judgemental and statistical forecasts.Management Science 1986;32:1521–32.

[54] Fildes R. EKcient use of information in the formationof subjective industry forecasts. Journal of Forecasting1991;10:597–617.

[55] Goodwin P. Correct or combine? Mechanically integratingjudgmental forecasts with statistical methods. InternationalJournal of Forecasting 2000;16:261–75.

[56] Armstrong JS. Combining forecasts. In: Armstrong JS, editor.Principles of forecasting. Norwell, MA: Kluwer AcademicPublishers, 2001. p. 417–39.

[57] Lawrence M, O’Connor M. Judgement or models: theimportance of task di*erences. Omega: International Journalof Management Science 1996;24:245–54.

Page 9: Integrating management judgment and statistical methods to improve short-term forecasts

P. Goodwin / Omega 30 (2002) 127–135 135

[58] Theil H. Applied economic forecasting. Amsterdam:North-Holland Publishing Company, 1971.

[59] Holden K, Peel DA, Thompson JL. Expectations: theory andevidence. London: Macmillan, 1985. p. 68–9.

[60] Elgers PT, May HL, Murray D. Note on adjustments toanalysts’ earning forecasts based upon systematic cross-sectional components of prior-period errors. ManagementScience 1995;41:1392–6.

[61] Goodwin P. Adjusting judgemental extrapolations usingTheil’s method and discounted weighted regression. Journalof Forecasting 1997;16:37–46.

[62] Bates JM, Granger CWJ. The combination of forecasts.Operational Research Quarterly 1969;20:451–68.

[63] Bunn D. Synthesis of expert judgment and statisticalforecasting. Models for decision support. In: Wright G, BolgerF, editors. Expertise and decision support. New York: Plenum,1992. p. 251–68.

[64] Makridakis S, Wheelwright SC, Hyndman RJ. Forecasting:methods and applications. New York: Wiley, 1998. p. 418–23.

[65] O’Connor M, Remus W, Griggs K. Judgemental forecastingin times of change. International Journal of Forecasting1993;9:163–72.

[66] Remus WE, Carter Pl, Jenicke LO. Regression models ofdecision rules in unstable environments. Journal of BusinessResearch 1979;7:187–96.