13
The triangular distribution as a proxy for the beta distribution in risk analysis By DAVID JOHNSON{ Loughborough University, UK [Received March 1994. Final revision February 1997] SUMMARY The beta distribution is seen as a suitable model in risk analysis because it provides a wide variety of distributional shapes over a finite interval. Unfortunately, the beta distribution is not easily understood and its parameters are not easily estimated. This paper explores the possibility of using the more intuitively obvious triangular distribution as a proxy for the beta distribution. It is shown that the differences between the two distribution functions are seldom significant and a procedure is proposed for estimating the parameters of a triangular distribution on the basis of two extreme percentiles and the median. Keywords: Beta distribution; Risk; Triangular distribution 1. Introduction Many forms of risk analysis incorporate appropriate probability distributions to represent the uncertainty surrounding key parameters in the analysis. As many uncertain quantities have iden- tifiable minimum and maximum values, a distribution with finite limits is intuitively plausible to non-statistically minded decision makers. The beta distribution is often seen as a suitable model in this context as it provides a wide variety of distributional shapes over a finite interval. It is therefore one of the most versatile of all the standard distributions used in risk analysis. The only other commonly used distribution with similar properties is the triangular distribution which, in the case of right-triangular distributions, is a special case of the beta distribution. The general form of the probability density function of the beta distribution is f ( x) ˆ( p q) ˆ( p) ˆ( q) ( x a) p1 ( b x) q1 ( b a) pq1 a < x < b, p, q . 0 (1) but the simple linear transformation y ( x a)=( b a) (2) reduces the density function to its more usual, simpler form f ( y) ˆ( p q) ˆ( p) ˆ( q) y p1 (1 y) q1 0 < y < 1, p, q . 0 (3) which, without loss of generality, we shall use here. As well as its more obvious applications, such as a model for the value of an uncertain proportion, the beta distribution has also been suggested (e.g. Law and Kelton (1982)) as ‘a rough model in the absence of data’. In this context, probably the best described (and most exten- sively criticized!) application in a risk situation is as a model for activity durations in program & 1997 Royal Statistical Society 0039–0526/97/46387 The Statistician (1997) 46, No. 3, pp. 387–398 { Address for correspondence: Management Development Centre, Business School, Loughborough University, Lough- borough, Leics., LE11 3TU, UK. E-mail: [email protected]

The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

The triangular distribution as a proxy for the beta distribution inrisk analysis

By DAVID JOHNSON{

Loughborough University, UK

[Received March 1994. Final revision February 1997]

SUMMARYThe beta distribution is seen as a suitable model in risk analysis because it provides a wide variety of distributionalshapes over a ®nite interval. Unfortunately, the beta distribution is not easily understood and its parameters are noteasily estimated. This paper explores the possibility of using the more intuitively obvious triangular distribution as aproxy for the beta distribution. It is shown that the differences between the two distribution functions are seldomsigni®cant and a procedure is proposed for estimating the parameters of a triangular distribution on the basis of twoextreme percentiles and the median.

Keywords: Beta distribution; Risk; Triangular distribution

1. Introduction

Many forms of risk analysis incorporate appropriate probability distributions to represent theuncertainty surrounding key parameters in the analysis. As many uncertain quantities have iden-ti®able minimum and maximum values, a distribution with ®nite limits is intuitively plausible tonon-statistically minded decision makers. The beta distribution is often seen as a suitable modelin this context as it provides a wide variety of distributional shapes over a ®nite interval. It istherefore one of the most versatile of all the standard distributions used in risk analysis. The onlyother commonly used distribution with similar properties is the triangular distribution which,in the case of right-triangular distributions, is a special case of the beta distribution.

The general form of the probability density function of the beta distribution is

f (x) � Ã( p� q)

Ã( p)Ã(q)

(xÿ a) pÿ1(bÿ x)qÿ1

(bÿ a) p�qÿ1a < x < b, p, q . 0 (1)

but the simple linear transformation

y � (xÿ a)=(bÿ a) (2)

reduces the density function to its more usual, simpler form

f (y) � Ã( p� q)

Ã( p)Ã(q)y pÿ1(1ÿ y)qÿ1 0 < y < 1, p, q . 0 (3)

which, without loss of generality, we shall use here.As well as its more obvious applications, such as a model for the value of an uncertain

proportion, the beta distribution has also been suggested (e.g. Law and Kelton (1982)) as `a roughmodel in the absence of data'. In this context, probably the best described (and most exten-sively criticized!) application in a risk situation is as a model for activity durations in program

& 1997 Royal Statistical Society 0039±0526/97/46387

The Statistician (1997)46, No. 3, pp. 387±398

{Address for correspondence: Management Development Centre, Business School, Loughborough University, Lough-borough, Leics., LE11 3TU, UK.E-mail: [email protected]

Page 2: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

evaluation and review technique (PERT) analysis of project networks. As has been widelyreported (e.g. Clark (1962), Grubbs (1962) and MacCrimmon and Ryaveck (1964)), thetheoretical link between the beta distribution and the usual PERT estimates of the mean andvariance of the activity time is somewhat tenuous. Various attempts have been made (e.g. Farnhamand Stanton (1987) and Golenko-Ginzberg (1988)) to re®ne the PERT estimates based moresoundly on the properties of the beta distribution, but there have been no suggestions that the betadistribution itself is inappropriate.

Berny (1989) has proposed a more general distribution of which the beta distribution is aspecial case. Berny claims that his distribution is more realistic in that it involves fourparameters which are `amenable to practical estimation', unlike the beta distribution where theshape parameters p and q must be estimated directly (which would be extremely dif®cult) orindirectly from estimates of two of the main distribution measures such as the mean, mode orstandard deviation. Although Berny's parameters are almost certainly more intuitively obviousthan the alternative beta parameters, there is still some question whether the decision makercould reliably estimate (say) the probability of exceeding the mode, which is one of Berny'sfour parameters. Instead of using a more complicated distribution than the beta, albeit withsomewhat more meaningful parameters, this paper explores the alternative approach of using asimpler distribution, the triangular distribution, which requires estimates of just three `position'parameters, namely the minimum, maximum and most likely values.

2. Triangular distribution

In its most general form, the triangular density function is

g(x) �2

bÿ a

xÿ a

mÿ aa < x < m,

2

bÿ a

bÿ x

bÿ mm < x < b:

8>><>>: (4)

It is again possible to make the simplifying linear transformation (2) which gives

g(y) � 2y=m9 0 < y < m9,

2(1ÿ y)=(1ÿ m9) m9 < y < 1

�(5)

where the transformed mode is given by m9 � (mÿ a)=(bÿ a). This transformation is, however,too restrictive for our purposes as it would obviously imply that the triangular limits must be 0and 1. Despite the fact that the beta distribution being approximated is restricted to the interval(0, 1), an approximating triangular distribution will need more degrees of freedom to achieve thebest possible ®t.

Support for the use of the triangular distribution is provided by Williams (1992) whocommented that `the beta distribution is not easily understood, nor are its parameters easilyestimated', whereas `the triangular distribution offers comprehensibility to the project planner'.Williams also reported that project planners are happy to accept the triangular distribution,using 10% and 90% points along with an estimated `uncertainty level'.

Like the beta distribution, the triangular distribution can be positively or negatively skewed(or symmetrical) but cannot be other than unimodal. In particular, it would be impossible toproduce a triangular distribution shape which reasonably approximates to a uniform, J-shapedor U-shaped distribution. Such distributions arise in the beta family when either p < 1 orq < 1, and we shall therefore restrict our consideration to beta distributions which have aunique mode with p, q . 1. In practice this would not usually be too restrictive an assumptionas relatively few uncertainty pro®les used in risk analysis are likely to be U or J shaped.

388 JOHNSON

Page 3: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

3. Fitting a triangular distribution

We wish to examine how well a triangular distribution function will ®t a beta distributionfunction in general. A typical ®t is shown in Figs 1 and 2 for the beta(4, 6) distribution. Themain point to note is that the very obvious difference in the density functions in Fig. 1 isalmost imperceptible in the distribution functions of Fig. 2.

To investigate the goodness of ®t, we shall consider a variety of beta distribution shapesdetermined by the parameters p and q, restricting our attention to unimodal, positively skeweddistributions satisfying the condition

q > p . 1: (6)

Rather than systematically varying the shape parameters p and q subject to condition (6), arange of beta distributions was de®ned by the coef®cients of variation v and skewness k givenby

v �r

q

p( p� q� 1)

� �, (7)

k � 2(qÿ p)

p� q� 2

rp� q� 1

pq

� �: (8)

Solving equations (7) and (8) to express p and q in terms of v and k gives

p � 2(1� kvÿ v2)

v(kv2 � 4vÿ k), (9)

q � pv2( p� 1)

1ÿ pv2: (10)

From the characteristics of the beta distribution generally, it can be shown that

vÿ 1=v , k , 2v v . 0: (11)

Fig. 1. beta(4, 6) (Ð) and optimum triangular (± ± ± ±) density functions

TRIANGULAR DISTRIBUTION 389

Page 4: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

In addition, condition (6) further restricts v and k as follows:

k > 0, (12)

k >2(3v2 ÿ 1)

v(3ÿ v2): (13)

From expressions (11)±(13), the allowable region of values of k and v is as shown in Fig. 3. Theportion in which we are particularly interested is the approximate `triangle' with vertices (0, 0),(1=p

3, 0) and (1, 2).

Fig. 2. beta(4, 6) (Ð) and optimum triangular (± ± ± ±) distribution functions

Fig. 3. Feasible region for coef®cients of variation and skewness

390 JOHNSON

Page 5: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

Within the triangle of interest, a range of values of v and k was selected at intervals of0.05 representing a total of 233 different beta distributions covering the full range of feasiblevalues of the coef®cients of variation and skewness.

4. Estimating triangular parameters

The optimum triangular parameters a, m and b were determined to minimize three separatemetrics, namely the mean-square deviation

S �Xn

i�1

fF(xi)ÿ G(xi)g2=n,

the mean absolute deviation

A �Xn

i�1

jF(xi)ÿ G(xi)j=n

and the maximum deviation

D � maxijF(xi)ÿ G(xi)j,

where F(x) and G(x) are the beta and triangular distribution functions respectively. In each case1000 values of xi were taken at intervals of 0.001 between 0 and 1.

The results obtained show very little difference in the optimum values of a, m and b undereach of the three best ®t criteria, with seldom a difference of more than 0.01 in the values ofeither a, m or b. It was felt that the third criterion, that of minimizing the maximum absolutedeviation D between the two distribution functions, was intuitively the most easily understoodmeasure, and so all the results which follow have been derived using this criterion.

An indication of the sort of values that can be expected for D can be obtained byconsidering probably the only non-triangular beta distribution where the best ®t can bedetermined theoretically, that being the uniform distribution obtained from beta parametersp � q � 1. In this case it can be shown that the optimum values of a, m and b are

a � (1ÿp2)=2 � ÿ0:2071,

m � 0:5,

b � (1�p2)=2 � 1:2071,

which gives a D-value of (3ÿ 2p

2)=4 � 0:0429. This particular beta distribution correspondsto the lower right-hand corner of the feasible triangle in Fig. 3 and represents one of the worstcases to which to ®t a triangular distribution. For most of the range of beta distributions in Fig. 3the D-value should therefore be better than 0.0429. For example, the beta(4, 6) distributionshown in Fig. 2 gives a D-value of only 0.0074.

For the range of feasible coef®cients of variation and skewness identi®ed in Fig. 3 the D-value is generally less than 0.02 and only becomes `large' where the coef®cient of variation isabove 50% and/or the coef®cient of skewness is above 90%. In particular, distributions wherethe ®t is worse are usually characterized by having a very low mode, typically less than 0.05.

There is some evidence (see for example Golenko-Ginzburg (1988)) that the uncertaintyin PERT activity durations is typi®ed by a most likely value which is close to the point(2a� b)=3 relative to the optimistic a- and pessimistic b-values which, in our case, wouldsuggest a mode of about 0.33. It would be unreasonable to presume that in all risk analysissituations the mode will be quite so restricted, but we can probably exclude extremely lowvalues of the mode as being unrealistic. If we con®ne our attention to beta distributions with amode of at least 0.15, the coef®cients of variation and skewness will be less than 0.6 and 0.7

TRIANGULAR DISTRIBUTION 391

Page 6: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

respectively. Within this region, there were only three cases with D-values above 0.03 andthese correspond to distributions which are very close to a uniform shape.

Alternatively, those beta distributions where the ®t is not so good generally lie along theright-hand side of the feasible triangle and can be characterized by a `low' value of p. If wecon®ne our attention to beta distributions with a p-value of at least 2, all the D-values are lessthan 0.03. These distributions will be bell shaped at both ends and are probably a more realis-tic representation of the uncertainty in most risk situations. Furthermore, earlier investigationsof the accuracy of different beta approximations by Perry and Greig (1975) and Keefer andBoddily (1983) both used a family of test beta distributions with p > 2. To permit a com-parison with these earlier results, we shall likewise consider only the 125 distributions lying tothe left of the broken line in Fig. 3 where p > 2.

In general, we are interested in the extent to which the optimum triangular parameters (a, mand b) can be determined from the characteristics of the beta distribution being approximated.For example, the particular case considered above ( p � 4 and q � 6) has optimum triangularparameters

a � 0:0612,

m � 0:3584,

b � 0:7782:

The beta distribution being approximated has a mode of (4ÿ 1)=(4� 6ÿ 2) � 0:375 and

P(x , a � 0:0612) � 0:0014,

P(x , b � 0:7782) � 0:9947:

In this case, therefore, the minimum and maximum values of the best ®tting triangular dis-tribution correspond approximately to the 0.1% and 99.5% points of the beta distribution, andthe triangular mode is about 95.6% of the beta mode. We are interested in the extent to whichrelationships of this nature hold for different beta distributions.

Not surprisingly, the ratio of the optimum triangular mode to the beta mode is not the samefor all distributions, and the triangular extremes do not correspond to ®xed beta percentiles.The cumulative beta probability associated with the lower limit a ranges from 0 to 0.0093,whereas that for the upper limit b ranges from 0.986 to 1. Furthermore, 47 (i.e. 40%) of the a-values are negative, and 22 (i.e. 19%) of the b-values are greater than 1.

If the optimum triangular parameters are regressed against either the 5% or the 10%fractiles and the median, some very strong relationships emerge. In particular, using the 5%fractiles:

a � 1:71â0:05 ÿ 0:85â0:5 � 0:16â0:95 r2 � 0:998,

m � ÿ0:85â0:05 � 2:72â0:5 ÿ 0:87â0:95 r2 � 0:999,

b � ÿ0:39â0:05 � 0:15â0:5 � 1:20â0:95 r2 � 1:000:

Likewise, using the 10% fractiles:

a � 2:24â0:1 ÿ 1:63â0:5 � 0:39â0:9 r2 � 0:998,

m � ÿ1:38â0:1 � 3:78â0:5 ÿ 1:40â0:9 r2 � 0:999,

b � ÿ0:08â0:1 ÿ 0:72â0:5 � 1:80â0:9 r2 � 1:000:

It appears, therefore, that the optimum triangular parameters can be very accurately estimatedfrom a combination of either the 5% or 10% fractiles and the median. Use of the 5%, 50% and95% fractiles is in line with the mean approximation suggested by Pearson and Tukey (1965),whereas the 10%, 50% and 90% fractiles are used in the Swanson and Megill estimate (attrib-

392 JOHNSON

Page 7: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

uted to Swanson in Megill (1977)). There is some evidence (e.g. Davidson and Cooper(1980)) that asking managers to estimate fractiles as extreme as 5% and 95% can produceunreliable results, in which case it is probably safer to use the 10% fractiles rather than the 5%fractiles.

If the equations for estimating the triangular parameters are represented by

a �X3

i�1

caiâi,

m �X3

i�1

cmiâi,

b �X3

i�1

cbiâi,

where â1, â2 and â3 represent the lower fractile, the median and the upper fractile respectively,limiting considerations as v, k ! 0 imply that the coef®cients in these equations should satisfythe conditions X

cai �X

cmi �X

cbi � 1, (14)

cm1 � cm3: (15)

In addition, the triangular mean is given by

ì � a� m� b

3�X3

i�1

c:iâi

where c:i � (cai � cmi � cbi)=3. For an unbiased estimate, the coef®cients in any such three-pointestimate should be symmetrical, in which case

c:1 � c:3 � (1ÿ c:2)=2: (16)

Conditions (14)±(16) imply that the nine coef®cients cij (i � a, m, b; j � 1, 2, 3) can be simplydetermined from the four `key' values ca1, cm2, cb3 and c:2.

It is reassuring to note that conditions (14)±(16) are approximately satis®ed by the aboveequations, particularly when using the 10% fractiles. If, however, we adopt a more constrainedapproach to the estimation of the optimum triangular parameters whereby the sum of squareddeviations in a, m and b is minimized subject to conditions (14)±(16), the modi®ed equationsgiven in Table 1 are obtained. The formulae for the means in Table 1 correspond reasonably

TABLE 1Fractile estimates of the triangular parameters

10% fractiles 5% fractiles

â1 â2 â3 â1 â2 â3

a 2.29 ÿ1.715 0.425 1.7 ÿ0.885 0.185m ÿ1.45 3.9 ÿ1.45 ÿ0.9 2.8 ÿ0.9b ÿ0.015 ÿ0.835 1.85 ÿ0.275 0.035 1.24

ì 0.275 0.45 0.275 0.175 0.65 0.175

r2 0.9995 0.9992

TRIANGULAR DISTRIBUTION 393

Page 8: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

well with the fractile-based approximations proposed by Pearson and Tukey (1965) andSwanson and Megill (1977), namely

ìPT � 0:185 X (0:05)� 0:63 X (0:5)� 0:185 X (0:95), (17)

ìSM � 0:3 X (0:1)� 0:4 X (0:5)� 0:3 X (0:9): (18)

A more detailed comparison of the relative accuracy of the triangular estimates against theSwanson±Megill and Pearson±Tukey estimates is given in the following section.

As an illustration, consider again the beta(4, 6) distribution. The optimum values of a, mand b given above would be estimated by using the 10% fractiles as follows. Given that

â0:1 � 0:2104,

â0:5 � 0:3931,

â0:9 � 0:5994,

then

a � 2:29 3 0:2104ÿ 1:715 3 0:3931� 0:425 3 0:5994 � 0:0624,

m � ÿ1:45 3 0:2104� 3:90 3 0:3931ÿ 1:45 3 0:5994 � 0:3589,

b � ÿ0:015 3 0:2104ÿ 0:835 3 0:3931� 1:85 3 0:5994 � 0:7775:

Using a triangular distribution function based on these estimated parameters gives a maximumdeviation of D � 0:0075 as opposed to the best possible value of 0.0074. In addition, theestimated mean and variance of the underlying beta distribution are

ì � (0:0624� 0:3589� 0:7775)=3 � 0:3996,

ó 2� (0:06242� 0:35892� 0:77752ÿ 0:0624 3 0:3589ÿ 0:0624 3 0:7775ÿ 0:3589 3 0:7775)=18

� 0:0215:

The true mean and variance are 0.4 and 0.0218, giving percentage errors of ÿ0.1% and ÿ1.4%.

5. Accuracy of triangular estimates

A number of previous studies (e.g. Perry and Greig (1975), Keefer and Boddily (1983) andKeefer and Verdini (1993)) have reported the results of various beta approximations, mainly interms of the errors that arise in estimating the mean and variance. The consensus is thatsimple three-point distributions derived from the Pearson±Tukey and the Swanson±Megillestimates given in equations (17) and (18) provide clearly the most accurate estimates of themean and variance. As would be expected, the ®t of a three-point discrete distribution functionto that of the beta distribution is not particularly good, the two D-values being 0.315 and 0.2respectively, but in some circumstances this may not be a major concern.

The most detailed evaluation is that by Keefer and Boddily (1983) who, similarly to Perryand Greig (1975), used a set of 78 trial beta distributions involving all combinations of p, q �2, 3, 4, 5, 6, 8, 10, 12, 15, 20, 30, 60 where p < q. These 78 distributions are represented bythe points to the left of the broken line in Fig. 3 and, as can be seen, cover most of thatportion of the feasible `triangle'. A summary of the error and percentage error in the Pearson±Tukey and Swanson±Megill estimates of the mean and variance of the 78 distributions is givenin Table 2.

Keefer and Boddily (1983) also considered four estimates based on the triangular distri-bution. Two of these use the 5% fractiles and the mode, with the fractiles being used either asthe triangular extremes or as the corresponding triangular fractiles. Of these two, the formergives a reasonably good estimate of the beta mean (with an average error of 0.27%) but is

394 JOHNSON

Page 9: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

much worse in estimating the variance, where the average error is 54.2%. The latter is muchbetter in estimating the beta variance (average error 3.7%) but is worse for the mean (averageerror 2.4%). A third triangular estimate is derived from a bitriangular distribution with tworight-angled triangles placed back to back at the median and with matching 5% fractiles. Thisdistribution is probably the best of the four considered as it performs reasonably well for boththe mean and the variance (with average errors of 1.16% and 4.1% respectively) but has theobvious drawback for skewed distributions of a discontinuity at the median. The fourthtriangular distribution uses the beta mode along with the beta extremes of 0 and 1 as thecorresponding triangular parameters. Not surprisingly, this produces excessively large errors forboth the mean and the variance and is not therefore a feasible proposition.

On the very limited evidence of the one example quoted above, the triangular estimate devel-oped here performs much better than any of the triangular estimates considered by Keefer andBoddily (1983) and gives results which are comparable with those from the Pearson±Tukeyor Swanson±Megill estimates. To provide a more thorough comparison, triangular estimates basedon the equations of Table 1 were obtained for both the mean and the variance of all 78 distri-butions used by Keefer and Boddily. The results for the same error measures as in Table 2 aregiven in Table 3 and are a clear improvement on the triangular distributions used by Keefer andBoddily. In particular, errors in estimates based on the 5% fractiles are smaller than thecorresponding ®gure for any of the Keefer and Boddily triangular distributions. The estimatesbased on the 10% fractiles are slightly less accurate, but nevertheless there is only one of theKeefer and Bodily triangular distributions which gives marginally more accurate estimates of thevariance, but at the expense of markedly less accurate estimates of the mean.

It must be remembered, however, that the estimated triangular parameters given by theequations in Table 1 were derived from a least squares ®t to the optimum triangular parametersbased on a Kolmogorov±Smirnov criterion. In particular, although these estimates give a good®t to the beta distribution function, they are not likely to be optimal when estimating the mean

TABLE 2Errors in Pearson±Tukey and Swanson±Megill estimates

Approximation Approximating mean Approximating variance

Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

Pearson±Tukey 0.00015 0.07 0.00004 0.02 ÿ0.00080 ÿ1.6 0.00006 0.46Swanson±Megill 0.00103 0.33 0.00012 0.05 0.00552 11.1 0.00042 2.7

TABLE 3Errors in proposed triangular estimates

Triangular Approximating mean Approximating varianceapproximation

using Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

5% fractiles 0.0010 0.77 0.0003 0.18 0.0013 6.63 0.0003 3.2110% fractiles 0.0012 1.28 0.0004 0.26 ÿ0.0036 ÿ8.62 0.0004 4.50

TRIANGULAR DISTRIBUTION 395

Page 10: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

and variance. If more importance attaches to obtaining accurate estimates of the mean and vari-ance, alternative equations are required for a, m and b, subject still to conditions (14)±(16).

Consider ®rst the estimation of the beta mean. As noted previously,

ì �X3

i�1

c:iâi � 1

2(1ÿ c:2)(â1 � â3)� c:2â2,

i.e.

2ìÿ â1 ÿ â3 � c:2(2â2 ÿ â1 ÿ â3):

Regressing 2ìÿ â1 ÿ â3 against 2â2 ÿ â1 ÿ â3 gives least squares estimates of c:2 as 0.63when based on the 5% fractiles and 0.415 when based on the 10% fractiles. The correlationis virtually perfect in both cases (with r2-values of 1.0000 and 0.9986). It is comforting thatthese values con®rm exactly the Pearson±Tukey coef®cients and are reasonably close to theSwanson±Megill coef®cients, where the â2-weights are 0.63 and 0.4 respectively.

The remaining three key values (ca1, cm2 and cb3) can now be determined to minimize theestimation errors in the variance. It is not possible to minimize simultaneously all four of thevariance errors shown in Tables 2 and 3, and the approach taken was ®rstly to determine theminimum possible value of each of the four types of error and then to search for a solutionwhich is reasonably close (in percentage terms) to the minimum values of both the averageabsolute error and the average absolute percentage error while not producing too large valuesfor the maximum absolute error and the maximum absolute percentage error. The results ofthis are shown in Table 4 which gives an alternative set of coef®cients to those in Table 1where the objective is now to produce as accurate as possible estimates of the beta mean andvariance.

Inevitably the ®t to the optimum triangular parameters is less good than before but stillgives r2-values that are well in excess of 99%. As a result, the ®t of the triangular distributionfunction to that of the beta distribution has deteriorated slightly, but not by an amount thatwould be of any real consequence in practice. For example, the ®t of a triangular distributionderived from the equations of Table 4 to the beta(4, 6) distribution produces D-values of0.0086 and 0.0090 compared with the `optimum' value of 0.0074.

To examine the performance of the above triangular estimates more thoroughly, the meanand variance of the 78 distributions of Keefer and Boddily (1983) were again estimated. Theerrors arising from this are given in Table 5. Comparing the triangular estimate based on the5% fractiles with the Pearson±Tukey estimates in Table 2, it can be seen that the errors inestimating the mean are identical (as would be expected because c:2 � 0:63 as in equation(17)), but the accuracy of the variance estimates has improved marginally. In particular, therehas been approximately a 40% decrease in the average errors at the expense of approximately

TABLE 4Alternative fractile estimates of the triangular parameters

10% fractiles 5% fractiles

â1 â2 â3 â1 â2 â3

a 1.970 ÿ1.088 0.118 1.315 ÿ0.143 ÿ0.173m ÿ1.233 3.465 ÿ1.233 ÿ0.565 2.130 ÿ0.565b 0.140 ÿ1.133 1.993 ÿ0.195 ÿ0.097 1.293

ì 0.293 0.415 0.293 0.185 0.63 0.185

r2 0.9983 0.9944

396 JOHNSON

Page 11: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

a 20% increase in the maximum errors. This trade-off between the average and maximumerrors could of course be adjusted if required by modifying slightly the estimating equations. Itis certainly possible to improve on both the average and the maximum errors in the Pearson±Tukey estimates by adjusting the criterion used to obtain the equations in Table 4 to give lessweight to the average errors and more to the maximum errors.

Comparing the triangular estimate based on the 10% fractiles with the Swanson±Megillestimates, it is clear that the triangular estimates again give similar accuracy when estimatingthe mean and are somewhat better at estimating the variance. The similarity of the results forthe mean is not surprising as the value of c:2 is again close to the coef®cient of the median inequation (18). The average errors in estimating the variance are marginally better than theSwanson±Megill estimates, whereas the maximum errors show a marked improvement and areabout half those of the Swanson±Megill estimates.

6. Conclusion

The foregoing analysis has shown the usefulness of the triangular distribution as a proxy forthe beta distribution. For those beta distributions which are bell shaped at both sides with p,q > 2, the triangular distribution can be expected to produce a distribution function which isvery similar to that of the beta distribution, with a D-value which is everywhere less than 0.03and only greater than 0.02 in cases of extreme skewness.

The optimum triangular parameters can be very reliably estimated from a simple linearcombination of two extreme fractiles and the median. Most of the estimates traditionally usedin risk analysis are based on either the 5% or 10% fractiles to indicate the limits of theuncertainty, and either the median or the mode as a measure of location. In keeping with this,this paper has examined the accuracy of estimates based on both the 5% and the 10% fractiles,from which it appears that the 5% fractiles provide appreciably more accurate estimates ofboth the mean and the variance. This result con®rms the earlier work of Keefer and Boddily(1983) who examined several three-point approximations and concluded that a simple exten-sion of the Pearson±Tukey approximation, using the 5% fractiles, was the most accurate. Someresearchers have questioned the reliability with which 5% fractiles can be estimated and haveclaimed that the 10% points are a more practical proposition. If this view is taken, the tri-angular parameters can still be estimated with a high degree of accuracy based on the 10%points in line with an alternative three-point approximation suggested by Swanson and Megill.In essence, the procedure proposed can be regarded as a simple triangulation of the basicPearson±Tukey and Swanson±Megill estimates as opposed to extending them into simplethree-point discrete distributions.

It has been shown that different estimating equations can be derived corresponding toalternative `best ®t' criteria. In particular, the equations which give a best ®t to the beta distri-bution function do not generally lead to the most accurate estimates of the mean and variance.

TABLE 5Errors in alternative triangular estimates

Triangular Approximating mean Approximating varianceapproximation

using Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

Maximumerror

Maximum% error

Averageabsolute

error

Averageabsolute% error

5% fractiles 0.00015 0.073 0.00004 0.018 0.00096 1.93 0.00004 0.2710% fractiles 0.00054 0.396 0.00011 0.065 ÿ0.00293 ÿ5.86 0.00029 2.70

TRIANGULAR DISTRIBUTION 397

Page 12: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

An alternative set of equations is derived which generally minimizes the average errors inestimating the mean and variance which lead to slight improvements over the correspondingPearson±Tukey and Swanson±Megill estimates. Speci®cally, using triangular formulae basedon the 5% fractiles allows the mean and variance of the beta distribution to be estimated withaverage errors of about 0.02% and 0.3% respectively, whereas the 10% fractiles yield averageerrors of 0.06% and 2.7%.

The decision maker is therefore presented with the relatively straightforward task ofestimating two extreme percentiles as well as the median. For most decision makers, thisshould be appreciably easier than estimating the parameters of the underlying beta distribution,or using most of the alternatives to a beta distribution so far proposed. Furthermore, theprocess of sampling from a triangular distribution is much simpler than that for a betadistribution. A simple one-to-one transformation of a uniform random deviate will produce therequired triangular sample much more ef®ciently than the more complicated procedureinvolved in generating a random sample from a beta distribution. This may not be a majorconsideration when using standard risk analysis computer software, as most packages wouldinclude beta random number generators. However, in situations involving spreadsheet-basedmodels, beta random numbers would not normally be available in which case the relativelysimple triangular formulae may provide a straightforward alternative.

References

Berny, J. (1989) A new distribution function for risk analysis. J. Ops Res. Soc., 40, 1121±1127.Clark, C. E. (1962) The PERT model for the distribution of an activity. Ops Res., 10, 405±406.Davidson, L. B. and Cooper, D. O. (1980) Implementing effective risk analysis at Getty Oil. Interfaces, 10, 62±75.Farnum, N. R. and Stanton, L. W. (1987) Some results concerning the estimation of beta distribution parameters in

PERT. J. Ops Res. Soc., 38, 287±290.Golenko-Ginzburg, D. (1988) On the distribution of activity time in PERT. J. Ops Res. Soc., 39, 767±771.Grubbs, F. E. (1962) Attempts to validate certain PERT statistics or `Picking on PERT'. Ops Res., 10, 912±915.Keefer, D. L. and Boddily, S. E. (1983) Three-point approximations for continuous random variables. Mangmnt Sci.,

29, 595±609.Keefer, D. L. and Verdini, W. A. (1993) Better estimation of PERT activity time. Mangmnt Sci., 39, 1086±1091.Law, A. M. and Kelton, W. D. (1982) Simulation Modelling and Analysis. New York: McGraw-Hill.MacCrimmon, K. R. and Ryaveck, C. A. (1964) An analytical study of the PERT assumptions. Ops Res., 12, 16±37.Megill, R. E. (1977) An Introduction to Risk Analysis. Tulsa: Petroleum Publishing Company.Pearson, E. S. and Tukey, J. W. (1965) Approximate means and standard deviations based on distances between

percentage points of frequency curves. Biometrika, 52, 533±546.Perry, C. and Greig, I. D. (1975) Estimating the mean and variance of subjective distributions in PERT and decision

analysis. Mangmnt Sci., 21, 1477±1480.Williams, T. M. (1992) Practical use of distributions in network analysis. J. Ops Res. Soc., 43, 265±270.

398 JOHNSON

Page 13: The triangular distribution as a proxy for the beta ...download.xuebalib.com/3vxbqk5Gi8Nw.pdf · therefore one of the most versatile of all the standard distributions used in risk

本文献由“学霸图书馆-文献云下载”收集自网络,仅供学习交流使用。

学霸图书馆(www.xuebalib.com)是一个“整合众多图书馆数据库资源,

提供一站式文献检索和下载服务”的24 小时在线不限IP

图书馆。

图书馆致力于便利、促进学习与科研,提供最强文献下载服务。

图书馆导航:

图书馆首页 文献云下载 图书馆入口 外文数据库大全 疑难文献辅助工具