5
A Bivariate Distribution Family with Specified Marginals Author(s): Mark E. Johnson and Aaron Tenenbein Source: Journal of the American Statistical Association, Vol. 76, No. 373 (Mar., 1981), pp. 198- 201 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2287067 . Accessed: 15/06/2014 18:25 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PM All use subject to JSTOR Terms and Conditions

A Bivariate Distribution Family with Specified Marginals

Embed Size (px)

Citation preview

Page 1: A Bivariate Distribution Family with Specified Marginals

A Bivariate Distribution Family with Specified MarginalsAuthor(s): Mark E. Johnson and Aaron TenenbeinSource: Journal of the American Statistical Association, Vol. 76, No. 373 (Mar., 1981), pp. 198-201Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2287067 .

Accessed: 15/06/2014 18:25

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PMAll use subject to JSTOR Terms and Conditions

Page 2: A Bivariate Distribution Family with Specified Marginals

A Bivariate Distribution Family With Specified Marginals

MARK E. JOHNSON and AARON TENENBEIN*

A systematic approach is given for constructing contin- uous bivariate distributions with specified marginals and ixed dependence measures. This approach is based on linear combinations of independent random variables and results in bivariate distributions that can attain the Frechet bounds. The dependence measures considered are Spearman's rho and Kendall's tau. Applications to testing for sensitivity in simulation models are discussed.

KEY WORDS: Bivariate Distributions; Spearman's rho; Kendall's tau; Simulation Models; Frechet bounds.

1. INTRODUCTION

In many simulation applications, it is required to gen- erate dependent pairs of continuous random variables for which there is limited information on the joint distribu- tion. For example, in a portfolio analysis simulation ap- plication, a joint distribution of stock and bond returns may have to be specified. Because of lack of data, it may be difficult to specify completely the joint distribution of stock and bond returns. However, it may be realistic to specify the marginal distributions and some measures of dependence between the random variables. A risk sim- ulation application was discussed by Eilen and Fowkes (1973) and Hall (1977).

The problem of constructing bivariate distributions with specified marginals has been discussed by Morgen- stern (1956), Farlie (1969), Nataf (1962), Frechet (1951), Plackett (1965) and Mardia (1967), and Kimeldorf and Sampson (1975a and 1975b), and this literature is re- viewed by Conway (1979) and discussed by Barnett (1980). These papers describe various dependence par- ameterized families of bivariate distributions with given marginals. In this paper we present the weighted linear combination method (WLC) for constructing families of bivariate distributions F(x, y), with specified marginals

* Mark E. Johnson is a Staff Member with Statistics Group S-1 at Los Alamos National Laboratory, Los Alamos, NM 87545. Aaron Te- nenbein is Professor of Statistics, Quantitative Analysis Area, New York University Graduate School of Business Administration, New York, NY 10006. This research was carried out while Aaron Tenenbein was a visiting scientist at Los Alamos Scientific Laboratory. The au- thors wish to thank the Laboratory for Computer Science, Mathlab Group, at the Massachusetts Institute of Technology for the use of their computer system, MACSYMA, which is supported in part by the United States Energy Research and Development Administration under con- tract E(l 1-)-3070 and the the National Aeronautics and Space Ad- ministration under Grant NSG 1323. The authors also wish to acknowl- edge the comments of a referee and the editors, whose suggestions greatly improved the exposition of the material.

F, (x) and F2(y), and whose measures of dependence, Kendall's tau (T) or Spearman's rho (ps), can be specified (Kendall 1962 and Kruskal 1958). This construction scheme is based on the method of translation suggested by Nataf (1962). To generate a pair of random variables (X, Y) with given F, (x) and F2(y), we proceed as follows:

Let

U= U' (1.1)

and

V= cU' + (1 - c)V' (1.2)

where U' and V' are independent and identically distrib- uted random variables with common probability density function g(t) and c is a constant in the interval (0, 1). (g(t) and c will be specified later, and will determine T, p,, and F(x, y).)

Let X' = Hi(U) and Y' = H2(V) where Hi(u) and H2(v) are the distribution functions of U and V, respectively.

Define

X = Fi (X') = F -(Hi(U)) (1.3)

and

Y = F2-'(Y') = F2' (H2(V)) (1.4a)

or

Y = F2`l0- Y') = F2`(l - H2(V)) (1.4b)

Since X', Y', and 1 - Y' are uniformly distributed over the interval (0, 1), X defined by (1.3) and Y defined by either (1.4a) or (1.4b) will have a joint distribution with marginals F1 (x) and F2(y), respectively. Positive and neg- ative values of ps and T result when Y is defined by (1 .4a) and (1.4b), respectively.

The advantages of the WLC family of bivariate distri- butions over the single parameter families discussed pre- viously are as follows:

1. The WLC family contains as members the Frechet upper and lower bounds, given by Mardia (1970). The upper and lower bounds are attained when c = 1, and Y is defined by (1.4a) and (1.4b), respectively. It follows from a general result proved by Tchen (1976) that this

? Journal of the American Statistical Association March 1981, Volume 76, Number 373

Theory and Methods Section

198

This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PMAll use subject to JSTOR Terms and Conditions

Page 3: A Bivariate Distribution Family with Specified Marginals

Johnson and Tenenbein: Bivariate Distributions With Specified Marginals 199

Table 1. H1 (u) and H2(v) for the Weighted Linear Combination Procedure

H1 (u) H2(v) (distribution)

Hi(u) = u for O s u s1 v21(2b(1 - b) for O v b (2v - b)I(1 - b) for b v 1 -b

1 - (1 - v)21(2b(1 - b)) for 1- b v 1 (uniform) where b = min(c, 1 - c)

H1(u) = d1(u) 4((V/F2+(17- c)2)

(Standard Normal)

Case 1: c + 2

Hi(u) = 2 exp(u) for u < 0 ((1 - C)2 exp(v/(l - c)) - e exp(v/c)]/(2(l - 2c)) for v < 0 = 1 - I exp(- u) for u > 0 1 - [(1 - c)2 exp(- v(l - c)) - c exp(- vlc)]1(2(l - 2c)) for v > 0

(Double Exponential) Case 2: c =2

(1 - v) exp(2v)/2 for v< 0 1 - (v + 1) exp(-2v)/2 for v > 0

Hi (u) = 1 - exp(- u) for u > 0 1 - [(1 - c) exp(- vI(1 - c)) - c exp(- vlc)]I(1 - 2c) for v > 0 and c 2

1 - (2v + 1) exp(-2v) for v> 0 and c=2

(Exponential)

family has members for which p5 and T take on any given value in the interval (- 1, 1).

2. The WLC family ofjoint distributions readily lends itself to the development of random variate generation for use in simulation models.

3. There are two degrees of freedom in constructing F(x, y): the weighting factor, c, and the underlying density function, g(t). The weighting factor affects the measure of dependence. By varying g(t), while holding c constant, we can assess the sensitivity of the simulation model's final results to the specification of F(x, y).

Throughout this paper we use T and p5 as measures of dependence rather than the Pearson product moment cor-

relation coefficient, p. There are three reasons for this approach. First, p is not defined if either F1 (x) or F2(y) has an infinite variance; however, p5 and T are always defined. Second, the invariance property of I p5 I and I T I under strictly monotone transformations of the random variables implies, from equations (1.3) and either (1.4a) or (1.4b), that

I P(X, ) I = Ips(U, V) I IT (X, 1) I = IT (U, V) I

These results make the use of ps and T more convenient as measures of dependence for the WLC family, since the values of these measures are not functions of the

Table 2. Formulas For p5 And As Functions Of c in the Weighted Linear Combination Procedure

Distribution pM(U, V) T(U, V)

Uniform c(10 - 13c) for 0 < c< .5 4c - 5c2

for 0 < c< .5 1 o(l _ C)2 ~~~~6(1-_C)2 fr0c.

3c3 + 16c2 - 11c + 2 for .5 < c< 1 1c2-6c+1 for .5 < c< 1

10Oc3 6c2

Normal arcsin - arcsin

2 + ( - 2V~c27(1 ~c)27

c(9 - 18c2 + 14c3 - 3C4) c(3 + 3c - 2c2) Double Exponential 2(2 - c)2 4

Exponential c(3 - 2c) c

This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PMAll use subject to JSTOR Terms and Conditions

Page 4: A Bivariate Distribution Family with Specified Marginals

200 Journal of the American Statistical Association, March 1981

Table 3. Values Of c Required As a Function of ps or T

Required Value of c Required Value of c

Double Double Ps Uniform Normal Exponential Exponential T Uniform Normal Exponential Exponential

0 0 0 0 0 0 0 0 0 0 .1 .0935 .0952 .0828 .0675 .1 .135 .137 .120 .100 .2 .176 .176 .158 .137 .2 .246 .245 .224 .200 .3 .250 .248 .229 .208 .3 .341 .338 .320 .300 .4 .317 .314 .299 .282 .4 .424 .421 .411 .400 .5 .380 .377 .370 .360 .5 .500 .500 .500 .500 .6 .440 .440 .443 .442 .6 .576 .579 .589 .600 .7 .500 .507 .522 .531 .7 .659 .662 .680 .700 .8 .569 .583 .611 .630 .8 .754 .755 .776 .800 .9 .667 .684 .722 .750 .9 .865 .863 .880 .900

1.0 -1.000 1.000 1.000 1.000 1 1.000 1.000 1.000 1.000

marginal distribution functions, F1 (x) and F2(y). This property is also discussed by Schucany, Parr, and Boyer (1978) in relation to other bivariate distribution systems. Third, the value of p does not cover the entire range, - 1 < p < 1, for this family of joint distributions, as discussed by Conway (1979).

Section 2 of this paper discusses specific results con- cerning the WLC procedure. Section 3 discusses the ap- plications of this family to Monte Carlo Simulation.

2. THE WEIGHTED LINEAR COMBINATION PROCEDURE (WLC)

To apply the WLC procedure, expressions are needed for HI (u), H2(v), p5(U, V), and T(U, V) in terms of c and g(t). The values of HI (u) and H2(v) would then allow us to apply the WLC procedure for given choices of c and g(t). The expressions for T(U, V) and p,(U, V) allow us to specify c for a given choice of g(t) in terms of the required value of either T or p5. These expressions are derived by Johnson and Tenenbein (1979) and compu- tations are carried out for the case of the uniform, normal,

exponential, and double exponential distributions. The results are given in Tables 1 through 4.

To show the application of these tables, suppose it is desired to generate random variables whose p5 = - .6 by using the double exponential distribution for g(t). Table 1 gives us the corresponding expressions for HI(u) and HI (v) and Table 3 gives us the required value of c = .443. We would then use these values in (1.1), (1.2), (1.3), and (1.4b); this would result in the generation of a random variable pair (X, 1) whose marginals are F1 (x) and F2(y), and whose p5 value is -.6.

3. APPLICATIONS TO MONTE CARLO SIMULATION

Many simulation applications require the specification of a joint continuous bivariate distribution as input. If there is an adequate theory or sufficient data on which to base a specific bivariate distribution, the problem is well defined. Johnson and Kotz (1972) compiled a large selection of bivariate and multivariate distributions and Fishman (1973) discussed methods of generating random variates from given bivariate distributions. If there is in-

Table 4. The Joint Density Function: h(u, v) for the Weighted Linear Combination Procedure

Underlying density h(u, v) Support

Uniform g(t) = 1 (O < t < 1) 1/(1 - c) cu < v < cu + 1 - c O < u <1

Normal Under Bivariate Normal N(O, 1) E(U) =E(V) = O < U < xo

= 1 -= < V< 00

UV2 = C2 + -C)2

p = C/C +T(-)2

Double Exponential

g(t) = 2exp(- I t i) (- < t < oc) (exp[- I u| - | v- cu 1/(1 - c)])/(4(1 - c)) -oc < u< -0 < V < 00

Exponential g(t) = exp(- t) (O < t <) (exp(- v - u + 2cu))I(1 - c) cu < v < xO < u <

This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PMAll use subject to JSTOR Terms and Conditions

Page 5: A Bivariate Distribution Family with Specified Marginals

Johnson and Tenenbein: Bivariate Distributions With Specified Marginals 201

sufficient data available to specify a unique bivariate dis- tribution, it may still be realistic to specify the marginal distribution of the random variables and a measure of dependence between them. In this case, the problem is not well defined because there are many bivariate dis- tributions that have these properties. A variety of as- sumed joint distributions must be used to assess their sensitivity to the results of the simulation model.

The WLC procedure produces a family of distributions that permits the simulation user to assess the sensitivity of the results of the simulation application. This family has two parameters: the weighting factor c and the un- derlying density g(t). By considering the three members of this family corresponding to g(t) as double exponential, normal, and uniform, and with the weighting factor ad- justed to give the same p5 or , we can assess the effect of tail weight in g(t) on the final results of the simulation model. By considering the two members of each family corresponding to g(t) as exponential and normal, we can assess the effect of skewness in g(t) on the final results of the simulation model.

One specific application is in portfolio analysis where the random variables (X, 1) are the rates of return on two fixed assets. The objective is to compute the probability that the rate of return on a portfolio, consisting of the assests in fixed proportions, will exceed a given rate of return. By specifying the marginal distributions of these rates of returns and the value of p5 or , the WLC family can be used to test the sensitivity of this probability to the specification of the joint distribution by using different choices for g(t).

Another application is the area of robustness of mul- tivariate tests of hypothesis. For example, the Hotelling t2 test assumes that the joint distribution of K random variables is multivariate normal. By using the WLC pro- cedure in the special bivariate case, K = 2, we can assess the sensitivity of the level of significance to the specifi- cation of the joint distribution. The following question could be investigated in such a study. Is the Hotelling t2 test relatively robust so long as the marginal distributions are normal? This question could be evaluated by com- paring the level of significance values resulting from the three WLC members where F(x) and F(y) are normal and

g(t) is successively uniform, double exponential, and ex- ponential. When g(t) is normal, the joint distribution of (X, 1) is bivariate normal.

[Received October 1977. Revised May 1980.]

REFERENCES BARNETT, VIC (1980), "Some Bivariate Uniform Distributions,"

Communications in Statistics, A9, 453-461. CONWAY, DELORES A. (1979), "Multivariate Distributions With

Specified Marginals," Technical Report No. 145, Stanford Univer- sity, Dept. of Statistics.

EILEN, S., and FOWKES, T.P. (1973), "Sampling Procedures for Risk Simulation," Operational Research Quarterly, 24, 241-252.

FARLIE, D.J.G. (1960), "The Performance of Some Correlation Coef- ficients for a General Bivariate Distribution," Biometrika, 47, 307-323.

FISHMAN, GEORGE E. (1973), Concepts and Methods in Discrete Event Digital Simulation, New York: John Wiley.

FRE-CHET, M. (1951), "Sur les Tableaux de Correlation dont les Marges sont Donnees," Ann. University of Lyon, Sec. A, Ser. 3, 14, 53-77.

HALL, J.C. (1977), "Dealing With Dependence in Risk Simulation," Operational Research Quarterly, 29, 202-213.

JOHNSON, MARK E., and TENENBEIN, AARON (1979), "Bivariate Distributions With Given Marginals and Fixed Measures of Depend- ence," Technical Report No. LA-7700-MS, Los Alamos Scientific Laboratory.

JOHNSON, NORMAN L., and KOTZ, SAMUEL (1972), Distributions in Statistics: Continuous Multivariate Distributions, New York: John Wiley.

KENDALL, MAURICE G. (1962), Rank Correlation Methods, Lon- don: Charles Griffin.

KIMELDORF, GEORGE, and SAMPSON, ALLAN, R. (1975a), "One-Parameter Families of Bivariate Distributions with Fixed Mar- ginals," Communications in Statistics, 4, 293-301.

(1975b), "Uniform Representations of Bivariate Distributions," Communications in Statistics, 4, 617-627.

KRUSKAL, WILLIAM H. (1958), "Ordinal Measures of Association," Journal of the American Statistical Association, 53, 814-859.

MARDIA, KANTILAL V. (1967). "Some Contributions to Contin- gency-Type Bivariate Distributions," Biometrika, 54, 235-249.

(1970), Families of Bivariate Distributions, Darien, Conn.: Haf- ner Publishing.

MORGENSTERN, D. (1956), "Einfache Beispiele Zweidimensionaler Verteilungen," Mitteilingsblattfur Mathematische Statistik, 8, 234-235.

NATAF, A. (1962), "Determination des Distributions de Probabilites dont des Marges sont Donnees," C.R. Academy of Sciences, 255, 42-43.

PLACKETT, R.L. (1965), "A Class of Bivariate Distributions," Jour- nal of the American Statistical Association, 60, 516-522.

SCHUCANY, WILLIAM R., PARR, WILLIAM C., and BOYER, JOHN E. (1978), "Correlation Structure in Farlie-Gumbel-Morgen- stern Distributions," Biometrika, 65, 650-653.

TCHEN, ANDRE (1976), "Inequalities for Distributions With Given Marginals," Technical Report No. 19, Stanford University, Dept. of Statistics.

This content downloaded from 91.229.248.187 on Sun, 15 Jun 2014 18:25:56 PMAll use subject to JSTOR Terms and Conditions