The Kumaraswamy distribution: median-dispersion re ...pmitnik/Mitnik_Baek_2013_Median_dispersion.pdf · In Sect. 3 we present the median-dispersion re-parameterizations and prove

Stat Papers (2013) 54:177–192DOI 10.1007/s00362-011-0417-y

REGULAR ARTICLE

The Kumaraswamy distribution: median-dispersionre-parameterizations for regression modelingand simulation-based estimation

Pablo A. Mitnik · Sunyoung Baek

Received: 25 February 2009 / Revised: 15 November 2011 / Published online: 13 January 2012© Springer-Verlag 2012

Abstract The Kumaraswamy distribution is very similar to the Beta distribution,but has the important advantage of an invertible closed form cumulative distributionfunction. The parameterization of the distribution in terms of shape parameters and thelack of simple expressions for its mean and variance hinder, however, its utilizationwith modeling purposes. The paper presents two median-dispersion re-parameter-izations of the Kumaraswamy distribution aimed at facilitating its use in regressionmodels in which both the location and the dispersion parameters are functions oftheir own distinct sets of covariates, and in latent-variable and other models estimatedthrough simulation-based methods. In both re-parameterizations the dispersion param-eter establishes a quantile-spread order among Kumaraswamy distributions with thesame median and support. The study also describes the behavior of the re-parameter-ized distributions, determines some of their limiting distributions, and discusses thepotential comparative advantages of using them in the context of regression modelingand simulation-based estimation.

Keywords Kumaraswamy distribution · Beta distribution · Median-dispersionparameterization · Quantile-spread order · Limiting distributions · Regressionmodeling · Generalized linear models · Latent-variable models · Simulation-basedestimation methods

Mathematics Subject Classifications (2000) 62E99 · 62F99 · 62J12

Electronic supplementary material The online version of this article(doi:10.1007/s00362-011-0417-y) contains supplementary material, which is available to authorized users.

P. A. Mitnik (B) · S. BaekCenter for the Study of Poverty and Inequality, Stanford University, 450 Serra Mall, Building 370,Room 212, Stanford, CA 94305-2029, USAe-mail: [email protected]

123

http://dx.doi.org/10.1007/s00362-011-0417-y

178 P. A. Mitnik, S. Baek

1 Introduction

The Kumaraswamy distribution is a continuous probability distribution with dou-ble-bounded support. It is very similar, in many respects, to the Beta distribution. Thebehavior of both distributions is governed by two shape and two boundary parameters.The relationships between the distributions’ possible shapes and the values of theirshape parameters are qualitatively identical, and both distributions are special casesof McDonald’s (1984) generalized Beta of the first kind. Most importantly, these twodistributions are very flexible and can take approximately the same shapes; therefore,they can be used to model the same (large variety of) random processes and uncer-tainties (see Garg 2008; Jones 2009; Kumaraswamy 1980; Mitnik Forthcoming andNadarajah 2008 for the Kumaraswamy distribution and for the relationships betweenthe two distributions; see Johnson et al 1995, Chap. 25, for the Beta distribution).

There are, however, important pragmatic differences between these two distribu-tions. On the one hand, the availability for the Kumaraswamy, but not for the Betadistribution, of an invertible closed-form cumulative distribution function makes theformer distribution much better suited than the latter for activities that require thegeneration of random variates (Jones 2009), in particular simulation modeling andsimulation-based model estimation (Mitnik Forthcoming). On the other hand, theavailability of simple closed-form expressions for the mean and the variance of theBeta distribution in terms of its shape parameters made deriving location-dispersionre-parameterizations of this distribution a very easy job, which in turn has facilitatedits use in modeling. Indeed, several authors have employed re-parameterizations of theBeta distribution in terms of its mean and either a dispersion or a precision parameterto conduct likelihood ratio tests aimed at comparing location and scale differencesacross data sets (Mielke 1975), to specify prior parameters via quantile estimates inthe context of the Bayesian approach (van Dorp and Mazzuchi 2004), and to developand estimate regression models (Cribari-Neto and Souza Forthcoming; Cribari-Netoand Zeileis 2010; Espinheira et al 2008a,b; Ferrari and Cribari-Neto 2004; Ferrariet al. 2011; Kieschnick and McCullough 2003; Ospina et al 2006; Paolino 2001;Rocha and Simas 2011; Simas et al 2010; Smithson and Verkuilen 2006; Vasconcellosand Cribari-Neto 2005). In contrast, the lack of tractable-enough expressions for themean and variance of the Kumaraswamy distribution has hindered its utilization formodeling purposes; in spite of the advantages that the availability of an invertibleclosed-form cumulative distribution function entails, the Kumaraswamy distributionhas been employed rather sparingly in the modeling of stochastic phenomena andprocesses (for examples, see Courard-Hauri 2007; Fletcher and Ponnambalam 1996;Ganji et al 2006; Sanchez et al 2007; Seifi et al 2000; Sundar and Subbiah 1989) and,to the best of our knowledge, it has never been used for regression modeling or insimulation-based model estimation.

In this article we address this issue by presenting two median-based location-dispersion re-parameterizations of the Kumaraswamy distribution, one of which thefirst author is currently employing in on-going empirical research on labor markets.The article is organized as follows. In Sect. 2 we summarize the features of the Kum-araswamy distribution, in its standard parameterization, relevant for the rest of thearticle. In Sect. 3 we present the median-dispersion re-parameterizations and prove

123

The Kumaraswamy distribution: median-dispersion re-parameterizations 179

that, in both re-parameterizations, the dispersion parameter establishes a quantile-spread order among Kumaraswamy distributions with the same median and support.In Sect. 4 we describe the relationships between the shapes of the re-parameterizedKumaraswamy distributions and the values of their parameters, and identify some oftheir limiting distributions. In Sect. 5 we discuss the contexts in which using modelsbased on the re-parameterized Kumaraswamy distributions should be more advanta-geous than employing models based on the standard version of the distribution, modelsbased on the re-parameterized Beta distribution, and semi-parametric models of themedian of the dependent variable. In Sect. 6 we present brief concluding remarks.

2 Standard parameterization of the Kumaraswamy distribution

The probability density, cumulative distribution, and quantile functions of the generalform of the Kumaraswamy distribution—that is, the form of the distribution with sup-port in any open interval of the real line with upper bound b and lower bound c—arethe following:

f (x) = (b − c)−1 pq

(x − c

b − c

)p−1 [1 −

(x − c

b − c

)p]q−1

(1)

F(x) = 1 −[

1 −(

x − c

b − c

)p]q

(2)

x = F−1(u) = c + (b − c)[1 − (1 − u)

1q

] 1p, (3)

where c < x < b, 0 < u < 1, and p > 0 and q > 0 are shape parameters. Wewill denote this general form of the distribution by K (p, q, c, b). The standard formof the distribution obtains when c = 0 and b = 1.

The r th moment around zero of X ∼ K (p, q, c, b) (Mitnik Forthcoming) is

μ′r (X) = (b − c)r

∑r

j = 0

(rj

)μ′

r− j (Y )

(c

b − c

) j

, (4)

where

(rj

)= r !

j ! (r− j)! is a binomial coefficient, μ′n(Y ) = q B

(1 + n

p , q)

is the nth

moment around zero of Y ∼ K (p, q) ≡ K (p, q, 0, 1), B(α, β) = ∫ 10 sα−1(1− s)β−1

ds = �(α)�(β)�(α+β)

is the Beta function, and �(υ) = ∫ ∞0 tυ−1e−t dt is the Gamma func-

tion. From (4), the expectation and variance of the general form of the distribution

are E(X) = c + (b − c) q B(

1 + 1p , q

)and V ar(X) = (b − c)2

{q B

(1 + 2

p , q)

−[q B

(1 + 1

p , q)]2

}. As anticipated in the introduction, the available expression

for E(X) makes a mean-based re-parameterization unfeasible.

123


From (3), however, we immediately obtain a simple expression for the median,

md(X) = ω = c + (b − c)(

1 − 0.51q

) 1p, (5)

which provides the basis for the median-dispersion re-parameterizations we intro-duce in the next section. Relevant for Sect. 4, Eq. 3 also allows to express the

inter-quartile range as IQR(X) = (b − c)[(1 − 0.75

1q )

1p − (1 − 0.25

1q )

1p

], while

Mitnik (Forthcoming) has shown that the mean absolute deviation around themedian, δ2(X) = ∫ b

c |x − ω| f (x)dx, can be expressed as δ2(X) = (b − c)[2q B

(2− 1

q , q, 1 + 1p

)− q B

(q, 1 + 1

p

)],where B(z, α, β) = ∫ z

0 sα−1(1−s)β−1ds

is the incomplete Beta function.

3 Median-dispersion re-parameterizations

From (5), the shape parameters q and p can be expressed as

q = ln 0.5

ln(1 − ωdp−1

)(6)

and

p = ln(1 − 0.5dq )

ln ω, (7)

where ω = ω−cb−c , dp = p−1 and dq = q−1. Successively substituting (6) and (7) in

(1), (2), and (3), we obtain two possible re-parameterizations of the Kumaraswamydistribution (similar re-parameterizations based on quantiles other than the medianare of course also possible). With x = x−c

b−c , the first re-parameterization, denoted byK p(ω, dp, c, b), is:

f p(x) = (b − c)−1 ln 0.5

dp ln(1 − ωd −1p )

xd −1p −1(1 − xd −1

p )

ln 0.5

ln(1−ωd −1

p )

−1

Fp(x) = 1 − (1 − xdp−1

)

ln 0.5

ln(1−ωdp−1

)

F−1p (u) = x = c + (b − c)

[1 − (1 − u)

ln(1−ωd −1

p )ln 0.5

]dp

.

The second re-parameterization, denoted by Kq(ω, dq , c, b) is:

123


fq(x) = (b − c)−1 ln(1 − 0.5dq)

ln ωdq

−1xln(1−0.5dq )

ln ω−1

[1 − x

ln(1−0.5dq )ln ω

] 1dq

−1

Fq(x) = 1 −[

1 − xln(1−0.5dq )

ln ω

]d −1q

F−1q (u) = x = c + (b − c)

[1 − (1 − u)dq

] ln ω

ln(1−0.5dq ) .

The main goal of the rest of this section is to show that the parameters dp and dq aredispersion parameters.

Showing that a parameter in a distribution is a dispersion parameter requiresreference to a well-defined dispersion order. The well-known dispersion order dueto Bickel and Lehmann (1979) and Lewis and Thompson (1981)—which Shaked andShanthikumar (2007, Chap. 3) call the “dispersive” order—appears as most attrac-tive. It stipulates that given two random variables X and Y with arbitrary distributionfunctions F and G and quantile functions F−1 and G−1, X is more disperse than Yif F−1(υ) − F−1(u) ≥ G−1(υ) − G−1(u) for all 0 < u < υ < 1. This orderprovides a very strong criterion of ordering in terms of dispersion, because it requiresthat the more disperse distribution has any two quantiles at least as far apart as thecorresponding quantiles of the less disperse distribution. As a result—and this is whatmakes the order so attractive—distributions ordered according to this criterion arealso ordered in terms of most common measures of dispersion, and in terms of manyother dispersion orders. Indeed, this is manifestly the case for the interquartile range,the mean and the median absolute deviation around the median, and the mean devia-tion (defined as

∫ 10

∫ 10 |F−1(s) − F−1(t)|ds dt, 0 < s, t < 1). In addition, it can be

shown that the dispersive order entails the “dilation order” (Shaked and Shanthikumar2007, Chap. 3, and see online Appendix), which in turn entails orders in terms of thetwo dispersion measures most commonly employed in the social sciences, the varianceand the standard deviation (Shaked and Shanthikumar 2007, pp. 154–155, 166).

Unfortunately, distributions with the same finite support are not related in termsof the dispersive order unless they are identical (Hickey 1986, p. 916; Shaked andShanthikumar 2007, Theorem 3.B.14). Hence, the case at hand requires employingweaker notions of dispersion order. For reasons explained in the online Appendix,two natural candidates, the “right-spread order” (Fernandez-Ponce et al 1998) andthe already-mentioned dilation order are not relevant either. However, a dispersionorder that may be used for this purpose and has attractive properties is the recentlyintroduced “quantile-spread order.” Townsend and Colonius (2005, see also Shakedand Shanthikumar 2007, p. 151) have characterized this order as follows. If H−1 isthe quantile function of the random variable Z , the quantile spread of Z is stipulatedto be the function QSH (u) = H−1(1 − u)− H−1(u), defined for 0 < u < 0.5 (thisfunction is equivalent to the “spread function” of Balanda and MacGillivray 1990).Now let X, Y, F, F−1, G and G−1 be the same as above. X is called smaller than Y inthe quantile-spread order, or X ≤QS Y, if QSF (u) ≤ QSG(u) for 0 < u < 0.5. It isconceptually clear that this is a dispersion order because it requires that the distancebetween any two “symmetric quantiles”of Y (e.g., the 30th and 70th quantiles, or the20th and the 80th) be at least as large as the distance for the corresponding quantiles of

123


X . Moreover, to the best of our knowledge this is the order conceptually most similarto the dispersive order among those that are relevant for double-bounded distributions.

Relevant for what follows, X ≤QS Y implies both IQR(X) ≤ IQR(Y ) and δ2(X) ≤δ2(Y ). The first implication is obvious, while the second is easy to prove. Indeed, themean absolute deviation around the median of a random variable Z with quantile

function H−1 can be written as δ2(Z) = ∫ 12

0 [H−1(1 − u) − H−1(u)] (Pham-Gia andHung 2001, p. 924), and from this the result follows immediately.

In both re-parameterizations of the Kumaraswamy distribution introduced above,distributions with a common median and support are indexed by the second parameterin terms of the quantile-spread order. The following proposition formulates this ideamore precisely.

Proposition 3.1 If X ∼ Kr (ω, dr x , c, b) and Y ∼ Kr (ω, dr y, c, b), then X is smallerthan Y in the quantile-spread order if and only if dr x ≤ dr y, with r = p, q.

Corollary If X ∼ Kr (ω, dr x , c, b), Y ∼ Kr (ω, dr y, c, b) and dr x ≤ dr y, thenIQR(X) ≤ IQR(Y ) and δ2(X) ≤ δ2(Y ), with r = p, q.

The proof of this proposition makes use of the following lemma.

Lemma 3.1 The function f (x) = x ln x(1−x) ln(1−x)

is decreasing on x = (0,1).

We will prove the lemma by proving that d f (x)dx = x ln x+(1−x) ln(1−x)+ln(1−x) ln x

(1−x)2[ln(1−x)]2 < 0on (0,1). To this end it is sufficient to show that

x ln x + (1 − x) ln(1 − x) + ln(1 − x) ln x < 0

on (0,1). Dividing both sides by (1-x)x, this condition becomes

p(x)p(1 − x) − p(x) − p(1 − x) < 0,

where p(x) = − ln x1−x and 0 < x < 1. After adding 1 to both sides, the inequality can

be rewritten as

[p(x) − 1] · [p(1 − x) − 1] < 1.

Now, it is the case that x < − ln(1 − x) < x1−x with x < 1 and x �= 0 (Abramowitz

and Stegun 1972, p. 68). Dividing by x and then subtracting 1, we have:

0 < − ln(1 − x)

x− 1 <

1

1 − x− 1 = x

1 − x

and, given that in the case at hand 0 < x < 1, also

0 < − ln(x)

1 − x− 1 <

1

x− 1 = 1 − x

x.

123


Hence,

[p(x) − 1] · [p(1 − x) − 1] <1 − x

x

x

1 − x= 1,

which concludes the proof of the lemma.We can now proceed with the proof of proposition 3.1.First re-parameterization. To simplify notation we will conduct the proof in terms

of p = d−1p instead of doing it in terms of dp itself. A sufficient condition for Propo-

sition 3.1 in this case is that dQSF(u)

d p < 0 for 0 < u < 0.5. Making c = 0 and b = 1,

without loss of generality, this derivative is:

dQSF(u)

dp= t(u, p, ω)

1p T(u, p, ω) − t(1 − u, p, ω)

1p T(1 − u, p, ω)

where:

t(u, p, ω) = 1 − uln(1−ωp)

ln 0.5

T(u, p, ω) = ln u

p ln 0.5

ωp ln ω

(1 − ωp)

1 − t(u, p, ω)

t(u, p, ω)− ln t(u, p, ω)

p2 .

Given that p > 0, 0 < ω < 1, and 0 < u < 0.5, we have 0 < ωp < 1, 0 <

t(u, p, ω) < 1, and 0 < t(1 − u, p, ω) < 1. Therefore, a sufficient condition fordQSF(u)

dp < 0 is that T(u, p, ω) < 0 and T(1 − u, p, ω) > 0. The first of these two

conditions can be expressed as

ln u

p ln 0.5

ωp ln ω

(1 − ωp)

1 − t(u, p, ω)

t(u, p, ω)<

ln t(u, p, ω)

p2

ωp ln ωp

(1 − ωp) ln(1 − ωp)>

t(u, p, ω) ln t(u, p, ω)

(1 − t(u, p, ω))ln(1−ωp)

ln 0.5 ln u

ωp ln ωp

(1 − ωp) ln(1 − ωp)>

(1 − uln(1−ωp)

ln 0.5 ) ln(1 − uln(1−ωp)

ln 0.5 )

uln(1−ωp)

ln 0.5 ln uln(1−ωp)

ln 0.5

.

Likewise, the second condition can be expressed as

ln(1 − u)

p ln 0.5

ωp ln ω

(1 − ωp)

1 − t(1 − u, p, ω)

t(1 − u, p, ω)>

ln t(1 − u, p, ω)

p2

ωp ln ωp

(1 − ωp) ln(1 − ωp)<

t(1 − u, p, ω) ln t(1 − u, p, ω)

(1 − t(1 − u, p, ω)) ln(1 − u)ln(1−ωp)

ln 0.5

ωp ln ωp

(1 − ωp) ln(1 − ωp)<

(1 − (1 − u)

ln(1−ωp)ln 0.5

)ln

(1 − (1 − u)

ln(1−ωp)ln 0.5

)

(1 − u)ln(1−ωp)

ln 0.5 ln(1 − u)ln(1−ωp)

ln 0.5

.

123


Consider now the function f (x) = x ln x(1−x) ln(1−x)

. Using this function we can rewritethe two inequalities we just derived as:

f(

1 − uln(1−ωp)

ln 0.5

)< f (ωp) < f

(1 − (1 − u)

ln(1−ωp)ln 0.5

).

Showing that this inequality obtains is sufficient to prove that T(u, p, ω) < 0 andT (1 − u, p, ω) > 0.

Since, given 0 < z < 1, 1 − zln(1−ωp)

ln 0.5 > ωp if and only if z < 0.5, and 1 −z

ln(1−ωp)ln 0.5 < ωp if and only if z > 0.5, it follows from 0 < u < 0.5 that

1 − uln(1−ωp)

ln 0.5> ωp > 1 − (1 − u)

ln(1−ωp)ln 0.5

.

In turn, the last inequality and the fact that f (x) is a decreasing function on (0, 1)(by Lemma 3.1) entail that

f(

1 − uln(1−ωp)

ln 0.5

)< f (ωp) < f

(1 − (1 − u)

ln(1−ωp)ln 0.5

),

thus proving that d QSF(u)

d p < 0.

Second re-parameterization. As before, we make c = 0 and b = 1, without loss

of generality. A sufficient condition for Proposition 3.1 is now that d QSF(u)

ddq> 0, for

0 < u < 0.5. Now, given any particular distribution, it can be parameterized bothin terms of ω and dq and in terms of ω and dp. This defines a one-to-one mappingbetween the values of dp and dq , dp = D(dq;ω). Hence, we can express the abovederivative as follows:

d QSF (u)

ddq= d QSF (u)

d D(dq ;ω)

d D(dq ;ω)

ddq.

As we have already proved that d QSF (u)d D(dq ;ω)

> 0, we only need to prove that d D(dq ;ω)

dq > 0.

Using (7), this derivative is:

d D(dq;ω)

ddq=

d ln ω

ln(1−0.5dq )

ddq= ln 0.5 0.5dq ln ω

(1 − 0.5dq ) ln(1 − 0.5dq )2> 0,

which concludes the proof of Proposition 3.1.

4 Behavior of the re-parameterized distributions

Figure 1 shows the density of re-parameterized Kumaraswamy distributions with sup-port in (0, 1), for several values of their parameters. The figure illustrates well howflexible and versatile the Kumaraswamy distribution is. It also illustrates a series ofproperties of the re-parameterized versions of the distribution that can be derived rather

123


dp = 0.5

dp = 1

dp = 1.6dp = 4

0.0

1.0

2.0

3.0

f p(x

)

0 .2 .4 .6 .8 1

x

ω = 0.25

dp = 0.5

dp = 1

dp = 1.6

dp = 4

0.0

0.5

1.0

1.5

2.0

f p(x

)

0 .2 .4 .6 .8 1

x

ω = 0.50

dp = 0.2

dp = 0.5

dp = 1

dp = 3

0.0

1.0

2.0

3.0

f p(x

)

0 .2 .4 .6 .8 1

x

ω = 0.75

dq = 0.2

dq = 0.5dq = 1

dq = 30.0

1.0

2.0

3.0

f q(x

)

0 .2 .4 .6 .8 1

x

ω = 0.25

dq = 0.2

dq = 0.5

dq = 1

dq = 3

0.0

0.5

1.0

1.5

2.0

f q(x

)

0 .2 .4 .6 .8 1

x

ω = 0.50

dq = 0.5

dq = 1 dq = 1.6

dq = 40.0

1.0

2.0

3.0

f q(x

)

0 .2 .4 .6 .8 1

x

ω = 0.75

Fig. 1 Some Kumaraswamy densities, both re-parameterizations

directly from known properties of the distribution in its standard parameterization (forthese latter properties, see Jones 2009, Sect. 3; Kumaraswamy 1980, pp. 81–83; MitnikForthcoming, Sect. 2). Table 1 summarizes the properties in question.

Jones (2009) and Mitnik (Forthcoming) have studied several limiting distributionsof the standard Kumaraswamy distribution, in its standard parameterization. The fol-lowing propositions specify limiting distributions of the general form of its median-dispersion counterparts.

Proposition 4.1 When ω → c (ω → b), the re-parameterized Kumaraswamy distri-butions tend to the degenerate distribution with parameter ρ = c (ρ = b).

Proposition 4.2 When dr → 0, the Kumaraswamy distribution tends to the degener-ate distribution with parameter ρ = ω, for r = p, q.

Proposition 4.3 When dr → ∞, the Kumaraswamy distribution tends to the discreteuniform distribution with possible values c and b, for r = p, q.

Corollary When c = 0, b = 1 and dr → ∞, the Kumaraswamy distribution tendsto the Bernoulli distribution with parameter s = 0.5, for r = p, q.

The proofs of these propositions are in the online Appendix.

5 Comparative advantages of the re-parameterized Kumaraswamydistributions

The re-parameterizations we have introduced make it possible to use the Kumarasw-amy distribution to develop median-dispersion models similar to existing regression

123


Table 1 Characteristics of the re-parameterized Kumaraswamy distributions for different parameter values

The equations that appear first, at the left of each row and at the top of each column, correspond to the firstre-parameterization. The equations that appear second, at the left of each row and at the top of each column,correspond to the second re-parameterization

models of Beta-distributed dependent variables with known bounds (see references inthe introduction) and, more generally, similar to the class of extended generalized lin-ear models in which the mean and the dispersion of the dependent variable are jointlymodeled (McCullagh and Nelder 1989, Chap. 10; Smyth 1989). The models in thisclass include two submodels aimed at modeling the conditional location and dispersionparameters separately; each of these submodels may have its own link function and setof covariates. With a Kumaraswamy-distributed dependent variable—and using herethe first re-parameterization—a model analogous to those discussed by McCullaghand Nelder is the following:

Yi =

⎡⎢⎢⎢⎣1 − (1 − ui )

ln

⎡⎣1−ω

1δii

⎤⎦

ln 0.5

⎤⎥⎥⎥⎦

δi

ωi =[1 + e−α′

1 X1i]−1

δi = eα′2 X2i ,

where i indexes observations, Y is the dependent variable, X1 and X2 are vectors ofindependent variables (usually including a 1 in the first position), u ∼ U (0, 1) is the

123


0.1

.2.3

.4.5

δ 2

−3 −2 −1 0 1 2 3 4

Log(dp)

δ2 as a function of Log(dp)

0.1

.2.3

.4.5

σ

−3 −2 −1 0 1 2 3 4

Log(dp)

σ as a function of Log(dp)

0.2

.4.6

.81

IQR

−3 −2 −1 0 1 2 3 4

Log(dp)

IQR as a function of Log(dp)0

.1.2

.3.4

.5

δ 2

−3 −2 −1 0 1 2 3 4

Log(dq)

δ2 as a function of Log(dq)0

.1.2

.3.4

.5

σ

−3 −2 −1 0 1 2 3 4

Log(dq)

σ as a function of Log(dq)

0.2

.4.6

.81

IQR

−3 −2 −1 0 1 2 3 4

Log(dq)

IQR as a function of Log(dq)

Fig. 2 Dispersion measures as functions of parameters, both re-parameterizations

stochastic component of the model, 0 < δ ≡ dp < ∞ is the dispersion parameter,0 < ω < 1 is the median, α1 and α2 are the parameter vectors to be estimated, and wehave assumed b = 1, c = 0 and a logarithmic and a logit link function, respectively,for the location and dispersion submodels (several other link functions can also beemployed).

As in models based on mean-based re-parameterizations of the Beta distribution—where the variance of the dependent variable is a function of both its mean and adispersion parameter—the dispersion parameter is not the only determinant of disper-sion in models based on either of the two re-parameterizations of the Kumaraswamydistribution. This is clear in Fig. 2, where we have plotted the mean absolute deviationaround the median, the standard deviation, and the inter-quartile range for differentvalues of the median and the dispersion parameter, for both re-parameterizations.However, as Smithson and Verkuilen have correctly pointed out, the fact that disper-sion depends partly on location when bounded random variables are involved is nota downside in the context of regression modeling as long as the location and the dis-persion parameters place no restriction on each other and may therefore be modeledseparately (Smithson and Verkuilen 2006, p. 58). This is clearly the case here.

Figure 2 also suggests that using the first re-parameterization for regression mod-eling may be slightly more advantageous than using the second. In effect, if thereis an approximate linear relationship between a scalar measure of dispersion D andln δ, then ∂ D

∂ X2∼= α2. This means that each coefficient in α2 is approximately equal

to the effect of a unit change in the corresponding independent variable on D, whichfacilitates interpretation. As the relationship between the three scalar measures ofdispersion included in the figure and the logarithm of the dispersion parameter is

123


closer to linear for a wider range of values of the dispersion parameter in the firstre-parameterization than in the second, the former may be preferred on interpretabil-ity grounds, at least when a logarithmic link function is employed.

As an alternative to using the re-parameterizations introduced in this paper, it wouldbe possible to use the Kumaraswamy distribution in its original parameterization toestimate regression models in which the shape parameters are functions of covariates(for an example of this approach in the context of models using the Beta distribution,see Brehm and Gates 1993). This is not, however, an attractive strategy. First, a parame-terization in terms of shape parameters makes it quite difficult to evaluate the direction,magnitude, and statistical significance of the effects of covariates on the location anddispersion of the dependent variable. Indeed, this cannot be done without perform-ing post-estimation covariate- and covariate-value-specific simulations (Paolino 2001,pp. 335–336), along the lines suggested by King et al (2000, pp. 349–351). In par-ticular, observe that information on whether the effects of any covariate on the shapeparameters are statistically significant or not is mostly irrelevant to whether its effectson the location or dispersion of the dependent variable are significant. For instance,a covariate may have no statistically significant effect on either shape parameter butsignificant effects on the median of the dependent variable if its effects on the shapeparameters “move” the median in the same direction; or it may have significant pos-itive effects on both shape parameters but no significant effect on the median if theeffects on the shape parameters compensate each other in terms of their effects on themedian. In addition, from Eq. 5 it is easy to see that when the shape parameters arefunctions of covariates, not just the magnitude but even the direction of the effectsof changes in any covariate on the median of the dependent variable may vary acrossinitial values of the covariate, that is, they may be positive for some values and negativeor zero for others.

Second, a parameterization in terms of shape parameters makes it impossible tospecify models in which different sets of variables affect the location and the dis-persion of the dependent variable—all relevant variables need to be entered into themodel as covariates of both shape parameters. As a consequence, it is not possible totest hypotheses about the determinants of location and dispersion separately (Smithsonand Verkuilen 2006, p. 69); and there is an unnecessary loss of degrees of freedom,which may be particularly damaging if sample sizes are small (Paolino 2001, p. 336).

The methodological arguments that have been advanced for using the Beta dis-tribution in regression models in a range of contexts and disciplines (Ferrari andCribari-Neto 2004; Kieschnick and McCullough 2003; Paolino 2001; Smithson andVerkuilen 2006) apply unchanged if we substitute the Kumaraswamy for the Betadistribution. However, employing the Kumaraswamy instead of the Beta distributionin regression models may be preferable in at least three cases. First, the median of thedependent variable may be more interesting or relevant than its mean on theoreticalgrounds, for instance if there are good reasons to prefer an absolute loss function toa quadratic loss function (e.g., Cameron and Trivedi 2005, Chap. 4; Manski 1991).Second, if the conditional distribution of the dependent variable is skewed, the medianmay be a more appropriate measure of central tendency than the mean; in this con-text, “conditional-median regression, rather than conditional-mean regression, shouldbe considered for the purpose of modeling location shifts” (Hao and Naiman 2007,

123


p. 29). Third, by using the median as location parameter “Kumaraswamy regressions”are likely to be much more robust to outliers than “Beta regressions,” which modelthe conditional mean and thus have estimators with unbounded influence functions(e.g., Koenker 2005, pp. 42–47). This may make Kumaraswamy regressions pref-erable even if the researcher is theoretically indifferent between modeling the con-ditional mean or the conditional median of the dependent variable, and it is not thecase that the conditional distribution of the dependent variable is significantly skewed.This advantage should be particularly important when the data set under analysis issmall.

In addition, given the Kumaraswamy distribution’s great flexibility, median-dis-persion linear models based on this distribution can be used to model conditionalquantiles parametrically. This may prove more convenient than the prevalent semi-parametric approach in three situations: when data sets are small, when—regardless oftheir size—they have scarce observations in particular regions of the sampling space,and when the available data are censored. Indeed, as the semi-parametric approachinvolves independently fitting a family of conditional quantile functions, it necessarilyrequires the estimation of a much larger number of parameters and is sometimes—when data are sparse in some regions—subject to the “quantile crossing” problem(see He 1997 for the quantile-crossing problem; see Koenker 2005 for a comprehen-sive monograph on the semi-parametric approach to quantile regression). In addition,while censored data pose no problem whatsoever for regression models based on theKumaraswamy regression—these models can still be easily estimated by maximumlikelihood using well understood procedures (e.g., Green 2007)—the semi-parametricapproach has much more difficulties, in particular when the censored data are alsobounded (for an attempt to deal with the problem of estimating quantile regressionswith bounded and censored dependent variables semi-parametrically, see Machadoand Santos Silva 2010). Employing median-dispersion linear models based on theKumaraswamy distribution to model conditional quantiles parametrically would alsobe more convenient than employing models based on the Beta distribution with thesame purpose—although these two distributions seem to be equally versatile, the lat-ter’s lack of a closed-form quantile function would make its use in this context lesspractical.

The re-parameterized Kumaraswamy distribution also offers an important advan-tage in a context different from regression models with fully-observed independentand dependent variables. Indeed, there are types of models for which the criterionfunctions usually employed in estimation—for instance, the likelihood function—areanalytically intractable or very difficult to evaluate. These models include, amongothers, models with latent random variables, models with “non-ignorable” (Little andRubin 2002, Chap. 1) missing data, and nonlinear dynamic models; these modelscan often be estimated with the help of simulation-based methods like the methodof simulated moments, maximum simulated likelihood, and indirect inference (e.g.,Gallant and Tauchen 1996; Gouriéroux and Monfort 1996; Roberto et al 2000). Sim-ulation-based estimation requires the generation of a very large number of randomvariates. If these variates follow a Kumaraswamy distribution, the distribution’s quan-tile function can be used to generate them through direct application of the inversionprinciple. If they follow a Beta distribution, however, much less efficient approaches

123


need to be employed—for instance, numerical implementations of the inversion princi-ple or rejection algorithms (see, e.g., Devroye 1986). Therefore, whenever simulation-based methods of estimation are employed, using the re-parameterized Kumaraswamyinstead of the re-parameterized Beta distribution to model conditional distributionsshould be substantially more efficient from a computational point of view. This hasparamount importance, because in this context computer power constraints are oftenbinding (e.g., Nagypál 2007; Yamaguchi 2007). The existence of an alternative to theBeta distribution better-suited to simulation-based estimation methods should not onlyfacilitate the estimation of existing models but also promote the development of newlatent-variable and similar models.

6 Concluding remarks

The parameterization of the Kumaraswamy distribution in terms of shape parametersand the lack of simple expressions for its mean and variance have made its utiliza-tion with modeling purposes difficult. Using the quantile-spread order as dispersioncriterion, we have introduced two median-dispersion re-parameterizations aimed ataddressing this issue. We have also examined the behavior and some limiting distri-butions of the re-parameterized Kumaraswamy distributions.

The re-parameterizations make possible the development of models for double-bounded dependent variables in which conditional median and dispersion are modeledjointly, interpretation of both statistical and substantive significance is quite straight-forward, and different sets of covariates may enter the location and dispersion sub-models. The great versatility of the distribution makes it attractive for the modeling ofmany double-bounded dependent variables, including proportions, percentages, andfractions. Models based on the re-parameterized Kumaraswamy distributions shouldbe preferable to models based on the re-parameterized Beta distribution, and to semi-parametric approaches, in a variety of contexts.

Acknowledgements Comments from two referees led to extensive improvements to the paper. MosheShaked provided useful advice regarding the dilation-order analysis in the online Appendix.

References

Abramowitz M, Stegun IA (eds) (1972) Handbook of mathematical functions with formulas, graphs, andmathematical tables (9th edn). New York, Dover

Balanda K, MacGillivray H (1990) Can J Stat 18(1):17–30Bickel PJ, Lehmann EL (1979) Descriptive statistics for nonparametric models IV. Spread. In: Jurecková J

(ed). Contributions to statistics. D. Reidel, Dordrecht and BostonBrehm J, Gates S (1993) Donut shops and speed traps: evaluating models of supervision on police behavior.

Am J Political Sci 37(2):555–581Cameron A, Trivedi P (2005) Microeconometrics. Cambridge University Press, CambridgeCourard-Hauri D (2007) Using Monte Carlo analysis to investigate the relationship between overconsump-

tion and uncertain access to one’s personal utility function. Ecol Econ 64(1):152–162Cribari-Neto F, Souza T (Forthcoming) Testing inference in variable dispersion beta regressions. J Stat

Comput SimulCribari-Neto F, Zeileis A (2010) Beta regression in R. J Stat Software 34(2):1–24Devroye L (1986) Non-uniform random variate generation. Springer-Verlag, New York

123


Espinheira PL, Ferrari SLP, Cribari-Neto F (2008a) Influence diagnostics in beta regression. Comput StatData Anal 52(9):4417–4431

Espinheira PL, Ferrari SLP, Cribari-Neto F (2008b) On beta regression residuals. J Appl Stat 35(4):407–419

Fernandez-Ponce JM, Kochar SC, Muñoz-Perez J (1998) Partial orderings of distributions based on right-spread functions. J Appl Prob 35:221–228

Ferrari SLP, Cribari-Neto F (2004) Beta regression for modeling rates and proportions. J Appl Stat31(7):799–815

Ferrari SLP, Espinheira PL, Cribari-Neto F (2011) Diagnostic tools in beta regression with varying disper-sion. Statistica Neerlandica 65(3):337–351

Fletcher SC, Ponnambalam K (1996) Estimation of reservoir yield and storage distribution using momentsanalysis. J Hydrol 182(1–4):259–275

Gallant AR, Tauchen G (1996) Which moments to match? Econom Theory 12:657–681Ganji A, Ponnambalam K, Khalili D, Karamouz M (2006) Grain yield reliability analysis with crop water

demand uncertainty. Stoch Environ Res Risk Assess 20(4):259–277Garg M (2008) On distribution of order statistics from Kumaraswamy distribution. Kyunpook Math J

48:411–417Gouriéroux C, Monfort A (1996) Simulation-based econometric methods. Oxford University Press, OxfordGreen WH (2007) Censored data and truncated distributions. In: Mills T, Patterson K (eds) The handbook

of econometrics, Vol 1. Palgrave, LondonHao L, Naiman D (2007) Quantile regression. Sage, Thousand OaksHe X (1997) Quantile curves without crossing. Am Stat 51(2):186–192Hickey R (1986) Concepts of dispersion in distributions: a comparative note. J Appl Prob 23(4):914–921Johnson N, Kotz S, Balakrishnan N (1995) Continuous univariate distributions. Wiley, New YorkJones MC (2009) Kumaraswamy’s distribution: a beta-type distribution with some tractability advantages.

Stat Methodol 6(1):70–81Kieschnick R, McCullough B (2003) Regression analysis of variates observed on (0, 1): percentages, pro-

portions and fractions. Stat Model 3:193–213King G, Tomz M, Wittenberg J (2000) Making the most of statistical analyses: improving interpretation

and presentation. Am J Political Sci 44(2):341–355Koenker R (2005) Quantile regression. Cambridge University Press, CambridgeKumaraswamy P (1980) A generalized probability density function for double-bounded random processes.

J Hydrol 46:79–88Lewis T, Thompson JW (1981) Dispersive distributions and the connections between dispersivity and

strong unimodality. J Appl Prob 18(1):76–90Little R, Rubin D (2002) Statistical analysis with missing data (2nd edn). Wiley, HobokenMachado J, Santos Silva JMC (2010) Quantiles with corners. Unpublished.Manski C (1991) Regression. J Econ Lit 29(1):34–50McCullagh P, Nelder J (1989) Generalized linear models. Chapman & Hall, LondonMcDonald J (1984) Some generalized functions for the size distribution of income. Econometrica

52(3):647–665Mielke PW Jr. (1975) Convenient beta distribution likelihood techniques for describing and comparing

meteorological data. J Appl Meteorol 14:985–990Mitnik P (Forthcoming). New properties of the Kumaraswamy distribution. Commun Stat: Theory MethodNadarajah S (2008) On the distribution of Kumaraswamy. J Hydrol 348:568–569Nagypál, É (2007) Learning by doing vs. learning about match quality: can we tell them apart? Rev Econ

Stud 74:537–566Ospina R, Cribari-Neto F, Vasconcellos K (2006) Improved point and interval estimation for a beta regres-

sion model. Comput Stat Data Anal 51:960–981Paolino P (2001) Maximum likelihood estimation of models with beta-distributed dependent variables.

Political Anal 9(4):325–346Pham-Gia T, Hung TL (2001) The mean and median absolute deviations. Math Comput Model 34:921–936Roberto M, Schuermann T, Weeks M (2000) Simulation-based inference in econometrics: methods and

applications. Cambridge University Press, CambridgeRocha AV, Simas AB (2011) Influence diagnostics in a general class of beta regression models. Test 20:95–

119

123


Sanchez S, Ancheyta J, McCaffrey WC (2007) Comparison of probability distribution functions for fittingdistillation curves of petroleum. Energy & Fuels 21(5):2955–2963

Seifi A, Ponnambalam K, Vlach J (2000) Maximization of manufacturing yield of systems with arbitrarydistributions of component values. Ann Oper Res 99:373–383

Shaked M, Shanthikumar G (2007) Stochastic orders. Springer, New YorkSimas AB, Barreto-Souza W, Rocha AV (2010) Improved estimators for a general class of beta regression

models. Comput Stat Data Anal 54(2):348–366Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-dis-

tributed dependent variables. Psychol Method 11(1):54–71Smyth G (1989) Generalized linear models with varying dispersion. J R Stat Soc, Ser B 51(1):47–60Sundar V, Subbiah K (1989) Application of double bounded probability density-function for analysis of

ocean waves. Ocean Eng 16(2):193–200Townsend J, Colonius H (2005) Variability of the max and min statistic: a theory of the quantile spread as

a function of sample size. Psychometrika 70(4):759–772van Dorp JR, Mazzuchi T (2004) Parameter specification of the beta distribution and its dirichlet extensions

utilizing quantiles. In: Gupta A, Nadarajah S (eds). Handbook of beta distribution and its applications.Marcel Dekker, New York

Vasconcellos K, Cribari-Neto F (2005) Improved maximum likelihood estimation in a new class of betaregression models. Braz J Prob Stat 19:13–31

Yamaguchi S (2007) Job search, bargaining, and wage dynamics, ISER Discussion Paper # 658. Instituteof Social and Economic Research, Osaka University, Osaka

123

Documents

The Kumaraswamy distribution: median-dispersion re ...pmitnik/Mitnik_Baek_2013_Median_dispersion.pdf · In Sect. 3 we present the median-dispersion re-parameterizations and prove