Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Tail Risk of Multivariate Regular Variation
Harry Joe∗ Haijun Li†
Third Revision, May 2010
Abstract
Tail risk refers to the risk associated with extreme values and is often affected by extremal
dependence among multivariate extremes. Multivariate tail risk, as measured by a coherent risk
measure of tail conditional expectation, is analyzed for multivariate regularly varying distribu-
tions. Asymptotic expressions for tail risk are established in terms of the intensity measure that
characterizes multivariate regular variation. Tractable bounds for tail risk are derived in terms
of the tail dependence function that describes extremal dependence. Various examples involving
Archimedean copulas are presented to illustrate the results and quality of the bounds.
Key words and phrases: Coherent risk, tail conditional expectation, regularly varying, cop-
ula, tail dependence.
MSC2000 classification: 62H20, 91B30.
1 Introduction
The performance (gain or loss, etc.) of a financial portfolio at the end of a given period is often
evaluated by a real-valued random variable X. A risk measure % is defined as a measurable mapping,
with some coherency principles, from the space of all the performance variables into R [28], and
these coherency principles provide a set of operational axioms that % should satisfy in order to
accurately characterize risky behaviors of portfolios. The coherent risk measure, introduced in [5]
for analyzing economic risk of financial portfolios, is an example of such an axiomatic approach.
Let L be the convex cone1 consisting of all the performance variables which represent losses of
financial portfolios at the end of a given period. Note that −X, where X ∈ L, represents the net
worth of a financial position. A mapping % : L → R is called a coherent risk measure if % satisfies
the following four economically coherent axioms:
∗[email protected], Department of Statistics, University of British Columbia, Vancouver, BC, V6T 1Z2,
Canada. This author is supported by NSERC Discovery Grant.†[email protected], Department of Mathematics, Washington State University, Pullman, WA 99164, U.S.A.
This author is supported in part by NSF grant CMMI 0825960.1A subset L of a linear space is a convex cone if x1 ∈ L and x2 ∈ L imply that λ1x1 + λ2x2 ∈ L for any λ1 > 0
and λ2 > 0. A convex cone is called salient if it does not contain both x and −x for any non-zero vector x.
1
1. (monotonicity) For X1, X2 ∈ L with X1 ≤ X2 almost surely, %(X1) ≤ %(X2).
2. (subadditivity) For all X1, X2 ∈ L, %(X1 +X2) ≤ %(X1) + %(X2).
3. (positive homogeneity) For all X ∈ L and every λ > 0, %(λX) = λ%(X).
4. (translation invariance) For all X ∈ L and every l ∈ R, %(X + l) = %(X) + l.
The interpretations of these axioms have been well documented in the literature (see, e.g., [28] for
details), and risk %(X) for loss X corresponds to the amount of extra capital requirement that has
to be invested in some secure instrument so that the resulting position %(X)−X is acceptable to
regulators/supervisors. The general theory of coherent risk measures was developed for arbitrary
real random variables in [12], and the convex measures that combine subadditivity and positive
homogeneity into the convexity property were extended to cadlag processes in [9], and to abstract
spaces in [14] that include deterministic, stochastic, single or multi-period cash-stream structures.
It follows from the duality theory that any coherent risk measure %(X) arises as the supremum
of expected values of X, taken over with respect to a convex set of probability measures on envi-
ronmental states, all of them being absolutely continuous with respect to the underlying physical
measure. If the set is taken to be the set of all conditional probability measures conditioning on
events with probability greater than or equal to p, 0 < p < 1, then the corresponding coherent
risk measure is known as the worst conditional expectation WCEp(X), which, in the case that loss
variable X is continuous, equals to the tail conditional expectation (TCE) defined as follows,
TCEp(X) := E(X | X > VaRp(X)), (1.1)
where VaRp(X) := infx ∈ R : PrX > x ≤ 1 − p is known as the Value-at-Risk (VaR) with
confidence level p (i.e., p-quantile). The VaR has been widely used in risk management, but it
violates the subadditivity of coherency on convex cone L and often underestimates risks. Although
VaR is coherent on a much smaller convex cone consisting of only linearized portfolio losses from
elliptically distributed risk factors, the non-subadditivity of VaR can occur in the situations where
portfolio losses are skewed or heavy-tailed with asymmetric dependence structures [28]. It can be
shown that for continuous losses, TCE is the average of VaR over all confidence levels greater than
p, focusing more than VaR does on extremal losses. Thus, TCE is more conservative than VaR at
the same level of confidence (i.e., TCEp(X) ≥ VaRp(X)) and provides an effective tool for analyzing
tail risks. The TCE is also related to the expected residual lifetime, a performance measure widely
used in reliability theory and survival analysis.
For light-tailed loss distributions, such as normal distributions, TCE and VaR at the same
level p of confidence are asymptotically equal as p → 1. Another example of light-tailed losses is
the phase-type distribution2. The explicit relation between TCE and VaR for the phase-type loss
distributions was obtained in [8], from which asymptotic equivalence of TCE and VaR as p → 1
2That is, the hitting time distribution of a finite-state Markov chain.
2
is evident. It is precisely the heavy-tails of loss distributions that make TCE more effective in
analyzing tail risks. Formally, a non-negative loss variable X with distribution function (df) F has
a heavy or regularly varying right tail at ∞ with heavy-tail index α if its survival function is of the
following form (see, e.g., [7] for detail),
F (r) := 1− F (r) = r−αL(r), r > 0, α > 0, (1.2)
where L is a slowly varying function; that is, L is a positive function on (0,∞) with property
limr→∞
L(cr)
L(r)= 1, for every c > 0. (1.3)
For example, the Pareto distribution with survival function F (r) = (1+r)−α, r ≥ 0, has a regularly
varying tail. It can be easily verified that if α > 1 for Pareto loss variable X, then
TCEp(X) ≈ α
α− 1VaRp(X), as p→ 1. (1.4)
In fact, (1.4) holds for any loss distribution (1.2) with heavy-tail index α > 1. Observe that
TCEp(X) =E(XIX > VaRp(X))
PrX > VaRp(X)
=1
PrX > VaRp(X)
(VaRp(X) PrX > VaRp(X)+
∫ ∞VaRp(X)
PrX > xdx
), (1.5)
where I(A) hereafter denotes the indicator function of set A. By the Karamata theorem (see, e.g.,
[31]), we have∫ ∞VaRp(X)
PrX > xdx ≈ 1
α− 1VaRp(X) PrX > VaRp(X), as p→ 1. (1.6)
Plug this estimate into (1.5), we obtain (1.4) for any regularly varying distribution with α > 1.
The asymptotic formula (1.4) of TCE for univariate tail risks is fairly straightforward, but
the multivariate case remains unsettled and is the focus of this paper. Consider a random vector
X = (X1, . . . , Xd) from a multi-assets portfolio at the end of a given period, where the i-th
component Xi corresponds to the loss of the financial position on the i-th market. A risk measure
R(X) for loss vector X corresponds to a subset of Rd consisting of all the deterministic portfolios
x such that the modified positions x−X is acceptable to regulators/supervisors. The coherency
principles that are similar to the univariate case were formulated in [20] for multivariate risk measure
R(X), and it was further shown in [6] that for continuous loss vectors, multivariate TCEs are
coherent in the sense of [20]. Note, however, that multivariate TCEs, to be formally defined in
Section 2, are subsets of Rd, which lack tractable expressions even for some widely used multivariate
distributions, such as multivariate normals. The effect of dependence among losses X1, . . . , Xd in
different assets on the multivariate TCE also remains difficult to understand. In this paper, we
3
study asymptotic behaviors of multivariate TCEs for multivariate regularly varying distributions.
Our method, based on tail dependence functions developed in [29, 18], not only yields explicit
asymptotic expressions of multivariate TCEs for various multivariate distributions, but also leads
to better insights into how the dependence among extreme losses would affect analysis on tail risks.
The rest of the paper is organized as follows. In Section 2, we briefly discuss the multivariate
coherent risk measures, and then obtain the tail estimates of TCEs for multivariate regular variation
in terms of intensity measures and their asymptotic bounds in terms of tail dependence functions.
In Section 3, we present several examples to examine the quality of the bounds. Section 4 concludes
the paper with some remarks and Appendix in Section 5 details two lengthy proofs. Throughout
this paper, measurability of functions and sets are often assumed without explicitly mention, and
the maximum operator is denoted by ∨.
2 Tail Risks of Multivariate Regular Variation
To explain the vector-valued coherent risk measures, we use the notations from [20]. Let K be a
closed, salient convex cone1 of Rd such that Rd+ ⊆ K. The convex cone K induces a partial order
on Rd: x ≤K y if and only if y ∈ x + K. Note that a convex cone K must be an upper set3
with respect to partial order ≤K induced by itself. Moreover, if A is an upper set with respect to
partial order ≤K , then for any x ∈ A and k ∈ K, x + k ≥K x, leading to x + k ∈ A and thus
A+K ⊆ A. Observe that we always have A+K ⊇ A due to the fact that any closed convex cone
must contain the origin. Conversely, if A + K = A for some subset A, then for any y ≥K x with
x ∈ A, y ∈ x+K ⊆ A+K = A, implying that A must be upper with respect to partial order ≤K .
Hence, A is an upper set with respect to partial order ≤K if and only if A+K = A.
If K = Rd+, then the ≤K-order becomes the usual component-wise order. For any two loss
random vectors X and Y on the probability space (Ω,F ,P), define X ≤K Y if and only if Y −X ∈K, P-almost surely. Using the partial order ≥K rather than the usual component-wise partial order
can account for some financial market frictions such as transaction cost, etc..
Definition 2.1. Consider random loss vectors on a probability space (Ω,F ,P). A vector-valued
coherent risk measure R(·) is a measurable set-valued map satisfying that R(X) ⊂ Rd is closed for
any loss random vector X and 0 ∈ R(0) 6= Rd, as well as the following axioms:
1. (Monotonicity) For any X and Y , X ≤K Y implies that R(X) ⊇ R(Y ).
2. (Subadditivity) For any X and Y , R(X + Y ) ⊇ R(X) +R(Y ).
3. (Positive Homogeneity) For any X and positive s, R(sX) = sR(X).
4. (Translation Invariance) For any X and any deterministic vector l, R(X + l) = R(X) + l.
3A set S is called upper (lower) with respect to partial order ≤K if s ≤K (≥K) s′ and s ∈ S imply that s′ ∈ S.
4
Note that the risk set R(X) consists of all the deterministic portfolios x such that the multi-
variate portfolio x−X is acceptable to the regulator/supervisor. The motivation for set-valued
risk measures is that investors are sometimes not able to aggregate their multivariate portfolios
on various security markets because of liquidity problems and/or transaction costs between the
different security markets (e.g., having assets in several currencies at the same time). See [20] for
details.
When d = 1, %(X) := infr : r ∈ R(X) is a univariate coherent risk measure satisfying the
four axioms discussed in Section 1, and thus R(X) = [%(X),∞). It was shown in [20] that the
worst conditional expectation for random vector X, defined as
WCEp(X) := x ∈ Rd : E(x−X | B) ≥K 0, ∀B ∈ F with P(B) ≥ 1− p, 0 < p < 1,
is a vector-valued coherent risk measure. Since WCEp(X) = ∩B∈F with P(B)≥1−p(E(X | B) +K)
and K is an upper set, WCEp(X) is also an upper set. For any continuous random vector X,
WCEp(X) equals the tail conditional expectation (TCE) for X, defined as in [6] by,
TCEp(X) := x ∈ Rd : E(x−X |X ∈ A) ≥K 0, ∀A ∈ Qp(X)
=⋂
A∈Qp(X)
(E(X |X ∈ A) +K), 0 < p < 1, (2.1)
where Qp(X) = A ⊆ Rd : A is Borel-measurable and A+K = A,PrX ∈ A ≥ 1− p is the set
of all the upper sets (with respect to ≤K) with probability mass greater than or equal to 1 − p.Observe that TCEp(X) is a convex and upper set that consists of all the portfolios x of capital
reserves that can be used to cover the expected losses E(X |X ∈ A) in the events that X ∈ A.
Note that multivariate coherent risk measures discussed in [20, 6] are defined for essentially
bounded random vectors. To discuss asymptotic properties, these measures have to be extended to
the set of all random vectors on Rd = [−∞,∞]d. This can be done using the idea in [12] that allows
vectors in R(X) to have components taking the value of ∞; that is, the positions corresponding to
these components are so risky, whatever that means, that no matter what the capital added, the
positions will remain unacceptable. We need also to exclude the situations where components of
the vectors in R(X) take the value of −∞, which would mean that arbitrary amounts of capitals
could be withdrawn without endangering the portfolios (see [12] for details). As a matter of fact,
it can be easily verified that TCEp(X) is coherent in the sense of Definition 2.1 if X, which may
not be bounded, has a continuous density function.
The extreme value analysis of TCE TCEp(X) as p → 1 boils down to analyzing asymptotic
behaviors of E(X | X ∈ rB) as r → ∞ for various upper set B, for which multivariate regular
variation suits well. A non-negative random vector X with joint df F is said to have a multivariate
regularly varying (MRV, see [30]) distribution F if there exists a Radon measure µ (i.e., finite on
compact sets), called the intensity measure, on Rd+\0 such that
limr→∞
PrX ∈ rBPr||X|| > r
= µ(B), (2.2)
5
for any relatively compact set B ⊂ Rd+\04 with µ(∂B) = 0, where || · || denote a norm on Rd.Any MRV df F with support in Rd+ admits the following spectral representation: for all continuous
points x of µ,
limr→∞
1− F (rx)
1− F (r1)= lim
r→∞
PrX/r ∈ [0, x]cPrX/r ∈ [0, 1]c
= kµ([0, x]c), (2.3)
where k > 0 is a constant and µ([0, x]c) =∫Sd−1+
max1≤j≤d (uj/xj)α S(du) for a finite measure S on
Sd−1+ := x ∈ Rd+ : ||x|| = 1. Non-degenerate margins Fj , 1 ≤ j ≤ d, of an MRV df F are regularly
varying in the sense of (1.2). Since F1, . . . , Fd are usually assumed to be tail equivalent [31], we
have that F j(x) = Lj(x)/xα, 1 ≤ j ≤ d, where Li(x)/Lj(x) → cij as x → ∞, 0 < cij < ∞. We
assume hereafter that cij = 1 for notational convenience. If cij 6= 1 for some i 6= j, we can properly
rescale the margins and the results still follow. We also assume that the heavy-tail index α > 1 to
ensure the existence of expectations. The examples and properties of MRV distributions, including
the relation between MRV distributions and multivariate extreme value distributions with identical
Frechet margins can be found in [30, 31].
The asymptotic relation between TCEp(X) and intensity measure µ is given below and its
proof is detailed in Appendix in Section 5.
Theorem 2.2. Let X be a non-negative loss vector that has an MRV df with intensity measure µ.
1. Let B be an upper set bounded away from 0. Then limr→∞ r−1E(Xj | X ∈ rB) =∫∞
0µ(Aj(w)∩B)
µ(B) dw =: uj(B;µ), where Aj(w) := (x1, . . . , xd) ∈ Rd : xj > w, 1 ≤ j ≤ d.
2. Let Q||·|| := B ⊆ Rd : B +K = B,B ∩ Sd−1+ 6= ∅, B ⊆ (Bd)c, and Bd := x ∈ Rd : ||x|| < 1denote the open unit ball in Rd with respect to the norm || · ||. As p→ 1,
TCEp(X) ≈⋂
B∈Q||·||
VaR1−(1−p)/µ(B)(||X||) ((u1(B;µ), . . . , ud(B;µ)) +K) .
Remark 2.3. 1. Theorem 2.2 provides the multivariate extension of (1.4) and shows how ex-
tremal dependence, as described by the intensity measure, would quantitatively affect tail
risks. It also provides a unified tool to analyze the structural properties of tail asymptotics of
TCEs for various portfolio and risk aggregations of loss vector (X1, . . . , Xd). For example, the
tail asymptotics of TCEs of the portfolio aggregation∑d
i=1Xi can be obtained from Theorem
2.2 (1) by taking B = x :∑d
i=1 xi > 1 (also see [3]). The tail estimate obtained in Theorem
2.2 (2) can be also applied to analyzing coherent aggregations [20] of extremal risks.
2. Theorem 2.2 (1) can be used in analyzing portfolio tail risk decomposition. For example, for
any 1 ≤ j ≤ d,
E(Xj
∣∣∣ d∑i=1
Xi > VaRp
( d∑i=1
Xi
))≈ VaRp
( d∑i=1
Xi
)uj(B;µ), as p→ 1,
4Here Rd+ = [0,∞]d is compact and the punctured version Rd+\0 is modified via the one-point uncompactification
(see, e.g., [31]).
6
where B = x :∑d
i=1 xi > 1. The tail estimate of E(Xj |
∑di=1Xi > VaRp(
∑di=1Xi)
)provides the contribution to the total tail risk attributable to risk j, as measured by TCEs.
The risk allocation/decomposition with TCE for elliptically distributed loss vectors can be
found in [24].
3. The computation of VaR for the norm ||X|| is difficult in general, but the tail estimate of
VaRp(||X||), when p→ 1, is relatively simple in light of (2.2). The tail estimates of VaR of the
sum are obtained in [2, 1, 4, 22, 13] in a similar spirit. For the maximum norm of loss vector
(X1, . . . , Xd) with identical margins, the VaR can be estimated from the asymptotic relation
Prmax1≤i≤dXi > r ≈ PrX1 > r/µ(B) for sufficiently large r, where B = (1,∞)× Rd−1.
In the situations that the asymptotic expression obtained in Theorem 2.2 may be intractable,
we can utilize the method of tail dependence functions introduced in [29, 18] to derive tractable
bounds for TCE. For notational convenience, we only consider the case where K = Rd+ in the
remainder of this paper.
The idea is to separate the margins from the dependence structure of df F , so that TCE’s
can be expressed asymptotically in terms of the marginal heavy-tail index and tail dependence of
the copula of F . Assume that df F of random vector X = (X1, . . . , Xd) has continuous margins
F1, . . . , Fd, and then from [32], the copula C of F can be uniquely expressed as
C(u1, . . . , ud) = F (F−11 (u1), . . . , F−1d (ud)), (u1, . . . , un) ∈ [0, 1]d,
where F−1j , 1 ≤ j ≤ d, are the quantile functions of the margins. The extremal dependence
of a df F can be described by various tail dependence parameters of its copula C. The upper
tail dependence parameters, for example, are the conditional probabilities that random vector
(U1, . . . , Ud) := (F1(X1), . . . , Fd(Xd)) with standard uniform margins belongs to upper tail orthants
given that a univariate margin takes extreme values:
λU = limu↓0
PrU1 > 1− u, . . . , Ud > 1− u | Ud > 1− u = limu↓0
C(1− u, . . . , 1− u)
u, (2.4)
where C denotes the survival function of C. Bivariate tail dependence has been widely studied
[16], and various multivariate versions of tail dependence parameters have also been introduced
and studied in [21, 25]. In fact, various upper tail dependence parameters can be represented by
the upper tail dependence function [21, 29, 18], defined as follows,
b∗(w) := limu↓0
C(1− uwj , 1 ≤ j ≤ d)
u, ∀w = (w1, . . . , wd) ∈ Rd+. (2.5)
The lower tail dependence can be similarly studied but we focus only on upper tail dependence in
this paper. It was shown in [18] that b∗(w) > 0 for all w ∈ Rd+ if and only if λU > 0. Unlike λU ,
however, the tail dependence function provides all the extremal dependence information [29, 18, 26].
7
Using the inclusion-exclusion principle, we define the upper exponent function of C as follows
a∗(w) :=∑
S⊆1,...,d,S 6=∅
(−1)|S|−1b∗S(wi, i ∈ S;CS), (2.6)
where b∗S(wi, i ∈ S;CS) denotes the upper tail dependence function of the margin CS of C with
component indexes in S.
The intensity measure µ and tail dependence function b∗ of an MRV distribution F are uniquely
determined from each other and their detailed relations can be found in [26]. In particular,
b∗(w) =µ(∏di=1[w
−1/αi ,∞])
µ([1,∞]× Rd−1+ ), and
µ([w,∞])
µ([0,1]c)=b∗(w−α1 , . . . , w−αd )
a∗(1, . . . , 1). (2.7)
Using this equivalence and Theorem 2.2 (1), E(X | X ∈ rB) can be asymptotically expressed in
terms of the tail dependence function b∗ for sufficiently large r. But the asymptotic estimation of
TCEp(X) via Theorem 2.2 (2) is still cumbersome because B ∈ Q||·|| can be quite arbitrary. More
tractable bounds for TCEp(X) can be established directly using the tail dependence, as shown in
the next theorem whose proof is detailed in Appendix in Section 5.
Theorem 2.4. Let X be a non-negative loss vector with an MRV df F and heavy-tail index α > 1.
Assume that the copula C of F has a positive upper tail dependence function b∗(w) > 0. Let ||·||max
denote the maximum norm.
1. For 1 ≤ j ≤ d,
limr→∞
1
rE(Xj |X ∈ r(x,∞]) =
∫ ∞0
b∗(x−α1 , . . . , (wj ∨ xj)−α, . . . , x−αd )
b∗(x−α1 , . . . , x−αd )dwj .
2. Let Sj(b∗, α) :=
∫∞0
b∗(1,...,1,(wj∨1)−α,1,...,1)b∗(1,...,1) dwj , 1 ≤ j ≤ d. For sufficiently small 1− p,
TCEp(X) ⊆ VaR1−(1−p)a
∗(1,...,1)b∗(1,...,1)
(||X||max)(
(S1(b∗, α), . . . , Sd(b
∗, α)) + Rd+).
3. For sufficiently small 1− p,
VaRp(||X||max)(
(s1(b∗, α), . . . , sd(b
∗, α)) + Rd+)⊆ TCEp(X)
where, for 1 ≤ j ≤ d,
sj(b∗, α) :=
α
α− 1
1
b∗(1, . . . , 1)+
∑∅6=S⊆i:i 6=j
(−1)|S|
∫ 10 wjd b
∗j∪S(w−αj , 1, . . . , 1;Cj∪S)
b∗(1, . . . , 1),
and b∗j∪S(w−αj , 1, . . . , 1;Cj∪S) denotes the upper tail dependence function of the multivari-
ate margin Cj∪S evaluated with the j-th argument being w−αj and others being one.
Observe that if d = 1, then Theorem 2.4 (2) and (3) reduce to (1.4). In multivariate risk
management, the upper (subset) bound presented in Theorem 2.4 (3) is more important, because it
provides a set of portfolios of conservative reserves so that even in worst case scenarios the resulting
positions are still acceptable to regulators/supervisors.
8
3 Illustrative Examples of Bounds for Tail Risks
We have some examples to examine the quality of the results in Theorem 2.4 when used as approx-
imations. The examples show that they are better with more tail dependence and a larger ζ, where
ζ is in the exponent of the second order expansion
C(1− uwj , 1 ≤ j ≤ d) ≈ u b∗(w) + u1+ζ b∗2(w), u→ 0. (3.1)
It is intuitive that if ζ is larger (especially if ζ ≥ 1), then the second order term is less important.
Note that for the Frechet upper bound copula, CU (1 − uw) = uminw1, . . . , wd, and there is no
second order term.
Example 3.1. (a) Analysis of complete dependence (the Frechet upper bound). Let CU be the
Frechet upper bound copula of dimension d. Then b∗(w) = minw1, . . . , wd and b∗(1) = 1,
a∗(1) = 1. In part (2) of Theorem 2.4, 1 − (1 − p)a∗/b∗ = p, and for α > 1, Sj(b∗, α) =
1 +∫∞1 min1, w−αdw = 1 + (α− 1)−1 = α/(α− 1). In part (3) of Theorem 2.4, for α > 1,
sj(b∗, α) = α/(α − 1) +
∑∅6=S⊆i:i 6=j(−1)|S|0 = α/(α − 1). That is, the expressions in parts
(2) and (3) coincide.
(b) Analysis of near independence. As the d-variate copula C (with tail dependence) moves
towards independence, b∗(1) → 0 and a∗(1) → d and 1 − (1 − p)a∗(1)/b∗(1) > 0 only if
p > 1−b∗(1)/a∗(1) so that for small b∗(1), the result in part (2) of Theorem 2.4 is non-trivial
only for large p near 1. This is a hint that all of the limiting results of Theorem 2.4 are
worse for weak tail dependence. In this case, one has to use Theorem 2.2 to approximate the
multivariate TCE.
Example 3.2. We show some details for two copula families to illustrate Theorem 2.4. The first
copula is the exchangeable MTCJ copula (or Mardia-Takahasi-Cook-Johnson copula, see [27, 33,
11]), and the second is a mixture of the MTCJ copula and the independence copula. Second order
expansions of the tail dependence functions are obtained and the approximation from part (1) of
Theorem 2.4 is summarized in Tables for some special cases.
(a) The MTCJ copula in dimension d, with dependence increasing in δ, is:
C(u; δ) =[u−δ1 + · · ·+ u−δd − (d− 1)
]−1/δ, δ > 0. (3.2)
Let wj > 0 for j = 1, . . . , d, and let W := w−δ1 + · · ·+ w−δd . Then
C(uw; δ) = u[w−δ1 + · · ·+ w−δd − (d− 1)uδ]−1/δ = uW−1/δ[1− (d− 1)uδ/W]−1/δ
≈ uW−1/δ[1 + (d− 1)δ−1uδ/W] = ub∗(w; δ) + u1+δb∗2(w; δ), as u→ 0,
where b∗(w; δ) =W−1/δ = (w−δ1 +· · ·+w−δd )−1/δ, b∗2(w; δ) = (d−1)δ−1(w−δ1 +· · ·+w−δd )−1/δ−1.
The second order term of C(uw; δ) is O(u1+ζ), where ζ = δ increases with more dependence.
9
Suppose (X1, . . . , Xd) is multivariate Pareto of the form used in [27]; the univariate survival
function is x−α for x > 1 for all d margins and the survival copula is given in (3.2). That is,
F (x) = C(x−α1 , . . . , x−αd ; δ) =[xδα1 + · · ·+ xδαd − (d− 1)
]−1/δ, xj > 1, j = 1, . . . , d. (3.3)
An expression for the conditional expectation (given for the first component only because of
symmetry) is:
E [X1|X1 > x1, . . . , Xd > xd] = x1 +
∫∞0 F (x1 + z1, x2, . . . , xd) dz1
F (x1, . . . , xd),
leading to TCE
r−1E [X1 | X1 > rx1, . . . , Xd > rxd] = x1 +
∫∞0 F (rx1 + rw1, rx2, . . . , rxd) dw1
F (rx). (3.4)
The above expectations exist for α > 1.
• Exact calculation of the last summand in (3.4):∫∞0 C
((r[x1 + w1])
−α, (rx2)−α, . . . , (rxd)
−α; δ)dw1
C((rx1)−α, . . . , (rxd)−α; δ
)=
∫∞0
[(r[x1 + w1])
αδ + (rx2)αδ + · · · (rxd)αδ − (d− 1)
]−1/δdw1[
(rx1)αδ + · · ·+ (rxd)αδ − (d− 1)]−1/δ .
• First order approximation of the last summand in (3.4):∫∞0 b∗
((x1 + w1)
−α, x−α2 , . . . , x−αd ; δ)dw1
b∗(x−α1 , . . . , x−αd ; δ
) =
∫∞0
((x1 + w1)
αδ + xαδ2 + · · ·+ xαδd)−1/δ
dw1(xαδ1 + · · ·+ xαδd
)−1/δ .
This can be computed via numerical integration. Let the numerator and denominator
of the above be denoted as N1 := N1(x;α, δ) and D1 := D1(x;α, δ).
• Second order approximation of the last summand in (3.4):
r−αN1 + r−α(1+δ)∫∞0 b∗2
((x1 + w1)
−α, x−α2 , . . . , x−αd ; δ)dw1
r−αD1 + r−α(1+δ)b∗2(x−α1 , . . . , x−αd ; δ
)=N1 + (d− 1)r−αδδ−1
∫∞0
((x1 + w1)
αδ + xαδ2 + · · ·+ xαδd)−1/δ−1
dw1
D1 + (d− 1)r−αδδ−1(xαδ1 + · · ·+ xαδd
)−1/δ−1 .
Table 1 has some (representative) results to show how the approximations compare; we take
r = (1 − p)−1/α, d = 2, x1 = x2 = 1, p = 0.999, α = 2 and 5, and δ ∈ [0.1, 1.9] . The table
shows that the first order approximation is worse only when the dependence is weak and the
exponent ζ of the second order term is much less than 1; in these cases, the second order
term of the expansion is useful.
10
(b) Mixture model with MTCJ and independence copulas. Now, the second order term is between
O(u) and O(u2), depending on the amount of dependence in the copula. Let
C(u; δ, β) = (1− β)d∏j=1
uj + β[u−δ1 + · · ·+ u−δd − (d− 1)]−1/δ, δ > 0, 0 < β < 1
so that dependence increases as δ and β increase. Let W := w−δ1 + · · ·+ w−δd . Then
C(uw; δ, β) ≈ (1− β)udd∏j=1
wj + βuW−1/δ[1 + (d− 1)δ−1uδ/W
]= u b∗(w; δ, β) + u1+ζb∗2(w; δ, β),
where
b∗(w; δ, β) = βW−1/δ = β(w−δ1 + · · ·+ w−δd )−1/δ,
b∗2(w; δ, β) =
(d− 1)βδ−1(w−δ1 + · · ·+ w−δd )−1/δ−1 if δ < d− 1,
(1− β)∏dj=1wj + (d− 1)βδ−1(w−δ1 + · · ·+ w−δd )−1/δ−1 if δ = d− 1,
(1− β)∏dj=1wj if δ > d− 1,
and ζ = δ if δ < d− 1 and ζ = d− 1 if δ ≥ d− 1. The second order term is not far from the
first order term if δ is near 0 (i.e., weak dependence). Similar to part (a), we list the exact
TCE and the first/second order approximations for the last summand in (3.4).
• Exact (assuming α > 1 as before): with Px =∏dj=1 x
−αi ,
β∫∞0
(r[x1 + w1])
αδ + (rx2)αδ + · · ·+ (rxd)
αδ − (d− 1)−1/δ
dw1 + (1− β)r−dαPxx1/(α− 1)
β
(rx1)αδ + · · ·+ (rxd)αδ − (d− 1)−1/δ
+ (1− β)r−dαPx
since∫∞0 (x1 + w)−αdw = x−α+1
1 /(α− 1).
• First order approximation: this is the same as in part (a) because β cancels from the
numerator and denominator.
• Second order approximation: this is the same as in part (a) for δ < d− 1. For δ ≥ d− 1,
one gets∫∞0 b∗
((x1 + w1)
−α, x−α2 , . . . , x−αd ; δ, β)dw1 + r−α(d−1)
∫∞0 b∗2
((x1 + w1)
−α, x−α2 , . . . , x−αd ; δ, β)dw1
b∗(x−α1 , . . . , x−αd ; δ, β
)+ r−α(d−1)b∗2
(x−α1 , . . . , x−αd ; δ, β
)Table 2 has some (representative) results to show how the approximations compare; we take
r = (1 − p)−1/α, d = 2, x1 = x2 = 1; p = 0.999, β = 0.25, α = 2 and 5, δ ∈ [0.1, 1.9].
The conclusions are similar to Table 1, except the first and second order approximations
are slightly off in the last decimal place shown, even for δ > 1. The accuracy is of order
O(ud) = O(u2) for δ > 1 rather than the order O(u1+δ) in part (a).
11
Table 1: Values of exact TCE minus x1, together with first/second order approximations for the
bivariate MTCJ copula with Pareto survival margins; r = (1− p)−1/α, x1 = x2 = 1, p = 0.999.
α = 2 α = 5
δ exact appr1 appr2 exact appr1 appr2
0.1 2.114 4.063 3.349 0.3955 0.5556 0.5079
0.3 2.257 2.464 2.290 0.4382 0.4639 0.4428
0.5 1.968 2.000 1.969 0.4133 0.4180 0.4134
0.7 1.761 1.766 1.761 0.3883 0.3892 0.3883
0.9 1.622 1.624 1.622 0.3690 0.3692 0.3690
1.1 1.526 1.526 1.526 0.3543 0.3543 0.3543
1.3 1.456 1.456 1.456 0.3429 0.3429 0.3429
1.5 1.402 1.402 1.402 0.3338 0.3338 0.3338
1.7 1.360 1.360 1.360 0.3263 0.3263 0.3263
1.9 1.326 1.326 1.326 0.3200 0.3200 0.3200
Table 2: Values of exact TCE minus x1, together with first/second order approximations for the
bivariate mixture of independence and MTCJ copulas, with Pareto survival margins; r = (1 −p)−1/α, x1 = x2 = 1, p = 0.999, β = 0.25.
α = 2 α = 5
δ exact appr1 appr2 exact appr1 appr2
0.1 1.951 4.063 3.349 0.3742 0.5556 0.5079
0.3 2.227 2.464 2.290 0.4338 0.4639 0.4428
0.5 1.957 2.000 1.969 0.4114 0.4180 0.4134
0.7 1.755 1.766 1.761 0.3872 0.3892 0.3883
0.9 1.622 1.624 1.622 0.3683 0.3692 0.3690
1.1 1.523 1.526 1.526 0.3538 0.3544 0.3542
1.3 1.453 1.456 1.455 0.3424 0.3429 0.3428
1.5 1.400 1.402 1.402 0.3334 0.3338 0.3337
1.7 1.358 1.360 1.360 0.3259 0.3263 0.3262
1.9 1.324 1.326 1.325 0.3197 0.3200 0.3199
12
Table 3: Bounds for parts (2) and (3) of Theorem 2.4 for the MTCJ copula, with Pareto survival
margins; p = 0.999, (1−p)−1/αα/(α−1) = 63.25 and 4.98 provides an intermediate value for α = 2
and 5 respectively.
α = 2 α = 5
δ LB2 UB2 LB3 UB3 LB2 UB2 LB3 UB3
0.2 21.46 2908. 11.53 31340. 2.954 211.4 2.133 2208.
0.5 47.21 375.8 41.61 1175. 4.270 30.05 3.967 105.2
0.8 55.07 216.3 51.97 488.9 4.613 17.33 4.456 43.01
1.0 57.48 177.0 55.23 353.8 4.718 14.16 4.605 30.76
1.5 60.29 132.4 59.10 220.2 4.841 10.52 4.782 18.74
2.0 61.45 112.9 60.72 169.2 4.893 8.944 4.857 14.21
3.0 62.38 94.96 62.02 126.8 4.935 7.500 4.918 10.47
4.0 62.74 86.54 62.53 108.4 4.952 6.826 4.942 8.872
5.0 62.91 81.66 62.77 98.22 4.960 6.435 4.953 7.988
8.0 63.11 74.54 63.05 84.07 4.970 5.869 4.967 6.764
Example 3.3. We show the quality of the approximations in parts (2) and (3) of Theorem 2.4 for
(3.3) with survival copula (3.2). Since b∗(w) = (w−δ1 + · · · + w−δd )−1/δ, the margins are given by
b∗S(wj : j ∈ S) = (∑
j∈S w−δj )−1/δ, and these can be used to compute sj(b
∗, α) and Sj(b∗, α) via
numerical integrations. The exponent function a∗ is in (2.6). If (X1, . . . , Xd) has the distribution
in (3.3), the distribution of Xmax = maxX1, . . . , Xd is
FXmax(x) = F (x, . . . , x) = 1 +
d∑j=1
(−1)j(d
j
)(jxαδ − j + 1)−1/δ, x > 0.
Based on this distribution, expressions of the form VaRg(p)(||X||max) can be computed numerically.
Because of exchangeability, parts (2) and (3) have the form
UBd[1d,∞] ⊆ TCEp(X) ⊆ LBd[1d,∞].
Table 3 lists the values of LBd and UBd for d = 2, 3 with α = 2 and 5. As might be expected, the
ratio UBd/LBd decreases as δ and α increase, and increases as d increases.
Example 3.4. We consider general Archimedean copulas which satisfy a regular variation con-
dition. Consider a loss vector (X1, . . . , Xd) that has regularly varying margins with heavy-tail
index α > 1, and the Archimedean survival copula C(u;φ) = φ(∑d
i=1 φ−1(ui)) where the Laplace
transform φ is regularly varying at ∞ in the sense of (1.2) with tail index β > 0. It follows from
13
Proposition 2.8 of [18] that b∗(w1, . . . , wd) = (w−1/β1 + · · · + w
−1/βd )−β. Observe that (X1, . . . , Xd)
is more tail dependent as β decreases. Thus, for 1 ≤ j ≤ d,
Sj(b∗, α) = 1 + dβ
∫ ∞1
(wα/β + d− 1
)−βdw.
sj(b∗, α) =
α
α− 1dβ + dβ
∑∅6=S⊆i:i 6=j
(−1)|S|[(|S|+ 1)−β −∫ 1
0(wα/β + |S|)−βdw].
It follows from Theorem 2.4 that computable asymptotic bounds are given by
(S1(b∗, α), . . . , Sd(b
∗, α)) + Rd+ ⊇ limp→1
TCEp(X)
VaRp(||X||max)⊇ (s1(b
∗, α), . . . , sd(b∗, α)) + Rd+.
Since
limβ→0
∫ ∞1
(wα/β + d− 1
)−βdw =
∫ ∞1
w−αdw =1
α− 1, and lim
β→0
∫ 1
0(wα/β + |S|)−βdw = 1,
we obtain that for fixed α > 1, limβ→0 sj(b∗, α)/Sj(b
∗, α) = 1, for 1 ≤ j ≤ d. That is, asymptotic
subset and superset bounds are approximately identical for small β.
Remark 3.5. With the point process approach applied to data in the tails, the tail dependence
function b∗ can be estimated, and then the results in the theorems can be used. An outline of the
steps is as follows.
1. For risk variable j, a heavy-tail index estimation method, such as the Hill estimation, can be
applied to data values above a threshold to get an estimated univariate tail index αj . If any
risk variable shows a thin tail (i.e., exponentially decayed), it can be removed from calcula-
tions of multivariate extremal risk. For the remaining risks with possibly different heavy-tail
indexes, make appropriate power transforms and rescale the data so that exceedances above
the threshold have a Pareto distribution with tail index α in the middle of the range of the
αjs (see page 310 of [31]).
2. Transform to Frechet margins and use the point process likelihood approach for the joint tails
of the risk variables [10, 19, 16]. The exponent function a∗ corresponds to the intensity mea-
sure of the point process. For example, the simplest exchangeable Gumbel model discussed
in Example 3.2 (a), nested Gumbel models [15], scale mixture models [17, 25], and several
other parametric models of a∗ all have tractable forms so that the point process likelihood
can be easily optimized numerically; included are models with flexible dependence (labeled as
MM1, MM2, MM3 in [16]) which have a parameter denoting a minimal dependence level and
additional parameters for each pair that add onto the minimal dependence. An estimated b∗
can be obtained from a∗ using the inclusion-exclusion relation.
14
3. Combining α in step 1 and b∗ in step 2, Theorem 2.4 can be used to obtain bounds on the
scaled risks, with one-dimensional numerical integration. With the rescaled risk variables
X1, . . . , Xd, let the thresholds Tj satisfy FXj (T ) = q for all j, where q might be in the range
[0.5, 0.8]. Let FXj (xj) = q + (1 − q)[1 − (1 + xj − Tj)−α] for xj > Tj with the estimated
common α. With a parametric a∗, we use the copula C(u1, . . . , ud) = e−a∗(− log u1,...,− log ud)
in the tail region. For x1 > T1, . . . , xd > Td, the estimated tail distribution is:
FX1,...,Xd(x1, . . . , xd) = C(FX1(x1), . . . , FX1(x1)). (3.5)
The tail conditional expectation E(Xj | X ∈ r(x,∞]) can be evaluated using the estimated
α and b∗ or using (3.5) through one-dimensional numerical integration, like in Example 3.2,
provided rxj > Tj for all j.
4. For non-rectangular upper sets B such that rB ⊂∏dj=1[Tj ,∞), E(Xj | X ∈ rB) can be
evaluated by simulation of the tail of (3.5). This is better than fitting a distribution to the
entire data for simulation, to avoid extrapolation from a fit that is dominated from the middle
of the data, and to reduce the simulation sample size.
5. Parts (2) and (3) of Theorem 2.4 are useful as a way to more quickly give insight on the effect
of the univariate tail index α and the amount of tail dependence (represented by b∗) on the
size of TCEp(X) for p near 1. The pattern in the limit should carry over to the non-limit in
the tail region.
4 Concluding Remarks
Our results illustrate how tail risk is quantitatively affected by extremal dependence and also show
how the tool of tail dependence functions can be used to estimate such an asymptotic relation.
Similar to the univariate case (1.4), the multivariate tail conditional expectation TCEp(X) as
p → 1 is essentially linearly related to the value-at-risk of an aggregated norm of X. In contrast
to the univariate case where the asymptotic proportionality constant is related to the heavy-tail
index α, the asymptotic proportionality constants in the multivariate case depend not only on the
heavy-tail index α but also on the tail dependence structure.
As illustrated in the paper, the lower and upper bounds for multivariate TCEs become approx-
imately equal for highly tail dependent distributions, and thus our method is especially effective
for analyzing extremal risks for loss variables with significant tail dependence. For example, non-
overlapping aggregations of large numbers of loss variables in high-dimensional portfolios can have
strong tail dependence even though loss variables themselves only demonstrate weak tail depen-
dence; see [23]. When the lower and upper bounds are far apart, reducing the class of relevant
upper sets is suggested.
The quality of the bounds presented in Theorem 2.4 might be poor for the distributions with
weaker tail dependence. In this situation, one may aggregate loss variables with weak tail depen-
15
dence, which also corresponds to choosing some reduced class of specific upper sets B in Theorem
2.2, so that better bounds can be obtained. One can also use the higher order expansions such
as (3.1) to reveal the dependence structure at sub-extreme levels so that more accurate, tractable
bounds can be developed. Our numerical examples via the second order expansion show some
significant improvements in the presence of weak tail dependence, but more theoretical studies are
indeed needed in this area.
Acknowledgments. The authors would like to thank two referees, an Editor and Editor-in-Chief
for their comments that lead to an improvement of the presentation of this paper.
5 Appendix: Proofs
5.1 Proof of Theorem 2.2
Proof. To estimate E(X |X ∈ rB) for any upper set B bounded away from 0, consider,
E(Xj |X ∈ rB) =
∫ ∞0
PrXj > x |X ∈ rBdx = r
∫ ∞0
PrXj > rw,X ∈ rBPrX ∈ rB
dw. (5.1)
for any 1 ≤ j ≤ d. We first argue that we can pass the limit through the integration (5.1). Since
PrXj > rw,X ∈ rB ≤ Pr Xj > rw , (5.2)
it follows from the Karamata theorem (1.6) that for any fixed c > 0,
limr→∞
∫ ∞c
Pr Xj > rwPrX ∈ rB
dw = limr→∞
∫ ∞rc
Pr Xj > xrPrX ∈ rB
dx =c
α− 1limr→∞
Pr Xj > rcPrX ∈ rB
.
Let Aj(w) := (x1, . . . , xd) ∈ Rd : xj > w, then via (2.2), we have,
limr→∞
∫ ∞c
Pr Xj > rwPrX ∈ rB
dw =c
α− 1
µ(Aj(c))
µ(B)=
∫ ∞c
µ(Aj(w))
µ(B)dw, (5.3)
where the last equality follows from the direct calculation via (2.3). Because of (5.2), (5.3) and the
generalized dominated convergence theorem, we have from (2.2) that for any c > 0,
limr→∞
∫ ∞c
PrXj > rw,X ∈ rBPrX ∈ rB
dw =
∫ ∞c
limr→∞
PrXj > rw,X ∈ rBPrX ∈ rB
dw =
∫ ∞c
µ(Aj(w) ∩B)
µ(B)dw,
which implies that for any small ε > 0, there exists rε such that for all r ≥ rε,∣∣∣ ∫ ∞0
PrXj > rw,X ∈ rBPrX ∈ rB
dw −∫ ∞0
µ(Aj(w) ∩B)
µ(B)dw∣∣∣ ≤ ∫ ε/3
0
PrXj > rw,X ∈ rBPrX ∈ rB
dw
+∣∣∣ ∫ ∞
ε/3
PrXj > rw,X ∈ rBPrX ∈ rB
dw −∫ ∞ε/3
µ(Aj(w) ∩B)
µ(B)dw∣∣∣+
∫ ε/3
0
µ(Aj(w) ∩B)
µ(B)dw
≤∫ ε/3
0
PrXj > rw,X ∈ rBPrX ∈ rB
dw +ε
3+
∫ ε/3
0
µ(Aj(w) ∩B)
µ(B)dw ≤ ε
3+ε
3+ε
3= ε,
16
where the last inequality follows due to the fact that PrXj > rw,X ∈ rB ≤ PrX ∈ rB and
µ(Aj(w) ∩B) ≤ µ(B). Therefore, we have from (5.1) that
limr→∞
1
rE(Xj |X ∈ rB) = lim
r→∞
∫ ∞0
PrXj > rw,X ∈ rBPrX ∈ rB
dw =
∫ ∞0
µ(Aj(w) ∩B)
µ(B)dw. (5.4)
This concludes the proof of statement (1).
For statement (2), we simplify (2.1) asymptotically. For any upper set A ∈ Qp(X), there
exists an upper set B with B ∩ Sd−1+ 6= ∅ and a positive number rB such that A = rBB. Since
PrX ∈ rB is decreasing in r, we can find rB,p ≥ rB for any A = rBB such that PrX ∈ A ≥PrX ∈ rB,pB = 1− p, as p→ 1. It follows from (5.4) that E(Xj |X ∈ rB,pB) is asymptotically
increasing for sufficiently small 1−p and goes to +∞ as p→ 1, and thus we have E(X |X ∈ A) ≤E(X |X ∈ rB,pB) for sufficiently small 1−p. Since E(X |X ∈ A) +K ⊇ E(X |X ∈ rB,pB) +K
for sufficiently small 1− p, and rB,pB ∈ Qp(X), we have,
limp→1
[( ⋂B∈Q
(E(X |X ∈ rB,pB) +K)
)\ TCEp(X)
]
= limp→1
⋂A∈Qp(X)
(E(X |X ∈ A) +K)
\ TCEp(X)
= ∅,
where Q := B ⊆ Rd : B + K = B,B ∩ Sd−1+ 6= ∅, B is bounded away from 0 and PrX ∈rB,pB = 1− p. That is, (2.1) can be rewritten as follows, for sufficiently small 1− p,
TCEp(X) ≈⋂B∈Q
(E(X |X ∈ rB,pB) +K). (5.5)
For any B ∈ Q, there exists a real number rB with rB ≥ 1 such that rBB ∈ Q||·|| = B ⊆ Rd :
B + K = B,B ∩ Sd−1+ 6= ∅, B ⊆ (Bd)c. That is, for any B ∈ Q with PrX ∈ rB,pB = 1 − p, we
can find a B′ ∈ Q||·|| and a real number rB′,p (e.g., rB′,p = rB,p/rB) such that rB,pB = rB′,pB′.
Thus (5.5) can be rewritten further as
TCEp(X) ≈⋂
B∈Q||·||,PrX∈rB,pB=1−p
(E(X |X ∈ rB,pB) +K), (5.6)
for sufficiently small 1−p. Observe that as p→ 1, rB,p →∞, and thus it follows from (2.2) that for
sufficiently small 1−p, µ(B) Pr||X|| > rB,p ≈ 1−p, implying that rB,p ≈ VaR1−(1−p)/µ(B)(||X||)as p→ 1. Therefore, (5.4) and (5.6) imply that
TCEp(X) ≈⋂
B∈Q||·||
VaR1−(1−p)/µ(B)(||X||) ((u1(B;µ), . . . , ud(B;µ)) +K)
as p→ 1, where uj(B;µ) =∫∞0
µ(Aj(w)∩B)µ(B) dw, 1 ≤ j ≤ d.
17
5.2 Proof of Theorem 2.4
Proof. Since margins F1, . . . , Fd of F are tail equivalent [31], we have that F j(x) = Lj(x)/xα,
1 ≤ j ≤ d, where Li(x)/Lj(x)→ 1 as x→∞.
(1) Without loss of generality, let j = 1. The straightforward calculation shows
E(X1 |X > rx) =
∫ ∞0
PrX1 > x,X1 > rx1, . . . , Xd > rxdPrX1 > rx1, . . . , Xd > rxd
dx
= rx1 +
∫ ∞rx1
PrX1 > x,X2 > rx2, . . . , Xd > rxdPrX1 > rx1, . . . , Xd > rxd
dx
= r
(x1 +
∫ ∞x1
PrX1 > rw,X2 > rx2, . . . , Xd > rxdPrX1 > rx1, . . . , Xd > rxd
dw
)= r
(x1 +
∫ ∞x1
PrU1 > F1(rw), U2 > F2(rx2), . . . , Ud > Fd(rxd)PrU1 > F1(rx1), . . . , Ud > Fd(rxd)
dw
).
Applying the Karamata theorem and generalized dominated convergence theorem, we are allowed
to pass the limit through the integral. Since Lj , 1 ≤ j ≤ d, are slowly varying and the margins are
tail equivalent, we have,
limr→∞
1
rE(X1 |X > rx)
= x1 + limr→∞
∫ ∞x1
PrU1 > 1− L1(rw)/(rw)α, . . . , Ud > 1− Ld(rxd)/(rxd)αPrU1 > 1− L1(rx1)/(rx1)α, . . . , Ud > 1− Ld(rxd)/(rxd)α
dw
= x1 +
∫ ∞x1
limr→∞
PrU1 > 1− w−αL1(r)r−α, U2 > 1− x−α2 L1(r)r
−α, . . . , Ud > 1− x−αd L1(r)r−α
PrU1 > 1− x−α1 L1(r)r−α, . . . , Ud > 1− x−αd L1(r)r−αdw
= x1 +
∫ ∞x1
limu→0
PrU1 > 1− w−αu, U2 > 1− x−α2 u, . . . , Ud > 1− x−αd uPrU1 > 1− x−α1 u, . . . , Ud > 1− x−αd u
dw
= x1 +
∫ ∞x1
b∗(w−α, x−α2 , . . . , x−αd )
b∗(x−α1 , x−α2 , . . . , x−αd )dw =
∫ ∞0
b∗((w1 ∨ x1)−α, x−α2 , . . . , x−αd )
b∗(x−α1 , . . . , x−αd )dw1.
(2) It follows from (5.6) that as p→ 1,
TCEp(X) ⊆⋂
x∈Sd−1+
(E(X |X ∈ rx,p(x,∞]) + Rd+)
where rx,p satisfies PrX ∈ rx,p(x,∞] = 1 − p. Since b∗(1) > 0, it follows from Theorem 2.4
of [25] that µ((1,∞]) > 0. Since ||X||max is regularly varying at ∞, we have for sufficiently
small 1− p, there exists r1,p, such that µ((1,∞]) Pr||X||max > r1,p = 1− p, which implies that
r1,p ≈ VaR1−(1−p)/µ((1,∞])(||X||max) as p → 1. Observe that as p → 1, r1,p → ∞, and thus it
follows from (2.2) that for sufficiently small 1− p,
PrX ∈ r1,p(1,∞] ≈ µ((1,∞]) Pr||X||max > r1,p = 1− p.
18
Therefore, as p→ 1,
TCEp(X) ⊆⋂
x∈Sd−1+
(E(X |X ∈ rx,p(x,∞]) + Rd+) ⊆ E(X |X ∈ r1,p(1,∞]) + Rd+. (5.7)
Since ||X||max > r1,p if and only if X ∈ r1,p[0, 1]c, the constant k in (2.3) equals 1 and µ([0, 1]c) = 1.
It then follows from (2.7) that µ((1,∞]) = b∗(1, . . . , 1)/a∗(1 . . . , 1), and thus from (1) that as p→ 1,
E(X |X ∈ r1,p(1,∞]) ≈ VaR1−(1−p)a
∗(1,...,1)b∗(1,...,1)
(||X||max)(S1(b∗, α), . . . , Sd(b
∗, α))
where Sj(b∗, α) =
∫∞0
b∗(1,...,1,(wj∨1)−α,1,...,1)b∗(1,...,1) dwj , 1 ≤ j ≤ d. Plug this into (5.7), we obtain (2).
(3) In light of (5.6), consider, for any B ∈ Q||·||maxwith PrX ∈ rB,pB = 1− p,
E(Xj |X ∈ rB,pB) =E(XjIX ∈ rB,pB)
PrX ∈ rB,pB.
Since (1,∞]d ⊆ B ⊆ [0,1]c for any B ∈ Q||·||max, we have
E(Xj |X ∈ rB,pB) ≤E(XjIX ∈ rB,p[0,1]c)
PrX ∈ rB,p(1,∞]d=
∫ ∞0
PrXj > x ∩ X ∈ rB,p[0,1]cPrX ∈ rB,p(1,∞]d
dx.(5.8)
If x > rB,p then
PrXj > x ∩ X ∈ rB,p[0,1]c = PrXj > x.
If x ≤ rB,p then
PrXj > x ∩ X ∈ rB,p[0,1]c = PrXj > x ∩ (∪di=1Xi > rB,p)
= Pr∪di=1(Xj > x ∩ Xi > rB,p) = Pr(∪i 6=jXj > x,Xi > rB,p) ∪ Xj > rB,p
=∑
S⊆i:i 6=j
(−1)|S| PrXj > rB,p, Xi > rB,p, i ∈ S −∑
∅6=S⊆i:i 6=j
(−1)|S| PrXj > x,Xi > rB,p, i ∈ S
= PrXj > rB,p+∑∅6=S⊆i:i 6=j
(−1)|S| (PrXj > rB,p, Xi > rB,p, i ∈ S − PrXj > x,Xi > rB,p, i ∈ S) . (5.9)
Since the margins are tail equivalent and slowly varying, we have, for any 0 ≤ wj ≤ 1, and any
∅ 6= S ⊆ i : i 6= j,
limp→1
PrXj > rB,pwj , Xi > rB,p, i ∈ SPrX ∈ rB,p(1,∞]d
= limp→1
PrUj > 1− w−αj r−αB,pLj(rB,pw), Ui > 1− r−αB,pLi(rB,p), i ∈ SPrUi > 1− r−αB,pLi(rB,p), 1 ≤ i ≤ d
= limrB,p→∞
PrUj > 1− w−αj r−αB,pL1(rB,p), Ui > 1− r−αB,pL1(rB,p), i ∈ SPrUi > 1− r−αB,pL1(rB,p), 1 ≤ i ≤ d
= b∗j∪S(w−αj , 1, . . . , 1;Cj∪S)/b∗(1, . . . , 1),
19
where b∗j∪S(w−αj , 1, . . . , 1;Cj∪S) denotes the upper tail dependence function of the multivariate
margin Cj∪S evaluated with the j-th argument being w−αj and others being one. Similarly,
limp→1
PrXj > rB,p, Xi > rB,p, i ∈ SPrX ∈ rB,p(1,∞]d
=b∗j∪S(1, . . . , 1;Cj∪S)
b∗(1, . . . , 1),
limp→1
PrXj > rB,pPrX ∈ rB,p(1,∞]d
=1
b∗(1, . . . , 1). (5.10)
Using the bounded convergence theorem, we then have, for sufficiently small 1− p,∫ 1
0
∑∅6=S⊆i:i 6=j
(−1)|S|PrXj > rB,p, Xi > rB,p, i ∈ S − PrXj > rB,pwj , Xi > rB,p, i ∈ S
PrX ∈ rB,p(1,∞]ddwj
≈∑
∅6=S⊆i:i 6=j
(−1)|S|b∗j∪S(1, . . . , 1;Cj∪S)−
∫ 10 b∗j∪S(w−αj , 1, . . . , 1;Cj∪S)dwj
b∗(1, . . . , 1). (5.11)
Plug (5.10) and (5.11) into (5.9), and we have, for sufficiently small 1− p,∫ rB,p
0
PrXj > x ∩ X ∈ rB,p[0,1]cPrX ∈ rB,p(1,∞]d
dx ≈rB,p
b∗(1, . . . , 1)+
rB,p∑
∅6=S⊆i:i 6=j
(−1)|S|b∗j∪S(1, . . . , 1;Cj∪S)−
∫ 10 b∗j∪S(w−αj , 1, . . . , 1;Cj∪S)dwj
b∗(1, . . . , 1).(5.12)
On the other hand, using the Karamata theorem (1.6), we have, for sufficiently small 1− p,∫ ∞rB,p
PrXj > x ∩ X ∈ rB,p[0,1]cPrX ∈ rB,p(1,∞]d
dx =
∫ ∞rB,p
PrXj > xPrX ∈ rB,p(1,∞]d
dx
≈ rB,p1
α− 1
PrXj > rB,pPrX ∈ rB,p(1,∞]d
≈rB,p
(α− 1)b∗(1, . . . , 1). (5.13)
Combining (5.12) and (5.13) into (5.8), we have, for sufficiently small 1− p,
E(Xj |X ∈ rB,pB) ≤ α
α− 1
rB,pb∗(1, . . . , 1)
+ rB,p∑
∅6=S⊆i:i 6=j
(−1)|S|b∗j∪S(1, . . . , 1;Cj∪S)−
∫ 10 b∗j∪S(w−αj , 1, . . . , 1;Cj∪S)dwj
b∗(1, . . . , 1).
As p → 1, rB,p ≈ VaR1−(1−p)/µ(B)(||X||max) ≤ VaR1−(1−p)/µ([0,1]c)(||X||max) = VaRp(||X||max)
due to the fact that µ([0,1]c) = 1. Thus, for sufficiently small 1− p,E(Xj |X ∈ rB,pB)
VaRp(||X||max)≤ α
α− 1
1
b∗(1, . . . , 1)
+∑
∅6=S⊆i:i 6=j
(−1)|S|b∗j∪S(1, . . . , 1;Cj∪S)−
∫ 10 b∗j∪S(w−αj , 1, . . . , 1;Cj∪S)dwj
b∗(1, . . . , 1)= sj(b
∗, α),
for any B ∈ Q||·||max, where the equality follows from the integration by parts. Therefore,
TCEp(X) ⊇ VaRp(||X||max)(
(s1(b∗, α), . . . , sd(b
∗, α)) + Rd+),
for sufficiently small 1− p.
20
References
[1] Albrecher, H., Asmussen, S. and Kortschak, D. (2006). Tail asymptotics for the sum of two
heavy-tailed dependent risks. Extremes, 9:107–130.
[2] Alink, S., Lowe, M. and Wuthrich, M. V. (2004). Diversification of aggregate dependent risks.
Insurance: Math. Econom., 35:77–95.
[3] Alink, S., Lowe, M. and Wuthrich, M. V. (2005). Analysis of the expected shortfall of aggregate
dependent risks, ASTIN Bulletin, 35(1):25–43.
[4] Alink, S., Lowe, M. and Wuthrich, M. V. (2007). Diversification for general copula dependence.
Statistica Neerlandica, 61:446–465.
[5] Artzner, P., Delbaen, F., Eber, J.M. and Heath, D. (1999). Coherent measures of risks. Math-
ematical Finance 9:203–228.
[6] Bentahar, I. (2006). Tail conditional expectation for vector-valued risks. Discussion paper
2006-029, http://sfb649.wiwi.hu-berlin.de, Technische Universitat Berlin, Germany.
[7] Bingham, N. H., Goldie, C. M. and Teugels, J. L. (1987). Regular Variation. Cambridge
University Press, Cambridge, UK.
[8] Cai, J. and Li, H. (2005). Conditional tail expectations for multivariate phase-type distribu-
tions. J. Appl. Prob. 42:810–825.
[9] Cheridito, P., Delbaen, F. and Kluppelberg, C. (2004). Coherent and convex monetary risk
measures for bounded cadlag processes. Stochastic Processes and their Applications, 112:1–22.
[10] Coles, S. G. and Tawn, J. A. (1991). Modelling extreme multivariate events. J. R. Statist. Soc.,
B, 53:377–392.
[11] Cook, R.D. and Johnson, M.E. (1981). A family of distributions for modelling non-elliptically
symmetric multivariate data. J. Roy. Statist. Soc. B, 43:210–218.
[12] Delbaen, F. (2002). Coherent risk measure on general probability spaces. Advances in Fi-
nance and Stochastics-Essays in Honour of Dieter Sondermann, Eds. K. Sandmann, P. J.
Schonbucher, Springer-Verlag, Berlin, 1–37.
[13] Embrechts, P., Neslehova, J. and Wuthrich, M. V., (2009). Additivity properties for value-
at-risk under Archimedean dependence and heavy-tailedness. Insurance: Mathematics and
Economics, 44(2):164–169.
[14] Follmer, H. and Schied, A. (2002). Convex measures of risk and trading constraints. Finance
and Stochastics, 6:426–447.
21
[15] Joe, H. (1993). Parametric family of multivariate distributions with given margins. J. Multi-
variate Anal., 46:262–282.
[16] Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman & Hall, London.
[17] Joe, H. and Hu, T. (1996). Multivariate distributions from mixtures of max-infinitely divisible
distributions. J. Multivariate Anal., 57:240–265.
[18] Joe, H., Li, H. and Nikoloulopoulos, A.K. (2010). Tail dependence functions and vine copulas.
Journal of Multivariate Analysis, 101:252–270.
[19] Joe, H., Smith, R. L. and Weissman, I. (1992), Bivariate threshold methods for extremes. J.
R. Statist. Soc. B. 54:171–183.
[20] Jouini, E., Meddeb, M. and Touzi, N. (2004). Vector-valued coherent risk measures. Finance
and Stochastics 8:531–552.
[21] Kluppelberg, C., Kuhn, G. and Peng, L. (2008). Semi-parametric models for the multivariate
tail dependence function – the asymptotically dependent. Scandinavian Journal of Statistics,
35(4):701–718.
[22] Kortschak, D. and Albrecher, H. (2009). Asymptotic results for the sum of dependent non-
identically distributed random variables. Methodol. Comput. Appl. Probab. 11:279–306.
[23] Kousky, C. and Cooke, R. M. (2009). Climate Change and Risk Management: Challenges for
insurance, adaptation and loss estimation. Discussion paper RFF DP 09-03-Rev, Resources
For the Future (http://www.rff.org/RFF/Documents/).
[24] Landsman Z. and Valdez, E. (2003). Tail conditional expectations for elliptical distributions.
North American Actuarial Journal, 7:55–71.
[25] Li, H. (2009). Orthant tail dependence of multivariate extreme value distributions. Journal of
Multivariate Analysis, 100:243–256.
[26] Li, H. and Sun, Y. (2009). Tail dependence for heavy-tailed scale mixtures of multivariate
distributions. J. Appl. Prob. 46 (4):925–937.
[27] Mardia, K.V. (1962). Multivariate Pareto distributions. Ann. Math. Statist., 33:1008–1015.
[28] McNeil, A. J., Frey, R., Embrechts, P. (2005). Quantitative Risk Management: Concepts,
Techniques, and Tools. Princeton University Press, Princeton, New Jersey.
[29] Nikoloulopoulos, A.K., Joe, H. and Li, H. (2009). Extreme value properties of multivariate t
copulas. Extremes, 12:129–148.
22
[30] Resnick, S. (1987). Extreme Values, Regular Variation, and Point Processes, Springer, New
York.
[31] Resnick, S. (2007). Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer,
New York.
[32] Sklar, A. (1959). Fonctions de repartition a n dimensions et leurs marges. Publ. Inst. Statist.
Univ. Paris, 8:229–231.
[33] Takahasi, K. (1965). Note on the multivariate Burr’s distribution. Ann. Inst. Statist. Math.,
17:257–260.
23