link.springer.com · JournalofTheoreticalProbability TailindicesforAX +BRecursionwithTriangularMatrices Muneya Matsui1 ·Witold Swia

Journal of Theoretical Probabilityhttps://doi.org/10.1007/s10959-020-01019-8

Tail indices for AX + B Recursion with Triangular Matrices

Muneya Matsui1 ·Witold Swiatkowski2

Received: 18 June 2019 / Revised: 10 June 2020© The Author(s) 2020

AbstractWestudymultivariate stochastic recurrence equations (SREs)with triangularmatrices.If coefficient matrices of SREs have strictly positive entries, the classical Kesten resultsays that the stationary solution is regularly varying and the tail indices are the same inall directions. This framework, however, is too restrictive for applications. In order towiden applicability of SREs, we consider SREs with triangular matrices and we provethat their stationary solutions are regularly varying with component-wise different tailexponents. Several applications to GARCH models are suggested.

Keywords Stochastic recurrence equation · Kesten’s theorem · Regular variation ·Multivariate GARCH(1, 1) processes · Triangular matrices

Mathematics Subject Classification (2010) Primary 60G10 · 60G70, Secondary62M10 · 60H25

1 Introduction

1.1 Results andMotivation

A multivariate stochastic recurrence equation (SRE)

Wt = AtWt−1 + Bt , t ∈ N (1.1)

Muneya Matsui is partly supported by the JSPS Grant-in-Aid for Young Scientists B (16k16023) and forScientific Research C (19K11868) and Nanzan University Pache Research Subsidy I-A-2 for the 2020academic year. Witold Swiatkowski is partly supported by the NCN Grant UMO-2014/15/B/ST1/00060.

B Witold [email protected]

Muneya [email protected]

1 Department of Business Administration, Nanzan University, 18 Yamazato-cho, Showa-ku,Nagoya 466-8673, Japan

2 Institute of Mathematics, University of Wrocław, Pl. Grunwaldzki 2/4, 50-384 Wrocław, Poland

123

http://crossmark.crossref.org/dialog/?doi=10.1007/s10959-020-01019-8&domain=pdf

http://orcid.org/0000-0001-8870-1212

Journal of Theoretical Probability

is studied, whereAt is a random d×d matrix with nonnegative entries,Bt is a randomvector in R

d with nonnegative components and (At ,Bt )t∈Z are i.i.d. The sequence(Wt )t∈N generated by iterations of SRE (1.1) is a Markov chain; however, it is notnecessarily stationary. Under somemild contractivity and integrability conditions (seee.g., [5,6]),Wt converges in distribution to a random variableW which is the uniquesolution to the stochastic equation:

W d= AW + B. (1.2)

here, (A,B) denotes a generic element of the sequence (At ,Bt )t∈Z, which is inde-

pendent of W and the equality is meant in distribution. If we put W0d= W, then the

sequence (Wt ) of (1.1) is stationary. Moreover, under suitable conditions a strictlystationary casual solution (Wt ) to (1.1) can be written by the formula:

Wt = Bt +t∑

i=−∞At · · ·AiBi−1

and Wtd= W for all t ∈ Z.

Stochastic iteration (1.1) has been already studied for almost half a century, andit has found numerous applications to financial models (see e.g., Section 4 of [9,12]). Various properties have been investigated. Our particular interest here is the tailbehavior of the stationary solutionW. The topic is not only interesting on its own, butit also has applications to, for example, risk management [12,22, Sec. 7.3].

The condition for the stationary solutionW having power decaying tails dates backto Kesten [19]. Since then, the Kesten theorem and its extensions have been usedto characterize tails in various situations. An essential feature of Kesten-type resultsis that tail behavior is the same in all coordinates. We are going to call it Kestenproperty. However, this property is not necessarily shared by all interesting models—several empirical evidences support the fact (see e.g., [17,23,29] from economic data).Therefore, SREs with solutions having more flexible tails are both challenging anddesirable in view of applications.

The key assumption implying Kesten property is an irreducibility condition, andit refers to the law of A. The simplest example when it fails is a SRE with diagonalmatrices A = diag(A11, . . . , Add). In this case, the solution can exhibit different tailbehaviors in coordinates, which is not difficult to see by the univariate Kesten result.In the present paper, we consider a particular case of triangular matrices A, which ismuch more complicated and much more applicable as well. We derive the precise tailasymptotics of solution in all coordinates. In particular, we show that it may vary incoordinates, though it is also possible that in some coordinates we have the same tailasymptotics.

More precisely, let A be a nonnegative matrix matrices such that Aii > 0, Ai j =0, i > j a.s. Suppose that EAαi

i i = 1 holds for αi > 0 in each i such that αi �=

123


α j , i �= j . We prove that when x → ∞

P(Wi > x) ∼ ci x−αi , ci > 0, i = 1, . . . , d, (1.3)

for αi > 0 depending on αi , . . . , αd . Here and in what follows, the notation ‘∼’ meansthat the quotient of the left- and right-hand sides tends to 1 as x → ∞. For d = 2,the result (1.3) was proved in [10] with indices α1 = min(α1, α2) and α2 = α2. Thedependency of α1 onα1, α2 comes from the SRE:W1,t = A11,tW1,t−1+A12,tW2,t−1+B1,t whereW1,t is influenced byW2,t .1 The same dependency holds in any dimensionsinceWi depends onWi+1, . . . ,Wd in a similar butmore complicatedmanner. In orderto prove (1.3), we clarify this dependency with new notions and apply an inductionscheme effectively.

The structure of the paper is as follows. In Sect. 2, we state preliminary assumptionsandwe show existence of stationary solutions to SRE (1.1). Section 3 contains themaintheorem together with its proof. The proof follows by induction, and it requires severalpreliminary results. They are described in Sects. 4 and 5. Applications to GARCHmodels are suggested in Sect. 6. Section 7 contains discussion about constants of tailsand open problems related to the subject of the paper.

1.2 Kesten’s Condition and Previous Results

We state Kesten’s result and briefly review the literature. We prepare a function

h(s) = infn∈N

(E‖�n‖s)1/n with �n = A1 · · ·An,

where ‖ · ‖ is a matrix norm. The tail behavior of W is determined by h. We assumethat there exists a unique α > 0 such that h(α) = 1. The crucial assumption ofKesten is irreducibility. It can be described in several ways, neither of which seemssimple and intuitive. We are going to state a weaker property, which is much easierto understand. For any matrix Y, we write Y > 0 if all entries are positive. Theirreducibility assumption yields that for some n ∈ N

P(�n > 0) > 0. (1.4)

Then, the tail ofW is essentially heavier than that of ‖A‖: It decays polynomially evenif ‖A‖ is bounded. Indeed, there exists a function eα on the unit sphere S

d−1 such that

limx→∞ xα

P(y′W > x) = eα(y), y ∈ Sd−1 (1.5)

and eα(y) > 0 for y ∈ Sd−1 ∩ [0,∞)d , where y′W = ∑d

j=1 y jW j denotes the innerproduct. If α /∈ N, then (1.5) implies multivariate regular variation of W, while ifα ∈ N, the same holds under some additional conditions (see [9, Appendix C]). Here,

1 In case α1 �= α2, possibility of different tail indices was already suggested in Matsui and Mikosch [23].

123


we say that a d-dimensional r.v. X is multivariate regularly varying with index α if

P(|X| > ux, X/|X| ∈ ·)P(|X| > x)

v→ u−αP(� ∈ ·), u > 0, (1.6)

wherev→ denotes vague convergence and � is a random vector on the unit sphere

Sd−1 = {x ∈ R

d : |x| = 1}. This is a common tool for describing multivariate powertails; see [3,26,27] or [9, p.279].

We proceed to the literature after Kesten. Later on, an analogous result was provedbyAlsmeyer andMentemeier [1] for invertiblematricesAwith some irreducibility anddensity conditions (see also [20]). The density assumption was removed by Guivarc’hand Le Page [16] who developed the most general approach to (1.1) with signed Ahaving possibly a singular law. Moreover, their conclusion was stronger, namely theyobtained existence of a measure μ on R

d being the weak limit of

xαP(x−1W ∈ ·) when x → ∞, (1.7)

which is equivalent to (1.6). As a related work, in [8] the existence of the limit (1.7)was proved under the assumption that A is a similarity.2 See [9] for the details ofKesten’s result and other related results.

For all the matrices A considered above, we have the same tail behavior in alldirections, one of the reasons being a certain irreducibility or homogeneity of theoperations generated by the support of the law of A. This is not the case for triangularmatrices, since (1.4) does not hold then. Research in such directions would be a nextnatural step.3 In this paper, we work out the case when the indices α1, . . . αd > 0satisfyingEAαi

i i = 1 are all different. However, there remain not a few problems beingsettled, e.g., what happens when they are not necessarily different. This is not cleareven in the case of 2 × 2 matrices. A particular case A11 = A22 > 0 and A12 ∈ R

was studied in [11] where the result is

P(W1 > x) ∼{Cx−α1(log x)α1 if EA12A

α1−111 �= 0,

Cx−α1(log x)α1/2 if EA12Aα1−111 = 0,

where C > 0 is a constant which may be different line by line. Our conjecture ind (> 1) dimensional case is that

P(Wi > x) ∼ Cx−αi �i (x), i = 1, . . . , d

for some slowly varying functions �i , and to get optimal �i ’s would be a real futurechallenge.

2 A is a similarity if for every x ∈ Rd , |Ax | = ‖A‖ |x |.

3 See also [7,8] and [9, Appendix D] for diagonal matrices.

123


2 Preliminaries and Stationarity

Weconsider d×d randommatricesA = [Ai j ]di, j=1 and d-dimensional randomvectorsB that satisfy the set of assumptions:

Note that (T-1) and (T-4) imply that

P(Aii > 0) > 0 (2.1)

for i = 1, . . . , d. For further convenience, any r.v. satisfying the inequality (2.1) willbe called positive. Most conditions are similar to those needed for applying Kesten–Goldie’s result (see [9]).

Let (At ,Bt )t∈Z be an i.i.d. sequence with the generic element (A,B). We definethe products

�t,s ={At · · ·As, t ≥ s,

Id , t < s,

where Id denotes the identity d × d matrix. In the case t = 0, we are going to use asimplified notation:

�n := �0,−n+1, n ∈ N,

so that �0 = Id . For any t ∈ Z and n ∈ N, the products �t,t−n+1 and �n have thesame distribution.

Let ‖ · ‖ be the operator norm of the matrix: ‖M‖ = sup|x|=1 |Mx|, where | · | is theEuclidean norm of a vector. The top Lyapunov exponent associated with A is definedby

γA = infn≥1

1

nE[log ‖�n‖].

Notice that in the univariate case A = A ∈ R and γA = E[log |A|].If γA < 0, then the equation

W d= AW + B (2.2)

123


has a unique solution W which is independent of (A,B). Equivalently, the stochasticrecurrence equation

Wt = AtWt−1 + Bt (2.3)

has a unique stationary solution and Wtd= W. Then we can write the stationary

solution as a series

Wt =∞∑

n=0

�t,t+1−nBt−n . (2.4)

Indeed, it is easily checked that the process (Wt )∞t=0 defined by (2.4) is stationary and

solves (2.3), if the series on the right-hand side of (2.4) is convergent for any t ∈ N.The convergence is ensured if the top Lyapunov exponent is negative (for the proofsee [5]). Negativity of γA follows from condition (T). We provide a proof in AppendixA.

From now on, W = (W1, . . . ,Wd) will always denote the solution to (2.2) andby Wt = (W1,t , . . . ,Wd,t ) we mean the stationary solution to (2.3). Since they areequal in distribution, whenwe consider distributional properties, we sometimes switchbetween the two notations, but this causes no confusion.

3 TheMain Result and the Proof of theMain Part

3.1 TheMain Result

We are going to determine the component-wise tail indices of the solution W tostochastic equation (2.2). SinceA is upper triangular, the tail behavior ofWi is affectedonly by Wj for j > i , but not necessarily all of them: We allow some entries of Aabove the diagonal to vanish a.s. In the extreme case of a diagonal matrix, the tail ofany coordinate may be determined independently of each other. On the other hand, ifA has no zeros above the diagonal, every coordinate is affected by all the subsequentones. In order to describe this phenomenon precisely, we define a partial order relationon the set of coordinates. It clarifies interactions between them.

Definition 3.1 For i, j ∈ {1, . . . , d}, we say that i directly depends on j and writei � j if Ai j is positive (in the sense of (2.1)). We further write i ≺ j if i � j andi �= j .

Observe that i � j implies i ≤ j since A is upper triangular, while i � i followsfrom the positivity of diagonal entries. We extract each component of SRE (2.3) andmay write

Wi,t =d∑

j=1

Ai j,tW j,t−1 + Bi,t =∑

j : j�i

Ai j,tW j,t−1 + Bi,t , (3.1)

123


where in the latter sum all coefficients Ai j,t are positive. From this, we obtain thecomponent-wise SRE in the form

Wi,t = Aii,tWi,t−1 +∑

j : j i

Ai j,tW j,t−1 + Bi,t = Aii,tWi,t−1 + Di,t , (3.2)

where

Di,t =∑

j : j i

Ai j,tW j,t−1 + Bi,t . (3.3)

Therefore, the tail of Wi,t is determined by the comparison of the autoregressivebehavior, characterized by the index αi , and the tail behavior of Di,t , which dependson the indices α j , j i . To clarify this, we define recursively new exponents αi ,where i decreases from d to 1:

αi = αi ∧ min{α j : i ≺ j}. (3.4)

If there is no j such that i ≺ j , then we set αi = αi . In particular, αd = αd . Noticethat in general αi �= min{α j : i ≤ j}. Depending on zeros of A, two relationsαi = min{α j : i ≤ j} and αi > min{α j : i ≤ j} are possible for any i < d, seeExample 3.4.

For further convenience, we introduce also a modified, transitive version of relation�.

Definition 3.2 We say that i depends on j and write i � j if there exists m ∈ N and asequence (i(k))0≤k≤m such that i = i(0) ≤ · · · ≤ i(m) = j and Ai(k)i(k+1) is positivefor k = 0, . . . ,m − 1. We write i � j if i � j and i �= j .

Equivalently, the condition on the sequence (i(k))0≤k≤m can be presented in the formi � i(1) � · · · � i(m) = j . In particular, i � j implies i � j . Now, we can write

αi = min{α j : i � j} (3.5)

which is equivalent to (3.4), though has a more convenient form.Although Definitions 3.1 and 3.2 are quite similar, there is a significant difference

between them. To illustrate the difference, we introduce the following notation for theentries of �t,s :

(�t,s)i j =: πi j (t, s).

If t = 0, we use the simplified form

πi j (n) := πi j (0,−n + 1), n ∈ N.

By i � j , we mean that the entry Ai j of the matrix A is positive, while i � j meansthat for some n ∈ N the corresponding entry πi j (n) of the matrix �n is positive.

123


The former relation gives a stronger condition. On the other hand, the latter is moreconvenient when products of matrices are considered, especially when transitivityplays a role. Example 3.4 gives a deeper insight into the difference between the tworelations. Throughout the paper, both of them are exploited.

Now, we are ready to formulate the main theorem.

Theorem 3.3 Suppose that condition (T) is satisfied for a random matrix A. LetW bethe solution to (2.2). Then, there exist strictly positive constants C1, . . . ,Cd such that

limx→∞ x αi P(Wi > x) = Ci , i = 1, . . . , d. (3.6)

The following example gives some intuition of what the statement of the theoremmeans in practice.

Example 3.4 Let

A =

⎛

⎜⎜⎜⎜⎝

A11 A12 0 0 A150 A22 0 A24 00 0 A33 0 A350 0 0 A44 00 0 0 0 A55

⎞

⎟⎟⎟⎟⎠,

where P(Ai j > 0) > 0 and other components are zero a.s. Suppose that α4 < α3 <

α2 < α5 < α1 and A satisfies the assumptions of Theorem 3.3. Then, α1 = α4,α2 = α4, α3 = α3, α4 = α4 and α5 = α5.

We explain the example step by step. The last coordinate W5 is the solution to 1-dimensional SRE, so its tail index is α5. Since there is no j such that 4 � j , α4 is thetail index of W4. For the third coordinate, the situation is different: We have 3 � 5,so the tail of W3 depends on W5. But α3 < α5, so the influence of W5 is negligibleand we obtain the tail index α3 = α3. Inversely, the relations 2 � 4 and α2 > α4imply α2 = α4. The first coordinate clearly depends on the second and fifth, but recallthat the second one also depends on the fourth. Hence, we have to compare α1 withα2, α4 and α5. The smallest one is α4; hence, α1 = α4 is the tail index. Although thedependence of W1 on W4 is indirect, we see it in the relation 1 � 4.

3.2 Proof of theMain Result

Since the proof includes several preliminary results which are long and technical, theyare postponed to Sects. 4 and 5. To make the argument more readable, we provide anoutline of the proof here, referring to those auxiliary results. In the proof of Theorem3.3, we fix the coordinate number k and consider two cases: αk = αk and αk < αk .

In the first case αk = αk , the proof is based on Goldie’s result [15, Theorem 2.3]and it is contained in Lemmas 4.2 and 4.3.

If αk < αk , the proof is more complicated and requires several auxiliary results.We proceed by induction. Let j0 be the maximal coordinate among j � k such thatthe modified tail indices α j and αk are equal. Then clearly, αk = α j0 = α� for any

123


� such that k � � � j0. We start with the maximal among such coordinates (in thestandard order on N) and prove (3.6) for it. Inductively we reduce � using results forlarger tail indices and finally we reach � = k.

We develop a component-wise series representation (5.4) for W0. In Lemma 5.1,we prove that it indeed coincides a.s. with (2.4).

The series (5.4) is decomposed into parts step by step (Lemmas 5.4, 5.9 and 5.10).Finally, we obtain the following expression:

W�,0 = π� j0(s)Wj0,−s + R� j0(s). (3.7)

Our goal is to prove that as s → ∞, the tail asymptotics of the first term π� j0(s)Wj0,−s

approaches that of (3.6), while the second term R� j0(s) becomes negligible.The quantity R� j0(s) is defined inductively by (5.31) and (5.39) as a finite collection

of negligible parts, each being estimated in a different way. In the process of findingthe main dominant term, we simultaneously settle the upper bounds for the negligibleparts. This is done through (5.15) in Lemmas 5.4, 5.3 and (5.29) under conditions ofLemmas 5.9 and 5.10. Then, in the end R� j0(s) is proved to be negligible as s → ∞.

The final step is related to Breiman’s lemma applied to π� j0(s)Wj0,−s . By indepen-dence between π� j0(s) and Wj0,−s , we obtain the equality in asymptotics:

P(π� j0(s)Wj0,−s > x) ∼ E[π� j0(s)α j0 ]P(Wj0,−s > x)

for fixed s. Recall that we need to let s → ∞ for R� j0(s) to be negligible. Theexistence of lims→∞ E[π� j0(s)

α j0 ] is assured in Lemma 5.7. Then, eventually we getP(W�,0 > x) ∼ C · P(Wj0,−s > x) for a positive constant C , which can be explicitlycomputed. Now, (3.6) follows from stationarity and Lemma 4.3.

We move to the proof of Theorem 3.3.

Proof of Theorem 3.3 Fix k ≤ d. If αk = αk , then the statement of the theorem directlyfollows from Lemma 4.3. If αk < αk , then there is a unique j0 � k such that αk =α j0 = α j0 . Another application of Lemma 4.3 proves that

limx→∞ xα j0 P(Wj0 > x) = C j0 (3.8)

for some constant C j0 > 0. By stationarity, we set t = 0 without loss of generality.The proof follows by induction with respect to number j in backward direction,

namely we start with � = j1 := max{ j : k � j � j0} and reduce � until � = k.Notice that j1 ≺ j0, and thus � = j1 satisfies conditions of Lemma 5.10. Thus, thereexists s0 such that for any s > s0 we have

W�,0 = π� j0(s)Wj0,−s + R� j0(s), (3.9)

where π� j0(s) is independent of Wj0,−s such that 0 < Eπ� j0(s)α j0 < ∞, and R� j0(s)

satisfies (5.29). We are going to estimate P(W�,0 > x) from both below and above.Since R� j0(s) ≥ 0 a.s.

P(W�,0 > x) ≥ P(π� j0(s)Wj0,−s > x)

123


holds, and by Breiman’s lemma [9, Lemma B.5.1] for fixed s > s0

limx→∞

P(π� j0(s)Wj0,−s > x)

P(Wj0,−s > x)= E[π� j0(s)

α j0 ] =: u�(s).

Combining these with (3.8), we obtain the lower estimate

limx→∞

xα j0 P(W�,0 > x) ≥ u�(s) · C j0 . (3.10)

Now, we pass to the upper estimate. Recall (5.29) in Lemma 5.10 which implies thatfor any δ ∈ (0, 1) and ε > 0 there exists s1 (> s0) such that for s > s1

limx→∞ xα j0 P

(R� j0(s) > δx

) = δ−α j0 limx→∞(δx)α j0 P

(R� j0(s) > δx

)< ε.

Then, for fixed s ≥ s1, we apply Lemma B.1, which is a version of Breiman’s lemma,to (3.9) and obtain

limx→∞ xα j0 P(W�,0 > x) = lim

x→∞ xα j0 P(π� j0(s)Wj0,−s + R� j0(s) > x

)

≤ limx→∞ xα j0 P

(π� j0(s)Wj0,−s > x(1 − δ)

)

+ limx→∞ xα j0 P

(R� j0(s) > δx

)

< limx→∞ xα j0 P

(π� j0(s)

1 − δWj0,−s > x

)+ ε

= E[π� j0(s)α j0 ](1 − δ)−α j0 lim

x→∞ xα j0 P(Wj0,−s > x) + ε

= (1 − δ)−α j0 u�(s) · C j0 + ε, (3.11)

where we also use (3.8). We may let ε → 0 and δ → 0 together with s → ∞ hereand in (3.10). The existence and positivity of the limit u� = lims→∞ u�(s) is assuredby Lemma 5.7. Thus, from (3.10) and (3.11) we have

u� · C j0 = lims→∞ u�(s) · C j0 ≤ lim

x→∞xα j0 P(W�,0 > x) ≤ lim

x→∞ xα j0 P(W�,0 > x)

≤ lims→∞ u�(s) · C j0 = u� · C j0 .

This implies that

limx→∞ x α�P(W�,0 > x) = u� · C j0 =: C�. (3.12)

Now, we go back to the induction process.If j1 = k, then the proof is over, and if j1 �= k, we set j2 = max{ j : k � j �

j0, j �= j1}. Then, there are two possibilities, depending on whether j2 ≺ j1 or not.If j2 ⊀ j1, then the assumptions of Lemma 5.10 are satisfied with � = j2 and werepeat the argument that we used for j1. If j2 ≺ j1, the assumptions of Lemma 5.9 are

123


fulfilled with � = j2. Since the assertion of Lemma 5.9 is the same as that of Lemma5.10, we can again repeat the argument that we used for j1.

In general, we define jm+1 = max({ j : k � j � j0} \ { j1, . . . , jm}). If jm+1 ≺ jfor some j ∈ { j1, . . . , jm}, then we use Lemma 5.9 with � = jm+1. Otherwise,we use Lemma 5.10 with the same �. Then, by induction we prove (3.12) for every� : k � � � j0, particularly in the end we also obtain

Ck = limx→∞ xα j0 P(Wk > x).

��Notice that we have two limit operations, with respect to x and s, and always the

limit with respect to x precedes. We cannot exchange the limits, namely we have tolet s → ∞ with s depending on x .

4 Case ˜˛k = ˛k

The assumption αk = αk means that the tail behavior of Wk is determined by itsauto-regressive property, namely the tail index is the same as that of the solution V of

the stochastic equation Vd= AkkV + Bk . The tails of other coordinates on which k

depends are of smaller order, which is rigorously shown in Lemma 4.2. In Lemma 4.3,we obtain the tail index ofWk by applying Goldie’s result. By Lemma 4.2, we observethat the perturbation induced by random elements other than those of kth coordinate(Akk, Bk) is negligible.

In what follows, we work on a partial sum of the stationary solution (2.4)component-wisely. To write the coordinates of the partial sum directly, the follow-ing definition is useful.

Definition 4.1 For m ∈ N and 1 ≤ i, j ≤ d let Hm(i, j) be the set of all sequencesof m + 1 indices (h(k))0≤k≤m such that h(0) = i, h(m) = j and h(k) � h(k + 1) fork = 0, . . . ,m − 1. For convenience, the elements of Hm(i, j) will be denoted by h.

Notice that each of such sequences is non-decreasing since A is upper triangular.Moreover, Hm(i, j) is nonempty if and only if i � j and m is large enough. Now, wecan write

πi j (s) =∑

h∈Hs (i, j)

s−1∏

p=0

Ah(p)h(p+1),−p. (4.1)

Similar expression for πi j (t, s) can be obtained by shifting the time indices,

πi j (t, s) =∑

h∈Ht−s+1(i, j)

t−s∏

p=0

Ah(p)h(p+1),t−p, t ≥ s,

123


which will be used later. Since the sum is finite, it follows from condition (T-5) thatEπi j (s)αi < ∞ for any i, j . Moreover, if j � i , then P(πi j (s) > 0) > 0 for s largeenough (in particular s ≥ j − i is sufficient). By definition, πi j (s) is independent ofWj,−s . Notice that when i = �, j = j0 and � � j0, it is the coefficient of our targetingrepresentation (3.7).

Lemma 4.2 For any coordinate, i ∈ {1, . . . , d}, EWαi < ∞ if 0 < α < αi .

Proof For fixed i , let us approximate W0 by partial sums of the series (2.4). We willdenote Sn = ∑n

m=0 �mB−m and Sn = (S1,n, . . . , Sd,n). We have then

Si,n =n∑

m=0

∑

j : j�i

πi j (m)Bj,−m =n∑

m=1

∑

j : j�i

∑

h∈Hm(i, j)

( m−1∏

p=0

Ah(p)h(p+1),−p

)Bj,−m + Bi,0. (4.2)

Suppose that α ≤ 1. Then, by independence of Ah(p−1)h(p),−p, p = 1, . . . ,m − 1and Bj,−m ,

ESαi,n ≤

n∑

m=1

∑

j : j�i

∑

h∈Hm(i, j)

( m−1∏

p=0

EAαh(p)h(p+1),−p

)EBα

j,−m + EBαi,0.

To estimate the right-hand side, we will need to estimate the number of elements ofHm(i, j). To see the convergence of the series (4.2) it suffices to consider m > 2d.Recall that the sequences in Hm(i, j) are non-decreasing; thus, for a fixed j there areat most j − i non-diagonal terms in each product on the right-hand side of (4.2). Thenon-diagonal terms in the product coincidewith the time indices t , for which the valuesh(t − 1) and h(t) are different. If there are exactly j − i such time indices, then thevalues are uniquely determined. There are

( mj−i

)possibilities of placing the moments

among m terms of the sequence and( mj−i

)<(md

)since m > 2d and j − i < d.

If we have l < j − i non-diagonal elements in the product, then there are(ml

)

possibilities of placing them among other terms and there are less than( j−i

l

)possible

sets of l values. Hence, we have at most(ml

)( j−il

)<

(md

)d! sequences for a fixed

l. Moreover, there are j − i < d possible values of l, and hence there are at most

d · d! · (md)sequences in Hm(i, j). Since

(md

)< md

d! , the number of sequences inHm(i, j) is further bounded by dmd .

Now, recall that there is ρ < 1 such that EAαj j ≤ ρ for each j � i and that there is

a uniform bound M such that EAαjl < M and EBα

j < M whenever j � i , for any l.

123


It follows that

ESαi,n ≤ C +

n∑

m=2d

∑

j : j�i

∑

h∈Hm (i, j)

Md+1ρm−d

≤ C +n∑

m=2d

∑

j : j�i

dMd+1ρ−d · mdρm

≤ C +n∑

m=2d

d2Md+1ρ−d · mdρm

= C + d2Md+1ρ−dn∑

m=2d

mdρm < ∞ (4.3)

uniformly in n, with C > 0, which is bounded from above. Hence, there exists thelimit Si = limn→∞ Si,n a.s. andESα

i < ∞. By (2.4), we haveW0 = limn→∞ Sn a.s.and we conclude that EWα

i,0 < ∞.If α > 1, then by Minkowski’s inequality we obtain

(ESαi,n)

1/α ≤ C ′ +n∑

m=2d

∑

j : j�i

∑

h∈Hm (i, j)

( m−1∏

p=0

(EAαh(p)h(p+1),−p)

1/α)(EBα

j,−m)1/α

with C ′ > 0. The same argument as above shows the uniform convergence. Thus, theconclusion follows. ��

Suppose that we have αk = αk . This implies that αk < α j for each j � k andhence, by Lemma 4.2, EWαk

j < ∞. The next lemma proves the assertion of the maintheorem in case αk = αk .

Lemma 4.3 Suppose assumptions of Theorem 3.3 are satisfied and let k ≤ d. Providedthat αk = αk , there exists a positive constant Ck such that

limx→∞ x αkP(Wk > x) = Ck . (4.4)

Proof We are going to use Theorem 2.3 of Goldie [15] which asserts that if

∫ ∞

0|P(Wk > x) − P(AkkWk > x)|xαk−1dx < ∞, (4.5)

then

limx→∞ xαkP(Wk > x) = Ck, (4.6)

where

Ck = 1

E[Aαkkk log Akk

]∫ ∞

0(P(Wk > x) − P(AkkWk > x)) xαk−1dx . (4.7)

123


To prove (4.5), we are going to use Lemma 9.4 from [15], which derives the equality

∫ ∞

0|P(Wk,0 > x) − P(Akk,0Wk,−1 > x)|xαk−1dx

= 1

αkE∣∣Wαk

k,0 − (Akk,0Wk,−1)αk∣∣ =: I . (4.8)

From (3.1), we deduce thatWk,0 ≥ Akk,0Wk,−1 a.s. Hence, the absolute value may beomitted on both sides of (4.8). We consider two cases depending on the value of αk .Case 1 αk < 1.

Due to (3.2) and Lemma 4.2, we obtain

αk I ≤ E[(Wk,0 − Akk,0Wk,−1)

αk] = EDαk

k,0 < ∞.

Case 2 αk ≥ 1.For any x ≥ y ≥ 0 and α ≥ 1, the following inequality holds:

xα − yα = α

∫ x

ytα−1dt ≤ αxα−1(x − y).

Since Wk,0 ≥ Akk,0Wk,−1 ≥ 0 a.s., we can estimate

Wαkk,0 − (Akk,0Wk,−1)

αk ≤ αkWαk−1k,0 (Wk,0 − Akk,0Wk,−1)

= αk Dk,0Wαk−1k,0

= αk Dk,0(Akk,0Wk,−1 + Dk,0)αk−1.

Since the formula

(x + y)α ≤ max{1, 2α−1}(xα + yα)

holds for any x, y > 0 and each α, by putting α = αk − 1 we obtain

I ≤ E

[Dk,0(Akk,0Wk,−1 + Dk,0)

αk−1]

≤ max{1, 2αk−2}(EDαkk,0 + E[Dk,0(Akk,0Wk,−1)

αk−1]).

Since EDαkk,0 < ∞ by Lemma 4.2, it remains to prove the finiteness of the second

expectation. In view of (3.2),

E[Dk,0(Akk,0Wk,−1)αk−1]

= E

[Aαk−1kk,0 Bk,0

]EWαk−1

k,−1 +∑

j k

E

[Aαk−1kk,0 Akj,0

]E

[Wαk−1

k,−1 Wj,−1

],

123


wherewe use independence of (A··,0, B·,0) andW·,−1. SinceEWαk−1k,−1 < ∞ byLemma

4.2, we focus on the remaining terms. Take p = αk+εαk+ε−1 and q = αk + ε with

0 < ε < min{α j : j � k} − αk . Then, since

p(αk − 1) = αk + ε

αk + ε − 1(αk − 1) =

(1 + 1

αk + ε − 1

)(αk − 1)

<

(1 + 1

αk − 1

)(αk − 1) = αk,

the Hölder’s inequality together with Lemma 4.2 yields

E

[Wαk−1

k,−1 Wj,−1

]≤(EW p(αk−1)

k,−1

)1/p ·(EWαk+ε

j,−1

)1/q< ∞.

Similarly, E[Aαk−1kk,0 Bk,0] < ∞ and E[Aαk−1

kk,0 Akj,0] < ∞ hold and hence I < ∞.By [15, Theorem 2.3], (4.6) holds and it remains to prove thatCk > 0. SinceWk,0 ≥

Akk,0Wk,−1+Bk,0 holds from (3.11),Wαkk,0−(Akk,0Wk,−1)

αk is strictly positive on theset {Bk,0 > 0} which has positive probability in view of condition (T-2). Therefore, in(4.8) we obtain I > 0. Condition (T-7) implies that 0 < E[Aαk

kk log Akk] < ∞. Hence,we see from (4.7) that Ck > 0. ��

5 Case ˜˛k < ˛k

The situation is now quite the opposite. The auto-regressive behavior of Wk does notplay any role since k depends (in terms of Definition 3.2) on coordinates which admitdominant tails. More precisely, we prove that Wk has a regularly varying tail and itstail index is smaller than αk . It is equal to the tail index α j0 of the unique componentWj0 such that αk = α j0(= α j0). The latter is due to the formula

W�,0 = π� j0(s)Wj0,−s + R� j0(s), (5.1)

and it is proved inductively for all � such that k � � � j0. The decomposition (5.1) isthe main goal in this section. We show that R� j0(s) is negligible as s → ∞, so thatthe tail of W�,0 comes from π� j0(s)Wj0,−s . Moreover, we apply Breiman’s lemma toπ� j0(s)Wj0,−s . Then, we also need to describe the limit behavior of Eπ� j0(s)

α j0 ass → ∞.

To reach (5.1), we first prove that W�,0 may be represented as in (5.4). Then,we divide the series (5.4) into two parts: the finite sum QF (s) and the tail QT (s)of (5.6). Next, we decompose QF (s) into two parts: QW (s) containing W -termsand QB(s) containing B-terms; see (5.13) and (5.14). Finally, QW (s) is split intoQ′

W (s) and Q′′W (s), where the former contains the components with dominating tails,

while the latter gathers those with lower-order tails. This decomposition suffices tosettle the induction basis in Lemma 5.10, since (5.28) is satisfied if we set R� j0(s) =Q′′

W (s) + QB(s) + QT (s). The three parts QT (s), QB(s) and Q′′W (s) are estimated

in Lemmas 5.3 and 5.4.

123


For the induction step in Lemma 5.9, we find s1, s2 ∈ N such that s = s1 + s2 andextract another term Q∗

W (s1, s2) from Q′W (s1), so that Q′

W (s1) = π� j0(s)Wj0,−s +Q∗

W (s1, s2). Then, we set

R� j0(s) = Q′′W (s1) + QB(s1) + QT (s1) + Q∗

W (s1, s2).

Notice that the two definitions of R� j0(s) above coincide, since Q∗W (s, 0) = 0 in the

framework of Lemma 5.10. The term Q∗W (s1, s2) is estimated in the induction step

in Lemma 5.9, where both s1 and s2 are required to be sufficiently large. The limitbehavior of Eπ� j0(s)

α j0 as s → ∞ is given in Lemma 5.7.

5.1 Decomposition of the Stationary Solution

We need to introduce some notation. For the products of the diagonal entries of thematrix A, we write

(i)t,s =

{Aii,t · · · Aii,s, t ≥ s,

1, t < s,i = 1, . . . , d; t, s ∈ Z

with a simplified form for t = 0

(i)n :=

(i)0,−n+1, n ∈ N.

We also define the subset H ′m(i, j) ⊂ Hm(i, j) as

H ′m(i, j) = {h ∈ Hm(i, j) : h(1) �= i}, (5.2)

namely H ′m(i, j) is the subset of Hm(i, j) such that the first two terms of its elements

are not equal: i = h(0) �= h(1). The latter naturally yields another definition:

π ′i j (t, t − s) :=

∑

h∈H ′s+1(i, j)

s∏

p=0

Ah(p)h(p+1),t−p; π ′i j (s) := π ′

i j (0,−s + 1),

(5.3)

which looks similar to πi j (t, t − s) and can be interpreted as an entry of the productA0t �t−1,t−s , where A0 stands for a matrix that has the same entries as A outside the

diagonal and zeros on the diagonal.Now, we are going to formulate and prove the auxiliary results of this section. We

start with representing W�,t as a series.

Lemma 5.1 Let Wt = (W1,t , . . . ,Wd,t ) be the stationary solution of (2.3). Then

Wi,t =∞∑

n=0

(i)t,t−n+1Di,t−n a.s., i = 1, . . . , d, (5.4)

123


where Di,t−n is as in (3.3).

Remark 5.2 Representation (5.4) allows us to exploit the fast decay of (i)t,t−n+1 as

n → ∞.

Proof First, we show that for a fixed � the series on the right-hand side of (5.4)converges. For this, we evaluate the moment of some order α from the range 0 < α <

αi . Without loss of generality, we let α < 1 and t = 0. By Fubini’s theorem,

E

( ∞∑

n=0

(i)n Di,−n

)α

≤∞∑

n=0

ρn(dML + M) < ∞,

where L = max{EWαj : j i} and M = EBα

i ∨ max{EAαi j : j i} are finite

constants, and ρ = EAαi i < 1. Hence, the series defined in (5.4) converges.

Denote Yi,t := ∑∞n=0

(i)t,t−n+1Di,t−n . We are going to show that the vector Yt =

(Y1,t , . . . ,Yd,t ) is a stationary solution to (2.3). Since stationarity is obvious, withoutloss of generality set t = 0. For i = 1, . . . , d, we have

Yi,0 = Di,0 +∞∑

n=1

(i)n Di,−n

= Di,0 + Aii,0

∞∑

n=0

(i)−1,−nDi,−n−1

= Aii,0Yi,−1 + Di,0. (5.5)

The d equations of the form (5.5) (one for each i) can be written together as a matrixequation:

Y0 = A0Y−1 + B0,

which is the special case of (2.3). It remains to prove that this implies that Yi,0 = Wi,0a.s.

The series representation (2.4) allows to write Di,t as a measurable function of{(As,Bs) : s < t}. Since the sequence (At ,Bt )t∈Z is i.i.d. and hence ergodic, itfollows from Proposition 4.3 of [21] that (Di,t )t∈Z is also ergodic. Brandt [6] provedthat then Yi,t defined in the paragraph above (5.5) is the only proper solution to (5.4)[6, Theorem 1]. Therefore, Yi,t = Wi,t a.s. ��

Now, we divide the series (5.4) into the two parts: the finite sum QF (s) of the sfirst elements and the corresponding tail QT (s):

W�,0 =:s−1∑

n=0

(�)n D�,−n

︸︷︷︸QF (s)

+∞∑

n=s

(�)n D�,−n

︸︷︷︸QT (s)

. (5.6)

123


For notational simplicity, we abbreviate coordinate number � in QF (s) and QT (s).In the paper, the appropriate coordinate is always denoted by �. The same applies forother parts QW (s), QB(s), Q′

W (s), Q′′W (s) and Q∗

W (s1, s2) of W�,0 which will bedefined in the sequel.

Both QF (s) and QT (s) have the same tail order for any s. However, if s is largeenough, one can prove that QT (s) becomes negligible in the sense that the ratioof QT (s) and QF (s) tends to zero. The following lemma describes precisely thatphenomenon. Recall that j0 is the unique index with the property α� = α j0 and soα j = α j0 for all j such that � � j � j0.

Lemma 5.3 Assume that for any j satisfying j0 � j � �,

limx→∞ xα j0 P(Wj > x) = C j > 0. (5.7)

Then, for every ε > 0 there exists s0 = s0(�, ε) such that

limx→∞ xα j0 P(QT (s) > x) < ε (5.8)

for any s ≥ s0.

Proof For ρ := EAα j0�� < 1, we choose a constant γ ∈ (0, 1) such that ργ −α j0 =:

δ < 1. Notice that E(

(�)n )α j0 decays to 0 exponentially fast, which is crucial in the

following argument. We have

P(QT (s) > x) = P

( ∞∑

n=s

(�)n D�,−n > x

)

≤∞∑

n=s

P

((�)

n D�,−n > x · (1 − γ )γ n−s)

=∞∑

n=0

P

(

(�)n+s D�,−n−s > x · (1 − γ )γ n

)

≤∞∑

n=0

P

(

(�)n+s D

′�,−n−s > x · (1 − γ )γ n

2

)(:= I)

+∞∑

n=0

P

(

(�)n+s D

′′�,−n−s > x · (1 − γ )γ n

2

)(:= II),

where we divided D�,−n−s = B�,−n−s + ∑j � A� j,−n−sW j,−n−s−1 into two parts:

D�,−n−s =∑

j :�≺ j� j0

A� j,−n−sW j,−n−s−1

︸︷︷︸D′

�,−n−s

+∑

j :�≺ j� j0

A� j,−n−sW j,−n−s−1 + B�,−n−s

︸︷︷︸D′′

�,−n−s

.

(5.9)

123


The first part D′�,−n−s contains those components of W−n−s−1 which are dominant,

while D′′�,−n−s gathers the negligible parts: the components with lower-order tails and

the B-term. For the second sum II, we have E(D′′�,−n−s)

α j0 < ∞ by Lemma 4.2 andcondition (T), so that Markov’s inequality yields

xα j0 II ≤∞∑

n=0

2α j0 (1 − γ )−α j0 γ −nα j0ρn+sE(D′′

�,−n−s)α j0 ≤ c ρs

∞∑

n=0

δn < ∞.

(5.10)

For the first sum I, we use conditioning in the following way:

xα j0 I ≤∞∑

n=0

∑

j :�≺ j� j0

xα j0 P

(

(�)n+s A� j,−n−sW j,−n−s−1 >

x · (1 − γ )γ n

2d

)

=∞∑

n=0

∑

j :�≺ j� j0

E[xα j0 P

(GnWj,−n−s−1 > x | Gn

)],

where Gn = (�)n+s A� j,−n−s(1 − γ )−1γ −n · 2d. Notice that Gn and Wj,−n−s−1 are

independent and

EGα j0n ≤ c� · ρn+s((1 − γ )γ n)−α j0 (2d)α j0 ,

where c� = max j {EAα j0� j : � ≺ j � j0}. Recall that by Assumption (5.7), there is a

constant c j such that for every x > 0

P(Wj > x) ≤ c j x−α j0 .

Therefore, recalling ργ −α j0 =: δ, we further obtain

xα j0 I ≤∞∑

n=0

∑

j :�≺ j� j0

c j · c� · ρn+s((1 − γ )γ n)−α j0 (2d)α j0 ≤ c′ · ρs∞∑

n=0

δn (5.11)

with c′ = d · c� · max{c j : � ≺ j � j0} · (1 − γ )−α j0 (2d)α j0 . Now, in view of (5.10)and (5.11), since ρ < 1, there is s0 such that

limx→∞ xα j0 P(QT (s) > x) < ε

for s > s0. ��The dominating term QF (s) of (5.6) can be further decomposed.

Lemma 5.4 For any � ≤ d and s ∈ N, the sum QF (s) admits the decomposition

QF (s) = QW (s) + QB(s) a.s. (5.12)

123


where

QW (s) =∑

j : j��

π� j (s)Wj,−s, (5.13)

QB(s) =s−1∑

n=0

(�)n

(B�,−n +

∑

j : j��

s−n−1∑

m=1

π� j (−n,−n − m + 1)Bj,−n−m

). (5.14)

Moreover,

EQB(s)α� < ∞. (5.15)

Remark 5.5 Each ingredient D�,−n = ∑j : j � A� j,−nW j,−n−1 + B�,−n of QF (s) in

(5.6) contains several W -terms with index −n − 1 and a B-term with time index −n(see (3.3)). The idea is to change W·,−n−1 terms into W·,−s by the iterative use ofthe recursion (2.3) component-wise. Meanwhile, some additional B-terms, with timeindices between −n and −s + 1, are produced. When all W -terms have the sametime index −s, we gather all ingredients containing a W -term in QW (s), while theremaining part QB(s) consists of all ingredients containing a B-term. The quantityQW (s) is easily treatable, and QB(s) is proved to be negligible in the tail (comparingto QW (s), for which the lower bound is shown in the proof of Theorem 3.3).

Proof Step 1. Decomposition of QF (s).In view of (5.6) and (3.2), we have

QF (s) =s−1∑

n=0

(�)n D�,−n =

s−1∑

n=0

(�)n

( ∑

i :i �

A�i,−nWi,−n−1 + B�,−n

)

︸︷︷︸ξn

:=s−1∑

n=0

ξn .

(5.16)

We will analyze each ingredient ξn of the last sum. We divide it into two parts, ηn andζn , starting from n = s − 1.

ξs−1 = (�)s−1

∑

j : j �

A� j,−s+1Wj,−s

︸︷︷︸ηs−1

+(�)s−1B�,−s+1︸︷︷︸

ζs−1

= ηs−1 + ζs−1.

123


Next, for n = s − 2 applying component-wise SRE (3.1), we obtain

ξs−2 = (�)s−2

∑

j : j �

A� j,−s+2

( ∑

i :i� j

A ji,−s+1Wi,−s + Bj,−s+1

)+

(�)s−2B�,−s+2

= (�)s−2

∑

j : j �

A� j,−s+2

∑

i :i� j

A ji,−s+1Wi,−s

︸︷︷︸ηs−2

+ (�)s−2

(B�,−s+2 +

∑

j : j �

A� j,−s+2Bj,−s+1

)

︸︷︷︸ζs−2

= ηs−2 + ζs−2.

In this way, for n < s we define ηn which consists of terms including (Wj,−s) and ζnwhich contains terms (Bj,−i ). Observe that in most η’s and ζ ’s, a multiple summationappears, which is not convenient. To write them in simpler forms, we are going to usethe notation (5.2). This yields

ηs−2 = (�)s−2

∑

i :i��

∑

j :i� j �

A� j,−s+2A ji,−s+1Wi,−s

= (�)s−2

∑

i :i��

∑

h∈H ′2(�,i)

( 1∏

p=0

Ah(p)h(p+1),−s+2−p

)Wi,−s

= (�)s−2

∑

i :i��

π ′�i (−s + 2,−s + 1)Wi,−s . (5.17)

For each ηn , we obtain an expression of a simple form similar to (5.17). To confirmthis, let us return to the decomposition of ξ and see one more step of the iteration forη. For n = s − 2, we have

ξs−2 = (�)s−3

[ ∑

j : j �

A� j,−s+3

{ ∑

i :i� j

A ji,−s+2

( ∑

u:u�i

Aiu,−s+1Wu,−s + Bi,−s+1

)

+ Bj,−s+2

}+ B�,−s+3

], (5.18)

so that similarly to the case n = s − 2, we obtain the expression

ηs−3 = (�)s−3

∑

j : j �

A� j,−s+3

∑

i :i� j

A ji,−s+2

∑

u:u�i

Aiu,−s+1Wu,−s

= (�)s−3

∑

u:u��

∑

i, j :u�i� j �

A� j,−s+3A ji,−s+2Aiu,−s+1Wu,−s

123


= (�)s−3

∑

u:u��

∑

h∈H ′3(�,u)

( 2∏

p=0

Ah(p)h(p+1),−s+3−p

)Wu,−s

= (�)s−3

∑

u:u��

π ′�u(−s + 3,−s + 1)Wu,−s . (5.19)

Similarly for any 0 ≤ n ≤ s − 1, we obtain inductively

ηn = (�)n

∑

u:u��

π ′�u(−n,−s + 1)Wu,−s . (5.20)

The expression for ζn is similar but slightly different. Let us write it for n = s − 3.From (5.18), we infer

ζs−3 = (�)s−3

(B�,−s+3 +

∑

j : j �

A� j,−s+3Bj,−s+2 +∑

j : j �

A� j,−s+3

∑

i :i� j

A ji,−s+2Bi,−s+1

)

= (�)s−3

(B�,−s+3 +

∑

j : j��

π ′� j (−s + 3,−s + 3)Bj,−s+2

+∑

i :i��

π ′�i (−s + 3,−s + 2)Bi,−s+1

)(5.21)

and the general formula for ζn is:

ζn = (�)n

(B�,−n +

∑

j : j��

s−1−n∑

m=1

π ′� j (−n,−n − m + 1)Bj,−n−m

). (5.22)

Therefore,

QF (s) =s−1∑

n=0

ξn =s−1∑

n=0

ηn

︸︷︷︸QW (s)

+s−1∑

n=0

ζn

︸︷︷︸QB (s)

,

where

QW (s) =s−1∑

n=0

(�)n

∑

j : j��

π ′� j (−n,−s + 1)Wj,−s .

Finally to obtain (5.13), we use (5.3) and the identity:

(�)n

s−1−n∏

p=0

Ah′(p)h′(p+1),−n−p =s−1∏

p=0

Ah(p)h(p+1),−p

123


for n < s, h′ ∈ H ′s−n(�, j) and h ∈ Hs(�, j) such that h(0) = h(1) = . . . = h(n) = �

and h(n+ p) = h′(p) for p = 0, . . . , s−1−n. Each element h ∈ Hs(�, j) is uniquelyrepresented in this way.Step 2 Estimation of QB(s).

Recall that α� ≤ α j for j � �. Hence,EAα�

i j < ∞ andEBα�

i < ∞ for any i, j � �.We have

EQB(s)α� =E

{ s−1∑

n=0

(�)n

(B�,−n+

∑

j : j��

s−1−n∑

m=1

π ′� j (−n,−n − m+1)Bj,−n−m

)}α�

.

(5.23)

Due to Minkowski’s inequality for α� > 1 and Jensen’s inequality for α� ≤ 1, (5.23)is bounded by finite combinations of the quantities

Eπ ′� j (−n,−n − m + 1)α� < ∞ and EBα�

j,−n−m < ∞

for 0 ≤ n ≤ s − 1, j : j � � and 1 ≤ m ≤ s − 1 − n. Here, we recall thatπ ′

� j (−n,−n−m+1) is a polynomial of Ai j , i, j � � (see (5.3)). Thus, EQB(s)α� <

∞. ��Example 5.6 In order to grasp the intuition of decomposition (5.12), we consider thecase d = 2 and let

At =(A11,t A12,t0 A22,t

).

Then, applying the recursions W2,r = A22,rW2,r−1 + B2,r , −s ≤ r ≤ 1 to thequantity QF (s) of (5.16), we obtain

QF (s) =s−1∑

n=0

(1)n D1,−n

=s−1∑

n=0

(1)n (A12,−nW2,−n−1 + B1,−n)

=s−1∑

n=0

(1)n A12,−n

(2)−n−1,−s+1W2,−s

+s−1∑

n=0

(1)n

(B1,−n +

s−1−n∑

m=1

A12,−n(2)−n−1,−n+1−mB2,−n−m

),

where we recall that we use the convention (·)−n−1,−n = 1.

Let us focus on the first sum in the last expression, which is equal to QW (s). Eachterm of this sum contains a product of s + 1 factors of the form A11,·, A12,· or A22,·.

123


Each product is completely characterized by the non-decreasing sequence (h(i))si=0of natural numbers with h(0) = 1 and h(s) = 2. If h(i) = 1 for i ≤ q and h(i) = 2 fori > q, then there are q factors of the form A11,· in front of Ah(q)h(q+1),−q = A12,−q ,and s − 1−q factors of the form A22,· behind. All such sequences constitute Hs(1, 2)of Definition 4.1. Thus, we can write

QW (s) =∑

h∈Hs+1(1,2)

s−1∏

p=0

Ah(p)h(p+1),−p

︸︷︷︸π12(s)

W2,−s.

The second sum, which corresponds to QB(s), has another sum of the products in thenth term. All terms in these secondary sums ofm are starting with A12,0 and then havea product of A22,· until we reach B·,·. Each of the products is again characterized bya non-decreasing sequence (h(i))mi=0 such that h(0) = 1, h(m) = 2 and h(1) �= 1,because there is just one such sequence for each m. Thus, we can use H ′

m(1, 2) ofDefinition 4.1 and write

QB(s) =s−1∑

n=0

(1)n

(B1,−n +

s−1−n∑

m=1

∑

h∈H ′m(1,2)

m−1∏

p=0

Ah(p)h(p+1),−n−p

︸︷︷︸π ′12(−n,−n−m+1)

B2,−n−m

).

5.2 The Dominant Term

Lemmas 5.4 and 5.3 imply that the tail of W�,0 is determined by

QW (s) =∑

j : j��

π� j (s)Wj,−s

in (5.13). In the subsequent lemmas (Lemmas 5.9 and 5.10), we will apply the recur-rence to Wj,−s, j � j0 until they reach Wj0,t for some t < s. Those Wj,−s couldsurvive as the dominant terms, and terms Wj,−s, j � j0 are proved to be negligible.In these steps, the behaviors of coefficients for all Wj,−s are inevitable.

The following property is crucial in subsequent steps, particularly in themain proof.

Lemma 5.7 Let αi > αi = α j0 for some j0 � i and ui (s) := Eπi j0(s)α j0 . Then, the

limit ui = lims→∞ ui (s) exists, and it is finite and strictly positive.

Proof First notice that ui (s) > 0 for s large enough since πi j (s) is positive wheneverj � i and s ≥ j − i . We are going to prove that the sequence ui (s) is non-decreasing

123


w.r.t. s. Observe that

πi j0(s + 1) =∑

j : j0� j�i

πi j (s)A j j0,−s

≥ πi j0(s) · A j0 j0,−s

and therefore, by independence,

ui (s + 1) = Eπi j0(s + 1)α j0 ≥ Eπi j0(s)α j0 · EA

α j0j0 j0,−s = Eπi j0(s)

α j0 = ui (s).

Since ui (s) is non-decreasing in s, if (ui (s))s∈N is bounded uniformly in s, the limitexists. For the upper bound on ui (s), notice that each product in πi j (s) (see (4.1)) canbe divided into two parts:

s−1∏

p=0

Ah(p)h(p+1),−p =( s−m−1∏

p=0

Ah(p)h(p+1),−p

)

︸︷︷︸first part

·( j0)−s+m,−s+1︸︷︷︸second part

,

where m denotes the length of the product of A j0 j0 terms. In the first part, all h(·) butthe last one h(s − m) = j0 are strictly smaller than j0. Since EA

α j0j0 j0

= 1, clearly

E

( s−1∏

p=0

Ah(p)h(p+1),−p

)α j0 = E

( s−m−1∏

p=0

Ah(p)h(p+1),−p

)α j0.

Now, let H ′′n ( j, j0) denote the set of all sequences h ∈ Hn( j, j0)which have only one

j0 at the end. Suppose first that α j0 < 1. Then, by Jensen’s inequality we obtain

Eπi j0(s)α j0 = E

( ∑

h∈Hs (i, j0)

s−1∏

p=0

Ah(p)h(p+1),−p

)α j0

≤∑

h∈Hs (i, j0)

s−1∏

p=0

EAα j0h(p)h(p+1),−p

=s−1∑

m=0

∑

h∈H ′′s−m (i, j0)

s−m−1∏

p=0

EAα j0h(p)h(p+1),−p

≤s−1∑

m=0

d(s − m)d · M j0−i · ρs−m−( j0−i) (5.24)

≤ dρi− j0M j0−i∞∑

l=1

ld · ρl < ∞. (5.25)

123


Here, we put

ρ = max{EAα j0j j : j � j0} < 1; M = max{EA

α j0uv : u � j0, v � j0} < ∞.

(5.26)

The term d(s − m)d in (5.24) is an upper bound on the number of elements ofH ′′s−m(i, j). Another term M j0−1 bounds the contribution of non-diagonal elements

in the product, since any sequence h ∈ Hs−m(i, j0) generates at most j0 − i suchelements. The last term ρs−m−( j0−i) in (5.24) is an estimate of contribution of thediagonal elements since there are at least s − m − ( j0 − i) of them.

We obtain (5.25) from (5.24) by substituting l = s − m and extending the finitesum to the infinite series. Since the series converges and its sum does not depend ons, the expectation Eπi j0(s)

α j0 is bounded from the above uniformly in s.Similarly, if α j0 ≥ 1, then by Minkowski’s inequality we obtain

(Eπi j0(s)

α j0)1/α j0 =

⎛

⎝E

( ∑

h∈Hs (i, j0)

s−1∏

p=0

Ah(p)h(p+1),−p

)α j0

⎞

⎠1/α j0

≤∑

h∈Hs (i, j0)

s−1∏

p=0

(EA

α j0h(p)h(p+1),−p

)1/α j0

=s−1∑

m=0

∑

h∈H ′′s−m (i, j0)

s−m−1∏

p=0

(EA

α j0h(p)h(p+1),−p

)1/α j0,

which is again bounded from above, uniformly in s, by the same argument. ��

The number ui depends only on i , since the index j0 = j0(i) is uniquely determinedfor each i . We are going to use ui as an upper bound for Eπi j (s)

α j0 with any j suchthat j0 � j � i . This is justified by the following lemma.

Lemma 5.8 For any j such that j0 � j � i there is s j such that

Eπi j (s)α j0 < ui for s > s j . (5.27)

Proof The argument is similar to that in the proof of Lemma 4.2. Indeed, one findsπi j (s) in Si,n of (4.2) by setting s = m. We briefly recall the argument for α j0 ≤ 1.The case α j0 > 1 is similar. Without loss of generality, we set s ≥ 2d−1. The numberof sequences in Hs(i, j) is less than dsd . Taking ρ and M as in (5.26), from (4.1) andJensen’s inequality we infer

123


Eπi j (s)α j0 = E

⎛

⎝∑

h∈Hs (i, j)

s−1∏

p=0

Ah(p)h(p+1),−p

⎞

⎠α j0

≤∑

h∈Hs (i, j)

s−1∏

p=0

EAα j0h(p)h(p+1)

≤ dMdsdρs−d s→∞−→ 0,

which implies (5.27). ��We have already done all the preliminaries, and we are ready for the goal of this

section, i.e., to establish the expression (3.7):

W�,0 = π� j0(s)Wj0,−s + R� j (s)

and prove the negligibility of the term R� j (s) in the tail when s → ∞.

Lemma 5.9 Suppose that α� < α� and j0 � � is the unique number with the propertyα� = α j0 . Assume that

limx→∞ xα j0 P(Wj > x) = C j > 0

whenever j0 � j � �. Then, for any ε > 0 there exists s� = s�(ε) such that if s > s�,W�,0 has a representation

W�,0 = π� j0(s)Wj0,−s + R� j0(s), (5.28)

where R� j0(s) satisfies

limx→∞ xα j0 P(|R� j0(s)| > x) < ε. (5.29)

The proof is given by induction, and first we consider the case when indeed � ≺ j0,not only j0 � �. More precisely, we prove the following lemma which serves as abasic tool in each inductive step.

Lemma 5.10 Suppose that α� > α� = α j0 for some j0 � � and α j > α j0 for everyj � �, j �= j0. Then, for any ε > 0 there exists s� = s�(ε) such that for any s > s�,W�,0 has a representation (5.28) which satisfies (5.29).

The condition in the lemma says that j0 is the unique coordinate which has theheaviest tail among j : j � �, all the other coordinates that determine � do not dependon j0 and they have lighter tails.

Notice that as long as we only represent W�,0 by π� j0(s)Wj0,−s plus some r.v., weneed not take a large s. Indeed, s = 0 is enough to obtain (π� j0(0) = A� j0,0) if � ≺ j0.Thus, the number s� is specific for (5.29).

123


Proof In view of (5.13), we may write

QW (s) = π� j0(s)Wj0,−s +∑

j : j��j �= j0

π� j (s)Wj,−s

︸︷︷︸QW (s)

= π� j0(s)Wj0,−s + QW (s), (5.30)

where EQW (s)α j0 < ∞ by the two facts;(1) Eπ� j (s)

α j0 < ∞ for any j (see the explanation (4.1)),

(2) EWα j0j < ∞ for j � �, j �= j0, which is due to Lemma 4.2.

Thus, by Lemma 5.4 we have

W�,0 = QW (s) + QB(s) + QT (s) = π� j0(s)Wj0,−s + QW (s) + QB(s) + QT (s),

where QW (s) and QB(s) have finite moment of order α j0 and QT (s) satisfies (5.8).Now putting

R� j0(s) = QW (s) + QB(s) + QT (s), (5.31)

and setting s�(ε) = s0(�, ε) as in Lemma 5.3, we obtain the result. ��

Proof of Lemma 5.9 Here, we may allow the existence of j : j0 � j � �, so that thereexist sequences ( ji )0≤i≤n such that j0 � j1 � · · · � jn = �. Since these sequencesare strictly decreasing, their lengths are at most j0 − � + 1, i.e. possibly smaller thanj0−�+1. Let n0 = n0(�) denote the maximal length of sequence such that ( ji )1≤i≤n0satisfies j0 � j1 � · · · � jn0 = �. Then clearly j0 j1 · · · jn0 = �. In the sameway, we define n0( j) for any j in the range � � j � j0. We sometimes abbreviate ton0 when we mean n0(�). We use induction with respect to this maximal number n0to prove (5.28) and (5.29). First, we directly prove these two properties in the casesn0 = 0, 1 which serve as the induction basis. For n0 > 1, the proof relies on theassumption that analogues of (5.28) and (5.29) hold for any j with n0( j) < n0(�).The proof is divided into four steps.

Step 1 Scheme of the induction and the basis.Consider arbitrary coordinate j0 satisfying α j0 = α j0 . We prove that for any � �j0 satisfying α� = α j0 the properties (5.28) and (5.29) hold. The proof follows byinduction with respect to n0(�). Namely, we assume that for any j � j0 satisfyingα j = α j0 and n0( j) < n0(�) and for any ε > 0 there is s j = s j (ε) such that for anys > s j we have

Wj,0 = π j j0(s)Wj0,−s + R j j0(s), (5.32)

where R j j0(s) satisfies

123


limx→∞ xα j0 P(|R j j0(s)| > x) < ε. (5.33)

Then, we prove (5.28) and (5.29).First, we settle the induction basis. Here, we consider the cases n0 = 0 and n0 = 1.

The former case is equivalent to � = j0. In that setting, Theorem 3.3 was alreadyproved in Lemma 4.3; nevertheless, (5.28) and (5.29) need to be shown separately.The iteration of (3.2) yields then that

Wj0,0 = A j0 j0,0Wj0,−1 + Dj0,0 = . . . = ( j0)s︸︷︷︸

π j0 j0 (s)

Wj0,−s +s−1∑

n=0

( j0)n D j0,−n

︸︷︷︸R j0 j0 (s)

, (5.34)

where we recall that ( j0)0 = 1 and

( j0)1 = A j0 j0,0. From the definition of j0 and

Lemma 4.2, it follows that EDα j0j0

< ∞. Since R j0 j0(s) is constituted by a finite sumof ingredients which have the α j0 th moment, we conclude that ER j0 j0(s)

α j0 < ∞ and(5.29) holds for any s. Thus, we may let s�(ε) = 1 for any ε > 0. The case n0 = 1is precisely the setting of Lemma 5.10, in which we have already proved (5.28) and(5.29).

If n0 > 1, then there is at least one coordinate j satisfying � � j � j0. Forany such j , it follows that n0( j) < n0(�); hence, we are allowed to use the inductionassumptions (5.32) and (5.33). In the next step,we prove that this range of j is essential,while for any other j � � the induction is not necessary.

Step 2 Decomposition of QW (s)QW (s)QW (s) and estimation of the negligible term.The first term in (5.28) comes from the part QW (s) of W�,0 in Lemma 5.4 and we

further write

QW (s) =∑

j : j0� j��

π� j (s)Wj,−s

︸︷︷︸Q′

W (s)

+∑

j : j0� j��

π� j (s)Wj,−s

︸︷︷︸Q′′

W (s)

. (5.35)

Recall that the relation ‘�’ describes dependence between the components of the solu-tionWt after a finite number of iterations of (1.1). Therefore, the range of summationj0 � j � � in Q′′

W (s) means that � ( �= j) depends on both j and j0 (by definitionof j0), but j does not depend on j0. The relation j � � implies that α j ≥ α�, whilej0 � j yields that α j �= α j0 . Recalling that α� = α j0 , we can say that in Q′

W (s) wegather all coordinates j : j � � such that α j = α j0 and Q′′

W (s) consists of j : j � �

such that α j > α j0 . Hence, by Lemma 4.2 each Wj appearing in Q′′W (s) has a tail of

lower order than the tail of Wj0 .Notice that QW (s) defined in (5.30) is the form that Q′′

W (s) takes under the assump-tions of Lemma 5.10. In this special case, we also have Q′

W (s) = π� j0(s)Wj0,−s . Weare going to study this expression in the more general setting of Lemma 5.9.

Step 3 The induction step: decomposition of Q′W (s).Q′W (s).Q′W (s).

123


To investigate Q′W (s) in more detail, we will introduce a shifted version of Ri j (s).

First, recall that the r.v.’s πi j (t, s) and πi j (t − s + 1) have the same distribution andπi j (t, s) can be understood as a result of applying t times a shift to all time indices inπi j (t−s−1). In the sameway, we define Ri j (t, s) as a result of applying t times a shiftto all time indices in Ri j (t − s − 1). In particular, we have Ri j (0,−s + 1) = Ri j (s).By the shift invariance of the stationary solution, (5.32) and (5.33) are equivalent totheir time-shifted versions:

Wj,t = π j j0(t, t − s + 1)Wj0,t−s + R j j0(t, t − s + 1), t ∈ Z (5.36)

and

limx→∞ xα j0 P(|R j j0(t, t − s + 1)| > x) < ε, t ∈ Z, (5.37)

respectively. Fix arbitrary numbers s1, s2 ∈ N. Letting t = −s1 in (5.36), we obtain

Q′W (s1) =

∑

j : j0� j��

π� j (s1)Wj,−s1

=∑

j : j0� j��

π� j (s1){π j j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2

+ R j j0(−s1,−s1 − s2 + 1)}

=∑

j : j0� j��

π� j (s1)π j j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2

+∑

j : j0� j��

π� j (s1)R j j0(−s1,−s1 − s2 + 1)

=∑

j : j0� j��

π� j (s1)π j j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2

−π��(s1)π� j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2+ ∑

j : j0� j��

π� j (s1)R j j0(−s1,−s1 − s2 + 1)

}=: Q∗

W (s1, s2)

= π� j0(s1 + s2)Wj0,−s1−s2 + Q∗W (s1, s2), (5.38)

where π� j0(s1 + s2) consists of all combinations π� j (s1)π j j0(−s1,−s1 − s2 + 1) onj : j0 � j � �. This is clear when we recall that π� j (s1), π j j0(−s1,−s1 − s2 + 1)and π� j0(s1 + s2) are appropriate entries of the matrices �s1 ,�−s1,−s1−s2+1 and�s1+s2 = �s1�−s1,−s1−s2+1, respectively.

Now, a combination of (5.12), (5.35) and (5.38) yields

W�,0 = QW (s1) + QB(s1) + QT (s1)

= Q′W (s1) + Q′′

W (s1) + QB(s1) + QT (s1)

= π� j0(s1 + s2)Wj0,−s1−s2 + Q′′W (s1) + Q∗

W (s1, s2) + QB(s1) + QT (s1)

= π� j0(s1 + s2)Wj0,−s1−s2 + R� j0(s1 + s2). (5.39)

123


Our goal will be achieved by putting s = s1 + s2 in (5.39) and showing (5.29) forR� j0(s) with s = s1 + s2.

Step 4 The induction step: estimation of the negligible terms.To obtain (5.29), we evaluate the four ingredients of R� j0(s1 + s2) of (5.39), where thesecond hypothesis (5.33) of induction is used. Three of them, Q′′

W (s1), QB(s1) andQT (s1), are nonnegative; hence, it is sufficient to establish an upper bound for eachof them. The fourth term, Q∗

W (s1, s2), may attain both positive and negative values;thus, we are going to establish an upper bound for its absolute value.

First, since the terms with j : j0 � j � � in Q′′W (s1) of (5.35) satisfy the same

condition as those of the sum QW (s) of the previous lemma:

EQ′′W (s1)

α j0 < ∞ (E.1)

holds. Secondly,

EQB(s1)α j0 < ∞ (E.2)

holds in view of Lemma 5.4. Moreover, by Lemma 5.3 for any ε′ > 0 there is s0 =s0(�, ε′) such that for s1 > s0

limx→∞ xα j0 P(QT (s1) > x) < ε′. (E.3)

For the evaluation of Q∗W (s1, s2), we will use (5.27) and therefore we have to

assume that s1 > max{s j ; j0 � j � �} where s j are defined in (5.27). Furthermore,we fix arbitrary ε′′ > 0 and assume that s2 > max{s j (ε′′) : j0 � j � �} where s j (ε′′)are defined right before (5.32). Recall that π� j (s1), j : j0 � j � � is independent ofR j j0(−s1,−s1 − s2 + 1) and has finite moment of order α j0 + δ with some δ > 0. Weuse (5.27), (5.37) and Lemma B.1 to obtain

limx→∞ xα j0 P(π� j (s1)|R j j0(−s1,−s1 − s2 + 1)| > x) ≤ Eπ� j (s1)

α j0 · ε′′ ≤ u� · ε′′.

The situation is different for j = j0. We cannot use Lemma B.1, because we donot know whether Eπ� j0(s1)

α j0+δ < ∞ for some δ > 0. Indeed, it is possible that

EAα j0+δ

j0 j0= ∞ for all δ > 0 and then clearlyπ� j0(s1) also does not have anymoment of

order greater than α j0 . However,Eπ� j0(s1)α j0 < ∞ holds for any s1 and this is enough

to obtain the desired bound. Since the term R j0 j0(s2) is nonnegative (see (5.34)) andit was already proved to have finite moment of order α j0 , we obtain for any s1, s2

E{π� j0(s1)|R j0 j0(−s1,−s1 − s2 + 1)|}α j0 = Eπ� j0(s1)

α j0 · ER j0 j0(s2)α j0 < ∞

and thus

limx→∞ xα j0 P(π� j0(s1)R j0 j0(−s1,−s1 − s2 + 1) > x) = 0.

123


Hence, setting N = #{ j : j0 � j � �}, we obtain that

limx→∞ xα j0 P(|Q∗

W (s1, s2)| > x)

≤ limx→∞ xα j0 P

⎛

⎝∑

j : j0� j��

π� j (s1)|R j j0(−s1,−s1 − s2 + 1)|

+ π��(s1)π� j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2 > x

⎞

⎠

≤ limx→∞ xα j0

⎛

⎝∑

j : j0� j��

P

(π� j (s1)|R j j0(−s1,−s1 − s2 + 1)| >

x

N + 1

⎞

⎠

+P

(π��(s1)π� j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2 >

x

N + 1

) ⎞

⎠

≤∑

j : j0� j��

limx→∞ xα j0 P((N + 1) · π� j (s1)|R j j0(−s1,−s1 − s2 + 1)| > x)

+ limx→∞ xα j0 P((N + 1) · π��(s1)π� j0(−s1,−s1 − s2 + 1)Wj0,−s1−s2 > x)

≤∑

j : j0� j��

(N + 1)α j0 u� · ε′′ + (N + 1)α j0ρs1u� · C j0 . (5.40)

For the last inequality, we used Lemma B.1 and the fact that π��(s1) = (�)0,−s1+1.

Since ρ = EAα j0�� < 1, there is s′ = s′(ε′′) such that ρs1u� · C j0 < (N + 1)α j0 u� · ε′′

for all s1 > s′. Then, recalling that the sum in the last expression contains at mostN − 1 nonzero terms, the final estimate is

limx→∞ xα j0 P(|Q∗

W (s1, s2)| > x) < (N + 1)α j0+1u� · ε′′. (E.4)

Now, we are going to evaluate R� j0(s) of (5.29). The desired estimate can beobtained only if s is chosen properly.

For convenience, we briefly recall the conditions on s1 and s2 that were necessaryto obtain the estimates (E.1–E.4). The inequalities (E.1) and (E.2) do not rely onany assumption on s1 or s2. The other relations are the following. Firstly, to obtainthe inequality (E.3) we need to assume that s1 > s0(�, ε′). Secondly, the estimatess1 > max{s j : j0 � j � �} and s2 > max{s j (ε′′) : j0 � j � �} are used to prove(5.40). Passing from (5.40) to (E.4) relies on the condition s1 > s′.

Now, let

s > s0(�, ε′) ∨ max{s j : j0 � j � �} ∨ s′ + max{s j (ε′′) : j0 � j � �} + 1,

123


where · ∨ · = max{·, ·}. Then, there are s1 > s0(�, ε′) ∨ max{s j : j0 � j � �} ∨ s′and s2 > max{s j (ε′′) : j0 � j � �} such that s = s1 + s2. Then,

R� j0(s) = R� j0(s1 + s2) = Q′′W (s1) + Q∗

W (s1, s2) + QB(s1) + QT (s1).

The numbers s1 and s2 were chosen in such a way that (E.3) and (E.4) hold. The termsQ′′

W (s1) and QB(s1) are negligible in the asymptotics. Therefore, it follows that

limx→∞ xα j0 P(R� j0(s) > x) = lim

x→∞ xα j0 P(Q∗W (s1, s2) + QT (s1) > x)

≤ limx→∞ xα j0 P(Q∗

W (s1, s2) > x/2) + limx→∞ xα j0 P(QT (s1) > x/2)

≤ 2α j0

((N + 1)α j0+1u� · ε′′ + ε′) .

Since ε′ and ε′′ are arbitrary, we obtain (5.29). ��

6 Applications

Although theremust be several applications,we focus on themultivariateGARCH(1, 1)processes, which is our main motivation. In particular, we consider the constantconditional correlations model by [4] and [18], which is the most fundamental mul-tivariate GARCH process. Related results are the following. The tail of multivariateGARCH(p, q) has been investigated in [13] but with the setting of Goldie’s condi-tion. A bivariate GARCH(1, 1) series with a triangular setting has been studied in[23] and [10]. Particularly in [10], detailed analysis was presented including exact tailbehaviors of both price and volatility processes. Since the detail of application is ananalogue of the bivariate GARCH(1, 1), we only see how the upper triangular SREsare constructed from multivariate GARCH processes.

Let α0 be a d-dimensional vector with positive elements, and let α and β be d ×d upper triangular matrices such that nonzero elements are strictly positive. For avector x = (x1, . . . , xd), write xγ = (xγ

1 , . . . , xγ

d ) for γ > 0. Then, we say thatd-dimensional series Xt = (X1,t , . . . , Xd,t )

′, t ∈ Z has GARCH(1, 1) structure if itsatisfies

Xt = �tZt ,

where Zt = (Z1,t , . . . , Zd,t )′ constitute an i.i.d. d-variable random vectors and the

matrix �t is

�t = diag(σ1,t , . . . , σd,t ).

Moreover, the system of volatility vector (σ1,t , . . . , σd,t )′ is given by that of squared

process Wt = (σ 21,t , . . . , σ

2d,t )

′. Observe that Xt = �tZt = diag(Zt )W1/2t , so that

123


X2t = diag(Z2

t )Wt . Then, Wt is given by the following auto-regressive model.

Wt = α0 + αX2t−1 + βWt−1

= α0 + (αdiag(Z2t ) + β)Wt−1.

Now, putting Bt := α0 and At := (αdiag(Z2t ) + β), we obtain the SRE: Wt =

AtWt−1 + Bt with At the upper triangular with probability one. Each component ofAt is written as

Ai j,t = αi j Z2i j,t + βi j , i ≤ j and Ai j,t = 0, i > j a.s.

Thus, we could apply ourmain theorem to the squared volatility processWt and obtainthe tail indices forWt . From this, we could derive tail behavior of Xt as done in [10].

Note that we have more applications in GARCH-type models. Indeed, we areconsidering applications in BEKK-ARCH models, of which tail behavior has beeninvestigated with the diagonal setting (see [25]). At there, we should widen our resultsinto the case where the corresponding SRE takes values on whole real line. The exten-sion is possible if we assume certain restrictions and consider positive and negativeextremes separately. Since the BEKK-ARCH model is another basic model in finan-cial econometrics, the analysis with the triangular setting would provide more flexibletools for empirical analysis.

7 Conclusions and Further Comments

7.1 Constants

In the bivariate case, we can obtain the exact form of constants for regularly varyingtails (see [10]). The natural question is whether we can obtain the form of constantseven in the d-dimensional case. The answer is positive. We provide an example whichillustrates the method of finding these constants when d = 4. Let

A =

⎛

⎜⎜⎝

A11 A12 A13 A140 A22 A23 A240 0 A33 A340 0 0 A44

⎞

⎟⎟⎠

and suppose that α3 < α4 < α2 < α1.For coordinate k = 3, 4, we have

Ck = E [(AkkWk + Bk)αk − (AkkWk)

αk ]

αkE[Aαkkk log Akk] ,

where Wk is independent of Akk and Bk . This is the Kesten–Goldie constant (see [15,Theorem 4.1]). Indeed, since W4 is a solution to the univariate SRE, we immediatelyobtain the constant. Since the tail index of W3 is equal to α3 = α3, the constant

123


follows by (4.6) in Lemma 4.3. For the second coordinate, we have an equationW2d=

A22W2 + A23W3 + A24W4. Since α2 = α3 and α4 > α3, the term A23W3 dominatesall others in the asymptotics. In view of (3.12), we obtain

C2 = u2 · C3,

where the quantity u2 is given in Lemma 5.7.The situation seems more complicated for the first coordinate, because we have the

condition α1 = α2 = α3 on the SRE: W1d= A11W1 + A12W2 + A13W3 + A14W4.

This means that the tail ofW1 comes fromW2 andW3 both of which have dominatingtails, and we could not single out the dominant term. However, by Lemmas 5.7 and5.9 again we obtain a simple formula:

C1 = u1 · C3.

We can write the general recursive formula for constants in any dimension:

Ck ={

E[(AkkWk+Bk )αk−(AkkWk )αk ]

αkE[Aαkkk log Akk ] if αk = αk,

uk · C j0 if αk = α j0 < αk .

Finally, we notice that these uk have only closed form including infinite sums. Theexact values of uk seem to be impossible, and the only method to calculate themwouldbe numerical approximations. The situation is similar to the Kesten–Goldie constant(see [24]).

7.2 Open Questions

In order to obtain the tail asymptotics of SRE such as (1.1), the Kesten’s theorem hasbeen the key tool (see [9]). However, when the coefficients of SRE are upper triangularmatrices as in our case, the assumptions of the theorem are not satisfied, so that wecould not rely on the theorem. Fortunately in our setting, we can obtain the exact tailasymptotic of each coordinate, which is P(Wk > x) ∼ Ckx−αk . However, in generalsetting, one does not necessarily obtain such asymptotic even in the upper triangularcase.

The example is given in [11], which we briefly see. Let A be an upper triangularmatrix with A11 = A22 having the index α > 0. Then, depending on additionalassumptions, it can be either P(W1 > x) ∼ Cx−α(log x)α/2, or P(W1 > x) ∼C ′x−α(log x)α for some constant C, C ′ > 0.

There are many natural further questions to ask. What happens to the solutionif some indices αi of different coordinates are equal? How we could find the tailasymptotics when the coefficient matrix is neither in the Kesten’s framework norupper triangular? Moreover, if A includes negative entries, could we derive the tailasymptotics? They are all open questions.

123


Acknowledgements The authors are grateful to Ewa Damek and Dariusz Buraczewski for valuable dis-cussion on the subject of the paper.

OpenAccess This article is licensedunder aCreativeCommonsAttribution 4.0 InternationalLicense,whichpermits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you giveappropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,and indicate if changes were made. The images or other third party material in this article are includedin the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. Ifmaterial is not included in the article’s Creative Commons licence and your intended use is not permittedby statutory regulation or exceeds the permitted use, you will need to obtain permission directly from thecopyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix A. Negativity of Top Lyapunov Exponent

We provide the proof for negativity of γA in Sect. 2. It can also be deduced from moregeneral results of Straumann [28] or Grencsér, Michaletzky and Orlovitz [14] on thetop Lyapunov exponent for block triangular matrices. However, in our case there is adirect elementary approach based on equivalence of norms and Gelfand’s formula.

Since all matrix norms are equivalent, we can use the norm

‖A‖1 =d∑

i=1

d∑

j=1

|Ai j |,

which is submultiplicative: ‖AB‖1 ≤ ‖A‖1 ‖B‖1. Then, since A has nonnegativeentries, we have E‖A‖1 = ‖EA‖1. Moreover, for any ε ∈ (0, 1), ‖A‖ε

1 ≤ ‖Aε‖1,where Aε denotes the matrix A with each entry raised to the power of ε. We are goingto apply these to the form of top Lyapunov exponent γA. For any ε > 0, by Jensen’sinequality we have

γA = infn≥1

(nε)−1E log ‖�n‖ε

1

≤ infn≥1

(nε)−1 logE‖�n‖ε1.

Then, from properties of ‖ · ‖1 above, we infer that

E‖�n‖ε1 ≤ E‖�ε

n‖1 = ‖E�εn‖1 ≤ ‖E�(ε)

n ‖1, (A.1)

where �(ε)n = Aε

0 · · ·Aε−n+1. The last inequality follows from the superadditivity of

the function f (x) = xε. Since the matrices Ai are i.i.d., we have E�(ε)n = (EAε)n .

Here, we take the nth power in terms of matrix multiplication. Hence,

γA ≤ infn≥1

(nε)−1 log ‖(EAε)n‖1 = 1

εinfn≥1

log ‖(EAε)n‖1/n1 .

123

http://creativecommons.org/licenses/by/4.0/


From Gelfand’s formula (e.g., [2, (1.3.3)]), for any matrix norm ‖ · ‖ we can write

limn→∞ ‖(EAε)n‖1/n = ρ(EAε).

Taking the norm ‖ · ‖1, we obtain

γA ≤ 1

εinfn≥1

log ‖(EAε)n‖1/n1 ≤ 1

εlimn→∞ log ‖(EAε)n‖1/n1 = 1

εlog ρ(EAε).

Hence, it suffices to show that ρ(EAε) < 1. If 0 < ε < min{α1, . . . , αd}, then fromcondition (T-4)

EAεi i < 1, i = 1, . . . , d. (A.2)

Since the spectral radius ρ(EAε) is the maximal eigenvalue of EAε, stationarity isimplied by (A.2).

Remark A.1 (i)By the equivalence of matrix norms, the argument above works for anynorm. In order to observe this, take a certain norm ‖·‖ and apply the inequality ‖A‖ ≤c‖A‖1. Then, by (A.1) we obtain E‖�n‖ε ≤ cε‖E�

(ε)n ‖1. Since limn→∞ cε/n = 1

for any constant c > 0, the whole argument holds.(i i) By the submultiplicativity of ‖ · ‖1, it is immediate to see that

γA ≤ 1

εlog ‖EAε‖1.

However, we do not have any control on the norm ‖EAε‖1; in particular, it can begreater than 1 for any ε. It is essential in our situation that ρ(EAε) ≤ ‖EAε‖1 andinvolving Gelfand’s formula is necessary to obtain the desired bound.

Appendix B. Version of Breiman’s Lemma

We provide a slightly modified version of the classical Breiman’s lemma (e.g., [9,LemmaB.5.1]), since it is needed in the proof for (5.29) ofLemma5.9. In theBreiman’slemma, we usually assume regular variation for the dominant r.v.’s of the two, whichwe could not apply in our situation. Instead, we require only an upper estimate of thetail. The price of weakening assumptions is also a weaker result: On behalf of the exactasymptotics of a product, we obtain just an estimate from above. The generalizationis rather standard, but we include it for completeness.

Lemma B.1 Assume that X and Y are independent r.v.’s and for some α > 0, thefollowing conditions hold:

limx→∞ xα

P(Y > x) < M for a constant M > 0; (B.1)

EXα+ε < ∞ for some ε > 0. (B.2)

123


Then,

limx→∞ xα

P(XY > x) ≤ M · EXα.

Proof The idea is the same as that in the original proof of Breiman’s lemma; see e.g.,[9, Lemma B.5.1]. Let PX denote the law of X . Then, for any fixed m > 0 we canwrite

P(XY > x) =∫

(0,∞)

P(Y > x/z)PX (dz)

=( ∫

(0,m]+∫

(m,∞), x/z>x0+∫

(m,∞), x/z≤x0

)P(Y > x/z)PX (dz).

By (B.1), there is x0 such that xαP(Y > x) ≤ M uniformly in x ≥ x0. For the first

integral since (x/z) ≥ x0 for x ≥ mx0 and z ∈ (0,m], by Fatou’s lemma

limx→∞ xα

∫

(0,m]P(Y > x/z)PX (dz) ≤

∫

(0,m]zα lim

x→∞(x/z)αP(Y > x/z)PX (dz)

≤ M ·∫

(0,m]zαPX (dz)

m→∞→ M · EXα. (B.3)

Since x/z > x0, the same argument as above is applicable to the second integral:

limx→∞ xα

∫

(m,∞), x/z>x0P(Y > x/z)PX (dz) (B.4)

≤∫

z>mzα lim

x→∞(x/z)αP(Y > x/z)PX (dz)

≤ M ·∫

z>mzαPX (dz)

m→∞→ 0. (B.5)

The assumption (B.2) allows us to useMarkov’s inequality to estimate the last integral:∫

(m,∞),x/z≤x0P(Y > x/z)PX (dz) ≤

∫

x/z≤x0PX (dz) = P(X > x/x0) ≤ (x/x0)

−(α+ε)EXα+ε

(B.6)

and therefore, the last integral is negligible as x → ∞ regardless of m. Now, in viewof (B.3)-(B.6), letting m to infinity, we obtain the result. ��

References

1. Alsmeyer, G., Mentemeier, S.: Tail behavior of stationary solutions of random difference equations:the case of regular matrices. J. Differ. Equ. Appl. 18, 1305–1332 (2012)

2. Belitskii, G.R., Lyubich, YuI: Matrix Norms and Their Applications. Birkhäuser, Basel (1988)3. Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge University Press, Cam-

bridge (UK) (1987)

123


4. Bollerslev, T.: Modelling the coherence in short-run nominal exchange rates: a multivariate generalisedARCH model. Rev. Econ. Stat. 72, 498–505 (1990)

5. Bougerol, P., Picard, N.: Strict stationarity of generalized autoregressive processes. Ann. Probab. 20,1714–1730 (1992)

6. Brandt, A.: The stochastic equation Yn+1 = An Yn + Bn with stationary coefficients. Adv. Appl.Probab. 18, 211–220 (1986)

7. Buraczewski, D., Damek, E.: Regular behavior at infinity of stationarymeasures of stochastic recursionon NA groups. Colloq. Math. 118, 499–523 (2010)

8. Buraczewski, D., Damek, E., Guivarc’h, Y., Hulanicki, A., Urban, R.: Tail-homogeneity of stationarymeasures for somemultidimensional stochastic recursions. Probab. Theory Relat. Fields 145, 385–420(2009)

9. Buraczewski, D., Damek, E., Mikosch, T.: Stochastic Models with Power-Law Tails. The EquationX = AX + B. Springer Int. Pub., Switzerland (2016)

10. Damek, E., Matsui, M., Swiatkowski, W.: Componentwise different tail solutions for bivariate stochas-tic recurrence equations with application to GARCH(1,1) processes. Colloq. Math. 155, 227–254(2019)

11. Damek, E., Zienkiewicz, J.: Affine stochastic equation with triangular matrices. J. Differ. Equ. Appl.24, 520–542 (2018)

12. Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events for Insurance and Finance.Springer, Berlin (1997)

13. Fernández, B., Muriel, N.: Regular variation and related results for the multivariate GARCH(p, q)model with constant conditional correlations. J. Multivariate Anal. 100, 1538–1550 (2009)

14. Gerencsér, L., Michaletzky, G., Orlovits, Z.: Stability of block-triangular stationary random matrices.Syst. Control Lett. 57, 620–625 (2008)

15. Goldie, C.M.: Implicit renewal theory and tails of solutions of random equations. Ann. Appl. Probab.1, 126–166 (1991)

16. Guivarc’h, Y., Le Page, É.: Spectral gap properties for linear random walks and Pareto’s asymptoticsfor affine stochastic recursions. Ann. Inst. H. Poincaré Probab. Statist. 52, 503–574 (2016)

17. Horváth, R., Boril, S.: GARCHmodels, tail indexes and error distributions: an empirical investigation.N. Am. J. Econ. Finance 37, 1–15 (2016)

18. Jeantheau, T.: Strong consistency of estimators for multivariate ARCH models. Econ. Theory 14,70–86 (1998)

19. Kesten, H.: Random difference equations and renewal theory for products of random matrices. ActaMath. 131, 207–248 (1973)

20. Klüppelberg, C., Pergamenchtchikov, S.: The tail of the stationary distribution of a random coefficientAR(q) model. Ann. Appl. Probab. 14, 971–1005 (2004)

21. Krengel, U.: Ergodic Theorems. De Gruyter, Berlin (1985)22. McNeil, A.J., Frey, R., Embrechts, P.: Quantitative RiskManagement: Concepts, Techniques and Tools

(Princeton Series in Finance). Princeton Univ. Pr, Princeton (2015)23. Matsui, M., Mikosch, T.: The extremogram and the cross-extremogram for a bivariate GARCH(1,1)

process. Adv. Appl. Probab. 48A, 217–233 (2016)24. Mikosch, T., Samorodnitsky, G., Tafakori, L.: Fractional moments of solutions to stochastic recurrence

equations. J. Appl. Probab. 50, 969–982 (2013)25. Pedersen, R.S., Wintenberger, O.: On the tail behavior of a class of multivariate conditionally het-

eroskedastic processes. Extremes 21, 261–284 (2018)26. Resnick, S.I.: Extreme Values, Regular Variation, and Point Processes. Springer, New York (1987)27. Resnick, S.I.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York

(2007)28. Straumann, D.: Estimation in conditionally heteroscedastic time series models. Lecture Notes in Statis-

tics 181, Springer-Verlag, Berlin (2005)29. Sun, P., Zhou, C.: Diagnosing the distribution of GARCH innovations. J. Empir. Finance 29, 287–303

(2014)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published mapsand institutional affiliations.

123

Documents

link.springer.com · JournalofTheoreticalProbability TailindicesforAX +BRecursionwithTriangularMatrices Muneya Matsui1 ·Witold Swia