Upload
independent
View
0
Download
0
Embed Size (px)
Citation preview
935
0894-9840/03/1000-0935/0 © 2004 Plenum Publishing Corporation
Journal of Theoretical Probability, Vol. 16, No. 4, October 2003 (© 2004)
Large Deviations of Empirical Measures underSymmetric Interaction
Włodzimierz Bryc1
1 Department of Mathematics, University of Cincinnati, P.O. Box 210025, Cincinnati, Ohio45221-0025. Email: [email protected]
Received May 22, 2002; revised June 30, 2003
We prove the large deviation principle for the joint empirical measure of pairsof random variables which are coupled by a ‘‘totally symmetric’’ interaction.The rate function is given by an explicit bilinear expression, which is finite onlyon product measures and hence is non-convex.
KEY WORDS: Large deviations; symmetric interaction; non-convex ratefunction.
1. INTRODUCTION
1.1. Large deviations of empirical measures have been widely studied in theliterature since the celebrated Sanov’s theorem, which gives the largedeviations principle in the scale of n of the empirical measures of i.i.d.random variables with the relative entropy H(m | n)=> log dm
dndm as the rate
function. Another entropy, Voiculescu’s non-commutative entropy S(m)=>> log |x − y| m(dx) m(dy), arises in the study of fluctuations of eigenvaluesof random matrices, see Hiai and Petz (8) and the references therein. Chan (4)
interprets empirical measures of eigenvalues of random matrices as asystem of interacting diffusions with singular interactions.
1.2. In this paper we study empirical measures which can be thought ofas a decoupled version of the empirical measures generated by randommatrices. We are interested in empirical measures on R2 generated by pairsof random variables that are tied together by a totally symmetric, and
hence non-local, interaction, see formula (1) for the (unnormalized) jointdensity. Under certain assumptions, we prove that the large deviationprinciple in the scale n2 holds for the joint empirical measures, and the ratefunction is non-convex. As a corollary, we derive a large deviationsprinciple for the univariate average empirical measures with a rate functionthat superficially resembles the rate function of random matrices, seeCorollary 1; an interesting feature here is the emergence of concave ratefunctions, see Remark 3. (Eigenvalues of random matrices are exchange-able and the large deviation rate function for their empirical measures isconvex; infinite exchangeable sequences often lead to non-convex ratefunctions, see Dinwoodie and Zabell (7) and Ref. 3, Example 3.)
1.3. Let g: R2Q R be a continuous function which satisfies the following
conditions.
Assumption 1. g(x, y) \ 0 for all x, y ¥ R.
Assumption 2. For every 0 < a [ 1, Ma :=>> ga(x, y) dx dy < ..
Assumption 3. g(x, y) is bounded, g(x, y) [ eC.
In the following statements we use the convention that −log 0=..
Assumption 4. The function k(x, y) := − log g(x, y) has compactlevel sets: for every a > 0 the set {(x, y): g(x, y) \ e−a} … R2 is compact.
The purpose of the next assumption is to allow singular interactions,where g(x, x)=0; this assumption is automatically satisfied with b=0 ifg(x, y) > 0 for all x, y.
Assumption 5. There is a b \ 0 such that (x, y) W b log |x − y| −log g(x, y) extends from {(x, y): x ] y} to the continuous function on R2.
Examples of functions that satisfy these assumptions are: the Gaussiankernel
g(x, y)=e−x2 − y2+2hxy
for |h| < 1, see the proof of Proposition 1; and a singular kernel
g(x, y)=|x − y|b e−x2 − y2
for b \ 0, see the proof of Proposition 2.
936 Bryc
Define
f(x1,..., xn, y1,..., yn)= Dn
i, j=1g(xi, yj). (1)
Clearly, f depends on n; we will suppress this dependence in our nota-tion and we will further write f(x, y) as a convenient shorthand forf(x1,..., xn, y1,..., yn).
Assumptions 1, 2, and 3 imply that f is integrable. Indeed, sinceg(x, y) [ eC,
Zn :=FR
2nf(x1,..., xn, y1,..., yn) dx1 · · · dxn dy1 · · · dyn
[ FR
2nD
n
i=1(g(xi, yi) eC(n − 1)) dx1 · · · dxn dy1 · · · dyn=eC(n2 − n)Mn
1 < ..
We are interested in joint empirical measures
mn=1n2 C
n
i, j=1dxi, yj
, (2)
considered as random variables with values in the Polish space of proba-bility measures P(R2) (equipped with the topology of weak convergence),with the distribution induced on P(R2) by the probability measure Pr=Prn ¥ P(R2n) defined by
Pr(dx, dy) :=1
Znf(x, y) dx dy. (3)
Theorem 1. If g(x, y) satisfies Assumptions 1, 2, 3, 4, and 5, then thejoint empirical measures {mn} satisfy the large deviation principle in thescale n2 with the rate function I: P(R2) Q [0, .] given by
I(m)=˛FF k(x, y) n1(dx) n2(dy) − I0 if m=n1 é n2 is a productmeasure and k is m-integrable;
. otherwise, (4)
where k(x, y)=−log g(x, y) and I0=infx, y ¥ R k(x, y).
Large Deviations of Empirical Measures under Symmetric Interaction 937
Definition 1 (Ref. 2, Chapter 3). We say that k: R2Q R is a negative
definite kernel if k(x, y)=k(y, x) and
C k(xi, xj) cicj [ 0 (5)
for all xi, ci ¥ R such that ; ci=0.
Condition (5) is satisfied for k(x, y)=V(x)+W(y) − o(x, y), whereo(x, y) is positive-definite.
Consider the average empirical measures
sn :=12n
Cn
i=1(dxi
+dyi).
Corollary 1. Suppose that the assumptions of Theorem 1 hold true,and in addition k(x, y) is continuous and negative-definite. Then theaverage empirical measures {sn} satisfy the large deviation principle in thescale n2 with the rate function
I(n)=FF k(x, y) n(dx) n(dy) − I0,
and I0=infx k(x, x).
Proof. This follows from the contraction principle. The mappingm( ) W 1
2 > m( · , dy)+12 > m(dx, · ) is continuous in the weak topology. The
rate function is I(n)=inf{I(n1 é n2): n=12 n1+1
2 n2}.Write K(m)=>> k(x, y) m(dx, dy). If n=1
2 n1+12 n2 then Ref. 10,
Theorem 3 implies that K(n1 é n2) \ K(n é n). Thus I(n)=K(n é n) − I0.Another form of the cited inequality is that for any two probability
measures n1, n2 we have
2K(n1 é n2) \ K(n1 é n1)+K(n2 é n2). (6)
In particular, 2k(x, y) \ k(x, x)+k(y, y), which implies that I0=infx, y k(x, y)=infx k(x, x). i
Remark 1. Inequality (6) implies that the rate function satisfies
I(12n1+1
2n2) \ 12I(n1)+1
2I(n2).
938 Bryc
2. APPLICATIONS
2.1. Let g(x, y)=e−x2 − y2+2hxy. Then
f(x, y)=exp 1− n Cn
i=1x2
i − n Cn
j=1y2
j +2h Cn
i, j=1xi yj
2
and k(x, y)=x2+y2 − 2hxy.Denote by mr(n)=>x rn(dx) the r-th moment of a measure n.
Proposition 1.
(i) If |h| < 1 then the empirical measures
nn :=1n
Cn
i=1dxj
satisfy the large deviation principle in the scale n2 with the rate function
I(n)=(m2(n) − h2m21(n)).
(ii) If 0 [ h < 1 then the average empirical measures
sn :=12n
Cn
i=1(dxi
+dyi)
satisfy the large deviation principle in the scale n2 with the rate function
I(n)=2(m2(n) − hm21(n)).
(In the formulas above, use I(n)=. if m2(n)=..)
Remark 2. The marginal density relevant in Proposition 1(i) is
f1(x)=C(n, h) exp 1− n2 11n
Cn
i=1x2
i − h2 1 1n
Cn
i=1xi2222 .
Remark 3. Both rate functions in Proposition 1 are concave.
Proof. (i) It is easy to see that the assumptions of Theorem 1 aresatisfied. Indeed, k(x, y)=(x − hy)2+(1 − h2)y2 is continuous, boundedfrom below. Furthermore
{k(x, y) [ a2} … {|y| [ |a|/(1 − h2)} 5 {|x| [ |a|/(1 − h2)}
Large Deviations of Empirical Measures under Symmetric Interaction 939
so k(x, y) has compact level sets. Finally, for a > 0 by a change of variableswe see that >> e−ak(x, y) dx dy=1
a >> e−k(x, y) dx dy < . so Assumption 2 issatisfied, too.
The result follows by the contraction principle: taking a marginalof a measure in P(R2) is a continuous mapping. The rate function isinf{I(m): n(A)=m(A × R)}. But since I is infinite on non-product measures,this is the same as infn2
{>> k(x, y) n(dx) n2(dy) − I0}. Since I0=0 here, itremains to notice that infn2
{>> k(x, y) n(dx) n2(dy)}=infy{> k(x, y) n(dx)}=infy{m2(n)+y2 − 2hym1(n)}=m2(n) − h2m2
1(n).(ii) This follows from Corollary 1: if h \ 0 then 2hxy is positive-
definite. Thus k(x, y)=x2+y2 − 2hxy is a negative definite kernel. i
2.2. Next, we consider a model which can be interpreted as a ‘‘decoupled’’version of a model studied in relation to eigenvalue fluctuations of randommatrices, where one encounters xj instead of our yj, compare Ref. 1, Sec-tion 5 and Ref. 9, formula (1.9). We consider here a slightly more generalsituation when
g(x, y)=|x − y|b e−V(x) − W(y).
Then
f(x, y)= Dn
i, j=1|xi − yj |b D
n
i=1e−nV(xi) D
n
j=1e−nW(yj),
and k(x, y)=V(x)+W(y) − b log |x − y|. We assume that functions V(x),W(y) are continuous, b \ 0, and that
lim|x| Q .
V(|x|)
log `1+x2= lim
|y| Q .
W(|y|)
log `1+y2=.. (7)
Proposition 2. The bivariate empirical measures mn defined by (2)satisfy the large deviation principle in the scale n2 with the rate function Igiven by (4).
In particular, if V(u)=W(u)=u2, then the rate function is
I(n1 é n2)=m2(n1)+m2(n2) − b FF log |x − y| n1(dx) n2(dy)+12b(log b − 1).
Proof. We verify that the hypotheses of Theorem 1 are satisfied.Assumption 1 holds trivially. Assumption 5 holds trivially since V(x)+W(y) is continuous.
940 Bryc
To verify Assumption 3 notice that
k(x, y) \ V(x)+W(y) − b log `1+x2 − b log `1+y2 . (8)
Since V(x) − b log `1+x2 is a continuous function which by (7) tendsto infinity as x Q ± ., it is bounded from below, V(x) − b log `1+x2 \
− c for some c. Similarly, W(y) − b log `1+y2 \ − c.We now verify Assumption 4. The set Ka :={k(x, y) [ a} is closed
since k is lower semicontinuous. Furthermore, (8) implies that Ka is con-tained in a level set of the continuous function V(x)+W(y) − b log `1+x2
− b log `1+y2. The latter set is bounded since V(x) − b log `1+x2 > a+cfor all large enough |x| and similarly W(y) − b log `1+y2 > a+c for alllarge enough |y|.
To verify Assumption 2 we use inequality (8) again. It implies
FF g(x, y)a dx dy [ F e−a(V(x) − b log `1+x2 ) dx F e−a(W(y) − b log `1+y2 ) dy.
By assumption (7), there is N > 0 such that for |x| > N we haveV(x) > (b+2/a) log `1+x2. By the previous argument the integrandis bounded; thus > e−a(V(x) − b log `1+x2) dx [ >
N
−Ne−a(V(x) − b log `1+x2) dx+
>|x| > N e−a(2/a log `1+x2) dx [ 2Neac+>|x| > N1
1+x2 dx < ..Therefore, by Theorem 1 the empirical measures mn satisfy the large
deviation principle with the rate function I(n1 é n2)=> V(x) n1(dx)+> W(y) n2(dy) − b >> log |x − y| n1(dx) n2(dy) − I0. If W(u)=V(u)=u2 thenI0=infx, y{x2+y2 − b log |x − y|}=b/2(1 − log b) by calculus. i
3. AUXILIARY RESULTS AND PROOF OF THEOREM 1The proof relies on Varadhan’s functional method (see Ref. 3,
Theorem T.1.3 and Ref. 6, Theorem 4.4.10). It consists of two steps: verifi-cation that the Varadhan functional
F W L(F) := limn Q .
1n2 log E exp(F(mn))
is well defined for a large enough class of bounded continuous functionsF: PQ R, and the proof of exponential tightness of {mn}.
3.1. Varadhan FunctionalLet F1, ..., Fm: R2
Q R be bounded continuous functions. Consider thebounded continuous function F: P(R2) Q R given by
F(m) := min1 [ r [ m
F Fr dm. (9)
Large Deviations of Empirical Measures under Symmetric Interaction 941
We will show the following.
Theorem 2. Under the assumptions of Theorem 1,
limn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
=sup 3F(m) − F k(x, y) dm : m=n1 é n2 ¥ P(R2)4 . (10)
Denote K(m)=> k(x, y) dm. Notice that by Assumption 3 we haveF(m) − K(m) [ maxr ||Fr ||.+C. In particular,
sup{F(m) − K(m): m ¥ P(R2)} < .. (11)
We prove (10) as two separate inequalities. It will be convenient toprove the upper bound for a larger class of functions F.
Lemma 1. If Assumptions 2 and 3 hold true, then for every boundedcontinuous function F : P(R2) Q R we have
lim supn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
[ sup{F(m) − K(m): m=n1 é n2 ¥ P(R2)}. (12)
Proof. Notice that for 0 < h < 1
F exp(n2F(mn)) f(x, y) dx dy
=F exp(n2(F(mn) − hK(mn)) − (1 − h) Cn
i, j=1k(xi, yj)) dx dy
[ exp(n2 supn1, n2
(F(n1 é n2) − hK(n1 é n2)))
× F exp 1−(1 − h) Cn
i, j=1k(xi, yj)2 dx dy.
942 Bryc
Since
Cn
i, j=1k(xi, yj) \ − n2C+ C
n
j=1k(xj, yj),
therefore
1n2 log F exp(n2F(mn)) f(x, y) dx dy
[ supn1, n2
{F(n1 é n2) − hK(n1 é n2)}+(1 − h) C+1n
log M1 − h.
Thus
lim supn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
[ supn1, n2
{h(F(n1 é n2) − K(n1 é n2))+(1 − h) F(n1 é n2)}+2(1 − h)C
[ h supn1, n2
{F(n1 é n2) − K(n1 é n2)}+(1 − h) ||F||.+2(1 − h)C.
Passing to the limit as h Q 1 we get (12). i
The proof of the lower bound is a combination of the discretizationargument in Ref. 1, pp. 532–535 with the entropy estimate from Ref. 9,pp. 191–192.
Denote by P0 the set of absolutely continuous probability measuresn(dx)=f(x) dx on R with compact support supp(n), and continuousdensity f. Let us first record the well-known fact.
Lemma 2. If n ¥ P0 then n has finite entropy
Hf :=F log f(x) n(dx) < ..
We first establish a weaker version of the lower bound.
Lemma 3. If F is given by (9), then
lim infn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
\ sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P0}. (13)
Large Deviations of Empirical Measures under Symmetric Interaction 943
Proof. Fix n1, n2 ¥ P0. Since k(x, y) \ − C is bounded from below,K(n1 é n2) ¥ (−., .], so without loss of generality we may assume thatk(x, y) is n1 é n2-integrable.
Since measures n1, n2 are absolutely continuous and have compactsupports, for every integer n > 0 we can find partitions P1(n)={a0 < a1 < · · · < an} and P2(n)={b0 < b1 < · · · < bn} of supp(n1),supp(n2) respectively such that
n1(ai − 1, ai)=n2(bj − 1, bj)=1n
for i, j=1, 2,..., n.
Then, denoting A=[a0, a1] × [a1, a2] × · · · × [an − 1, an] and B=[b0, b1] ×[b1, b2] × · · · × [bn − 1, bn], we have
F exp(n2F(mn)) f(x, y) dx dy
\ FA × B
exp 1minr
Cn
i, j=1Fr(xi, yj) − C
n
i, j=1k(xi, yj)2 dx dy. (14)
Write n1=f(x) dx, n2=g(y) dy. By our choice of the partitions,functions
fi(x) :=nf(x) I[ai − 1, ai]
and
gj(x) :=ng(x) I[bj − 1, bj]
are probability densities. Let
S(x, y)=minr
Cn
i, j=1Fr(xi, yj) − C
n
i, j=1k(xi, yj) − C
n
i=1log f(xi) − C
n
j=1log g(yj).
Integrating over a smaller set {f1(x1) > 0,..., fn(xn) > 0, g1(y1) > 0,...,gn(yn) > 0} on the right hand side of (14) we get
F exp(n2F(mn)) f(x, y) dx dy \1
n2n F exp(S(x, y)) Dn
i=1fi(xi) D
n
j=1gj(yj) dx dy.
Using Jensen’s inequality, applied to the convex exponential functionin the last integral, we get
F exp(n2F(mn)) f(x, y) dx dy \1
n2n exp(S1 − S2 − S3 − S4),
944 Bryc
where
S1=FR
2n1min
rCn
i, j=1Fr(xi, yj)2 D
n
i=1fi(xi) D
n
j=1gj(xj) dx dy,
S2=FR
2nCn
i, j=1k(xi, yj) D
n
i=1fi(xi) D
n
j=1gj(yj) dx dy,
S3=FR
nCn
i=1log f(xi) D
n
i=1fi(xi) dx,
S4=FR
nCn
j=1log g(yj) D
n
j=1gj(yj) dy.
We need the following identities. (Proofs of all Claims are postponeduntil the end of this proof.)
Claim 1. For a n1 é n2-integrable function h, we have
FR
nCn
j=1h(yj) D
n
j=1gj(yj) dy=n F
Rh(y) g(y) dy,
FR
nCn
i=1h(xi) D
n
i=1fi(xi) dx=n F
Rh(x) f(x) dx,
FR
2nCn
i, j=1h(xi, yj) D
n
i=1fi(xi) D
n
j=1gj(yj) dx dy=n2 FF h(x, y) f(x) g(y) dx dy.
Lemma 2 says that the entropies Hf=> log f(x) f(x) dx, Hg=> log g(y) g(y) dy are finite. Thus the functions k(x, y), log f(x), andlog g(y) are n1 é n2-integrable. Applying Claim 1, we get S2=n2K(n1 é n2),S3=nHf, and S4=nHg. Therefore,
F exp(n2F(mn)) f(x, y) dx dy \1
n2n exp(S1 − n2K(n1 é n2) − nHf − nHg).
We need the following lower bound for S1.
Claim 2.
F 1minr
Cn
i, j=1Fr(xi, yj)2 D fi(xi) D gj(yj) dx dy \ min
rCn
i, j=1Fr, (i, j),
(16)
Large Deviations of Empirical Measures under Symmetric Interaction 945
where
Fr, (i, j)=min{Fr(x, y): ai − 1 [ x [ ai, bj − 1 [ y [ bj}.
Combining inequalities (15) and (16), we get
1n2 log F exp(n2F(mn)) f(x, y) dx dy
\1n2 min
rCn
i, j=1Fr, (i, j) − K(n1 é n2) −
1n
Hf −1n
Hg −2n
log n. (17)
Since functions Fr(x, y) are continuous and n1, n2 have compact sup-port, n1 é n2-almost surely ;n
i, j=1 Fr, (i, j) I(ai − 1, ai)(x) I(bj − 1, bj)(y) Q Fr(x, y),and the functions are bounded. Since 1 [ r [ m ranges over a finite set ofvalues only we have
limn Q .
1n2 min
rCn
i, j=1Fr, (i, j)
=minr
limn Q .
Cn
i, j=1Fr, (i, j)n1(ai − 1, ai) n2(bj − 1bj)=F(n1 é n2).
Letting n Q . in (17) we obtain (13).To conclude the proof, it remains to prove Claims 1 and 2.
Proof of Claim 1. Switching the order of integration and summation,we get
F Cn
j=1h(yj) D gj(yj) dy= C
n
j=1F h(yj) gj(yj) dyj D
i ] jF gi(yi) dyi
= Cn
j=1F h(yj) gj(yj) dyj
=n Cn
j=1F
bj
bj − 1
h(y) g(y) dy=n Fbn
b0
h(y) g(y) dy.
The other two identities follow by a similar argument. i
Proof of Claim 2. Fix 0 [ k [ n, x1,..., xk ¥ R and y1,..., yn ¥ R. Let
Gr, k(x1,..., xk) := Ck
i=1Cn
j=1Fr(xi, yj)+ C
n
i=k+1Cn
j=1min
ai − 1 [ x [ ai
Fr(x, yj).
946 Bryc
If ak − 1 < xk < ak, we have
minr
Gr, k(x1,..., xk)
=minr
1 Ck − 1
i=1C
jFr(xi, yj)+C
jFr(xk, yj)+ C
n
i=k+1C
jmin
ai − 1 [ x [ ai
Fr(x, yj)2
\ minr
Gr, k − 1(x1,..., xk − 1).
Therefore,
Fak
ak − 1
minr
Gr, k(x) fk(xk) dxk \ minr
Gr, k − 1(x).
Recurrently,
F minr
1 Cn
i=1Cn
j=1Fr(xi, yj)2 D fi(xi) dx
=F minr
Gr, n(x) D fi(xi) dx \ minr
Gr, 0(x)
=minr
1 Cn
i=1Cn
j=1min
ai − 1 [ x [ ai
Fr(x, yj)2 .
Applying the same reasoning to variables y1,..., yn and
Gr, k(y1,..., yk) := Ck
j=1Cn
i=1min
ai − 1 [ x [ ai
Fr(x, yj)+ Cn
j=k+1Cn
i=1Fr, (i, j)
we get (16). i
This concludes the proof. i
The next lemmas show that the right hand sides of (12) and (13)coincide.
Let Pc denote compactly supported probability measures.
Lemma 4. If Assumption 3 holds true, then
sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ Pc}
=sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P}. (18)
Large Deviations of Empirical Measures under Symmetric Interaction 947
Proof. Clearly the left-hand side of (18) cannot exceed the right handside. To show the converse inequality, fix g > 0 and n1, n2 ¥ P such that
F(n1 é n2) − K(n1 é n2) \ supn1, n2 ¥ P
{F(n1 é n2) − K(n1 é n2)} − g. (19)
Since the supremum is finite, see (11), and k(x, y) is bounded from below,therefore >> |k(x, y)| dn1 dn2 < ..
For L > 0 large enough, define probability measures nj, L by
nj, L(A) :=nj(A 5 [−L, L])
nj([−L, L]), j=1, 2.
By definition, measures nj, L ¥ Pc have compact support. Since −C [
k(x, y) I|x| < L, |y| < L [ |k(x, y)| and k is n1 é n2-integrable, by Lebesgue’sdominated convergence theorem
limL Q .
K(n1, L é n2, L)=limL Q . >L
−L >L−L k(x, y) n1(dx) n2(dy)
limL Q . n1([−L, L]) n2([−L, L])=K(n1 é n2).
Similarly,
limL Q .
F(n1, L é n2, L)=F(n1 é n2).
Thus (18) follows. i
Lemma 5. If Assumptions 3 and 5 hold true, then
sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P0}
=sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P}. (20)
Proof. Trivially,
supn1, n2 ¥ P0
{F(n1 é n2) − K(n1 é n2)} [ supn1, n2 ¥ P
{F(n1 é n2) − K(n1 é n2)}.
To show the converse inequality, fix g > 0 and compactly supportedn1, n2 ¥ Pc such that
F(n1 é n2) − K(n1 é n2) \ supn1, n2 ¥ P
{F(n1 é n2) − K(n1 é n2)} − g, (21)
see Lemma 4. As previously, k(x, y) is n1 é n2-integrable, see (11).
948 Bryc
Consider the convolution nEj (A) := 1
2E >E−E nj(A − x) dx, where j=1, 2
and 0 < E [ 1. Measures nE1, nE
2 have continuous densities, and since n1, n2
have compact supports, nE1, nE
2 also have compact support. Thus nE1, nE
2 ¥ P0
and
supn1, n2 ¥ P0
{F(n1 é n2) − K(n1 é n2)} \ F(nE1 é nE
2) − K(nE1 é nE
2). (22)
As E Q 0 measure nEj converges weakly to nj. Hence
limE Q 0
F(nE1 é nE
2)=F(n1 é n2). (23)
Assumption 5 asserts that V(x, y) :=b log |x − y|+k(x, y) is acontinuous function. Thus |V(x, y)| is bounded on the compact setsupp(n1
1 é n12). Since the supports of nE
1 é nE2 are contained in supp(n1
1 é n12),
and nEj Q nj, we get
FF V(x, y) nE1(dx) nE
2(dy) Q FF V(x, y) n1(dx) n2(dy). (24)
This concludes the proof if b=0. If b > 0, then log |x − y| isn1 é n2-integrable as a linear combination of integrable functions,log |x − y|=(V(x, y) − k(x, y))/b. Therefore we have
F(nE1 é nE
2) − K(nE1 é nE
2)
=F(nE1 é nE
2) − FF V(x, y) nE1(dx) nE
2(dy)+b FF log |x − y| n1(dx) n2(dy)
− b 1FF log |x − y| n1(dx) n2(dy) − FF log |x − y| nE1(dx) nE
2(dy)2 .
Taking the lim sup as E Q 0, from (22), (23), (24), and (21) we get
supn1, n2 ¥ P0
{F(n1 é n2) − K(n1 é n2)}
\ supn1, n2 ¥ P
{F(n1 é n2) − K(n1 é n2)} − g
− lim supE Q 0
1FF log |x − y| n1(dx) n2(dy) − FF log |x − y| nE1(dx) nE
2(dy)2 .
Since g > 0 is arbitrary, to end the proof we use the following.
Large Deviations of Empirical Measures under Symmetric Interaction 949
Claim 3.
lim supE Q 0
1FF log |x − y| n1(dx) n2(dy) − FF log |x − y| nE1(dx) nE
2(dy)2 [ 0. i
Proof of Claim 3. Claim 3 is established by the argument in Ref. 9,pp. 192–193. For completeness, we repeat it here. Let X, Y be independentrandom variables with distributions n1, n2 respectively and let Z=X − Y.Since log |Z| is integrable, Pr(Z=0)=0. Let U ¥ [−2, 2] be a r.v. inde-pendent of Z with the density f(u)=(2 − |u|)/4. It is easy to see that>> log |x − y| nE
1(dx) nE2(dy)=E log |Z+EU|, and the inequality to prove
reads
lim supE Q 0
E 1 log+ 1|1+E(U/Z)|
2 [ 0.
For fixed z ] 0 we have
E 1 log+ 1|1+E(U/z)|
2 [1
log 2log 11+
2E
|z|2 . (25)
Indeed, since (2 − |u|)/4 [ 1/2 we get
E 1 log+ 1|1+E(U/z)|
2 [|z|4E
F1+2E/|z|
1 − 2E/|z|log+ 1
|x|dx.
Therefore,
E 1 log+ 1|1+E(U/z)|
2 [ ˛ |z|4E
F1
1 − 2E/|z|log
1x
dx if |z| > 2E
|z|4E1F
1
0log
1x
dx+F2E/|z| − 1
0log+ 1
xdx2 if |z| [ 2E .
If |z| > 2E we get E(log+ 1|1+E(U/z)|) [ 1
2 log 11−2E/|z| < 1
2 log(1+2E|z|) [ 1
log 2 log(1+2E|z|).
If |z| [ 2E, then E(log+ 1|1+E(U/z)|) [ |z|
2E >10 log 1
x dx [ 1 [ 1log 2 log(1+2E
|z|). Thus inboth cases, (25) holds true.
To finish the proof we integrate inequality (25) and get
lim supE Q 0
E 1 log+ 1|1+E(U/Z)|
2 [1
log 2lim sup
E Q 0E(log(1+2E/|Z|)).
950 Bryc
For E < 1/2 we have log(1+2E/|Z|) [ log 2+log+ 1|Z| and log+ 1
|Z| is inte-grable. Lebesgue’s dominated convergence theorem yields
lim supE Q 0
E(log(1+2E/|Z|))=0. i
Proof of Theorem 2. Combining Lemmas 1 and 3 we have
sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P0}
[ lim infn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
[ lim infn Q .
1n2 log F exp(n2F(mn)) f(x, y) dx dy
[ sup{F(n1 é n2) − K(n1 é n2): n1, n2 ¥ P}.
By (20), all of the above inequalities are in fact equalities. Thus (10) holdstrue. i
3.2. Exponential Tightness
Recall that {mn} is exponentially tight if for every m > 0 there is acompact subset K … P such that
supn
1n2 log Pr(mn ¨ K) < − m.
Our proof of exponential tightness is a concrete implementation of deAcosta. (5)
Assumption 3 implies that k(x, y)+C \ 0. Let q: P(R2) Q [0, .] begiven by
q(m)=FR
2(k(x, y)+C) dm.
Lemma 6. If Assumptions 3 and 4 hold true, then q has pre-compactlevel sets: for every t > 0, q−1[0, t] is a pre-compact set in P.
Proof. Fix t > 0 and denote K :={m ¥ P(R2): q(m) [ t}. We willshow that K is pre-compact.
Large Deviations of Empirical Measures under Symmetric Interaction 951
Assumption 4 says that for every E > 0 the set KE :={(x, y):C+k(x, y) [ t/E} is a compact subset of R2. For every m ¥ K byChebyshev’s inequality we have
m(KcE) [ m({(x, y): C+k(x, y) > t/E}) [
Eq(m)t
=E.
Thus K is pre-compact, and its weak closure K is compact. i
Lemma 7. If Assumptions 1, 2, and 3 hold true, then
supn
1n2 log F exp 1 1
2n2q(mn)2 f(x, y) dx dy < .. (26)
Proof. We have
F exp 112
n2q(mn)2 f(x, y) dx dy=F exp 112
n2C −12
Cn
i, j=1k(xi, yj)2 dx dy
[ e12 n2C F D
n
i, j=1`g(xi, yj) dx dy [ en2CMn
1/2.
Therefore the left-hand side of (26) is at most C+log+M1/2 < .. i
Theorem 3. Under the assumptions of Theorem 1, the sequence {mn}is exponentially tight.
Proof. Notice that by (10) used with F(m) :=0 we have1n2 log Zn Q L0 := − infm > k(x, y) dm=−infx, y k(x, y). Since L0 is finite, see(11), therefore by Lemma 7 we have
supn
1n2 log F exp 11
2n2q(mn)2 1
Znf(x, y) dx dy=C1 < ..
Fix m > 0. Let K … P be the pre-compact set from Lemma 6 correspond-ing to t=2m+2C1.
Applying Chebyshev’s inequality to probability measure (3) we get
Pr(mn ¨ K) [ Pr(mn ¨ K)=Pr(q(mn) > t) [ e−12 n2t F exp( 1
2 n2q(mn)) d Pr.
Therefore
Pr(mn ¨ K) [ e−12 n2t en2C1,
952 Bryc
and
1n2 log Pr(mn ¨ K) [ −t/2+C1=−m
for all n. i
Proof of Theorem 1. Recall that the space P=P(R2) of probabilitymeasures on R2 with the topology of weak convergence is a Polish space.By Theorem 3, {mn} is exponentially tight. Theorem 2 says that theVaradhan functional
L(F) := limn Q .
1n2 log E(exp n2F(mn))
is defined on all functions F given by (9), and
L(F)=supn1, n2
{F(n1 é n2) − K(n1 é n2)} − limn Q .
1n2 log Zn
=supn1, n2
{F(n1 é n2) − K(n1 é n2)}+infx, y
k(x, y).
Thus
L(F)=supn1, n2
{F(n1 é n2) − K(n1 é n2)}+I0. (27)
Functions F defined by (9) form a subset of Cb(P(R2)) which sepa-rates points of P(R2) and is closed under the operation of taking pointwiseminima. Thus by Ref. 3, Theorem T.1.3 or Ref. 6, Theorem 4.4.10, theempirical measures {mn} satisfy the large deviation principle with the ratefunction
I(m) :=sup{F(m) − L(F)}; (28)
here, the supremum is taken over all F1,..., Fm ¥ Cb(R2) and F(m) is definedby (9).
It remains to prove formula (4). Fix n1, n2 ¥ P. From (27), for F givenby (9) we have L(F) \ F(n1 é n2) − K(n1 é n2)+I0. Thus formula (28)implies that
I(n1 é n2) [ K(n1 é n2) − I0. (29)
Large Deviations of Empirical Measures under Symmetric Interaction 953
To prove the converse inequality we use the fact that we already knowthat the large deviations principle holds. The large deviations principleimplies that
I(n1 é n2)= supF ¥ Cb(P)
{F(n1 é n2) − L(F)}. (30)
Now consider FM(m)=> (M N k(x, y)) dm. Assumptions 3 and 5 implythat (x, y) W M N k(x, y) is a bounded continuous function for every realM. Thus FM is given by (9). Since M N k(x, y) [ k(x, y), from (27) we getL(FM) [ I0. Thus
I(n1 é n2) \ supM
{FM(n1 é n2) − L(FM)} \ lim supM Q .
F M N k(x, y) dm − I0.
This together with (29) proves (4) for product measures.It remains to verify that if m0 is not a product measure, then
I(m0)=.. To this end, take bounded continuous functions F(x), G(y)such that
d :=F F(x) G(y) m0(dx, dy) − F F(x) m0(dx, dy) F G(y) m0(dx, dy) > 0.
For b > 0, let
Fb(m) :=b 1F F(x) G(y) m(dx, dy) − F F(x) m(dx, dy) F G(y) m(dx, dy)2 .
Clearly, Fb : PQ R is a bounded continuous function, which vanisheson product measures. By the upper bound (12) we therefore haveL(Fb) [ I0. So I(m0) \ Fb(m0) − L(Fb) \ bd − I0. Since b can be arbitrarilylarge, I(m0)=.. i
ACKNOWLEDGMENTS
I would like to thank P. Dupuis for a conversation on non-convex ratefunctions.
REFERENCES
1. Ben Arous, G., and Guionnet, A. (1997). Large deviations for Wigner’s law andVoiculescu’s non-commutative entropy. Probab. Theory Related Fields 108(4), 517–542.
2. Berg, C., Christensen, J. P. R., and Ressel, P. (1984). Harmonic Analysis on Semigroups,Springer-Verlag, New York.
954 Bryc
3. Bryc, W. (1990). On the large deviation principle by the asymptotic value method. InPinsky, M., (ed.), Diffusion Processes and Related Problems in Analysis, Birkhäuser, Vol. I,pp. 447–472.
4. Chan, T. (1993). Large deviations for empirical measures with degenerate limiting distri-bution. Probab. Theory Related Fields 97(1–2), 179–193.
5. de Acosta, A. (1985). Upper bounds for large deviations of dependent random vectors.Z. Wahrsch. Verw. Gebiete 69(4), 551–565.
6. Dembo, A., and Zeitouni, O. (1998). Large Deviations Techniques and Applications (2nded.), Springer-Verlag, New York.
7. Dinwoodie, I. H., and Zabell, S. L. (1992). Large deviations for exchangeable randomvectors. Ann. Probab. 20(3), 1147–1166.
8. Hiai, F., and Petz, D. (2000). The Semicircle Law, Free Random Variables and Entropy,American Mathematical Society, Providence.
9. Johansson, K. (1998). On fluctuations of eigenvalues of random Hermitian matrices. DukeMath. J. 91(1), 151–204.
10. Ressel, P. (1982). A general Hoeffding type inequality. Z. Wahrsch. Verw. Gebiete 61(2),223–235.
Large Deviations of Empirical Measures under Symmetric Interaction 955