13
Carnegie Mellon University Research Showcase @ CMU Department of Statistics Dietrich College of Humanities and Social Sciences 9-1990 Bayes' eorem for Choquet Capacities Larry A. Wasserman Carnegie Mellon University Joseph B. Kadane Carnegie Mellon University, [email protected] Follow this and additional works at: hp://repository.cmu.edu/statistics is Article is brought to you for free and open access by the Dietrich College of Humanities and Social Sciences at Research Showcase @ CMU. It has been accepted for inclusion in Department of Statistics by an authorized administrator of Research Showcase @ CMU. For more information, please contact [email protected]. Published In e Annals of Statistics, 18, 3, 1328-1339.

Bayes' Theorem for Choquet Capacities

Embed Size (px)

Citation preview

Carnegie Mellon UniversityResearch Showcase @ CMU

Department of Statistics Dietrich College of Humanities and Social Sciences

9-1990

Bayes' Theorem for Choquet CapacitiesLarry A. WassermanCarnegie Mellon University

Joseph B. KadaneCarnegie Mellon University, [email protected]

Follow this and additional works at: http://repository.cmu.edu/statistics

This Article is brought to you for free and open access by the Dietrich College of Humanities and Social Sciences at Research Showcase @ CMU. It hasbeen accepted for inclusion in Department of Statistics by an authorized administrator of Research Showcase @ CMU. For more information, pleasecontact [email protected].

Published InThe Annals of Statistics, 18, 3, 1328-1339.

The Annals of Statistics1990, Vol. 18, No.3, 1328-1339

BAYES' THEOREM FOR CHOQUET CAPACITIES

By LARRY A. WASSERMAN! AND JOSEPH B. KADANE 2

Carnegie Mellon University

We give an upper bound for the posterior probability of a measurableset A when the prior lies in a class of probability measures 9. The boundis a rational function of two Choquet integrals. If g; is weakly compact andis closed with respect to majorization, then the bound is sharp if and only ifthe upper prior probability is 2-alternating. The result is used to compute-bounds for several sets of priors used in robust Bayesian inference. Theresult may be regarded as a characterization of 2-alternating Choquetcapacities.

1. Introduction. Sets of probability measures arise naturally in classicalrobustness, Bayesian robustness and group decision making. Some sets ofprobability measures give rise to upper probabilities that are 2-alternatingChoquet capacities. These upper probabilities are pervasive in the robustnessliterature; see Huber (1973), Huber and Strassen (1973), Buja (1984, 1985,1986), Bednarski (1982) and Rieder (1977), for example. The purpose of thispaper is to prove a version of Bayes' theorem for sets of prior probabilities thathave the 2-alternating property and to see how these sets of probabilities maybe exploited in Bayesian robustness. This result was proved in Wasserman(1988, 1990) for infinitely alternating capacities (also known as belief func­tions; Shafer, 1976). Since infinitely alternating capacities are also 2-alternat­ing, the present result generalizes that theorem. However, the proof in theinfinitely alternating case uses an argument that depends on properties thatinfinitely alternating capacities possess that are not shared by 2-alternatingcapacities in general. Also, in this paper, the conditions given are both neces­sary and sufficient. A proof of sufficiency when the parameter space is finite isgiven in Walley (1981).

Section 2 of this paper states and proves the main result. In Section 3, weapply the result to derive explicit bounds for the posterior probability of ameasurable set using various classes of priors. Section 4 contains a discussion.

2. Main result. Let 0 be a Polish space, that is, the topology for 0 iscomplete, separable and metrizable, and let ~ be the Borel u-algebra for 0.Let PI be the set of all bounded, nonnegative, real-valued, measurable func­tions defined on 0. Let {Po; () E 0} be a set of probability measures on asample space Y with u-algebra N. Assume that each Po has a density {(YI())

Received October 1988; revised May 1989.1Research supported by a post-doctoral fellowship from the Natural Sciences and Engineering

Research Council of Canada.2Research supported by NSF grant DMS-87-05646.AMS 1980 subject classifications. Primary 62F15; secondary 62F35.Key words and phrases. 2-alternating Choquet capacities, robust Bayesian inference, posterior

bounds.

1328

BAYES' THEOREM AND CAPACITIES 1329

with respect to some u-finite dominating measure and let L(()) = f(YI()) be thelikelihood function for () having observed Y E Y. We assume that LEge foreach Y E Y.

Let 9 be a nonempty set of prior probability measures on ~ and define theupper and lower prior probability functions by

P( A) = sup P( A) and E( A) = inf P( A).PE9 PE9

Since 15 and E are related by the equation peA) = 1 - E(AC), it suffices toconcentrate on p. In what follows, there is no loss in generality by assumingthat 9 is convex. For each X E gc, define the upper expectation of X byE(X) = supp E g;P(X), where P(X) = jX(())P(d()). The lower expectation ofX, denoted by ~(X), is defined in an analogous way. Let g;, be the class ofposteriors arrived at by applying Bayes' theorem to each prior P in 9 forwhich peL) > o. To avoid triviality, we assume there is at least one such P.Let Py be the upper probability generated by g;,. The posterior correspondingto P is denoted by Py • __

We say that 9 is 2-alternating if for each A, B E ~, peA u B) ~ peA) +PCB) - p(A n B). We say that 9 generates a Choquetcapacity if p(Fn)~ p(F)for each sequence of closed sets Fn ~ F. It can be shown that 9 generates aChoquet capacity if and only if {P; P ~ p} is weakly compact. Here, P ~ 15means that peA) ~ peA) for every A E ~. See Choquet (1953) and Huberand Strassen (1973) for details on capacities. We say that 9 is closed withrespect to majorization, or is m-closed, if P ~ 15 implies that P E 9. It iscommon practice in the literature to assume that 9 is m-closed. We will show(Section 3, Example 1) that replacing 9 with its m-closure has nontrivialconsequences.

For each X E gc, the upper Choquet integral of X is defined by

C(X) = [OP(Xt ) dt,o

where X t == {() E 0; X(()) > t}. Similarly, the lower Choquet integral Q(X) isdefined by

It follows easily that Q(X) ~ E(X) ~ E(X) ~ C(X). If 9 consists of a singleprobability measure P, then C(X) = Q(X) = E(X) = E(X) = P(X); if thisprobability measure is our prior probability, then the posterior probability of asubset A, after observing y and applying Bayes' theorem, may be expressed asP(LA)/(P(LA) + P(LAc)), where L A(()) = L(())IA(()) and IA(()) is the indicatorfunction for A. Note that C and Q may be extended to the set of all bounded,measurable functions by adding and subtracting a sufficiently large positiveconstant.

1330 L. A. WASSERMAN AND J. B. KADANE

THEOREM. Let g; be a nonempty set ofprior probabilities on ~ and let g;,be the corresponding class ofposterior probabilities. Then, for each A E ~,

when the ratios are well defined. If g; generates a Choquet capacity and if g;is m-closed, then the following three statements are equivalent:

(i) P is 2-alternating.(ii) The first inequality is an equality for each A E ~ and each LEge.

(iii) The second inequality is an equality for each A E ~ and each LEge.

Before proving the theorem, we shall state a few lemmas. The first lemma isfrom Huber and Strassen (1973).

LEMMA 1. If g; is m-closed and generates a 2-alternating Choquet capac­ity, then for each upper semicontinuous X E ge, there exists aPE g; such thatP(X) = E(X) and P(Xt ) = P(Xt ), for each real number t.

The next two lemmas are from Bednarski (1982).

LEMMA 2. If <3 is finite, then the following two conditions are equivalent:

(i) peA u B) ~ peA) + PCB) - peA n B) for each A, B c <3.(ii) For every sequence of sets Al c A 2 C ... C An' there exists P ~ P such

that P(A i ) = P(A i ) for i = 1,2, ... ,n.

LEMMA 3. Let g; be weakly compact and suppose that g; is m-closed. Theng; is 2-alternating if and only if, for each finite field N, g;IN is 2-alternat­ing where g;IN is the restriction of g; to Nand M is the closure of M withrespect to pointwise convergence on atoms.

The last two lemmas may be found in Buja (1984).

LEMMA 4. Let g; generate a 2-alternating Choquet capacity. For eachX E ge and each E > 0, there exists a nonnegative, upper semicontinuousfunction h ~ X such that C(X) - E < C(h) ~ C(X). Furthermore, if g; ism-closed, then for each X E ge, E(X) ~ C(X).

LEMMA 5. g; is 2-alternating if and only if C(X + y) ~ C(X) + C(Y) foreach X, Y E ge.

PROOF OF THEOREM. Note that for every PEg; for which peL) > 0,

P(LA )

PiA) = P(LA

) + P(LAC

) •

BAYES' THEOREM AND CAPACITIES

Now, Py(A) is of the form

1331

1h(P) = (1 + g(P)/f(P)) ,

where rep) = P(LA) and g(P) = P(LAc). h(P) is maximized where g(p)jr(p)is minimized. But

which proves the first inequality. The second inequality follows from that factthat E(X) ~ C(X) and ~(X) ~ Q(X).

Now suppose g; is m-closed and generates a Choquet capacity. We begin byshowing that (i) implies (ii). Assume that P is 2-alternating. We need to showthat the first inequality is an equality for each A E ~ and for each L E fJC. IfE(LA) = 0, this follows easily. Now suppose E(LA) > 0. Let b = SUPOEACL(O).Define a function a on e by a(O) = L(O) + b on A and a(O) = - L(O) + b onAC. For notational convenience, define

E(LA )B= -=------

E( LA) + ~(LAc) .

We will show that for each e > 0, there exists P ~ P such that

P(LA )B-e< <B.

P( LA) + P( LAc) -

For every 8 > 0, there exists, by Lemma 4, a nonnegative, bounded, uppersemicontinuous function h such that h ~ a and E(a) - 8 ~ E(h) ~ E(a). Setk = E(LA) + ~(LAc). Note that k > 0. Choose 8 < Eke By Lemma 1, thereexists PEg; such that P(h) = E(h) and P(h > t) = P(h > t), for each t. Itis easily verified that P(LA) ~ E(LA) - 8 and P(LAc) ~ ~(LAc) + 8. Thus,

P(LA) E(LA)-8Py ( A) = ~ -=---------

P(LA) + P(LAc) E(LA) - 8 + E(LAc) + 8

8 8= B - E(LA) + ~(LAC) = B - k > B - e.

Since this holds for each e > 0, supP E(gPy(A) = B so that (ii) holds. Further,(i) implies (iii) because of Lemma 4.

To prove that (ii) implies (i), first assume that e is finite. Suppose thatPy(A) = E(LA)j(E(LA) + ~(LAc)) for every subset A and every L E fJC. g; isclosed, with respect to the Euclidean topology, so there exists, for each A, aPEg; such that Py(A) = Py(A). Therefore, .

E(LA) P(LA)

E(LA) + ~(LAc) P(LA) + P(LAc) ,

1332 L. A. WASSERMAN AND J. B. KADANE

from which it follows that

(1)

This holds for each A and each L E [!C. Let Al C A 2 C ... C An be anincreasing sequence of subsets. For i = 1,2, ... ,n - 1, define L i on ® to beequal to 1 on Ai and Ai+l and 0 otherwise. From (1) we deduce that thereexists Pi E 9 such that Pi(A i ) = P(A i ) and Pi(A i+l ) = P(Ai+l ) for i =1,2, ... , n - 1. Define A o = 0 and let gi = Ai - Ai-I' Let ~ be the fieldgenerated by {gl' ... ,gn}' Define an additive set function S on ~ by

S(gl) = Pl(Al ),

S(g2) =P2(A2) -Pl(Al ),

S(gn-l) = Pn- l( An-I) - Pn- 2( A n- 2),

S(gn) = Pn-l(An) - Pn-l(A n- l )

and S(A~) = 1 - SCAn)' Then, S(Ai ) = P(Ai ) for i = 1, 2, ... , n. Also, S(gl) ~o and S(gi) = Pi(A i ) - Pi-l(A i - l ) = P(A i ) - Pi-l(A i - l ) = Pi-l(A i ) ­Pi-l(A i - l ) ~ 0, so that S is a probability measure on ~ which maximizes theprobability of the Ai'S, and since- S is dominated by P on ~, S may beextended by the Hahn-Banach theorem to 2°. By Lemma 2, P satisfies the2-alternating condition. Therefore 9 is 2-alternating.

Now we show that (iii) implies (i) and we continue with the assumption that® is finite. Let the second inequality be an equality for each A E ~ and eachL E [!C. Then E(X) = C(X) for each bounded function X. For suppose E(X) <C(X) for some X not identically equal to zero. Let a = min{X(O); 0 E ®} andset A = {O, X(O) > a}. We claim that E(XA) < C(XA). To see this, note that ifa = .Q, t.!!en E(X~ = E(X) < C(X)_= C(XA). If a > 0, then-py the ~finitionof C, C(XA) = C(X) - aE(AC) > E(XA) + Ll, where Ll = E(X) - E(XA) ­aE(AC). Now, for each P E 9,

P(X) = P(XA) + P(XAc) = P(XA) + aP(AC) ~ P(XA) + aE(AC),

so that E(X) ~ E(XA) + aE(AC). Hence, Ll ~ 0 which implies that E(XA) <C(XA). But this, together with the fact that E(LAc) = aE(AC) = Q(LAc) leadsto the second inequality being strict. This contradicts our assumption. There­fore, E(X) = C(X). It follows that C is subadditive and from Lemma 5 wededuce that P is 2-alternating. (

Now remove the restriction that ® is finite. Suppose that either (ii) or (iii)holds. Let ~ be a finite subfield and without loss of generality, assume ~ isgenerated by a finite partition H = {h l' h 2' ... , h k}' It follows easily that (ii)and (iii) hold on 91~ . By the previous argument, if either (ii) or (jii) holds,then 91~ is 2-alternating. By Lemma 3, 9 is 2-alternating. D

We close this section with a few remarks. First, note that if 9 is notm-closed, we still have that (ii) implies (i) and also (iii) implies (i), since

BAYES' THEOREM AND CAPACITIES 1333

m-closure was not used in those parts of the proof. Second, if 9 is not convex,the theorem is still true since replacing 9 with its convex closure does notchange the upper and lower probabilities or the upper and lower expectations.Third, it may be shown that if the prior is 2-alternating, then the posterior is2-alternating. This is proved for finite E> in Walley (1981). The proof is easilyextended to the case where E> is a Polish space. However, if the set of priors 9is m-closed, it does not follow in general that the set of posteriors is m-closed.Consider the following example.

Let E> = {I, 2, 3, 4} and let 9 be the convex hull of P and Q which havecoordinat€s (t, t, O,~) and (t, t, ~,O), respectively. Set B = {I, 2, 3}. Now 9 ism-closed but the set of probabilities 9 B obtained by conditioning each P E 9is not. For, R ~ PB but R $. 9 B , where R has coordinates (0.3, 0.4, 0.3, 0)and PB is the upper probability generated by 9 B . Thus, m-closure is notpreserved by conditioning and hence is not, in general, preserved by Bayes'theorem. An interesting problem is to determine when an m-closed class ofpriors remains m-closed under the action of Bayes' theorem.

The implication of the example above is that the Choquet integral of abounded random variable X, based on the posterior upper probability, will not,in general, equal the upper posterior expectation of X. Nonetheless, theChoquet integral still provides a conservative bound for the upper posteriorexpectation of X. Thus, denoting the Choquet integral of X with respect tothe upper posterior probability by Cy , we have that

E(XIY = y) ~ Cy(X) = jOOPiXt) dt = JOO t~ f( ) ,o ol+gt/ t

where

f(t) = jOOp(Xt n L u ) du and g(t) = l°Op(X~ n LJ duoo 0

Of course, Qy(X) may be calculated in a similar ~y. Note that if Cy(X) ­Qy(X) is small, then it is unnecessary to compute E(XIY = y).

3. Examples.

EXAMPLE 1. A set that is not m-closed. To see that closure with respect tomajorization is important, here is an example of a set of probabilities thatgenerates a 2-alternating upper probability but that is not m-closed and forwhich the equalities in the theorem are strict. Let E> = {I, 2, 3}, let the vectorof likelihood values be (1,6,2) and let 9 be the set of probability measureswhose coordinates on E> are given by (p/2, p/2, (1 - p)), where p E [0,1]. Theupper probability is 2-alternating but 9 is not m-closed. For example, if Phas coordinates (0.2,0.3,0.5), then P ~ P but P is not in 9. Let A = {I}.Then

1 t(LA )- <~------5 C( LA) + Q( LAc)

1

3

1334 L. A. WASSERMAN AND J. B. KADANE

This shows that if 9 is not m-closed, then the 2-alternating condition is notstrong enough to imply the result. Note that enlarging 9 to be m-closed has anontrivial effect on the upper posterior probability.

EXAMPLE 2. A set that is not 2-alternating. Now we give an example of aset of probabilities that is m-closed but is not 2-alternating and we show thatthe inequalities are strict. A similar example was considered in Wasserman(1990). Let e = {I, 2, 3, 4} and let the vector of likelihood values on e be(a, b, c, d). Suppose that a > b > c > d > °and b > c + d. Now let P be aprobability measure with values (~,~, 0, 0) on e and let Q have values(0, o,~, ~). Finally, let 9 be the convex hull of P and Q. It is easily verifiedthat 9 is m-closed and that P is not 2-alternating. Let A = {I}. Then

_ a E(LA )

PiA) = a + b < E(LA

) + g(LAC

)

a C(LA )= < -=-------

a + c + d C(LA) + Q(LAc)

so that the inequalities in the theorem are strict.

a

a + c'

The following are sets of probability measures that arise naturally in thestudy of Bayesian robustness. For a general overview of Bayesian robustness,see Berger (1984, 1987).

EXAMPLE 3. e-Contaminated priors. Let P be a fixed probability measureon ~, let e E [0,1] be fixed and assume that e is compact. Define 9 by

9= {Q; Q = (1 - e) P + eR, where R is any probability measure on ~}.

This is the class of e-contaminated priors that have been studied by Huber(1973), Berger (1985) and Berger and Berliner (1986). It follows easily thatpeA) = (1 - e)P(A) + e, A =1= 0 and E(A) = (1 - e)P(A) for A =1= e. 9 ism-closed and generates a 2-alternating Choquet capacity. [Buja (1986), showsthat the compactness assumption may be dropped.] For every X E PI,

C(X) = (1 - e)P(X) + esupX(O)(JEE>

and

Q(X) = (1 - e)P(X) + e inf X(O).(JEE>

Applying the theorem, we have

P (A) = (1 - e)Pi A ) + ea/P(L)y (l-e)+eajP(L)

where a = sUPo E A L(O). This result was stated in Huber (1973).

BAYES' THEOREM AND CAPACITIES 1335

EXAMPLE 4. Total variation neighbourhoods. As in Example 3, fix P and eand let <3 be compact. Define 9 by

9= {Q;Ll(P,Q) ~ e},

where Ll(P, Q) = SUPAE.98IP(A) - Q(A)I. This is the e total-variation classgenerated by P. Let x V y denote the maximum of x and y and x 1\ y theminimum of x and y. Then peA) = (P(A) + e) 1\ 1, A =1= 0, and E(A) =(P(A) - e) V 0, A =1= <3. Again, 9 is m-closed and generates a 2-alternatingChoquet capacity. For every X E f!,C, define to(X) = sup{t; P(Xt) ~ 1 - e} andso(X) = inf{t; P(Xt) ~ e}. Also, let

Then

C(X) =P(X) +esupX+ (l-e)to(X) -J(X,O,to(X))

and

Q(X) = P(X) + e inf X - eso(X) - J(X, so(X), sup X).

Let to = to(LA) and So = so(LAc). Then,

_ P(LA) + (1 - e)to + ea - J(LA, 0, to)Py { A) = ) ( ) ( ) ,P(L)+(I-e to+ea-J LA,O,to -eso-J LAc,so,b

where a = SUPOEE> LA(O) and b = SUPOEE> LAc(O).

Bednarski (1981) and Buja (1986) considered an upper probability thatcontains e-contamination and total variation neighbourhoods as special cases.It may be shown that this generalized class is an m-closed, 2-alternatingChoquet capacity as well.

EXAMPLE 5. Density bounded classes. Let Land U be two u-finite mea­sures on ~ such that L ~ U; that is, L(A) ~ U(A) for each A E ~. Lavine(1987) considers classes of probability measures of the form g; = {P; L ~ P ~

U}. He calls this a density bounded class. We shall assume that L(<3) ~ 1 ~

U(<3) < 00. Without loss of generality, we also assume that Land U havedensities land u with respect to some u-finite dominating measure v. rrhenP E 9 if and only if P has density p with respect to v and for v-almost all 0,leO) ~ p(O) ~ u(O). DeRobertis and Hartigan (1981) consider the class DH ofall u-finite measures between Land U. The class DH is also discussed inBerger (1987) and DasGupta and Studden (1988a, 1988b).

For the class 9, it follows that peA) = min{U(A), 1 - L(AC)} and E(A) =max{L(A),1 - U(AC)}. Further, it may be shown that 9 generates a 2-alter­nating, m-closed Choquet capacity. For X E f!,C, set to(X) =inf{t; U(Xt) + L(X%) ~ I} and so(X) = sup{t; U(XtC) + L(Xt) ~ I}. LetJ(Xt,c,d)=!cdD(Xt)dt, where D=U-L. Finally, let d L =I-L(<3) andd u = U(<3) - 1. Then, C(X) = U(X) + to(X) d L - J(X, 0, to(X)) and C(X) =

1336 L. A. WASSERMAN AND J. B. KADANE

L(X) - SO(X) d U + J(X, 0, so(X)). Here, U(X) = jX(O)U(dO) and L(X) =jX(O)L(dO). Therefore,

- U(LA) + tod L - J(LA, 0, to)Py ( A) = ( ( ( ('U LA) + tod L - J LA' 0, to) + L LAc) - So d u + J LAc, 0, so)

where to == to(LA) and So == so(LAc).

EXAMPLE 6. Probabilities with partition restrictions. Moreno and Cano(1988) consider the following set of priors. Let P be a fixed probabilitymeasure on yg, e a fixed number in [0,1] and H = {hI' , h k} a measurablepartition of 8. Assume that 8 is compact. Let P = (PI' ,Pk) be such thateach Pi ~ °and LPi = 1. Then define

9= {Q;Q = (1 - e)P + e8, 8 E J}.

Here J is the set of all probability measures that satisfy 8(h i) = Pi' i =1,2, ... ,k. The interpretation is the same as the e-contaminated class ofExample 3, except that we are putting constraints on the possible contamina­tions by forcing them to have fixed probabilities on some partition. This is asimple, reasonable way to restrict the size of the possible contaminatingdistributions.

This class contains two important special cases. By taking H = 8, we getback the general e-contaminated model of Example 3. By taking e = 1, we getthe set of all probability measures with fixed probabilities on a partition. Thislatter class has been studied by Berliner and Goel (1986).

Define I A = {i; hi c A} and JA = {i; hi n A =1= 0}. Then

f(A) = (1 - e)P(A) + e L PiiEIA

and

P(A) = (1 - e)P(A) + e L Pi.iEJA

It may be shown that P is n-alternating for every n and hence is 2-alternat­ing. Further, 9 generates an m-closed, Choquet capacity. It follows that

Q(X) = (1 - e)P(X) + eL KiPi

and

where Ki = infOEht X(O) and Xi = sUPOEht

X(O). Therefore,

- (1 - e)P(LA) + eLPiLA.

PiA) = (1 - e)P(L) + eLPi[ L Ai + iAd ·EXAMPLE 7. Symmetric, unimodal contaminations. Sivaganesan and Berger

(1989) consider the following modification of the e-contaminated class for 8

BAYES' THEOREM AND CAPACITIES

equal to the real line. For fixed P and £, set

g; = {Q = (1 - £) P + £S , S E J},

1337

where J is the set of all probability measures that possess densities that areunimodal and symmetric about a fixed point (Jo. Without loss of generality, wetake 00 = 0. Now, g; generates a 2-alternating upper probability if and only ifJ generates a 2-alternating upper probability. Hence, we confine attention tothe class J. As Sivaganesan and Berger (1989) remark, when computingextrema, one need only consider the smaller class J' consisting of uniformprobability measures that are centered at the origin. A point mass at the originmust be included as well. Let S be the upper probability generated by J. Weclaim that S is not 2-alternating and that J is not m-closed.

To see that S is not 2-alternating, set A = [- 3, - 1] U [1,2] and B =[-2, - 1] U [1,3]. Then, i = S(A U B) > S(A) + S(B) - S(A n B) = ~ +~ - ~. To see that J is not m-closed, let R = ~Do + ~u, where Do is a pointmass at °and u is the uniform [0, 1] distribution. Now, if°E A, R(A) ~ 1 =S(A). Otherwise, R(A) = R(A n [0,1]) = u(A n [0,1])/2 = u'(A n [0, 1]) ~u'(A) ~ S(A), where u' is uniform [-1, 1]. Thus, R ~ S but R $. J, so J isnot m-closed.

As a result, the Choquet integral fails miserably as an approximation to E.For example, let X n be defined by

(

0,X n ( 0) = n,

-n,

if-n~O~n,

if 0 > n,ifO<-n.

Then E(Xn ) = E(Xn ) = 0, but it is easily shown that C(Xn ) = n/2. Hence,limn -+oo(C(Xn) - E(Xn )) = 00. This is an extreme example that shows howimportant the 2-alternating and m-closure conditions are. It would be interest­ing to investigate how much of the difference between E and C is accountedfor by the failure of the 2-alternating condition and how much is because ofthe lack of m-closure.

4. Discussion. The main result of this paper helps to unify the process offinding bounds for posterior probabilities in robust Bayesian inference. Also,this theorem may be thought of as a characterization of 2-alternating Choquetcapacities. Other authors have given different characterizations. Huber andStrassen (1973) showed that the 2-alt

/ernating structure is necessary and

sufficient for generalizing the Neyman-Pearson lemma to sets of probabilities.Buja (1984, 1985) generalized this result from the case of two hypotheses to afinite number of hypotheses. Similarly, Bednarski (1982) showed that thiscondition is necessary and sufficient for the construction of least in(ormativebinary experiments in the sense of Blackwell (1951), Le Cam (1964, 1969,1972) and Torgersen (1970). These authors were concerned with uncertaintywith respect to sampling distributions, as represented by sets of probabilitymeasures dominated by 2-alternating Choquet capacities. We have focused on

1338 L. A. WASSERMAN AND J. B. KADANE

uncertainty in the prior distribution. Other properties of 2-alternating capaci­ties may be found in Walley (1981).

Acknowledgments. We are grateful to the referees for their insightfulcomments and suggestions.

REFERENCESBEDNARSKI, T. (1981). On solutions of minimax test problems for special capacities. Z. Wahrsch.

Verw. Gebiete 58 397-405.BEDNARSKI, T. (1982). Binary experiments, minimax tests and 2-alternating capacities. Ann.

Statist. 10 226-232.BERGER, J. O. (1984). The robust Bayesian viewpoint (with discussion). In Robustness in Bayesian

Statistics (J. Kadane, ed.) 63-144. North-Holland, Amsterdam.BERGER, J. O. (1985). Statistical Decision Theory, 2nd ed. Springer, New York.BERGER, J. O. (1987). Robust Bayesian analysis: sensitivity to the prior. Technical Report No.

87-10. Dept. Statist., Purdue Univ.BERGER, J. O. and BERLINER, L. M. (1986). Robust Bayes and empirical Bayes analysis with

e-contaminated priors. Ann. Statist. 14 461-486.BERLINER, L. M. and GOEL, P. K. (1986). Incorporating partial prior information: ranges of

posterior probabilities. Technical Report No. 357, Dept. Statist., Ohio State Univ.BLACKWELL, D. (1951). Comparison of experiments. Proc. Second Berkeley Symp. Math. Statist.

Probab. Univ. California, Berkeley.BUJA, A. (1984). Simultaneously least favourable experiments. I: Upper standard functionals and

sufficiency. Z. Wahrsch. Verw. Gebiete 65 367-384.BUJA, A. (1985). Simultaneously least favourable experiments. II: Upper standard loss functions

and their applications. Z. Wahrsch. Verw. Gebiete 69 387-420.BUJA, A. (1986). On the Huber-Strassen theorem. Probab. Theory Related Fields 73 367-384.CROQUET, G. (1953). Theory of capacities. Ann. Inst. Fourier (Grenoble) 5 131-295.DASGuPTA, A. and STUDDEN, W. J. (1988a). Robust Bayesian analysis and optimal designs in

normal linear models with many parameters. I. Technical Report No. 88-14, Dept.Statist., Purdue Univ.

DASGuPTA, A. and STUDDEN, W. J. (1988b). Robust Bayesian analysis and optimal designs innormal linear models with many parameters. II. Technical Report No. 88-34C, Dept.Statist., Purdue Univ.

DEROBERTIS, L. and HARTIGAN, J. A. (1981). Bayesian inference using intervals of measures. Ann.Statist. 9 235-244.

HUBER, P. J. (1973). The use of Choquet capacities in statistics. Bull. Inst. Internat. Statist. 45181-191.

HUBER, P. J. and STRASSEN, V. (1973). Minimax tests and the Neyman-Pearson lemma for'capacities. Ann. Statist. 1 251-263.

LAVINE, M. L. (1987). Prior influence in Bayesian statistics. Ph.D. thesis, Univ. Minnesota.LE CAM, L. (1964). Sufficiency and approximate sufficiency. Ann. Math. Statist. 35 1419-1455.LE CAM, L. (1969). Theorie Asymptotique de La Decision Statistique. Univ. Montreal Press,

Montreal.LE CAM, L. (1972). Limits of experiments. Proc. Sixth Berkeley Symp. Math. Statist. Probab. 1

245-261. Univ. California Press, Berkeley.MORENO, E. and CANO, J. A. (1988). Robust Bayesian analysis with e-contaminations partially'

known. Report, Univ. de Granada, Spain.RIEDER, H. (1977). Least favourable pairs for special capacities. Ann. Statist. 5 909-921.SHAFER, G. (1976). A Mathematical Theory of Evidence. Princeton Univ. Press, Princeton, N.J.SIVAGANESAN, S. and BERGER, J. O. (1989). Ranges of posterior measures for priors with unimodal.

contaminations. Ann. Statist. 17 868-889.

BAYES' THEOREM AND CAPACITIES 1339

TORGERSEN, E. N. (1970). Comparison of experiments when the parameter space is finite.Z. Wahrsch. Verw. Gebiete 16 219-249.

WALLEY (1981). Coherent lower (and upper) probabilities. Statistics research report, Univ. War­wick, Coventry.

WASSERMAN, L. A. (1988). Some applications of belief functions to statistical inference. Ph.D.thesis, Univ. Toronto.

WASSERMAN, L. A. (1990). Prior envelopes based on belief functions. Ann. Statist. 18 454-464.

DEPARTMENT OF STATISTICS232 BAKER HALLCARNEGIE-MELLON UNIVERSITYPITTSBURGH, PENNSYLVANIA 15213-3890