13

Click here to load reader

Assumptions, beliefs and probabilities

Embed Size (px)

Citation preview

Page 1: Assumptions, beliefs and probabilities

ARTIFICIAL INTELLIGENCE 65

Assumptions, Beliefs and Probabilities

Kathryn Blackmond Laskey Decis ion Science Consor t ium, Inc . , 1895 Preston White Drive , Suite 300, Res ton , VA 22091, U S A .

Paul E. Lehner George M a s o n Universi ty , 4400 Universi ty Drive , Fairfax,

VA 22030, U S A . ; and Decis ion Science Consor t ium, Inc.

ABSTRACT

A formal equivalence is demonstrated between Shafer-Dempster belief theory and assumption-based truth maintenance with a probability calculus on the assumptions. This equivalence means that any Shafer-Dempster inference network can be represented as a set of A TMS justifications with probabilities attached to assumptions. A proposition's belief is equal to the probability of its label conditioned on label consistency. An algorithm is given for computing these beliefs. When the A TMS is used to manage beliefs, non-independencies between nodes are automatically and correctly accounted for. The approach described here unifies symbolic and numeric approaches to uncertainty management, thus facilitating dynamic construction of quantitative belief arguments, explanation of beliefs, and resolution of conflicts.

1. Introduction

Most inferences are made in the context of uncertainty. That is, at the time an inference is made, the inference maker is usually uncertain about a number of relevant facts and issues. Two general approaches to uncertainty management have emerged as popular: numerical calculi for propagating degrees of belief, and extended logics for reasoning with defaults and assumptions.

A variety of uncertainty calculi have been proposed based on the mathe- matics of probability theory, belief theory [9, 19] and fuzzy set theory [23]. While these systems differ in their proposed mathematics, they all use numeri- cal parameters to represent degrees of uncertainty about the connection between rule antecedents and consequents.

A number of extended logics have also been proposed that allow an inference system to incorporate assumptions and defaults into the reasoning process, thereby allowing the system to make conclusions in the context of

Artificial Intelligence 41 (1989/90) 65-77 0004-3702/89/$3.50 © 1989, Elsevier Science Publishers B.V. (North-Holland)

Page 2: Assumptions, beliefs and probabilities

66 K.B. LASKEY AND P.E. LEHNER

incomplete or uncertain information [e.g., 13, 17]. Doyle's [10] truth mainte- nance system (TMS) has its theoretical base in McDermott and Doyle's nonmonotonic logic. In the TMS, a set of assumptions is provisionally adopted until an inconsistency is derived. When this happens, the system checks its space of assumptions to identify one or more "culprits" to reject. Alternative- ly, in assumption-based truth maintenance [6], propositions are labeled with the assumptions under which they can be proven. Conflicting deductions cause the conjunction of the assumption sets proving the conflicting propositions to be added to a database of "nogood" assumption sets. These nogood sets are removed from any labels in which they appear. In this way, an assumption- based truth maintenance system (ATMS) maintains for each proposition a list of those sets of assumptions under which it can be proven.

This paper shows that an ATMS can be used to represent Shafer-Dempster and Bayesian models in a manner that automatically accounts for non- indepencencies between nodes in an inference network. In particular, any Shafer-Dempster inference system can be represented in the ATMS by attaching probabilities to assumptions that represent hypotheses in a back- ground frame [20]. A proposition's Shafer-Dempster belief is the probability of the label of the corresponding ATMS node. The independence assumption underlying Dempster's Rule corresponds to probabilistic independence of the assumption sets carrying the beliefs to be combined•

Although our work was done independently, other researchers have ex- plored the idea of adding probabilistic information to the ATMS. D'Ambrosio [5] uses the ATMS to compute beliefs for a special case [1] of the Sharer- Dempster model. De Kleer and Williams [8] also assign probabilities to assumptions in an ATMS, but do not discuss the relationship to Shafer- Dempster theory. The contribution of this paper is twofold: to demonstrate that the ATMS can be used to represent any Shafer-Dempster model, and to establish a correct algorithm for computing beliefs as label probabilities.

2. Assumption-Based Truth Maintenance

An assumption-based truth maintenance system (ATMS) [6, 7] is a sophisti- • • 1

cated mechanism for keeping track of the dependence of propositions on other propositions, particularly propositions that represent assumptions. A brief description of the ATMS is given below: details can be found in de Kleer's papers.

An ATMS node represents a statement or hypothesis about whose truth or falsity the system reasons. The ATMS maintains a record of the dependence of each node on distinguished nodes called assumptions. (We follow de Kleer's convention of using uppercase letters to denote assumptions.) Dependence

~We use the term proposition to denote a propositional symbol or a Boolean expression containing propositional symbols. We do not consider first-order models.

Page 3: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS AND PROBABILITIES 67

between nodes is derived from justifications supplied by the problem solver. A justification for a node represents the antecedent nodes from which it is directly derivable. For example, the problem solver fires the rule x ^ A ~ y by passing the justification x,A ~ y to the ATMS. This justification, interpreted as a material implication, is stored with the node. A node may have several justifications, representing different arguments from which it can be derived.

The label for a node represents the environments (sets of assumptions) in which the node can be proven using all justifications known to the ATMS. For example, the label {A} for node a means that a is true when assumption A holds. Assumptions propagate across inferences. Thus, if node a has label { A }, node b has label {B} and A and B are consistent, then a,b ~ c adds {A, B} to the label of c; c ~ d then adds {A, B} to the label of d. The ATMS also maintains the set of inconsistent, or nogood environments. Any environment in which falsity is derivable is marked nogood.

Possible values for a variable must each be represented by a distinct node. For instance, x and ~ x are represented by two different nodes grouped as a class, with x,-7~ ~ L, where the symbol ± represents falsity.

When a justification is passed to the ATMS, it is recorded in the list of justifications for the consequent node. Then the ATMS updates all nodes to guarantee that each node label is sound, minimal, complete and consistent. That is, each node label (combined with the nogoods) precisely defines the space of alternative assumption sets under which the node can be derived, given the justifications generated so far.

In addition to the basic ATMS capabilities described above, de Kleer [7] extends the ATMS to allow it to encode propositional statements containing complex disjunctions. In particular, choose{A~, . . . , A,,} tells the ATMS that at least one of the A i must be true. This extension is necessary to represent the exhaustivity of a hypothesis set over which beliefs are defined. The extended ATMS includes two hyperresolution rules which are necessary for proper label updating with choose sets.

3. Shafer-Dempster Theory and the ATMS

A proposition's Shafer -Dempster belief may be thought of as the probability that the evidence is sufficient to prove the proposition [20]. For example, imagine that a source reports the truth of a proposition h. Imagine also that the source reports reliably 80% of the time, and simply reports at random the other 20% of the time. We might encode this as an ATMS justification r ,A ~ h, where r stands for the report and A stands for the assumption that the source is reliable. If A is assigned probability 0.8 and this is our only justification, the label {A} of h represents its Shafer -Dempster belief. (If the source is assumed to lie when not reporting reliably, we could add the justification r,-TA ~ T h . Under this model, 0.8 is the conditional probability of h given r.)

Page 4: Assumptions, beliefs and probabilities

68 K.B. LASKEY AND P.E. LEHNER

More generally, Shafer -Dempste r beliefs are defined on a set of mutually exclusive and exhaustive hypotheses h = { h~ . . . . . tG }.2 A belief function over these hypotheses arises from a probability distribution over a set of auxiliary hypotheses { A~ . . . . . A k}. Each auxiliary hypothesis A, is related to a subset h, of h in the following manner: if A s is known to be true, then the true hypothesis must be one of the hypotheses in h i [20]. Thus, A i is sufficient to prove the disjunction of the elements of h,. The subset h i is called a fi~cal element of the belief function; the probability of the assumption A I is called the basic probability of h i. The belief Bel(h~) assigned to any subset h i (not necessarily a focal element) is the sum of the basic probabilities of the focal elements contained in h,. This belief can be thought of as the probability that an assumption proving h i holds.

When two belief functions over the same hypothesis space are based on independent evidence, they may be combined by Dempster 's Rule [191. Consider two belief functions over h with auxiliary hypothesis sets { A~ . . . . } and { B ~ , . . . } and basic probability assignments p(As) and q(Bs), respectively. If A s proves h i and bj proves h i , t h e n A s A Bj proves their intersection h i • h . Assuming independence of the two distributions, the probability that any subset of h can be proven is the sum of those p(As)q(Bj) for which the associated h i N hj is contained in the subset. When the two belief functions give support to incompatible sets, this procedure assigns some probability to the empty set. Dempster 's Rule normalizes the above probabilities so that all belief is assigned to non-null sets.

As noted above, a proposition's Shafer -Dempste r belief can be interpreted as the probability that the proposition is implied by the evidence. The ATMS label propagation algorithm ensures that the label of a proposition tells us exactly those environments in which it is implied by the premises (assuming that the problem solver has supplied the necessary justifications and the propositions have been encoded correctly). With the addition of a probability calculus on ATMS assumptions, a straightforward mapping exists from ATMS label probabilities to Shafer -Dempste r beliefs.

A belief function is encoded in the ATMS as follows. First, the hypothesis space is represented as a oneof disjunction h I @ " " @ h,, of ATMS nodes. That is, a node class is created consisting of a node h~ for each hypothesis, and the justifications hs,hj# i ~ ± are created. Following de Kleer's encoding for disjunctions, a "h idden" assumption (i.e., invisible to both the problem solver and the probability handler) justifies each hypothesis: H i g h i. Then choose{H~ . . . . } is asserted. Second, an auxiliary hypothesis A s is created for each focal element h i of the belief function, and these are declared mutually inconsistent: A s , A j ~ ±. (There is no need for a choose here because the belief computer is interested only in environments in which one of the A~

2We adopt the convention of using bold betters to denote sets and ordinary letters to denote

elements of sets.

Page 5: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS AND PROBABILITIES 69

holds.) Third, an ATMS node is created for each subset h i for which a basic probability assignment is specified. This node is justified by the hidden assumptions H i corresponding to its elements, and declared inconsistent with the other H i. Next, an ATMS justification is created linking each auxiliary hypothesis to its associated subset of the hypothesis space: A i ~ h i. Finally, probabilities are assigned to the assumptions in the auxiliary hypothesis set.

The probability of the label of an ATMS node can be thought of as the probability that an environment holds in which the associated proposition can be proven. To ensure that zero probability is assigned to ±, this label probability is conditioned on the consistent assumption sets. Assuming that different auxiliary hypothesis sets are probabilistically independent, beliefs computed by Dempster 's Rule are the same as label probabilities conditioned on consistency. An algorithm for computing label probabilities is given in the next section.

An essential feature of AI systems is the ability to link hypotheses together in chains of inference. Although the results are not as well known as for probability theory, models exist for propagating Shafer -Dempster beliefs through networks of inference rules [3, 21]. Before turning to our example, we give a brief overview of how Shafer -Dempster theory supports inference rules.

If the hypotheses in h provide information about the truth of hypotheses in some other set g, the link can be expressed by defining a belief function over the cross product space h x g. Commonly, the dependence of g on h is expressed by rules of the form hi--~ Belg(. ), in which a single hypothesis from the set h implies a belief function over g. Each such rule can be regarded as a conditional (on hi) belief function over g, and can be encoded by an extension of the above ATMS encoding of unconditional belief functions.

To encode the uncertain inference rule hi--~Bel ( - ) , we begin by creating an auxiliary hypothesis set represented by assumptions A~I . . . . . Ai,, one for each focal element of the conditional belief function. With each A ij is as- sociated the basic probability of the focal element gj given h i. Next the justifications h , A i j ~ gj are passed to the ATMS. Any assumptions impacting hypotheses in h automatically propagate to the labels of the hypotheses in g via the ATMS label propagation mechanism. Rules may have more than one antecedent (e.g. hi,fy,Aij k ~ gk). The auxiliary hypothesis set for each condi- tional belief function is assumed to be probabilistically independent of the other auxiliary hypothesis sets.

Unlike probabilities, beliefs on h × g are not uniquely derivable from conditional beliefs on g given h i and marginal beliefs on h. The above model implicitly defines a joint belief function that turns out to be equivalent to the hierarchical model in [3]. Other models for the joint belief function are appropriate when the belief functions conditional on different elements of h are judged not to be based on independent evidence [2]. These too can be encoded by the ATMS by having non-independent rules share assumptions.

We have shown that any Shafer -Dempster model can be represented as a set

Page 6: Assumptions, beliefs and probabilities

70 K.B. LASKEY AND P.E. LEHNER

of ATMS justifications with probabilities on ATMS assumptions. Conversely, for any set of ATMS justifications and probability assignment to assumptions, there is an equivalent Shafer -Dempste r model. The only restriction on the probabilities is that they must be assigned to a set of pairwise disjoint assumptions whose probabilities sum to 1. Note that unlike the assumptions carrying basic probabilities, the set over which focal elements are defined need not consist of exclusive and exhaustive elements. When it does not, a reformu- lation is always possible, 3 but explicitly encoding it is not necessary. (Of course, the system designer should be aware that beliefs will not be computed correctly unless the ATMS is told when nodes are exclusive and exhaustive.)

Because probability models can be regarded as special cases of belief function models, they can be encoded by ensuring that all belief is allocated to single hypotheses. For example, a conditional probability distribution over g given h is encoded as a set of justifications h , , A q ~ gj with appropriate probabilities assigned to the A q. (Assumption sets for the conditional distribu- tions given different values of h i may be assumed independent.) A prior distribution over h is encoded by a set of justifications Bs ~ h i and appropriate probability assignments to the B,.

Both Bayesian and Shaferian inference structures have been plagued by difficulties with correct handling of non-independencies. It is important to note that n o n - i n d e p e n d e n c i e s are au tomat ica l l y and correct ly i ncorpora ted by A T M S

label p r o p a g a t i o n . Suppose, for example, that the hypothesis sets f and h both impact on the hypothesis set g. If f and h are in turn both affected by some common antecedent e, the belief propagation algorithms in [21] and [3] give incorrect results. Input from e will be counted twice, once through its impact on f and once through its impact on h. But in an ATMS-based system, the common antecedent will be reflected by assumptions that appear in the labels of both f and h. ATMS label propagation ensures that the shared assumptions will appear correctly in the labels of the gi with no double counting. The example in Section 5 illustrates the treatment of non-independencies.

4. Computing Beliefs

The Shafer -Dempste r belief for a node is the probability of its label given the complement of the nogoods:

Pr( label f3 ~ n o g o o d ) Pr( labe l 0 - n n o g o o d ) B e l ( n o d e ) = P r ( - l n o g o o d ) = 1 - P r ( n o g o o d ) (1)

Nogood environments consisting of pairs of auxiliary hypotheses from the same probability distribution have zero probability by definition, and are

3 E.g., any two propositions a 1 and a 2 may be reexpressed as the oneof disjunction b~ @ b e @ b~, where b l=a~, b 2 = a 2 A ~a~, b~ - ~a~ A ~a 2.

Page 7: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS AND PROBABILITIES 71

excluded from (1). Often the set of nogood environments can be partitioned into nogood, and nogood 2, where the environments in nogood 2 have no overlap with environments in nogood, or label. Environments in nogood 2 can be ignored in (1) because their probabilities factor out of both numerator and denominator. To construct nogood 1, first select all nogoods that overlap the label-- i .e . , that contain assumptions from the same probability distribution as an assumption in a label environment. Next, select all nogoods that overlap nogoodl. Continue until no more nogoods are added to nogood 1.

Because ATMS labels and nogoods are maintained in disjunctive normal form, a node's belief is equal to the probability of a disjunction of conjunc- tions. If the disjuncts are disjoint, their probabilities may simply be added. But this is not usually the case. Fortunately, a simple technique exists to transform any disjunction a s v - . . v a,, into a logically equivalent disjunction /3~ v -- . v/3,, in which the/3~ are themselves expressions in disjunctive normal form, and all disjuncts in the entire expression are disjoint. The/3~ are constructed as follows:

/3, = o~, , /3, = o~, A 7 ~ , A . . . A - l o % ~ . ( 2 )

The/3~ have been constucted to be disjoint from each other. To finish the job, we express each /3i in disjunctive normal form with disjuncts that are disjoint. First, note that the environments c~ i and c~ i are disjoint if each contains an assumption from the same probability distibution (e.g., {A, B} and {--hA, C} are disjoint). All such % may be removed from the conjunction in (2). Next, repeatedly apply the relation

C~ A-7(A 1 A " '" A Ak)

= (C~ A -TAl) V (a /x A 1A ~ A 2 )

• . . v ( c ~ A A 1 A A2A . . - A - ' n A k ) (3)

to remove the remaining aj in turn. If A i E a, the disjunct containing -TAi may be dropped from (3) because its probability is zero. If all probability distribu- tions are binary, (2) and (3) are sufficient to transform the expression to disjoint disjunctive normal form. Otherwise, we reexpress -TA i as V j~i A/.

Note that this transformation automatically removes all earlier aj from each a~ to create/3i. This suggests the following algorithm for computing Pr(nogood) and Pr(label A-Tnogood) as needed in (1):

Step 1. Compile the label and interacting nogoods into a single disjunction c~ v . . . v c~,, where the first k disjuncts are nogood and the remaining n - k constitute the label environments. (Efficiency is gained if smaller environments are processed first.)

Step 2. Reexpress as /31 v . ' . v/3~ using (2) and (3). The first k terms

Page 8: Assumptions, beliefs and probabilities

72 K.B. LASKEY AND P.E LEHNER

fll v . . . v [3k represent the interacting nogoods in disjoint disjunctive normal form. The remaining terms flk+~ V ' ' ' V [3, represent the label with the nogoods removed.

Step 3. Compute Pr(nogood) and Pr(label N-Tnogood). To do this, add together the products of the assumptions in the conjunctions making up nogood and label N ~nogood, respectively.

D'Ambrosio [5] independently developed an approach very similar to ours. Whereas he considers only the special case Shafer-Dempster model im- plemented in support logic programming [1], we demonstrate a formal equival- ence between ATMS plus probabilities and general Shafer-Dempster theory. D'Ambrosio's belief computation algorithm gives different answers from ours (and hence from Dempster's Rule) when there are nogood environments containing assumptions not appearing in the labels of any hypotheses in the set over which beliefs are computed. An example of this difference is given in the next section.

5. An Example

Relations between Depravia and Rechtia have been plagued by recurrent border disputes. Because these countries are of such strategic importance, concern has arisen over a recent report of increased activity in Depravia near the border area, which may be indicative of an impending attack.

Figure 1 illustrates some hypothetical inference rules for reasoning about Depravia's attack plans. Firing each rule results in passing the associated justification to the ATMS. Probability distributions on the auxiliary hypotheses are created by the statement pdist{A 1, A 2 . . . . ; p~, p: . . . . }. This statement results in the following actions.

A i , A / ~ i ~ 2 , Pr(A,) ~ p , .

Figure 2 shows the states of the ATMS nodes after the rules of Fig. 1 have fired. Because all nogoods consist of members of the same pdist, the de- nominator of (1) is equal to 1.

Belief in the node a (Depravia plans an attack) is thus equal to

Bel({a}) = Pr((WA Y A Z) v (VA X A Z)) . (4)

Applying the method of the last section, the label may be reexpressed as follows (probabilities of the environments are included):

ill: W, Y , Z [32: V, X, Z, ~ W

V , X , Z , W , - 7 Y

0 . 8 x 0 . 7 5 x 0 . 8 = 0 . 4 8 , 0 . 7 x 0 . 6 x 0 . 8 x 0 . 2 = 0 . 0 6 7 , 0 . 7 x 0 . 6 × 0 . 8 x 0 . 8 x 0 . 2 5 = 0 . 0 6 7 .

Page 9: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS AND PROBABILITIES 73

Inference Rules and Corresponding ATMS Justifications

Inference Rule ATMS Justification

Moving_Troops To Border ~ Attack Planned ( .7)

b,V ~ a

Readying Supply_Lines ~ Attack_Planned ( .8)

Increased_Activity -- Moving_Troops to Border ( .6)

d,X = b

Increased_Activity -- Readying_Supply_Lines (.75)

d j Y ~ c

(Report Increased_Activity) ~ Increased_Activity ( .8)

e,Z = d

Auxiliary Hypotheses Definitions

pdist{V,~V; .7 .3} pdlst{W,~W; .8,.2} pdist{X,~X; .6,.4} pdisC{Y,~Y; .75,.25) pdisC{Z,~Z; .8,.2}

Additional ATMS Justifications

b,~b ~ i

d,~d ~ I e,~e ~ I

Diagram of the Inference Network

e__a~d/b~a

Fig. 1, Intelligence analysis example.

Bel({a}) is equal to the sum of these probabilities, or 0.614. Now suppose we encounter conflicting evidence, learning the antecedent of the following rule:

(Report (Not Readying_SupplyLines))

~0.~ (Not Readying_Supply_Lines).

Figure 3 shows the modified inference network. The ATMS node f and the assumption U are created to represent, respectively, the proposition on the left-hand side of this rule and the auxiliary hypothesis that the source is

Page 10: Assumptions, beliefs and probabilities

74 K.B. LASKEY AND P.E. LEHNER

ATMS node labels (not including hidden nodes)

a: { {W,Y ,Z } , {V ,X ,Z } }

b: {{X, Z}}

c: {{ Y, Z l l

d: { {Z} l

e: {( }I

Nogood environments

{V, 7V}, {W, ~qW}, {X, reX}, { Y, mY}, {Z, qZ }

Fig. 2. ATMS nodes and nogood environments.

repor t ing reliably in this instance. The A T M S is passed the justification f, U ~ ~ c , which creates a n o n e m p t y label, { { U}}, for ~ c . Because { Y, Z} is in the label of c, the env i ronmen t { U, Y, Z} is declared nogood. This nogood env i ronment interacts with the label of a, so Be l ({a} ) must be r ecomputed .

The n o g o o d and label env i ronments are again expressed as disjoint disjuncts:

ill: U, Y , Z [32: W, Y, Z, 7 U [33: V, X, Z, ~ U , ~ W

V, X, Z, -q U, W, T Y

V , X , Z , U , - n Y

0.8 x 0.75 × 0.8 = 0.48 0.8 x 0.75 × 0.8 x 0.2 = 0.096 0.7 x 0.6 x 0.8 x 0.2 x 0.2 = 0.013

0.7 x 0.6 × 0.8 × 0.2 × 0.8 × 0.25 = 0.013

0.7 x 0.6 × 0.8 × 0.8 × 0.25 = 0.067

There fore ,

Bel({ a}) = (0.096 + 0.013 + (I.013 + 0 . 0 6 7 ) / ( l - 0.48) = 0 .366.

D ' A m b r o s i o ' s a lgor i thm [5] only computes probabil i t ies for in terpreta t ions conta in ing assumptions appear ing in the label of a or -qa. Using his algori thm,

~ C

J g"

Fig. 3. Inference net with conflicting evidence.

Page 11: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS A N D P R O B A B I L I T I E S 75

the environment {U, Y, Z} would not be removed from the label before computing the numerator of (1), nor would its probability appear in the denominator (because U does not appear in the label of either a or --ha). Thus, the evidence against c would not change belief in a at all, despite its contradicting the earlier evidence for one of a's antecedents.

D'Ambrosio also constructs all interpretations containing the selected as- sumptions. In this problem, there a r e 2 6 interpretations (all possible truth values for 6 assumptions), 17 of which appear in the label or are nogood. This means that d'Ambrosio's algorithm would construct 17 interpretations and compute a probability, requiring six multiplications, for each. Our algorithm only computes probabilities for five environments, and the probabilities of most require fewer than six multiplications. Moreover, each term of/3 i contains the assumptions in c~i, so their probabilities need only be multiplied once.

The probability 0.48 of { U, Y, Z} represents the prior belief we had assigned to an environment that was later found to be inconsistent. As such, it represents the degree of conflict associated with the current belief structure. Such a high degree of conflict may prompt us to reexamine the original belief assignments. For example, the problem solver might have assigned proba- bility 0.8 to both U and Z as a default assignment for rules of the form (report q)---~ q. Conflict might prompt the problem solver to examine the characteristics of the source of each report, searching for stronger evidence about the reliability of the source [3].

6. Discussion

We have demonstrated the formal equivalence of Shafer-Dempster belief theory to an ATMS with a probability model on assumptions. Based on this equivalence, we have provided an algorithm for computing beliefs using an ATMS. In establishing the equivalence, assumptions were treated simply as symbolic tokens for ATMS propagation. Of more interest is the case where assumptions are themselves meaningful. For example, assumptions might represent rule endorsements [4] which have meaning to the problem solver. Or an assumption attached to a rule might represent the negation of an unless censor [14]. The ATMS-based probability calculus is a theoretically justified way of combining and propagating endorsements or censors. When assump- tions and their probabilities are meaningful, one can interpret the probability of a datum's label, derived from (1), as a measure of the robustness of a conclusion over alternative possible worlds (see also [12]).

A natural question is why an ATMS-based algorithm should be preferred to an algorithm specially designed for efficient propagation of numerical uncer- tainty values in an inference network (e.g., [15, 21]). These algorithms are designed for efficient calculation of probabilities and belief values on stable predefined symbolic reasoning chains (inference networks). For problems

Page 12: Assumptions, beliefs and probabilities

76 K,B. LASKEY AND P.E. LEHNER

where an inference network is not defined a priori, applying one of these algorithm requires the system to (1) generate the relevant reasoning chain, (2) attach the appropriate probability values, and then (3) execute the numerical propagation algorithm. The ATMS approach, oll the other hand, can be easily applied to problems where the reasoning chain is constructed rather than predefined. A symbolic problem solver with an ATMS provides an efficient mechanism for doing (1), while our approach to belief computation provides an efficient mechanism for doing (2) and (3), given that (1) was achieved using an ATMS. We therefore suggest that, except for the special case of a predefined inference network, the ATMS approach may turn out to be at least as efficient as numerical propagation algorithms. (Methods like those of [15] and [21] will be more efficient when beliefs of many propositions are required between updates of the inference network.)

In addition, there are several other advantages to the ATMS-based ap- proach. First, our representation handles non-independencies correctly. We know of no other method for propagating Shafer-Dempster beliefs that does this) Second, the ATMS approach is consistent with the constructive probabili- ty approach advocated by Shafer and Tversky. In [221 it is suggested that, at least for people, probability assessments are not so much retrieved from memory, as they are constructed as needed. In the ATMS approach, a database of relevant probability values is not presumed. Rather, ATMS symbolic processing is used to identify required values which can then be provided either through database access or via some construction algorithm. Finally, belief computation can easily be combined with other truth mainte- nance tasks like default reasoning. Certain assumptions can be designated by the problem solver as defaults. Treating defaults separately (leaving them out of the probability calculation but assigning zero probability to environments inconsistent with them) can lead to increased efficiency.

Finally, we note that a probability calculus on assumptions also raises interesting possibilities for problem solving control. For example, probable contexts may be explored first; probabilities may be used as guidance on which data to collect; or probabilities may inform the selection of default contexts. An interesting approach to conflict resolution is suggested by noting that Pr(nogood) in equation (1) is equal to the prior probability assigned to environments that turned out, after accounting for the justifications, to be inconsistent. This quantity serves as a natural measure of the degree of conflict associated with the belief model. The problem solver may use conflict as a signal to reexamine beliefs assigned by default [3]. In this way, the ATMS approach makes it possible to reason both symbolically and numerically about

~Shachter [18] solves Bayesian networks with non-independence (but his method is likely to be no more efficient than an ATMS). Non-independencies have also been handled with Monte Carlo methods [11, 16].

Page 13: Assumptions, beliefs and probabilities

ASSUMPTIONS, BELIEFS AND PROBABILITIES 77

conflict. Numerical processing measures the degree of conflict, while ATMS processing makes it possible to explicitly reason about the reasoning chains and assumptions that led to the conflict.

REFERENCES

1. Baldwin, J., Support logic programming, Tech. Rept. ITRC-65, Information Technology Research Center, University of Bristol (1985).

2. Black, P.K. and Laskey, K.B., Hierarchical evidence and belief functions, in: Proceeding~ Fourth Workshop on Uncertainty in Artificial Intelligence, St. Paul, MN (1988).

3. Cohen, M.S., Laskey, K.B. and Ulvila, J.W., The management of uncertainty in intelligence data: A self-reconciling evidential database, Tech. Rept. 87-8, Decision Science Consortium. Inc., Reston, VA (1987).

4. Cohen, P.R., Heuristic Reasoning about Uncertainty: An Artificial Intelligence Approach (Pitman, Boston, MA, 1985).

5. d'Ambrosio, B., A hybrid approach to uncertainty, Int. J. Approximate Reasoning (to appear). 6. de Kleer, J., An assumption-based TMS, Artificial Intelligence 28 (1986) 127 162. 7. de Kleer, J., Extending the ATMS, Artificial Intelligence 28 (1986) 163-196. 8. de Kleer, J. and Williams, B.C., Diagnosing multiple faults, Artificial huelligence 32 (I987)

97 13(t. 9. Dempster, A.P., Upper and lower probabilities induced by a multivalued mapping, Ann.

Math. Statist. 38 (1967) 325-339. 10. Doyle, J., A truth maintenance system, Artificial Intelligence 12 (1979) 231-272. 11. Henrion, M., Propagating uncertainty by logic sampling in Bayes' networks, in: Proceedings

Second Workshop on Uncertainty in AI, Philadelphia, PA (1986). 12. Lehner, P.E., Probabilities and possible worlds reasoning, Artificial Intelligence (Submitted). 13. McDermott, D. and Doyle, J., Non-monotonic logic I, Artificial Intelligence 13 (198(I) 41-72. 14. Michalski, R.S. and Winston, P.H., Variable precision logic, Artificial Intelligence 29 (1986)

121 146. 15. Pearl, J., Fusion, propagation and structuring in belief networks, Artificial Intelligence 29

(1986) 241 288. 16. Pearl, J., Evidential reasoning using stochastic simulation of causal models, Artificial Intelli-

gence 32 (1987) 245-257. 17. Reiter, R., A logic for default reasoning, Artificial Intelligence 13 (1980) 81-132. I8. Shachter, R.D., Evaluating inference diagrams, Oper. Res. 34 (1986) 871-882. 19. Shafer, G., A Mathematical Theory of Evidence (Princeton University Press, Princeton, N J,

1976). 21). Shafer, G., Belief functions and possibility measures, in: J.E. Bezdek (Ed.), The Analysis of

Fuzz)' Information (CRC Press, Boca Raton, FL, 1987). 21. Sharer, G.. Shenoy, P.P. and Mellouli, K., Propagating belief functions in qualitative Markov

trees, Working Paper No. 186, University of Kansas, School of Business, Lawrence, KS {1986).

22. Shafer, G. and Tversky, A., Languages and designs for probability judgment, Cognitive Sci. 9 (1985) 309-339.

23. Zadeh, L.A., Fuzzy sets as a basis for a theory of possibility, Fuzz), Sets Syst. 1 (1978) 3-28.

R e c e i v e d Oc tober 1987; rev ised version received July 1988