18
408 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998 A Unified Model for Abduction-Based Reasoning echir Ayeb, Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract— In the last decade, abduction has been a very active research area. This has resulted in a variety of models mechanizing abduction, namely within a probabilistic or logical framework. Recently, a few abductive models have been pro- posed within a neural framework. Unfortunately, these neural- /probablistic-/logical-based models cannot address complex ab- duction problems. In this paper, we propose a new extended neural-based model to deal with abduction problems which could be monotonic, open, and incompatible. Index Terms— Competition model, distributed processing, monotonic/open/incompatibility abduction problems, neural networks. I. PROLOGUE T HE intuition behind abduction can be stated as follows [5], [31]. Given an observation (e.g., a manifestation or a symptom in medicine), a hypothesis (e.g., a disorder or a disease in medicine), and the knowledge that causes , it is an abduction to hypothesize that occurred. The main issue of abduction is to synthesize a composite hypothesis explaining the entire observation from elementary hypotheses. In [5], deep insights into the mechanization of abduction were pointed out. The computational complexity of each class of abduction problems has been provided. In fact, the ubiquity of abduction is remarkable. Once one learns it, one discovers it virtually everywhere. This includes diverse areas such as diagnosis [34], planning [17], natural language processing [7], machine learning [28], le- gal reasoning [41], and text processing [27]. Unfortunately, the applicability of abduction suffers from its computational complexity. Although abduction is in general NP-Hard [5], [18], some models (e.g., [1], [2], [4], [9], [25], [29], [32], [35]) aiming at mechanization of abductive reasoning (called also hypothetical reasoning) have been provided. All of these models make several assumptions to address only a simple case, called the class of independent abduction problems. The main objective of this paper is to address complex classes of abduction. Specifically, we propose a unified neural- based model dealing with the monotonic class, the open class, and the incompatibility class of abduction problems. The strengths of this work are two-fold. First, it is innovative. Manuscript received November 7, 1995; revised October 15, 1996 and October 22, 1997. This research was supported in part by the NdL project under FRAI Grants 0238A and 0291A and NSERC Grants OGP0121468/OGP0121680; EQP0121469/EQP0139397. Project dL/NdL is an experimental diagnosis laboratory which aims at adapting and mechanizing model-based diagnosis techniques depending on the class of diagnostic problems. The authors are with the Faculty of Sciences/DMI, University of Sherbrooke, Sherbrooke, P.Q., Canada J1K 2R1 (e-mail: [email protected]; [email protected]; [email protected]). Publisher Item Identifier S 1083-4427(98)04360-4. To our knowledge, this is the first model which tackles complex abductive problems (e.g., the incompatibility class) and provides an effective algorithm to generate explanations for this class. Second, it is instructive in itself. Starting from a simple model addressing the class of independent abduction problems, we extend progressively this model to address a variety of abduction problems regardless of their class. Consequently, we show how neural networks can be very effective to address complex problems. We also open the door for a wide applicability of abduction on real-world problems. The rest of this paper is organized as follows. Section II gives preliminary material on abduction. Section III presents our initial model for the independent class. Section IV provides in a stepwise refinement way our unified model. Section V summarizes our model and theoretical foundations. Section VI discusses related work and experimental results. Section VII includes concluding remarks and further research. II. CONCISE SUMMARY ON ABDUCTION CLASSES The main goal of this section is to provide preliminary materials on abduction, facilitate the introduction of neural- based abduction, and characterize in a concise way different classes of abduction problems. 1 Depending on the context, an abduction problem can be formalized in several ways. Let us adopt the following definition. Definition 1: An abduction problem AP is a tuple , where and are two disjoint finite sets of symbols; is a finite set of causal relationships between and . Intuitively, denotes the set of elementary hypotheses, denotes the set of observable facts, whereas denotes a mapping from to . Following the form of the causal relationships embedded in , we can introduce several classes of abduction problems. For convenience, let us first introduce the following notation. Notation 1: Consider , a subset of elementary hy- potheses and , an observable fact. We write to express a causal relationship between the hypotheses in and the observable fact with a given weight [0, 1]. In the following, we say that is covered by with respect to a weight , or may cause with the weight . A. Independent and Monotonic Abduction Problems Using Notation 1, we now define straightforwardly the class of independent and monotonic abduction problems. 1 For a thorough and general introduction to abduction, see [5], [33], and [37]. 1083–4427/98$10.00 1998 IEEE

A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

408 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

A Unified Model for Abduction-Based ReasoningBechir Ayeb,Member, IEEE, Shengrui Wang, and Jifeng Ge

Abstract— In the last decade, abduction has been a veryactive research area. This has resulted in a variety of modelsmechanizing abduction, namely within a probabilistic or logicalframework. Recently, a few abductive models have been pro-posed within a neural framework. Unfortunately, these neural-/probablistic-/logical-based models cannot address complex ab-duction problems. In this paper, we propose a new extendedneural-based model to deal with abduction problems which couldbe monotonic, open,and incompatible.

Index Terms— Competition model, distributed processing,monotonic/open/incompatibility abduction problems, neuralnetworks.

I. PROLOGUE

T HE intuition behind abduction can be stated as follows[5], [31]. Given an observation (e.g., a manifestation

or a symptom in medicine), a hypothesis(e.g., a disorderor a disease in medicine), and the knowledge thatcauses ,it is an abduction to hypothesize thatoccurred. The mainissue of abduction is to synthesize a composite hypothesisexplaining the entire observation from elementary hypotheses.In [5], deep insights into the mechanization of abduction werepointed out. The computational complexity of each class ofabduction problems has been provided.

In fact, the ubiquity of abduction is remarkable. Onceone learns it, one discovers it virtually everywhere. Thisincludes diverse areas such as diagnosis [34], planning [17],natural language processing [7], machine learning [28], le-gal reasoning [41], and text processing [27]. Unfortunately,the applicability of abduction suffers from its computationalcomplexity. Although abduction is in general NP-Hard [5],[18], some models (e.g., [1], [2], [4], [9], [25], [29], [32],[35]) aiming at mechanization of abductive reasoning (calledalso hypothetical reasoning) have been provided. All of thesemodels make several assumptions to address only a simplecase, called the class of independent abduction problems.

The main objective of this paper is to address complexclasses of abduction. Specifically, we propose a unified neural-based model dealing with the monotonic class, the open class,and the incompatibility class of abduction problems. Thestrengths of this work are two-fold. First, it is innovative.

Manuscript received November 7, 1995; revised October 15, 1996and October 22, 1997. This research was supported in part by theNdL project under FRAI Grants 0238A and 0291A and NSERC GrantsOGP0121468/OGP0121680; EQP0121469/EQP0139397. Project?dL/NdL isan experimental diagnosis laboratory which aims at adapting and mechanizingmodel-based diagnosis techniques depending on the class of diagnosticproblems.

The authors are with the Faculty of Sciences/DMI, University ofSherbrooke, Sherbrooke, P.Q., Canada J1K 2R1 (e-mail: [email protected];[email protected]; [email protected]).

Publisher Item Identifier S 1083-4427(98)04360-4.

To our knowledge, this is the first model which tacklescomplex abductive problems (e.g., the incompatibility class)and provides an effective algorithm to generate explanationsfor this class. Second, it is instructive in itself. Starting from asimple model addressing the class of independent abductionproblems, we extend progressively this model to addressa variety of abduction problems regardless of their class.Consequently, we show how neural networks can be veryeffective to address complex problems. We also open the doorfor a wide applicability of abduction on real-world problems.

The rest of this paper is organized as follows. Section IIgives preliminary material on abduction. Section III presentsour initial model for the independent class. Section IVprovides in a stepwise refinement way our unified model.Section V summarizes our model and theoretical foundations.Section VI discusses related work and experimental results.Section VII includes concluding remarks and further research.

II. CONCISE SUMMARY ON ABDUCTION CLASSES

The main goal of this section is to provide preliminarymaterials on abduction, facilitate the introduction of neural-based abduction, and characterize in a concise way differentclasses of abduction problems.1 Depending on the context, anabduction problem can be formalized in several ways. Let usadopt the following definition.

Definition 1: An abduction problem AP is a tuple, where and are two disjoint finite sets of

symbols; is a finite set of causal relationships betweenand .

Intuitively, denotes the set of elementary hypotheses,denotes the set of observable facts, whereasdenotes

a mapping from to . Following the form of the causalrelationships embedded in, we can introduce several classesof abduction problems. For convenience, let us first introducethe following notation.

Notation 1: Consider , a subset of elementary hy-potheses and , an observable fact. We writeto express a causal relationship between the hypotheses inand the observable factwith a given weight [0, 1].

In the following, we say that is covered by with respectto a weight , or may cause with the weight .

A. Independent and Monotonic Abduction Problems

Using Notation 1, we now define straightforwardly the classof independent and monotonic abduction problems.

1For a thorough and general introduction to abduction, see [5], [33], and[37].

1083–4427/98$10.00 1998 IEEE

Page 2: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 409

Fig. 1. A general abduction problem.

Definition 2: Let be an abduction problem,then we have the following.

AP belongs to theindependent class if each causal rela-tionship in can be expressed as , whereand . In other words, each causal relationship involvesexactly one and only one elementary hypothesis andoneobservable fact. Hence, there is no interaction between theelementary hypotheses in.

AP belongs to themonotonic class if there is at least onecausal relationship in which is expressed as: ,where and does not hold. This time,an additive interaction between the hypotheses inis requiredto cause the observable fact.

Fig. 1 introduces a general abduction problem2

, where parenthesized identifiers referto abbreviated names. Using , let us define , anindependent abduction problem, and , a monotonicabduction problem, as follows.

Example 1: , where, and .

Example 2: where, and .

2This is drawn from mechanical engineering, namely an oversimplifiedmodel for automobile troubleshooting [8]. We do not intend that our illus-trative examples correspond to actual cases; real-world cases are reported inSection V. Note also that weights of causal relationships are split into severalcategories, e.g., weight 1.0 (resp., 0.75, 0.5) expressesmust cause,(resp.,maycause, leads to) relationships. Other categories could be introduced dependingon the application domain.

Given an abduction problem, we introduce the concept of asimple observation as follows.

Definition 3: Let be an independent or amonotonic abduction problem, asimple observationSO is agiven subset of observed facts, SO .

As an example, let us consider the following simple obser-vations.

Example 1 (Continued):SO.

Example 2 (Continued):SO.

The following notation, inspired from [5], [33], is useful tointroduce the concept of explanation for an abduction problem.

Notation 2: Let be an elementary hypothesis andsuppose that causes , i.e., . We

define the following “function”: EXPLAIN .If denotes a subset of elementary hypotheses, then

EXPLAIN EXPLAIN .Definition 4: Let be an independent or a

monotonic abduction problem. Suppose that SO is a givensimple observation for , then anexplanation forwith respect to SO is such that: 1) , 2) SOEXPLAIN , and 3) is minimal with respect to setinclusion/cardinality. If among all explanations, has themaximal plausibility (i.e., with respect to a given belieffunction) then is said to be thebest explanation.

For our illustrative examples, we have the following expla-nations.

Example 1 (Continued): ,and for with respectto SO .

Example 2 (Continued): ,and for with respectto SO .

Notice that the above explanations take minimality criterionwith respect to set cardinality. Moreover, and appear tobe the best explanations if plausibility is a simple normalizedsum of weights attached to causal relationships.

B. Open and Incompatibility Abduction Problems

Unlike independent and monotonic classes, open and in-compatibility classes require introduction of the concept ofmultiple observation as follows.

Definition 5: Let be an abduction problem,a multiple observation MO is an uplet where

denotes the facts which are known to be present,denotes the facts which are known to be absent,

and denotes the unknown facts. Naturally, we have.

Now, we introduce a final convention followed by twodefinitions related to the class of open and incompatibilityabduction problems.

Notation 3: We make use of the distinguished symboltodenote a fact which can never be observed.

Definition 6: Let be an abduction problem.Suppose that every observation to is necessarily multiple,then we have:

Page 3: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

410 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

belongs to theopen class if for every multiple obser-vation, say MO , we have or .

belongs to theincompatibility class if: 1) thereis at least one causal relationship which is expressed as

, and 2) . Here, we have a nonempty setof elementary hypotheses which cannot coexist together.

Now we introduce the concept of explanation as fol-lows—compare with Definition 4.

Definition 7: Let be an incompatibility oran open abduction problem. Suppose that MOis a given multiple observation for , then anexplanationfor with respect to MO is such that 1) ,2) EXPLAIN , 3) is minimal with respectto set inclusion/cardinality, and 4) EXPLAIN isminimal with respect to set inclusion/cardinality. If amongall explanations, has the maximal plausibility (i.e., withrespect to a given belief function) then is said to be thebest explanation.

Considering the relationships of in Fig. 1, let usdefine , an open abduction problem, and , anincompatible abduction problem.

Example 3: where, and .

Example 4: where, and

.Consider the following multiple observations for our illus-

trative examples.Example 3 (Continued):MO , where

, and .Example 4 (Continued):MO ,

where, , and .

Then, we obtain the following explanations.Example 3 (Continued): for

with respect to MO.Example 4 (Continued):

and for with respect to MO.Observe that the potential explanation

for is removed because it covers, considered as absent. In the same way,

explanation for is alsorejected due to the incompatibility relationship.

C. Related Issues

From the definitional viewpoint, we notice that we delib-erately create a set of primitive classes. The incompatibilityclass is in fact a combination of open and monotonic classesof abduction. Unfortunately, these primitive classes are not“complete” since some classes of abduction problems cannotbuilt them. The cancellation abduction problems introducedin [5] is an example of a such abduction class. Is there oneformulation which allows us to define a minimal, howevercoplete, set of classes? Further research should shed somelights on this issue.

From the semantical viewpoint, we should pay attention tominimality and plausibility criteria which are subject to several

implicit assumptions and case sensitive [38]. Nonetheless,these criteria are used virtually in all abduction approachesand a precedence relation between them is introduced to solvethe potential conflict. Plausibility is often considered afterminimality criterion. In this work, we adopt these conventionalcriteria, hence their implicit assumptions.

From the operational viewpoint, mechanizing abduction isactually complex. Indeed, determining the number of expla-nations for an independent abduction problem is just as hardas determining the number of solutions to an NP-completeproblem [5, Theorem 4.1]. In the same way, determiningwhether an additional explanation exists for a monotonicabduction problem, given a set of explanations, is NP-complete[5, Theorem 4.3]. As for an incompatibility abduction prob-lem, determining whether an explanation exists for it is NP-complete [5, Theorem 4.5]. Finally, determining the bestexplanation for a incompatibility abduction problem is NP-hard [5, Corollary 4.6].

III. I NITIAL MODEL FOR NEURAL-BASED ABDUCTION

In contrast to the variety of proposals (e.g., [9], [29],[37]) provided within a logical/probabilistic framework, a fewproposals (i.e., [20], [34], [41]) have been made to mechanizeabduction within a neural framework. In fact, mechanizationabduction becomes very complex because it involves multipleand bidirectional inferences. Ultimately, neither Perceptron-like associative neural networks nor more flexible models likeART [6], bidirectional associative networks [26] can easily beused to model the abductive reasoning process.

As other proposals [20], [34], [41], our initial neural modeladdresses only the class of independent abduction problems[3]. The goal of this section is two-fold. First, discuss thereasoning process of this initial model to facilitate the intro-duction of the unified model in the next section. Second, bridgethe gap between the conventional framework of abduction(i.e., Section II) and the neural framework of abduction (i.e.,Section III).

A. Neural Architecture

Given an independent abduction problem ,we start by designing the architecture of the neural networkas depicted in Fig. 2.

Excitatory (i.e., ) and inhibitory (i.e., ) connec-tions could be initialized to any small real value. However,each will be initialized by , the weight given in ,i.e., . At its turn, each will be initialized bya normalized fraction of the number of common observablefacts of and , and the total number of observable facts.

There is a cell (resp., ) for each elementary hypothesis(resp., observable fact ). The layer

(resp., ) includes the cells of the observable facts in(resp., the elementary hypotheses in). The first type ofconnection is between an elementary hypothesis cell

and an observable cell . There is a connection ,called an excitatory connection, whenever there is a causalrelationship . The second type of connection

is between some elementary hypotheses and is called

Page 4: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 411

Fig. 2. The architecture for neural-based abduction.

inhibitory connection. Two elementary hypotheses cellsand are connected if they can explain at least one commonobservable fact.

B. Competition Process

The activity of a given observable fact cell is set toa given constant if the effect is in the given observation,otherwise is set to zero. Moreover, their activities will stayunchanged through out the reasoning process. On the otherhand, the activity of the cell of an elementary hypothesis

, is governed by the shunting competitive equation [21]

(1)

where is a small positive real number (i.e., 0.1) representingthe decay rate of the cell .

Formula (1) is in fact used as a theoretical model combiningpositive and negative influences of elementary hypotheses. Letus rewrite (1) to obtain ,where , is the sigmoid function,and . Here, is the total inhibitoryinput to the cell from all other cells in . Its action onthe activity of is modulated by the activation level ofitself. This guarantees smooth evolution of cell activity andlong transient period during which the network is devoted tothe reasoning process. Conversely, is the total excitatoryinput to from cells in . Note that is not modulatedby since is a supporting evidence for . Note alsothat in absence of the excitatory inputs (i.e., ), theactivity will converge to zero due to the decay parameter.

Fig. 3. The initial neural-based abduction algorithm—Steps.

Suppose that denotes the designed network for thegiven independent abduction problem and letSO be a simple observation for . Without loss of generality,assume also that a real number [0, 1] is attached toeach observed fact in SO—such a real can be interpretedas a certainty degree of user’s observation. Figs. 3 and 4summarize our initial competition process [3], to generate abest explanation for with respect to SO.

The algorithm given in Fig. 3 consists of two major stages.The first stage includes the first two steps. It aims at initial-izing the activities of hypotheses and observable cells. Theinitialization3 of these activities proceeds as follows:

if SOotherwise

(2)

(3)

Formula (2) in the first step assumes that valuesarepositive and normalized (i.e., bounded above). Formula (3)is derived from (1) assuming that when the observable cells

3Recall that excitatory and inhibitory connections have been alreadyinitialized during the design step.

Page 5: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

412 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

Fig. 4. The initial neural-based abduction algorithm—Notations.

turn on, the hypotheses cells enter immediately to an equilib-rium state before the competition process begins. This means

while for all .The second stage (i.e., step 3–6) engages the network into

a competition process. The third step aims at updating theactivities of hypotheses cells as follows:

(4)

In fact, this step consists of resolving the system of nonlineardifferential equations in its equilibrium stateassuming that the states of the cells vary smoothly fromone iteration to another and the output of each cellis approximately constant within a short time interval. Animportant property is that increases with (dependingon the connection of all weights ) and decreases with(depending on a great extent on the connection weights).Furthermore, when , we have .

The fourth step updates the connection weights to reflectthe current variations of the network’s belief on hypothesesplausibility as follows:

(5)

(6)

(7)

where is a small constant, i.e., 0.1. Formula (5) togetherwith the weight normalization (6), consist of a redistributionof belief strengths that is caused by . Hypotheses

gaining greater belief are normally those which resist betterthe competition. On the other hand, the normalization ensuresthat an exclusive hypothesis is always gaining the maximalbelief, which is 1, no matter how strong the other competitivecells are.

such that

(8)

where is a small constant, i.e., 0.1. Formula (8) is appliedonly to connecting two hypotheses, and , having atleast one common observable fact. It is designed for increasinginhibitory connection weights between and when theiractivities evolve in opposite direction. Hence, if is gainingsupport while is loosing, mutual inhibition is strengthened.Conversely, if and evolve in the same direction, mutualinhibition is weakened.

The fifth step is simply for pruning hypotheses. It eliminatescausal relationships which are no more significant for thesubsequent reasoning. Ultimately, we have the following rule:

(9)

The sixth step concerns the termination test. The reasoningprocess is terminated when each of the remaining elementaryhypotheses is temporarily exclusive. An elementary hypothesis

is temporarily exclusive if there is an observed factsuchthat is the only remaining elementary hypothesis coveringit. Exclusiveness is thus related to the temporal belief of thenetwork. The goal of updating the network is to eliminatethose elementary hypotheses which are not considered to beessential and make every remaining elementary hypothesesexclusive for at least one observed fact.

C. Illustrative Example

For the sake of illustration, let us consider , the inde-pendent abduction problem introduced in Section II. Supposethat we already design the 2-layer neural network correspond-ing to . As before, lets us take SOas a simple observation for . Assume also that the userprovides the following certainty degrees: , ,

.The first stage of algorithm GOBE is merely for initialization

and is obvious. Let us then focus on the second stage whichengages the competition process. Figs. 5 and 6 summarize thedynamic of network through the evolution of cell activitiesand connection weights.

The initial activities of hypothesis cells are 0.50 forcarbu-retor ( ), 0.1 for ( ), 0.46 for( ), and 0.3 for ( ). Due to the competitionand mutual inhibition, hypotheses andare quickly eliminated as depicted in the evolution of the activ-ities in Fig. 5 as well as in the evolution of the correspondingexcitation weights shown on Fig. 6.

Hypotheses and are the winnersof the competition process. Fig. 6 shows that for a surviving

Page 6: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 413

Fig. 5. Activities evolution.

Fig. 6. Weights evolution.

hypothesis there is at least one corresponding weight that grad-ually converges to 1. This means that the hypothesis becomestemporarily exclusive. Here hypothesis is exclu-sive to (ogl) and to(oae)—hence, and converge to 1. At theend of competition, the algorithm collects the winners hy-potheses , that is the explanationfor with respect to SO.

IV. UNIFIED MODEL FOR NEURAL-BASED ABDUCTION

We propose several extensions to the initial model to dealwith complex abduction problems, namely the monotonic,open and incompatibility problems. Fundamental to theseextensions is that they are monotonic; once one extension isdone, it will be never undone by another extension. Conse-quently, the new unified model will solve not only abductionproblems of monotonic, open, or incompatible classes but alsoany abduction problem resulting from a combination of theseclasses. For the sake of readability, we present progressivelythe unified model in a modular way. Moreover, we adopt an in-tuitive viewpoint and use illustrative examples systematically.Analytical materials are postponed to Section V.

A. Monotonic Class

Clearly, abduction problems of this class cannot be solvedby using the initial two-layer network architecture of Fig. 2.

In fact, it requires the introduction of an intermediate layer.Without loss of generality, let us consider the causal relation-ship . Fig. 7 depicts the new three-layerarchitecture introducing the intermediate layer.

An intermediate cell in is associated to each causalrelationship having the form . The intermediatecell models the additive interaction between the subset ofelementary hypotheses and . The role of isnamely information flow. That is, acts as an informationtransmitter between and .

Notice that for , plays the role ofone macro-hypothesis representing elementary hypotheses.Hence connecting and is similar to a regularexcitatory connection in the initial model. Moreover, is

initialized by (i.e., the weight in ) andstraightforwardly updated by (6).

Let us now discuss the additional connections, noted. These connections represent in fact a

mere redistribution of . Ultimately,must always holds. It follows that

(10)

where , , and . Intuitively,the weight measures the individual contribution ofamong to cause the observable fact.

Finally, let us turn to the activities’ cells. According to(4), , where denotes the activityof . Taking into account the additive interaction betweenhypotheses, the excitatory term now becomes

(11)

The first term represents thedirect exci-tatory input to from the layer . The second term

represents itsindirect excitatory input alsofrom the layer but via , the intermediate layer.

Note that the additive interaction between hypotheses doesnot affect . Indeed, recall that an inhibitive connection

exists iff such that , i.e.,and share at least one common observable fact. Summingup, we obtain

(12)

Since the intermediate cell stands for a macro-hypothesisrepresenting the conjunction of , it follows thatits activity must be governed by the activities of .This also means that , can explain the observ-able fact if each hypothesis cell is active, i.e.,

. Hence, we obtain

(13)

Page 7: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

414 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

Fig. 7. The new three-layer architecture.

Fig. 8. Activities evolution with respect toAPMN.

For the sake of illustration, let us consider the monotonicabduction problem introduced in Section II. As before,we take SO as our simple observation for

. We suppose that certainty degrees are: ,, . Figs. 8 and 9 summarize the

competition process through the evolution of cell activitiesand connection weights.

According to the graph of Fig. 8, the winning hypotheses areand , i.e., . Observe

that Fig. 9 shows that the strong activity of cell and theadditive interaction between and cells have greatlycontributed to to win the competition against .

B. Open Class

According to Definitions 6 and 7, an abduction problem ofthe open class has two distinctive features. First, it requiresa multiple observation (e.g., MO ) specifyingnot only observable facts which are present (i.e.,), butalso those which are absent (i.e., ) and unknown (i.e.,

). Second, an explanation for a problem of this classmust not only include as few as possible hypotheses covering

Fig. 9. Weights evolution with respect toAPMN.

observable facts in , but also covers as few as possibleobservable facts in , i.e., EXPLAIN has to beminimal.

The first feature leads to a new initialization scheme of theactivities of observable cells. Ultimately, we adapt (2) of theinitial model as follows:

ifotherwise

(14)

where we assume this time that measures theuser’s belief that is present or absent.

The second feature concerns the activities of hypothesiscells and directly affects the competition process. The activity

of an hypothesis cell in (4) is .denotes the total inhibitory input from all the other cells

of hypotheses which compete with . On the other hand,is calculated from the activities of the cells representing

observable facts. Let us then split the term in order totake into account both categories of observable facts in a givenmultiple observation. We have

(15)

Page 8: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 415

Fig. 10. Activities evolution with respect toAPOP.

Fig. 11. Weights evolution with respect toAPOP.

where and.

Clearly should play the role of excitation for (theactivity of ), while has to inhibit . Nonetheless,we should control the inhibition effect of to preventfrom becoming negative, namely when . Letus rewrite (15) by introducing a control factor noted. Wehave

(16)

where . Summing up, we obtain

(17)

Observe that enables that . In particular,if then . Conversely, if

then is almost reduced to zero. Therefore, theinhibition effect resulting from absent facts is modeled througha division instead of a subtraction in (15).

As an illustrative example, we take again AP togetherwith the multiple observation MOintroduced in Section IIand , , and , Figs. 10 and11 summarize the competition process through the evolutionof cell activities and connection weights.

The possible explanations for are, and . The absence of has

strongly penalized in such a way that has won easilythe competition in less than 20 iterations. Ifhad been considered as unknown, could still have wonbut the competition would have taken much iterations. Thisconfirms that an unknown fact has little impact on the activityof hypothesis cells compared to a present or absent fact.

C. Incompatibility Class

Without loss of generality, consider and , a pair ofincompatible hypotheses, i.e., . Clearly, and

are by definition competitors to each other. That is why,we create two “incompatibility” connections: 1) linking

to ; 2) linking to . The role of (resp.,) is to inhibit the activity of (resp., of ). Let

us rewrite (1) (i.e., the initial shunting competition equation)to take these new connections into account. We have

(18)

Assuming that the activity of hypotheses cells variessmoothly within a short interval, we obtain

(19)

where , ,and .

Let us now focus on the initialization and updatingof incompatibility connections. The initialization ofis straightforward. Given two incompatible elementaryhypotheses , , and are initialized by

. Updating is somewhat more subtle as shown below.At first sight, inhibitive and incompatibility connections aresimilar since both aims at weakening competitors activities.More precisely, we have

(20)

(21)

where if , 0 otherwise. With respectto (8) used for updating inhibitory connections, incompatibleconnections are updated in an asymmetric way thanks tofunction. Ultimately, we ensure that activity inhibition affectsone and only one cell, namely the cell whose activity is alreadyweakened.

Formulas (20) and (21) are pretty good except that they mayweaken the activity of an exclusive hypothesis. By definition,an exclusive hypothesis must remain and ultimately win thecompetition, since it is the only one covering an observedfact—see formal definition in Fig. 4. Taking exclusiveness into

Page 9: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

416 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

Fig. 12. Activities evolution with respect toAPIC.

account, we rewrite (20) and (21) as follows:

(22)

(23)

where if EXCLUSIVE holds, 0 otherwise. Thenew function ensures that the activities of incompatiblehypotheses cannot be greatly weaken if they are exclusive.Ultimately, they remain in the competition and necessarilysurvive the competition process.

As an illustration, consider together with the multipleobservation MO introduced in Section II. Assume also that

, , and , by default. Figs. 12–14 analyze the competition process through the

evolution of cell activities and connection weights.It is easy to understand why and win the com-

petition. As a matter of fact, is exclusive with respectto . The survival of is so guaranteed. This, in turn,contributes to quick elimination of since andare incompatible. For , it wins the competition basicallybecause it covers two observed present facts without coveringabsent ones. As for and , each of them covers oneobserved fact but no one is exclusive. Furthermore,competes with to explain , competes withto explain , and and are incompatible. All thesefactors play in favor of at the expense of and

. As depicted on Figs. 12 and 13, the wining hypothesesand compete also with each other to explain .

The competition becomes more dominant when the othercompetitors (i.e., and ) become weaker. However, noneof and succeeds to eliminate the other since both arenecessary to explain all of observed facts.

The evolution of the weight of incompatibility connec-tions deserve some remarks. Observe that in Fig. 14, both

and remain unchanged, whereasand increase because and are consideredmore relevant than and . Taking into account theevolution of cell activities, and receive stronger andstronger inhibition and are eliminated from the competition.

Fig. 13. Weights evolution with respect toAPIC.

Fig. 14. Weights evolution with respect toAPIC.

Conversely, the weak inhibition received by and helpthem to win the competition.

V. SYNTHESIS AND THEORETICAL ANALYSIS

Two basic methods could be used to construct an neuralmodel for abduction. The conventional method, adopted bythe other neural abduction models, consists of: 1) propose atarget/energy function modeling the adopted criteria; 2) designan architecture of the network to optimize the target function.The second method, adopted here, consists of: 1) design thearchitecture of the network for the class of abduction problems;2) analyze the network dynamics and characterizes generatedexplanations.

The main advantage of the conventional method is to makeavailable the target function in an earlier stage. Unfortunately,such a function may, if not properly designed, impose severeconstraints on network design and optimization mechanisms tobe used. On its turn, the second method offers the possibility toconstruct various mechanisms into the network architecture totake into account the characteristics of the diagnostic problems.Yet, such an advantage depends on the availability of an energyfunction to analyze the network dynamics.

In what follows, we present a formal analysis of the dynam-ics of the proposed network model. We show that the networkalgorithm performs a gradient descent type of optimizationprocess on an energy function. Furthermore, this energy func-tion bears a significant relationship to the characteristics of theexplanation generated by the network.

Page 10: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 417

Fig. 15. UNIFY: the unifying abduction algorithm.

A. Unifying Algorithm

Prior to formal analysis of our neural model, let us summa-rize all extensions proposed in Section IV.

Fig. 15 depicts UNIFY algorithm which takes five param-eters. Three subsets , , and included in a givenmultiple observation—for simple observations,

. The vector denotes the certainty degreeof elements in . Finally, denotes thenetwork of the abduction problem at hand.4

B. Synthesizing Example

Here is a real case drawn from the domain of legal reason-ing. Cara Knott was killed on Dec. 27, 1986, and Craig Peyer, aveteran California Highway Patrolman, was accused. Twenty-

4In addition to notations and conventions used in Section IV and Fig. 4,we adopt the following ones. Subscriptsi andk index quantities attached tohypotheses, e.g.,xi; W ex

i; k; Wick; i; subscriptj indexes quantities attached to

an observable fact, e.g.,xj ; W exi; j ; subscriptl indexes quantities attached

to an intermediate cell, e.g.,xl; W exk; l; AXi = l W

exi; l xl; Oi; k =

foj such thatW exi; j W

exk; j 6= 0g; and#(set) denotes cardinality set.

Fig. 16. Synthesizing example: a monotonic, open, and incompatibilityabduction problem.

two women, young and attractive like the victim, testified thatthey had been pulled over by Peyer for extended personalconversations near the stretch of road where Knott’s bodywas found. On Dec. 28, 1988, theSan Diego Unionand SanDiego Tribunecovered the trial, which ended at San Diegoon February 27, 1988. Based on these extensive news reports,this case was formulated as an abduction problem in [41]. Thefollowing formalization in Fig. 16 is adapted from [41]; thelegend is depicted in Fig. 17.

The purpose of this example is twofold. First, we show howour model is able to effectively deal with complex abduc-tion problems, namely open, monotonicand incompatibilityproblems. Second, we proceed to a sensitivity analysis of ouralgorithm by modifying progressively the input data of thesynthesizing example. Of course, a sensitivity analysis couldbe enhanced by comparing the behavior of our algorithmwith respect to other proposals. Unfortunately, this is notyet possible since the literature does not offer abductionalgorithms to deal with complex problems—more details aregiven in Section VI.

Figs. 18–20 summarize the evolution of the activity ofhypotheses cells in three experimentations. These experimen-tations were selected to underlie the role and the influence ofeach observation on the competition process. Starting fromthe initial input data (i.e., Fig. 18 and in Table I),our UNIFY algorithm concludes that the surviving of thecompetition are: , and . Taking intoaccount the final activity of each of these surviving hypotheses

Page 11: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

418 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

Fig. 17. Synthesizing example legend.

depicted on Fig. 18, the generated explanation isit is mostprobable that Peyer is guilty, however, it is not excluded thatPeyer is innocent. Indeed, there is a thin difference betweenthe final activity of the guiltiness (resp., innocence) hypothesis

(resp., ) which is 0.98 (resp., 0.77). Nonetheless, thisdifference between final activities depends on the value of

Fig. 18. Experimenting set:OBS1 in Table I.

Fig. 19. Experimenting set:OBS2 in Table I.

Fig. 20. Experimenting set:OBS3 in Table I.

and —the Hebbian constants which were set to 0.1 inour experimentations. More precisely, the inhibition effectresulting from incompatibility relationships is under controlof these constants.

The innocence of Peyer becomes much clearer when consid-ering of Fig. 19. In this observation, two facts (i.e.,and ) were dropped and three others (i.e., , and )were considered as unknown. That is why UNIFY algorithm

Page 12: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 419

TABLE IEXPERIMENTING SET

generates the final explanation, depicted in Fig. 19, which is:, , and . Hence,it is most probably

that Peyer is innocent, however, some guiltiness doubts stillexist.

One could ask why the final explanation includes bothinnocent and guilty hypotheses. The answer is simply thatUNIFY tries to explain all of observed facts in . Ultimately,each hypothesis included in the final explanation contributesto explain at least one fact in . of Fig. 20, wherethe final explanation is , and , gives us a simpleillustration. By considering and as unknown, thecompetition eliminates and which are now no longerrequired.

C. Theoretical Analysis

Let us now analyze the dynamic of the network where weassume that and leading to ,

, and . Note that thisassumption actually holds since bothand are very smalland the activities of hypothesis cells vary smoothly exceptfor the first iteration.5 For the sake of readability, we splitour analysis into two steps. First, we consider the classes ofmonotonic and open abduction problems. Then we generalizeour analysis to the classes of monotonic and incompatibilityabduction problems.

Considering monotonic and open abduction problems, theactivity of a hypothesis cell is calculated as follows:

(24)

5The large variations of cells activity at the first iteration is because ofweights normalization and activity initialization.

Where denotes the th iteration, and. Let us now focus

on , the variation of activity between two succeedingiterations, we have

(25)

Since , we obtain

(26)

where is defined as follows that

(27)

Since and , itfollows that6

(28)

Taking into account this approximation of by (28),we rewrite (26) as follows:

(29)

Formula (29) points out that is influenced by threefactors: , , and . Sincethe dynamics of the system mainly depends on the evolution ofactive hypothesis cells, the determinant factor is then ,except when . Let us focus our analysis on thisdeterminant factor. By definition, we have

(30)

where is defined as follows:

(31)6For the sake of readability, we will omit the parameter(t), whenever there

is no ambiguity.

Page 13: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

420 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

Now, observe that and. Consequently, we

have

(32)

Using the facts that and , werewrite (32) to obtain

(33)

Since when , we can furthersimply (33) as follows:

(34)

Summing up, we have

(35)

Now let us define . Since ,can be interpreted as an overall (weighted sum) explanation

that the active hypothesis cells provide to the observed factassociated with . It follows that measures in fact thestrength of the cell with respect to this explanation, i.e.,the weighted sum. Ultimately, if then the cell

is likely a better hypothesis than most others to explainthe cell . Conversely, if then it should existanother cell which could better explain the cell . Basedon these remarks and substituting in (35), we obtain

(36)

which is actually a combination trading off the for and theagainst of the hypothesis cell .

Keeping in mind the definition of in (33), let us definethe following energy function:

(37)

Focusing on , we obtain

(38)

Hence, if is not exclusive (i.e., ), then. This means that the evolution of

follows the gradient descent of, provided that is notexclusive. Let us now generalize the energy functionwhich

is defined on all competing and nonexclusive hypotheses.Observe that the term converges to 0as tends to 1. Thus if we letwhenever , then the energy function defined by (37)becomes

(39)

The function reaches its minimum (i.e., zero) whenevery surviving hypothesis () is exclusive for an observablefact in the given observation. The algorithm UNIFY given inFig. 15 allows effectively the network to reach this state ofexclusiveness. However, it is worth noting that exclusivenessreflects atemporarybelief (or preference) of the network. Infact, at the end of competition it does not mean that everysurviving hypothesis is actually the only remaining hypothesiscovering some observable facts in the given observation.Ultimately, a hypothesis can survive the competition only ifit is actually exclusive or it is one of those which explains asmany as possible the observable facts, and may explain as fewas possible observable facts known to be absent.

Consequently, the energy function not only allows to an-alyze the convergence of the algorithm but also allows toevaluate the plausibility of the generated explanation. For thisend, we should replace all temporary weights [i.e., ]with their initial normalized value [i.e., ]. Thevalue of the function is now a weighted sum of the squareddifferences between the strength of each surviving hypothesisand the strength of an overall explanation to each observablefact. The value of ranges into the interval , whereis a positive value. However, a simple transformation such as

, can map the range of within . Thistime the maximum value in this range (i.e., 1) correspond tothe minimum of the initial energy function (i.e., 0).

Let us now address the class of incompatibility abductionproblems. Taking the following additional assumption

and dealing in a similar way as before, we obtain

(40)

where, , and. Let us rewrite (40) as follows:

(41)

where . Considering (41), we obtainthe following energy function:

(42)

which is exactly the same energy function given by (39) forthe class of independent and open abduction problems. In fact,the distinctive feature of incompatibility abduction problems iscaptured through . A detailed analysis reveals that the term

Page 14: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 421

in the denominator of penalizes mainly weak incom-patible hypotheses. Experimental analysis in ()Section VI-Cconfirms also that incompatible hypotheses may never surviveif they are not exclusive.

The extension of the above energy function to the class ofmonotonic abduction problems requires more computationalefforts. Let us first use the following approximation:

(43)

We then focus on , whererepresents the (direct) part of generated from all cells of

—the layer of observable facts. On the other hand,represents the (indirect) part of generated from allcells of —the intermediate layer. Observe that in absenceof intermediate cells, we have . Using (36),

can be approximated as follows:

(44)

Similarly, can be derived as

(45)

Summing up, we obtain and the corresponding energyfunction as follows:

(46)

(47)

This new energy function differs from the energy functiondefined in (36) and (42) by the introduction of a new secondterm. Clearly, this term models the additive interaction ofhypotheses in the class of monotonic abduction problems.Moreover, the energy function defined in (36) and therebyin (42) are special cases of the new energy function definedin (47).

VI. COMPARISON AND CONNECTION TO OTHER WORK

This section primarily aims at a functional comparisonpointing out the capability of different proposals. Mechanizingabduction has been approached in two frameworks. The first

is the logical/probabilistic framework including a variety ofproposals, e.g., Assumption Truth Maintenance System [10],Belief Revision Networks [29], and Probabilistic Networks[24], [35], [36]. The second framework is the connection-ist/neural framework and includes very few proposals, namely[20], [34].

A. Logical/Probabilistic Framework

Assumption-Based Truth Maintenance (ATMS) [11] is builtwithin a logical framework and use propositional calculus.From a functional view, an ATMS7 can be viewed as abuilder of explanations. While monotonic and incompatibilityabduction problems could be encoded as required by an ATMS,there is no simple way to encode open abduction problemswithout radically altering the mechanism itself of an ATMS.Moreover, an ATMS handles only logical formulas and doesnot accept fuzzy relationships. As a matter of fact, an ATMScannot generate an explanation for our synthesizing example.Indeed, when considering pure logical relationships instead offuzzy relationships this example is already inconsistent for anATMS—Peyer cannot be guilty and innocent.

Most importantly, an ATMS aims first at generating allconsistent explanations which raises a potential combinatorialexplosion. Several authors (e.g., [12], [14], [15], [39]) pro-posed to focus an ATMS for the generation of most probableexplanations. Typically, this consists in attaching a probabilityto each hypothesis in such a way that the ATMS considersthe most probable hypotheses, e.g., Sherlock, HTMS [22].Unfortunately, the combinatorial explosion cannot be avoided,namely when all hypotheses have similar8 probability [13].

Table II summarizes the capabilities of an ATMS versusthose of the proposed algorithm UNIFY. It is worth noting thatTable II only underlines the capabilities of different proposalsto handle abduction problems. A full comparison of proposalsrequires a variety of criteria and could be hazardous, e.g., howto compare the efficiency of an ATMS versus UNIFY since thefirst generates several explanations while the latter generatesonly one.

Within a probabilistic framework, Bayesian (or Bayes)networks such as [24], [29], [30], [36] are the most familiar.We first note that incompatibility abduction problems are notaddressed in this framework. Interesting however is the modelof Pearl [29] which is able to address open abduction problems.In fact, there are some similarities between our proposal andPearl’s proposal.9 First of all, both proposals do generate onlybest (parsimonious and plausible) explanations. Both proposalscan address independent and open abduction problems. How-ever, incompatibility abduction problems cannot be addressedby Pearl’s network. From a macroscopic viewpoint, thepos-itive influences and the mechanism of belief distribution in

7We use “an ATMS” instead of “the ATMS” to make our analysis general.In fact, after 1986 (publication date of [11]), a variety of ATMS-like havebeen proposed [16], [22]. Our analysis, however, holds since these extensionsdid not change the intrinsic function of an ATMS.

8While assigning different probabilities to causal relationships betweenhypotheses and observables seems be the case, it is rare that hypotheses havedifferent probabilities.

9Note that Bayes network are sometimes called causal nets, Bayesian beliefnetworks, or probabilistic causal networks [23].

Page 15: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

422 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

TABLE IICAPABILITIES OF ATMS/BAYES NETWORK/UNIFY

Pearl’s proposal play somewhat the same role as our excitatoryconnections (i.e., ) and the updating rules (i.e., Formulasof Stage 4 in Fig. 15). On the other hand, there is no “negative”influences in Pearl’s proposal which could play the role ofinhibitory/incompatibility connections (i.e., ) andthe updating rules (i.e., Formulas of Stage 6 and 7 in Fig. 15)in our proposal. This limits the ability of Pearl’s model andBayes nets since inhibitory/incompatibility connections in ourmodel (i.e., “negative” influences) are crucial to sustain andto handle the competition between incompatible hypotheses.Is there a way to introduce “negative” influences in Bayesnetworks without altering the fundamentals of the probabilisticframework in which they are built?

The above analysis shows that with respect to abduc-tion and from a functional viewpoint, our proposal is themost appropriate since it can generate best explanations formonotonic, open and/or incompatibility abduction problems.This does not mean that our proposal beats the others, butit offers several capabilities to address complex abductionproblems (see Table II). In this vein, let us mention that Bayesnetworks and ATMS could be more appropriate in severalcases. For example, an ATMS is appropriate namely whenwe are interested in generating all explanations. Neither ourproposal nor Bayes Network can do. Causal networks spanningon several layers can be addressed by an ATMS as wellas Bayes networks. Although the extension of our proposalshould be straightforward, this paper focused on two-layercausal networks.

B. Connectionist/Neural Framework

The proposal of Peng and Reggia [34] is based on a prob-abilistic model [33] where maximal plausibility and minimalcardinality criteria are cast into a single function. The value ofthis function is proportional to the posterior probability of theset of explanations (called diagnoses in [34]) for the observedfacts (called effects in [34]). This target function10 is to bemaximized in order to generate the best explanation (diagnosisin [34]). In Peng and Reggia’s model, the target function isdecomposed into different components, each of which formslocal optimal criteria for a single cause. Each componentis associated with a unit in the connectionist architecture.The activation rules for these units are derived from thoselocal criteria. In this way, the global optimization problem isdecomposed into a set of concurrent local problems for all

10In fact, such a function plays the role of an energy function where theminimization task of the energy function is reduced the maximization task ofthe target function.

elementary hypotheses (diseases in [34]). The design of thisconnectionist model has greatly benefited from the theoreticalframework put forward by the same authors in [33]. From aformal point of view, the model captures all major aspects ofthe framework. Furthermore, when the size of the abductionproblem is small, the model does calculate the best or oneof the best explanations (diagnoses in [34]) according to theframework. In another paper [42], the convergence problem indealing with large size abduction problem has been approachedby using a pseudo simulated annealing process. The approachhas been successful to some extent, although it is not clearhow to chose the initial temperature and what the best wayto decrease it is.

To our knowledge the proposal of Reggiaet al., is the firstone aiming at mechanizing the abductive reasoning withina neural/connectionist framework. Nonetheless, it is unclearwhether this proposal can be extended to account for othertypes of diagnostic problems (e.g., open or incompatibil-ity diagnostic problems), since it is strictly dependent ona probabilistic framework allowing little flexibility. In thesame way, the adaptation of this model to situations wherecausal relationships cannot be completely specified in termsof probabilities, does not seem straightforward.

In [20], Goel et al., suggest two proposals. The first one isbased on an energy function that seems capable of accountingfor maximal plausibility and minimal cardinality criteria. Themain problem as pointed out by Goelet al., is that suchan energy function imposes high order neural networks. Thesecond proposal of Goelet al. [20] consists of a functionalarchitecture emphasizing modular representation of problemconstraints and functionality. Of course, this representationmay in principle result in a reduced complexity of local neuralarchitectures. However, the mechanisms in each module aswell as mechanisms for interactions between modules remainto be specified. The strong feature of the work of Goelet al.,consists of providing several useful guidelines to model andto mechanize abductive reasoning within a neural framework.It is difficult however to assess the utilities of these proposals,namely because of lack of precisions.

In [41], a new computational theory of explanatory coher-ence is proposed. In his paper, Thagard provides a neuralbased algorithm to solve a variety of problems formulatedin this theory. At first sight, there are several similaritiesbetween our proposal and this model. In fact, both propos-als use excitatory/inhibitory connections and are based on acompetition-based reasoning. Nonetheless, there are severalfundamental differences between both models.

Page 16: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 423

First, there is a difference from the architectural viewpoint.We model the additive interaction between hypotheses explic-itly by the intermediate layer, while this remains implicit inThagard’s architecture. Second, there is a fundamental differ-ence from the definitional viewpoint. Thagard’s explanationconcept is defined within a theory of explanatory coherence,while we use the abductive theory to define what is anexplanation. Hence, the nature of explanations generated byThagard’s model are based on hypotheses coherence, whileexplanations of our model are based on two conventionalcriteria of abduction, i.e., minimal hypotheses and maximalplausibility.

These differences prevent us from making a fair theoreticalor experimental comparison between both models. Ultimately,it is easy to built several counter-examples leading Thagard’smodel to generate a nonminimal explanation. Conversely,our model could generate explanations which do not takecoherence criteria into account.

C. Experimental Results

The data used in the following experiments are the sameas in [42]. There are 26 causes (elementary hypotheses in ourterminology), 56 associated effects (observable facts in ourterminology), and about 400 causal connections.

For the sake of readability, we will only present somenumerical results from a comparison between our model andthe connectionist model of Peng and Reggia. This comparisonaims at demonstrating the major differences between the twomodels rather than showing which one is better than the other.Indeed, it is unfair to do so since the two models have differentobjective functions and use different updating rules.

The connectionist model of Peng and Reggia [33] has beenimplemented according to the specifications given in [42].These specifications are designed for large size diagnosticproblems. According to these specifications, the initial tem-perature is chosen to be 2.0 and the temperature decay rateto be 0.95. The number of iterations is 80 for all tests. In ourmodel, has been chosen to be 0.1 andto be 0.05. Whenan effect is observed, the activity of the cell corresponding toit affected to 1.0, otherwise the cell is affected to 0.0. Finally,we fixed the number of iterations to 300 for all tests.

In order to evaluate the Peng–Reggia’s plausibility function(noted as in [42]) and our proposed energy function [noted

], a cause is considered to be present if its output isgreater than 0.01. This cause will have a nominal value of1 in the Peng–Reggia’s plausibility function while its valuein our energy function is its output. The comparison betweenour model and the Peng–Reggia’s model is based onand . Suppose that (resp., ) denotesthe diagnosis (explanation in our terminology) generated byPeng–Reggia’s model (resp., our model) and let us define thefollowing ratios:

Since the objective is to maximize (resp., minimize)[resp., ], we can state the following rules: 1) if ,

Fig. 21. Comparison with respect toLR ratio.

Fig. 22. Comparison with respect toFR ratio.

then is more plausible (or better) than ;2) if , then is more plausible (or better)than .

Based on these rules, we experiment both models on abattery of a hundred tests to compare the behavior of bothmodels. For each test, a set of effects is selected randomly.Each effect has a chance of 25% to be considered as beingobserved. Figs. 21 and 22 summarize our comparison withrespect to the above ratio.11

From left to right, Fig. 21 should be read as follows. Thereare five cases where (i.e., fails to generatea cover), one case where , one case where

, three cases where , andso on. Recall that if then the diagnosis generatedby our model (i.e., ) is less plausible than the one(i.e., ) generated by Peng–Reggia’s model. Hence,

11The list of all detailed values is available from the authors upon request.

Page 17: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

424 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 4, JULY 1998

this graph points out that is in general better thanwith respect to .

From left to right, Fig. 22 should be read as follows. Thereare three cases where , twelve cases where

, seven cases where , elevencases where , and so on. Recall thatif then the diagnosis generated by our model (i.e.,

) is more plausible than the one (i.e., )generated by Peng–Reggia’s model. Hence, this graph pointsout that is in general better than withrespect to .

Besides this comparison, we noticed that in most cases ourmodel generates more causes than Peng–Reggia’s because ourreasoning process is more in favor of those causes that havestrong relationships with observed effects. On the other hand,Peng–Reggia’s model penalizes severely the causes which areeither nonessential with respect to parsimonious covering orreplaceable with a less number of other causes (see details in[42]).

VII. EPILOGUE

The contributions of this paper can be summarized asfollows. We provide a concise summary of abduction classesand a competition based model yielding a general algorithmaddressing complex abduction problems. We also approximatethe energy function of our model to provide theoretical foun-dations for this work. Finally, we tested the algorithm on abattery of 100 abduction problems to assess the effectivenessof our algorithm. Deliberately, in this paper we presented ourmodel in a progressive and ultimately constructive way. Thatis, starting from our initial model for the independent classof abduction problems, we refine and extend it to deal withmonotonic, open and incompatibility classes of abduction. Thismotivates our initial claim: one unified model/algorithm ratherthan a variety of models/algorithms for each abduction class.

As one could expect, there are also some shortcomings inthis work which should be addressed in further research. Weshould first investigate why the algorithm fails to solve abduc-tion problems when incompatibility is modeled in an implicitway. Indeed, notice that Notation 3 as well as Definitions 6and 7 have been arranged to allow an implicit modeling ofan incompatibility abduction problem as a combination of anopen and monotonic problem. Unfortunately, it is easy to builtan ad hoc counter-example leading to a very slow convergenceof our algorithm, when dealing with incompatibility problemsimplicitly modeled.12 A temptative analysis to fix this issuecan be found in Hence, we should find an explanation of thisfailure to fix it. A primary investigation has pointed out thedefinition of the control factor but we have not found aformal evidence.

There is another important issue related to the experimentalviewpoint. We tested our algorithm on a database kindlyprovided by Reggia [40]. However, the hundred test cases inthis database are in fact limited to independent/open problems.

12This explains why we decided to model incompatible hypotheses ex-plicitly in Section IV-C. In fact, it is even possible to build counter-examplesleading the algorithm to generate a nonminimal and unsatisfactory explanation.

To test problems including incompatibility relationships, weproceeded randomly by generating typical incompatibility re-lationships. Our analysis of the generated explanation and thedynamic of the network was also done case by case—analysisdetails can be found in [19]. Of course, proceeding in sucha way raises a risk of misinterpretation. Our current concernis to build a battery of actual monotonic/open/incompatibilityabduction problems.13 Most importantly, we also need theexplanation for each of these abduction problems. Obviously,such explanations should be generated by another algorithmto analyze the performance of our algorithm.

Finally, there is a promising research direction to extendthe proposed unified model for the cancellation class [5]. Thisclass raises several difficulties and exciting challenges.

ACKNOWLEDGMENT

The authors would like to thank Prof. J. A. Reggia whoprovided them with the battery of experiments used in [42] tocompare both models.

REFERENCES

[1] B. Ayeb, “Toward systematic construction of diagnostic systems forlarge industrial plants: Methods, languages and tools,”IEEE Trans.Knowl. Data Eng., vol. 6, pp. 698–712, Oct. 1994.

[2] B. Ayeb, P. Marquis, and M. Rusinowitch, “Preferring diagnoses byabduction,” IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 792–808,May 1993.

[3] B. Ayeb and S. Wang, “Abduction-based diagnosis: A competition-based neural model simulating abductive reasoning,”J. Parallel Distrib.Comput., vol. 24, pp. 202–212, 1995.

[4] T. Bylander, “The monotonic abduction problem: A functional charac-terization on the edge of tractability,” inProc. AAAI’90,Boston, MA,1990, pp. 70–77.

[5] T. Bylander, D. Allemang, M. C. Tanner, and J. R. Josephon, “Thecomputational complexity of abduction,”Artif. Intell., vol. 49, pp.25–60, 1991.

[6] G. A. Carpenter and S. Grossberg, “A massively parallel architecture fora self-organizing neural pattern recognition machine,”Comput. Vision,Graph., Image Processing, vol. 37, pp. 54–115, 1987.

[7] E. Charniak, “Connectionism and Explanation,” inTheoretical IssuesNatural Language Processing,Las Cruses, NM, 1987, p. 3.

[8] L. Console, D. T. Dupre, and P. Torasso, “A theory of diagnosisfor incomplete causal models,” inProc. Int. Joint Conf. ArtificialIntelligence,Detroit, MI, 1989, pp. 1311–1317.

[9] P. T. Cox and T. Pietrzykowski, “Causes for events: Their computa-tion and applications,” in8th Conf. Automated Deduction, LNCS 230.Berlin, Germany: Springer-Verlag, 1986, pp. 608–621.

[10] J. de Kleer, “An assumption-based TMS,”Artif. Intell., vol. 28, pp.127–162, 1986.

[11] , “Problem solving with the ATMS,”Artif. Intell., vol. 28, pp.197–224, 1986.

[12] , “Optimizing focusing model-based diagnosis,” inProc. Int.Workshop Diagnosis DX’92, Orcas, WA, 1992, pp. 26–29.

[13] J. de Kleer and O. Raiman, “Trading off costs of inference vs. probingin diagnosis,” inProc. Int. Joint Conf. Artificial Intelligence,Montreal,P.Q., Canada, 1995, pp. 1736–1741.

[14] O. Dressler, “Problem solving with the NM-ATMS,” in9th Eur. Conf.Artificial Intelligence,Stockholm, Sweden, 1990, pp. 253–258.

[15] O. Dressler and P. Struss, “Back to defaults: Characterizing and comput-ing diagnoses as coherent sets,” in10th Eur. Conf. Artificial Intelligence,Vienna, Austria, 1992, pp. 719–723.

[16] M. L. Ginsberg, Ed.,Readings in Non Momotonic Reasoning.NewYork: Morgan Kaufmann, 1987.

[17] K. Eshghi, “Abductive planning with event calculus,” inProc. 4th Int.Conf. Symp. Logic Programming, 1988, pp. 563–579.

[18] M. R. Garey and D. S. Johnson,Computers and Tractability. NewYork: Freeman, 1979.

13Unfortunately, it seems difficult to gain access to such a battery.

Page 18: A Unified Model For Abduction-based Reasoning - Systems ... · A Unified Model for Abduction-Based Reasoning Bechir Ayeb,´ Member, IEEE, Shengrui Wang, and Jifeng Ge Abstract—

AYEB et al.: UNIFIED MODEL FOR ABDUCTION-BASED REASONING 425

[19] J. Ge, “Raisonnement hypoth´eique par les r´eseaux de neurones,” Uni-versite de Sherbrooke, Sherbrooke, P.Q., Canada, Memoire de MSc,Tech. Rep. (III)-906, 1994.

[20] A. Goel, J. Ramanujam, and P. Sadayappan, “Toward a ‘neural’ architec-ture of abductive reasoning,” inProc. 2nd Int. Conf. Neural Networks,1988, pp. I-681–I-688.

[21] S. Grossberg, “Nonlinear neural networks: Principles, mechanisms, andarchitectures,”Neural Networks,vol. 1, no. 1, pp. 17–60, 1988.

[22] Readings in Model-Based Diagnosis, W. C. Hamscher, J. de Kleer, andL. Console, Eds. New York: Morgan Kaufmann, 1992.

[23] D. E. Heckermann and B. N. Nathwani, “Probability-based representa-tions for diagnosis,” inProc. Int. Workshop Diagnosis DX’92, Orcas,WA, 1992, pp. 169–178.

[24] D. Heckermann and E. Horvitz, “Problem formulation as the reductionof a decision model,” inProc. Conf. Uncertainty Artificial Intelligence,1990, pp. 82–89.

[25] J. R. Josephson, B. Chandrasekaran, J. W. Smith, and M. C. Tanner,“A mechanism for forming composite explanatory hypotheses,”IEEETrans. Syst., Man, Cybern., vol. SMC-17, pp. 445–454, 1987.

[26] B. Kosko, “Bidirectional associative memories,”IEEE Trans. Syst., Man,Cybern., vol. 18, pp. 49–60, 1988.

[27] D. L. Long, J. M. Golding, A. C. Graesser, and L. F. C. Clark, “Goal,event, and state inferences: An investigation of inference generationduring story comprehension,” inThe Psychology of Learning and Moti-vation, A. C. Graesser and G. H. Bower, Eds. New York: Academic,1990, vol. 25.

[28] T. M. Mitchell, R. M. Keller, and S. T. Kedar-Cabelli, “Explanation-based generalization a unifying view,”Mach. Learn., vol. 1, pp. 47–80,1986.

[29] J. Pearl, “Distributed revision of composed beliefs,”Artif. Intell., vol.33, pp. 173–215, 1987.

[30] , Probabilistic Reasoning in Intelligent Systems.San Mateo, CA:Morgan Kaufmann, 1988.

[31] C. S. Peirce, “Abduction and induction,” inPhilosophical Writings ofPeirce,J. Buchler, Ed. New York: Dover, 1955, ch. 11, pp. 150–156.

[32] Y. Peng and J. A. Reggia, “A probabilistic causal model for diagnosticproblem solving: Part I,”IEEE Trans. Syst., Man, Cybern., vol. SMC-17,pp. 146–162, 1987.

[33] , “A probalistic causal model for diagnostic problem solving: PartII,” IEEE Trans. Syst., Man, Cybern., vol. SMC-17, pp. 395–406, 1987.

[34] , “A connectionist model for diagnostic problem solving,”IEEETrans. Syst., Man, Cybern., vol. 19, pp. 285–289, 1989.

[35] D. Poole, “Representing diagnostic knowledge for probabilistic hornabduction,” in Proc. Int. Joint Conf. Artificial Intelligence,Sydney,Australia, 1991, pp. 1129–1135.

[36] G. M. Provan and J. R. Clarke, “Dynamic network construction andupdating techniques for the diagnosis of acute abdominal pain,”IEEETrans. Pattern Anal. Machine Intell., pp. 299–307, Mar. 1993.

[37] J. A. Reggia, D. S. Nau, and P. Y. Wang, “A formal model of diagnosticinference,”Inf. Sci., vol. 37, pp. 227–285, 1985.

[38] R. Reiter, “A theory of diagnosis from first principles,”Artif. Intell.,vol. 32, pp. 57–95, 1987.

[39] P. Struss and O. Dressler, “Physical negation: Integrating fault modelsinto the general diagnostic engine,” inProc. Int. Joint Conf. ArtificialIntelligence, Detroit, MI, 1989, pp. 1318–1323.

[40] M. A. Tagamets and J. A. Reggia, “A data flow implementation of acompetition-based connectionist model,”J. Parallel Distrib. Comput.,vol. 6, pp. 704–714, 1989.

[41] P. Thagard, “Explanatory coherence,”Behav. Brain Sci., vol. 12, pp.435–502, 1989.

[42] J. Wald, M. Farach, M. Tagamets, and J. A. Reggia, “Generatingplausible diagnostic hypotheses with self-processing causal networks,”J. Exper. Theoret. Artif. Intell., vol. 2, pp. 91–112, 1989.

Bechir Ayeb (M’92) is an Associate Professor withthe Faculty of Sciences, University of Sherbrooke,Sherbrooke, P.Q., Canada. He heads the researchlaboratory RIFA focusing on diagnosis, fault toler-ance computing, and distributed systems. He hasbeen interested in diagnosis since 1985 when he ex-plored different aspects of diagnosis, including theo-retical, methodological, and experimental issues. Hehas published several papers on these subjects. Hiscurrent research interests are distributed systems,fault tolerance computing, and diagnosis.

Dr. Ayeb is a member of the IEEE Computer Society.

Shengrui Wang received the B.S. degree in math-ematics from Hebei University, China, in 1982,the M.S. degree in applied mathematics from theUniversite de Grenoble, Grenoble, France, in 1986and the Ph.D. degree in computer science from theInstitut Polytechnique de Grenoble in 1989.

In 1990, he was a Post-Doctoral Fellow withthe Department of Electrical Engineering, LavalUniversity, Quebec, P.Q., Canada. He is now anAssociate Professor at the University of Sherbrooke,Sherbrooke, P.Q. His research interests include neu-

ral networks, expert systems, image processing, parallel computing, andnumerical optimization.

Dr. Wang is a member of INNS.

Jifeng Ge received the B.S. degree from ZhejiangUniversity, China, in 1985 and the M.S. degreefrom University of Sherbrooke, Sherbrooke, P.Q.,Canada, in 1994, both in computer science.

He is with Micro-Intel Inc., Montreal, P.Q. Hisinterests include artificial intelligence and neuralnetwork modeling.