Parallel Logic Programming for Problem Solving

International Journal of Parallel Programming, Vol. 28, No. 3, 2000

Parallel Logic Programming forProblem Solving1

Ramiro Varela Arias,2 Camino Rodr@� guez Vela,2

Jorge Puente Peinador,2 and Cesar Alonso Gonza� lez2

Received June 1998; revised March 2000

We present a new model for parallel evaluation of logic programs. This modelcan exploit the main sources of parallelism that the language of logic expresses:Independent and parallelism and or parallelism, together with a secondarysource emerging as a consequence of the Independent and Parallelism: theproducer�consumer parallelism. The efficiency is derived from the use of orderedstructures for managing the information generated throughout the searchprocess. The model is suitable for evaluating programs with a high degree ofnon-determinism because it never generates two processes for solving the samesubgoal and hence it can exploit the same real parallelism generating a lowernumber of processes than other models. As an application example, we considerthe Job Shop Scheduling problem. We report experimental results showing thatlogic programs can be designed that exhibit parallelism, and that the use ofheuristic information translates into speedup in obtaining answers.

KEY WORDS: Parallel logic programming; ordered structures; heuristics;problem solving.

1. INTRODUCTION

Rule based deduction is a classic technique in Artificial Intelligence with awide range of applications such as Expert Systems and Problem Solving.

275

0885-7458�00�0600-0275�18.00�0 � 2000 Plenum Publishing Corporation

1 This work has been supported by the FICYT of the Principado de Asturias under ProjectPB-TIC-9703.

2 Centro de Inteligencia Artificial, Universidad de Oviedo en Gijo� n Campus de Viesques,E-33271 Gijo� n, Spain. E-mail: [ramiro,camino,puente,calonso]�aic.uniovi.es or http:��www.aic.uniovi.es�PlyP�presentacion.htm.

In these domains, the complexity of the computations can be very highand many problems tend to be combinatorially explosive. Hence, it is tobe expected that their computation time will be drastically reduced witha parallel execution scheme. On the other hand, logic programminglanguages have become a powerful tool to express symbolic computationswhich often appears in this context. The most extended implementation ofthese languages is Prolog. The procedural semantics of this languageinvolve a sequential control strategy with a classic depth first and back-tracking scheme, and so the execution time often becomes unacceptablewhen solving some instances of these problems.

These facts, together with the ability of the language of logic to expressparallel computations, has meant that parallel schemas for logic programsevaluation appears to be an interesting field of research in order to improvethe efficiency of deductive systems. Accordingly, several models for parallelinterpretation of logic programs have been proposed over the last years.Most of them exploit one or the two main sources of parallelism of logicprograms: and parallelism and or parallelism.(1�10) and parallelism consistsof the simultaneous evaluation of several literals of a query, whereas orparallelism consists of exploiting several rules with the same conclusion atthe same time. These kinds of parallelism can be clearly represented byand�or trees. Hence, computations in parallel models are usually representedby means of these trees; every model having its own variant of the and�ortree.

In this work, we present an abstract model for parallel interpretationof logic programs. The model was designed to exploit the two main sourcesof parallelism of logic programming, as well as some secondary sources.The model is abstract in the sense of being independent of every targetmachine in which it can be subsequently implemented.

On developing a parallel model for logic programming, severalproblems arise with respect to sequential approaches. For example, whenexploiting or parallelism we have to deal with multiple bindings to thevariables of the literals. Moreover, if we exploit and parallelism at the sametime, we have to join the multiple bindings from various literals of a givenquery. What frequently occurs is that most of the computed solutions to aliteral are not compatible with the solutions to the remaining literals andthen there are lots of processes whose work turns out to be useless. On theother hand, it is common for the same subgoal to be solved various timeswhen a query is evaluated with respect to a program; this occurs not onlywith parallel models but also with sequential SLD resolution. Therefore,when designing a parallel model for evaluating logic programs, we have toconsider whether or not the management cost that parallelism introducesovercomes the speedup obtained by means of parallelism.

276 Arias, Vela, Peinador, and Gonza� lez

In our model, these problems are faced by means of ordered structureswith some lattice properties to represent the information used during theevaluations. First, the I-and�or tree is defined in order to carry out adecomposition of the whole task of evaluating a query into a set of inde-pendent subtasks that can be solved in parallel. Then, in order to achievethe former decomposition in an efficient way, the Data Flow Lattice (DFL)is introduced to representing a partial ordering for the evaluation of theliterals of the queries. And finally, the Process and Solutions Net (PSN ) isdefined in order for the partial solutions to the literals to be efficientlymanaged. The idea of utilizing ordered structures as a formal framework inthe fields of logic programming and artificial intelligence is not new; it has,for example, been widely used for and parallelism management.(2, 3, 11)

In order to clarify the power of our model for problem solving, weconsider the Job Shop Scheduling (JSS) constraint satisfaction problem.This is a NP-hard problem that have give rise to a high research effort, soa large number of solutions have been proposed. Among these solutions,we can found the application of almost every artificial intelligence techni-que, for instance constraint logic programming, (12) neural nets, (13) machinelearning, (14) genetic algorithms, (15) and heuristic search.(16) Maybe, the lasttwo ones being the most frequently used. In this work, firstly, we show howthis problem may be solved by means of the logic programming, andfurthermore, we show how the logic programs can be heuristically designedin order to improve the efficiency in extracting answers. In order to dothat, we consider the variable and value ordering heuristics proposed bySadeh and Fox(16) for JSS problems. These heuristics will be used to guidethe construction of logic programs for solving JSS problems, in order forthese programs to be efficiently evaluated in parallel. As we will see, theseprograms exhibit all of the three types of parallelism as previouslydescribed, and can be improved with the assistance of heuristics.

The remainder of the paper is organized as follows: in Section 2 weintroduce parallel logic programming and review the main sources ofparallelism that can be exploited together with the main models that havebeen proposed for this purpose. In Section 3 we present the basis of ourinterpretation process model and point out the main problems that itsimplementation involves. Here, we firstly describe by means of examplesthe main features of the interpretation model, and then we formally definethe main ordered structures utilized: the I-and�or tree, the DFL, and thePSN. In Section 4 we face the problem of joining the multiple bindings tothe literals of a query, and prove that it can be efficiently solved from someproperties that we require to the structure of the DFLs. In Section 5, wedescribe the non ground instantiations management. This is a key issue inlogic programming and it is specially critical to handle in presence of

277Parallel Logic Programming for Problem Solving

parallelism. In Section 6, we present the application of the interpretationmodel to Job Shop Scheduling problems. Finally, in Section 7 we sum-marize the work.

2. PARALLEL LOGIC PROGRAMMING

In this section we present some fundamentals about parallel logicprogramming through a simple model that can be used to evaluate logicprograms in parallel. This model can exploit the main sources of paral-lelism that the logic programming offers, but it presents some drawbacksthat a real model should solve in order to be efficient.

Firstly, the terminology used in this paper with regards to logicprogramming is briefly introduced. A logic program consists of a set ofHorn clauses. A Horn clause has the form p: &q1 ,..., qn , where p and everyqi are literals. A literal is a predicate symbol followed by a parenthesisedlist of terms. A term can either be a constant or a variable or a functionfollowed by a parenthesized list of terms. To represent predicates, functionsand constants identifiers starting with a lowercase letter are used, whereasvariables are symbols starting with an uppercase letter. In this clause, p isthe head and q1 ,..., qn is the body of the clause. If the body is empty theclause is a fact and it is represented simply as p. If the body is not emptythe clause is a rule. A query is a possibly empty conjunction of literals.

The clause p: &q1 , ..., qn can be understood to mean ``one way to solvep is to solve all the subproblems q1 ,..., qn .'' A fact represents a solvedproblem. A literal r is solved if it unifies with the head of a clause with mostgeneral unifier (mgu) _, and all the literals in the body of the clause instan-tiated with the substitution _ are solved. A unifier of a set of literals is asubstitution that when applied to every literal of the set all of them match.Given a set of variables [X1 ,..., Xn] a substitution for these variables is a set[X1 �t1 ,..., Xn �tn] where every t i is a term that does not contain the variableXi . A solution of a query is a substitution \ such that every literal in thequery instantiated with \ is solved. A partial solution is a solution to oneor more subgoals of a query.

The most extended implementation of logic programming languagesis Prolog; its procedural semantics being SLD-Resolution.(17) In order toobtain solutions to a query with respect to a logic program SLD-Resolu-tion expands the so called SLD tree by means of a sequential depth firstsearch using backtracking. Every node of the SLD tree is labeled by aquery. The root is the top level query and each one of the remaining nodesis obtained from a resolution step in which the first literal of the father'squery is unified with the conclusion of a clause of the logic program. Theleaves of the SLD tree are either a query whose first literal does unify with


File: 828J 030205 . By:SD . Date:25:04:00 . Time:07:51 LOP8M. V8.B. Page 01:01Codes: 2619 Signs: 2030 . Length: 44 pic 2 pts, 186 mm

Fig. 1. (a) a simple logic program defined by five facts; and (b) The full SLDtree expanded to evaluate the query q(X, Y ), p(X ) with respect to the logicprogram of (a).

no conclusion of the clauses or an empty query. A path from the root ofthe tree to an empty leaf represents a solution to the top level query. Figure 1shows a logic program composed by five facts and the SLD tree generatedto evaluate the query q(X,Y ), p(X ) with respect to the program. SLD treesare suitable for sequential search, but they can not express all the parallelismof the logic programs. As it is pointed out by Kale(18) SLD trees do notexpress and parallelism.

and�or trees provide an alternative method to the classic SLD treeswhich is more adequate for representing parallel evaluation of logicprograms. They have two types of nodes: and nodes labeled with queriesand or nodes labeled with single subgoals. In its raw version, given a queryQ with respect to a logic program, the and�or tree is expanded through thefollowing rules:

v The root is an and node labeled with Q.

v An and node labeled with [ p1 ,..., pn] has n children labeled withp1 ,..., pn respectively. This relation expresses the notion that to solvethe query p1 ,..., pn , every pi has to be solved.

v An or node labeled with p has one and child for every clause of theprogram whose head unifies with p with mgu _. The node is labeledwith the body of the rule composed with _ (if the clause is a fact, theand child has null label and it is a leaf of the tree). In this casethe associated arc is labeled with the mgu _. This relation expressesthe fact that the query of at least one child must be solved in orderto solve the literal of the or node.

Solutions in a node are stored in a Partial Solution Set (PSS). ThePSS of a node is computed only from the PSSs of its children nodes. Inthe case of an or node, every solution in the PSS of one and child com-posed with the label of the corresponding arc is a solution of the node.



Fig. 2. (a) The and�or tree expanded to evaluate the query q(X, Y ), p(X ) with respect tothe logic program of Fig. 1a; and (b) Joins made by the root and process of the and�or treeof (a). Only two out of the six do not fail.

However, in and nodes we must join one solution from each child node inall possible ways. Figure 2a shows the and�or tree expanded for solvingthe query q(X,Y ), p(X ) with respect to the program of Fig. 1a, and Fig. 2bshows all of the six combinations computed at the root node. As we cansee only two of these provide solutions for the query; while the others failbecause they combine incompatible solutions. Here, ``V'' denotes the usualjoin of substitutions defined for a pair of substitutions, S1=[X11 �t11 ,...,X1n �t1n] and S2=[X21�t21 ,..., X2m�t2m], as follows (we have to take intoaccount that the identity substitution is represented by < and that itsmeaning is the value True).

S1 V S2=

(a) False, if one of the substitutions is False or there is a variableX # var(S1) & var(S2) such that X�t1 # S1 and X�t2 # S2 and theterms t1 and t2 do not unify.

(b) S1"2 _ S2"1 _ [X�t1_: _=mgu(t1 , t2), X # var(S1) & var(S2), X�t1

# S1 , X�t2 # S2].

Where var(S1)=[X11 ,..., X1n], var(S2)=[X21 ,..., X2m], S1"2=S1 =(var(S1)"var(S2)) and S2"1=S2 = (var(S2)"var(S1)). The operators ``"'' or``='' denote to the usual set difference and the projection of a substitutionover a set of variables respectively.

and�or trees suggest a way of decomposing the problem of computinganswers for a given query into parallel subtasks. Hence, from this represen-tation, we can establish a process model that associates one independentprocess with each node for solving the problem represented by that node.


Therefore, we have two types of processes: and processes to solve queriesand or processes to solve single literals. In order to solve a query, an andprocess generates an or process to solve each one of the literals of thequery. When all of these processes are generated in parallel, we obtain afull and parallel schema. At the same time, to solve a single literal p an orprocess generates an and process to solve the body of every clause whoseconclusion unifies with the literal p. As before, when all of these andprocesses are started in parallel, we have a full or parallel schema.

As pointed out by several authors(2, 3, 5) the main objection that can beraised to this simple model is the cost of joining partial solutions. This costmay be unacceptable when there are a lot of variables shared among theliterals in programs with a large number of partial solutions. This problemis usually confronted by means of a different form of parallelism: the socalled Independent and Parallelism (IAP) proposed by Conery(2) andaccepted by most researchers. IAP consists of the ordered evaluation ofliterals in such a way that two literals are not evaluated in parallel if theyshare some free variable. When a variable is shared by several literals, oneof them is evaluated first, in a producer way, and then the others can beevaluated as consumers with the shared variable bounded to the valuecomputed by the producer. In order to do this, it is necessary to determinea partial ordering for the evaluation of the literals of a query. This orderingis usually represented by means of ordered structures such as, for example,the Data Flow Graph (DFG), (2) the Data Join Graph (DJG), (11) and Condi-tional Graph Expressions (CGE).(3)

One advantage of IAP is that, in general, it reduces the cost of joiningpartial solutions with respect to a full and-parallel schema, because itmight avoid the generation of lots of partial solutions to a literal whichresult to be incompatible with every solution to other literal. Over the lastdecade, several models for exploiting IAP with or parallelism have beenproposed.(2, 5�8)

3. THE DFL�PSN PROCESS MODEL

In this section we introduce our proposed model to evaluate logicprograms. The main objective of this model was to exploit simultaneouslyor parallelism and Independent and parallelism together with anothersecondary source of parallelism: the producer�consumer or consumer�instance parallelism. The efficiency of the model is based on the use ofordered structures to represent computations. At the top level, computa-tions are represented by means of a variant of the and�or tree called theI-and�or tree (I standing for Independent). Given a logic program and aquery, the I-and�or tree represents a decomposition of the evaluation of


the query into set of independent subtasks, such that many of these sub-tasks can be carried out in parallel. Because of I-and�or trees are expandedat runtime, they represent non determinate computations. This class ofcomputation is still an open area of research as it was pointed, for example,by Dennis.(19)

In order to represent the IAP we use the Data Flow Lattice (DFL).This is a graph that represents a partial ordering for the evaluation of theliterals of a query. It is assumed that the DFLs for the top level query aswell as for the body of every rule of the program are computed at compiletime. Finally, to manage the partial solutions to the subgoals of a query wedefine the Process and Solutions Net (PSN). This is a dynamic structurewhich is build by an and process as long as it generates or processes tosolve subgoals and receives solutions from them. To name the model wechose the name of these two last structures, that is DFL�PSN. Before defineformally the three structures, we introduce by means of two examples themain features and give a number of hints about the dynamic aspects of themodel. In order to simplify the description of the model, at the moment weassume that when a literal is evaluated all of its free variables result instan-tiated to ground terms. Of course this is a strong simplification in the logicprogramming field. Nevertheless, even with this simplification, the modelcan be successfully applied to a variety of combinatorial search problemslike the JSS that we address in Section 6. Later, in Section 4, we willindicate how to generalize the model in order to deal with partial instan-tiated terms.

Example 1. Consider again the query q(X, Y ), p(X ) and theprogram of Fig. 1a. Figure 3 shows the I-and�or tree expanded by ourmodel when it evaluates this query. The literals within a query areevaluated according to a partial ordering; we assume that for the formerquery the ordering is first q(X, Y ) and then p(X ) as the DFL at the rootnode of Fig. 3 shows. Therefore, one and process is created first for solvingthe query. This process generates firstly one or process for solving q(X, Y ).The arc from an and process to a descendant or process is labeled by asubstitution % called the context of the or process. The context of an orprocess is an instantiation of those variables of the literal that appearsin some of its predecessors in the partial ordering. In the case of the orprocess generated to solve q(X, Y ) the context is the empty substitutionbecause this literal does not have any predecessor.

The former or process expands in parallel three and processesassociated respectively to the three clauses of the program that might solvethe literal. In this case, the corresponding edge is labeled with a substitu-tion _. This substitution is the mgu of the literal at the or node and the



Fig. 3. An I-and�or tree generated to solve the query p(X, Y ), q(X )with respect to the logic program of Fig. 1a. It is assumed that the partialordering among the literals of the query for evaluation under IAP is firstp(X, Y ) and then q(X ) as shown at the root node.

conclusion of the associated clause, as we will see in Section 3.1. Moreover,the descendant and process is labeled with the premise of the clause instan-tiated with the former mgu. Because of the former three clauses are facts,all of the three and processes are leaves of the tree and hence they arelabeled with a null query and produce the empty substitution. This solu-tion, previously composed with the label _ of the edge, is sent by a messageto the father or node. This node sends, via messages, the three solutions tothe root and process, previously composed with the corresponding sub-stitution %. From these three solutions the root and process should identifytwo different instantiations of the variable X which is shared by the literalq(X, Y ) and its successor p(X ), these instantiations define two contexts,(X�b) and (X�c), to evaluate p(X ) and hence the and process will generatetwo new or process to solve this literal under these contexts. As shown inFig. 3 one of them completes with a positive answer whereas the other fails.Finally at the root node we have two answers to the query.

One of the main features of the DFL�PSN model is that avoids theduplication of a number of processes; in contrast to other models such asthose proposed by Kale, (11) and Gupta, (8) which would generate two pro-cesses for solving p(b) in this example, one from each solution of the literalq(X, Y ) with the variable X instantiated to the value b. [Note: There aremany interesting problems where this ability of nonduplication is worth.But in a number of situations, for example when side effect literals ordynamic databases are used, this ability can be inhibited by user annota-tions or automatically if necessary.] In order to do that we have developed


a strategy for partial solution management based on the PSN. This is ahierarchical structure that represents all the partial solutions obtained tothe subgoals of the query and their relations. We clarify this representationin the next example.

As we have pointed out, The DFL�PSN model can exploit producer�consumer (or consumer instance) parallelism. This is a secondary source ofparallelism that appears associated to IAP when the literals have multipleanswers. Given two literals with common variables, it consists of startingthe evaluation of one instance of the second literal (consumer) to computecompatible solutions with one solution of the first ( producer) as soon asthis solution appears. In our example, if producer�consumer parallelism isexploited, we can start exploring p(b) as soon as we have the first solutionof q(X, Y ) containing (X�b). When the second solution with (X�b) appears,this solution should be properly related with the already existing processthat solve p(b) in order for the solutions from this process to be joined withboth of the solutions of q(X, Y ) containing (X�b).

Producer�consumer parallelism is interesting mainly in nondeterministicprograms because it allows to exploit the solutions to a subgoal as soon asthey are obtained. That is, a solution to a subgoal can be used to generateprocesses to other literals while the remainder solutions to the subgoal arestill being calculated. In this way a uniform expansion of the I-and�or treeis achieved that allows combining the early solutions of every literal beforethe whole set of solutions of any of them is computed. Furthermore, inpresence of infinite relations it maintains the completeness of the system. Aspointed out by Kale(8, 11) exploitation of this parallelism was the point ofdeparture for the REDUCE-OR Process Model (ROPM).

Example 2. Now we elaborate on another example in order todescribe how an and process constructs a PSN during the evaluation of aquery. Let us consider the logic program of Figure 4a and the queryq(X, Y ), p(X, Z), r(Y, T ), s(Z, T ) whose literals are partially ordered forevaluation under IAP as the DFL of Fig. 4b displays. In this case we areinterested about the main steps followed by the root process of the I-and�or tree on building the PSN (Fig. 4c). In order to do that we assume anumber of race conditions that allows us to clarify the interaction amongIAP, or parallelism and producer�consumer parallelism.

The root and process is started to solve the query codified into theDFL; at the beginning its PSN is null. The first main step of an and pro-cess consist of generating an or process for solving each one of the literalswithout predecessors in the DFL, in this case q(X, Y ). Therefore an orprocess is generated to solve this literal and the correspondent identifier isinserted into the PSN, as shown in Fig. 5a. Then, the and process waits for



Fig. 4. (a) A logic program; (b) A DFL for the query q(X, Y ), p(X, Z), r(Y, T ), s(Z, T ); and(c) The PSN generated for the root and process to evaluate the query with respect to the logicprogram.

the first answer of the descendant or process. As we can see from theprogram of Fig. 4a, this or process shall return four different answers inany order, which depend on the race conditions. Let us assume that thefirst answer is (X�d, Y�b); this solution is inserted into the PSN as a solu-tion node linked to the corresponding process node as Fig. 5b shows. Now,while waiting for the remaining solutions to the literal q(X, Y ), the andprocess generates or processes to solve subgoal instances of the successorliterals of q(X, Y ) in the DFL: p(X, Z ) and r(Y, T ). This subgoals, areobtained by instantiation of the litetals p(X, Z) and r(Y, T ) accordingly tothe solution of q(X, Y ). Consequently two or processes are generated inparallel to solve p(d, Z ) and r(b, T ) respectively. These processes evaluatesthe literals p(X, Z) and r(Y, T ) under the context defined by the solutionof the ancestor literal q(X, Y ) and hence, as shown if Fig. 5c, their processidentifiers are linked to the solution node (X�d, Y�b). This relationexpresses the notion that this solution is compatible with every solutioneventually produced by these processes and hence they can be joinedwithout any compatibility checking. This is the main advantage of IAP.At the same time, we can appreciate the utility of producer�consumerparallelism. In the situation of Fig. 5c, the first solution produced by



Fig. 5. (a) The PSN after creating an or process to solve the first subgoal; (b) The PSNafter the arrival of the first answer from the first OR descendant; and (c) The PSN after thegeneration of the or processes for solving the first instances of literals p(X, Z) and r(Y, T )respectively.

q(X, Y ) was consumed by the literals p(X, Z ) and r(Y, T ), while theremaining three solutions of q(X, Y ) are still being calculated. It is clearthat, in general, this type of parallelism will accelerate the calculation of thefirst answer while every potential solution can still be calculated if it issafety managed.

The three or processes of Fig. 5c are asynchronous and hence they cansend answers to the father and process in any order. If the answer (Z�j)comes from the process procp(d, Z ) we have the situation of Fig. 6a, wherea new solution node is inserted and linked to this process. This new solu-tion node represents not only a solution to the literal p(X, Z ), but also asolution to the conjunction of the two literals q(X, Y ) and p(X, Z ). Thissolution is made explicit by means of the inference function INF as shownin Fig. 6c. This function is in charge of joining the partial solutions spread

Fig. 6. (a) The PSN after the answer of the process procp(d, Z); (b) The PSN after theanswer of the process procs( j, k); and (c) Result of the inference function from the solutionnodes of the PSN of (b).


over the PSN and hence obtaining solutions to the query or a subset of itsliterals.

Now assume that the process procr(b, T ) send the answer (T�k). In thissituation we have solutions to the literals of the first two steps of the DFL,each solution being compatible to the others. Therefore, as the DFL ofFig. 4b expresses, a new or process to solve the literal s(Z, T ) can begenerated. The situation after generating this process is shown in Fig. 6b.As we can see the new process procs(i, j) is not linked to a single solutionnode but to a couple of solution nodes, each one belonging to one of theliterals predecessors in the DFL. This hyperlink expresses the notion thatthe solutions computed by this process are compatible with those solutionsrepresented by the two solution nodes. This pair of solutions establishes thecontext to evaluate the process procs(i, j). This process does not producevalues for the variables of the literal s(Z, T ), it is only consumer of valuesfor these variables. Therefore the only solutions that can return are true offalse. As we can see from the logic program of Fig. 4a it will return theanswer true which is represented by the empty set as shown in Fig. 6b. Solong as this is a solution to the last literal in the partial order defined bythe DFL, the inference function returns a solution to the query when it isapplied to the corresponding solution node, as shown in Fig. 6c. At thismoment, the root and process produces the first answer to the query, whilethe remainder answers are still being calculated.

Now let us assume that the second solution to the literal q(X, Y )appears. As before, this new solution should be inserted into the PSN asa new solution node linked to the process procq(X, Y ), and, in principle,two new processes to solve the literals p(X, Z ) and r(Y, T ) under the newcontext should be generated. Although in this case one of this processes,procp(d, Z ), has already been created from the first solution of q(X, Y ).Hence, by making use of the relations among processes and solutionsrepresented within the PSN, we can avoid the duplication of this processand simply introduce a new link from the new solution to the existing pro-cess. The new situation is shown in Fig. 7a. Now the context of the processprocp(d, Z ) is defined by two solutions of the literal q(X, Y ) both of whichhave the same instantiation for the variable X consumed by the literalp(X, Z). When the new or process procr(c, T ) sends the answer (T�k), thissolution should be combined with the solution (Z�j) to obtain a new con-text to evaluate s(Z, T ), although this context is not really new and hencea new process is not generated. What should be done is simply to add anew link from the couple of solution nodes and the process procs( j, k) asshown in Fig. 7b. After this new link, we can observe that a new solutionto the query is codified in the PSN. Of course, the model should ensurethat this solution is not lost. In order to do that without obtaining repeated



Fig. 7. (a) The PSN after the second answer of the process procq(X, Y ). Only a process tosolve r(Y, T ) should be generated; and (b) The PSN after the answer of the processprocr(c, T ). A new process is not generated, but a new solution to the query is codified inthe PSN.

solutions, a variant of the inference function is used, the so called solutionrestricted inference, INF�S�R, which allows to select one or more literalsand restrict the joining to a given solution node of each of these literals.Figure 7b shows the result of applying the inference from the solution nodelabeled < restricted to the solution node (T�k)2 of the literal r(Y, T ).INF�S�R is a generalization of the function INF, since both of them returnthe same value when they are applied to a solution node, provided thatINF�S�R is restricted to none of the solution nodes. For example, for thesolution node < of the literal s(Z, T ) of Fig. 7b we have

INF�S�R(<, [ ])=INF(<)=[(X�d, Y�b, Z�j, T�k), (X�d, Y�c, Z�j, T�k)]

Now assume that the third solution of the literal q(X, Y ) appears.What happens is represented in Fig. 8a. As we can see, from this solutiontwo new or processes should be generated: procp(a, Z ) and procr(e, T ). Thesecond fails and the former returns the answer (Z�i) that is not compatibleto any of the solutions of the literal r(Y, T ) and hence at the moment thissolution can not contribute to any context to the literal s(Z, T ).

Finally when the root and process receives the last solution of theliteral q(X, Y ) from the or process procq(X, Y ), it occurs that this solutiondoes not establish a new context for the literals p(X, Z ) and r(Y, T ). As we



Fig. 8. (a) The PSN after the third answer of the process procq(X, Y ). Two new or processare generated one of which fails; and (b) The PSN after the last solution of the processprocq(X, Y ). The solutions (Z�i) and (T�k)2 becomes compatible each to the other and then theprocess procs(i, k) is generated.

can see in Fig. 8b, this solution is adjoined to the context of two alreadyexisting processes. In this situation, unlike the situation of Fig. 7b where anew solution to the query had appeared, no new solution to the queryappears. Nevertheless, in this situation something more should be done inorder for a number of solutions not to be lost. As a consequence of thesolution (X�a, Y�c) to the literal q(X, Y ), the solutions (Z�i) of processprocp(a, Z ) and (T�k) of process procr(c, T ) are compatible and hence theyestablish a new context to evaluate the literal s(Z, T ). Therefore, under thisnew context the process procs(i, z) should be generated. This process com-pletes successfully, as shown in Fig. 8b and hence a new solution to thequery is obtained. At this moment every descendant or process of the rootand process has completed, and consequently the and process completestoo.

As we can observe by means of the former example, during the PSNmanagement there are a number of actions that might be critical and hencerequire to be efficiently implemented. First the inference function that joinsthe partial solutions spread over the PSN to obtain solutions to the query.And then the search from a given solution node for compatible solutionnodes of any other literal. For example, when the process procr(c, T )returns the answer (T�k) (Fig. 7b), in order to obtain the compatible solu-tions of the literal p(Y, Z ), the first step is search upwards for compatiblesolutions of the literal q(X, Y ) thus obtaining the solution node (X�d, Y�c).Then from this node a downwards search founds the node (Z�j) of theliteral p(Y, Z ). Similar upward and downward search is necessary tomanage the other situation we have commented in this example. The


efficiency of all of these operations is based on a number of properties werequired to the structure of the DFL, as we will see in the next sections.

In the next three subsections, we formally describe all of the threemain structures that have to do with the procedural semantics of theDFL�PSN model.

3.1. The I-AND�OR Tree

In this section we formally define the I-and�or tree. We denote an ornode by O(%, p, PSSp), where % is a substitution that represents the contextof the node, p is a literal and PSSp is a set of solutions to the subgoal p%.On the other hand, one and node is represented by A(_, Q, PSSQ), where_ is a substitution, Q is a query whose literals are assumed to be partiallyordered according to a DFL and PSSQ a set of solutions to Q.

Given a query Q with respect to a logic program, the expansion of theI-and�or tree is carried out by means of the following five rules. Rules 1�3specify how to build the tree and rules 4 and 5, how to collect the PSSs.

Rule 1. The root node is A(<, Q, PSSQ).

Rule 2. The children of a node A(_, [q1 ,..., qn], PSS) are computedas follows: if n=0 it has no children, otherwise it has zero or more childrenassociated to each qk , 1�k�n. Let [qk1 ,..., qkm]/[q1 ,..., qn] be theancestors of qk in the DFL. If m=0, there is only the node O(<, qk ,PSSqk) associated to qk . If m�1 and there are no children correspondingto some qki , 1�i�m, then there are no children for qk either. Otherwise,let O(%j , qki , PSSij), 1� j�hi , be the children corresponding to each qki ,1�i�m. Now, let us consider the set S given by the consistent joinings ofsolutions from the PSSs predecessor or nodes, that is

S=[s : s=s1 V } } } V sm , s{False, si=%j_, _ # PSSij , 1� j�h i , 1�i�m]

Then, for every subset of S given by the solutions of S with the samebindings in the variables of qk , that is [s # S : s = var(qk)=%] there is achild process O(%, qk , PSS). This process is generated to solve the sub-goal qk%.

In other words, given the set of solutions obtained for the conjunctionof the literals [qk1 ,..., qkm], from the subset of solutions with the samebinding in each one of the variables consumed by qk , only one context forthis literal is established and then only one process is generated to solvethe literal under this context. This is the key to avoid the duplication ofprocesses.


Rule 3. The children of a node O(%, p, PSSp) are established asfollows: for each clause of the program, H :&Q, such that mgu(H, p%)=_,there is an and child A(_, Q_, PSS). That is, for each clause of theprogram which conclusion unifies with the subgoal of the or node, an andchildren labeled with the body of the clause instantiated with the corre-spondent mgu is generated.When the clause is fact, as its body is empty, thelabel of the child and process is empty too.

Rule 4. The solutions to a node O(%, p, PSSp) whose children areA(_i , qi , PSSqi), 1�i�h, are computed as PSSp=�1�i�h [$ : $=_i : =var( p%), : # PSSqi]. That is, every solution to an and node produces asolution to its parent or node.

Rule 5. The set of solutions to a node A(_, [q1 ,..., qn], PSS) withchildren O(%j , qi , PSSij), 1� j�hi , 1�i�n, are PSS=[$ : $=+1 V } } } V +n ,${False, +i=%j:, : # PSSij , 1� j�hi , 1�i�n] (: is a solution to thesubgoal qi%j and hence +i is a solution to the literal qi). That is, everyconsistent composition of solutions, one from each literal of the body of aclause, is a solution to the head too. In the particular case of n=0, as thereare not any or children, PSS=[<].

3.2. The Data Flow Lattice

As we pointed out in the previous section, to exploit IAP we have todetermine a partial ordering among the literals of a query so that theseliterals may be evaluated without producing variable binding conflicts. Inour model this partial ordering is codified into a directed graph called theData Flow Lattice (DFL).

Given a query Q composed of the set of literals [ p1 ,..., pn], a DFL forQ is a single, non cyclic and directed graph that represents a partial order-ing among the literals of the query. Every literal of the query labels one ofthe nodes of the DFL. And the edge ( pi , pj) states that the literal p j shouldbe evaluated after the literal pi . We denote this precedence relation as>DFL and its reflexive and transitive closure as �*DFL . If q>DFL p, we saythat q is a predecessor of p, and if q>*DFL p, we say that q is an ancestorof p. Given a literal p of the query Q, we denote the set of variables appear-ing in the arguments of p as var( p). And, considering the position of p inthe DFL associated with the query, we establish the partition var( p)=var�cons( p) _ var�prod( p) where var�cons( p)=[X # var( p), _q>*DFL p,X # var(q)] are the variables consumed by the literal p, and var�prod( p)=var( p)"var�cons( p) are the variables produced by p. Hence, when a


process is generated for solving the literal p, every variable consumed bythis literal is bounded to some term. Moreover, multiple bindings to thevariables consumed might exist. These bindings comes from the solutionscomputed by a process previously generated for solving some literalq>*DFL p. Thus, every process generated for solving the literal p is createdfrom one set of bindings to the variables consumed by p. This set ofbindings actually establishes the context of the process. On the other hand,the variables produced by p are free and will be bounded by the processgenerated to solve p. Therefore, a solution calculated by this process willactually represent one or more solutions to the conjunction of literals of theset [q : q�*DFL p]. From now on, this set will be denoted by Pp*, and thedifference set Pp*"Pq* will be denoted by P*p&q .

As we have pointed out, to simplify the description of the processmodel, at the moment we consider that every process bounds the variablesproduced to ground terms. Hence, the DFL represents an unconditionalorder for evaluating the literals under IAP, and it can be easily determinedat compile time. Otherwise, a number of conditions concerning thedynamic instantiation of the variables should be tested at runtime in orderto determine whether or not two or more literals can be evaluated inparallel. These conditions can be detected at compile time by means ofabstract interpretation techniques, as we will indicate in Section 5.

In order to achieve an efficient strategy for joining partial solutions,we impose some additional properties on the DFL. These properties arenot present in the structures used in other models for representing the IAP.As is easy to envisage, some of them might limit the amount of parallelismthat the DFL is able to represent. But, on the other hand, as we will see,these properties will permit an efficient search over the PSN. These proper-ties of the DFL are the following:

P1. It has degree two: every node has at most two incoming and twooutcoming edges.

P2. There is a dependence of consumed variables: every variable con-sumed by one literal appears in the variables of some of itspredecessors.

P3. It is a semilattice with respect to the infimum: it is an ordered setin which every pair of elements has infimum. Hence, it has mini-mum and as it is a finite set, it can be proved that every pair ofnodes with common ancestors has supremum.

P4. It is a structured graph: if s is the supremum of two nodes p1 andp2 with a common successor, and the node q is such that q�*DFL s,then every path from q to p1 (or p2) includes the node s.


File: 828J 030219 . By:XX . Date:30:05:00 . Time:07:51 LOP8M. V8.B. Page 01:01Codes: 2522 Signs: 1983 . Length: 44 pic 2 pts, 186 mm

Fig. 9. Three different DFLs for the same query. Each one represents a different amountof parallelism to the others as indicated by the evaluation functions, (a) ns=2, wf =0.5;(b) ns=3, wf =0.83; and (c) ns=4, wf =1.

Figure 9 shows three examples of DFL. The label ``b'' denotes the nullliteral, and the corresponding nodes are null nodes. These nodes need to beincluded in order to guarantee the former properties. However, the nullnodes do not influence the amount of parallelism expressed by the DFLbecause it is not necessary to generate processes to solve them. From thedeclarative point of view, a null literal can be considered a literal withoutany variables produced, whose consumed variables are those appearing insome of its immediate predecessors and that it is true for every instantia-tion of its variables.

To measure the amount of parallelism expressed by a partial ordering,several evaluation functions are commonly used. For instance, the numberof steps (ns), a step being a subset of literals that can be evaluated inparallel after the evaluation of the literals in the previous steps. We haveproposed the waiting factor (wf ), (20) defined as

:n

i=1

ancestors�of ( pi)< :n&1

i=1

i

where n is the number of literals of the query, and ancestors�of ( pi) is thenumber of non null literals ancestors to the literal pi . The function wfmeasures how much the literals have to wait before being evaluated.Provided that we are interested in obtaining the maximum parallelismduring the evaluation of a query, both of the functions, ns and wf, are goodindicators of the quality of an ordering. As we can see, the value of wfis 1 when the DFL expresses that the literals have to be sequentiallyevaluated and it is 0 when all of them are evaluated at the same time.Figure 9 shows the values of the evaluation functions for each of theDFLs. It is clear that the smaller the value of both functions, the better theordering.



Fig. 10. (a) A DFL that does not satisfy the property P2; (b) The solution node (Z�i) doesnot establish a consistent context for the literal r(X, Z); and (c) The inference from the node(Z�i) produces two solutions with different instantiations of the variable X.

The computation of the best ordering is a NP-hard problem as provenby Delcher and Kasif,(21) so a heuristic strategy is necessary. Varela(20)

proposed a strategy for determining one DFL from a query that uses someheuristics in order to minimize the values of the evaluation functions nsand wf. Moreover, Varela et al.(22) developed a generic algorithm thatmakes one part of the work: the computation of the literals belonging toeach of the steps.

3.3. The Process and Solutions Net

We shall now formally define, by means of four rules, the structure ofthe PSN and the inference function. The first three rules specify how toassign links between solution nodes of a literal and process identifiers of asuccessor. And the fourth gives the definition of the inference function. Thisis a static description of the structure of the PSN as it results at the com-pletion of the correspondent and process. To clarify how the PSN is builtat run time, we show in Fig. 11 the code of both and and or processes.

Let q be a literal of the DFL whose consumed variables are [X1 ,..., Xk],and procq a process created from the context S=[X1 �c1 ,..., Xk�ck]. Then,the process procq has one link to every element of a set of solution nodesdenoted by C(procq), as indicated by the following rules. Given the relationbetween the substitution S and the set C(procq), we also call C( procq) thecontext of the process. We denote by SN( p) to the set of solution nodes ofthe literal p.

Rule 1$. If q has no predecessors in the DFL (var�cons(q)=<),then procq is the only process associated with q. It has no links to any solu-tion node (C(procq)=<). We say that the process has a null context.



Fig. 11. (a) A high level description of an and process; and (b) A high level description ofan or process.


Rule 2$. If q has only one predecessor p in the DFL, then severalprocesses associated with q might exist. Then, for every process procq,C( procq) is a subset of SN( p) defined as

[s : s # SN( p), S$ # INF(s), S$ = var�cons(q)=S ] (4.1)

In this case, the process has a single context.

Rule 3$. If q has two predecessors p1 and p2 in the DFL, there canalso be one or more processes associated with q. Now, C(procq) is the sub-set of the cross product of SN( p1) and SN( p2)

[(s1 , s2) : s1 # SN( p1), s2 # SN( p2), S$ # INF(s1) V INF(s2),

S$ = var�cons(q)=S ] (4.2)

In this case, the process has a double context.

Rule 4$. The inference function INF can be defined as follows; s beinga solution node of a literal q computed by the process procq

INF(s)={h(s), if C( procq) is null

h(s) V .s$ # C(procq)

INF(s$), if C( procq) is single

h(s) V .(s1 , s2) # C(procq)

[INF(s1) V INF(s2)], if C( procq) is double

where h is the solution function. When this function is applied to a solutionnode it returns the corresponding partial solution. Here, the join for setsof solutions and the join of a solution and a set are defined as naturalextensions of the join of a pair of single solutions.

In order to guarantee the consistency of the former rules, we have toproof that in expressions (4.1) and (4.2), the projection S$ = var�cons(q)has the same value for every substitution S$ from INF(s) in (4.1) and fromINF(s1) V INF(s2) in (4.2). Otherwise, it might not be possible to generatea process to compute solutions compatible with those of INF(s) in (4.1) orINF(s1) V INF(s2) in (4.2). To clarify this situation, let us consider theexample of Fig. 10b, where it is not possible to determine the process thathas to be generated to compute solutions to r(X, Z) compatible with thesolution node (Z�i). The reason is that the inference from (Z�i) producestwo solutions with different instantiations of the variable X which is con-sumed by r(X, Z ), as we can see in Fig. 10c. This problem appears because


the graph of Fig. 10a does not have the property P2 of the DFLs: thevariable X is consumed by the literal r(X, Z ) and it does not appear in itspredecessor q(Y, Z). The proof of the consistency of both expressions (4.1)and (4.2) is given by Varela.(20)

3.4. The Dynamic Scenario

Now we are ready to describe the dynamic aspects of the interpreta-tion model. That is, how the former structures are built in order to solvea query with respect to a given program. First, the DFL of the body ofevery clause is computed at compile time. When the restriction to groundbindings is assumed, the independence conditions can be obtained at com-pile time from only the variable symbols and these conditions can beexpressed by means of an unconditional partial ordering represented by aDFL as shown in Section 3.2. In Alonso et al.(23) we propose an effectivealgorithm to compute a DFL from a given query under the former restric-tion. On the other hand, if nonground bindings might be obtained, theindependence conditions are more complex to compute and express atcompile time, and hence some of this work should be delayed until run-time. In Section 5 we will discuss the way to confront this issue.

The remainder of the work is done at runtime. First, the top levelquery is codified into a DFL and an and process, the root of the I-and�ortree, is generated to solve this query. From now on, the remainder of thetree is expanded in such a way that and processes generate or processes tosolve a subgoal of the query as soon as this subgoal is detected. That is,when a new context is detected for the corresponding literal by joining par-tial solutions obtained at the moment to the remaining literals of the query.At the same time, or processes generate and processes to solve the bodyof the clauses whose head unifies with the subgoal of the or process. All ofthese and processes can be started in parallel, hence obtaining a full orparallel schema. Every process sends a new solution to its father process bymeans of a message as soon as this solution is obtained, and every incom-ing solution to a process is stored in a queue and processed as soon aspossible. To manage a new solution, an and process should insert the solu-tion into the PSN and then join it to the remainder partial solutions inorder to obtain new solutions to the query and new contexts to the literals.An or process sends to the father and process a solution from each solu-tion received from a descendant and process. Figure 11 shows the structureof both and and or procesess.

The former processes description is aimed to clarify the process model.Now then, a number of improvements can be considered. Firstly, we canobserve that an or process, once its and children were generated, is merely


a sender of the received messages. It only reduces the size of a receivedsolution by projecting it on the variables of the subgoal before sending thesame solution to its father and process. Hence, the amount of messages canbe reduced if communications are rearranged so that an and process sendevery solution to its grandparent and process instead of to its father orprocess. Therefore, an or process completes just after generating its descen-dant and processes, thus reducing the number of active processes at agiven time.

Other improvement can be achieved if we avoid the generation of andprocesses to solve the empty query. That is, when the subgoal of an or pro-cess unifies with a clause being a single fact, the or process may send thecorresponding mgu as a new solution to the father and process. Instead ofgenerating an and process labeled with the empty query which will com-pletes just after sending the solution <, which in its turn should be sent tothe father and process after being composed with the former mgu. Thesetwo optimizations will be taken into account to design a real implementa-tion of the process model.

4. INFERENCE OPTIMIZATION

It is clear that the definition proposed in the previous section forthe inference function does not suggest an efficient algorithm for joiningthe solutions spread over the PSN. This is because the evaluation ofINF(s1) V INF(s2) requires a redundant search over the region of the PSNwhich is reachable from both solution nodes s1 and s2 . Moreover, the sub-stitutions of INF(s1) might consequently have shared variables with thoseof INF(s2). Therefore, when computing the join INF(s1) V INF(s2), a con-sistency check among instantiations of the same variable from differentsubstitutions is required. At first glance, this seems paradoxical because, aswe pointed out, this consistency checking was what we tried of avoid bymeans IAP. Hence, we clearly have two sources of inefficiency. On theother hand, we have to remark that the proposed definition for theinference function is consistent, assuming only the properties P1 and P2 ofthe DFL. The actual reason to introduce the former definition is just toclarify what the inference has to return.

In this section, we will make use of the whole set of properties of theDFL in order to design an efficient algorithm for inference. The underlyingidea will be to avoid a redundant search over the PSN and thus at thesame time to avoid the complexity of joining partial solutions.

Let us consider the situation of Fig. 12, which represents the subgraphof a DFL including the subset Pp* of nodes and the connections betweenthem. The literal q is the supremum of the predecessors, p1 and p2 , of the



Fig. 12. A fraction of anabstract DFL representing thenode p and its ancestors, that isPp*. The node p has two prede-cessors p1 and p2 . These nodeshave common ancestors andthen they have supremum. Thissupremum is the node q.

literal p. Recall that property P3 of the DFL ensures that every subset ofnodes with common ancestors has supremum; and that from property P4of the DFL it follows that the only connection from Pp* to P*p1&q

(analogously P*p2&q) is the edge (q, q1) (analogously (q, q2)) and hence allof the subsets P*p1&q , P*p2&q and Pq* are disjoint each one to the others. Inthis situation, if we have a solution Sq to the literals of Pq* and a solutionSp1&q to the literals of P*p1&q (an analogous reasoning from a solutionSp2&q of P*p2&q has to be followed), then these solutions will be compatiblewith each other if there are two solution nodes sq1 of SN(q1) and sq ofSN(q) such that h(sq1)/Sp1&q , h(sq)/Sq and the solution node sq belongsto the context of the process that computed the solution of the node sq1 .Hence, in order to compute INF(sp1) V INF(sp2), sp1 and sp2 belonging toSN( p1) and SN( p2) respectively, the partial solutions can be independentlyjoined within each of the subsets P*p1&q , P*p2&q and Pq*; and then the com-patibility among them can be restricted to checks among solution nodes ofthe literal q with nodes of q1 and nodes of q2 . This is formally proven fromthe whole set of properties required of the DFL.(20)

In order to make this new type of compatibility checking possible, weintroduce a set of marks into the solutions computed in the inference pro-cess. These marks are ``produced'' by those solution nodes of the literalsthat are suprema of a pair of literals with a common successor in the DFL.



In the example of Fig. 12, a mark Msq is introduced into a solution if thesolution node sq is reached when searching from the solution node sp1 .Then, two solutions will be compatible when their respective set of marksare. This new compatibility checking schema requires a new solution for-mat, as well as a new join operator. Now, a solution is not only a substitu-tion, but also a list of marks; that is, a pair (set�of �marks; substitution).And the new join, that we denote by ``VV,'' is defined as

(s�m�1; subs�1) VV (s�m�2; subs�2)=(s�m�1 _ s�m�2; subs�1 _ subs�2)

if the set of marks s�m�1 _ s�m�2 does not have marks from two differentsolution nodes of the same literal. Otherwise, the join produces the valueFalse.

From the new model for joining partial solutions, we define the newinference function that we call predicate restricted inference and which willbe denoted by INF�P�R, as shown in Fig. 13. In this case, the qualificationof restricted is due to the fact that the search is stopped when a solutionnode of a selected literal is reached, in order to avoid repetitive search.These selected literals are included in a set CP that it is sent to the functionas its second argument. Hence the function is invoked as INF�P�R(sp , CP)to compute the set of solutions to the conjunction of the literals of the set�q # CP P*p&q that can be obtained from the solution mode sp . Every solu-tion will include a mark from a solution node of each literal within the set

Fig. 13. Definition of the restricted inference function INF�P�R. sp is a solu-tion node, CP is a set of literals, and C represents to the context of the processthat computed the solution represented by sp . A similar definition can be givenfor the function INF�S�R.


CP indicating that the solution is compatible with the corresponding solu-tion node. This new function is expected to be more efficient for joiningpartial solutions; as we can see from Fig. 13, none of the solution nodes ofone solution graph is visited more than once during the inference process.Of course, we have to consider whether this improvement is overcome bythe cost of managing the marks introduced in the solutions. Varela et al.(24)

and Varela(20) presented a number of experimental results that show theINF�P�R function is more efficient than the function INF.

5. PARTIAL INSTANTIATIONS MANAGEMENT

In this section we indicate how to save the simplification introduced inSection 3 about the groundness of terms that a produced variable shouldbe instantiated to as a result of solving a subgoal. When a variable isinstantiated to a nonground term we say that the variable suffers a partialinstantiation. It is clear that partial instantiations are a key issue in logicprogramming because in many applications a number of complex struc-tures are incrementally built by means of successive resolution steps. But,at the same time, partial instantiations complicates the management ofIAP, due to the partial ordering of the literals of a query can not longerbe calculated at compile time. Now, a number of run time checks arenecessary in order to ensure that two or more subgoals are independenteach other and hence can be evaluated in parallel. These checks can beestablished at compile time by means of abstract interpretation techni-ques(25�31) and, as proven by Muthukumar and Hermenegildo, (29) this canbe expressed by means of conditions involving the following two predicates

v ground(X ): that returns the value true when the variable X is boundedto a ground term, and false otherwise,

v indep(X, Y ): that returns the value true when the variables X and Yare bounded to terms that do not share any free variable, thusexpressing that both of these terms are independent each other, andfalse otherwise.

We clarify how partial instantiations can be managed in our model bymeans of an example. Consider the logic program of Fig. 14a. If the toplevel query is h(X, Y, Z ), provided that all of the three arguments are freevariables each one distinct to the others, a compile time analysis can estab-lish that the following two conditions hold

v At the moment of evaluating the body of the first clause all of thevariables are free and independent each one to the others.



Fig. 14. (a) A logic program in which a variable can be partially instantiated; (b) A condi-tional DFL (CDFL) to codify the body of the first clause of the program; and (c) The PSNbuilt from the former CDFL by the and process that solves the body of the first clause.

v After the evaluation of every literal of the body, every variable willbe bounded to a ground term, except, maybe, the variables of theliteral b(Y, Z ) which might be bounded to nonground and depen-dent terms.

From the former information a conditional partial ordering can beestablished as represented by the CDFL (Conditional DFL) of Fig. 14b. TheCDFL is an extension of the DFL that actually codifies various partialorderings together with a number of conditions which allow to select thecorrect ordering to evaluate subgoals at runtime. It is similar to the struc-tures used in other models as, for example, the CDJG(11) and the CGE. (3)

The CDFL of Fig. 14b expresses that the subgoals a(X ) and b(Y, Z) canbe unconditionally evaluated in parallel. But the order to evaluate theliterals p(X, Y ) and q(Y, Z ) must be established at runtime from thegroundness of the variable Y in the solutions of the literal b(Y, Z ). Whenthe variable Y is bounded to a ground term the subgoals are independenteach other and hence they can be evaluated in parallel, but if the variableY is bounded to a non ground term the subgoals are mutually dependentand therefore they must be evaluated sequentially.


On the other hand, the PSN results as shown in Fig. 14c. As we cansee, the subgoals p(C1 , C2) and q(C2 , C3) generated from the solution(Y�C1 , Z�C3) of the literal b(Y, Z ) were evaluated in parallel, whereas thesubgoals generated from the solution (Y�f (Z )) were evaluated sequentiallyto guarantee the independence conditions.

6. AN APPLICATION: THE JOB SHOP SCHEDULINGCONSTRAINT SATISFACTION PROBLEM

First, we introduce the JSS problem we are considering in this paper.The job shop requires scheduling a set of jobs [J1 ,..., Jn] on a set of physi-cal resources [R1 ,..., Rq]. Each job Ji consists of a set of tasks [ti1 ,..., t imi]to be sequentially scheduled, and each task has a single resource require-ment. We assume that there are a release date of all jobs and a due datebetween which all the tasks have to be performed. Each task has a fixedduration duij and a start time st ij whose value has to be selected. Thedomain of possible start times of the tasks is initially constrained by therelease and due dates.

Therefore, there are two nonunary constraints of the problem: prece-dence constraints and capacity constraints. Precedence constraints definedby the sequential routings of the tasks within a job translate into linearinequalities of the type: stil+duil�st ik (i.e., stil before st ik). Capacity con-straints that restrict the use of each resource to only one task at a timetranslate into disjunctive constraints of the form: stil+duil�stjk 7 stjk+dujk�st il (i.e., two tasks that use the same resource can not overlap).

The objective is to come up with a feasible solution as fast as possible,a solution being a vector of start times, one for each task, such that startingat these times all the tasks end without exceeding the due date and all theconstraints are satisfied.

None of the simplifying assumptions are required by the approachthat will be discussed: jobs usually have different release and due dates,tasks within a job can have different duration, several resource require-ments, and several alternatives for each of these requirements.

Figure 15 depicts an example with three jobs [J1 , J2 , J3] and fourphysical resources [R1 , R2 , R3 , R4]. It is assumed that the tasks of the firsttwo jobs have duration of two time units, whereas the tasks of the thirdone have duration of three time units. The release time is 0 and the duedate is 10 for every job. Label Pi represents a precedence constraint andlabel Cj represents a capacity constraint. Start time values constrained bythe release and due dates and the duration time of tasks are represented asintervals. For instance [0, 4] represents all start times between time 0 andtime 4, as allowed by the time granularity, namely [0, 1, 2, 3, 4].



Fig. 15. An instance of a JSS problem. Each box repre-sents a task together with its resource requirement and itsduration. Arcs represent sequential constraints and edgesrepresent capacity constraints. Lower and upper bounds ofstart times are also represented to every task in-betweenbrackets.

6.1. The Logic Programming Approach to JSS Problems

The JSS problem as well as many other CSPs with finite domains hasbeen formulated in a variety of logical settings: First Order PredicateCalculus, Prolog, Datalog and Constraint Logic Programming (CLP)among others. Maybe CLP being the most common approach.(6, 12, 32�37)

Here we propose a representation of the problem in a naive logic languageand show how the logic programs can be obtained in order to exploitparallelism.

We define for each task a single literal having a solution for each ofthe possible start times. So, for instance, the start times domain of the taskt11 will be defined by the ground literals t11(0), t11(1), t11(2), t11(3), t11(4).Moreover, every constraint will be defined by means of binary relations: abinary relation for every precedence constraint and two binary relations forevery capacity constraint, for instance

P1(X11, X12) :&t11(X11), t12(X12), X12�X11+du11

C1(X11, X31) :&t11(X11), t31(X31), X31�X11+du11

C1(X11, X31) :&t11(X11), t31(X31), X31�X31+du31

It is clear that an instantiation of the whole set of variables appearingin constraints making true each of them represents a solution of theproblem. It is also clear that, in general, there are a big number of variables



shared by two or more literals, and that every constraint literal has a lotof solutions, so it makes sense to organize the evaluation of the constraintliterals under IAP. In order to do that, in this work we propose a strategythat consist of two steps: firstly, an independent constraint tree is computedfrom the constraint dependency graph; and then, from the independent con-straint tree, a logic program that can be annotated for evaluation under IAPis determined. An independent constraint tree is a tree representing a partialordering for the evaluation of the whole set of constraint literals under IAP;and the constraint dependency graph is an undirected graph representingthe variable dependencies among the constraint literals. Figure 16a depictsthe constraint dependency graph for the constraints of the problem ofFig. 15. The independent constraint tree is not unique for a given con-straint dependency graph; Fig. 16b shows a possible independent constrainttree determined from the graph of Fig. 16a.

In order to compute an independent constraint tree from a constraintdependency graph, Puente et al.(38) proposed an algorithm with severalnondeterministic actions that have to be solved by means of heuristics. Thealgorithm first tries to obtaining a set, in principle as big as possible, ofindependent constraints; these constraints are the leaves of the tree. Then,a search for constraints dependent of only one or two computed nodes isrepeated in order to determining the remaining nodes of the tree.

Then, from the independent constraints tree a logic program is deter-mined. This program is not unique for a given tree, but distinct programscan be derived with different sizes in the clauses, so giving rise to differentgranularity levels of the processes generated during program evaluation.For instance, from the tree depicted in Fig. 16b at least the following twoprograms can be determined (here the facts and the constraint relations arenot represented).

Fig. 16. (a) The constraint graph for the problem of Fig. 15; and (b) A constraint tree builtfrom the constraint graph of Fig. 16a.



6.2. Variable and Value Ordering Heuristics for the JSSProblem

As we have pointed out in the introduction, one of the original con-tributions of this work will be the utilization of heuristic information in theconstruction stage of logic programs for solving JSS problems. Our pur-pose is to incorporate the variable and value ordering heuristics proposedby Sadeh and Fox.(16) These heuristics are based on a probabilistic modelof the search space. A probabilistic framework is introduced that accounts


for the chance that a given value will be assigned to a variable and thechances that values assigned to different variables conflict with each other.

The heuristics are evaluated from the profile demands of the tasks forthe resources. In particular the individual demand and the aggregate demandvalues are considered. The individual demand Dij (Rp , T ) of a task t ij for aresource Rp at time T is simply computed by adding the probabilities _ ij ({)of the resource Rp is demanded by the task tij at some time within theinterval [T&duij+1, T ]. The individual demand is an estimation of thereliance of a task on the availability of a resource. Consider, for example,the initial search state of the problem depicted in Fig. 15. As the task t12

has five possible start times or reservations, and assuming that there is noreason to believe that one reservation is more likely to be selected thananother, each reservation is assigned an equal probability to be selected, inthis case 1�5. Given that the task t12 has duration of 2 time units, this taskwill demand to the resource R2 at time 4 if its start time is either 3 or 4.So, the individual demand of the task t12 for resource R2 at interval4�t<5 is estimated as D12(R2 , t)=_12(3)+_12(4)=2�5. On the otherhand, the aggregate demand Daggr(R, {) for a resource is obtained byadding the individual demands of all tasks over the time. Table I shows theindividual demands of all ten tasks of the problem, as well as the aggregatedemands for all four resources.

From the aggregate demand of a resource a contention peak is iden-tified. This is an interval of the aggregate demand of duration equal to theaverage duration of all the tasks with the highest demand. Table I shows

Table I. Individual and Aggregate Demands over the Time Intervals of theTasks and Resourcesa

Interv. 0 1 2 3 4 5 6 7 8 9 10

D11(R1 , T ) 0.2 0.4 0.4 0.4 0.4 0.2D31(R1 , T ) 0.2 0.4 0.6 0.6 0.4 0.2 0.2Daggr(R1 , T ) 0.4 0.8 1 1 0.8 0.4 0.2D12(R2 , T ) 0.2 0.4 0.4 0.4 0.4 0.2D21(R2 , T ) 0.2 0.4 0.4 0.4 0.4 0.2Daggr(R2 , T ) 0.2 0.4 0.6 0.8 0.8 0.6 0.4 0.2D13(R3 , T ) 0.2 0.4 0.4 0.4 0.4 0.2D23(R3 , T ) 0.2 0.4 0.4 0.4 0.4 0.2D32(R3 , T ) 0.2 0.4 0.6 0.6 0.6 0.4 0.2Daggr(R3 , T ) 0.2 0.8 1.4 1.4 1.4 1.2 0.6D22(R4 , T ) 0.2 0.4 0.4 0.4 0.2Daggr(R4 , T ) 0.2 0.4 0.4 0.4 0.2

a The problem is given in Fig. 15.


the contention peaks of all the four resources. Then, the task with thelargest contribution to the contention peak of a resource is determined asthe most critical and therefore it is selected first for reservation. This is theheuristic of variable ordering referred to as ORR (Operation ResourceReliance).(16) This heuristic can be introduced in the construction of thelogic programs by forcing those constraints involving tasks with large con-tribution to the corresponding contention peaks to be evaluated first, as wewill see in Section 6.4.

On the other hand, the value ordering heuristic proposed by Sadehand Fox(16) is also computed from the profile demands for the resources.Given a task tij that demands the resource Rp , the heuristic consists ofestimating the survivability of the reservations. The survivability of a reser-vation (stij=T ) is the probability that the reservation will not conflictwith the resource requirements of other tasks, that is, the probability thatnone of the other tasks require the resource during the interval[T, T+duij&1]. When the task demands are for only one resource, thisprobability can be estimated(16)

\1&AVG(Daggr(Rp , {)&D ij (Rp , {))

AVG(np({)&1) +AVG(np({)&1) V duij V (AVG(du)&1)

where du stands for the average duration of the tasks, np({) is the numberof tasks that can demand the resource Rp at time { and AVG( f ({)) repre-sents the average value of function f ({) in the interval [T, T+duij&1].Table II shows the survivability of all the reservations possible for all tentasks of the problem.

As it looks clear, the value ordering heuristic consist of trying first thereservations with large survivability values. This heuristic information canbe easily introduced in the construction of the logic programs for the JSS

Table II. Survivabilities of the Reservations of All Ten Tasksa

Interv. 0 1 2 3 4 5 6 7 8 9 10

t11 0.73 0.54 0.44 0.54t12 0.63 0.63 0.73 0.95 1t13 0.41 0.3 0.3 0.35 0.53t21 1 0.95 0.73 0.63 0.63t22 1 1 1 1 1t23 0.41 0.3 0.3 0.35 0.53t31 0.59 0.51 0.51 0.59 0.73t32 0.71 0.35 0.26 0.26 0.35

a In the initial state of the problem given in Fig. 15.


problem by declaring, in principle, the start time ground instances of everytask ordered from larger to lower survivability values. These values beingcalculated in the initial state of the search, that is, when none of the taskshave been assigned a reservation yet. Hence, during the program evalua-tion, reservations with high survivability will be used first under theassumption that they are more likely to be present within a solution of thewhole problem.

For an in depth study of these heuristics, as well as for furtherrefinements, we refer to the interested reader to Sadeh and Fox.(16)

6.3. The Simulator Tool

In order to evaluate logic programs and to study their performance,we have developed a simulator of our interpretation model that emulatesthe evolution of the set of processes generated on an arbitrary number ofprocessors. After evaluation of a query with respect to a logic program, thesimulator permits to lay out the process tree and the Gantt chart of theprocesses generated for solving the query. This information permits study-ing the amount of parallelism that is exploited during the evaluation of theprograms, and so it allows us to evaluate the quality of the logic programsfrom the point of view of its parallel evaluation.

The current release of the simulator is built on KappaPC 2.3 objectoriented environment, and so it has some limitations mainly due to thelimited number of active instances that the tool can manage at a giventime. As a consequence, the simulator often spends an unacceptableamount of time when evaluates very big programs. So, in order to reducethe number of instances generated during a simulation session, we intro-duce the following transformation in logic programs for solving JSSproblems. Instead of defining each of the tasks by means of a relation withas many ground instances as possible reservations, we define a relation foreach of the constraints having one ground instance for each compatiblejoin of solutions of the pair of tasks involved. For example, as we had theclause P1(X11, X12) :&t11(X11), t12(X12), X12�X11+du11, and solutionst11(1) and t21(3) are compatibles with each other, we declare the groundinstance P1(1, 3). The ordering of these ground instances will be establishedfrom the value ordering heuristics as shown in the next section.

6.4. Heuristic Programs

Now we show how the information provided by the variable and valueordering heuristics can be used to design the logic programs. Firstly, thevalue ordering heuristic is used to declare the ground instances of every


constraint in an ordering accordingly to the product of the survivabilitiesof the tasks involved. For example, the ground instance P1(1, 3) thatinvolves the solutions t11(1) and t21(3), having survivabilities 0.54 and 0.63respectively as shown in Table II, is declared with a survivability of0.54 V 0.63=0.34. Therefore, as we have pointed out in Section 6.3, groundinstances of high survivabilities are tried first under the assumption thatthey are likely to contribute to a solution to the whole problem. Table IIIshows the corresponding ground instances to all ten constraints, which aredeclared in the order established by their survivabilities. As we can see inTable III heuristic strategy is successful for a number of constraints, inparticular for P5 , C4 , P2 , C2 , and P3 due to it places ground instances thattake part in a solution to the problem at the beginning of the relation.Whereas in other cases clearly fails, as it is the case of constraints C1

and P4 , where the best real values are displaced towards the end of therelation.

At the same time, the information from the variable ordering heuristicis exploited as follows. As we have pointed in Section 6.2, what thisheuristic suggests is to select a reservation to the more critical task beforethan to the others. The more critical task being the task with the largestcontribution to the most contented resource contention peak. The underly-ing idea is that in this way the start time domain of this task is not prunedfrom the reservations of other tasks and hence there is a chance for selectthe best reservation for the critical task. We have translated this idea intoour approach in such a way that during the construction of a constrainttree from the constraint graph of the problem, we try to place constraintsinvolving high critical tasks as leaves of the tree. We have done thatbecause, as we can observe, the constraints at the leaves are evaluatedbefore than the others. In fact, these constraints are the producers of valuesfor their variables, whereas, in general, the remaining constraints are con-sumers and hence they act merely as filters of the values produced on theleaves. And this happens independently of the logic program codified fromthe constraint tree.

In order to establish a ranking among the constraints we proposed thefollowing strategy. The idea is to assign a priority to every constraint thatis proportional to the contention peaks of the resources involved as well asto the contribution of the tasks to these peaks. For example, constraint P5

involves tasks t31 and t32 which demand the resources R1 and R3 respec-tively, and the largest values of the aggregate demands of the contentionpeaks of these resources are 1 and 1.4 respectively as shown in Table I.Taking into account that the contribution of task t31 to the contentionpeak of resource R1 is 0.6, and the contribution of task t32 to thecorrespondent contention peak is also 0.6, we associate a priority to the


Table III. Survivabilities of the Constraint Ground Terms Computed as theProduct of the Survivalities of the Reservations Involveda

P5(X31, X32). C4(X13, X32). C5(X32, X23). C3(X13, X23). C1(X11, X31).1.44 1.4 1.4 1.12 0.96

p5(0, 3). 0.42p5(4, 7). 0.26p5(0, 7). 0.21p5(3, 7). 0.21p5(0, 4). 0.21p5(2, 7). 0.18p5(1, 4). 0.18p5(1, 7). 0.18p5(3, 6). 0.15p5(0, 5). 0.15p5(0, 6). 0.15p5(1, 6). 0.13p5(1, 5). 0.13p5(2, 5). 0.13p5(2, 6). 0.13

c4(8, 3). 0.38c4(7, 3). 0.25c4(6, 3). 0.21c4(8, 4). 0.19c4(4, 7). 0.14c4(8, 5). 0.14c4(7, 4). 0.12c4(4, 6). 0.11c4(5, 7). 0.11

c5(3, 8). 0.38c5(3, 7). 0.25c5(3, 6). 0.21c5(4, 8). 0.19c5(7, 4). 0.14c5(5, 8). 0.14c5(4, 7). 0.12c5(6, 4). 0.11c5(7, 5). 0.11

c3(4, 8). 0.22c3(8, 4). 0.22c3(6, 8). 0.16c3(5, 8). 0.16c3(8, 5). 0.16c3(8, 6). 0.16c3(7, 4). 0.14c3(4, 7). 0.14c3(6, 4). 0.12c3(4, 6). 0.12c3(7, 5). 0.11c3(5, 7). 0.11

c1(0, 4). 0.53c1(0, 3). 0.43c1(1, 4). 0.39c1(0, 2). 0.37c1(2, 4). 0.32c1(1, 3). 0.32c1(4, 0). 0.32c1(4, 1). 0.28c1(3, 0). 0.26

P2(X12, X13). P1(X11, X12). P4(X22, X23). C2(X21, X12). P3(X21, X22).0.88 0.72 0.72 0.64 0.48




c2(0, 6). 1.00c2(0, 5). 0.95c2(1, 6). 0.95c2(1, 5). 0.90c2(0, 4). 0.73c2(2, 6). 0.73c2(2, 5). 0.69c2(1, 4). 0.69c2(0, 2). 0.63c2(0, 3). 0.63c2(3, 6). 0.63c2(4, 6). 0.63c2(1, 3). 0.60c2(3, 5). 0.60c2(2, 4). 0.53c2(4, 2). 0.40


a The boldface ground instances correspond to the values involved in a solution. The con-straint literals are ordered accordingly to their priorities, as established from the variableordering heuristic. The boldface literals are those selected as leaves of the constraint tree, asshown in Fig. 16b.


constraint P5 of 0.6 V 1+0.6 V 1.4=1.44. Table III shows the priorities ofall ten constraints of our problem.

Finally, the former priorities are used together with the dependencyrelations among the constraints expressed by the constraint graph in orderto establish the constraints to be placed as leaves of the constraint tree. Theidea is, on one hand to select a number as long as possible of independentconstraints in order to obtain the maximum parallelism possible; this is aNP-hard problem. And, on the other hand to select critical constraints aswe have pointed out in previous paragraphs. Hence we propose the follow-ing strategy:

v the constraints are visited from high to low priorities and each oneis selected to label a leaf if and only if it is independent of everypreviously selected constraint.

When this strategy is applied to the constraint graph of Fig. 16a,taking into account the constraint priorities showed in Table III, the con-straints selected are those appearing as leaves in the constraint tree ofFig. 16b. These constraints appear also boldfaced in Table III.

The logic program constructed in this way ensures, if the number ofprocessors is big enough, that the constraints at the leaves of the tree areall of them evaluated in parallel and their values are exploited in the orderin which they are declared. As the former strategy puts as long as possiblethe most critical constraints at the leaves of the constraint tree the systemtries firstly the most promising values of a number of the most criticalconstraints. These values are further filtered by the remaining constraintsaccordingly to the order in which the constraints appear in the tree. There-fore, it is expected that the system returned the first solution soon, as longas the reminder solutions can still be calculated at a later time. Now, let usassume that the Program-2 is codified from the constraint tree of Fig. 16band consider how computations are carried out to evaluate the queryall(X11, X12, X13, X21, X22, X23, X31, X32). For the sake of simplicity weconcentrate only on the constraints P1 , P3 , and P2 involved in the clausep1p3c2(X11, X12, X21, X22) :&p1(X11, X12), p3(X21, X22), c2(X21, X12).,and assume that the DFL generated to codify the body of this clause is theone of Fig. 17a. Hence, the evaluation of the body of the former clause isas follows: an and process is generated to solve this query that in principlegenerates two or processes to evaluate the constraints p1(X11, X12) andp3(X21, X22) respectively. These or processes will send to the and processthe solutions of the constraints in the order in which the correspondingground instances are declared. Therefore, the first answers received at theand process are (X11�0, X12�6) and (X21�0, X22�2) coming from the pro-cesses procp1(X11, X12) and procp3(X21, X22) respectively. Then the and



Fig. 17. (a) A DFL for the query p1(X11, X12), p3(X21, X22), c2(X21, X12); (b) The PSNafter reception and joining the first solutions of the constraints p1(X11, X12) andp3(X21, X22); and (c) The PSN after reception of the second answer from the constraintp1(X11, X12). A new process is not necessary to search for compatible solutions of theconstraint c2(X21, X12).

process joins these solutions and generates the or process procc2(0, 6). Thisis the situation depicted in the PSN of Fig. 17b, in which all of the threeor processes run in parallel due to exploitation of producer�consumerparallelism. That is, whereas the processes procp1(X11, X12) and procp3(X21,X22) are still working to send the second answer, the subgoal c2(0, 6) isalready being evaluated. Moreover, if the second solution from the processprocp3(X21, X22) arrives, this solution is joined to the solution of the con-straint p1(X11, X12), but in this case a new process to search for com-patible solutions of c2(X21, X12) is not required because this joining doesnot define a new context for this literal, as shown in Fig. 17c.

As we have shown, it is possible codify logic programs to solve JSSproblems that express all of the three classes of parallelism our model canexploit. Moreover, when these programs are evaluated, in general, many orprocesses are prevented from duplication due to the ability of our modelto collect multiple partial solutions into the context of the same process.On the other hand, the logic programs can be improved with the helpof heuristics so that a number of the most critical constraints are earlyevaluated and their most promising solutions are likely tried first.

6.5. Experimental Results

In this section we include a number of experimental results showing,on one hand, the amount of parallelism that can be exploited in solvingJSS problems; and on the other hand, the speedup that the variable andvalue ordering heuristics produce in obtaining answers.

As we have shown in previous section, a logic program can beautomatically codified to solve a JSS problem that exhibit all of the three



Fig. 18. Gantt charts from two simulations for solving the same problem on three pro-cessors: (a) with producer�consumer parallelism. The time of answer is 855 units; and (b)without producer�consumer parallelism. The time of answer is 6244 units.

classes of parallelism our model is able to exploit, namely, or parallelism,independent and parallelism and producer�consumer parallelism. Theimportance of the first two ones was widely proclaimed in the literature,hence we start by showing an example in order to make clear the improve-ment introduced by the producer�consumer parallelism. In order to dothat, we consider a simple example: the Program�2 but including only fourground instances for each of the ten constraints of Table III, so that onlyone solution to the query exists from these ground instances. Figure 18depicts the Gantt charts of the processes generated to solving the queryall(X11, X12, X13, X21, X22, X23, X31, X32), when the producer�consumerparallelism is exploited (Fig. 18a) and when it is not exploited (Fig. 18b).As we can observe, in the first case there is a more uniform use of theprocessors than in the second. As a consequence, the time of answer islower in the first case than in the second one.

Now, in order to clarify the importance of the ordering among theground instances of the constraint literals, we consider the whole program,that is, the Program�2 and all the ground instances of Table III. We firstsimulate the evaluation of the program keeping the ordering shown inTable III among the ground instances, and then with the inverse ordering.Table IV shows the arrival time of all 18 answers, as well as the meanvalues. As we can see, when the ground instances are ordered accordinglyto the value ordering heuristic, the time of the first answers is much lowerthan the corresponding time when the inverse ordering is considered, as itwas expected.

Finally, in Fig. 19 we present some results showing the speedupobtained with the number of processors. In any case, we consider theProgram�2 with all ground instances ordered by the heuristic as shown inTable III. Here, we have to take into account that speedup measures referto the time of first answer as well as to the mean values for all of the 18answers. This justifies values greater than 1 in speedup. In all of theexperiments we have simulated a processor scheduling policy based on pro-cesses priorities that are computed only from the waiting time of processesin the ready to run state.



Table IV. Results of Simulation of Program�2 on a Number of 4 Processors

Answers to the queryall(X11, X12, X13, X21, X22, X23, X31, X32)

Arrival time withthe orderingproduced bythe heuristic

Arrival timewith theinverse

ordering

[X11�4, X12�6, X13�8, X21�0, X22�2, X23�6, X31�0, X32�3] 4542 279873[X11�4, X12�6, X13�8, X21�0, X22�3, X23�6, X31�0, X32�3] 5358 232000[X11�3, X12�6, X13�8, X21�0, X22�2, X23�6, X31�0, X32�3] 7998 263182[X11�3, X12�6, X13�8, X21�0, X22�3, X23�6, X31�0, X32�3] 8775 218288[X11�3, X12�5, X13�8, X21�0, X22�2, X23�6, X31�0, X32�3] 12922 253609[X11�3, X12�5, X13�8, X21�0, X22�3, X23�6, X31�0, X32�3] 13871 210598[X11�4, X12�6, X13�8, X21�0, X22�4, X23�6, X31�0, X32�3] 25272 195292[X11�3, X12�6, X13�8, X21�0, X22�4, X23�6, X31�0, X32�3] 32384 184719[X11�3, X12�5, X13�8, X21�0, X22�4, X23�6, X31�0, X32�3] 37106 179484[X11�4, X12�6, X13�8, X21�1, X22�3, X23�6, X31�0, X32�3] 110452 121825[X11�3, X12�6, X13�8, X21�1, X22�3, X23�6, X31�0, X32�3] 111531 112500[X11�3, X12�5, X13�8, X21�1, X22�3, X23�6, X31�0, X32�3] 116724 105895[X11�4, X12�6, X13�8, X21�1, X22�4, X23�6, X31�0, X32�3] 129241 92451[X11�3, X12�6, X13�8, X21�1, X22�4, X23�6, X31�0, X32�3] 136279 86145[X11�3, X12�5, X13�8, X21�1, X22�4, X23�6, X31�0, X32�3] 142002 81989[X11�4, X12�6, X13�8, X21�2, X22�4, X23�6, X31�0, X32�3] 227164 29625[X11�3, X12�6, X13�8, X21�2, X22�4, X23�6, X31�0, X32�3] 229245 29602[X11�3, X12�6, X13�8, X21�2, X22�4, X23�6, X31�0, X32�3] 239543 29578

Average time 88356 150370

Fig. 19. Speedup defined as (time�with�1�processor�time�with�n�pro-cessors)�n.


4. CONCLUSIONS

In this paper we present an abstract model for parallel interpretationof logic programs. The model can exploit the two main sources ofparallelism of logic programming, or and IAP, as well as producer�consumer parallelism. The main interesting characteristic of the model isthat it avoids the duplication of processes for solving the same subgoal.This duplication occurs in other models that exploit producer�consumerparallelism, for example in ROPM.(11) The efficiency of the model isfounded on the use of ordered structures for managing the informationgenerated during the search process. First, an I-and�or tree is defined thatrepresents the decomposition of the whole task of evaluating a query intoa set of subtasks with a high degree of independence; and therefore manyof these subtasks can be evaluated in parallel. Then, a DFL is introducedto represent a partial ordering among the literals of a query, in order forthese literals to be evaluated under IAP. Finally, a PSN is designed to dis-play and join the partial solutions to the subgoals of a query. The PSNmaintains every partial solution independently, though related to the com-patible partial solutions of the remaining subgoals and, when necessary, thepartial solutions are joined in order to obtain solutions to the query bymeans of the inference algorithm. The efficiency of this algorithm is criticalfor the model performance, so we have required some properties of theDFL in order to improve inference, although the property P4 might restrictthe amount of parallelism that the DFLs express. From these properties wehave designed an efficient inference algorithm as well as some auxiliaryoperations for exploiting producer�consumer parallelism.

As an application example, we have proposed a strategy to solve JSSproblems that combines our parallel logic programming interpretationmodel and heuristics to guide the search. Experimental results obtainedthrough simulation show that the logic programs can express the paral-lelism these problems exhibit, and that heuristics can help us to designthese programs in order to improve performance.

Our recent and current work is aimed in three complementary direc-tions. First, we are developing more sophisticated simulations of the inter-pretation model in order to carry out experimental studies of certainaspects. The idea is to build on the works, (29) where we made a compara-tive study of various algorithms used to perform inference, and presenteda preliminary study of the process scheduling policy.(39) Of course theobjective of these simulations is to work towards implementation on a realparallel machine. Puente et al.(40) reported a number of experimentalresults from a prototype multiprocessor implementation. It is worth toremarking here that in spite of the research effort devoted to parallel logic


programming, as far as we know, there are no commercial tools availablefor standard machines exploiting at the same time IAP, or parallelism andproducer�consumer parallelism. There are only a number of prototypetools that only exploit some of the former sources, for example the ACE(41)

and DASWAM(42) systems; a comparison of these implementations ispresented by Gupta and Pontelli.(43) The second direction of research isinto automatic determination of CDFLs (Conditional DFLs) by means ofabstract interpretation techniques. And the third one is addressed tofinding families of problems for which our model performs well. As, forexample, the JSS problem just described in this work and the map-coloringproblem confronted by Varela and Vela.(44)

ACKNOWLEDGMENTS

We are grateful to our colleague Antonio Bahamonde for his encour-agement and ideas in the first stage of our research. We also would like tothank the anonymous reviewers for their comments and suggestions.

REFERENCES

1. K. A. M. Ali and R. Karlsson, Full Prolog and scheduling or-parallelism in Muse, IJPP,19(6):445�475 (1991).

2. J. S. Conery, The and�or process model for parallel interpretation of logic programs,Ph.D. Th. Dpto. Information and Computer Science, University California, Irvine (1983).

3. D. DeGroot, Restricted and-parallelism, Proc. Int'l. Conf. Fifth Generation Comp.Systems, North-Holland, pp. 471�478 (1984).

4. J. Chang, A. M. Despain, and D. Degroot, and-parallelism of logic programs-based onstatic data dependency analysis, COMPCOM, pp. 218�225 (1985).

5. L. V. Kale, Parallel architectures for problem solving, Ph.D. thesis, Dept. ComputerScience, SUNY, Stony Brook (1985).

6. P. P. Li and A. J. Martin, The SYNC model: A parallel execution method for logicprogramming, Proc. Symp. on Logic Progr., Salt Lake City (September 1986).

7. H. Westphal, P. Robert, J. Chassin, and J. Syre, The PEPSys model: Combining back-traking, and- and or-parallelism, IEEE Int'l. Symp. Logic Progr., San Francisco, California,pp. 436�448 (1987).

8. G. Gupta, Parallel execution of logic programs on shared memory multiprocessors, Ph.D.thesis, Dept. of Computer Science, University North Carolina at Chapel Hill (1992).

9. M. V. Hermenegildo, An abstract machine-based execution model for computer architec-ture design and efficient implementation of logic programs in parallel, Ph.D. thesis,University of Texas at Austin (1986).

10. E. Lusk et al., The Aurora or-parallel Prolog system, New Generation Computing, 7(2�3)(1990).

11. L. V. Kale, The reduce-or process model for parallel interpretation of logic programs,J. Logic Progr., 11:55�84 (1991).

12. P. V. Hentenryck, H. Simonis, and M. Dincbas, Constraint satisfaction using constraintlogic programming, Artificial Intelligence, 58:113�159 (1992).


13. H. M. Adorf and M. D. Johnston, A discrete stochastic neural network algorithm forconstraint satisfaction problems, Proc. Int'l. Joint Conf. Neural Networks, San Diego,California (1990).

14. M. Zweben, E. Davis, B. Daun, E. Drascher, M. Deale, and M. Eskey, Learning toimprove constraint-based scheduling, Artificial Intelligence, 58:271�296 (1992).

15. D. Corne and P. Ross, Practical Issues and Recent Advances in Job- and Open- ShopScheduling, D. Dasgupta and Z. Michalewicz (eds.), Springer-Verlag.

16. N. Sadeh and M. S. Fox, Variable and value ordering heuristics for the job shop schedul-ing constraint satisfaction problem, Artificial Intelligence, 86:1�41 (1996).

17. N. J. Nilsson and J. Matuszynski, Logic, Programming and Prolog, John Wiley (1990).18. L. V. Kale, A tree representation for parallel problem solving, Proc. Int'l. Conf. Parallel

Processing, St. Charles, pp. 677�681 (August 1988).19. J. B. Dennis, Machines and models for parallel computers, IJPP, 22(1):47�77 (February

1994).20. R. Varela, Un Modelo para el ca� lculo paralelo de deducciones en lo� gica de predicados,

Tesis doctoral, Departamento de Matema� ticas, Universidad de Oviedo (1995).21. A. Delcher and S. Kasif, Some result on the complexity of exploiting data dependency in

parallel logic programs, J. Logic Progr., 6:229�241 (1989).22. C. R. Vela, C. L. Alonso, R. Varela, and J. Puente, A genetic approach to computing inde-

pendent and parallelism in logic programming, in Biological and Artificial Computation:

From Neuroscience to Technology, IWANN'97, J. Mira, R. Moreno, and J. Cabestany(eds.), Springer-Verlag, Lecture Notes in Computer Science, pp. 566�575 (1997).

23. C. L. Alonso, C. R. Vela, R. Varela, and J. Puente, Ordered structures for parallel rule-based computations, Technical Report, Centro de Inteligencia Artificial, Univ. de Oviedo(1999).

24. R. Varela, E. Sierra, L. Jime� nez, and C. R. Vela, Combinacio� n de Soluciones Parciales enProgramacio� n Lo� gica Paralela, C-AEPIA, Alicante (1995).

25. A. K. Mackworth, The logic of constraint satisfaction, Artificial Intelligence, 58(1�3):3�20(December 1992).

26. M. V. Hermenegildo and F. Rossi, Strict and Nonstrict and�parallelism in logic programs:Correctness, efficiency, and compile-time conditions, J. Logic Progr., 22(1):1�45 (January1995).

27. S. K. Debray, Efficient dataflow analysis of logic programs, J. ACM, 39(4):949�984(1992).

28. D. Jacobs and A. Langen, Static analysis of logic programs for independent andparallelism, J. Logic Progr., 13:291�314 (1992).

29. K. Muthukumar and M. Hermenegildo, Compile-time derivation of variable dependencyusing abstract interpretation, J. Logic Progr., 13:315�347 (1992).

30. K. Muthukumar, F. Beuno, M. Garc@� a de la Banda, and M. Hermenegildo, Automaticcompile-time parallelization of logic programs for restricted, goal level, independent andparallelism, J. Logic Progr., 38(2):165�218 (February 1999).

31. B. Schend, A Methodology for detecting shared variable dependencies in logic programs,J. Symbolic Computation, 12:275�298 (1991).

32. R. Cucchiara, E. Lamma, P. Mello, M. Milano, and M. Piccardi, Interactive ConstraintSatisfaction and its Application to Visual Object Recognition, Proc. APPIA-GULP-PRODE, La Corun~ a, Spain, pp. 57�69 (July 1998).

33. M. Dincbas, H. Simonis, and P. V. Hentenryck, Solving large combinatorial problems inlogic programming, J. Logic Progr., 8(1�2):75�93 (1990).

34. F. Fages, J. Fowler, and T. Sola, Experiments in reactive constraint logic programming,J. Logic Progr., 37(1�3):185�212 (1998).


35. P. V. Hentenryck, V. Saraswat, and Y. Deville, Design, implementation, and evaluationof the constraint language cc(FD), J. Logic Progr., 37(1�3):139�164 (1998).

36. J. Jafar and M. J. Maher, Constraint logic programming: A survey, J. Logic Progr.,19�20:503�581 (1994).

37. E. Lamma, M. Milano, and P. Mello, Reasoning on constraints in CLP(FD), J. LogicProgr., 38(1):93�110 (January 1999).

38. J. Puente, R. Varela, C. R. Vela, and C. Alonso, A parallel logic programming approachto job shop constraint satisfaction problems, Proc. APPIA-GULP-PRODE, La Corun~ a,Spain, pp. 29�41 (July 1998).

39. R. Varela, J. Puente, C. R. Vela, and C. Alonso, Planificacio� n Heur@� stica de Procesosand�or Paralelos, CAEPIA, Ma� laga (1997).

40. J. Puente, R. Varela, and C. R. Vela, Ca� lculo paralelo de deducciones en un sistema multi-procesador, Proc. CAEPIA, Murcia, Spain, Vol. 1, pp. 9�16 (November 1999).

41. E. Pontelli, G. Gupta, and M. Hermenegildo, 6-ACE: A high performance parallel Prologsystem, Proc. Ninth Int'l. Parallel Proc. Symp., IEEE Press, pp. 564�571 (1995).

42. K. Shen, Initial results from the parallel implementation DASWAM, Proc. Joint Int'l.Conf. Symp. Logic Progr., MIT Press (1996).

43. G. Gupta and E. Pontelli, High performance parallel logic programming: The ACEparallel Prolog system, APPIA-GULP-PRODE, Grado, Italy, pp. 25�31 (1997).

44. R. Varela and C. R. Vela, and�or Trees for Parallel Deductions, ITHURS, Leo� n, Spain(July 1996).

45. C. S. Mellish, The automatic generation of mode declaration for Prolog programs, Dept.Artificial Intelligence, Research Paper No. 163, University of Edinburgh (1981).

46. R. Varela, El Modelo RPS para la Gestio� n del Paralelismo and Independiente enProgramas Lo� gicos, Proc. Joint Conf. Declarative Progr. GULP�PRODE, pp. 251�265(1994).

47. R. Varela, C. R. Vela, and J. Puente, Efficient Producer�Consumer Parallelism in LogicProgramming, APPIA-GULP-PRODE, San Sebastian (July 1996).

Printed in Belgium


Documents

Parallel Logic Programming for Problem Solving