Interpreters Compiler Optimisers in Prolog

Embed Size (px)

Citation preview

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    1/74

    DRAFTInterpreters, Compilers, Optimisers and

    Analysers in Prolog

    Michael Leuschel + ...

    Lehrstuhl fur Softwaretechnik und Programmiersprachen

    Institut fur Informatik

    Universitat Dusseldorf

    Universitatsstr. 1, D-40225 Dusseldorf

    {leuschel}@cs.uni-duesseldorf.de

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    2/74

    2

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    3/74

    Chapter 1

    Introduction

    Goal: implement, optimise, analyse, and verify programming languages andspecification languages.

    Tools:

    Prolog

    DCGs (or even cheaper: operator declarations) for parsing

    support for non-determinism allows to model a wide variety of speci-fication formalisms as well as allows analysis, running programs back-wards,...

    operational semantics, denotational semantics rules can often be trans-lated easily into Prolog

    unification good for type checking, inference new technologies:

    tabling: good for analysis (lfp) and verification (model checking)

    coroutining: allows more natural translation of advanced for-malisms (CSP, B, lazy functional lges)

    constraints: provides convenient powerful analysis and verifica-tion technology

    Partial Evaluation

    optimise an interpreter for a particular program to be interpreted

    translate an interpreter into a compiler

    can be used for slicing and analysis sometimes

    Abstract Interpretation

    3

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    4/74

    4 CHAPTER 1. INTRODUCTION

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    5/74

    Chapter 2

    Background

    5

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    6/74

    6 CHAPTER 2. BACKGROUND

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    7/74

    Chapter 3

    Logic and LogicProgramming

    In this chapter we summarise some essential background in first-order logicand logic programming, required for the proper comprehension of this thesis.The exposition is mainly inspired by [1] and [30] and in general adheres to thesame terminology. The reader is referred to these works for a more detailedpresentation, comprising motivations, examples and proofs. Some other goodintroductions to logic programming can also be found in [37], [16, 2] and [35],while a good introduction to first-order logic and automated theorem provingcan be found in [19].

    3.1 First-order logic and syntax of logic pro-grams

    We start with a brief presentation of first-order logic.

    Definition 3.1.1 (alphabet) An alphabet consists of the following classes ofsymbols:

    1. variables;

    2. function symbols;

    3. predicate symbols;

    4. connectives, which are negation, conjunction, disjunction, im-plication, and equivalence;

    5. quantifiers, which are the existential quantifier and the universal quan-tifier ;

    6. punctuation symbols, which are (, ) and ,.

    7

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    8/74

    8 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    Function and predicate symbols have an associated arity, a natural number in-dicating how many arguments they take in the definitions following below.

    Constants are function symbols of arity 0, while propositions are predicate sym-bols of arity 0.

    The classes 4 to 6 are the same for all alphabets. In the remainder of thisthesis we suppose the set of variables is countably infinite. In addition, alphabetswith a finite set of function and predicate symbols will simply be called finite.An infinite alphabet is one in which the number of function and/or predicatesymbols is not finite but countably infinite.

    We will try to adhere as much as possible to the following syntactical con-ventions throughout the thesis:

    Variables will be denoted by upper-case letters like X, Y, Z , usually takenfrom the later part of the (Latin) alphabet.

    Constants will be denoted by lower-case letters like a,b,c, usually taken

    from the beginning of the (Latin) alphabet. The other function symbols will be denoted by lower-case letters like

    f , g , h. Predicate symbols will be denoted by lower-case letters like p,q,r.

    Definition 3.1.2 (terms, atoms) The set ofterms (over some given alphabet)is inductively defined as follows:

    a variable is a term

    a constant is a term and

    a function symbol f of arity n > 0 applied to a sequence t1, . . . , tn of nterms, denoted by f(t1, . . . , tn), is also a term.

    The set of atoms (over some given alphabet) is defined in the following way:

    a proposition is an atom and

    a predicate symbol p of arity n > 0 applied to a sequence t1, . . . , tn of nterms, denoted by p(t1, . . . , tn), is an atom.

    We will also allow the notations f(t1, . . . , tn) and p(t1, . . . , tn) in case n = 0.f(t1, . . . , tn) then simply represents the term f and p(t1, . . . , tn) represents theatom p. For terms representing lists we will use the usual Prolog [14, 42, 8]notation: e.g. [ ] denotes the empty list, [H|T] denotes a non-empty list withfirst element H and tail T.

    Definition 3.1.3 (formula) A (well-formed) formula (over some given alpha-bet) is inductively defined as follows:

    An atom is a formula.

    If F and G are formulas then so are (F), (F G), (F G), (F G),(F G).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    9/74

    3.1. FIRST-ORDER LOGIC 9

    If X is a variable and F is a formula then (XF) and (XF) are alsoformulas.

    To avoid formulas cluttered with the punctuation symbols we give the connec-tives and quantifiers the following precedence, from highest to lowest:

    1. , , , 2. , 3. , 4. , .

    For instance, we will write X(p(X) q(X) r(X)) instead of the less read-able (X(p(X) ((q(X)) r(X)))).

    The set of all formulas constructed using a given alphabet A is called thefirst-order language given by A.

    First-order logic assigns meanings to formulas in the form of interpretationsover some domain D:

    Each function symbol of arity n is assigned an n-ary function Dn D.

    This part, along with the choice of the domain D, is referred to as apre-interpretation.

    Each predicate symbol of arity n is assigned an n-ary relation, i.e. a subsetof Dn (or equivalently an n-ary function Dn {true, false}).

    Each formula is given a truth value, true or false, depending on the truthvalues of the sub-formulas. (For more details see e.g. [19] or [30]).

    A model of a formula is simply an interpretation in which the formula hasthe value true assigned to it. Similarly, a model of a set S of formulas is aninterpretation which is a model for all F S.

    For example, let I be an interpretation whose domain D is the set of naturalnumbers IN and which maps the constant a to 1, the constant b to 2 and theunary predicate p to the unary relation {(1)}. Then the truth value of p(a)under I is true and the truth value of p(b) under I is false. So I is a model of

    p(a) but not of p(b). I is also a model of Xp(X) but not ofXp(X).We say that two formulas are logically equivalent iff they have the same

    models. A formula F is said to be a logical consequence of a set of formulas S,denoted by S |= F, iff F is assigned the truth value true in all models of S. Aset of formulas S is said to be inconsistent iff it has no model. It can be easilyshown that S |= F holds iff S {F} is inconsistent. This observation lies atthe basis of what is called a proof by refutation: to show that F is a logicalconsequence of S we show that S {F} leads to inconsistency.

    From now on we will also use true (resp. false) to denote some arbitraryformula which is assigned the truth value true (resp. false) in every interpreta-

    tion. If there exists a proposition p in the underlying alphabet then true coulde.g. stand for p p and false could stand for p p.1 We also introduce thefollowing shorthands for formulas:

    1In some texts on logic (e.g. [19]) true and false are simply added to the alphabet andtreated in a special manner by interpretations. The only difference is then that true andfalse can be considered as atoms, which can be convenient in some places.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    10/74

    10 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    ifF is a formula, then (F ) denotes the formula (F true) and ( F)denotes the formula (false F).

    () denotes the formula (false true).

    In the following we define some other frequently occurring kinds of formulas.

    Definition 3.1.4 (literal) If A is an atom then the formulas A and A arecalled literals. Furthermore, A is called a positive literal and A a negativeliteral.

    Definition 3.1.5 (conjunction, disjunction) Let A1, . . . , An be literals, wheren > 0. Then A1 . . . An is a conjunction and A1 . . . An is a disjunction.

    Usually we will assume (respectively ) to be associative, in the sensethat we do not distinguish between the logically equivalent, but syntactically

    different, formulas p (q r) and (p q) r.

    Definition 3.1.6 (scope) Given a formula (XF) (resp. (XF)) the scope ofX (resp. X) is F. A bound occurrence of a variable X inside a formula F isany occurrence immediately following a quantifier or an occurrence within thescope of a quantifier X or X. Any other occurrence of X inside F is said tobe free.

    Definition 3.1.7 (universal and existential closure) Given a formula F, theuniversal closureofF, denoted by (F), is a formula of the form (X1 . . . (XmF) . . .)where X1, . . . , X m are all the variables having a free occurrence inside F (in somearbitrary order). Similarly the existential closure ofF, denoted by (F), is theformula (X1 . . . (XmF) . . .).

    The following class of formulas plays a central role in logic programming.

    Definition 3.1.8 (clause) A clause is a formula of the form (H1 . . . Hm B1 . . . Bn), where m 0, n 0 and H1, . . . , H m, B1, . . . , Bn are all literals.H1 . . . Hm is called the head of the clause and B1 . . . Bn is called thebody.A (normal) program clause is a clause where m = 1 and H1 is an atom. Adefinite program clause is a normal program clause in which B1, . . . , Bn areatoms. A fact is a program clause with n = 0. A query or goal is a clause withm = 0 and n > 0. A definite goal is a goal in which B1, . . . , Bn are atoms.The empty clause is a clause with n = m = 0. As we have seen earlier, thiscorresponds to the formula false true, i.e. a contradiction. We also use 2

    to denote the empty clause.

    In logic programming notation one usually omits the universal quantifiersencapsulating the clause and one also often uses the comma (,) instead ofthe conjunction in the body, e.g. one writes p(s(X)) q(X), p(X) instead ofX(p(f(X)) (q(X) p(X))). We will adhere to this convention.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    11/74

    3.2. SEMANTICS OF LOGIC PROGRAMS 11

    Definition 3.1.9 (program) A (normal) program is a set of program clauses.A definite program is a set of definite program clauses.

    In order to be able to express a given program P in a first-order languageL given by some alphabet A, the alphabet A must of course contain the func-tion and predicate symbols occurring within P. The alphabet might howevercontain additional function and predicate symbols which do not occur insidethe program. We therefore denote the underlying first-order language of a givenprogram P by LP and the underlying alphabet by AP. For technical reasons re-lated to definitions below, we suppose that there is at least one constant symbolin AP.

    3.2 Semantics of logic programs

    Given that a program P is just a set of formulas, which happen to be clauses,the logical meaning of P might simply be seen as all the formulas F for whichP |= F. For normal programs this approach will turn out to be insufficient, butfor definite programs it provides a good starting point.

    3.2.1 Definite programs

    To determine whether a formula F is a logical consequence of another formulaG, we have to examine whether F is true in all models ofG. One big advantageof clauses is that it is sufficient to look just at certain canonical models, calledthe Herbrand models.

    In the following we will define these canonical models. Any term, atom,literal, clause will be called ground iff it contains no variables.

    Definition 3.2.1 Let P be a program written in the underlying first-orderlanguage LP given by the alphabet AP. Then the Herbrand universe UP is theset of all ground terms over AP.

    2 The Herbrand base BP is the set of all groundatoms in LP.

    A Herbrand interpretation is simply an interpretation whose domain is theHerbrand universe UP and which maps every term to itself. A Herbrand modelof a set of formulas S is an Herbrand interpretation which is a model of S.

    The interest of Herbrand models for logic programs derives from the followingproposition (the proposition does not hold for arbitrary formulas).

    Proposition 3.2.2 A set of clauses has a model iff it has a Herbrand model.

    This means that a formula F which is true in all Herbrand models of a set ofclauses C is a logical consequence ofC. Indeed ifF is true in all Herbrand models

    2It is here that the requirement that AP contains at least one constant symbol comes intoplay. It ensures that the Herbrand universe is never empty.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    12/74

    12 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    then F is false in all Herbrand models and therefore, by Proposition 3.2.2,C {F} is inconsistent and C |= F.

    Note that a Herbrand interpretation or model can be identified with a subsetH of the Herbrand base BP (i.e. H 2

    BP): the interpretation of p(d1, . . . , dn)is true iff p(d1, . . . , dn) H and the interpretation of p(d1, . . . , dn) is falseiff p(d1, . . . , dn) H. This means that we can use the standard set order onHerbrand models and define minimal Herbrand models as follows.

    Definition 3.2.3 A Herbrand model H BP for a given program P is aminimal Herbrand model iff there exists no H H which is also a Herbrandmodel of P.

    For definite programs there exists a unique minimal Herbrand model, calledthe least Herbrand model, denoted by HP. Indeed it can be easily shown that theintersection of two Herbrand models for a definite program P is still a Herbrand

    model ofP. Furthermore, the entire Herbrand base BP is always a model for adefinite program and one can thus obtain the least Herbrand model by takingthe intersection of all Herbrand models.

    The least Herbrand model HP can be seen as capturing the intended meaningof a given definite program P as it is sufficient to infer all the logical consequencesof P. Indeed, a formula which is true in the least Herbrand model HP is truein all Herbrand models and is therefore a logical consequence of the program.

    Example 3.2.4 Take for instance the following program P:

    int(0) int(s(X)) int(X)

    Then the least Herbrand model of P is HP = {int(0),int(s(0)), . . .} and indeed

    P |= int(0), P |= int(s(0)), . . . . But also note that for definite programs theentire Herbrand base BP is also a model. Given a suitable alphabet AP, wemight have BP = {int(a),int(0),int(s(a)),int(s(0)), . . .}. This means that theatom int(a) is consistent with the program P (i.e. P |= int(a)), but is notimplied either (i.e. P |= int(a)).

    It is here that logic programming goes beyond classical first-order logic.In logic programming one (usually) assumes that the program gives a completedescription of the intended interpretation, i.e. anything which cannot be in-ferred from the program is assumed to be false. For example, one would saythat int(a) is a consequence of the above program P because int(a) HP.This means that, from a logic programming perspective, the above program cap-tures exactly the natural numbers, something which is impossible to accomplish

    within first-order logic (for a formal proof see e.g. Corollary 4.10.1 in [13]).A possible inference scheme, capturing this aspect of logic programming,

    was introduced in [40] and is referred to as the closed world assumption (CWA).The CWA cannot be expressed in first-order logic (a second-order logic axiomhas to be used to that effect, see e.g. the approach adopted in [28]). Note thatusing the CWA leads to non-monotonic inferences, because the addition of new

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    13/74

    3.2. SEMANTICS OF LOGIC PROGRAMS 13

    information can remove certain, previously valid, consequences. For instance,by adding the clause int(a) to the above program the literal int(a) is no

    longer a consequence of the logic program.

    3.2.2 Fixpoint characterisation of HP

    We now present a more constructive characterisation of the least Herbrandmodel, as well as the associated set of consequences, using fixpoint concepts.We first need the following definitions:

    Definition 3.2.5 (substitution) A substitution is a finite set of the form = {X1/t1, . . . , X n/tn} where X1, . . . , X n are distinct variables and t1, . . . , tnare terms such that ti = Xi. Each element Xi/ti of is called a binding.

    Alternate definitions of substitutions exist in the literature (e.g. in [17, 15],

    see also the discussion in [24]), but the above is the most common one in thelogic programming context.We also define an expression to be either a term, an atom, a literal, a con-

    junction, a disjunction or a program clause.

    Definition 3.2.6 (instance) Let = {X1/t1, . . . , X n/tn} be a substitutionand E an expression. Then the instance of E by , denoted by E, is theexpression obtained by simultaneously replacing each occurrence of a variableXi in E by the term t.

    We can now define the following operator mapping Herbrand interpretationsto Herbrand interpretations.

    Definition 3.2.7 (TP) Let P be a program. We then define the (ground)

    immediate consequence operator TP 2BP

    2BP

    by:TP(I) = {A BP | A A1, . . . An is a ground instance of a

    clause in P and {A1, . . . , An} I}

    Every pre-fixpoint I ofTP, i.e. TP(I) I, corresponds to a Herbrand modelof P and vice versa. This means that to study the Herbrand models of P onecan also investigate the pre-fixpoints of the operator TP. For definite programs,one can prove that TP is a continuous mapping and that it has a least fixpointlf p(TP) which is also its least pre-fixpoint.

    The following definition will provide a way to calculate the least fixpoint:

    Definition 3.2.8 Let T be a mapping 2D 2D. We then define T 0 = andT i + 1 = T(T i). We also define T to stand for

    i

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    14/74

    14 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    3.3 Proof theory of logic programs [nicht rele-

    vant fuer PS2 Kurs?]We start with some additional useful terminology related to substitutions. IfE = F then E is said to be more general than F. If E is more generalthan F and F is more general than E then E and F are called variants (ofeach other). If E is a variant of E then is called a renaming substitution

    for E. Because a substitution is a set of bindings we will denote, in contrastto e.g. [30], the empty or identity substitution by and not by the empty se-quence . Substitutions can also be applied to sets of expressions by defining{E1, . . . , E n} = {E1 , . . . , E n}.

    Substitutions can also be composed in the following way:

    Definition 3.3.1 (composition of substitutions) Let = {X1/s1, . . . ,

    Xn/sn} and = {Y1/t1, . . . , Y k/tk} be substitutions. Then the compositionof and , denoted by , is defined to be the substitution {Xi/si | 1 i n si = Xi} {Yi/ti | 1 i k Yi {X1, . . . , X n}}.

    3

    When viewing substitutions as functions from expressions to expressions, thenthe above definition behaves just like ordinary function composition, i.e. E() =(E). We also have that (for proofs see [30]) the identity substitution acts asa left and right identity for composition, i.e. = = , and that compositionis associative, i.e. () = ().

    We call a substitution idempotent iff = . We also define the followingnotations: the set of variables occurring inside an expression E is denoted byvars(E), the domain of a substitution is defined as dom() = {X | X/t }and the range of is defined as ran() = {Y | X/t Y vars(t)}. Finally,

    we also define vars() = dom() ran() as well as the restriction |V of asubstitution to a set of variables V by |V = {X/t | X/t X V}.The following concept will form the link between the model-theoretic seman-

    tics and the procedural semantics of logic programs.

    Definition 3.3.2 (answer) Let P be a program and G = L1, . . . , Ln a goal.An answer for P {G} is a substitution such that dom() vars(G).

    3.3.1 Definite programs

    We first define correct answers in the context of definite programs and goals.

    Definition 3.3.3 (correct answer) Let P be a definite program and G =

    A1, . . . , An a definite goal. An answer for P {G} is called a correct answerfor P {G} iff P |= ((A1 . . . An)).

    3This definition deviates slightly from the one in [1, 30, 37]. Indeed, taking the definitionin [1, 30, 37] literally we would have that {X/a}{X/a} = and not the desired {X/a} (thedefinition in [1, 30, 37] says to delete any binding Yi/ti with Yi {X1, . . . , X n} from a set ofbindings). Our definition does not share this problem.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    15/74

    3.3. PROOF THEORY OF LOGIC PROGRAMS [NICHT RELEVANT FUER PS2 KURS?]15

    Take for instance the program P = {p(a) } and the goal G = p(X).Then {X/a} is a correct answer for P {G} while {X/c} and are not.

    We now present a way to calculate correct answers based on the concepts ofresolution and unification.

    Definition 3.3.4 (mgu) Let S be a finite set of expressions. A substitution is called a unifier of S iff the set S is a singleton. is called relevant iff itsvariables vars() all occur in S. is called a most general unifier or mgu iff foreach unifier of S there exists a substitution such that = .

    The concept of unification dates back to [20] and has been rediscovered in[41]. A survey on unification, also treating other application domains, can befound in [23].

    If a unifier for a finite set S of expressions exists then there exists an idem-potent and relevant most general unifier which is unique modulo variable re-

    naming (see [1, 30]). Unifiability of a set of expressions is decidable and thereare efficient algorithms for calculating an idempotent and relevant mgu. Seefor instance the unification algorithms in [1, 30] or the more complicated butlinear ones in [31, 38]. From now on we denote, for a unifiable set S of expres-sions, by mgu(S) an idempotent and relevant unifier of S. If we just want tounify two terms t1, t2 then we will also sometimes write mgu(t1, t2) instead ofmgu({t1, t2}).

    We define the most general instance, of a finite set S to be the only elementof S where = mgu(S). The opposite of the most general instance is themost specific generalisation of a finite set of expressions S, also denoted bymsg(S), which is the most specific expression M such that all expressions inS are instances of M. Algorithms for calculating the msg exist [27], and thisprocess is also referred to as anti-unification or least general generalisation.

    We can now define SLD-resolution, which is based on the resolution principle[41] and which is a special case of SL-resolution [26]. Its use for a programminglanguage was first described in [25] and the name SLD (which stands for Selec-tion rule-driven Linear resolution for Definite clauses), was coined in [4]. Formore details about the history see e.g. [1, 30].

    Definition 3.3.5 (SLD-derivation step) Let G = L1, . . . , Lm, . . . , Lk bea goal and C = A B1, . . . , Bn a program clause such that k 1 and n 0.Then G is derived from G and C using (and Lm) iff the following conditionshold:

    1. Lm is an atom, called the selected atom (at position m), in G.

    2. is a relevant and idempotent mgu of Lm and A.

    3. G is the goal (L1, . . . , Lm1, B1, . . . , Bn, Lm+1, . . . , Lk).

    G is also called a resolvent of G and C.

    In the following we define the concept of a complete SLD-derivation (we willdefine incomplete ones later on).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    16/74

    16 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    Definition 3.3.6 (complete SLD-derivation) Let P be a normal programand G a normal goal. A complete SLD+-derivation of P {G} is a tuple

    (G, L, C, S) consisting of a sequence of goals G = G0, G1, . . ., a sequenceL = L0, L1 . . . of selected literals,4 a sequence C = C1, C2, . . . of variantsof program clauses of P and a sequence S = 1, 2, . . . of mgus such that:

    for i > 0, vars(Ci) vars(G0) = ; for i > j, vars(Ci) vars(Cj) = ; for i 0, Li is a positive literal in Gi and Gi+1 is derived from Gi and

    Ci+1 using i+1 and Li; the sequences G, C, S are maximal given L.

    A complete SLD-derivation is just a complete SLD+ derivation of a definiteprogram and goal.

    The process of producing variants of program clauses of P which do notshare any variable with the derivation sequence so far is called standardising

    apart. Some care has to be taken to avoid variable clashes and the ensuingtechnical problems; see the discussions in [24] or [15].

    We now come back to the idea of a proof by refutation and its relation toSLD-resolution. In a proof by refutation one adds the negation of what is tobe proven and then tries to arrive at inconsistency. The former corresponds toadding a goal G = A1, . . . , An to a program P and the latter corresponds tosearching for an SLD-derivation of P {G} which leads to 2. This justifies thefollowing definition.

    Definition 3.3.7 (SLD-refutation) An SLD-refutationof P {G} is a finitecomplete SLD-derivation of P {G} which has the empty clause 2 as the lastgoal of the derivation.

    In addition to refutations there are (only) two other kinds of complete deriva-tions:

    Finite derivations which do not have the empty clause as the last goal.These derivations will be called (finitely) failed.

    Infinite derivations. These will be called infinitely failed.

    We can now define computed answers, which correspond to the output cal-culated by a logic program.

    Definition 3.3.8 (computed answer) Let P be a definite program, G adefinite goal and D a SLD-refutation for P {G} with the sequence 1, . . . , nof mgus. The substitution (1 . . . n)|vars(G) is then called a computed answer

    for P {G} (via D).

    If is a computed (respectively correct) answer for P{G} then G is calleda computed (respectively correct) instance for P {G}.

    4Again we slightly deviate from [1, 30]: the inclusion of L avoids some minor technicalproblems wrt the maximality condition.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    17/74

    3.3. PROOF THEORY OF LOGIC PROGRAMS [NICHT RELEVANT FUER PS2 KURS?]17

    Theorem 3.3.9 (soundness of SLD) Let P be a definite program and Ga definite goal. Every computed answer for P {G} is a correct answer for

    P {G}.

    Theorem 3.3.10 (completeness of SLD) Let P be a definite program and Ga definite goal. For every correct answer for P {G} there exists a computedanswer for P {G} and a substitution such that G = G.

    A proof of the previous theorem can be found in [1].5

    We will now examine systematic ways to search for SLD-refutations.

    Definition 3.3.11 (complete SLD-tree) A complete SLD-tree for P {G}is a labelled tree satisfying the following:

    1. Each node of the tree is labelled with a definite goal along with an indi-cation of the selected atom

    2. The root node is labelled with G.

    3. Let A1, . . . , Am, . . . , Ak be the label of a node in the tree and supposethat Am is the selected atom. Then for each clause A B1, . . . , Bq in Psuch that Am and A are unifiable the node has one child labelled with

    (A1, . . . , Am1, B1, . . . , Bq, Am+1, . . . , Ak),

    where is an idempotent and relevant mgu of Am and A.

    4. Nodes labelled with the empty goal have no children.

    To every branch of a complete SLD-tree corresponds a complete SLD-derivation.The choice of the selected atom is performed by what is called a selection

    rule. Maybe the most well known selection rule is the left-to-right selectionrule ofProlog [14, 42, 8], which always selects the leftmost literal in a goal. Thecomplete SLD-derivations and SLD-trees constructed via this selection rule arecalled LD-derivations and LD-trees.

    Usually one confounds goals and nodes (e.g. in [1, 30, 37]) although thisis strictly speaking not correct because the same goal can occur several timesinside the same SLD-tree.

    We will often use a graphical representation of SLD-trees in which the se-lected atoms are identified by underlining. For instance, Figure 3.1 contains agraphical representation of a complete SLD-tree for P { int(s(0))}, whereP is the program of Example 3.2.4.

    5The corresponding theorem in [30] states that = , which is known to be false. In-deed, take for example the program P = {p(f(X,Y)) } and the goal G = p(Z). Then

    {Z/f(a, a)} is a correct answer (because p(f(a, a)) is a consequence ofP), but there is no computed answer {X/f(a, a)} for any computed answer {Z/f(X, Y)} (where either X or Y must be different

    from Z; these are the only computed answers) composing it with {X/a,Y/a} willgive {Z/f(a, a), X/a,Y/a} (or {Z/f(a, a), Y/a} if X = Z or {Z/f(a, a), X/a} ifY = Z) which is different from {X/f(a)}.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    18/74

    18 CHAPTER 3. LOGIC AND LOGIC PROGRAMMING

    2

    int(0)

    c

    c

    int(s(0))

    Figure 3.1: Complete SLD-tree for Example 3.2.4

    3.3.2 Programs with built-ins

    Most practical logic programs make (heavy) usage of built-ins. Although a lotof these built-ins, like e.g. assert/1 and retract/1, are extra-logical and ruin

    the declarative nature of the underlying program, a reasonable number of themcan actually be seen as syntactic sugar. Take for example the following programwhich uses the Prolog [14, 42, 8] built-ins = ../2 and call/1.

    map(P, [], []) map(P, [X|T], [PX|PT]) C = ..[P, X, PX ], call(C), map(P , T , P T)inv(0, 1) inv(1, 0)

    For this program the query map(inv, [0, 1, 0], R) will succeed with the com-puted answer {R/[1, 0, 1]}. Given that query, the Prolog program can be seen asa pure definite logic program by simply adding the following definitions (wherewe use the prefix notation for the predicate = ../2):

    = ..(inv(X, Y), [inv,X,Y])

    call(inv(X, Y)) inv(X, Y)The so obtained pure logic program will succeed for map(inv, [0, 1, 0], R) withthe same computed answer {R/[1, 0, 1]}.

    This means that some predicates like map/3, which are usually taken to behigher-order, can simply be mapped to pure definite (first-order) logic programs([44, 36]). Some built-ins, like for instance is/2, have to be defined by infiniterelations. Usually this poses no problems as long as, when selecting such abuilt-in, only a finite number of cases apply (Prolog will report a run-time errorif more than one case applies while the programming language Godel [22] willdelay the selection until only one case applies).

    In the remainder of this thesis we will usually restrict our attention to thosebuilt-ins that can be given a logical meaning by such a mapping.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    19/74

    Chapter 4

    Parsing

    4.1 Operator Declarations

    [TO DO: INSERT: material on op declarations]

    4.2 DCGs

    [TO DO: INSERT: Some DCG background.]

    4.3 Treating Precedences and Associativites with

    DCGs

    Scheme to treat operator precedences and associativity: one layer per operatorprecedence; every layer calls the layer below, bottom layer can call top layer viaparentheses.

    Program 4.3.1

    plus(R) --> exp(A), plusc(A,R).

    plusc(Acc,Res) --> "+",!, exp(B), plusc(plus(Acc,B),Res).

    /* left associative: B associates to + on left */

    plusc(A,A) --> [].

    exp(R) --> num(A), expc(A,R).

    expc(Acc,exp(Acc,R2)) --> "**",!, exp(R2). /* right associative */

    expc(A,A) --> [].

    num(X) --> "(",!,plus(X), ")".

    19

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    20/74

    20 CHAPTER 4. PARSING

    + +

    **

    x y

    ** **

    x y z y

    plus

    exp

    num

    parse

    Figure 4.1: Illustrating the parse hierachy and tree for "x**y+x**y**z+y"

    num(id(x)) --> "x".

    num(id(y)) --> "y".

    num(id(z)) --> "z".

    parse(S,T) :- plus(T,S,[]).

    This parser can be called as follows:

    | ?- parse("x+y+z",R).

    R = plus(plus(id(x),id(y)),id(z)) ?

    yes

    | ?- parse("x**y**z",R).

    R = exp(id(x),exp(id(y),id(z))) ?

    yes

    | ?- parse("x**y+x**y**z+y",R).

    R = plus(plus(exp(id(x),id(y)),exp(id(x),exp(id(y),id(z)))),id(y)) ?yes

    Exercise 4.3.2 Extend the parser from Program 4.3.1 to handle the operators-, *, /.

    4.4 Tokenisers

    Exercise 4.4.1 Extend the parser from Program 4.3.1 and Exercise 4.3.2 tolink up with a tokenizer that recognises numbers and identifiers.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    21/74

    Chapter 5

    Simple Interpreters

    5.1 NFA

    Our first task is to write an interpreter for non-deterministic finite automaton(NFA). [REF; Ullman?]

    One question we must ask ourselves is, how do we represent the NFA to beinterpreted. One solution is to represent the NFA by Prolog facts. E.g., wecould use Prolog predicate init/1 to represent the initial states, final/1 torepresent the final states, and trans/3 to represent the transitions between thestates. To represent the NFA from Figure 7.2 we would thus write:

    Program 5.1.1

    init(1).

    trans(1,a,2).

    trans(2,b,3).

    trans(2,b,2).

    final(3).

    Let us now write an interpreter, checking whether a string can be generatedby the automaton. The interpreter needs to know the current state St as well asthe string to be checked. The empty string can only be generated in final states.

    1 2a

    b

    3b

    Figure 5.1: A simple NFA

    21

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    22/74

    22 CHAPTER 5. SIMPLE INTERPRETERS

    A non-empty string can be generated by taking an outgoing transition from thecurrent state to another state S2, thereby generating the first character H of the

    string; from S2 we must be able to generate the rest T of the string.

    accept(S,[]) :- final(S).

    accept(S,[H|T]) :- trans(S,H,S2), accept(S2,T).

    We have not yet encoded that initially we must start from a valid initialstate. This can be encoded by the following:

    check(Trace) :- init(S), accept(S,Trace).

    We can now use our interpreter to check whether the NFA can generatevarious strings:

    | ?- check([a,b]).yes

    | ?- check([a]).

    no

    We can even only provide a partially instantiated string and ask for solutions:

    | ?- check([X,Y,Z]).

    X = a ,

    Y = b ,

    Z = b ? ;

    no

    Finally, we can generate strings accepted by the NFA:

    | ?- check(X).

    X = [a,b] ? ;

    X = [a,b,b] ? ;

    X = [a,b,b,b] ? ;

    Exercise 5.1.2 Extend the above interpreter for epsilon transitions.

    5.2 Regular Expressions

    Our next task is to write an interpreter for regular expressions. [REF] We fill

    first use operator declarations so that we can use the standard regular expressionoperators . for concatenation, + for alternation, and * for repetition. Forthis we write the following operator declarations [REF].

    :- op(450,xfy,.). /* + already defined; has 500 as priority */

    :- op(400,xf,*).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    23/74

    5.2. REGULAR EXPRESSIONS 23

    With append:

    Program 5.2.1:- use_module(library(lists)).

    gen(X,[X]) :- atomic(X).

    gen(X+Y,S) :- gen(X,S) ; gen(Y,S).

    gen(X.Y,S) :- gen(X,SX), append(SX,SY,S), gen(Y,SY).

    gen(*(_X),[]).

    gen(*(X),S) :- gen(X,SX), append(SX,SSX,S),gen(*(X),SSX).

    Observe how to compute the meaning/effect of X.Y we compute the effect ofX and Y by recursive calls to gen. This is a common pattern, that will appeartime and time again in interpreters. We call this compositional style (ordenotational style).

    Usage:

    | ?- gen(a*,[X,Y,Z]).

    X = a ,

    Y = a ,

    Z = a ?

    yes

    | ?- gen((a+b)*,[X,Y]).

    X = a ,

    Y = a ? ;

    X = a ,

    Y = b ? ;

    X = b ,

    Y = a ? ;X = b ,

    Y = b ? ;

    no

    Problem: efficiency as SX traversed a second time by append. More prob-lematic is the following, however:

    | ?- gen((a*).b,[c]).

    ! Resource error: insufficient memory

    Solution: difference lists: no more need to do append (concatenation inconstant time using difference lists: diff append(X-Y,Z-V,R) :- R = X-V,Y=Z.

    Program 5.2.2 Version with difference lists:

    generate(X,[X|T],T) :- atomic(X).

    generate(X +_Y,H,T) :- generate(X,H,T).

    generate(_X + Y,H,T) :- generate(Y,H,T).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    24/74

    24 CHAPTER 5. SIMPLE INTERPRETERS

    generate(X.Y,H,T) :- generate(X,H,T1), generate(Y,T1,T).

    generate(*(_),T,T).

    generate(*(X),H,T) :- generate(X,H,T1), generate(*(X),T1,T).

    gen(RE,S) :- generate(RE,S,[]).

    We can now call:

    | ?- gen((a*).b,[c]).

    no

    [Insert Figure showing call graph for both versions of gen.]Explain alternate reading: generate(RE, InEnv, OutEnv): InEnv = char-

    acters still to be consumed overall; OutEnv = characters remaining after RE hasbeen matched.

    Common theme: sometimes called threading. Can be written in DCGstyle notation:

    Program 5.2.3

    generate(X) --> [X], {atomic(X)}.

    generate(X +_Y) --> generate(X).

    generate(_X + Y) --> generate(Y).

    generate(.(X,Y)) --> generate(X), generate(Y).

    generate(*(_)) --> [].

    generate(*(X)) --> generate(X), generate(*(X)).

    5.2.1 Representation: Prolog terms vs facts

    In the NFA program we have represented the automaton by a set of Prolog facts.In the Regular Expression program we have encoded the regular expression to

    be interpreted as a Prolog term.Which approach is better?It depends. Advantages of the first approach with Prolog facts are:

    we can use Prolog queries to search the fact database, e.g., trans(Pre, ,2)to find predecessors of 2 or trans(X, ,X) to find states with self-loops .

    Prolog provides automatic first argument indexing; e.g., looking for trans(X,A,Y)with X ground will be very efficient.

    the Prolog facts do not have to be passed as arguments to the interpreter

    Disadvantages are:

    the interpreter cannot easily modify the program to be interpreted ...

    modifications via assert/retract can be expensive and have to be handledwith extreme care inside an interpreter;

    e.g., in the regular expression example: sub-expressions X + Y passed torecursive calls

    modifications via assert/retract are expensive

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    25/74

    5.3. PROPOSITIONAL LOGIC 25

    5.3 Propositional Logic

    Let us try to write an interpreter for simple propositional logic formulas, usingthe constants true and false, as well as the two connectives or and and.

    Program 5.3.1

    int(const(true)).

    int(const(false)) :- fail.

    int(and(X,Y)) :- int(X), int(Y).

    int(or(X,Y)) :- int(X) ; int(Y).

    Note that for the logical connectives we again use the decompositionalstyle, whereby to compute the effect/meaning of a formula, we first computethe effect/meaning of the direct subformulas and then derive the overall ef-

    fect/meaning. The interpreter can be used as follows.| ?- int(and(const(true),const(true))).

    yes

    | ?- int(and(const(true),const(false))).

    no

    One may wonder why we have wrapped true and false into const/1. Thereason is to separate constants from the other formulas and to be able to askqueries of the following kind:

    | ?- int(and(const(Y),const(X))).

    X = true,

    Y = t r u e ? ;

    no

    Without the const/1 wrapper we would have had to ask the query ?-int(X,Y).which has infinitely many solutions... [Explain a bit better]

    Let us now try to add negation to our interpreter. The obvious solution isto directly employ Prolog negation and to add the following clause to the aboveprogram:

    Program 5.3.2

    int(not(X)) :- \+ int(X).

    At first sight this seems to work:

    | ?- int(not(const(true))).no

    | ?- int(not(const(false))).

    yes

    However, take a look at the following two queries:

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    26/74

    26 CHAPTER 5. SIMPLE INTERPRETERS

    | ?- int(not(const(X))).

    no

    | ?- int(const(X)).X = true ? ;

    no

    What is going on here? Why does Prolog fail to find the solution X = falsefor the first query, while it is able to find the solution X=true for the secondone?

    The reason is that Prologs built-in negation is not logical negation butso-called negation-as-failure. This negation coincides with logical negationonly when its arguments contains no variables at the moment it is called. Tounderstand this issue let us examine a simpler program:

    Program 5.3.3

    int(0).

    int(s(X)) :- int(X).

    nia :- \+ int(a). /* succeeds */

    nia2 :- \+ int(X), X=a. /* fails */

    nia3 :- X=a, \+ int(X). /* succeeds */

    Logically the program is int(0) X.int(s(X)) int(X).Prolog (and logic programming) uses the negation as failure rule: if we

    cannot prove p then assume that p is true. Note that in classical logic onecan have that neither p nor p is a logical consequence. There is, however, away to view (in some circumstances) Prolog negation as classical negation on atransformed program. This concept is called Clark completion. Idea: viewpredicate definition not as a series of implications but as a big if-and-only-if.Technique: introduce equality predicate and translate all clauses so that headcontains only variables (the same for all clauses of the same predicate).

    First clause can be translated into:

    Y.int(Y) Y = 0

    Second clause can be transformed as follows:

    X.int(s(X)) int(X)

    X.int(Y) Y = s(X),int(X)

    Y.int(Y) X.(Y = s(X) int(X))

    We can now combine both into IF F(P): Y.int(Y) (Y = 0 X.(Y = s(X) int(X))).

    Free-equality theory F ET(P)

    0 = 0

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    27/74

    5.3. PROPOSITIONAL LOGIC 27

    X,Y.s(X) = s(Y) X = Y

    X.(s(X) = 0)

    X.(0 = s(X))

    We have that F ET(P) IF F(P) |= int(a). However, Prologs implemen-tation of negation is only sound when there are no free variables inside thenegated goal !

    Hence to use negation (inside an interpreter) there are basically three solu-tions. The first is to always ensure that there are no variables when we call thenegation. This may be difficult to achieve in some circumstances; and generallymeans we can use the interpreter in only one specific way. But it is still often apractical solution.

    The second solution is to delay negated goals until they become ground. This

    can be achieved using the built-in when/2 of Prolog. The call when(Cond,Call)waits until the condition Cond becomes true, at which point Call is executed.While Call can be any Prolog goal, Cond can only use: nonvar(X), ground(X),?=(X,Y), as well as combinations thereof combined with conjunction (,) anddisjunction (;). With this built-in we can implement a safe version of negation,which will ensure that the Prolog negation is only called when no variables areleft inside the negated call:

    Program 5.3.4

    safe_not(P) :- when(ground(P), \+(P)).

    Disadvantage: does not work with all Prolog systems (though most now

    support co-routining). One can also get a floundering goal, where all goalssuspend. The Godel programming language supported such a safe version ofnegation.

    Another common technique to circumvent this problem is to get rid of theneed for the built-in negation, by explicitly writing a predicate for the negation:

    Program 5.3.5

    int(const(true)).

    int(const(false)) :- fail.

    int(and(X,Y)) :- int(X), int(Y).

    int(or(X,Y)) :- int(X) ; int(Y).

    int(not(X)) :- nint(X).

    nint(const(false)).

    nint(const(true)) :- fail.

    nint(and(X,Y)) :- nint(X) ; nint(Y).

    nint(or(X,Y)) :- nint(X),nint(Y).

    nint(not(X)) :- int(X).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    28/74

    28 CHAPTER 5. SIMPLE INTERPRETERS

    This interpreter now works as expected for negation and partially instanti-ated queries:

    | ?- int(not(const(X))).

    X = false ? ;

    no

    Exercise 5.3.6 Extend the above interpreter for implication imp/2.

    Literature: Clark Completion and Equality Theory, SLDNF, Constructivenegation, Godel Programming language,...

    5.4 A First Simple Imperative Language

    Let us first choose an extremely simple language with three constructs:

    variable definition def, definining a single variable. Example: def x.

    assignment := to assign a value to a variable. Example: x : = 3

    sequential composition ; to compose two constructs. Example: def x ;x:= 3

    Imperative programs access and modify a global state. When writing aninterpreter we thus need to model this environment. Thus, we first provideauxilary predicate to store and retrieve variable values in an environment:

    Program 5.4.1

    /* def(OldEnv, VariableName, NewEnv) */

    def(Env,Key,[Key/undefined|Env]).

    /* store(OldEnv, VariableName, NewValue, NewEnv) */

    store([],Key,Value,[exception(store(Key,Value))]).

    store([Key/_Value2|T],Key,Value,[Key/Value|T]).

    store([Key2/Value2|T],Key,Value,[Key2/Value2|BT]) :-

    Key \== Key2, store(T,Key,Value,BT).

    /* lookup(VariableName, Env, CurrentValue) */

    lookup(Key,[],_) :- print(lookup_var_not_found_error(Key)),nl,fail.

    lookup(Key,[Key/Value|_T],Value).

    lookup(Key,[Key2/_Value2|T],Value) :-

    Key \== Key2,lookup(Key,T,Value).

    We suppose that store and lookup will be called with all but the last argu-

    ment bound ?

    | ?- store(Old,x,X,[x/3,y/2]).

    X = 3 ,

    Old = [x/_A,y/2] ? ;

    no

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    29/74

    5.4. A FIRST SIMPLE IMPERATIVE LANGUAGE 29

    Program 5.4.2

    int(X:=V,In,Out) :- store(In,X,V,Out).int(def(X),In,Out) :- def(In,X,Out).

    int(X;Y,In,Out) :- int(X,In,I2), int(Y,I2,Out).

    test(R) :- int( def x; x:= 5; def z; x:= 3; z:= 2 , [],R).

    | ?- int( def x; x:= 5; def z; x:= 3; z:= 2 , [],R).

    R = [z/2,x/3]

    | ?- int( def x; x:= 5; def z; x:= 3; z:= X , In, [z/2,x/RX]).

    X = 2 ,

    In = [],

    RX = 3 ? ;

    no

    Now let us allow to evaluate expressions in the right hand side of an assign-ment. To make our interpreter a little bit simpler, we suppose that variable usesare preceded by a $. E.g., x := $x+1.

    We need a separate function to evaluate arguments. This is again a commonpattern in interpreter development: we have one predicate to execute state-ments, modifying an environment, and one predicate to evaluation expressionsreturning a value. If the expressions in the language to be interpreted are side-effect free, the eval predicate need not return a new environment. This is whatwe have done below:

    :- op(200,fx,$).

    /* eval(Expression, Env, ExprValue) */

    eval(X,_Env,Res) :- number(X),Res=X.

    eval($(X),Env,Res) :- lookup(X,Env,Res).

    eval(+(X,Y),Env,Res) :- eval(X,Env,RX), eval(Y,Env,RY), Res is RX+RY.

    eint(X:=V,In,Out) :- eval(V,In,Res),store(In,X,Res,Out).

    eint(def(X),In,Out) :- def(In,X,Out).

    eint(X;Y,In,Out) :- eint(X,In,I2), eint(Y,I2,Out).

    | ?- eint( def x; x:= 5; def z; x:= $x+1; z:= $x+($x+2) , [],R).

    R = [z/14,x/6] ?

    Exercise 5.4.3 Extend the interpreter for other operators, like multiplication* and exponentiation **.

    Exercise 5.4.4 Rewrite the interpreter so that one does not need to use the$. Make sure that your interpreter only has a single (and correct) solution.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    30/74

    30 CHAPTER 5. SIMPLE INTERPRETERS

    5.4.1 Summary

    The general scheme for interpreting statements is:int( Statement[X1,...XN], I1, OutEnv) :-

    int(X1,I1,I2), or eval(X1,I1,R1),I2=I1,

    ...,

    int(Xn,In,OutEnv).

    For evaluating expressions it is:

    eval(Expr [X1,...XN], Env, Res) :-

    eval(X1,Env,R1), ...,

    eval(Xn,Env,Rn),

    compute(R1,...,Rn,Res).

    5.5 An imperative language with conditionalsTwo choices:

    treat a boolean expression 1=x like an expression returning a boolean value

    introduce a separate class boolean expression with a separate predicate.Here there are again two sub choices:

    Write a predicate test be(BE,Env) which succeeds if the booleanexpression succeeds in the environment Env.

    Write a predicate eval be(BE,Env,BoolRes)

    First, though, we need to examine how to write an if-then-else in Prologitself. Below we present 4 ways to do this.

    Program 5.5.1

    ift1(Test,Then,Else) :- Test,!,Then.

    ift1(Test,Then,Else) :- Else.

    ift2(Test,Then,Else) :- Test, Then.

    ift2(Test,Then,Else) :- Not_Test, Else.

    % see how to implement not earlier in section

    ift3(Test,Then,Else) :- (Test -> Then ; Else).

    ift4(Test,Then,Else) :- if(Test,Then,Else). % SICStus Prolog

    The first version is not declarative. A program is declarative if its meaningis independent of the actual execution (left-to-right), i.e., the program can beseens as a logical theory declaring the problem to be solved. Sometimes such aprogram is also called a pure logic program.

    Sometimes we can adopt a more precise defintion. We say that a predicatep of arity n is declarative iff for all terms a1, . . . , an:

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    31/74

    5.6. AN IMPERATIVE LANGUAGE WITH LOOPS 31

    . p(a1,...,an), , p(a1,...,an) (binding-insensitive)

    p(a1,...,an),fail fail, p(a1,...,an) (side-effect free)Version 4 is declarative; Version 2 is declarative if we use a declarative en-

    coding of the not (see earlier). Versions 1 and 2 are not declarative. The useof the -> construct is, however, better in the sense that the cut is local andmore predictable than the the full blown cut. However, if the Test contains nofree variables (or only existential output variables) then it will behave likeversions 2 and 4, while being more efficient. Below we use version 3.

    Program 5.5.2

    eint(if(BE,S1,S2),In,Out) :-

    eval_be(BE,In,Res),

    ((Res=true -> eint(S1,In,Out)) ; eint(S2,In,Out)).

    eval_be(=(X,Y),Env,Res) :- eval(X,Env,RX), eval(Y,Env,RY),((RX=RY -> Res=true) ; (Res=false)).

    eval_be( eint(;(S,while(BE,S)),In,Out)) ; In=Out).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    32/74

    32 CHAPTER 5. SIMPLE INTERPRETERS

    Alternate solution:

    eint(skip,E,E).eint(while(BE,S),In,Out) :- eint(if(BE,;(S,while(BE,S)),skip),In,Out).

    5.7 Simple Three Address Code

    Let us take some sample three address code, in the format of Chapters 9 and12 of the new Dragon book:

    (1) i = 0

    (2) x = 2

    (3) if i>1 goto 7

    (4) x = x*x

    (5) i = i + 1

    (6) goto 3(7) res = x

    Basically, we have:

    assignments with three operands x,y,z: x = y O P z

    copy assignments with two operands x,y: x = y

    unconditional jumps to a label L: goto L

    conditional jumps with two operands x,y and a label L: if x OP y gotoL

    The difference to the previous interpreter is that instructions now have labels,and that we have the possibility to jump straight to the labels.

    How to represent the Three-address-code program:

    Clausal Representation: Like in Section 5.1 use clauses to represent theobject program.

    Term Representation (Reified representation): Like in Sections 5.2, 5.3,5.4-5.6, use a Prolog term to represent the object program to be inter-preted.

    In principle both techniques can be used. The clausa representation hasthe advantage that we can use Prolog queries to search and traverse the objectprogram. However, modifying the object program can be expensive. The termrepresentation has the advantage that we can easily generate new terms atruntime (see our second implementation of the while loop in Section 5.6), that

    it is efficient to obtain direct sub-terms of expressions by simple unification,without having to search the Prolog clause database.

    While for the imperative language without labels in Sections 5.4 - 5.6 it wasmore elegant to use a term representation, the situation here is different as weneed to be able to jump to arbitrary labels in the object program.

    A possible encoding using Prolog clause for a predicate inst/5 is as follows:

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    33/74

    5.7. SIMPLE THREE ADDRESS CODE 33

    inst(1,assign,i,0,_).

    inst(2,assign,x,2,_).

    inst(3,if(>),i,1,7).inst(4,assign(*),x,x,x).

    inst(5,assign(+),i,i,1).

    inst(6,goto,_,_,3).

    inst(7,assign,res,x,_).

    inst(8,halt,_,_,_).

    The general format for an individual instruction fact is:inst(Label, OPCODE, Op1, Op2, Op3).

    We want to achieve this:

    | ?- run(Op).

    Op = [i/2,x/16,res/16] ? ;

    no

    In contrast to the previous interpreter we now also need to keep track of aprogram counter, in addition to the current environment.

    We reuse lookup from the previous sections, but extend store so that in thecase it reaches the empty environment it adds a new entry into the environmentrather than printing an error. This accounts for the fact that in our three-address code variables do not have to be defined, they are simply assigned to afirst time.

    run(Out) :- tint(1,[],Out).

    tint(PC,In,Out) :-

    inst(PC,Opcode,A1,A2,A3),NextPC is PC+1,

    ex_opcode(Opcode,A1,A2,A3,NextPC,In,Out).

    ex_opcode(halt,_,_,_,_,Env,Env).

    ex_opcode(goto,_,_,Label,_,In,Out) :-

    tint(Label,In,Out).

    ex_opcode(assign,Var,RHS,_,NextPC,In,Out) :-

    load(RHS,In,RHSVAL),

    store(In,Var,RHSVAL,Out2),

    tint(NextPC,Out2,Out).

    ex_opcode(if(OP),RHS1,RHS2,Label,NextPC,In,Out) :-

    load(RHS1,In,RHSVAL1),

    load(RHS2,In,RHSVAL2),((test_op(OP,RHSVAL1,RHSVAL2) -> tint(Label,In,Out)) ; tint(NextPC,In,Out)).

    ex_opcode(assign(OP),Var,RHS1,RHS2,NextPC,In,Out) :-

    load(RHS1,In,RHSVAL1),

    load(RHS2,In,RHSVAL2),

    ex_op(OP,RHSVAL1,RHSVAL2,Res),

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    34/74

    34 CHAPTER 5. SIMPLE INTERPRETERS

    store(In,Var,Res,Out2),

    tint(NextPC,Out2,Out).

    ex_op(*,A1,A2,R) :- R is A1 * A2.

    ex_op(+,A1,A2,R) :- R is A1 + A2.

    ex_op(-,A1,A2,R) :- R is A1 - A2.

    test_op(,A1,A2) :- A1 > A2.

    test_op(=,A1,A2) :- A1 >= A2.

    load(Key,Env,Val) :-

    (number(Key) -> Val=Key ; lookup(Key,Env,Val)).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    35/74

    Chapter 6

    Writing a PrologInterpreter in Prolog

    Def: object programHow to represent the Prolog program:

    Clausal Representation: Like in Section 5.1 use clauses to represent theobject program.

    Term Representation (Reified representation): Like in Sections 5.2 and5.3, use a Prolog term to represent the object program

    [ vanilla meta-interpreter (see, e.g., [21, 3]). ]

    Program 6.0.1:- dynamic app/3.

    app([],L,L).

    app([H|X],Y,[H|Z]) :- app(X,Y,Z).

    solve(true).

    solve(,(A,B)) :- solve(A),solve(B).

    solve(Goal) :- Goal \= ,(_,_), Goal \= true,

    clause(Goal,Body), solve(Body).

    Debugging version:

    Program 6.0.2

    isolve(Goal) :- isolve(Goal,0).

    indent(0).

    indent(s(X)) :- print(+-), indent(X).

    35

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    36/74

    36 CHAPTER 6. WRITING A PROLOG INTERPRETER IN PROLOG

    isolve(true,_).

    isolve(,(A,B),IL) :- isolve(A,IL),isolve(B,IL).

    isolve(Goal,IL) :- Goal \= ,(_,_), Goal \= true,print(|), indent(IL), print(> enter ), print(Goal),nl,

    backtrack_message(IL,fail(Goal)),

    clause(Goal,Body), isolve(Body,s(IL)),

    backtrack_message(IL,redo(Goal)),

    print(|), indent(IL), print(> exit ), print(Goal),nl.

    backtrack_message(_IL,_).

    backtrack_message(IL,Msg) :-

    print(|), indent(IL), print(> ),print(Msg),nl,fail.

    6.0.1 Towards a reifined version

    First (wrong) attempt:

    Program 6.0.3

    :- use_module(library(lists),[member/2]).

    rsolve(true,_Prog).

    rsolve(,(A,B),Prog) :- rsolve(A,Prog),rsolve(B,Prog).

    rsolve(Goal,Prog) :- member(:-(Goal,Body),Prog), rsolve(Body,Prog).

    At first sight seems to work:

    | ?- rsolve(p,[ (p:-q,r), (q :- true), (r :- true)]).

    yes| ?- rsolve(p(X), [ (p(X) :- q(X)) , (q(a) :- true), (q(b) :- true) ]).

    X = a ? ;

    X = b ? ;

    no

    But here it doesnt:

    | ?- rsolve(i(s(s(0))), [ (i(0) :- true) , (i(s(X)) :- i(X)) ]).

    no

    What is the problem ?

    | ?- rsolve(i(s(0)), [ (i(0) :- true) , (i(s(X)) :- i(X)) ]) .

    X = 0

    We need standardising apart, i.e., generate a fresh copy for the variablesinside the clauses being used:

    Program 6.0.4

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    37/74

    37

    rsolve(true,_Prog).

    rsolve(,(A,B),Prog) :- rsolve(A,Prog),rsolve(B,Prog).

    rsolve(Goal,Prog) :- get_clause(Prog,Goal,Body), rsolve(Body,Prog).

    get_clause([ Clause | _], Head, Body) :-

    copy_term(Clause,CClause),

    CClause = :-(Head,Body).

    get_clause([ _ | Rest], Head, Body) :- get_clause(Rest,Head,Body).

    Exercise 6.0.5 Improve the data structure for the above interpreter; by group-ing clauses according to predicate name and arity.

    6.0.2 Towards a declarative Interpreter

    The first vanilla interpreter can be given a declarative reading (cite De Schreye,

    Martens). However, the reified one is certainly not.Recall, what do we mean by declarative:We say that a predicate p of arity n is declarative iff

    p(a1,...,an), , p(a1,...,an) (binding-insensitive)

    p(a1,...,an),fail fail, p(a1,...,an) (side-effect free)

    copy term is not declarative: copy term(p(X),Q),X=a is different from X=a,copy term(p(X),Q).

    Exercise 6.0.6 Find other predicates in Prolog which are not declarative andexplain why.

    Why is it a good idea to be declarative:

    clear logical semantics; independent of operational reading

    gives optimisers more liberty to reorder goals: parallelisation, optimisa-tion, analysis, specialisation,...

    Solution: the ground representation:Going from [ (i(0) :- true) , (i(s(X)) :- i(X)) ] with variables to

    [ (i(0) :- true) , (i(s(var(x))) :- i(var(x))) ] . What if the pro-gram uses var/1 ?? [ (i(term(0,[])) :- true) , (i(term(s,[var(x)])):- i(var(x))) ] .

    But now things get more complicated: we need to write an explicit unifica-tion procedure ! We need to treat substitutions and apply bindings !

    InstanceDemo or the Lifting Interpreter

    Trick: lift var(1) to variables; can be done declaratively !Try to write lift(p(var(1),var(1)), X) which gives answer X = p( 1, 1).Lift can be used to generate fresh copy of clauses !

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    38/74

    38 CHAPTER 6. WRITING A PROLOG INTERPRETER IN PROLOG

    Program 6.0.7

    /* --------------------- *//* solve(GrRules,NgGoal) */

    /* --------------------- */

    solve([],_GrRules).

    solve([NgH|NgT],GrRules) :-

    fresh_member(term(clause,[NgH|NgBody]),GrRules),

    solve(NgBody,GrRules),

    solve(NgT,GrRules).

    /* --------------------------------- */

    /* fresh_member(NgExpr,GrListOfExpr) */

    /* --------------------------------- */

    fresh_member(NgX,[GrH|_GrT]) :- lift(GrH,NgX).

    fresh_member(NgX,[_GrH|GrT]) :- fresh_member(NgX,GrT).

    /* ---------------------------------------- */

    /* lift(GroundRepOfExpr,NonGroundRepOfExpr) */

    /* ---------------------------------------- */

    lift(G,NG) :- mkng(G,NG,[],_Sub).

    mkng(var(N),X,[],[sub(N,X)]).

    mkng(var(N),X,[sub(N,X)|T],[sub(N,X)|T]).

    mkng(var(N),X,[sub(M,Y)|T],[sub(M,Y)|T1]) :-

    N \== M, mkng(var(N),X,T,T1).

    mkng(term(F,Args),term(F,IArgs),InSub,OutSub) :-

    l_mkng(Args,IArgs,InSub,OutSub).

    l_mkng([],[],Sub,Sub).

    l_mkng([H|T],[IH|IT],InSub,OutSub) :-

    mkng(H,IH,InSub,IntSub),

    l_mkng(T,IT,IntSub,OutSub).

    test(X,Y,Z) :- solve([term(app,[X,Y,Z])], [

    term(clause,[term(app,[term([],[]),var(l),var(l)]) ]),

    term(clause,[term(app,[term(.,[var(h),var(x)]),var(y),

    term(.,[var(h),var(z)])]),

    term(app,[var(x),var(y),var(z)]) ])

    ]).

    6.0.3 Meta-interpreters and pre-compilation

    [TO DO: Integrate material below into main text]A meta-program is a program which takes another program, the object pro-

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    39/74

    6.1. THE POWER OF THE GROUND VS. THE NON-GROUND REPRESENTATION39

    gram, as input, manipulating it in some way. Usually the object and meta-program are supposed to be written in (almost) the same language. Meta-

    programming can be used for (see e.g. [21, 5]) extending the programming lan-guage, modifying the control [9], debugging, program analysis, program trans-formation and, as we will see, specialised integrity checking.

    6.1 The power of the ground vs. the non-groundrepresentation

    6.1.1 The ground vs. the non-ground representation [Rewrite]

    In logic programming, there are basically two (fundamentally) different ap-proaches to representing an object level expression, say the atom p(X, a), atthe meta-level. In the first approach one uses the term p(X, a) as the meta-levelrepresentation. This is called a non-ground representation, because it representsan object level variable by a meta-level variable. In the second approach onewould use something like the term struct(p, [var(1), struct(a, [])]) to representthe object level atom p(X, a). This is called a ground representation, as it repre-sents an object level variable by a ground term. Figure 6.1 contains some furtherexamples of the particular ground representation which we will use throughoutthis thesis. From now on, we use T to denote the ground representation of aterm T. Also, to simplify notations, we will sometimes use p(t1, . . . , tn) as ashorthand for struct(p, [t1, . . . , tn]).

    Object level Ground representation

    X var(1)c struct(c, [])

    f(X, a) struct(f, [var(1), struct(a, [])])p q struct(clause, [struct(p, []), struct(q, [])])

    Figure 6.1: A ground representation

    The ground representation has the advantage that it can be treated in apurely declaratively manner, while for many applications the non-ground repre-sentation requires the use of extra-logical built-ins (like var/1 or copy/2). Thenon-ground representation also has semantical problems (although they weresolved to some extent in [10, 32, 33]). The main advantage of the non-groundrepresentation is that the meta-program can use the underlying1 unification

    mechanism, while for the ground representation an explicit unification algorithmis required. This (currently) induces a difference in speed reaching several or-ders of magnitude. The current consensus in the logic programming communityis that both representations have their merits and the actual choice depends

    1The term underlying refers to the system in which the meta-interpreter itself is written.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    40/74

    40 CHAPTER 6. WRITING A PROLOG INTERPRETER IN PROLOG

    on the particular application. In the following subsection we discuss the dif-ferences between the ground and the non-ground representation in more detail.

    For further discussion we refer the reader to [21], [22, 6], the conclusion of [32].

    Unification and collecting behaviour

    As already mentioned, meta-interpreters for the non-ground representation cansimply use the underlying unification. For instance, to unify the object levelatoms p(X, a) and p(Y, Y) one simply calls p(X, a) = p(Y, Y). This is very ef-ficient, but after the call both atoms will have become instantiated to p(a, a).This means that the original atoms p(X, a) and p(Y, Y) are no longer accessi-ble (in Prolog for instance, the only way to undo these instantiations is viafailing and backtracking), i.e. we cannot test in the same derivation whetherthe atom p(X, a) unifies with another atom, say p(b, a). This in turn means

    that it is impossible to write a breadth-first like or a collecting (i.e. performingsomething like findall/32) meta-interpreter declaratively for the non-groundrepresentation (it is possible to do this non-declaratively by using for instancePrologs extra-logical copy/2 built-in).

    In the ground representation on the other hand, we cannot use the under-lying unification (for instance p(var(1),a) =p(var(2),var(2)) will fail).The only declarative solution is to use an explicit unification algorithm. Suchan algorithm, taken from [12], is included in Appendix ??. (For the non-ground representation such an algorithm cannot be written declaratively; non-declarative features, like var/1 and = ../2, have to be used.) For instance,unify(p(var(1),a),p(var(2),var(2)),Sub) yields an explicit representationof the unifier in Sub, which can then be applied to other expressions. In con-trast to the non-ground representation, the original atoms p(var(1),a) and

    p(var(2),var(2)) remain accessible in their original form and can thus be usedagain to unify with other atoms. Writing a declarative breadth-first like or acollecting meta-interpreter poses no problems.

    Standardising apart and dynamic meta-programming

    To standardise apart object program clauses in the non-ground representation,we can again use the underlying mechanism. For this we simply have to store theobject program explicitly in meta-program clauses. For instance, if we representthe object level clause

    anc(X, Y) parent(X, Y)

    by the meta-level fact

    clause(1, anc(X, Y), [parent(X, Y)])

    2Note that the findall/3 built-in is non-declarative, in the sense that the meaning ofprograms using it may depend on the selection rule. For example, given a program containing

    just the fact p(a) , we have that findall(X, p(X), [A]),X = b succeeds (with the answer{A/p(a),X/b}) when executed left-to-right but fails when executed right-to-left.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    41/74

    6.1. THE POWER OF THE GROUND VS. THE NON-GROUND REPRESENTATION41

    we can obtain a standardised apart version of the clause simply by calling clause(1,Hd,Bdy). Similarly, we can resolve this clause with the atom

    anc(a, B) by calling clause(C, anc(a, B),Bdy).3The disadvantage of this method, however, is that the object program is

    fixed, making it impossible to do dynamic meta-programming (i.e. dynami-cally change the object program, see [21]); this can be remedied by using a mixedmeta-interpreter, as we will explain in Subsection ?? below). So, unless we re-sort to such extra-logical built-ins as assert and retract, the object programhas to be represented by a term in order to do dynamic meta-programming.This in turn means that non-logical built-ins like copy/2 have to be used toperform the standardising apart. Figure 6.2 illustrates these two possibilities.Note that without the copy in Figure 6.2, the second meta-interpreter wouldincorrectly fail for the given query. For our application this means that, on theone hand, using the non-logical copying approach unduly complicates the spe-cialisation task while at the same time leading to a serious efficiency bottleneck.On the other hand, using the clause representation, implies that representingupdates to a database becomes much more cumbersome. Basically, we alsohave to encode the updates explicitly as meta-program clauses, thereby makingdynamic meta-programming impossible.

    1. Using a Clause Representation 2. Using a Term Representation

    solve([]) solve(P, []) solve([H|T]) solve(P, [H|T])

    clause(H,B) member(Cl,P), copy(Cl,cl(H,B))solve(B), solve(T) solve(P, B), solve(P, T)

    clause(p(X), []) solve([p(a), p(b)]) solve([cl(p(X), [])], [p(a), p(b)])

    Figure 6.2: Two non-ground meta-interpreters with {p(X) } as object program

    For the ground representation, it is again easy to write an explicit standar-dising apart operator in a fully declarative manner. For instance, in the pro-gramming language Godel [22] the predicate RenameF ormulas/3 serves thispurpose.

    Testing for variants or instances

    In the non-ground representation we cannot test in a declarative manner whethertwo atoms are variants or instances of each other, and non-declarative built-ins,like var/1 and =../2, have to be used to that end. Indeed, suppose that wehave implemented a predicate variant/2 which succeeds if its two arguments

    represent two atoms which are variants of each other and fails otherwise. Then variant(p(X), p(a)) must fail and variant(p(a), p(a)) must succeed. This,however, means that the query variant(p(X), p(a)), X = a fails when using

    3However, we cannot generate a renamed apart version of anc(a,B). The copy/2 built-inhas to be used for that purpose.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    42/74

    42 CHAPTER 6. WRITING A PROLOG INTERPRETER IN PROLOG

    a left to right computation rule and succeeds when using a right to left compu-tation rule. Hence variant/2 cannot be declarative (the exact same reasoning

    holds for the predicate instance/2). Thus it is not possible to write declara-tive meta-interpreters which perform e.g. tabling, loop checks or subsumptionchecks.

    Again, for the ground representation there is no problem whatsoever to writedeclarative predicates which perform variant or instance checks.

    Specifying partial knowledge

    One additional disadvantage of the non-ground representation is that it is moredifficult to specify partial knowledge for partial evaluation. Suppose, for in-stance, that we know that a given atom (for instance the head of a fact thatwill be added to a deductive database) will be of the form man(T), where Tis a constant, but we dont know yet at partial evaluation time which partic-

    ular constant T stands for. In the ground representation this knowledge canbe expressed as struct(man, [struct(C, [])]). However, in the non-ground repre-sentation we have to write this as man(X), which is unfortunately less precise,as the variable X now no longer represents only constants but stands for anyterm.4

    4A possible solution is to use the = ../2 built-in to constrain X and represent the aboveatom by the conjunction man(X), X= ..[C]. This requires that the partial evaluator providesnon-trivial support for the built-in = ../2 and is able to specialise conjunctions instead ofsimply atoms, see Chapter ??.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    43/74

    Chapter 7

    Verification Tools

    7.1 Kripke Structures and Labeled TransitionSystems

    A Kripke structure is a graph with labels on the nodes:

    each node is a state of the system to be verified

    the labels indicate which property holds in that state.

    A labeled transition system is typically a graph with labels on the edges:

    each node is again a state of the system to be verified

    the labels indicate which particular action/operation of the system cantrigger a state change.

    In this chapter we combine these two formalisms as follows:

    Definition 7.1.1 A Labeled Transition System (LTS) is a tuple M = S, T, L, I , s0where

    S is a finite set of states, T S L S is a transition relation (s,a,t is also written as s

    a t),

    L is a finite set of actions, P is a set of basic properties, I : S P is the node labeling function, indicating which properties hold

    in a particular state, s0 is the initial state.

    Encoding this as a logic program:

    Program 7.1.2

    43

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    44/74

    44 CHAPTER 7. VERIFICATION TOOLS

    open

    closedclosedheat

    open_door close_door

    start

    open_door

    stop

    Figure 7.1: A simple LTS for a microwave oven

    start(s0).

    prop(s0,open).

    prop(s1,closed).

    prop(s2,closed).

    prop(s2,heat).

    trans(s0,close_door,s1).

    trans(s1,open_door,s0).

    trans(s1,start,s2).

    trans(s2,open_door,s0).

    trans(s2,stop,s1).

    Note that trans does not have to be defined by facts alone: we can plug inan interpeter !

    We now try to compute reach(M) the set of reachable states of M, which isimportant for checking so-called safety properties of a system:

    Program 7.1.3

    reach(X) :- start(X).

    reach(X) :- reach(Y), trans(Y,_,X).

    This is elegant and logically correct, but does not really work with classicalProlog:

    | ?- reach(X).

    X = s 0 ? ;

    X = s 1 ? ;

    X = s 0 ? ;

    X = s 2 ? ;

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    45/74

    7.2. BOTTOM-UP INTERPRETER 45

    X = s1 ? ;

    X = s0 ? ;

    X = s1 ? ;X = s 0 ?

    ...

    If we try to check a simple safety property of our system, namely that thedoor cannot be open while the heat is on, then we get stuck into an infiniteloop:

    | ?- reach(X), prop(X,open), prop(X,heat).

    ..... infinite loop ....

    The same problem persists if we change the order of the literals in the secondclause:

    reach(X) :- trans(Y,_,X), reach(Y).

    7.2 Bottom-Up Interpreter

    Top-down interpretation: start from goal try to find refutation. Used by classicalProlog, SLD-resolution.

    Bottom-up: start from facts and try to reach goal. Used by relational anddeductive databases. In logic programming this corresponds to the TP Operatorvon Definition 3.2.7.

    Below we adapt our vanilla interpreter from [REF] to work backwards (i.e.,bottom-up) and construct a table of solutions:

    Program 7.2.1

    :- dynamic start/1, prop/2, trans/3, reach/1.

    pred(start(_)). pred(prop(_,_)). pred(trans(_,_,_)). pred(reach(_)).

    :- dynamic table/1.

    :- dynamic change/0.

    bup(Call) :- pred(Call), clause(Call,Body), solve(Body).

    solve(true).

    solve(,(A,B)) :- solve(A),solve(B).

    solve(Goal) :- Goal \= ,(_,_), Goal \= true,table(Goal).

    run :- retractall(table(_)), bup, print_table.

    print_table :- table(X), print(X),nl,fail.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    46/74

    46 CHAPTER 7. VERIFICATION TOOLS

    print_table :- nl.

    bup :- retractall(change),bup(Sol),

    \+ table(Sol),

    assert(table(Sol)),

    assert(change),

    fail.

    bup :- print(.), flush_output, (change -> bup ; nl).

    7.3 Tabling and XSB Prolog

    Tabling combines bottom-up with top-down evaluation.

    Whenever a predicate call is treated:

    check if a variant of the call has already an entry in the table

    if there is no entry, then add a new entry and proceed as usual; whenevera solution is found for this call it is entered into the table (unless thesolution is already in the table)

    if there is an entry, then lookup the solutions in the table; do notevaluatethe goal; if the table is currently empty then delay and do another branchfirst...

    A system which implements tabling is XSB Prolog, available freely at:http://xsb.sourceforge.net/.

    Declaring a predicate reach with 1 argument as tabled:

    :- table reach/1.

    (There also exist mechanisms for XSB to decide automatically what to table.)

    One can switch between variant tabling (:- use variant tabling p/n, thedefault) and subsumptive tabling (:- use subsumptive tabling p/n ).

    Other useful predicates: abolish all tables, get calls for table(+PredSpec,?Call)

    This system has a very efficient tabling mechanism: efficient way to checkif a call has already been encountered; efficient way to propagate answers andcheck whether a table is complete.

    Program 7.3.1

    :- table reach/1.

    reach(X) :- start(X).

    reach(X) :- reach(Y), trans(Y,_,X).

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    47/74

    7.4. TEMPORAL MODEL CHECKING 47

    reach(X)

    start(X)

    X=s0

    reach(Y),

    reach(X) X=s0

    X=s1X=s2

    trans(Y,_,X)

    trans(s0,_,X)

    X=s1

    trans(s1,_,X) trans(s2,_,X)

    X=s0 X=s2 X=s0 X=s1

    Table for reach/1:

    Figure 7.2: Tabled execution of reach(X)

    | ?- reach(X).

    X = s2;

    X = s1;

    X = s0;

    no

    | ?- reach(X), prop(X,open), prop(X,heat).

    no

    7.4 Temporal Model Checking[Copied from Lopstr paper; needs to be rewritten]

    7.4.1 CTL syntax and semantics

    The temporal logic CTL (Computation Tree Logic) introduced by Clarke andEmerson in [18], allows to specify properties of specifications generally describedas Kripke structures. The syntax and semantics for CTL are given below.

    Given Prop, the set ofpropositions, the set of CTL formulae is inductivelydefined by the following grammar (where p Prop):

    := true | p | | | | | U | U

    A CTL formula can be either true or false in a given state. For example,true is true in all states, true is false in all states, and p is true in all stateswhich contain the elementary proposition p. The symbol is the nexttimeoperator and U stands for until. (resp. ) intuitively means that holds in every (resp. some) immediate successor of the current program state.

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    48/74

    48 CHAPTER 7. VERIFICATION TOOLS

    s p q

    p,r

    0 s 1

    s2

    s p0

    s p,r2

    q

    s q1

    s q1

    s1

    s p,r2s q1

    s p,r2s p1 s p0q

    .

    .

    .

    .

    .

    .

    .

    .

    .

    .

    ba

    Figure 7.3: Example of a Kripke structure.

    The formula 1U2 (resp. 1U2) intuitively means that for every (resp.some) computation path, there exists an initial prefix of the path such that 2holds at the last state of the prefix and 1 holds at all other states along theprefix.

    The semantics of CTL formulae is defined with respect to a Kripke structure(S,R,,s0) with S being the set of states, R( S S) the transition relation,(S 2Prop) giving the propositions which are true in each state, and s0 beingthe initial state.

    The semantics of CTL formulae is defined with respect to a Kripke structure(S,R,,s0) with S being the set of states, R( S S) the transition relation,(S 2Prop) giving the propositions which are true in each state, and s0being the initial state. Figure 7.3a gives a graphical representation of a Kripkestructure with 3 states s0, s1, s2, where s0 is the initial state. The propositions

    p, q, r label the states.Generally it is required that any state has at least one outgoing vertex. From

    a Kripke structure, we can define an infinite transition tree as follows: the rootof the tree is labelled by s0. Any vertex labelled by s has one son labelled by s

    for each vertex swith a transition s s in the Kripke structure. For instance,the Kripke structure of Fig. 7.3a gives the prefix for the transition tree startingfrom s0 given in Fig. 7.3b.

    If the system is not directly specified by a Kripke structure but e.g. by a

    Petri net, the markings and transitions between them give resp. the states andtransition relation of the Kripke structure.The CTL semantics is defined on states s by:

    s |= true s |= p iff p P(s) s |= iff not(s |= )

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    49/74

    7.4. TEMPORAL MODEL CHECKING 49

    s |= 1 2 iff s |= 1 and s |= 2 s |= iff for all state t such that (s, t) R, t |=

    s |= iff for some state t such that (s, t) R, t |= s |= 1U2 iff for all path (s0, s1, . . . ) of the system with s = s0, i(i

    0 si |= 2 j(0 j < i sj |= 1)) s |= 1U2iff it exists a path (s0, s1, . . . ) with s = s0, such that i(i

    0 si |= 2 j(0 j < i sj |= 1))If we look at the unfolding of the Kripke structure, starting from s0 (Fig. 7.3b),

    we can see, for instance, that

    s0 |= p since there exists a successor of s0 (i.e. s2) where p is true(p (s2)),

    s0 |= p since in some successor of s0, p does not hold.

    s0 |= pUq since the paths from s0 either go to s1 where q is true or to s2and then directly to s1. In both cases, the paths go through states where

    p is true before reaching one state where q holds.

    Since they are often used, the following abbreviations are defined

    3 true U i.e. for all paths, eventually holds,

    3 true U i.e. there exists a path where eventuallyholds,

    2 3() i.e. there exists a path where always holds,

    2 3() i.e. for all paths always holds.

    E.g. 2

    states that is an invariant of the system, 3

    states that isunavoidable and 3 states that may occur.

    7.4.2 CTL as fixed point and its translation to logic pro-gramming

    One starting point of this paper is that both the Kripke structure defining thesystem under consideration and the CTL specification can be easily definedusing logic programs. This is obvious for the standard logic operators as wellas for the CTL operators . The following equality can be easily proved . Moreover, the operators U and U can be defined asfixpoint solutions [34]. Indeed, if you take the view that represents the set ofstates S where is valid, it can be proved that pUq Y = (q (p Y))

    where stands for the least fixpoint of the equation. Intuitively this equationtells that the set of states satisfying pUq is the least one satisfying q or havinga successor which satisfies this property.

    3 and 2 can be derived from what precedes. can be derived usingthe equivalence . Slightly more involved is the definition of theset of states which satisfy 2. These states can be expressed as the solution

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    50/74

    50 CHAPTER 7. VERIFICATION TOOLS

    of X = X where stands for the greatest fixpoint of the equation.Greatest fixpoint cannot be directly computed by logic programming systems,

    but the equation can be translated into 2 where the set Y of stateswhich satisfy is defined by Y = Y. Finally, 3 2and 1U2 (2U(1 2)) 22. This last equality says thata state satisfies 1U2 if it has no path where 2 is continuously false until astate where both 1 and 2 are false nor a path where 2 is continuously false.

    We can see that we only use least fixpoint of monotonic equations. Moreover,the use of table-based Prolog (XSB Prolog) ensures proper handling of cycles.

    For infinite state systems, the derivation of the fixpoints cannot generally becompleted. However, by the monotonicity of all the used fixpoints, all derivedimplications belong indeed to the solution (no overapproximation).

    Our CTL specification is independant of any model. It only supposes thatthe successors of a state s can be computed (through the predicate trans)and that the elementary proposition of any state s can be determined (throughprop). In Fig. 7.4 we present a particular implementation of CTL as a (tabled)logic program. For example, it can be used to verify the mutual exclusionexample from [7], simply by running it in XSB-Prolog. This follows similarlines as [39, 29] where tabled logic programming is used as an (efficient) meansof finite model checking. Nonetheless, our translation of CTL in this paper isexpressed (more clearly) as a meta-interpreter and will be the starting point forthe model checking of infinite state systems using program specialisation andanalysis techniques.

    First version:

    sat(Formula) :- start(S), sat(S,Formula).

    sat(_E,true).sat(E,p(P)) :- prop(E,P). /* elementary proposition */

    sat(E,and(F,G)) :- sat(E,F), sat(E,G).

    sat(E,or(F,_G)) :- sat(E,F).

    sat(E,or(_F,G)) :- sat(E,G).

    sat(E,en(F)) :- trans(E,_Act,E2),sat(E2,F). /* exists next */

    sat(E,eu(F,G)) :- sat_eu(E,F,G). /* exists until */

    sat(E,ef(F)) :- sat(E,eu(true,F)). /* exists future */

    sat(E,not(F)) :- not(sat(E,F)).

    sat(E,ag(F)) :- sat(E,not(ef(not(F)))). /* always global */

    :- table sat_eu/3. /* table to compute least-fixed point using XSB */sat_eu(E,_F,G) :- sat(E,G). /* exists until */

    sat_eu(E,F,G) :- sat(E,F), trans(E,_Act,E2), sat_eu(E2,F,G).

    | ?- [ctl].

    [Compiling ./ctl]

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    51/74

    7.5. REFINEMENT CHECKING 51

    % Specialising partially instantiated calls to sat/2

    [ctl compiled, cpu time used: 0.0200 seconds]

    [ctl loaded]

    yes

    | ?- sat(ef(p(closed))).

    yes

    | ?- sat(ef(and(p(open),p(heat)))).

    no

    | ?- sat(ef(and(p(closed),p(heat)))).

    /* A Model Checker for CTL fomulas written for XSB-Prolog */

    sat(_E,true).

    sat(_E,false) :- fail.sat(E,p(P)) :- prop(E,P). /* elementary proposition */

    sat(E,and(F,G)) :- sat(E,F), sat(E,G).

    sat(E,or(F,_G)) :- sat(E,F).

    sat(E,or(_F,G)) :- sat(E,G).

    sat(E,not(F)) :- not(sat(E,F)).

    sat(E,en(F)) :- trans(E,_Act,E2),sat(E2,F). /* exists next */

    sat(E,an(F)) :- not(sat(E,en(not(F)))). /* always next */

    sat(E,eu(F,G)) :- sat_eu(E,F,G). /* exists until */

    sat(E,au(F,G)) :- sat(E,not(eu(not(G),and(not(F),not(G))))),

    sat_noteg(E,not(G)). /* always until */

    sat(E,ef(F)) :- sat(E,eu(true,F)). /* exists future */

    sat(E,af(F)) :- sat_noteg(E,not(F)). /* always future */

    sat(E,eg(F)) :- not(sat_noteg(E,F)). /* exists global */

    /* we want gfp -> negate lfp of negation */

    sat(E,ag(F)) :- sat(E,not(ef(not(F)))). /* always global */

    :- table sat_eu/3. /* table to compute least-fixed point using XSB */sat_eu(E,_F,G) :- sat(E,G). /* exists until */

    sat_eu(E,F,G) :- sat(E,F), trans(E,_Act,E2), sat_eu(E2,F,G).

    :- table sat_noteg/2. /* table to compute least-fixed point using XSB */

    sat_noteg(E,F) :- sat(E,not(F)).

    sat_noteg(E,F) :- not((trans(E,_Act,E2),not(sat_noteg(E2,F)))).

    Figure 7.4: CTL interpreter

    7.5 Refinement Checking

    Below requires XSB to be configured ./configure --enable-batched-schedulingand ./makexsb --config-tag=btc for batched scheduling (to provide answersbefore tables are complete; by default XSB will first complete table and theprint answer).

    From XSB Manual:

    Local scheduling then fully evaluates each maximal SCC (a SCC thatdoes not depend on another SCC) before returning answers to any

  • 7/28/2019 Interpreters Compiler Optimisers in Prolog

    52/74

    52 CHAPTER 7. VERIFICATION TOOLS

    subgoal outside of the SCC 5 . ... Unlike Local Scheduling, BatchedScheduling allows answers to be re- turned outside of a maximal

    SCC as they are derived, and thus resembles Prologs tuple at a timescheduling.

    :- table enabled/2.

    enabled(A,X) :- trans(X,A,_).

    :- table not