45
1 CS 2710, ISSP 2160 Chapter 9 Inference in First-Order Logic

CS 2710, ISSP 2160

  • Upload
    matty

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

CS 2710, ISSP 2160. Chapter 9 Inference in First-Order Logic. Pages to skim. Storage and Retrieval (p. starts bottom 328) Efficient forward chaining (starts p. 333) through Irrelevant facts (ends top 337) - PowerPoint PPT Presentation

Citation preview

Page 1: CS 2710, ISSP 2160

1

CS 2710, ISSP 2160

Chapter 9Inference in First-Order Logic

Page 2: CS 2710, ISSP 2160

2

Pages to skim

• Storage and Retrieval (p. starts bottom 328)• Efficient forward chaining (starts p. 333) through

Irrelevant facts (ends top 337)• Efficient implementation of logic programs (starts p.

340) through Constraint logic programming (ends p. 345)

• Completeness of resolution (starts p. 350) (though see notes in slides)

Page 3: CS 2710, ISSP 2160

3

Inference with Quantifiers• Universal Instantiation:

– Given X (person(X) likes(X, sun))– Infer person(john) likes(john,sun)

• Existential Instantiation:– Given x likes(x, chocolate)– Infer: likes(S1, chocolate)– S1 is a “Skolem Constant” that is not found

anywhere else in the KB and refers to (one of) the individuals that likes sun.

Page 4: CS 2710, ISSP 2160

4

Reduction to Propositional Inference• Simple form (pp. 324-325) not efficient. Useful

conceptually.• Replace each universally quantified sentence by all

possible instantiations– All X (man(X) mortal(X)) replaced by– man(tom) mortal(tom)– man(chocolate) mortal(chocolate) – …

• Now, we have propositional logic. • Use propositional reasoning algorithms from Ch 7

Page 5: CS 2710, ISSP 2160

5

Reduction to Propositional Inference

• Problem: when the KB includes a function symbol, the set of term substitutions is infinite. father(father(father(tom))) …

• Herbrand 1930: if a sentence is entailed by the original FO KB, then there is a proof using a finite subset of the propositionalized KB

• Since any subset has a maximum depth of nesting in terms, we can find the subset by generating all instantiations with constant symbols, then all with depth 1, and so on

Page 6: CS 2710, ISSP 2160

6

First-Order Inference

• We have an approach to FO inference via propositionalization that is complete: any entailed sentence can be proved

• Entailment for FOPC is semi-decidable: algorithms exist that say yes to every entailed sentence, but no algorithm exists that also says no to every nonentailed sentence.

• Our proof procedure could go on and on, generating more and more deeply nested terms, but we will not know whether it is stuck in a loop, or whether the proof is just about to pop out

Page 7: CS 2710, ISSP 2160

7

Generalized Modus Ponens

• This is a general inference rule for FOPC that does not require universal instantiation first

• Given:

– p1’, p2’ … pn’, (p1 … pn) q– Subst(theta, pi’) = subst(theta, pi) for all i

• Conclude:– Subst(theta, q)

Page 8: CS 2710, ISSP 2160

8

GMP is a lifted version of MP

• GMP “lifts” MP from propositional to first-order logic• Key advantage of lifted inference rules over

propositionalization is that they make only substitutions which are required to allow particular inferences to proceed

Page 9: CS 2710, ISSP 2160

9

GMP Example

x,y,z ((parent(x,y) parent(y,z)) grandparent(x,z))• parent(james, john), parent(james, richard),

parent(harry, james)• We can derive:

– Grandparent(harry, john), bindings:{x/harry,y/james,z/john}

– Grandparent(harry, richard), bindings: {x/harry,y/james,z/richard}

Page 10: CS 2710, ISSP 2160

10

Unification

• Process of finding all legal substitutions• Key component of all FO inference algorithms • Unify(p,q) = theta, where Subst(theta,p) ==

Subst(theta,q)Assuming all variables universally quantified

Page 11: CS 2710, ISSP 2160

11

Standardizing apart

• All X knows(john,X).• All X knows(X,elizabeth).• These ought to unify, since john knows everyone, and

everyone knows elizabeth. • Rename variables to avoid such name clashes

Note:all X p(X) == all Y p(Y)

All X (p(X) ^ q(X)) == All X p(X) ^ All Y q(Y)

Page 12: CS 2710, ISSP 2160

12

def Unify (p, q, bdgs): d = disagreement(p, q) # If there is no disagreement, then success. if not d: return bdgs elif not isVar(d[0]) and not isVar(d[1]): return 'fail' else: if isVar(d[0]): var = d[0] ; other = d[1] else: var = d[1] ; other = d[0] if occursp (var,other): return ‘fail’ # Make appropriate substitutions and recurse on the result. else: pp = replaceAll(var,other,p) qq = replaceAll(var,other,q) return Unify (pp,qq, bdgs + [[var,other]])

For code, see “resources” on the webpage

Page 13: CS 2710, ISSP 2160

13

================================ unify: ['loves', ['dog', 'var_x'], ['dog', 'fred']] ['loves', 'var_z', 'var_z'] subs: [['var_z', ['dog', 'var_x']], ['var_x', 'fred']] result: ['loves', ['dog', 'fred'], ['dog', 'fred']] ================================ unify: ['loves', ['dog', 'fred'], 'fred'] ['loves', 'var_x', 'var_y'] subs: [['var_x', ['dog', 'fred']], ['var_y', 'fred']] result: ['loves', ['dog', 'fred'], 'fred'] ================================ unify: ['loves', ['dog', 'fred'], 'mary'] ['loves', ['dog', 'var_x'], 'var_y'] subs: [['var_x', 'fred'], ['var_y', 'mary']] result: ['loves', ['dog', 'fred'], 'mary'] ================================

Page 14: CS 2710, ISSP 2160

14

unify: ['loves', ['dog', 'fred'], 'mary'] ['loves', ['dog', 'var_x'], 'var_y'] subs: [['var_x', 'fred'], ['var_y', 'mary']] result: ['loves', ['dog', 'fred'], 'mary'] ================================ unify: ['loves', ['dog', 'fred'], 'fred'] ['loves', 'var_x', 'var_x'] failure ================================ unify: ['loves', ['dog', 'fred'], 'mary'] ['loves', ['dog', 'var_x'], 'var_x']failure ================================ unify: ['loves', 'var_x', 'fred'] ['loves', ['dog', 'var_x'], 'fred'] var_x occurs in ['dog', 'var_x'] failure

Page 15: CS 2710, ISSP 2160

15

unify: ['loves', 'var_x', ['dog', 'var_x']] ['loves', 'var_y', 'var_y'] var_y occurs in ['dog', 'var_y'] failure ================================ unify: ['loves', 'var_y', 'var_y'] ['loves', 'var_x', ['dog', 'var_x']] var_x occurs in ['dog', 'var_x'] failure ================================ unify: (fails because vars not standardized apart) ['hates', 'agatha', 'var_x'] ['hates', 'var_x', ['f1', 'var_x']] failure ================================ unify: ['hates', 'agatha', 'var_x'] ['hates', 'var_y', ['f1', 'var_y']] subs: [['var_y', 'agatha'], ['var_x', ['f1', 'agatha']]] result: ['hates', 'agatha', ['f1', 'agatha']]

Page 16: CS 2710, ISSP 2160

16

Most General Unifier• The Unify algorithm returns a MGU

L1 = p(X,f(Y),b)L2 = p(X,f(b),b) Subst1 = {X\a, Y\b} Result1 = p(a,f(b),b) Subst2 = {Y\b} Result2 = p(X,f(b),b) Subst1 is more restrictive than Subst2. In fact,Subst2 is a MGU of L1 and L2.

Page 17: CS 2710, ISSP 2160

17

Storage and retrieval

• Hash statements by predicate for quick retrieval (predicate indexing), e.g., of all sentences that unify with tall(X)

• Why attempt to unify – tall(X) and silly(dog(Y))

• Instead– Predicates[tall] = {all tall facts} – Unify(tall(X),s) for s in Predicates[tall]

• Subsumption lattice for efficiency (see p. 329 for your interest)

Page 18: CS 2710, ISSP 2160

18

Inference Methods

• Unification (prerequisite)• Forward Chaining

– Production Systems– RETE Method (OPS)

• Backward Chaining– Logic Programming (Prolog)

• Resolution– Transform to CNF– Generalization of Prop. Logic resolution

Page 19: CS 2710, ISSP 2160

19

Resolution Theorem Proving (FOL)

• Convert everything to CNF• Resolve, with unification

– Save bindings as you go!• If resolution is successful, proof succeeds• If there was a variable in the item to prove, return

variable’s value from unification bindings

Page 20: CS 2710, ISSP 2160

20

Converting to CNF

Page 21: CS 2710, ISSP 2160

21

Converting sentences to CNF1. Eliminate all ↔ connectives

(P ↔ Q) ((P Q) ^ (Q P)) 2. Eliminate all connectives

(P Q) (P Q) 3. Reduce the scope of each negation symbol to a single predicate

P P(P Q) P Q(P Q) P Q(x)P (x)P(x)P (x)P

4. Standardize variables: rename all variables so that each quantifier has its own unique variable name

Page 22: CS 2710, ISSP 2160

22

Converting sentences to clausal form: Skolem constants and functions

5. Eliminate existential quantification by introducing Skolem constants/functions(x)P(x) P(c)

c is a Skolem constant (a brand-new constant symbol that is not used in any other sentence)

(x)(y)P(x,y) becomes (x)P(x, F(x))since is within the scope of a universally quantified variable, use a Skolem function F to construct a new value that depends on the universally quantified variable

f must be a brand-new function name not occurring in any other sentence in the KB.

E.g., (x)(y)loves(x,y) becomes (x)loves(x,F(x)) In this case, F(x) specifies the person that x loves

E.g., x1 x2 x3 y P(… y …) becomes x1 x2 x3 P(… FF(x1,x2,x3) …) (FF is a new name)

Page 23: CS 2710, ISSP 2160

23

Converting sentences to clausal form6. Remove universal quantifiers by (1) moving them all to

the left end; (2) making the scope of each the entire sentence; and (3) dropping the “prefix” partEx: (x)P(x) P(x)

7. Put into conjunctive normal form (conjunction of disjunctions) using distributive and associative laws(P Q) R (P R) (Q R)(P Q) R (P Q R)

8. Split conjuncts into separate clauses9. Standardize variables so each clause contains only

variable names that do not occur in any other clause

Page 24: CS 2710, ISSP 2160

24

An example

(x)(P(x) ((y)(P(y) P(F(x,y))) (y)(Q(x,y) P(y)))) 2. Eliminate

(x)(P(x) ((y)(P(y) P(F(x,y))) (y)(Q(x,y) P(y)))) 3. Reduce scope of negation

(x)(P(x) ((y)(P(y) P(F(x,y))) (y)(Q(x,y) P(y)))) 4. Standardize variables

(x)(P(x) ((y)(P(y) P(F(x,y))) (z)(Q(x,z) P(z)))) 5. Eliminate existential quantification

(x)(P(x) ((y)(P(y) P(F(x,y))) (Q(x,G(x)) P(G(x))))) 6. Drop universal quantification symbols

(P(x) ((P(y) P(F(x,y))) (Q(x,G(x)) P(G(x)))))

Page 25: CS 2710, ISSP 2160

25

An Example7. Convert to conjunction of disjunctions

(P(x) P(y) P(F(x,y))) (P(x) Q(x,G(x))) (P(x) P(G(x)))

8. Create separate clausesP(x) P(y) P(F(x,y)) P(x) Q(x,G(x)) P(x) P(G(x))

9. Standardize variablesP(x) P(y) P(F(x,y)) P(z) Q(z,G(z)) P(w) P(G(w))

Note: Now that quantifiers are gone, we do need the upper/lower-case distinction

Page 26: CS 2710, ISSP 2160

26

1. all X (read (X) --> literate (X)) 2. all X (dolphin (X) --> ~literate (X))3. exists X (dolphin (X) ^ intelligent (X))(a translation of ``Some dolphins are intelligent'') ``Are there some who are intelligent but cannot read?''4. exists X (intelligent(X) ^ ~read (X)) Set of clauses (1-3):1. ~read(X) v literate(X)2. ~dolphin(Y) v ~literate(Y)3a. dolphin (a)3b. intelligent (a) Negation of 4:~(exists Z (intelligent(Z) ^ ~read (Z)))In Clausal form:~intelligent(Z) v read(Z)Resolution proof: in lecture.

Page 27: CS 2710, ISSP 2160

27

More complicated example Did Curiosity kill the cat

• Jack owns a dog. Every dog owner is an animal lover. No animal lover kills an animal. Either Jack or Curiosity killed the cat, who is named Tuna. Did Curiosity kill the cat?

• These can be represented as follows:A. (x) (Dog(x) Owns(Jack,x))B. (x) (((y) (Dog(y) Owns(x, y))) AnimalLover(x))C. (x) (AnimalLover(x) ((y) Animal(y) Kills(x,y)))D. Kills(Jack,Tuna) Kills(Curiosity,Tuna)E. Cat(Tuna)F. (x) (Cat(x) Animal(x) )G. Kills(Curiosity, Tuna) GOAL

Page 28: CS 2710, ISSP 2160

28

• Convert to clause formA1. (Dog(D)) A2. (Owns(Jack,D))B. (Dog(y), Owns(x, y), AnimalLover(x))C. (AnimalLover(a), Animal(b), Kills(a,b))D. (Kills(Jack,Tuna), Kills(Curiosity,Tuna))E. Cat(Tuna)F. (Cat(z), Animal(z))

• Add the negation of query: G: (Kills(Curiosity, Tuna))

D is a skolem constant

Page 29: CS 2710, ISSP 2160

29

• The resolution refutation proof R1: G, D, {} (Kills(Jack, Tuna))R2: R1, C, {a/Jack, b/Tuna} (~AnimalLover(Jack),

~Animal(Tuna))R3: R2, B, {x/Jack} (~Dog(y), ~Owns(Jack, y),

~Animal(Tuna))R4: R3, A1, {y/D} (~Owns(Jack, D),

~Animal(Tuna))R5: R4, A2, {} (~Animal(Tuna))R6: R5, F, {z/Tuna} (~Cat(Tuna))R7: R6, E, {} FALSE

Page 30: CS 2710, ISSP 2160

30

• The proof tree

G D

C

B

A1

A2

F

A

R1: K(J,T)

R2: AL(J) A(T)

R3: D(y) O(J,y) A(T)

R4: O(J,D), A(T)

R5: A(T)

R6: C(T)

R7: FALSE

{}

{a/J,b/T}

{x/J}

{y/D}

{}

{z/T}

{}

Page 31: CS 2710, ISSP 2160

31

Decidability and Completeness

• Resolution is a refutation complete inference procedure for First-Order Logic– If a set of sentences contains a contradiction, then a

finite sequence of resolutions will prove this.– If not, resolution may loop forever (“semi-decidable”)

• Here are notes by Charles Elkan that go into this more deeply

Page 32: CS 2710, ISSP 2160

32

Decidability and Completeness

• Refutation Completeness: If KB |= A then KB |- A– If it’s entailed, then there’s a proof

• Semi-decidable: – If there’s a proof, we’ll halt with it. – If not, maybe halt, maybe not

• Logical entailment in FOL is semi-decidable: if the desired conclusion follows from the premises, then eventually resolution refutation will find a contradiction

Page 33: CS 2710, ISSP 2160

33

Decidability and Completeness

• Propositional logic– logical entailment is decidable– There exists a complete inference procedure

• First-Order logic– logical entailment is semi-decidable– Resolution procedure is refutation complete

Page 34: CS 2710, ISSP 2160

34

• Strategies (heuristics) for efficient resolution include– Unit preference. If a clause has only one literal, use it

first.– Set of support. Identify “useful” rules and ignore the

rest. (p. 305)– Input resolution. Intermediately generated sentences

can only be combined with original inputs or original rules.

– Subsumption. Prune unnecessary facts from the database.

Page 35: CS 2710, ISSP 2160

35

Horn Clauses

• A Horn Clause is a CNF clause with at most one positive literal

• Horn Clauses form the basis of forward and backward chaining

• The Prolog language is based on Horn Clauses• Deciding entailment with Horn Clauses is linear in the

size of the knowledge base

Page 36: CS 2710, ISSP 2160

36

Reasoning with Horn Clauses

• Forward Chaining– For each new piece of data, generate all new facts,

until the desired fact is generated– Data-directed reasoning

• Backward Chaining– To prove the goal, find a clause that contains the

goal as its head, and prove the body recursively– Goal-directed reasoning

• The state space is an AND-OR graph; see 7.5.4

Page 37: CS 2710, ISSP 2160

37

Forward Chaining over FO Definite (Horn) Clauses

• Clauses (disjunctions) with at most one positive literal• First-order literals can include variables, which are

assumed to be universally quantified• Use GMP to perform forward chaining

(Semi-decidable as for full FOPC)

Page 38: CS 2710, ISSP 2160

38

Def FOL-FC-Ask(KB,A) returns subst or false KB: set of FO definite clauses with variables standardized apart A: the query, an atomic sentenceRepeat until new is empty new {} for each implication (p1 ^ … ^ pn q) in KB: for each T such that SUBST(T,p1^…^pn) = SUBST(T,p1’^…^pn’) for some p1’,…,pn’ in KB q’ SUBST(T,q) if q’ is not a renaming of a sentence already in KB or new: add q’ to new S Unify(q’,A) if S is not fail then return S add new to KBReturn false

Process can be made more efficient; read on your own, for interest

Page 39: CS 2710, ISSP 2160

39

Backward Chaining over Definite (Horn) Clauses

• Logic programming• Prolog is most popular form• Depth-first search, so space requirements are lower, but

suffers from problems from repeated states

Page 40: CS 2710, ISSP 2160

40

american(X) ^ weapon(Y) ^ sells(X,Y,Z) ^ hostile(Z) criminal(X).owns(nono,m1). missile(m1).missile(X1) ^ owns(nono,X1) sells(west,X1,nono).missile(X2) weapon(X2).enemy(X3,america) hostile(X3).american(west).enemy(nono,america).

Goal: criminal(west).

Backward chaining proof: in lecture

In Prolog:criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z).

Page 41: CS 2710, ISSP 2160

41

Horn clauses are all of the form: L1 ^ L2 ^ ... ^ Ln -> Ln+1 Or, equivalently, in clausal form: ~L1 v ~L2 v ... v ~Ln v Ln+1 Prolog (like databases) makes the "closed world assumption": if P cannot be proved, infer not P Think of the system as an arrogant know-it-all: "If it were true, I would know it. Since I can't prove it, it must not be true" Thus, it uses "negation as failure".

Page 42: CS 2710, ISSP 2160

42

neighbor(canada,us) neighbor(mexico,us)neighbor(pakistan,india) ?- neighbor(canada,india). no In full first-order logic, you would have to be able toinfer “~neighbor(canada,india)" for"neighbor(canada,india)" to be false.

Be careful! “~neighbor(canada,india) is not entailed by theSentences above!

Page 43: CS 2710, ISSP 2160

43

bachelor(X) :- male(X), \+ married(X).male(bill).male(jim).married(bill).married(mary). An individual is a bachelor if it is male and it isnot married. \+ is the negation-as-failure operator in Prolog. | ?- bachelor(bill). no| ?- bachelor(jim).yes| ?- bachelor(mary).no| ?- bachelor(X).X = jim;no| ?-

Page 44: CS 2710, ISSP 2160

44

Comparing backward chaining in prolog with

resolution• [In lecture]

Page 45: CS 2710, ISSP 2160

45

WrapUp

• You are responsible for everything in Chapter 9 except the following (though you are encouraged to read them):– Storage and Retrieval (p. starts bottom 328)– Efficient forward chaining (starts p. 333) through

Irrelevant facts (ends top 337)– Efficient implementation of logic programs (starts p.

340) through Constraint logic programming (ends p. 345)

– Completeness of resolution (starts p. 350) (though see notes in slides)

• Also, see files posted on the schedule (clausal form conversion, resolution, etc.)