16
Neural-Symbolic Integration A selfcontained introduction Sebastian Bader Pascal Hitzler ICCL, Technische Universit¨ at Dresden, Germany AIFB, Universit¨ at Karlsruhe, Germany Outline of the Course Introduction and Motivation The History of Neural-Symbolic Integration The Core Method for Propositional Logic The Core Method for First-Order Logic Part Introduction and Motivation Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration Why Neural Symbolic Integration As we will see, connectionist systems and symbolic AI systems have quite contrasting advantages and disadvantages. We try to integrate both paradigms while keeping the advantages. Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 4 Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration The Neural Symbolic Cycle Symbolic System Connectionist System embedding extraction writable readable trainable Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 5 Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration Connectionist Systems Inspired by nature. Massively parallel computational model. A Connectionist System consist of ... a set U of units (input, hidden and output). a set of connections C U × U, each labelled with a weight w R. x y z -1.5 0.3 -1.3 -0.7 1.8 -1.6 Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 6 Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration Units of Connectionist Systems A unit is characterised by ... Activation function, mapping inputs i to the potential p: p = n i n · w n p = n (i n - w n ) 2 Output function, mapping the potential p to the output o: threshold ramp sigmoidal tanh Gaussian Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 7 Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration Dynamics of a Network x y z -1.5 0.3 -1.3 -0.7 1.8 -1.6 Activation F. Output F. input p set from outside o = p hidden p = P n (in - wn) 2 o = e -p 2 output p = P n (in * wn) o = p t=0: -1.0 1.0 t=1: 0.74 0.5783 2.98 0.0001 t=2: 1.039 1.039 Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 8

Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Neural-Symbolic IntegrationA selfcontained introduction

Sebastian Bader Pascal Hitzler

ICCL, Technische Universitat Dresden, Germany

AIFB, Universitat Karlsruhe, Germany

Outline of the Course

I Introduction and MotivationI The History of Neural-Symbolic IntegrationI The Core Method for Propositional LogicI The Core Method for First-Order Logic

Part

Introduction and Motivation

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Why Neural Symbolic Integration

As we will see, connectionist systems and symbolic AI systemshave quite contrasting advantages and disadvantages. We tryto integrate both paradigms while keeping the advantages.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 4

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

The Neural Symbolic Cycle

SymbolicSystem

ConnectionistSystem

embedding

extraction

writable

readable

train

able

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 5

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Connectionist Systems

I Inspired by nature.I Massively parallel computational model.I A Connectionist System consist of ...

• a set U of units (input, hidden and output).• a set of connections C ⊆ U × U, each labelled with a

weight w ∈ R.x

y

z

-1.5

0.3

-1.3

-0.7

1.8

-1.6

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 6

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Units of Connectionist Systems

A unit is characterised by ...I Activation function, mapping inputs~i to the potential p:

p =∑

n

in · wn p =∑

n

(in − wn)2

I Output function, mapping the potential p to the output o:

threshold ramp sigmoidal tanh Gaussian

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 7

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Dynamics of a Network

x

y

z

-1.5

0.3

-1.3

-0.7

1.8

-1.6

Activation F. Output F.input p set from outside o = phidden p =

Pn(in − wn)

2 o = e−p2

output p =P

n(in ∗ wn) o = p

t=0:

-1.0

1.0

t=1:0.74

0.5783

2.98

0.0001

t=2:

1.0391.039

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 8

Page 2: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Dynamics of a Network

Activation function Output function Result

p =∑

n(in − wn)2

1

-11-1

-2-1.5-1-0.5 0 0.5 1 1.5 2

-3-2.5

-2-1.5

-1-0.5

0 0.5

1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-2-1.5

-1-0.5

0 0.5

1 1.5

2

p =∑

n in · wn

1

-11-1

-2-1.5-1-0.5 0 0.5 1 1.5 2

-2-1.5

-1-0.5

0 0.5

1 1.5

2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-2-1.5

-1-0.5

0 0.5

1 1.5

2

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 9

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Training of Connectionist Systems

? How can we train a network to represent a function givenas a set of samples {(i1, o1), . . . , (in, on)}?

x

y

⇒ x

y

I Learning as generalization.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 10

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Backpropagation

I Let a set of samples {(i1, o1), . . . , (in, on)} be given.I Error of the network: E =

∑i(N (ii)− oi)

2.I Idea: minimise E by gradient descent.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

-2-1.5

-1-0.5

0 0.5

1 1.5

2

-2-1.5-1-0.5 0 0.5 1 1.5 2

0.5 1

1.5 2

2.5 3

3.5 4

4.5 5

5.5

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 11

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Backpropagation in Detail

1. Present a training sample to the network.2. Compare the output of the network with the desired output.3. Calculate the error in each output unit.4. Modify the weights to the output layer such that the error

decreases.5. Propagate the error back to the last hidden units.6. Compute the part of the error caused by the hidden units.7. Modify the weights to the hidden units using this local error.8. Continue until input units are reached.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 12

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

A sample run ...

Prior training:

x

-3.0

0.0

0.7

0.2

0.1

y-0.6

-2.6+0.2+3.1-1.1-0.7

-1.3-0.2-1.2-1.3-0.7

x

y

After training:

x

-3.6

0.0

-2.3

2.4

0.3

y0.6

-0.7-0.2+4.7-2.7-0.7

-0.4+0.1-1.3-2.6+0.2

x

y

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 13

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Funahashi’s Theorem

Theorem (Ken-Ichi Funahashi, 1989)Every continous function f : K → R (with K ⊂ Rn compact) canbe approximated arbitrarily well using 3 layer feed-forwardnetworks with sigmoidal units.

1

-11-1

x

y

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 14

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

History of Connectionist Systems

1943 Warren Stirgis McCulloch and Walter Pitts publish“A logical calculus of the ideas immanent in nervousactivity”.

1968 Marvin Minsky and Seymor Papert publish“Perceptron”.

1974 Paul Werbos (1974), David Parker (1984) and DavidRumelhart & Ronald Williams (1985) inventBackpropagation

1989 Ken-Ichi Funahashi publishes“On the Approximate Realisation of Continuous Mappingsby Neural Networks”.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 15

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

NETtalk

I Terrence J. Sejnowski and Charles R. Rosenberg, 1987I Network learns the map between letters and phonemesI 3-layer feed-forward network with sigmoidal units:

• 203 input units: encoding a window of 7 letters• 80 hidden units• 26 output units: representing phonemes, punctuation ...

I Trained using samples of the form:Word phonemes stress and syllablelogic laJIk > 1 < 0 <programme progr@m-- >> 1 >> 2 <<<neural nU-r-L > 1 << 0 <network nEtw-Rk > 1 <> 2 <<

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 16

Page 3: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

ALVINN & MANIAC

ALVINN (Autonomous Land Vehicle In a Neural Network),I Pomerleau 1993I Learns to control NAVLAB vehicles by watching a person.I 3-layer feed-forward network with sigmoidal units:

• 960 input units: 30x32 units serve as two dimensional retina• 5 hidden units• 30 output units: representing the steering direction

MANIAC (Multiple ALVINN Networks In Autonomous Control)I Jochem et al 1993I Multiple ALVINN networks, each for a certain type of road.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 17

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

ALVINN, MANIAC & RALPH

The road for ALVINN, MANIAC & RALPH:

RALPH (Rapidly Adapting Lateral Position Handler)I Pomerleau 1995I Drove in 9 days from Pittsburgh to San Diego

(2850 but 50 miles)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 18

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

TD-Gammon

Artificial Backgammon player (Gerry Tesauro, 1995):I Based on standard neural network.I Learns by playing against itself.I Reaches championship level.

Btw.:I We play Backgammon symbolically.I Can we learn from the network to play better?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 19

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Properties of Connectionist Systems

, Massively parallel paradigm., Continuous function can be approximated arbitrarily well., Well suited to learn and to adapt to new environments.

• Hardly any knowledge about the function is needed.• Training from raw, incomplete and noisy data.

, Degrade gracefully, even if parts of the systems fail theoverall behaviour is ok.

, Successfully used in many application areas.

/ Declarative semantics is not available./ Structured objects are hard to represent./ Background knowledge cannot be used.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 20

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Symbolic AI

I Inspired by human thinking.I Aristotle tried to formalize human thinking by means of

logic (500BC).I First “is-a” hierarchy by Porphyry of Tyros (260).I Use of logic-like languages for knowledge representation

and reasoning.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 21

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

SHRDLUI Terry Winograd, MIT, 1968-70I SHRDLU carried on a simple

dialog about a small world.I A sample dialog:

Person: Pick up a red block.Computer: Ok. (does it)Person: Grasp the pyramid.Computer: I don’t understand which pyramid you mean.Person: Find a block which is taller than the one

you are holding and put it into the box.Computer: By ‘‘it’’, i assume you mean the block

which is taller than the one i am holding.Computer: Ok. (does it)Person: What does the box contain?Computer: The blue pyramid and the blue block.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 22

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

SHRDLU - Internals

I Written in LISP.I Equipped with some knowledge about the world:

((#IS :B1 #BLOCK)) ((#IS #RED #COLOR))((#IS :B2 #PYRAMID)) ((#IS #GREEN #COLOR))((#IS :B3 #BLOCK)) ((#IS #BLACK #COLOR))((#COLOR :B1 #RED)) ((#CONTAIN :BOX :B4))((#COLOR :B2 #GREEN)) ((#SHAPE :B1 #RECTANGULAR))((#COLOR :TABLE #BLACK))((#SHAPE :B3 #RECTANGULAR))

(DEFPROP TA-AT (THANTE (X Y) (#AT $?X $?Y)(THRPLACA (CDR (ATAB $?X)) $?Y))THEOREM)

(DEFPROP TA-EXISTS(THANTE (X) (#EXISTS $?X)(THSUCCEED)) THEOREM)

I Can be downloaded fromhttp://hci.stanford.edu/winograd/shrdlu

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 23

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

ProLog (Programming In Logic)

I Designed as a tool for man-mashine communication innatural language.

I Phillippe Roussell and Alain Colmerauer, 1972I The first Prolog-Application:

Every psychatrist is a person.Every person he analyzes is sick.Jacques is a psychatrist in Marseille.

Is Jacques a person? Yes.Where is Jacques? In Marseille.Is Jacque sick? I don’t know.

I Consisted of 610 clauses.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 24

Page 4: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Applications Involving Prolog

Nowadays:I Turing complete programming language.I Usually with additional (non-logical) features.

Some application areas:I Expert and rule systems.I Computational linguistics (e.g. representation of

grammars).I Planning in AI.I Cognitive robotics.I Semantic web.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 25

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Deterministic Finite Automata

A Moore Machine consists of:I Q - set of states with an initial state q0 ∈ QI Σ - set of input symbolsI ∆ - set of output symbolsI δ - state transition function δ : Q × Σ→ QI λ - state output function λ : Q → ∆

ExampleQ = {q0, q1}Σ = {a, b} ∆ = {0, 1}

δ = {q0a7→ q0, q0

b7→ q1, q1a7→ q1, q1

b7→ q0}λ = {q0 7→ 1, q1 7→ 0}

q01

q10

ab

b

a

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 26

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

DFAs are everywhere

I Beverage vending machines.I Elevators.I Mobile phone menus.I etc

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 27

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Properties of Symbolic Systems

, Human readable and writable, i.e. background knowledgeis directly integrable.

, Declarative semantics is available., Recursive structures can easily be represented and

manipulated., Successfully used in many application areas.

/ Hard to learn and to adapt to new environments./ If parts of the system breaks, the whole system fails./ Reasoning can be very hard.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 28

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Why Neural-Symbolic Integration?

I Connectionist systems and symbolic knowledgerepresentation are two major approaches in AI.

I Both have complementary advantages and disadvantages.I We try to integrate both by keeping the advantages:

, Human readble and writable., Declarative semantics is available., Recursive structures can easily be represented and

manipulated., Massively parallel paradigm., Well suited to learn and to adapt to new environments., Gracefully degradation.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 29

Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration

Major Problems in Neural-Symbolic Integration

I How can symbolic knowledge be represented withinconnectionist systems?

I How can symbolic knowledge be extracted fromconnectionist systems?

I How can symbolic knowledge be learned usingconnectionist systems?

I How can connectionist learning be guided by symbolicbackground knowledge?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 30

Neural-Symbolic IntegrationA selfcontained introduction

Sebastian Bader Pascal Hitzler

ICCL, Technische Universitat Dresden, Germany

AIFB, Universitat Karlsruhe, Germany

Part

The History of Neural-Symbolic Integration

Page 5: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

A Joint Start - McCulloch and Pitts

? Can the activities of a neural system be modelled by alogical calculus?

I W. S. McCulloch and W. Pitts, 1943A logical calculus of the ideas immanent in nervous activity

I S. C. Kleene, 1956Representation of events in nerve nets and finite automata

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 33

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Logical Connectives

? Can we model logical connectives using simple units?I Using binary threshold units and the activations 1 for “true”

and 0 for “false”, we obtain:

Disjunction: Conjunction: Negation:x

y

x ∨ y0.5

1.0

1.0

x

y

x ∧ y1.5

1.0

1.0x ¬x

-0.5-1.0

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 34

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

McCulloch-Pitts Networks

A McCulloch-Pitts network consist of ...I A set I of input units.I a set U of binary threshold units.I A subset O ⊆ U of output units.

ExampleI = {x , y}

U = {h, o}O = {o}

x

y

o0.5

1.5

-1.0

1.0-1.0

1.0 1.0

1.0

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 35

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Moore Machines

I A Moore Machine consists of:• set of states with an initial state• set of input symbols• set of output symbols• state transition function• state output function

I Using the Moore-machine:Input: a b b a

State: q0 q1

ab

b

a

q0 q1

ab

b

a

q0 q1

ab

b

a

q0 q1

ab

b

a

q0 q1

ab

b

a

Output: 1 0 1 1

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 36

q01

q10

ab

b

a

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

From Moore Machines to McCulloch-Pitts Networks

q01

q10

ab

b

a

a

b 0∨

1∨

∨q0

∨q1

∨q′0

∨q′1

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 37

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

From Moore Machines to McCulloch-Pitts Networks

a 1

b 1

b 0

a 1

1

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 38

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

From McCulloch-Pitts Networks to Moore Machines

I A sample networkx

y

o0.5

1.5

-1.0

1.0-1.0

1.0 1.0

1.0

I Moore Machine:• Set of states (Q)• Input symbols (Σ)• Output symbols (∆)• State transitions (δ)• State outputs (λ)

I Q ={

, , ,}

I Σ ={

, , ,}

I ∆ ={

,}

I δ : Q × Σ→ Q

I λ : Q → ∆

λ(Q)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 39

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Conclusions

I McCulloch-Pitts networks are finite automata and viceversa.

I The paper (“A logical calculus of the ideas immanent innervous activity”) started the research on artificial neuralnetworks and on finite automata.

I Similar constructions work for other types of automata.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 40

Page 6: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Recursive Autoassociative Memory (RAAM)I Designed to encode structured data, e.g. trees:

A B C D

I Terminals are maped to vectors:A 7→ (1, 0, 0, 0) B 7→ (0, 1, 0, 0)

C 7→ (0, 0, 1, 0) D 7→ (0, 0, 0, 1)

I Nonterminals are learned.

0010

B

0001

A

[AB]

0010

B

0001

A

0100

D

1000

C

[CD]

0100

D

1000

D

[CD]

[AB]

[ABCD] [CD]

[AB]

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 41

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

RAAM - UsageDue to the auto-associative structure, a single network can beused for encoding and decoding.

I Encoding:

0010

B

0001

A

[AB]

0010

B

0001

A

I Decoding:

0010

B

0001

A

[AB]

0010

B

0001

A

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 42

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

RAAM - Conclusions

I For details, see (Pollack, 199x)., Efficiently implementable., Use of powerfull gradient based learning techniques., System degrades gracefully.

/ Difficulties to distinguish terminals and non-terminals forterms with depth ≥ 5.

/ Capacity limit ≈ depth 5./ System needs an external controller.

, Demonstration that structured data can be representedwithin a connectionist system.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 43

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - A System for Reflexive Reasoning

I Humans can handle certain problems very easily and fast.I Humans approximate knowledge base: 108 rules and

facts, i.e. we perform reflexive reasoning in sublinear time.I The SHRUTI-System (Shastri & Ajjanagadde, 1993) is a

connectionist architecture for this type of reasoning.I Variable binding by synchronization of neurons.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 44

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - A Sample Knowledge Base

I Rules:owns(Y , Z )← gives(X , Y , Z )

owns(X , Y )← buys(X , Y )

can−sell(x , y)← owns(X , Y )

I Facts:gives(john, josephine, book)

(∃X )buys(john, X )

owns(josephine, ball)

I Question:can−sell(josephine, book)? yes(∃X )owns(josephine, X )? yes (X 7→ book , X 7→ ball)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 45

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - A Sample Network

Can-sell

Owns

Gives Buys

book john ball josephine

from john

from john

from josephine

from book

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 46

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - A Sample Network Run

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 47

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - Conclusions

I Answers are derived in a time proportional to the depth ofthe search space (Reflexive Reasoning)

I Network size is linear in the size of the knowledge base.I A rule can be used only a fixed number of times.I Biologically plausible.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 48

Page 7: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

SHRUTI - Extensions

I Support of negation and inconsistency(Shastri & Wendelken, 1999).

I Simple Learning using Hebbian Learning(Wendelken & Shastri, 2003).

I Multiple instantiation of a single rule(Wendelken & Shastri, 2004).

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 49

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - Knowledge-Based Artificial Neural Networks

? Can simple “if-then” rules be represented and learnedusing a connectionist architecture?

I Geoffrey G. Towell and Jude W. Shavlik, 1994

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 50

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - The Construction

A← B ∧ C ∧ ¬D.

A← D ∧ ¬E .

H ← F ∧G.

K ← A ∧ ¬H.

A← A′ ∨ A′′.

A′ ← B ∧ C ∧ ¬D.

A′′ ← D ∧ ¬E .

H ← F ∧G.

K ← A ∧ ¬H.

B

C

D

E

F

G

A’

A”

H∧

A∨

K∧

w

w

-w

w

-w

w

w

w

w w

-w

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 51

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - Training

TrainingI Add hidden units.I Fully connect layers.I Add small random

numbers to weights andthresholds.

I Apply backpropagation.

B

C

D

E

F

G

A’

A”

H∧

A∨

K∧

w

w

-w

w

-w

w

w

w

w w

-w

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 52

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - A Problem

, Works well if rules have only few conditions and there areonly few rules with the same consequence, but:

I Towell and Shavlik used sigmoidal output functions:

o =1

1 + e−(p−θ)

I The threshold of conjunctive units are computed as follows:

θ = (P − 0.5) · w

Where P is the number of positive antecedents.I The threshold of disjunctive units is allways set to

θ = w/2

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 53

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - A Problem ctd.

I The rules:C ← A1 ∨ . . . ∨ An. θC = w/2Ai ← Ai1 ∧ . . . ∧ AiP . θAi = (P − 0.5) · w

I Let all but one Aij be true for each clause, i.e.pAi = (P − 1)w

oAi =1

1 + e−(p−θ)=

11 + e−((P−1)w−(P−0.5)w)

=1

1 + ew2

pC =n · w

1 + ew2

/ For any value of w we can compute an n, such that the oCexceeds any threshold to be concidered active.

, Can be solved using bipolar output functions.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 54

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

KBANN - Conclusions

I Mapping of hierarchical domain knowledge into aconnectionist system.

I Refinement using standard backpropagation.I Successfully applied to a number of problems

(e.g. DNA sequence analysis).I Outperforms purely empirical and purely hand-built

classifiers.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 55

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Symmetric Networks

A symmetric network consists of ...I a set U of binary threshold units.I a set W of symmetric connections W ⊆ U × U, i.e.

wij = wji .The units are updated asynchronously until a stable state isreached.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 56

Page 8: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Symmetric Networks - A Simple Example

0

0

5 0

2

-1

2

2

0

0

5 0

2

-1

2

2

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 57

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Symmetric Networks and Logic Formulae

I It is possible to associate an energy function E(t)describing the state of the network at time t .

I The energy is monotone decreasing, i.e. E(t) ≥ E(t + 1).

? Is there a link between propositional logic formulae andsymmetric networks (Pinkas, 1991)?

I To each propositional logic formula we can define afunction τ which is “compatible” with the error function.

I We can construct a symmetric network such that theactivation of the network at the minima coincide with themodels of the formula.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 58

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Symmetric Networks - An Example

Example

F = (¬o ∨m) ∧ (¬s ∨ ¬m) ∧ (¬c ∨m) ∧ (¬c ∨ s) ∧ (¬v ∨ ¬m)

τ(F ) = vm − cm − cs + sm − om + 2c + o

0

1

2

0

0

0

0

m

so

v c

1 -1

-1 1

1

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 59

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Symmetric Networks - Conclusions

I Strong link between propositional logic formulae andsymmetric networks.

I Further extensions to non-monotonic logics andinconsistency.

I Add penalties to clauses which define a preference.I Network settles down to most preferable interpretation.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 60

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

The Core Method

I Relate logic programs and connectionist systemsI Embed interpretations into (vectors of) real numbers.I Hence, obtain an embedded version of the TP-operator.I Construct a network computing one application of fP .I Add recurrent connections from output to input layer.

IL ILTP

Rm Rm

ι ι−1

fP

~x fP (~x)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 61

A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method

Major Problems in Neural-Symbolic Integration

I How can symbolic knowledge be represented withinconnectionist systems? (What is ι?)

I How can symbolic knowledge be extracted fromconnectionist systems? (What is ι−1?)

I How can symbolic knowledge be learned usingconnectionist systems?

I How can connectionist learning be guided by symbolicbackground knowledge?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 62

Neural-Symbolic IntegrationA selfcontained introduction

Sebastian Bader Pascal Hitzler

ICCL, Technische Universitat Dresden, Germany

AIFB, Universitat Karlsruhe, Germany

Part

The Core-Method for Propositional Logic

Page 9: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Propositional Logic Programs – An Example

A← ¬B. % A is true, if B is false.B ← A ∧ ¬B. % B is true, if A is true and B is false.B ← B. % B is true, if B is true.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 65

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Propositional Logic Programs – The Syntax

Definition (Propositional Variables & Connectives)A, B, C, D, . . . ∧ = “and” ← = “if-then” ¬ = “not”

Definition (Clause)H︸︷︷︸

head

← L1 ∧ L2 ∧ . . . ∧ Ln.︸ ︷︷ ︸body with Li either X or ¬X

Definition (Propositional Logic Program)A propositional logic program is a finite set of clauses.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 66

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Propositional Logic Programs – The Semantics

Definition (Herbrand Base BL)The Herbrand base is the set of all variables occuring in P.

Example (BL for the running example)BL = {A, B}

Definition (Interpretation)An interpretation is a subset of the Herbrand base.

Example (Interpretations for the running example)I1 = ∅ I2 = {A} I3 = {B} I4 = {A, B}

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 67

A← ¬B.

B ← A ∧ ¬B.

B ← B.

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Propositional Logic Programs – The Semantics Ctd.

Example (For I2 = {A})(A)I2 = true (¬A)I2 = false

(B)I2 = false (¬B)I2 = true

(A← ¬B)I2 = true (B ← B)I2 = true

(A ∧ ¬B)I2 = true (B ← A ∧ ¬B)I2 = false

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 68

A← ¬B.

B ← A ∧ ¬B.

B ← B.

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Propositional Logic Programs – The Semantics Ctd.

Definition (Model)An interpretation M satisfying every clause of a program P iscalled a model of P (in symbols M |= P).

Example (Models of the running example)

A← ¬B.

B ← A ∧ ¬B.

B ← B.

∅ 6|= P

{A} 6|= P

{B} |= P

{A, B} |= P

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 69

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

The Immediate Consequence Operator TP

Definition (TP)TP(I) = {A | there is a clause A← body in P and I |= body}

I The TP-operator propagates truth along the clauses.

Example (TP for our running example)

A← ¬B.

B ← A ∧ ¬B.

B ← B.

{} 7→ {A}{A} 7→ {A, B}{B} 7→ {B}

{A, B} 7→ {B}

I For definite programs, TP converges to the least model.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 70

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Constructing the Core-Network

1. For each element of BL, addan input unit and an outputunit with threshold 0.5.

2. For each clause H ← L1 . . . Lndo the following:2.1 Add a hidden unit c and a

connection to H ′ (w = 1.0).2.2 Connect every Li and c with

w =

{+1.0 if Li is positive,−1.0 if Li is negated.

2.3 Set the threshold of c to“number of pos. Li ”−0.5.

Example

A← ¬B.

B ← A ∧ ¬B.

B ← B.

A

B

A’∨

B’∨

1.0-1.0

1.0

-1.0 1.0

1.0 1.0

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 71

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

One Application of TP

A← ¬B.

B ← A ∧ ¬B.

B ← B.

A

B

A’0.5

B’0.5

-0.5

0.5

0.5

-1.01.0

1.0

-1.0 1.0

1.0 1.0

{} 7→ {A} 7→ 7→

{A} 7→ {A, B} 7→ 7→

{B} 7→ {B} 7→ 7→

{A, B} 7→ {B} 7→ 7→

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 72

Page 10: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Repetitive Application of TP

A← ¬B.

B ← A ∧ ¬B.

B ← B.

A

B

A’0.5

B’0.5

-0.5

0.5

0.5

-1.01.0

1.0

-1.0 1.0

1.0 1.0

7→ 7→ 7→ 7→ 7→

7→ 7→ 7→ 7→ 7→

7→ 7→ 7→ 7→ 7→

7→ 7→ 7→ . . .

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 73

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Main Results (Holldobler & Kalinke, 1994)

I 2-layer networks cannot compute TP .I For each program P there exists a 3-layer kernel

computing TP .

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 74

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Space and Time Complexity

Let n be the number of clauses, m be the number ofpropositional variables:

I 2m + n units, 2mn connections in the kernel.I TP(I) is computed in 2 steps.I The parallel model to compute TP is optimal.I The recurrent network settles down in at most 3n steps.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 75

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction Methods

I Single units do not necessarily correspond to single rules.I In general: It is NP-complete to find the minimal logical

description for a trained network (Golea, 1996).I There is not allways a single minimal program (Lehmann,

Bader & Hitzler, 2005).

Decompositionalrule1

rule2

rule3

rule4

Pedagogical

~x fP (~x)

rules

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 76

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction – A Pedagogical Approach

A

B

0.3

-0.4

0.6

A’-0.2

B’0.2

1.0

-2.0

-0.5

1.5

0.3

0.8

2.00.7

0.0

-1.0

-0.21.7

A B c1 c2 c3 A’ B’0 0 0.0 / 0.0 0.0 / 1.0 0.0 / 0.0 0.0 / 1 -1.0 / 00 1 1.5 / 1.0 0.3 / 1.0 0.8 / 1.0 1.8 / 1 0.7 / 11 0 1.0 / 1.0 -2.0 / 0.0 -0.5 / 0.0 2.0 / 1 0.7 / 11 1 2.5 / 1.0 -1.7 / 0.0 0.3 / 0.0 2.0 / 1 0.7 / 1

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 77

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction – A Pedagogical Approach

A B A’ B’0 0 1 00 1 1 11 0 1 11 1 1 1

A← ¬A ∧ ¬B.

A← ¬A ∧ B.

A← A ∧ ¬B.

A← A ∧ B.

B ← ¬A ∧ B.

B ← A ∧ ¬B.

B ← A ∧ B.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 78

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction – A Pedagogical Approach

A← ¬A ∧ ¬B.

A← ¬A ∧ B.

A← A ∧ ¬B.

A← A ∧ B.

B ← ¬A ∧ B.

B ← A ∧ ¬B.

B ← A ∧ B.

A.

B ← ¬A ∧ B.

B ← A.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 79

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction – A Pedagogical Approach

, Sound, i.e. every extracted rule is a rule implemented bythe network.

, Complete, i.e. every rule implemented by the network willbe extracted.

/ Bad time-complexity, due to the exponential blow-up./ Does not create the smallest program automatically.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 80

Page 11: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Extraction – A Decompositional Approach

We can do much better (Mayer-Eichberger, 2006):I Decompositional approach., Implementable (the implementation is under way)., Sound., Complete., Create very small programs automatically.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 81

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Main Results (Holldobler & Kalinke, 1994)

I 2-layer networks cannot compute TP .I For each program P there exists a 3-layer kernel

computing TP .I For each 3-layer kernel K there exists a program P, such

that K computes TP .I Let n be the number of clauses, m be the number of

propositional variables• 2m + n units, 2mn connections in the kernel.• TP(I) is computed in 2 steps.• The parallel model to compute TP is optimal.• The recurrent network settles down in at most 3n steps.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 82

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

The CILLP-System

? Can the learning capabilities of KBANN be combined withthe Core Method (Garcez & Zaverucha, 1999)?

I Using sigmoidal functions, we obtain a standard 3-layerfeed-forward neural network.

I This network is trainable using back-propagation.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 83

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

CILLP - The Construction

I Define ranges for ”true” and “false”:

a“true”

-a“false”

?

I Compute a, the weights and thresholds such that thesigmoidal kernel computes TP (Garcez & Zaverucha,1999).

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 84

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

CILLP - Extracting a Learned Program

I The pedagogical approach would work, but ...I The decompositional approach mentioned above does not

work for sigmoidal units.I Garcez, Broda & Gabbay (2001) proposed a suitable

method, which ..., is sound., is computational feasible due to clever restriction of the

search space./ is not necessarily complete./ does not necessarily create the small programs.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 85

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

CILLP - The MONK’s Problems

I Robots are described by 6 properties,e.g. head-shape ∈ {round, square, octagon}, ...

I Classification task: “Recognice robots with(body-shape = head-shape) or (jacket-color = red)”

I Network architecture:• 17 input units: one for each attribute.• 3 hidden layer units.• 1 output unit: indicating answer “yes” or “no”.

I 100% performance of the network and extracted rules.I Pruning: from 131072 possible inputs for some hidden

unit, only 18724 were queried.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 86

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

CILLP - Conclusions

I Successfully used for ...• classification tasks like the MONK’s problem.• DNA sequence analysis (Promoter Recognition, Splice

Junction Determination).• Power system fault diagnosis.

I Extensions of the CILLP-System:• Metalevel priorities between rules (Garcez, Broda &

Gabbay , 2000).• Intuitionistic logic (Garcez, Lamb & Gabbay, 2003).• Modal logic (Garcez, Lamb, Broda & Gabbay, 2004).

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 87

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

The Core Method

I Relate logic programs and connectionist systemsI Embed interpretations into (vectors of) real numbers.I Hence, obtain an embedded version of the TP-operator.I Construct a network computing one application of fP .I Add recurrent connections from output to input layer.

IL ILTP

Rm Rm

ι ι−1

fP

~x fP (~x)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 88

Page 12: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Major Problems in Neural-Symbolic Integration

I How can symbolic knowledge be represented withinconnectionist systems? (What is ι?)

I How can symbolic knowledge be extracted fromconnectionist systems? (What is ι−1?)

I How can symbolic knowledge be learned usingconnectionist systems?

I How can connectionist learning be guided by symbolicbackground knowledge?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 89

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Conclusions

We have a complete system implementing the NeSy-Cycle forpropositional logic programs.

SymbolicSystem

ConnectionistSystem

embedding

extraction

writable

readable

train

able

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 90

Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions

Main Results

I 3-layer feedforward networks can compute TP .I Using sigmoidal units, the network is trainable using

Back-Propagation.I Extraction is sound (and complete).I Successfully applied to real world problems.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 91

Neural-Symbolic IntegrationA selfcontained introduction

Sebastian Bader Pascal Hitzler

ICCL, Technische Universitat Dresden, Germany

AIFB, Universitat Karlsruhe, Germany

Part

The Core Method for First Order Logic

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

First Order Logic Programs – Two Examples

nat(0). % 0 is a natural number.nat(succ(X ))← nat(X ). % The successor succ(X ) is a natural

% number if X is a natural number.

even(0). % 0 is an even number.even(succ(X ))← odd(X ). % The successor of an odd X is even.odd(X )← ¬even(X ). % If X is not even then it is odd.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 94

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

First Order Logic Programs – The Syntax

Functions, Variables and TermsF = {0/0, succ/1}V = {X}T = {0, succ(0), succ(X ), succ(succ(0)), . . .}

Predicate Symbols and AtomsP = {even/1, odd/1}A = {even(succ(X )), odd(succ(0)), odd(0), odd(X ), . . .}

Connectives, Clause and Program↗ propositional logic

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 95

e(0).

e(s(X ))← o(X ).

o(X )← ¬e(X ).

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

First Order Logic Programs – The Semantics

Herbrand Base BL = Set of ground atomsBL = even(0), even(succ(0)), . . . , odd(0), odd(succ(0)), . . .

Interpretations = Subsets of the Herbrand baseI1 = {even(succ2n(0) | n ≥ 1} I2 = {}I3 = {odd(succ2n+1(0) | n ≥ 0} I4 = I2 ∪ I3

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 96

Page 13: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

TP for our running examples

Definition (TP)TP(I) = {A | there is A← body in ground(P) and I |= body}

Example (Natural numbers){} 7→ {n(0)}

n(0). {n(0)} 7→ {n(0), n(s(0))}n(s(X ))← n(X ). {n(0), n(s(0))} 7→ {n(0), n(s(0)), n(s(s(0)))}

{n(X ) | X ∈ T } 7→ {n(X ) | X ∈ T }

Example (Even and odd numbers){} 7→ {e(0), o(X ) | X ∈ T }

e(0). {o(X ) | X ∈ T } 7→ {e(0), e(s(X )), o(X ) | X ∈ T }e(s(X ))← o(X ). {e(s2n(0)) | n ≥ 0} 7→ {e(0), o(s2n+1(0)) | n ≥ 0}o(X )← ¬e(X ). {o(s2n+1(0)) | n ≥ 0} 7→ {e(0), e(s2n(0)) | n ≥ 0}

BL 7→ {e(0), e(s(X )) | X ∈ T }

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 97

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Problems

I BL is usually infinite and therefore the propositionalapproach does not work.

I How can we bridge the gap?• How can first-order terms be represented?• How can first-order rules be represented?• How can the variable-binding be solved?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 98

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Level Mappings

I A Level Mapping | · | assigns a (unique) natural number toeach ground atom ...

Example (Even and odd numbers)

|e(sn(0))| = 2n + 1 |o(sn(0))| = 2n + 2

I ... hence, enumerates the Herbrand base:

Example (Even and odd numbers)[e(0)︸︷︷︸

1

, o(0)︸︷︷︸2

, e(s(0))︸ ︷︷ ︸3

, o(s(0))︸ ︷︷ ︸4

, e(s(s(0)))︸ ︷︷ ︸5

, . . .]

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 99

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Embedding First-Order Terms into the Real Numbers

Using an injective level mapping, we can assign a unique realnumber to each interpretation:

ι(I) =∑A∈I

4−|A|

This coincides with a “binary” representation:

BL= [ e(0),o(0),e(1),o(1),e(2),. . .]

ι({e(0)})= 0. 1 0 0 0 04 = 0.2510ι({e(0), e(1), e(2)})= 0. 1 0 1 0 14 ≈ 0.2710

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 100

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

The Graph of the Natural Numbers

ι (I)

ι (TP (I))

0.25

0.25

n(0).

n(s(X ))← n(X ).

|n(sn(0))| = n + 1

{}|{z}0.0

7→ {n(0)}| {z }0.25

{n(0)}| {z }0.25

7→ {n(0), n(s(0))}| {z }0.3125

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 101

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

The Graph of the Even and Odd Numbers

ι (I)

ι (TP (I))

0.25

0.25 e(0).

e(s(X ))← o(X ).

o(X )← ¬e(X ).

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 102

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Some Results

Theorem (Holldobler, Kalinke & Storr, 1999)The TP-operator associated with an acyclic (wrt. injective levelmapping) first order logic program can be approximatedarbitrarily well using standard sigmoidal networks.

Some conclusions and limitations:, The Core-Method can be applied to first order logic., First treatment of first-order logic with function symbols in a

connectionist setting./ No algorithm to construct the network./ Very limitted class of logic programs.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 103

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Approximating the Embedded TP-Operator

ι (I)

ι (TP (I))

0.25

0.25

e(0).

e(s(X ))← o(X ).

o(X )← ¬e(X ).

ε = 0.05

Constructions using sigmoidal and RBF-units are given in(Bader, Hitzler & Witzel, 2005).

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 104

Page 14: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

A Problem ...

I The accuracy of this approach is very limitted.I E.g., on a 32 bit computer, only 16 atoms can be

represented.I Therefore, we need to use real vectors instead of a single

real number to represent interpretations.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 105

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Multi-dimensional Level Mappings

I A Multi-dimensional Level Mapping ‖ · ‖ assigns to eachground atom a level l ∈ N+ and a dimension d ∈{1, . . . m}:Example (Even and odd numbers)

‖e(sn(0))‖ = (n + 1, 1) ‖o(sn(0))‖ = (n + 1, 2)

I ... still “enumerates” the Herbrand base:Example (Even and odd numbers)

1 2 3 4dim1 : e(0) e(s(0)) e(s(s(0))) e(s(s(s(0))))dim2 : o(0) o(s(0)) o(s(s(0))) o(s(s(s(0))))

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 106

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Embedding First-Order Terms into the Real Numbers

Using an injective m-dimensional level mapping, we can assigna unique m-dimensional vector to each interpretation:

~ι(I) =∑A∈I

~ι(A)

~ι(A) =(ι1(A), . . . , ιm(A)) with

ιi(A) =

{4−l for ‖A‖ = (l , d) and i = d0 otherwise

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 107

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Cm- The Set of all embedded Interpretations

Cm for the 2-dimensional case:

Cm = {~ι(I) | I ∈ IL}x

0.3

y0.3

{} 7→ (0, 0)

{e(0)} 7→ (0.25, 0)

{o(0)} 7→ (0, 0.25)

{e(0), o(0)} 7→ (0.25, 0.25)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 108

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Cm- The Set of all embedded Interpretations

Another construction:

x

y

x

y

x

y

x

y

. . .

x

y

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 109

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Approximating the Embedded TP-Operator

ι (I)

ι (TP (I))

0.3

0.3

d10.3

d2

0.3

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 110

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Implementation

A first prototype implemented by Andreas Witzel (Witzel, 2006):

I Merging of the techniques described above andSupervised Growing Neural Gas (SGNG) (Fritzke, 1998).

I Radial basis function network approximating TP .I Very robust with respect to noise and damage.I Trainable using a version of backpropagation together with

techniques from SGNG.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 111

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Approximating the Embedded TP-Operator

ι (I)

ι (TP (I))

0.3

0.3

d10.3

d2

0.3

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 112

Page 15: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Statistics - FineBlend vs SGNG

0.01

0.1

1

10

100

0 2000 4000 6000 8000 10000 12000 14000 0

20

40

60

80

100

120

140

erro

r

#uni

ts

#examples

#units (FineBlend 1)error (FineBlend 1)

#units (SGNG)error (SGNG)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 113

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Statistics - Unit Failure

0.01

0.1

1

10

100

0 2000 4000 6000 8000 10000 12000 14000 16000 0

10

20

30

40

50

60

70

80

erro

r

#uni

ts

#examples

#units (FineBlend 1)error (FineBlend 1)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 114

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Statistics - Iteration of Random Inputs

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.05 0.1 0.15 0.2 0.25 0.3

dim

ensi

on 2

(od

d)

dimension 1 (even)

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 115

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Conclusions

, Prototypical implementation., Very robust with respect to noise and damage., Trainable using more or less standard algorithms., System outperforms other architectures (at least for the

tested examples).

/ System requires many parameters./ There is no first-order extraction technique yet.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 116

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

First-order by propositional approximation

Let P be definite and I be its least Herbrand model (Seda &Lane, 2004):

I Choose some error ε.I There exists a finite ground subprogram Pn (least model In)

such thatd(I, In) < ε.

I Use propositional approach to encode Pn.I Increasing n yields better approximations of TP .

(If TP is continuous wrt. d .)I Approach works for other (many-valued) logics similarly.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 117

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Comparison of the approaches

I Seda & Lane:• For definite programs under continuity constraint.• Treatment of acyclic programs should be ok.• Better approximation increases all layers of network.• Step functions only.• Sigmoidal approach (learning) to be investigated.

I Bader, Hitzler & Witzel:• For acyclic normal programs.• Treatment of definite (continuous) programs should be ok.• Better approximation increases only hidden layer.• Variety of activation functions.• Standard learning possible.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 118

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Iterated Function Symbols

I The Sierpinsky Triangle:

x1.0

y1.0

;x

1.0

y1.0

;x

1.0

y1.0

;x

1.0

y1.0

x1.0

y1.0

;x

1.0

y1.0

;x

1.0

y1.0

; . . .

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 119

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

From Logic Programs to Iterated Function Systems

I For some logic programs we can explicitely construct anIFS, such that the attractor coincides with the graph of theembedded TP-operator.

I Let P be a program such that fP is Lipschitz-continous.Then there exists an IFS such that the attractor is thegraph of fP .

I For a finite set of points taken from a TP-operator, we canconstruct an interpolating IFS.

I The sequence of attractors of interpolating IFSs for acyclicprograms converges to the graph of the program.

I IFSs can be encoded using RBF networks.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 120

Page 16: Outline of the Course Neural-Symbolic Integration · 2008. 6. 7. · A logical calculus of the ideas immanent in nervous activity . 1968 Marvin Minsky and Seymor Papert publish Perceptron

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Extraction of First-Order Logic Programs

I Very little work has been done on this.I A general idea:

• Use any initialization method as a base.• Neural network are points in Rn, where n is number of

weights.• Define conditions on programs which may be extracted

(E.g.: maximum number of atoms or of term nesting depth).• ; discrete points in Rn via initialization method.• Program which lies closest to network in Rn is the extracted

program.

? Could this work?

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 121

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Conclusions

I 3-layer feedforward networks can approximate TP forcertain programs.

I Using sigmoidal units, the network is trainable usingbackpropagation.

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 122

FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions

Open Problems

? How can first-order descriptions be extracted from aconnectionist system?

? Can a first-order neural-symbolic system be applied to realworld problems, outperforming conventional approaches?

? How does the Core Method relate to reasoningapproaches from Cognitive Science?

? ... (many more) ...

www.neural-symbolic.org

Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 123