Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Neural-Symbolic IntegrationA selfcontained introduction
Sebastian Bader Pascal Hitzler
ICCL, Technische Universitat Dresden, Germany
AIFB, Universitat Karlsruhe, Germany
Outline of the Course
I Introduction and MotivationI The History of Neural-Symbolic IntegrationI The Core Method for Propositional LogicI The Core Method for First-Order Logic
Part
Introduction and Motivation
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Why Neural Symbolic Integration
As we will see, connectionist systems and symbolic AI systemshave quite contrasting advantages and disadvantages. We tryto integrate both paradigms while keeping the advantages.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 4
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
The Neural Symbolic Cycle
SymbolicSystem
ConnectionistSystem
embedding
extraction
writable
readable
train
able
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 5
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Connectionist Systems
I Inspired by nature.I Massively parallel computational model.I A Connectionist System consist of ...
• a set U of units (input, hidden and output).• a set of connections C ⊆ U × U, each labelled with a
weight w ∈ R.x
y
z
-1.5
0.3
-1.3
-0.7
1.8
-1.6
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 6
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Units of Connectionist Systems
A unit is characterised by ...I Activation function, mapping inputs~i to the potential p:
p =∑
n
in · wn p =∑
n
(in − wn)2
I Output function, mapping the potential p to the output o:
threshold ramp sigmoidal tanh Gaussian
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 7
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Dynamics of a Network
x
y
z
-1.5
0.3
-1.3
-0.7
1.8
-1.6
Activation F. Output F.input p set from outside o = phidden p =
Pn(in − wn)
2 o = e−p2
output p =P
n(in ∗ wn) o = p
t=0:
-1.0
1.0
t=1:0.74
0.5783
2.98
0.0001
t=2:
1.0391.039
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 8
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Dynamics of a Network
Activation function Output function Result
p =∑
n(in − wn)2
1
-11-1
-2-1.5-1-0.5 0 0.5 1 1.5 2
-3-2.5
-2-1.5
-1-0.5
0 0.5
1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-2-1.5
-1-0.5
0 0.5
1 1.5
2
p =∑
n in · wn
1
-11-1
-2-1.5-1-0.5 0 0.5 1 1.5 2
-2-1.5
-1-0.5
0 0.5
1 1.5
2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-2-1.5
-1-0.5
0 0.5
1 1.5
2
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 9
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Training of Connectionist Systems
? How can we train a network to represent a function givenas a set of samples {(i1, o1), . . . , (in, on)}?
x
y
⇒ x
y
I Learning as generalization.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 10
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Backpropagation
I Let a set of samples {(i1, o1), . . . , (in, on)} be given.I Error of the network: E =
∑i(N (ii)− oi)
2.I Idea: minimise E by gradient descent.
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5
-2-1.5
-1-0.5
0 0.5
1 1.5
2
-2-1.5-1-0.5 0 0.5 1 1.5 2
0.5 1
1.5 2
2.5 3
3.5 4
4.5 5
5.5
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 11
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Backpropagation in Detail
1. Present a training sample to the network.2. Compare the output of the network with the desired output.3. Calculate the error in each output unit.4. Modify the weights to the output layer such that the error
decreases.5. Propagate the error back to the last hidden units.6. Compute the part of the error caused by the hidden units.7. Modify the weights to the hidden units using this local error.8. Continue until input units are reached.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 12
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
A sample run ...
Prior training:
x
-3.0
0.0
0.7
0.2
0.1
y-0.6
-2.6+0.2+3.1-1.1-0.7
-1.3-0.2-1.2-1.3-0.7
x
y
After training:
x
-3.6
0.0
-2.3
2.4
0.3
y0.6
-0.7-0.2+4.7-2.7-0.7
-0.4+0.1-1.3-2.6+0.2
x
y
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 13
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Funahashi’s Theorem
Theorem (Ken-Ichi Funahashi, 1989)Every continous function f : K → R (with K ⊂ Rn compact) canbe approximated arbitrarily well using 3 layer feed-forwardnetworks with sigmoidal units.
1
-11-1
x
y
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 14
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
History of Connectionist Systems
1943 Warren Stirgis McCulloch and Walter Pitts publish“A logical calculus of the ideas immanent in nervousactivity”.
1968 Marvin Minsky and Seymor Papert publish“Perceptron”.
1974 Paul Werbos (1974), David Parker (1984) and DavidRumelhart & Ronald Williams (1985) inventBackpropagation
1989 Ken-Ichi Funahashi publishes“On the Approximate Realisation of Continuous Mappingsby Neural Networks”.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 15
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
NETtalk
I Terrence J. Sejnowski and Charles R. Rosenberg, 1987I Network learns the map between letters and phonemesI 3-layer feed-forward network with sigmoidal units:
• 203 input units: encoding a window of 7 letters• 80 hidden units• 26 output units: representing phonemes, punctuation ...
I Trained using samples of the form:Word phonemes stress and syllablelogic laJIk > 1 < 0 <programme progr@m-- >> 1 >> 2 <<<neural nU-r-L > 1 << 0 <network nEtw-Rk > 1 <> 2 <<
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 16
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
ALVINN & MANIAC
ALVINN (Autonomous Land Vehicle In a Neural Network),I Pomerleau 1993I Learns to control NAVLAB vehicles by watching a person.I 3-layer feed-forward network with sigmoidal units:
• 960 input units: 30x32 units serve as two dimensional retina• 5 hidden units• 30 output units: representing the steering direction
MANIAC (Multiple ALVINN Networks In Autonomous Control)I Jochem et al 1993I Multiple ALVINN networks, each for a certain type of road.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 17
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
ALVINN, MANIAC & RALPH
The road for ALVINN, MANIAC & RALPH:
RALPH (Rapidly Adapting Lateral Position Handler)I Pomerleau 1995I Drove in 9 days from Pittsburgh to San Diego
(2850 but 50 miles)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 18
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
TD-Gammon
Artificial Backgammon player (Gerry Tesauro, 1995):I Based on standard neural network.I Learns by playing against itself.I Reaches championship level.
Btw.:I We play Backgammon symbolically.I Can we learn from the network to play better?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 19
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Properties of Connectionist Systems
, Massively parallel paradigm., Continuous function can be approximated arbitrarily well., Well suited to learn and to adapt to new environments.
• Hardly any knowledge about the function is needed.• Training from raw, incomplete and noisy data.
, Degrade gracefully, even if parts of the systems fail theoverall behaviour is ok.
, Successfully used in many application areas.
/ Declarative semantics is not available./ Structured objects are hard to represent./ Background knowledge cannot be used.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 20
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Symbolic AI
I Inspired by human thinking.I Aristotle tried to formalize human thinking by means of
logic (500BC).I First “is-a” hierarchy by Porphyry of Tyros (260).I Use of logic-like languages for knowledge representation
and reasoning.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 21
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
SHRDLUI Terry Winograd, MIT, 1968-70I SHRDLU carried on a simple
dialog about a small world.I A sample dialog:
Person: Pick up a red block.Computer: Ok. (does it)Person: Grasp the pyramid.Computer: I don’t understand which pyramid you mean.Person: Find a block which is taller than the one
you are holding and put it into the box.Computer: By ‘‘it’’, i assume you mean the block
which is taller than the one i am holding.Computer: Ok. (does it)Person: What does the box contain?Computer: The blue pyramid and the blue block.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 22
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
SHRDLU - Internals
I Written in LISP.I Equipped with some knowledge about the world:
((#IS :B1 #BLOCK)) ((#IS #RED #COLOR))((#IS :B2 #PYRAMID)) ((#IS #GREEN #COLOR))((#IS :B3 #BLOCK)) ((#IS #BLACK #COLOR))((#COLOR :B1 #RED)) ((#CONTAIN :BOX :B4))((#COLOR :B2 #GREEN)) ((#SHAPE :B1 #RECTANGULAR))((#COLOR :TABLE #BLACK))((#SHAPE :B3 #RECTANGULAR))
(DEFPROP TA-AT (THANTE (X Y) (#AT $?X $?Y)(THRPLACA (CDR (ATAB $?X)) $?Y))THEOREM)
(DEFPROP TA-EXISTS(THANTE (X) (#EXISTS $?X)(THSUCCEED)) THEOREM)
I Can be downloaded fromhttp://hci.stanford.edu/winograd/shrdlu
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 23
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
ProLog (Programming In Logic)
I Designed as a tool for man-mashine communication innatural language.
I Phillippe Roussell and Alain Colmerauer, 1972I The first Prolog-Application:
Every psychatrist is a person.Every person he analyzes is sick.Jacques is a psychatrist in Marseille.
Is Jacques a person? Yes.Where is Jacques? In Marseille.Is Jacque sick? I don’t know.
I Consisted of 610 clauses.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 24
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Applications Involving Prolog
Nowadays:I Turing complete programming language.I Usually with additional (non-logical) features.
Some application areas:I Expert and rule systems.I Computational linguistics (e.g. representation of
grammars).I Planning in AI.I Cognitive robotics.I Semantic web.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 25
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Deterministic Finite Automata
A Moore Machine consists of:I Q - set of states with an initial state q0 ∈ QI Σ - set of input symbolsI ∆ - set of output symbolsI δ - state transition function δ : Q × Σ→ QI λ - state output function λ : Q → ∆
ExampleQ = {q0, q1}Σ = {a, b} ∆ = {0, 1}
δ = {q0a7→ q0, q0
b7→ q1, q1a7→ q1, q1
b7→ q0}λ = {q0 7→ 1, q1 7→ 0}
q01
q10
ab
b
a
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 26
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
DFAs are everywhere
I Beverage vending machines.I Elevators.I Mobile phone menus.I etc
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 27
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Properties of Symbolic Systems
, Human readable and writable, i.e. background knowledgeis directly integrable.
, Declarative semantics is available., Recursive structures can easily be represented and
manipulated., Successfully used in many application areas.
/ Hard to learn and to adapt to new environments./ If parts of the system breaks, the whole system fails./ Reasoning can be very hard.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 28
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Why Neural-Symbolic Integration?
I Connectionist systems and symbolic knowledgerepresentation are two major approaches in AI.
I Both have complementary advantages and disadvantages.I We try to integrate both by keeping the advantages:
, Human readble and writable., Declarative semantics is available., Recursive structures can easily be represented and
manipulated., Massively parallel paradigm., Well suited to learn and to adapt to new environments., Gracefully degradation.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 29
Motivation Connectionist Systems Symbolic AI Neural-Symbolic Integration
Major Problems in Neural-Symbolic Integration
I How can symbolic knowledge be represented withinconnectionist systems?
I How can symbolic knowledge be extracted fromconnectionist systems?
I How can symbolic knowledge be learned usingconnectionist systems?
I How can connectionist learning be guided by symbolicbackground knowledge?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 30
Neural-Symbolic IntegrationA selfcontained introduction
Sebastian Bader Pascal Hitzler
ICCL, Technische Universitat Dresden, Germany
AIFB, Universitat Karlsruhe, Germany
Part
The History of Neural-Symbolic Integration
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
A Joint Start - McCulloch and Pitts
? Can the activities of a neural system be modelled by alogical calculus?
I W. S. McCulloch and W. Pitts, 1943A logical calculus of the ideas immanent in nervous activity
I S. C. Kleene, 1956Representation of events in nerve nets and finite automata
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 33
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Logical Connectives
? Can we model logical connectives using simple units?I Using binary threshold units and the activations 1 for “true”
and 0 for “false”, we obtain:
Disjunction: Conjunction: Negation:x
y
x ∨ y0.5
1.0
1.0
x
y
x ∧ y1.5
1.0
1.0x ¬x
-0.5-1.0
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 34
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
McCulloch-Pitts Networks
A McCulloch-Pitts network consist of ...I A set I of input units.I a set U of binary threshold units.I A subset O ⊆ U of output units.
ExampleI = {x , y}
U = {h, o}O = {o}
x
y
o0.5
1.5
-1.0
1.0-1.0
1.0 1.0
1.0
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 35
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Moore Machines
I A Moore Machine consists of:• set of states with an initial state• set of input symbols• set of output symbols• state transition function• state output function
I Using the Moore-machine:Input: a b b a
State: q0 q1
ab
b
a
q0 q1
ab
b
a
q0 q1
ab
b
a
q0 q1
ab
b
a
q0 q1
ab
b
a
Output: 1 0 1 1
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 36
q01
q10
ab
b
a
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
From Moore Machines to McCulloch-Pitts Networks
q01
q10
ab
b
a
a
b 0∨
1∨
∨q0
∨q1
∨q′0
∨q′1
∧
∧
∧
∧
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 37
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
From Moore Machines to McCulloch-Pitts Networks
a 1
b 1
b 0
a 1
1
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 38
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
From McCulloch-Pitts Networks to Moore Machines
I A sample networkx
y
o0.5
1.5
-1.0
1.0-1.0
1.0 1.0
1.0
I Moore Machine:• Set of states (Q)• Input symbols (Σ)• Output symbols (∆)• State transitions (δ)• State outputs (λ)
I Q ={
, , ,}
I Σ ={
, , ,}
I ∆ ={
,}
I δ : Q × Σ→ Q
I λ : Q → ∆
λ(Q)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 39
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Conclusions
I McCulloch-Pitts networks are finite automata and viceversa.
I The paper (“A logical calculus of the ideas immanent innervous activity”) started the research on artificial neuralnetworks and on finite automata.
I Similar constructions work for other types of automata.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 40
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Recursive Autoassociative Memory (RAAM)I Designed to encode structured data, e.g. trees:
A B C D
I Terminals are maped to vectors:A 7→ (1, 0, 0, 0) B 7→ (0, 1, 0, 0)
C 7→ (0, 0, 1, 0) D 7→ (0, 0, 0, 1)
I Nonterminals are learned.
0010
B
0001
A
[AB]
0010
B
0001
A
0100
D
1000
C
[CD]
0100
D
1000
D
[CD]
[AB]
[ABCD] [CD]
[AB]
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 41
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
RAAM - UsageDue to the auto-associative structure, a single network can beused for encoding and decoding.
I Encoding:
0010
B
0001
A
[AB]
0010
B
0001
A
I Decoding:
0010
B
0001
A
[AB]
0010
B
0001
A
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 42
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
RAAM - Conclusions
I For details, see (Pollack, 199x)., Efficiently implementable., Use of powerfull gradient based learning techniques., System degrades gracefully.
/ Difficulties to distinguish terminals and non-terminals forterms with depth ≥ 5.
/ Capacity limit ≈ depth 5./ System needs an external controller.
, Demonstration that structured data can be representedwithin a connectionist system.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 43
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - A System for Reflexive Reasoning
I Humans can handle certain problems very easily and fast.I Humans approximate knowledge base: 108 rules and
facts, i.e. we perform reflexive reasoning in sublinear time.I The SHRUTI-System (Shastri & Ajjanagadde, 1993) is a
connectionist architecture for this type of reasoning.I Variable binding by synchronization of neurons.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 44
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - A Sample Knowledge Base
I Rules:owns(Y , Z )← gives(X , Y , Z )
owns(X , Y )← buys(X , Y )
can−sell(x , y)← owns(X , Y )
I Facts:gives(john, josephine, book)
(∃X )buys(john, X )
owns(josephine, ball)
I Question:can−sell(josephine, book)? yes(∃X )owns(josephine, X )? yes (X 7→ book , X 7→ ball)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 45
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - A Sample Network
Can-sell
Owns
Gives Buys
book john ball josephine
from john
from john
from josephine
from book
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 46
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - A Sample Network Run
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 47
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - Conclusions
I Answers are derived in a time proportional to the depth ofthe search space (Reflexive Reasoning)
I Network size is linear in the size of the knowledge base.I A rule can be used only a fixed number of times.I Biologically plausible.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 48
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
SHRUTI - Extensions
I Support of negation and inconsistency(Shastri & Wendelken, 1999).
I Simple Learning using Hebbian Learning(Wendelken & Shastri, 2003).
I Multiple instantiation of a single rule(Wendelken & Shastri, 2004).
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 49
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - Knowledge-Based Artificial Neural Networks
? Can simple “if-then” rules be represented and learnedusing a connectionist architecture?
I Geoffrey G. Towell and Jude W. Shavlik, 1994
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 50
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - The Construction
A← B ∧ C ∧ ¬D.
A← D ∧ ¬E .
H ← F ∧G.
K ← A ∧ ¬H.
A← A′ ∨ A′′.
A′ ← B ∧ C ∧ ¬D.
A′′ ← D ∧ ¬E .
H ← F ∧G.
K ← A ∧ ¬H.
B
C
D
E
F
G
∧
A’
∧
A”
H∧
A∨
K∧
w
w
-w
w
-w
w
w
w
w w
-w
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 51
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - Training
TrainingI Add hidden units.I Fully connect layers.I Add small random
numbers to weights andthresholds.
I Apply backpropagation.
B
C
D
E
F
G
∧
A’
∧
A”
H∧
A∨
K∧
w
w
-w
w
-w
w
w
w
w w
-w
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 52
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - A Problem
, Works well if rules have only few conditions and there areonly few rules with the same consequence, but:
I Towell and Shavlik used sigmoidal output functions:
o =1
1 + e−(p−θ)
I The threshold of conjunctive units are computed as follows:
θ = (P − 0.5) · w
Where P is the number of positive antecedents.I The threshold of disjunctive units is allways set to
θ = w/2
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 53
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - A Problem ctd.
I The rules:C ← A1 ∨ . . . ∨ An. θC = w/2Ai ← Ai1 ∧ . . . ∧ AiP . θAi = (P − 0.5) · w
I Let all but one Aij be true for each clause, i.e.pAi = (P − 1)w
oAi =1
1 + e−(p−θ)=
11 + e−((P−1)w−(P−0.5)w)
=1
1 + ew2
pC =n · w
1 + ew2
/ For any value of w we can compute an n, such that the oCexceeds any threshold to be concidered active.
, Can be solved using bipolar output functions.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 54
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
KBANN - Conclusions
I Mapping of hierarchical domain knowledge into aconnectionist system.
I Refinement using standard backpropagation.I Successfully applied to a number of problems
(e.g. DNA sequence analysis).I Outperforms purely empirical and purely hand-built
classifiers.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 55
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Symmetric Networks
A symmetric network consists of ...I a set U of binary threshold units.I a set W of symmetric connections W ⊆ U × U, i.e.
wij = wji .The units are updated asynchronously until a stable state isreached.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 56
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Symmetric Networks - A Simple Example
0
0
5 0
2
-1
2
2
0
0
5 0
2
-1
2
2
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 57
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Symmetric Networks and Logic Formulae
I It is possible to associate an energy function E(t)describing the state of the network at time t .
I The energy is monotone decreasing, i.e. E(t) ≥ E(t + 1).
? Is there a link between propositional logic formulae andsymmetric networks (Pinkas, 1991)?
I To each propositional logic formula we can define afunction τ which is “compatible” with the error function.
I We can construct a symmetric network such that theactivation of the network at the minima coincide with themodels of the formula.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 58
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Symmetric Networks - An Example
Example
F = (¬o ∨m) ∧ (¬s ∨ ¬m) ∧ (¬c ∨m) ∧ (¬c ∨ s) ∧ (¬v ∨ ¬m)
τ(F ) = vm − cm − cs + sm − om + 2c + o
0
1
2
0
0
0
0
m
so
v c
1 -1
-1 1
1
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 59
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Symmetric Networks - Conclusions
I Strong link between propositional logic formulae andsymmetric networks.
I Further extensions to non-monotonic logics andinconsistency.
I Add penalties to clauses which define a preference.I Network settles down to most preferable interpretation.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 60
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
The Core Method
I Relate logic programs and connectionist systemsI Embed interpretations into (vectors of) real numbers.I Hence, obtain an embedded version of the TP-operator.I Construct a network computing one application of fP .I Add recurrent connections from output to input layer.
IL ILTP
Rm Rm
ι ι−1
fP
~x fP (~x)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 61
A Joint Start RAAM SHRUTI KBANN Symmetric Networks The Core Method
Major Problems in Neural-Symbolic Integration
I How can symbolic knowledge be represented withinconnectionist systems? (What is ι?)
I How can symbolic knowledge be extracted fromconnectionist systems? (What is ι−1?)
I How can symbolic knowledge be learned usingconnectionist systems?
I How can connectionist learning be guided by symbolicbackground knowledge?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 62
Neural-Symbolic IntegrationA selfcontained introduction
Sebastian Bader Pascal Hitzler
ICCL, Technische Universitat Dresden, Germany
AIFB, Universitat Karlsruhe, Germany
Part
The Core-Method for Propositional Logic
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Propositional Logic Programs – An Example
A← ¬B. % A is true, if B is false.B ← A ∧ ¬B. % B is true, if A is true and B is false.B ← B. % B is true, if B is true.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 65
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Propositional Logic Programs – The Syntax
Definition (Propositional Variables & Connectives)A, B, C, D, . . . ∧ = “and” ← = “if-then” ¬ = “not”
Definition (Clause)H︸︷︷︸
head
← L1 ∧ L2 ∧ . . . ∧ Ln.︸ ︷︷ ︸body with Li either X or ¬X
Definition (Propositional Logic Program)A propositional logic program is a finite set of clauses.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 66
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Propositional Logic Programs – The Semantics
Definition (Herbrand Base BL)The Herbrand base is the set of all variables occuring in P.
Example (BL for the running example)BL = {A, B}
Definition (Interpretation)An interpretation is a subset of the Herbrand base.
Example (Interpretations for the running example)I1 = ∅ I2 = {A} I3 = {B} I4 = {A, B}
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 67
A← ¬B.
B ← A ∧ ¬B.
B ← B.
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Propositional Logic Programs – The Semantics Ctd.
Example (For I2 = {A})(A)I2 = true (¬A)I2 = false
(B)I2 = false (¬B)I2 = true
(A← ¬B)I2 = true (B ← B)I2 = true
(A ∧ ¬B)I2 = true (B ← A ∧ ¬B)I2 = false
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 68
A← ¬B.
B ← A ∧ ¬B.
B ← B.
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Propositional Logic Programs – The Semantics Ctd.
Definition (Model)An interpretation M satisfying every clause of a program P iscalled a model of P (in symbols M |= P).
Example (Models of the running example)
A← ¬B.
B ← A ∧ ¬B.
B ← B.
∅ 6|= P
{A} 6|= P
{B} |= P
{A, B} |= P
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 69
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
The Immediate Consequence Operator TP
Definition (TP)TP(I) = {A | there is a clause A← body in P and I |= body}
I The TP-operator propagates truth along the clauses.
Example (TP for our running example)
A← ¬B.
B ← A ∧ ¬B.
B ← B.
{} 7→ {A}{A} 7→ {A, B}{B} 7→ {B}
{A, B} 7→ {B}
I For definite programs, TP converges to the least model.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 70
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Constructing the Core-Network
1. For each element of BL, addan input unit and an outputunit with threshold 0.5.
2. For each clause H ← L1 . . . Lndo the following:2.1 Add a hidden unit c and a
connection to H ′ (w = 1.0).2.2 Connect every Li and c with
w =
{+1.0 if Li is positive,−1.0 if Li is negated.
2.3 Set the threshold of c to“number of pos. Li ”−0.5.
Example
A← ¬B.
B ← A ∧ ¬B.
B ← B.
A
B
A’∨
B’∨
1.0-1.0
∧
∧
1.0
-1.0 1.0
∧
1.0 1.0
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 71
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
One Application of TP
A← ¬B.
B ← A ∧ ¬B.
B ← B.
A
B
A’0.5
B’0.5
-0.5
0.5
0.5
-1.01.0
1.0
-1.0 1.0
1.0 1.0
{} 7→ {A} 7→ 7→
{A} 7→ {A, B} 7→ 7→
{B} 7→ {B} 7→ 7→
{A, B} 7→ {B} 7→ 7→
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 72
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Repetitive Application of TP
A← ¬B.
B ← A ∧ ¬B.
B ← B.
A
B
A’0.5
B’0.5
-0.5
0.5
0.5
-1.01.0
1.0
-1.0 1.0
1.0 1.0
7→ 7→ 7→ 7→ 7→
7→ 7→ 7→ 7→ 7→
7→ 7→ 7→ 7→ 7→
7→ 7→ 7→ . . .
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 73
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Main Results (Holldobler & Kalinke, 1994)
I 2-layer networks cannot compute TP .I For each program P there exists a 3-layer kernel
computing TP .
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 74
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Space and Time Complexity
Let n be the number of clauses, m be the number ofpropositional variables:
I 2m + n units, 2mn connections in the kernel.I TP(I) is computed in 2 steps.I The parallel model to compute TP is optimal.I The recurrent network settles down in at most 3n steps.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 75
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction Methods
I Single units do not necessarily correspond to single rules.I In general: It is NP-complete to find the minimal logical
description for a trained network (Golea, 1996).I There is not allways a single minimal program (Lehmann,
Bader & Hitzler, 2005).
Decompositionalrule1
rule2
rule3
rule4
Pedagogical
~x fP (~x)
rules
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 76
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction – A Pedagogical Approach
A
B
0.3
-0.4
0.6
A’-0.2
B’0.2
1.0
-2.0
-0.5
1.5
0.3
0.8
2.00.7
0.0
-1.0
-0.21.7
A B c1 c2 c3 A’ B’0 0 0.0 / 0.0 0.0 / 1.0 0.0 / 0.0 0.0 / 1 -1.0 / 00 1 1.5 / 1.0 0.3 / 1.0 0.8 / 1.0 1.8 / 1 0.7 / 11 0 1.0 / 1.0 -2.0 / 0.0 -0.5 / 0.0 2.0 / 1 0.7 / 11 1 2.5 / 1.0 -1.7 / 0.0 0.3 / 0.0 2.0 / 1 0.7 / 1
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 77
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction – A Pedagogical Approach
A B A’ B’0 0 1 00 1 1 11 0 1 11 1 1 1
A← ¬A ∧ ¬B.
A← ¬A ∧ B.
A← A ∧ ¬B.
A← A ∧ B.
B ← ¬A ∧ B.
B ← A ∧ ¬B.
B ← A ∧ B.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 78
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction – A Pedagogical Approach
A← ¬A ∧ ¬B.
A← ¬A ∧ B.
A← A ∧ ¬B.
A← A ∧ B.
B ← ¬A ∧ B.
B ← A ∧ ¬B.
B ← A ∧ B.
A.
B ← ¬A ∧ B.
B ← A.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 79
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction – A Pedagogical Approach
, Sound, i.e. every extracted rule is a rule implemented bythe network.
, Complete, i.e. every rule implemented by the network willbe extracted.
/ Bad time-complexity, due to the exponential blow-up./ Does not create the smallest program automatically.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 80
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Extraction – A Decompositional Approach
We can do much better (Mayer-Eichberger, 2006):I Decompositional approach., Implementable (the implementation is under way)., Sound., Complete., Create very small programs automatically.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 81
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Main Results (Holldobler & Kalinke, 1994)
I 2-layer networks cannot compute TP .I For each program P there exists a 3-layer kernel
computing TP .I For each 3-layer kernel K there exists a program P, such
that K computes TP .I Let n be the number of clauses, m be the number of
propositional variables• 2m + n units, 2mn connections in the kernel.• TP(I) is computed in 2 steps.• The parallel model to compute TP is optimal.• The recurrent network settles down in at most 3n steps.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 82
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
The CILLP-System
? Can the learning capabilities of KBANN be combined withthe Core Method (Garcez & Zaverucha, 1999)?
I Using sigmoidal functions, we obtain a standard 3-layerfeed-forward neural network.
⇒
I This network is trainable using back-propagation.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 83
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
CILLP - The Construction
I Define ranges for ”true” and “false”:
a“true”
-a“false”
?
I Compute a, the weights and thresholds such that thesigmoidal kernel computes TP (Garcez & Zaverucha,1999).
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 84
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
CILLP - Extracting a Learned Program
I The pedagogical approach would work, but ...I The decompositional approach mentioned above does not
work for sigmoidal units.I Garcez, Broda & Gabbay (2001) proposed a suitable
method, which ..., is sound., is computational feasible due to clever restriction of the
search space./ is not necessarily complete./ does not necessarily create the small programs.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 85
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
CILLP - The MONK’s Problems
I Robots are described by 6 properties,e.g. head-shape ∈ {round, square, octagon}, ...
I Classification task: “Recognice robots with(body-shape = head-shape) or (jacket-color = red)”
I Network architecture:• 17 input units: one for each attribute.• 3 hidden layer units.• 1 output unit: indicating answer “yes” or “no”.
I 100% performance of the network and extracted rules.I Pruning: from 131072 possible inputs for some hidden
unit, only 18724 were queried.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 86
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
CILLP - Conclusions
I Successfully used for ...• classification tasks like the MONK’s problem.• DNA sequence analysis (Promoter Recognition, Splice
Junction Determination).• Power system fault diagnosis.
I Extensions of the CILLP-System:• Metalevel priorities between rules (Garcez, Broda &
Gabbay , 2000).• Intuitionistic logic (Garcez, Lamb & Gabbay, 2003).• Modal logic (Garcez, Lamb, Broda & Gabbay, 2004).
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 87
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
The Core Method
I Relate logic programs and connectionist systemsI Embed interpretations into (vectors of) real numbers.I Hence, obtain an embedded version of the TP-operator.I Construct a network computing one application of fP .I Add recurrent connections from output to input layer.
IL ILTP
Rm Rm
ι ι−1
fP
~x fP (~x)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 88
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Major Problems in Neural-Symbolic Integration
I How can symbolic knowledge be represented withinconnectionist systems? (What is ι?)
I How can symbolic knowledge be extracted fromconnectionist systems? (What is ι−1?)
I How can symbolic knowledge be learned usingconnectionist systems?
I How can connectionist learning be guided by symbolicbackground knowledge?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 89
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Conclusions
We have a complete system implementing the NeSy-Cycle forpropositional logic programs.
SymbolicSystem
ConnectionistSystem
embedding
extraction
writable
readable
train
able
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 90
Propositional Logic Programs The Core Method for Propositional Logic CILLP and some Derivatives Conclusions
Main Results
I 3-layer feedforward networks can compute TP .I Using sigmoidal units, the network is trainable using
Back-Propagation.I Extraction is sound (and complete).I Successfully applied to real world problems.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 91
Neural-Symbolic IntegrationA selfcontained introduction
Sebastian Bader Pascal Hitzler
ICCL, Technische Universitat Dresden, Germany
AIFB, Universitat Karlsruhe, Germany
Part
The Core Method for First Order Logic
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
First Order Logic Programs – Two Examples
nat(0). % 0 is a natural number.nat(succ(X ))← nat(X ). % The successor succ(X ) is a natural
% number if X is a natural number.
even(0). % 0 is an even number.even(succ(X ))← odd(X ). % The successor of an odd X is even.odd(X )← ¬even(X ). % If X is not even then it is odd.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 94
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
First Order Logic Programs – The Syntax
Functions, Variables and TermsF = {0/0, succ/1}V = {X}T = {0, succ(0), succ(X ), succ(succ(0)), . . .}
Predicate Symbols and AtomsP = {even/1, odd/1}A = {even(succ(X )), odd(succ(0)), odd(0), odd(X ), . . .}
Connectives, Clause and Program↗ propositional logic
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 95
e(0).
e(s(X ))← o(X ).
o(X )← ¬e(X ).
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
First Order Logic Programs – The Semantics
Herbrand Base BL = Set of ground atomsBL = even(0), even(succ(0)), . . . , odd(0), odd(succ(0)), . . .
Interpretations = Subsets of the Herbrand baseI1 = {even(succ2n(0) | n ≥ 1} I2 = {}I3 = {odd(succ2n+1(0) | n ≥ 0} I4 = I2 ∪ I3
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 96
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
TP for our running examples
Definition (TP)TP(I) = {A | there is A← body in ground(P) and I |= body}
Example (Natural numbers){} 7→ {n(0)}
n(0). {n(0)} 7→ {n(0), n(s(0))}n(s(X ))← n(X ). {n(0), n(s(0))} 7→ {n(0), n(s(0)), n(s(s(0)))}
{n(X ) | X ∈ T } 7→ {n(X ) | X ∈ T }
Example (Even and odd numbers){} 7→ {e(0), o(X ) | X ∈ T }
e(0). {o(X ) | X ∈ T } 7→ {e(0), e(s(X )), o(X ) | X ∈ T }e(s(X ))← o(X ). {e(s2n(0)) | n ≥ 0} 7→ {e(0), o(s2n+1(0)) | n ≥ 0}o(X )← ¬e(X ). {o(s2n+1(0)) | n ≥ 0} 7→ {e(0), e(s2n(0)) | n ≥ 0}
BL 7→ {e(0), e(s(X )) | X ∈ T }
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 97
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Problems
I BL is usually infinite and therefore the propositionalapproach does not work.
I How can we bridge the gap?• How can first-order terms be represented?• How can first-order rules be represented?• How can the variable-binding be solved?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 98
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Level Mappings
I A Level Mapping | · | assigns a (unique) natural number toeach ground atom ...
Example (Even and odd numbers)
|e(sn(0))| = 2n + 1 |o(sn(0))| = 2n + 2
I ... hence, enumerates the Herbrand base:
Example (Even and odd numbers)[e(0)︸︷︷︸
1
, o(0)︸︷︷︸2
, e(s(0))︸ ︷︷ ︸3
, o(s(0))︸ ︷︷ ︸4
, e(s(s(0)))︸ ︷︷ ︸5
, . . .]
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 99
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Embedding First-Order Terms into the Real Numbers
Using an injective level mapping, we can assign a unique realnumber to each interpretation:
ι(I) =∑A∈I
4−|A|
This coincides with a “binary” representation:
BL= [ e(0),o(0),e(1),o(1),e(2),. . .]
ι({e(0)})= 0. 1 0 0 0 04 = 0.2510ι({e(0), e(1), e(2)})= 0. 1 0 1 0 14 ≈ 0.2710
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 100
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
The Graph of the Natural Numbers
ι (I)
ι (TP (I))
0.25
0.25
n(0).
n(s(X ))← n(X ).
|n(sn(0))| = n + 1
{}|{z}0.0
7→ {n(0)}| {z }0.25
{n(0)}| {z }0.25
7→ {n(0), n(s(0))}| {z }0.3125
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 101
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
The Graph of the Even and Odd Numbers
ι (I)
ι (TP (I))
0.25
0.25 e(0).
e(s(X ))← o(X ).
o(X )← ¬e(X ).
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 102
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Some Results
Theorem (Holldobler, Kalinke & Storr, 1999)The TP-operator associated with an acyclic (wrt. injective levelmapping) first order logic program can be approximatedarbitrarily well using standard sigmoidal networks.
Some conclusions and limitations:, The Core-Method can be applied to first order logic., First treatment of first-order logic with function symbols in a
connectionist setting./ No algorithm to construct the network./ Very limitted class of logic programs.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 103
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Approximating the Embedded TP-Operator
ι (I)
ι (TP (I))
0.25
0.25
e(0).
e(s(X ))← o(X ).
o(X )← ¬e(X ).
ε = 0.05
Constructions using sigmoidal and RBF-units are given in(Bader, Hitzler & Witzel, 2005).
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 104
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
A Problem ...
I The accuracy of this approach is very limitted.I E.g., on a 32 bit computer, only 16 atoms can be
represented.I Therefore, we need to use real vectors instead of a single
real number to represent interpretations.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 105
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Multi-dimensional Level Mappings
I A Multi-dimensional Level Mapping ‖ · ‖ assigns to eachground atom a level l ∈ N+ and a dimension d ∈{1, . . . m}:Example (Even and odd numbers)
‖e(sn(0))‖ = (n + 1, 1) ‖o(sn(0))‖ = (n + 1, 2)
I ... still “enumerates” the Herbrand base:Example (Even and odd numbers)
1 2 3 4dim1 : e(0) e(s(0)) e(s(s(0))) e(s(s(s(0))))dim2 : o(0) o(s(0)) o(s(s(0))) o(s(s(s(0))))
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 106
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Embedding First-Order Terms into the Real Numbers
Using an injective m-dimensional level mapping, we can assigna unique m-dimensional vector to each interpretation:
~ι(I) =∑A∈I
~ι(A)
~ι(A) =(ι1(A), . . . , ιm(A)) with
ιi(A) =
{4−l for ‖A‖ = (l , d) and i = d0 otherwise
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 107
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Cm- The Set of all embedded Interpretations
Cm for the 2-dimensional case:
Cm = {~ι(I) | I ∈ IL}x
0.3
y0.3
{} 7→ (0, 0)
{e(0)} 7→ (0.25, 0)
{o(0)} 7→ (0, 0.25)
{e(0), o(0)} 7→ (0.25, 0.25)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 108
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Cm- The Set of all embedded Interpretations
Another construction:
x
y
x
y
x
y
x
y
. . .
x
y
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 109
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Approximating the Embedded TP-Operator
ι (I)
ι (TP (I))
0.3
0.3
d10.3
d2
0.3
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 110
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Implementation
A first prototype implemented by Andreas Witzel (Witzel, 2006):
I Merging of the techniques described above andSupervised Growing Neural Gas (SGNG) (Fritzke, 1998).
I Radial basis function network approximating TP .I Very robust with respect to noise and damage.I Trainable using a version of backpropagation together with
techniques from SGNG.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 111
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Approximating the Embedded TP-Operator
ι (I)
ι (TP (I))
0.3
0.3
d10.3
d2
0.3
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 112
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Statistics - FineBlend vs SGNG
0.01
0.1
1
10
100
0 2000 4000 6000 8000 10000 12000 14000 0
20
40
60
80
100
120
140
erro
r
#uni
ts
#examples
#units (FineBlend 1)error (FineBlend 1)
#units (SGNG)error (SGNG)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 113
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Statistics - Unit Failure
0.01
0.1
1
10
100
0 2000 4000 6000 8000 10000 12000 14000 16000 0
10
20
30
40
50
60
70
80
erro
r
#uni
ts
#examples
#units (FineBlend 1)error (FineBlend 1)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 114
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Statistics - Iteration of Random Inputs
0
0.05
0.1
0.15
0.2
0.25
0.3
0 0.05 0.1 0.15 0.2 0.25 0.3
dim
ensi
on 2
(od
d)
dimension 1 (even)
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 115
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Conclusions
, Prototypical implementation., Very robust with respect to noise and damage., Trainable using more or less standard algorithms., System outperforms other architectures (at least for the
tested examples).
/ System requires many parameters./ There is no first-order extraction technique yet.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 116
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
First-order by propositional approximation
Let P be definite and I be its least Herbrand model (Seda &Lane, 2004):
I Choose some error ε.I There exists a finite ground subprogram Pn (least model In)
such thatd(I, In) < ε.
I Use propositional approach to encode Pn.I Increasing n yields better approximations of TP .
(If TP is continuous wrt. d .)I Approach works for other (many-valued) logics similarly.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 117
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Comparison of the approaches
I Seda & Lane:• For definite programs under continuity constraint.• Treatment of acyclic programs should be ok.• Better approximation increases all layers of network.• Step functions only.• Sigmoidal approach (learning) to be investigated.
I Bader, Hitzler & Witzel:• For acyclic normal programs.• Treatment of definite (continuous) programs should be ok.• Better approximation increases only hidden layer.• Variety of activation functions.• Standard learning possible.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 118
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Iterated Function Symbols
I The Sierpinsky Triangle:
x1.0
y1.0
;x
1.0
y1.0
;x
1.0
y1.0
;x
1.0
y1.0
x1.0
y1.0
;x
1.0
y1.0
;x
1.0
y1.0
; . . .
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 119
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
From Logic Programs to Iterated Function Systems
I For some logic programs we can explicitely construct anIFS, such that the attractor coincides with the graph of theembedded TP-operator.
I Let P be a program such that fP is Lipschitz-continous.Then there exists an IFS such that the attractor is thegraph of fP .
I For a finite set of points taken from a TP-operator, we canconstruct an interpolating IFS.
I The sequence of attractors of interpolating IFSs for acyclicprograms converges to the graph of the program.
I IFSs can be encoded using RBF networks.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 120
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Extraction of First-Order Logic Programs
I Very little work has been done on this.I A general idea:
• Use any initialization method as a base.• Neural network are points in Rn, where n is number of
weights.• Define conditions on programs which may be extracted
(E.g.: maximum number of atoms or of term nesting depth).• ; discrete points in Rn via initialization method.• Program which lies closest to network in Rn is the extracted
program.
? Could this work?
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 121
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Conclusions
I 3-layer feedforward networks can approximate TP forcertain programs.
I Using sigmoidal units, the network is trainable usingbackpropagation.
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 122
FOL Programs Bridging the Gap FineBlend Further Topics Extraction Conclusions
Open Problems
? How can first-order descriptions be extracted from aconnectionist system?
? Can a first-order neural-symbolic system be applied to realworld problems, outperforming conventional approaches?
? How does the Core Method relate to reasoningapproaches from Cognitive Science?
? ... (many more) ...
www.neural-symbolic.org
Neural-Symbolic Integration (Sebastian Bader, Pascal Hitzler) 123