Upload
laurel-hall
View
216
Download
3
Embed Size (px)
Citation preview
Learning in neural networks
Chapter 19
DRAFT
Biological neuron
Biological neuron
• Neuron• Soma• Dendrites • Axon• Synapse• Action potential
Perceptron
gxi
x0
xn
ywi
y = g(i=1,…,n wi xi)
+ +
+
++ -
-
--
-
Perceptron• Synonym for Single-Layer, Feed-Forward Network
• First Studied in the 50’s
• Other networks were known about but the perceptron was the only one capable of learning and thus all research was concentrated in this area
Perceptron• A single weight only affects one output so we can restrict our investigations to a model as shown on the right
• Notation can be simpler, i.e.
jWjIjStepO 0
What can perceptrons represent?
AND XORInput 1 0 0 1 1 0 0 1 1Input 2 0 1 0 1 0 1 0 1Output 0 0 0 1 0 1 1 0
What can perceptrons represent?
0,0
0,1
1,0
1,1
0,0
0,1
1,0
1,1
AND XOR
• Functions which can be separated in this way are called Linearly Separable
• Only linearly Separable functions can be represented by a perceptron
What can perceptrons represent?
Linear Separability is also possible in more than 3 dimensions – but it is harder to visualise
Training a perceptron
ANDInput 1 0 0 1 1Input 2 0 1 0 1Output 0 0 0 1
Aim
Training a perceptrons
t = 0.0
y
x
-1W = 0.3
W = -0.4
W = 0.5
I1 I2 I3 Summation Output-1 0 0 (-1*0.3) + (0*0.5) + (0*-0.4) = -0.3 0-1 0 1 (-1*0.3) + (0*0.5) + (1*-0.4) = -0.7 0-1 1 0 (-1*0.3) + (1*0.5) + (0*-0.4) = 0.2 1-1 1 1 (-1*0.3) + (1*0.5) + (1*-0.4) = -0.2 0
Function-Learning Formulation
Goal function f Training set: (xi, f(xi)), i = 1,…,n
Inductive inference: find a function h that fits the point well
Same Keep-It-Simple bias
Perceptron
gxi
x0
xn
ywi
y = g(i=1,…,n wi xi)
+ +
+ +
+ -
-
--
-
?
Unit (Neuron)
gxi
x0
xn
ywi
y = g(i=1,…,n wi xi)
g(u) = 1/[1 + exp(-u)]
Neural Network
Network of interconnected neurons
gxi
x0
xn
ywi
gxi
x0
xn
ywi
Acyclic (feed-forward) vs. recurrent networks
Two-Layer Feed-Forward Neural Network
Inputs Hiddenlayer
Outputlayer
Backpropagation (Principle)
New example yk = f(xk) φk = outcome of NN with weights w for
inputs xk
Error function: E(w) = ||φk – yk||2
wij(k) = wij
(k-1) – εE/wij
Backpropagation: Update the weights of the inputs to the last layer, then the weights of the inputs to the previous layer, etc.
Comments and Issues
How to choose the size and structure of networks?• If network is too large, risk of over-fitting
(data caching)• If network is too small, representation
may not be rich enough Role of representation: e.g., learn the
concept of an odd number Incremental learning