Upload
danghuong
View
270
Download
0
Embed Size (px)
Citation preview
NEURAL NETWORKS (ELEC 5240 and ELEC 6240)(ELEC 5240 and ELEC 6240)
single neuron training
Bodgan M. Wilamowski
1
Area with 4 partitions. Input–output mapping
Assuming linear activation function foractivation function for the output neuron
Relatively complex li inonlinear mapping
Question:
How to design a system for an arbitrary nonlinear mapping?
2
arbitrary nonlinear mapping?
How to design?
What is given?What is given?
(a) Mathematical function
No need for design – just use microcomputer for calculations
(b) Experimental data
- Find an analytical functions describing the process ???- Find an analytical functions describing the process ???
- Use a data to train neural networks
3
Hamming code example
Let us consider binary signals and weights such as
x = +1 -1 -1 +1 -1 +1 +1 -1
if weights w = xg
w = +1 -1 -1 +1 -1 +1 +1 -1
then8 xw = net ii
n
=1i
this is the maximum value net can have for any other combinations net would be smaller
1i
4
combinations net would be smaller
Hamming code example
For the same pattern
x = +1 -1 -1 +1 -1 +1 +1 -1
and slightly different weightsg y g
w = +1 +1 -1 +1 -1 -1 +1 -1n
4 xw = net ii
n
=1i
HDnxw = net ii
n
1i
2 HD is the Hamming Distance
5
=1i Hamming Distance
Unsupervised learning rules for single neuron
xw ci i
where c is the learning constant
Hebb rule xw oci Hebb rule xw oci
Pattern normalization required
6
Supervised learning rules for single neuron
xw ci correlation rule (supervised): dperceptron fixed rule: od perceptron adjustable rule - as above but the learning constant is modified to:
* wx netT
2xxx T
LMS (Widrow-Hoff) rule: netd delta rule: 'fod
d i l (th LMS) dTT xxxw1
7
pseudoinverse rule (the same as LMS): dxxxw
Training Neurons
ii xw Perceptron learning rule:
ii
od )(sign netdii xw
A i bi l
2xw Assuming bipolar neurons
output ±18
2ii xw output = ±1
Simple example of training one neuron
neuron3 y3 y
(1,2) => -1x
1
2
(2 1) > +1
( , )y
3
-3
1
(2,1) => +1+1
initial setting with wrong answers
1 2x
3
initial setting with wrong answersboth paterns belongd to -1 category
9
1 2 3
Simple example of training one neuron
Weights: 1 3 -3 Desired output
Pattern 1: 1 2 +1 -1
Pattern 2: 2 1 +1 +1
A lnet = w xn
for pattern 1: net = 1*1+3*2 3*1= 4 => +1
Actual outputnet = w xi=1
i ifor pattern 1: net = 1 1+3 2-3 1= 4 => +1
for pattern 2: net = 1*2+3*1-3*1= 2 => +1
10
Simple example of training one neuron
weights:
30 constant learning assuming .α 3]3[1wweights:
npattern 1:
3]- 3 [1w
1] 2 [1x
net = w xi=1
i i od xw
14313211
xxw 6.0113.0 14313211 net
0.6]- 1.2- [-0.6w
3 6]1 8[0 411
3.6]- 1.8 [0.4w
Simple example of training one neuronAft l i th fi t tt fi t ti
3 yx
0 43 6]1 8[0 4
After applying the first pattern first time
3 y
(1,2) => -1y
0.4
1.8
3.6]- 1.8 [0.4w
2
(2 1) 1
(1,2) 1
+1-3.6
63
1
(2,1) => +1
63
94.0
6.30 x
1 2x
3
2 8.1
6.30 y
12
1 2 3
Simple example of training one neuron
weights:
Applying the second pattern first time
3 6]1 8[0 4wweights:
npattern 2: 1] 1 [2x
3.6]- 1.8 [0.4w
net = w xi=1
i i od xw
11631811402
xxw 6.0113.0 116.318.114.02 net
0.6] 0.6 [1.2w
133.0]- 2.4 [1.6w
Simple example of training one neuron
3 y 3]2 4[1 6wx
1 6
After applying the second pattern first time
3 y
(1,2) => -1
3]- 2.4 [1.6wy
1.6
2.4
2
(2 1) > +1
( , )
+1-3
1
(2,1) => +187.1
6.1
30 x
1 2x
3
1.25 4.2
30 y
14
1 2 3
Simple example of training one neuron
weights:
Applying the first pattern second time
3]2 4[1 6wweights:
npattern 1: 1] 2 [1x
3]- 2.4 [1.6w
net = w xi=1
i i od xw
14331422611
xxw 6.0113.0 14.3314.226.11 net
0.6]- 1.2- [-0.6w
15
3.6]- 1.2 [1w
Simple example of training one neuronAft l i th fi t tt d tiAfter applying the first pattern second time
3 6]1 2[1w3 y x13.6]- 1.2 [1w3 y
y
1
1.2
2+1
-3.6
16.3
1
6.30 x
1 2x
3
3 2.1
6.30 y
16
1 2 3 2.1
Simple example of training one neuron
weights:
Applying the second pattern second time
3 6]1 2[1wweights:
npattern 2: 1] 1 [2x
3.6]- 1.2 [1w
net = w xi=1
i i od xw
14063121112
xxw 6.0113.0 14.06.312.1112 net
0.6] 0.6 [1.2w
17
3.0]- 1.8 [2.2w
Simple example of training one neuronAfter applying the second pattern second time
3.0]-1.8[2.2w3 y x2 23.0] 1.8 [2.2w3 y
y
2.2
1.8
2+1
-3
3
13
36.12.2
30 x
1 2x
3
67.1 8.1
30 y
18
1 2 3
Simple example of training one neuron
weights:
Applying the first pattern third time
3 0]1 8[2 2wweights:
npattern 1: 1] 2 [1x
3.0]- 1.8 [2.2w
net = w xi=1
i i od xw
18231812221
xxw 6.0113.0 18.2318.122.21 net
0.6]- 1.2- [-0.6w
19
3.6]- 0.6 [1.6w
Simple example of training one neuron
3 6]0 6[1 6w
Applying the first pattern third time
3 y x1 63.6]- 0.6 [1.6w3 y
y
1.6
0.6
2+1
-3.6
63
163
25.26.1
6.30 x
1 2x
3
6 6.0
6.30 y
20
1 2 3
Simple example of training one neuron
weights:
Applying the second pattern third time
3 6]0 6[1 6wweights:
npattern 2: 1] 1 [2x
3.6]- 0.6 [1.6w
net = w xi=1
i i od xw
120631601612
00113.0 xxw
12.06.316.016.12 net
0] 0 [0w
3 6]-0 6[1 6w21
3.6]- 0.6 [1.6w
Simple example of training one neuron
3.6]- 0.6 [1.6wApplying the second pattern third time
x3 y
y
1.6
0.6
2+1
y
-3.6
1
25.26.1
6.30 x
1
x6
6.0
6.30 y
221 2
x3
Simple example of training one neuron
weights:
Applying the first pattern 4-th time
3 6]0 6[1 6wweights:
npattern 1: 1] 2 [1x
3.6]- 0.6 [1.6w
net = w xi=1
i i od xw
180631602611
00113.0 xxw
18.06.316.026.11 net
0] 0 [0w
3 6]-0 6[1 6w23
3.6]- 0.6 [1.6w
Supervised learning rules for single neuron
xw ci correlation rule (supervised): dperceptron fixed rule: od perceptron adjustable rule - as above but the learning constant is modified to:
* wx netT
2xxx T
LMS (Widrow-Hoff) rule: netd delta rule: 'fod
d i l (th LMS) dTT xxxw1
24
pseudoinverse rule (the same as LMS): dxxxw
Training one neuron using the perceptron ruleDesired output
Pattern 1: 1 2 +1
Desired output
-1
Pattern 2: 2 1 +1 +1
1 3 -3Initial weights:
α 30constantlearning
)(sign netdii xw .α 30constant learning
)(sign netdii xw
2ii xw 25
2ii xw
Training one neuron using the perceptron rule
Final weights: 3.6]- 0.6 [1.6w
y x3 y
y
1.6
0.6
2+1
-3.6
1
25.26.1
6.30 x
x6
6.0
6.30 y
26
1 2 3
Soft activation functions
01
1)(netif
netsign 01 netif
00
05.02
1)()(
netif
netifnetsign
netfo
01
00)sgn()(
netif
netifnetnetfo
1 - net- + 1
2 = net0.5 = f(net) = o
exptanh net- + 1
1 = f(net) = o
exp
27
oo = f 1' o = f 21'
Program in MATLAB (with graphics)%single neuron perceptron training with soft activation function format compact;format compact;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold onplot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:5,
for p=1:npfor p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; net(p)=ip(p,:)*ww' ; op(p)=sign(k*0.5*net(p));er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause;% pause; endter=sqrt(er*er')tter(ite)=ter;if t 0 001 b k d
28
if ter <0.001, break; end;end;hold off; ite,figure(2); clf; semilogy(tter)
MATLAB training resultsc=0.3 iter=4 error =0 c=0.1 iter=9 error =0
c=0.01 iter=66 error =0c=1 iter=4 error =0
29
Program in MATLAB (perceptron -hard)%single neuron perceptron training with hard activation function format compact;format compact;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold onplot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % augmenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:5,
for p=1:npfor p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; net(p)=ip(p,:)*ww' ; op(p)=sign(net(p));er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause;% pause; endter=sqrt(er*er')tter(ite)=ter;if t 0 001 b k d
30
if ter <0.001, break; end;end;hold off; ite,figure(2); clf; semilogy(tter)
MATLAB training results (perceptron -hard)c=0.3 iter=4 error =0 c=0.1 iter=9 error =0
c=0.01 iter=66 error =0c=1 iter=4 error =0
31
Program in MATLAB (perceptron -soft)
%single neuron perceptron traning with soft activation function 4%single neuron perceptron traning with soft activation function format compact; clear all;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input 2.5
3
3.5
4
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:20,
for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; 1
1.5
2
Y in
put
j j j ( y g )net(p)=ip(p,:)*ww' ;op(p)=tanh(k*0.5*net(p)); %hiperbolic functioner(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3)) /ww(2);
0 0.5 1 1.5 2 2.5 3 3.5 40
0.5
X input
x(2)=4; y(2)=-(ww(1) x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause; endter=sqrt(er*er'), tter(ite)=ter;
0
if ter <0.0001, break; end;end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');
100
erro
r
32
semilogy(tter); xlabel( iterations ); ylabel( error );
0 2 4 6 8 10 12 14 16 18 20iterations
Program in MATLAB (perceptron -soft)
%single neuron perceptron traning with soft activation function 3.5
4
%single neuron perceptron traning with soft activation function format compact; clear all;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=3; k=0.3;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input 2
2.5
3
nput
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:20,
for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end;
0 5
1
1.5
2
Y in
j j j ( y g )net(p)=ip(p,:)*ww' ;op(p)=tanh(k*0.5*net(p)); %hiperbolic functioner(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3)) /ww(2);
0 0.5 1 1.5 2 2.5 3 3.5 40
0.5
X input
101
x(2)=4; y(2)=-(ww(1) x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause; endter=sqrt(er*er'), tter(ite)=ter; 10
0
if ter <0.0001, break; end;end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');
10-1
erro
r
33
semilogy(tter); xlabel( iterations ); ylabel( error );
0 2 4 6 8 10 12 14 16 18 2010
-2
iterations
LMS learning rule
ip
np
pp xfoddw
TEd'2
pidw 1
In LMS rule (Widrow Hoft – 1962) they assumed f‘=1
(they worked with hard threshold neurons so f ’ was not defined)(they worked with hard threshold neurons so f was not defined)
np
xodTEd 2
Therefore:
ipp
ppi
xoddw
1
2
2
np
netdTE
Therefore:
34
1
p
pp netdTE
Program in MATLAB (LMS)%single neuron LMS training with soft activation function f t t l ll
4
format compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=0.1; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[ i] i (i ) i ( i+1) ( 1) % ti
2.5
3
3.5
ut[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:100,
f 1
1
1.5
2
Y in
pu
for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end;net(p)=ip(p,:)*ww' ;op(p)=tanh(0.5*k*net(p)); %hyperbolic functioner(p) dp(p) op(p) +c*(dp(p) net(p))*ip(p )
0 0.5 1 1.5 2 2.5 3 3.5 40
0.5
X input
er(p)=dp(p)-op(p); ww=ww+c*(dp(p)-net(p))*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause;% pause; endter=sqrt(er*er'), tter(ite)=ter;if ter <0.0001, break; end;
end; 100
erro
r
35
end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');
0 10 20 30 40 50 60 70 80 90 100iterations
Delta learning rule
2111 odErr Errors:
2222 odErr
2npnpnp odErr
2
np
pp odTE 1p
pp
36
DELTA learning rule 12
2
1
np
ppp odTE
1p
ninip xwxwxwfo 2211
The gradient of TE along wi:
pnp do
dTEd 2
d tdd
i
p
ppp
i dwod
dw
1
2
ii
p
p
p
i
p xfdw
dnet
dnet
do
dw
do'
37
ipi
Delta learning rule 2
i
pnp
ppp
i dw
dood
dw
TEd
1
2
ii
p
p
p
i
p xfdw
dnet
dnet
do
dw
do'
ipi
ipi
ip
np
pp xfodd
TEd'2 ip
ppp
i
fdw 1
np
p
ipppi xfodw1
'2
38
Program in MATLAB (Delta)
3.5
4
%single neuron delta training with soft activation function format compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1 1) ip(1 2) 'ro'); plot(ip(2 1) ip(2 2) 'bx'); 2
2.5
3
nputplot(ip(1,1),ip(1,2), ro ); plot(ip(2,1),ip(2,2), bx );
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a);j=0;for ite=1:250,
for p=1:np,j j 1 if j 1 l t( ' ') d 0 5
1
1.5
2
Y in
j=j+1; if j>1, plot(x,y,'g'); end;net(p)=ip(p,:)*ww' ;op(p)=tanh(0.5*k*net(p)); %hyperbolic functionfp(p)=k*(1-op(p)*op(p));er(p)=dp(p)-op(p); ww=ww+c*fp(p)*er(p)*ip(p,:);
0 0.5 1 1.5 2 2.5 3 3.5 40
0.5
X input
101(p) p(p) p(p); p(p) (p) p(p, );
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');
% pause; end
100
endter=sqrt(er*er'), tter(ite)=ter;if ter <0.001, break; end;
end;hold off; ite
10-1
erro
r
39
figure(2); clf;semilogy(tter); xlabel('iterations'); ylabel('error');
0 50 100 150 200 25010
-2
iterations
Program in MATLAB (Delta)Batch training 4
%single neuron delta training with soft activation function % BATCH trainingformat compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5; 2.5
3
3.5
figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a);j=0; 1
1.5
2
Y in
put
for ite=1:125,if ite>1, plot(x,y,'g'); end;net=ip*ww'; op=tanh(0.5.*k.*net);fp=k.*(1-op.*op); er=dp-op;
0 0.5 1 1.5 2 2.5 3 3.5 40
0.5
X input
101
dw=(c*er.*fp)'*ip; ww=ww+dw;x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');
100
% pause; ter=er'*er, tter(ite)=ter;if ter <0.001, break; end;
end;10
-2
10-1
erro
r
40
hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');
0 20 40 60 80 100 120 14010
-3
iterations
Delta learning for multiple patterns c=3 k=1derr=0.01 ite=576
4
%single neuron delta training with soft activation function % BATCH training with several patternsformat compact; clear all;ip= [-1,-1; 2,2; 0,0; 1,1; -0.5,0; 2,1; 0,1; 3,1; 1,1.5; 2.5,1.5] [ i] i (i ) i ( i+1) ( 1) % ti i t
2
3
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputdp=[-1, 1,-1, 1, -1, 1, -1, 1, -1, 1]', ww=[-1 3 -3], c=1.8; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');a=axis ; a=[-2 4 -2 4]; axis(a);j=0;for ite=1:10000,
0
1
Y in
put
if ite>1, plot(x,y,'g'); end;net=ip*ww'; op=tanh(0.5.*k.*net);fp=k.*(1-op.*op); er=dp-op; dw=(c*er.*fp)'*ip; ww=ww+dw;x(1)= 1; y(1)= (ww(1)*x(1)+ww(3)) /ww(2);
-2 -1 0 1 2 3 4-2
-1
X input10
2
x(1)=-1; y(1)=-(ww(1) x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');% pause; ter=er'*er, tter(ite)=ter;
101
if ter <0.01, break; end;end;for p=1:np, if dp(p)>0, plot(ip(p,1),ip(p,2),'ro');
else plot(ip(p,1),ip(p,2),'bx'); end; end;hold off; ite 10
-1
100
erro
r
41
hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');.
0 100 200 300 400 500 60010
-2
iterations