57
Multilayer Perceptrons Lecture 11

Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

  • Upload
    others

  • View
    48

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

1!

Multilayer Perceptrons!!

Lecture 11!

Page 2: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

2!

Chain Rule!g Letʼs say we have two functions f(x) and g(x)"

g What is the derivative of f(g(x))?"!

f (x) = x 5

g(x) = (x 2 +1)f (g(x)) = (x 2 +1)5

!

df (x)dx

= 5x 4

df (g)dg

= 5g4

dg(x)dx

= 2x

df (g(x))dx

=df (g)dg

"dg(x)dx

= 5(x 2 +1)4 " 2x

Page 3: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

3!

Chain Rule!

y = esinx2

dydx

= esin x2

!cos x2 !2x

dydu

= esin x2

dudv

= cos x2

dvdx

= 2x

dydx

=dydu!dudv!dvdx

v = h(x)

= x2

u = g(v) =sin(v)

y = f(u)

=eu x

Page 4: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

4!

Multilayer Perceptron!g Graph of a multilayer perceptron with two

hidden layers"

Input Layer

First Hidden Layer

Second Hidden Layer

Output Layer

Page 5: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

5!

Signal Flow Graph (Output Neuron j)!

yo=+1 [Bias-term]

yi(n) wji(n) vj(n) yj(n)

φj(vj(n))

dj(n) [desired output value]

-1 ej(n)

(error at node j)

!

v j (n) = w ji(n)yi(n)i=0

M

"

wj0(n)

yM(n)

Page 6: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

6!

Signal Flow Graph (Output Neuron j)!

yo=+1 [Bias-term]

yi(n) wji(n) vj(n) yj(n)

φj(vj(n))

dj(n) [desired output value]

-1 ej(n)

(error at node j)

!

v j (n) = w ji(n)yi(n)i=0

M

"

!

y j (n) =" j (v j (n))

wj0(n)

yM(n)

Page 7: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

7!

Signal Flow Graph (Output Neuron j)!

yo=+1 [Bias-term]

yi(n) wji(n) vj(n) yj(n)

φj(vj(n))

dj(n) [desired output value]

-1 ej(n)

(error at node j)

!

v j (n) = w ji(n)yi(n)i=0

M

"

!

y j (n) =" j (v j (n))

!

e j (n) = d j (n) " y j (n)

wj0(n)

yM(n)

Page 8: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

8!

Goal: Minimize Total Error at the O/P!

yo=+1

yi(n) wji(n) vj(n) yj(n)

φj(vj(n))

dj(n)

-1 ej(n)

!

v j (n) = w ji(n)yi(n)i=0

M

"

!

y j (n) =" j (v j (n))

!

e j (n) = d j (n) " y j (n)

wj0(n)

!

E(n) =12

ej

2 (n)j"O /P#

Page 9: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

9!

Update Weights!

yo=+1

yi(n) wji(n) vj(n) yj(n)

φj(vj(n))

dj(n)

-1 ej(n)

!

v j (n) = w ji(n)yi(n)i=0

M

"

!

y j (n) =" j (v j (n))

!

e j (n) = d j (n) " y j (n)

wj0(n)

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

Small step against the direction of the gradient to minimize error

Page 10: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

10!

Update for Weights!

"g where"

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#

Page 11: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

11!

Update for Weights!

""g Applying chain rule we get:""

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#

Page 12: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

12!

Update for Weights!g Applying chain rule we get:""

!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"e j (n)

="

"e j (n)12

ej

2 (n)j$O /P%

&

' ( (

)

* + + = e j (n)

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#

Page 13: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

13!

Update for Weights!g Applying chain rule we get:""

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"e j (n)

="

"e j (n)12

ej

2 (n)j$O /P%

&

' ( (

)

* + + = e j (n)

"e j (n)"y j (n)

="

"y j (n)d j (n) , y j (n)( ) = ,1

Page 14: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

14!

Update for Weights!g Applying chain rule we get:""

!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"e j (n)

="

"e j (n)12

ej

2 (n)j$O /P%

&

' ( (

)

* + + = e j (n)

"e j (n)"y j (n)

="

"y j (n)d j (n) , y j (n)( ) = ,1

"y j (n)"v j (n)

="

"v j (n)- j (v j (n))( ) =- j '(v j (n))

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#

Page 15: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

15!

Update for Weights!g Applying chain rule we get:""

!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"e j (n)

="

"e j (n)12

ej

2 (n)j$O /P%

&

' ( (

)

* + + = e j (n)

"e j (n)"y j (n)

="

"y j (n)d j (n) , y j (n)( ) = ,1

"y j (n)"v j (n)

="

"v j (n)- j (v j (n))( ) =- j '(v j (n))

"v j (n)"w ji(n)

="

"w ji(n)w ji(n)yi(n)

i=0

M

%&

' (

)

* + = yi(n)

!

E(n) =12

ej

2 (n)j"O /P#

e j (n) = d j (n) $ y j (n)y j (n) =% j (v j (n))

v j (n) = w ji(n)yi(n)i=0

M

#

Page 16: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

16!

Update for Weights!g Applying chain rule we get:"

!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"w ji(n)

= e j (n)# ($1)# % j '(v j (n))# yi(n)

Page 17: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

17!

Update for Weights!g Applying chain rule we get:"

g Delta rule for updating weights"!

"E(n)"w ji(n)

="E(n)"e j (n)

#"e j (n)"y j (n)

#"y j (n)"v j (n)

#"v j (n)"w ji(n)

"E(n)"w ji(n)

= e j (n)# ($1)# % j '(v j (n))# yi(n)

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

% w ji(n +1) = w ji(n) + #learningrate

! & e j (n)& ' j '(v j (n))local gradient

" # $ $ % $ $ & yi(n)

input!

Page 18: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

18!

Update for Weights!g Delta rule for updating weights"

g  For output neurons this rule can be directly applied"

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

% w ji(n +1) = w ji(n) + #learningrate

! & ' j (n)local gradient" # $

& yi(n)input!

' j (n) =$E(n)$v j (n)

= e j (n)& ( j)(v j (n))

*

+ ,

-

. /

Page 19: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

19!

How to update weights for hidden layers?!g Delta rule for updating weights"

g Credit-assignment problem:"n  Even though the hidden neurons are not directly

accessible they share responsibility for the error"n  How to penalize or reward hidden neurons?"

!

w ji(n +1) = w ji(n) "#$E(n)$w ji(n)

% w ji(n +1) = w ji(n) + #learningrate

! & ' j (n)local gradient" # $

& yi(n)input!

Page 20: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

20!

Signal-flow at hidden node h!

yo=+1

yh(n) wjh(n) vj(n) yj(n)

φj(vj(n))

dj(n)]

-1 ej(n)

wj0(n)

yM(n)

yo=+1

yi(n) whi(n) vh(n)

φh(vh(n))

wh0(n)

yM2(n)

Page 21: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

21!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

!

"h (n) = #$E(n)$vh (n)

= #$E(n)$yh (n)

%$yh (n)$vh (n)

Page 22: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

22!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

!

"h (n) = #$E(n)$vh (n)

= #$E(n)$yh (n)

%$yh (n)$vh (n)

"h (n) = #$E(n)$yh (n)

%$

$vh (n)(&h (vh (n)))

Page 23: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

23!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

!

"h (n) = #$E(n)$vh (n)

= #$E(n)$yh (n)

%$yh (n)$vh (n)

"h (n) = #$E(n)$yh (n)

%$

$vh (n)(&h (vh (n)))

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

Page 24: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

24!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"g We know"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

E(n) =12

e j2(n)

j"O /P#

$E(n)$yh (n)

= e j (n)j"O /P#

$e j (n)$yh (n)

Page 25: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

25!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"g We know"

g Again applying chain rule:"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"yh (n)

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"v j (n)

%"v j (n)"yh (n)

Page 26: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

26!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"v j (n)

%"v j (n)"yh (n)

Page 27: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

27!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"v j (n)

%"v j (n)"yh (n)

!

"e j (n)"v j (n)

="

"v j (n)(d j (n) # y j (n))

Page 28: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

28!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"v j (n)

%"v j (n)"yh (n)

!

"e j (n)"v j (n)

="

"v j (n)(d j (n) # y j (n))

"e j (n)"v j (n)

="

"v j (n)(d j (n) #$ j (v j (n))) = #$ j

%(v j (n))

Page 29: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

29!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"E(n)"yh (n)

= e j (n)j#O /P$

"e j (n)"v j (n)

%"v j (n)"yh (n)

!

"e j (n)"v j (n)

="

"v j (n)(d j (n) # y j (n))

"e j (n)"v j (n)

="

"v j (n)(d j (n) #$ j (v j (n))) = #$ j

%(v j (n))

"v j (n)"yh (n)

="

"yh (n)w jh (n)yh (n)

h=0

M

&'

( )

*

+ , = w jh (n)

Page 30: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

30!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) = #$E(n)$yh (n)

% (&h'(vh (n)))

!

"h (n) =#h$(vh (n)) e j (n)% # j (v j (n))

local gradient! " # # $ # #

% w jh (n)j&O /P'

!

"h (n) =#h$(vh (n)) " j (n)% w jh (n)

j&O /P'

Page 31: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

31!

How to update weights for hidden layers?!g Local gradient of hidden neuron ʻhʼ:"

"

!

"h (n) =#h$(vh (n)) " j (n)% w jh (n)

j&O /P'

!

whi(n +1) = whi(n) + "learningrate

! # $h (n)local gradient" # $ # yi(n)

input!

Page 32: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

32!

Back-propagation of errors!

δj(n)

φj’(vj(n)) ej(n) wjh(n)

δM(n)

e1(n)

eM(n)

φ1’(v1(n))

φM’(vM(n))

δ1(n)

δh(n)

!

"h (n) =#h$(vh (n)) " j (n)% w jh (n)

j&O /P'

Intuition: weight the error at each output node by the connection weights of the hidden node to the output node and assign that as the error caused by the hidden node

Page 33: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

33!

Back Propagation Algorithm!g Output node"

g Hidden node"

!

w ji(n +1) = w ji(n) + "learningrate

! # $ j (n)local gradient" # $

# yi(n)input!

!

"h (n) =#h$(vh (n)) " j (n)% w jh (n)

j&O /P'

(

) * *

+

, - -

!

whi(n +1) = whi(n) + "learningrate

! # $h (n)local gradient" # $ # yi(n)

input!

!

" j (n) = e j (n)# $ j%(v j (n))

& ' (

) * +

Page 34: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

34!

An Example!g  Lets assume a simple MLP with one hidden

layer"

o1

o2

x1

x2

x0=1 y0=1

Input Layer

Hidden Layer

Output Layer

Page 35: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

35!

An Example!g  Begin with random assignment of weights"

o1

o2

x1

x2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

Page 36: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

36!

An Example!g  Let input x=[0,1] and the desired output be

d=[1,0]; η=0.1"

o1

o2

x1=0

x2=1

u11= -1

u22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

Page 37: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

37!

An Example!g Forward pass: Hidden Layer"

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

v1 = u10x0+u11x1+u12x2 v2 = u20x0+u21x2+u22x2

v1=?

v2=?

x1=0

x2=1

Page 38: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

38!

An Example!g Forward pass"

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

v1 = u10x0+u11x1+u12x2 v2 = u20x0+u21x2+u22x2

v1=1

v2=2

x1=0

x2=1

Page 39: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

39!

An Example!g Forward pass: Lets assume identity

activation function: φj(x)=x "

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

y1=φ(v1) y2=φ(v2)

v1=1

v2=2

x1=0

x2=1

Page 40: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

40!

An Example!g Forward pass: Lets assume identity

activation function: φj(x)=x "

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

Input Layer

Hidden Layer

Output Layer

y1=φ(v1) y2=φ(v2)

y1=1

y2=2

x1=0

x2=1

!

"'(x) =1[ ]

Page 41: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

41!

An Example!g Forward pass: Output layer"

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

ov1 = w10y0+w11y1+w12y2 ov2 = w20y0+w21y2+w22y2

ov1=?

ov2=?

Page 42: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

42!

An Example!g Forward pass: Output layer"

o1

o2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

ov1 = w10y0+w11y1+w12y2 ov2 = w20y0+w21y2+w22y2

ov1=2

ov2=2

Page 43: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

43!

An Example!g Forward pass: Output layer"

o1=?

o2=?

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

ov1=2

ov2=2

o1=φ(ov1) o2=φ(ov2)

Page 44: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

44!

An Example!g Forward pass: Output layer"

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

ov1=2

ov2=2

Desired o/p d1=1 d2=0

Page 45: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

45!

An Example!g Forward pass: Output layer"

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

e1=d1-o1

Desired o/p d1=1 d2=0

e2=d2-o2

Page 46: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

46!

An Example!g Backward pass: Output layer"

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

e1=-1

Desired o/p d1=1 d2=0

e2=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 47: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

47!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

e1=-1

e2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 48: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

48!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 49: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

49!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=?

δ2h=?

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 50: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

50!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

δ1w11

x0=1 y0=1

δ2w20

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=?

δ2h=?

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 51: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

51!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

δ1w11

x0=1 y0=1

δ2w20

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=1

δ2h=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 52: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

52!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 1

x0=1 y0=1

w20= 1

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=1

δ2h=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 53: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

53!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 0.9

x0=1 y0=1

w20= 0.8

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=1

δ2h=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 54: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

54!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 1

w11= 0.9

x0=1 y0=1

w20= 0.8

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=1

δ2h=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 55: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

55!

An Example!g Backward pass: "

o1=2

o2=2

u11= -1

v22= 0.8

w11= 0.9

x0=1 y0=1

w20= 0.8

y1=1

y2=2

x1=0

x2=1

δ1=-1

δ2=-2

!

" j (n) = e j (n)# $ j%(v j (n))

"h (n) =$h%(vh (n)) " j (n)# w jh (n)

j&O /P'

δ1h=1

δ2h=-2

!

w ji(n +1) = w ji(n) +"e j (n)yi(n) [output]

u ji(n +1) = u ji(n) +"xi(n) e j (n)# w jh (n) [hidden]j$O /P%

Page 56: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

56!

An Example!g Again forward pass: "

o1=1.66

o2=0.32

u11= -1

v22= 0.8

w11= 0.9

x0=1 y0=1

w20= 0.8

y1=1.2 x1=0

x2=1

y2=1.6

Desired o/p d1=1 d2=0

Notice the error has reduced

Page 57: Multilayer Perceptrons Lecture 11labs.seas.wustl.edu/bme/...MultiLayerPerceptrons.pdf · 4! Multilayer Perceptron! g Graph of a multilayer perceptron with two hidden layers" Input

57!

What does each layer do?!

1st layer draws linear boundaries!

2nd layer combines the boundaries!

3rd layer can generate arbitrarily complex boundaries!