net11-09

8/13/2019 net11-09

1/34

November 9, 2010 Neural NetworksLecture 16: Counterpropagation

1

Unsupervised Learning

So far, we have only looked at supervised learning, in

which an external teacher improves networkperformance by comparing desired and actual outputs

and modifying the synaptic weights accordingly.

However, most of the learning that takes place in our

brains is completely unsupervised.

This type of learning is aimed at achieving the most

efficient representation of the input space, regardless

of any output space.

Unsupervised learning can also be useful in artificial

neural networks.

8/13/2019 net11-09

2/34


2

Unsupervised Learning

Applications of unsupervised learning include

Clustering

Vector quantization

Data compression

Feature extraction

Unsupervised learning methods can also be combined

with supervised ones to enable learning through input-

output pairs like in the BPN.

One such hybrid approach is the counterpropagation

network.

8/13/2019 net11-09

3/34


3

Unsupervised/Supervised Learning:

The Counterpropagation Network

The counterpropagation network(CPN) is a fast-learning combination of unsupervised and supervised

learning.

Although this network uses linearneurons, it canlearn nonlinearfunctions by means of a hidden layer

of competitive units.

Moreover, the network is able to learn a function and

itsinverseat the same time.

However, to simplify things, we will only consider the

feedforwardmechanism of the CPN.

8/13/2019 net11-09

4/34

November 9, 2010 Neural NetworksLecture 16: Counterpropagation 4

Distance/Similarity Functions

In the hidden layer, the neuron whose weight vector is

most similar to the current input vector is the winner.

There are different ways of defining such maximal

similarity, for example:

(1) Maximal cosine similarity (same as net input):

(2) Minimal Euclidean distance:

xwxw, )(s

i

ii xwd 2)( xw,

(no square root necessary for determining the winner)

8/13/2019 net11-09

5/34


5


A simple CPN with two input neurons, three hidden neurons,

and two output neurons can be described as follows:

Hw11

Hw12 Hw21

Hw22

Hw31

Hw32

Ow11

Ow12

Ow13O

w21

Ow22

Ow23

X1 X2

Input

layer

H1 H2 H3Hidden

layer

Y1 Y2Output

layer

8/13/2019 net11-09

6/34


6


The CPN learning process (general form for n

input units and m output units):1. Randomly select a vector pair (x, y) from the

training set.

2. If you use the cosine similarity function, normalize(shrink/expand to length 1) the input vector x bydividing every component of x by the magnitude||x||, where

n

j

jxx1

2||||

8/13/2019 net11-09

7/34


7


3. Initialize the input neurons with the resulting vector

and compute the activation of the hidden-layerunits according to the chosen similarity measure.

4. In the hidden (competitive) layer, determine the unit

W with the largest activation (the winner).5. Adjust the connection weights between W and all N

input-layer units according to the formula:

))(()()1( twxtwtw HWnnHWnHWn

6. Repeat steps 1 to 5 until all training patterns have

been processed once.

8/13/2019 net11-09

8/34


8


7. Repeat step 6 until each input pattern is

consistently associated with the same competitiveunit.

8. Select the first vector pair in the training set (thecurrent pattern).

9. Repeat steps 2 to 4 (normalization, competition)for the current pattern.

10. Adjust the connection weights between the

winning hidden-layer unit and all M output layerunits according to the equation:

))(()()1( twytwtw OmWmO

mW

O

mW

8/13/2019 net11-09

9/34


9


11. Repeat steps 9 and 10 for each vector pair in the

training set.

12. Repeat steps 8 through 11 for several epochs.

8/13/2019 net11-09

10/34


10


Because in our example network the input is two-

dimensional, each unit in the hidden layer has two

weights (one for each input connection).

Therefore, input to the network as well as weights of

hidden-layer units can be represented and visualizedby two-dimensional vectors.

For the current network, all weights in the hidden layer

can be completely described by three 2D vectors.

8/13/2019 net11-09

11/34


11

CounterpropagationCosine Similarity

This diagram shows a sample state of the hidden layer and asample input to the network:

),( 1211HH

ww

),( 2221HH

ww

),( 3231HH ww

),( 21 xx

8/13/2019 net11-09

12/34


12


In this example, hidden-layer neuron H2wins and, according tothe learning rule, is moved closer towards the current input

vector.

),( 1211HH

ww

),( 2221HH

ww

),( 3231HH ww

),( 21 xx

),( ,22,21H

new

H

new ww

8/13/2019 net11-09

13/34


13

CounterpropagationCosine SimilarityAfter doing this through many epochs and slowly reducing theadaptation step size , each hidden-layer unit will win for a

subset of inputs, and the angle of its weight vector will be in thecenter of gravity of the angles of these inputs.

),( 1211HH

ww

),( 2221HH ww

),( 3231HH ww

all input vectorsin the training set

8/13/2019 net11-09

14/34


14

CounterpropagationEuclidean Distance

Example of competitive learning with three hidden neurons:

x

x

x

x

+

++

++

ooo

o

3

1

2

8/13/2019 net11-09

15/34


15



x

x

x

x

+

++

++

ooo

o

3

1

2

8/13/2019 net11-09

16/34


16



x

x

x

x

+

++

++

ooo

o

31

2

8/13/2019 net11-09

17/34


17



x

x

x

x

+

++

++

ooo

o

31

2

8/13/2019 net11-09

18/34


18



x

x

x

x

+

++

++

ooo

o

31

2

8/13/2019 net11-09

19/34


19



x

x

x

x

+

++

++

ooo

o

31

2

8/13/2019 net11-09

20/34

8/13/2019 net11-09

21/34


21



x

x

x

x

+

++

++

ooo

o

321

8/13/2019 net11-09

22/34


22



x

x

x

x

+

++

++

ooo

o

32

1

8/13/2019 net11-09

23/34


23



x

x

x

x

+

++

++

ooo

o

32

1

8/13/2019 net11-09

24/34


24



x

x

x

x

+

++

++

ooo

o

2

1

3

8/13/2019 net11-09

25/34


25



x

x

x

x

+

++

++

ooo

o

2

1

3

8/13/2019 net11-09

26/34


26



x

x

x

x

+

++

++

ooo

o2

1

3

8/13/2019 net11-09

27/34


27


and so on,

possibly with reduction of the learning rate

C

8/13/2019 net11-09

28/34


28



x

x

x

x

+

++

++

ooo

o2

1

3

Th C i N k

8/13/2019 net11-09

29/34


The Counterpropagation NetworkAfter the first phaseof the training, each hidden-layerneuron is associated with a subset of input vectors.

The training process minimized the average angledifference or Euclidean distance between the weightvectors and their associated input vectors.

In the second phaseof the training, we adjust theweights in the networks output layer in such a waythat, for any winning hidden-layer unit, the networksoutput is as close as possible to the desired output forthe winning units associated input vectors.

The idea is that when we later use the network tocompute functions, the output of the winning hidden-layer unit is 1, and the output of all other hidden-layerunits is 0.

C t ti C i Si il it

8/13/2019 net11-09

30/34


CounterpropagationCosine SimilarityBecause there are two output neurons, the weights in theoutput layer that receive input from the same hidden-layer unit

can also be described by 2D vectors. These weight vectors arethe only possible output vectors of our network.

),( 2111 OO ww

),( 2313 OO ww

),( 2212OO ww

network output

if H1wins

network output

if H3wins

network output

if H2wins

C t ti C i Si il it

8/13/2019 net11-09

31/34



For each input vector, the output-layer weights that areconnected to the winning hidden-layer unit are made more

similar to the desired output vector:

),( 2111 OO ww

),( 2313 OO ww

),( 2212OO ww

),( ,21,11O

new

O

new ww ),( 21 yy

C t ti C i Si il it

8/13/2019 net11-09

32/34



The training proceeds with decreasing step size , and after itstermination, the weight vectors are in the center of gravity of

their associated output vectors:

),( 2111OO ww

),( 2313OO ww

),( 2212OO

ww

Output associated with

H1

H2

H3

C t ti E lid Di t

8/13/2019 net11-09

33/34



At the end of the output-layer learning process, the outputs of

the network are at the center of gravity of the desired outputs of

the winner neuron.

x

x

x

x

+

+

+

+ +

o

oo

o2

1

3

Th C t ti N t k

8/13/2019 net11-09

34/34

November 9 2010 Neural NetworksL 16 C i 34

The Counterpropagation NetworkNotice:

In the first training phase, if a hidden-layer unit does not win

for a long period of time, its weights should be set to randomvalues to give that unit a chance to win subsequently.

It is useful to reduce the learning rates, during training.

There is no need for normalizingthe training outputvectors.

After the training has finished, the network maps the training

inputs onto output vectors that are close to the desired ones.

The morehidden units, the better the mapping; however, thegeneralization ability may decrease.

Thanks to the competitive neurons in the hidden layer, even

linear neurons can realize nonlinearmappings.

Documents

net11-09