Associative Networks

November 30, 2010 Neural Networks Lecture 20: Interpolative Associative Memory

1

Associative NetworksAssociative NetworksAssociative networks are able to store a set of Associative networks are able to store a set of patterns and output the one that is associated with the patterns and output the one that is associated with the current input.current input.

The patterns are stored in the interconnections The patterns are stored in the interconnections between neurons, similarly to how memory in our between neurons, similarly to how memory in our brain is assumed to work.brain is assumed to work.

Hetero-association:Hetero-association: Mapping input vectors to output Mapping input vectors to output vectors in a different vector space (e.g., English-to-vectors in a different vector space (e.g., English-to-German translator)German translator)

Auto-association: Auto-association: Input and output vectors are in the Input and output vectors are in the same vector space (e.g., spelling corrector).same vector space (e.g., spelling corrector).


2

Interpolative Associative MemoryInterpolative Associative MemoryFor hetero-association, we can use a simple two-layer For hetero-association, we can use a simple two-layer network of the following form:network of the following form:

II11 II22 IINN……

OO11 OO22 OOMM

11w 12w

Nw121w

22wNw2 1Mw

2Mw MNw

……


3

Interpolative Associative MemoryInterpolative Associative Memory

Sometimes it is possible to obtain a training set with Sometimes it is possible to obtain a training set with orthonormal orthonormal (that is, normalized and pairwise (that is, normalized and pairwise orthogonal) input vectors. orthogonal) input vectors.

In that case, our two-layer network with linear neurons In that case, our two-layer network with linear neurons can solve its task perfectly and can solve its task perfectly and does not even does not even require trainingrequire training..

We call such a network an We call such a network an interpolative associative interpolative associative memorymemory..

You may ask: You may ask: How does it work?How does it work?


4


Well, if you look at the network’s output functionWell, if you look at the network’s output function

Mmiwo n

N

nmnm ,...,1for

1

you will find that this is just like a matrix multiplication:you will find that this is just like a matrix multiplication:

iWo

i

i

i

www

www

www

o

o

o

NMNMM

N

N

M

or ...

...

.........

...

...2

1

21

22221

11211

2

1


5


With an orthonormal set of exemplar input vectors With an orthonormal set of exemplar input vectors (and any associated output vectors) we can simply (and any associated output vectors) we can simply calculate a weight matrixcalculate a weight matrix that realizes the desired that realizes the desired function and does not need any training procedure. function and does not need any training procedure.

For exemplars (xFor exemplars (x11, y, y11), (x), (x22, y, y22), …, (x), …, (xPP, y, yPP) we obtain ) we obtain

the following weight matrix W: the following weight matrix W:

P

p

tppxyW

1

NoteNote that an N-dimensional vector space cannot have that an N-dimensional vector space cannot have a set of more than N orthonormal vectors!a set of more than N orthonormal vectors!


6


Example:Example:

Assume that we want to build an interpolative memory Assume that we want to build an interpolative memory with three input neurons and three output neurons.with three input neurons and three output neurons.

We have the following three exemplars (desired input-We have the following three exemplars (desired input-output pairs):output pairs):

8

2

5

,

1

0

0

,

3

2

1

,

0

0

1

,

3

3

3

,

0

1

0


7


ThenThen

100

8

2

5

001

3

2

1

010

3

3

3

W

833

232

531

800

200

500

003

002

001

030

030

030

W

If you set the weights wIf you set the weights wmnmn to these values, the network to these values, the network

will realize the desired function.will realize the desired function.


8


So if you want to implement a So if you want to implement a linearlinear function R function RNNRRMM and can provide exemplars with and can provide exemplars with orthonormalorthonormal input input vectors, then an vectors, then an interpolative associative memoryinterpolative associative memory is is the best solution.the best solution.

It does It does notnot require any require any trainingtraining procedure, realizes procedure, realizes perfect matchingperfect matching of the exemplars, and performs of the exemplars, and performs plausible interpolationplausible interpolation for new input vectors. for new input vectors.

Of course, this interpolation is Of course, this interpolation is linearlinear..


9

The Hopfield NetworkThe Hopfield NetworkThe The HopfieldHopfield model is a single-layered recurrent model is a single-layered recurrent network.network.

Like the associative memory, it is usually Like the associative memory, it is usually initializedinitialized with appropriate weights instead of being trained.with appropriate weights instead of being trained.

The network structure looks as follows:The network structure looks as follows:

XX11 XX22 XXNN……


10

The Hopfield NetworkThe Hopfield NetworkWe will first look at the We will first look at the discretediscrete Hopfield model, Hopfield model, because its mathematical description is more because its mathematical description is more straightforward.straightforward.

In the discrete model, the output of each neuron is In the discrete model, the output of each neuron is either 1 or –1.either 1 or –1.

In its simplest form, the output function is the In its simplest form, the output function is the sign sign functionfunction, which yields 1 for arguments , which yields 1 for arguments 0 and –1 0 and –1 otherwise.otherwise.


11

The Hopfield NetworkThe Hopfield NetworkFor input-output pairs (xFor input-output pairs (x11, y, y11), (x), (x22, y, y22), …, (x), …, (xPP, y, yPP), we ), we

can initialize the weights in the same way as we did it can initialize the weights in the same way as we did it with the associative memory:with the associative memory:

P

p

tppxyW

1

This is identical to the following formula:This is identical to the following formula:

P

p

jp

ipij xyw

1

)()(

where xwhere xpp(j)(j) is the j-th component of vector x is the j-th component of vector xpp, and, and

yypp(i)(i) is the i-th component of vector y is the i-th component of vector ypp..


12

The Hopfield NetworkThe Hopfield NetworkIn the discrete version of the model, each component In the discrete version of the model, each component of an input or output vector can only assume the of an input or output vector can only assume the values values 11 or or –1–1..

The output of a neuron i at time t is then computed The output of a neuron i at time t is then computed according to the following formula:according to the following formula:

N

jjiji towto

1

)1(sgn)(

This This recursionrecursion can be performed over and over again. can be performed over and over again.

In some network variants, In some network variants, external inputexternal input is added to is added to the internal, recurrent one.the internal, recurrent one.


13

The Hopfield NetworkThe Hopfield Network

Usually, the vectors xUsually, the vectors xpp are are not orthonormalnot orthonormal, so it is , so it is

not guaranteed that whenever we input some pattern not guaranteed that whenever we input some pattern xxpp, the output will be y, the output will be ypp, but it will be a pattern , but it will be a pattern similarsimilar

to yto ypp..

Since the Hopfield network is recurrent, its behavior Since the Hopfield network is recurrent, its behavior depends on its depends on its previous stateprevious state and in the general and in the general case is difficult to predict.case is difficult to predict.

However, what happens if we initialize the weights However, what happens if we initialize the weights with a set of patterns so that each pattern is being with a set of patterns so that each pattern is being associated with associated with itselfitself, (x, (x11, x, x11), (x), (x22, x, x22), …, (x), …, (xPP, x, xPP)?)?


14


This initialization is performed according to the This initialization is performed according to the following equation:following equation:

P

p

jp

ipij xxw

1

)()(

You see that the weight matrix is symmetrical, i.e., You see that the weight matrix is symmetrical, i.e., wwijij = w = wjiji..

We also demand that wWe also demand that w iiii = 0, in which case the = 0, in which case the network shows an interesting behavior.network shows an interesting behavior.

It can be mathematically proven that under these It can be mathematically proven that under these conditions the network will reach a conditions the network will reach a stable activation stable activation statestate within an finite number of iterations. within an finite number of iterations.


15


And what does such a stable state look like?And what does such a stable state look like?

The network associates input patterns with The network associates input patterns with themselves, which means that in each iteration, the themselves, which means that in each iteration, the activation pattern will be activation pattern will be drawn towardsdrawn towards one of those one of those patterns.patterns.

After converging, the network will most likely present After converging, the network will most likely present one of the patterns that it was initialized with.one of the patterns that it was initialized with.

Therefore, Hopfield networks can be used to Therefore, Hopfield networks can be used to restorerestore incomplete or noisy input patterns.incomplete or noisy input patterns.


16


Example:Example: Image reconstruction Image reconstruction

A 20A 2020 discrete Hopfield network was trained with 20 input 20 discrete Hopfield network was trained with 20 input patterns, including the one shown in the left figure and 19 patterns, including the one shown in the left figure and 19 random patterns as the one on the right.random patterns as the one on the right.


17


After providing only one fourth of the “face” image as After providing only one fourth of the “face” image as initial input, the network is able to perfectly reconstruct initial input, the network is able to perfectly reconstruct that image within only two iterations.that image within only two iterations.


18


Adding noise by changing each pixel with a probability Adding noise by changing each pixel with a probability p = 0.3 does not impair the network’s performance.p = 0.3 does not impair the network’s performance.

After two steps the image is perfectly reconstructed.After two steps the image is perfectly reconstructed.


19


However, for noise created by p = 0.4, the network is However, for noise created by p = 0.4, the network is unable the original image.unable the original image.

Instead, it converges against one of the 19 random Instead, it converges against one of the 19 random patterns. patterns.


20


Problems with the Hopfield model are that Problems with the Hopfield model are that

• it cannot recognize patterns that are it cannot recognize patterns that are shifted in shifted in position position,,

• it can only store a it can only store a very limited numbervery limited number of different of different patterns. patterns.

Nevertheless, the Hopfield model constitutes an Nevertheless, the Hopfield model constitutes an interesting neural approach to identifying partially interesting neural approach to identifying partially occluded objectsoccluded objects and objects in and objects in noisy imagesnoisy images..

These are among the These are among the toughest problemstoughest problems in in computer vision.computer vision.

Documents

Associative Networks