Download pdf - Matching of medical images by self-organizing neural networks

Pattern Recognition Letters 25 (2004) 341–352

www.elsevier.com/locate/patrec

Matching of medical images by self-organizingneural networks q

Giuseppe Coppini a,*, Stefano Diciotti b, Guido Valli b

a Consiglio Nazionale della Ricerche (CNR), Institute of Clinical Physiology, Via Moruzzi 1, 56124 Pisa, Italyb Department of Electronics and Telecommunications, University of Florence, Italy

Received 11 March 2003; received in revised form 21 October 2003

Abstract

A general approach to the problem of image matching which exploits a multi-scale representation of local image

structure and the principles of self-organizing neural networks is introduced. The problem considered is relevant in

many imaging applications and has been largely investigated in medical imagery, especially as regards the integration of

different imaging procedures.

A given pair of images to be matched, named target and stimulus respectively, are represented by Gabor Wavelets.

Correspondence is computed by exploiting the learning procedure of a neural network derived from Kohonen�s SOM.The SOM units coincide with the pixels of the target image and their weight are pointers to those of the stimulus images.

The standard SOM rule is modified so as to account for image features. The properties of our method are tested by

experiments performed on synthetic images. The considered implementation has shown that is able to recover a wide

range of transformations including global affine transformations and local distortions. Tests in the presence of additive

noise indicate considerable robustness against statistical variability. Applications to clinical images are presented.

� 2003 Elsevier B.V. All rights reserved.

Keywords: Image matching; Medical imaging; SOM neural network; Gabor wavelets

1. Introduction

Finding correspondences between images is a

central problem in many vision activities, from

preattentive processes to attentive tasks. A wide

qPartially founded by the Italian Ministry for Education,

University and Research (MIUR).*Corresponding author. Tel.: +39-503-153-480; fax: +39-

503-152-166.

E-mail addresses: [email protected] (G. Coppini), diciotti@

asp.det.unifi.it (S. Diciotti), [email protected] (G. Valli).

0167-8655/$ - see front matter � 2003 Elsevier B.V. All rights reserv

doi:10.1016/j.patrec.2003.10.012

category of image matching problems arises from

the need to integrate images generated by different

modalities. The importance of integration tasks

has rapidly grown in recent years and has become

of paramount importance in modern medical

imaging. In the latter case, it is necessary to buildcomprehensive functional and anatomical repre-

sentations of the observed biological structures by

utilizing the partial views offered by basic imaging

procedures such as MRI, PET, CT (van den Elsen

et al., 1993). Several correspondence problems

have been faced which include mono-modal or

ed.

mail to: [email protected]

tI (r)

sI (r’)

r

r’

T

S

stimulus image

target image

Fig. 1. A general matching procedure is expected to preserve

pictorial features and their neighborhood relationship.

1 One-to-one mapping can be desirable in some cases.

342 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352

multi-modal image correspondence and intra- or

inter-subjects matching. In addition, matching of

3D images has become an important issue along

with the mapping of imaged structures to reference

representations such as anatomical atlases. Much

effort has been devoted to finding global defor-mations, which has led to efficient algorithms

being able to recover linear transformations. A

popular method is based on the maximization of

the mutual information of the two images (Wells

III et al., 1996). More recently, the use of deform-

able models has been widely investigated so as to

recover local non-linear deformations. A number

of references on the subject can be found in(Maintz and Viergever, 1998). Thermodynamical

analogy with Maxwell demons has been also

investigated (Thirion, 1998).

In general, specific solutions have been pro-

posed for many matching problems. The related

algorithms usually exploit some form of prior

knowledge, and are often based on strong

assumptions. On the other hand many ‘‘corre-spondence finding’’ problems share important

common facets, such as the statement of image-

similarity criteria or the definition of adequate

computational schemes. In this view, the impor-

tance of general solutions for image matching

problems has been considered by several

researchers (Haralick and Shapiro, 1993). In par-

ticular, we believe that a self-organizing process isa natural and very promising setting. As reported

by Bellando and Kotari (1996), the computation of

topology preserving maps using Kohonen neural

networks (Kohonen, 2001) can provide valid

solutions to establishing image correspondence.

Wurtz et al. (1999) compare the behavior of Ko-

honen�s SOM with the Dynamic Link Architecture

(Konen et al., 1994) paying particular attentionto the computational efficiency of self-organizing

processes. It is worth mentioning that the appli-

cation of a SOM network to image registration is

described by Huwer et al. (1996).

In this work we will take the following general

problem into account. Let ItðrÞ and Isðr0Þ, twoimages (target and stimulus image, respectively)

with r ¼ ðx; yÞ, and r0 ¼ ðx0; y0Þ co-ordinate vectorsdefined in proper subsets T and S, respectively(image planes) of R2 space. We assume that a

feature vector ftðrÞ ¼ ff it ðrÞg describing relevant

properties of ItðrÞ can be computed for each pointr in T . Similarly, fsðr0Þ ¼ ff i

s ðr0Þg will indicate thefeature vector of Isðr0Þ for each r0 2 S. We searchfor a correspondence rule M : T 7!S that maps

points in T to points in S which have similar

morphological properties, as described by the

feature vectors (see Fig. 1). This is an ill-posed

problem, and further constraints must be consid-

ered to compute a useful solution, when stated so.We believe that powerful constraints can be de-

rived from the regularity of the observed world

even if the use of specific knowledge available for

each matching problem considered can lead to

adequate solutions. For example, images of nor-

mal anatomical structures exhibit a large inter-

subject variability, nevertheless the observed

structures keep some general properties such asconnectedness and proximity among sub-struc-

tures. In general, it seems reasonable to search for

a correspondence law which is unique from T toS, 1 and which maps features in the T plane to

similar features in the S plane by matching spa-tially contiguous features in T into spatially con-tiguous features in S. These conditions can berestated more concisely by saying that the desiredmapping is a feature- and topology-preserving

transformation from T into S. The importance oftopology preservation in image matching has been

G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 343

discussed by Musse et al. (2001) who propose the

use of hierarchical parametric models. A formal

definition of the meaning of ‘‘topology preserving

transformation’’ is useful for the following. To this

end we assume that proper metrics dI and dF are

given in the image planes I ¼ fT ; Sg and the fea-ture space F , respectively. Let us assume that forany given r1; r2 2 T , r01; r

02 2 S are the correspond-

ing transformed points, i.e. r01 ¼ Mðr1Þ, r02 ¼Mðr2Þ. We will say that the transformation Mpreserves image topology if the following condi-

tions are met:

(1) Preservation of image plane topology:

limdI ðr1;r2Þ!0

dIðr01; r02Þ ¼ 0

(2) Feature preservation:

limdI ðr1;r01Þ!0

dF ðftðr1Þ; fsðr01ÞÞ ¼ 0

Though further constraints can be taken into

account (such as the high order smoothness of

the mapping), our aim is the investigation of the

preservation of the neighborhood relationship.

To this end we consider the computation of

image matching by using a topology preserving

neural network derived from Kohonen�s SOM.The resulting computation is typically data-drivenand no specific prior model is expected to con-

strain the obtained solution. In this sense our

approach is typically non-parametric. In order to

investigate the related capabilities, we outline the

basic scheme of the adopted computational para-

digm in the next section. Subsequently we present

and analyze the performances observed when a

set of known mathematical transformations wasapplied to phantom images in the presence of

additive noise. Applications to computing corre-

spondence between biomedical images are also

described.

2 Without loss of generality we will assume that each unit is

identified by a predefined index.

2. Computational paradigm

Following the previous considerations, a gen-

eral framework of image matching problems re-

quires: (a) a proper computational architecture to

implement a topology preserving map, (b) a flexi-

ble image description to compute a useful set of

pictorial features f ¼ ff ig. This led us to investi-gate a self-organizing neural network coupled with

image features computed by the Gabor Wavelet

Transform.

2.1. Matching through self-organization

The computation of topology preserving maps

between vector spaces can be carried out by using

Kohonen neural networks (Kohonen, 2001).

Basically, they include a grid of units, typically 2D,

each of which receives the same input vector

(stimulus) and operates on a winner-takes-allbasis. For a given input, the network weights are

changed according to the rule: (i) find a winner

unit, and (ii) change the weights of the units in a

neighborhood of the winner unit, the size of the

neighborhood (defining the extent of lateral plas-

ticity) is a decreasing function of time. The relax-

ation process obtained by iteratively applying the

above rule for a sequence of input stimuli is thebasis of the training of ordinary Kohonen net-

works. In this work, the same process, properly

adapted, is exploited to compute the desired cor-

respondenceM. In our computational scheme, the

pixels of one of the two images ItðrÞ (called thetarget image) correspond to the units of the Ko-

honen network. In general we assume the image to

be sub-sampled by using a N N grid of pixels,which provides a N N SOM. The weights of eachSOM unit can be interpreted as pointers to the

second image Isðr0Þ (called stimulus image) whosecoordinates are the input of the network (Fig. 2).

In this way, the weight vectors directly give the

displacement needed to match point r with the cor-

responding point r0. For this reason, in the fol-

lowing the network weight of the ith unit 2 isindicated byMðriÞ, while r0 is the input (or stimu-lus) vector. The standard Kohonen�s algorithmthat governs the updating of network weights in

response to a stimulus r0 presented at time t is asfollows:

Fig. 2. Image matching by a 2D self-organizing network: SOM

units are superimposed to the pixels of the target image and the

network is made to relax under the action of the pixel of the

stimulus image.


i(i) locate the winner unit c by:

c ¼ argminkkr0 �MðrkÞk

(ii) change the network weights according to theequation:

Mtþ1i ðriÞ ¼ Mt

iðriÞ þ aðtÞhicðtÞ½r0 �MtiðriÞ�

The term aðtÞ is the learning rate which is oftengiven by:

aðtÞ ¼ a0tmax � ttmax

being a06 1 and tmax the (prefixed) number ofiterations of the relaxation process. The function

hic accounts for lateral plasticity: its value is onefor i ¼ c and falls off with distance kri � rck fromthe winner unit. Gaussian is a widely used neigh-

borhood function:

hicðtÞ ¼ exp"� kri � rck2

2rðtÞ2

#

The parameter rðtÞ determines the width of theneighborhood and controls the topology preser-

vation process. As in the case of the learning rate,

rðtÞ is decreased as the relaxation progresses.Unfortunately, the previous self-adaptation

equation is too simple to produce the desired re-

sults. Only condition 1 of our definition (i.e.

preservation of image plane topology) is satisfiedby the obtained mapping. On the other hand,

image features are not taken into account and a

unitary mapping is produced independently of the

image content. In order to achieve a meaningful

behavior, i.e. to ensure that feature preservation is

satisfied (condition 2), we modified the standard

SOM rule so as to include image features. To be

more precise, we defined feature similarity S by:

Sðfsðr0Þ; ftðrÞÞ ¼1

1þ kfsðr0Þ � ftðrÞkwhere k � k is the ordinary Euclidean metric. ThefunctionS assumes a unit value when fsðr0Þ ¼ ftðrÞand drops to zero as kfsðr0Þ � ftðrÞk increases.Having defined S, we replaced the SOM weight

update rule with the equation:

Mtþ1i ðriÞ ¼ Mt

iðriÞ þ aðtÞhicðtÞ ½r0 �Mt

iðriÞ�Sðfsðr0Þ; f tðrÞÞ ð1ÞWith this choice, maximal weight changes are al-

lowed when similarity between corresponding

features is high. On the other hand, weight changes

are weakened when different image features arecoupled. Otherwise stated, image features influ-

ence the computed map by simply modulating the

entity of weight update. One can have an idea of

the effect of such a process by considering the ex-

treme case of zero-valued similarity: this prevents

any weight change despite the competition mech-

anism that has no actual effect. In brief, competi-

tion among network units involves spatialcoordinates only, while changes to mapping are

controlled by similarity of image-feature.

2.2. Image representation by Gabor Wavelets

In general, the matching process can be based

on pictorial features which are optimized with re-

spect to the problem to be faced. Nevertheless,especially in the absence of explicit knowledge,

several low-level descriptions of images are known

(such as derivative of Gaussians, and several

classes of wavelets) which are well suited to cap-

ture image structures. In this view, we took Gabor

Wavelets into account. They provide a multi-scale

image representation which can be rigorously

framed in the theory of wavelet transform (Lee,1996) while keeping an intriguing biological

inspiration (Daugman, 1988). Gabor Wavelets for

an image IðrÞ can be obtained by a set of filters:


IðrÞ � Wðr; kÞ

with:

Wðr; kÞ ¼ 4k2 expð�2k2r2Þ

expði2pk � rÞ�

� exp�� p2

2

��

where k ¼ ðkx; kyÞ is the spatial frequency vectorwhich is usually more conveniently expressed in a

polar reference k ¼ ðk;uÞ where the modulus kacts as a scale-selection parameter. The gaborian

kernel, for a fixed k vector, is a Gaussian which is

modulated by a complex sinusoid. Such a kernelcan be decomposed into a pair of quadrature fil-

ters:

cðx; y; k;uÞ ¼ 4k2 exp½2k2ðx2 þ y2Þ� ðcosð2pkðx cosu þ y sinuÞÞ� expð�p2=2ÞÞ

sðx; y; k;uÞ ¼ 4k2 exp½2k2ðx2 þ y2Þ� sinð2pkðx cosu þ y sinuÞÞ

These are two Gaussians with a standard deviation12k, each Gaussian being modulated by a sinusoid

with a frequency k in the u direction: they respondmaximally to u-oriented features. It is worth not-ing that the frequency response is a Gaussian

centered at k, which has a standard deviation 2k.We expect specific oriented features to be strongly

enhanced when a proper set of oriented kernels is

used. At the same time, by filtering the image at

Fig. 3. Gabor representation of an MRI image: the scales (a) 1, an

different scales, image textures with different

coarseness are enhanced. In addition, Gabor ker-

nels are well suited to model the receptive fields of

simple cells in biological vision systems. Moreover,

the modulus mðr; kÞ of the output of a pair ofGabor filters:

mðr; k;uÞ ¼ mðr; kÞ

¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðcðr; kÞ � IðrÞÞ2 þ ðsðr; kÞ � IðrÞÞ2

qis usually considered a good approximation of the

output of complex cells (Spitzer and Hochstein,

1985). The non-linearly filtered images obtained by

mðr; kÞ ¼ jIðrÞ � Wðr; kÞj are used in this work.We choose to use the widely accepted set of

Gabor Wavelets with the spectral set of para-

meters distributed in a 2D log-polar lattice (as

described in (Daugman, 1988)). For each image

we computed a feature vector fðrÞ at K scales

k 2 f2�ðlþ1Þ : l ¼ 1; . . . ;Kg and M angles, u 2fi p

M : i ¼ 0; 1; . . . ;M � 1g ¼ fu0; . . . ;uM�1g:

fðrÞ ¼ ðmðr; 1;u0Þ; . . . ;mðr; 1;uM�1Þ; . . . ;mðr;K;u0Þ; . . . ;mðr;K;uM�1ÞÞ

In Fig. 3 the above representation is illustrated for

an MRI image with K ¼ 2 and M ¼ 4. A key fea-ture of multi-scale representations is their intrinsic

capability to implement coarse-to-fine processing.

In this way, one can immediately focus on largescale image feature and disregard fine details

which may often be misleading in preliminary

d (b) 2 for K ¼ 2, and four equispaced orientations, M ¼ 4.


processing steps. Small details can be usually

handled more efficiently in the following phases:

an approach which usually leads to improve

overall performance and reduce the computational

load. It must be pointed out that coarse-to-finestrategy matches well with SOM relaxation: SOM

are usually built according to a two-stage scheme.

In the first stage, commonly termed the ‘‘ordering’’

phase, adaptation parameters are selected so as

to capture the global structure of the map (in our

case the overall transformation between the two

images). In the second stage, the ‘‘refinement’’

phase, local details of the map emerge whichprovide a high-resolution estimate of the trans-

formation. We exploited these properties of SOM

networks by adopting the following computational

scheme.

During the ‘‘ordering’’ phase, we used a coarse

scale k of the Gabor Wavelet Transform with Morientations, thus using the feature vectors:

f tðrÞ ¼ ðmtðr; k;u0Þ; . . . ;mtðr; k;uM�1ÞÞfsðr0Þ ¼ ðmsðr0; k;u0Þ; . . . ;msðr0; k;uM�1ÞÞ

Fig. 4. Schema of the algorithm to compute the ma

for ItðrÞ, and Isðr0Þ respectively. In this phase,

images can be safely down-sampled leading to a

Nk Nk matrix.

During the ‘‘refinement’’ phase, feature vectors

computed at a finer scale h (keeping the numberof orientations) are used to build a Nh Nh map,

with Nh PNk. The implemented algorithm is

schematized in Fig. 4 and details are given in

Section 2.3.

2.3. Implementation issues

Original images were represented by 256 · 256matrices. In the ordering phase we used Gabor

features which were computed at scale 2 with 8

orientations (u ¼ 0; p8; . . . ; 7p

8) and re-sampled the

images to 64 · 64 obtaining a 64 · 64 unit SOM. Anumber of 5 · 105 iterations was used, varying rðtÞlinearly from 10 to 1. This means that the neigh-

borhood of each winning unit covers, on average,

almost the entire map in the initial iterations, whileat the end only a few units are included in the

neighborhood of the winner. Weights were ini-

pping M by relaxation of our SOM network.


tialized with random values uniformly distributed

in ½0; 63�. Identity mapping is another possibleinitialization: several experiments performed in

this way showed no noticeable difference in the

results, nor any significant changes in the execu-

tion time. The starting value of the learning rate a0was set to one.

In the refinement phase we used Gabor features

computed at scale 1 (with the same orientations

adopted in the ordering phase) and re-sampled

(by bi-cubic spline interpolation) the images to

128 · 128 obtaining a 128 · 128 unit SOM. In thiscase 106 iterations were used, varying rðtÞ linearlyfrom 2 to 0.3. In the initial iteration the neigh-borhood of each winning unit includes a few

neighbors and eventually shrinks to the winner

alone. Weights were initialized by using the map

from the ordering phase and by computing the

missing values by bi-cubic spline interpolation.

The starting learning rate a0 was again set to one.Once convergence is reached, the map is re-sam-

pled to 256 · 256 matrix by means of bi-cubicspline interpolation.

All the procedures were implemented in C++

language, under a Linux operating system running

on a PC equipped with an AMD Athlon processor

clocked at 2 GHz. An overall processing time of

about 11 min was observed: 25 s for the compu-

tation of the Gabor features, 4 min for the

ordering phase and 6 min for the refinement phase.It is worth noting that the multi-resolution ap-

proach provides a noticeable saving of computa-

tion time as compared to a single step carried out

at the finest spatial scale: this needs about about 57

min, that is 51 min for the ordering phase and 6

min for the refinement phase.

3. Experiments

In order to evaluate the effectiveness of SOM-

based image matching we performed a series of

experiments by using the well-known Shepp and

Logan head phantom (Shepp and Logan, 1974)

while undergoing different types of deformation in

the presence of additive uniform noise. At first weapplied to the phantom a set of global affine

transformations:

r0 ¼ Arþ b ð2Þ

where, in our experiments, the A matrix accounts

for both image rotation about the image center

and uniform scaling along co-ordinate axes, while

b ¼ ðbx; byÞ is a rigid displacement. In this context,it is worthwhile writing Eq. (2) as:

r0 ¼ RDrþ b ð3Þ

where

R ¼ cos h sin h� sin h cos h

� �

is a rotation matrix, and

D ¼ s 0

0 s

� �

is a uniform scaling matrix.

In addition, we have considered local non-linear

transformations belonging to the class of barrel/

pincushion deformations. These are best describedin a polar frame. The transformation is defined

with respect to a fixed point (positioned at the

center of the image, in our case) by the equation:

qr ¼ qr0 ð1þ cq2r0 Þ ð4Þwhere qr is the radial position of a generic point

in the original picture and qr0 is its radial position

in the transformed picture. The coefficient c con-trols the extent of the transformation. For c < 0one has a pincushion-like distortion, while forc > 0 the deformation is barrel-like.Besides phantom images, we examined the

behavior of our method for MRI images in the

case of transformations computed according to

Eqs. (3) and (4).

As described in the following, experiments were

performed by using several sets of values which

determined the transformations considered. Uni-form uncorrelated noise was added to the phan-

tom images to mimic the effect of statistical

fluctuations. The noise level was controlled by

fixing the interval ½�a; a� of the uniform deviate.

Assuming image gray levels in ½0; 1�, noise levelwas controlled by varying a 2 ½0; 0:2�.To quantify the quality of the obtained

matching normalized image correlation rn wasused as a figure of merit:

Table 1

Results for affine transformations without added noise

Translation

bx )25 )20 )15 )10 )5 0 5 10 15 20 25

rn 0.981 0.998 0.981 0.998 0.985 1.000 0.987 0.999 0.985 0.999 0.982

r0n 0.628 0.661 0.698 0.742 0.809 1.000 0.809 0.742 0.698 0.661 0.628

Rotation

h )25 )20 )15 )10 )5 0 5 10 15 20 25

rn 0.979 0.979 0.982 0.983 0.985 1.000 0.984 0.983 0.982 0.980 0.979

r0n 0.730 0.753 0.783 0.833 0.913 1.000 0.913 0.833 0.783 0.753 0.730

Scale

s 0.8 0.85 0.9 0.95 1.0 1.05

rn 0.943 0.962 0.977 0.985 1.000 0.973

r0n 0.601 0.644 0.687 0.750 1.000 0.757


rn ¼P

x;y Itðx; yÞIrðx; yÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPx;y Itðx; yÞ

2Px;y Irðx; yÞ

2q

where Irðx; yÞ is the image registered according tothe computed transformation M : ðx; yÞ7!ðx0; y0Þ.It is worth noting that correlation rn is widely usedin literature, nevertheless its integral nature limits

its value. For this reason we also evaluated the

quality of the results by visually inspecting the

difference image Itðx; yÞ � Isðx; yÞ.

3.1. Affine transformation

We considered three different affine trans-

formations: translation, scaling and rotation. In

Table 1 we give rn for different values of bx(translation), h (rotation), and s (uniform scaling),

Table 2

Results for affine transformations in the presence of additive

noise

a 0.0 0.05 0.1 0.15 0.2

Translation (bx ¼ 10)

rn 0.999 0.994 0.988 0.981 0.974

r0n 1.000 0.763 0.784 0.803 0.820

Rotation (h ¼ 10�)rn 0.983 0.983 0.980 0.975 0.970

r0n 1.000 0.847 0.858 0.869 0.878

Scale (s ¼ 0:90)

rn 0.977 0.978 0.975 0.972 0.967

r0n 1.000 0.713 0.740 0.765 0.786

Fig. 5. Some examples of estimated matchings for Shepp and

Logan phantom while undergoing affine transformations. Col-

umn (i) refers to a rigid translation with bx ¼ 10, column (ii)illustrates a uniform scaling (s ¼ 0:90), and (iii) is a rotationwith h ¼ 10�. A uniform noise with a ¼ 0:15 was added to thephantom images. Transformed images are shown in row (a),

and in row (b) the recovered images are reported. In (c) one can

see the difference between the original and the transformed

image, and in (d) the difference between the original and the

recovered image.

Table 3

Results for pincushion/barrel deformations without noise

Pincushion

c 0.000 1 · 10�5 2 · 10�5 3· 10�5rn 1.000 0.976 0.968 0.955

r0n 1.000 0.760 0.736 0.727

Barrel

c )5·10�6 )4 ·10�6 )3 ·10�6 0.000

rn 0.979 0.990 0.985 1.000

r0n 0.669 0.692 0.737 1.000

Table 4

Results for pincushion/barrel deformations with noise

a 0.0 0.05 0.1 0.15 0.2

Barrel (c ¼ �4 10�6)

rn 0.990 0.987 0.982 0.972 0.962

r0n 1.000 0.707 0.719 0.729 0.735

Pincushion (c ¼ 2 10�5)

rn 0.968 0.960 0.952 0.939 0.925

r0n 1.000 0.743 0.748 0.754 0.757

Fig. 6. (a) The Shepp and Logan phantom undergoing pincushion

images, (c) difference images between original and distorted images, (d

values of c ¼ f�3 10�6;�5 10�6; 1 10�5; 3 10�5g are used for


respectively, in the absence of noise. As a refer-

ence, we provide the normalized cross-correlation

r0n between the original picture and the trans-

formed one:

r0n ¼P

x;y Itðx; yÞIsðx; yÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPx;y Itðx; yÞ

2Px;y Isðx; yÞ

2q

In Table 2, the values of rn (along with r0n) arereported by varying the noise level, for bx ¼ 10pixels, h ¼ 10�, and s ¼ 0:90 respectively.The values of transformation parameters bx; h; s

(and noise level a) were selected so as to obtain aset of transformations which correspond to com-

mon cases.

In order to make the evaluation of experimental

results easier in the panels of Fig. 5 we show some

examples of transformed images and estimated

matchings for a ¼ 0:15.

and barrel distortions is shown, (b) corresponding recovered

) difference images between original and recovered images. The

(i), (ii), (iii), and (iv) respectively.


3.2. Pincushion and barrel deformation

The results for transformation in Eq. (4) are

shown in Table 3. We considered both negative

and positive values of parameter c, with c drawnfrom the interval [)5 · 10�6, 3 · 10�5].Results of simulations in the presence of noise

are given in Table 4 for both the cases c < 0 andc > 0.In Fig. 6 some examples of barrel and pin-

cushion deformations are shown.

3.3. Examples of MRI images with synthetic

transformations

Experiments on real medical images were car-

ried out in the case of mono-modal intra-subject

MRI. Comparing images taken at different times is

Fig. 7. An MRI slice undergoing different transformations. The le

bx ¼ by ¼ 20. Column (ii) refers to a scaling with s ¼ 0:80, column (iii)and (v) c ¼ 3 10�5.

crucial in treatment monitoring (e.g. before and

after a surgical intervention), as well to study the

growth of lesions (Maintz and Viergever, 1998;

Thirion and Calmon, 1999). In addition, breast

MRI is usually based on image subtraction fol-

lowing the injection of a contrast agent, and re-quires a careful image registration to avoid motion

artifacts (Schnabel et al., 2003).

We have tested our method by applying the

image transformations in Eqs. (3) and (4) to MRI

images. In Fig. 7 we show some examples of the

results obtained for an axial slice. The figure shows

examples of translation, scaling, rotation, pin-

cushion, and barrel deformations, respectively.Further examples obtained for an image of a dif-

ferent anatomical section are provided in Fig. 8.

The values of the correlations rn and r0n are givenfor the two images in Tables 5 and 6, respectively.

ftmost column illustrates the case of a rigid translation with

is a rotation with h ¼ 10� and distortion with (iv) c ¼ �5 10�6

Fig. 8. Results for an MRI slice undergoing translation, rotation, scaling, pincushion and barrel transformations using the same

parameters of Fig. 7, with the exception of rotation which corresponds to h ¼ �10�.

Table 5

Correlation values for the transformations of the image in Fig. 7

bx ¼ 20; by ¼ 20 s ¼ 0:8 h ¼ 10� c ¼ �5 10�6 c ¼ 3 10�5

rn 1.000 0.967 0.988 0.921 0.903

r0n 0.627 0.694 0.825 0.734 0.719

Table 6

Correlation values for the transformations of the image in Fig. 8

bx ¼ 20; by ¼ 20 s ¼ 0:8 h ¼ �10� c ¼ �5 10�6 c ¼ 3 10�5

rn 1.000 0.961 0.980 0.981 0.966

r0n 0.709 0.761 0.867 0.768 0.789


4. Conclusion

We have described a general procedure to solve

matching problems for medical image co-registra-

tion and fusion. The basic Kohonen�s SOM neural

network has been properly adapted and its relax-

ation mechanism is exploited in conjunction to amulti-scale image representation based on Gabor

Wavelets.

The proposed method is essentially data-driven

and non-parametric. Consequently, no specific

hypothesis is necessary about the mathematical


form of the correspondence law. On the other

hand, the most frequently used co-registration

methods (Maintz and Viergever, 1998) strongly

rely on prior knowledge about the type of the

involved transformation (e.g. rigid, affine, poly-

nomial warping), which, in general, is onlyapproximately known.

In order to evaluate the power of the approach,

we have performed extensive computer simula-

tions on phantom images while undergoing known

transformations in the presence of additive noise.

Similar tests have been performed using clinical

MRI images. The results support the idea that a

topology-preserving paradigm can offer consistentestimates of both global and local transforma-

tions. Though further experimentation in clinical

settings is necessary, we believe that the method is

adequate for intra-subject image alignment in

mono-modality imaging. In particular, this is

useful when images acquired at different times

must be compared. Other image co-registration

problems (such as in the cases of intra-subjectmulti-modality imaging, registration of patient

data against an anatomical atlas or inter-subjects

comparison) may need further development of the

method as they usually involve more complex and

extended transformations.

It is worth noting that the data in the previous

section indicates a slight degradation when the

entity of the transformation increases. On theother hand, we based our similarity criterion on

local morphological features. Consequently, if the

images to be matched exhibit noticeable differences

one expects computed mapping to become less

accurate. Nevertheless, the method works rather

well for large changes of shape and it is able to

deal with both global linear transformations and

local non-linear deformations.We wish to point out that the adopted com-

putational strategy naturally leads to the imple-

mentation of the matching algorithm for 3D

images, as is often the case in medical imagery: this

requires the use of a 3D SOM along with a proper

set of 3D image features. A final remark about the

inherent sequentiality of SOM learning: as this

may limit the computation speed, some optimiza-tion may be desirable, e.g. via parallel implemen-

tation (Hamalainen, 2001).

References

Bellando, J., Kotari, R., 1996. On image correspondence

using topology preserving mappings. In: IEEE Interna-

tional Conference on Neural Networks, pp. 1784–

1789.

Daugman, J., 1988. Complete discrete 2D Gabor transforms

by neural networks for image analysis and compression.

IEEE Trans. Acoust. Speech Signal Process. 36, 1169–

1179.

Hamalainen, T., 2001. Parallel implementations of self-orga-

nizing maps. In: Seifert, U., Jain, L. (Eds.), Self-organizing

Neural Networks. Recent Advances and Applications.

Springer, Heidelberg, pp. 245–278.

Haralick, R., Shapiro, L., 1993. Computer and Robot Vision.

Addison-Wesley, Reading, Ma.

Huwer, S., Rahmal, J., Wangenheim, A.V., 1996. Data-driven

registration for local deformations. Pattern Recognition

Lett. 17, 951–957.

Kohonen, T., 2001. Self-organizing Maps. Springer, Berlin.

Konen, W., Maurer, T., von der Malsburg, C., 1994. A fast

dynamic link algorithm for invariant pattern recognition.

Neural Networks 7, 1019–1039.

Lee, T.S., 1996. Image representation using 2D Gabor wave-

lets. IEEE Trans. Pattern Anal. Machine Intell. 18, 959–

971.

Maintz, J., Viergever, M., 1998. A survey of medical image

registration. Med. Image Anal. 2, 1–36.

Musse, O., Heitz, F., Armspach, J., 2001. Topology preserving

deformable image matching using constrained hierarchical

parametric models. IEEE Trans. Image Process. 10, 1081–

1093.

Schnabel, C., Tanner, J.A., Castellano-Smith, A., Degenhard,

A., Leach, M., Hose, D., Hill, D., Hawkes, D., 2003.

Validation of nonrigid image registration using finite-

element methods: application to breast MR images. IEEE

Trans. Med. Imaging 22, 238–247.

Shepp, L., Logan, B., 1974. The Fourier reconstruction of a

head section. IEEE Trans. Nucl. Sci. 21, 692–702.

Spitzer, H., Hochstein, S., 1985. A complex–cell receptive

model. J. Neurosci. 5, 1266–1286.

Thirion, J., 1998. Image matching as a diffusion process: an

analogy with Maxwell�s demons. Med. Image Anal. 2, 243–260.

Thirion, J.P., Calmon, G., 1999. Deformation analysis to

detect and quantify active lesions in three-dimensional

medical image sequences. IEEE Trans. Med. Imaging 18,

429–441.

van den Elsen, P., Pol, E., Viergever, M., 1993. Medical image

matching: a review with classification. IEEE Eng. Med.

Biol. 12, 26–29.

Wells III, W., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.,

1996. Multi-modal volume registration by maximization of

mutual information. Med. Image Anal. 1, 35–51.

Wurtz, R., Konen, W., Behrmann, K., 1999. On the perfor-

mance of neuronal matching algorithms. Neural Networks

12, 127–134.