Pattern Recognition Letters 25 (2004) 341–352
www.elsevier.com/locate/patrec
Matching of medical images by self-organizingneural networks q
Giuseppe Coppini a,*, Stefano Diciotti b, Guido Valli b
a Consiglio Nazionale della Ricerche (CNR), Institute of Clinical Physiology, Via Moruzzi 1, 56124 Pisa, Italyb Department of Electronics and Telecommunications, University of Florence, Italy
Received 11 March 2003; received in revised form 21 October 2003
Abstract
A general approach to the problem of image matching which exploits a multi-scale representation of local image
structure and the principles of self-organizing neural networks is introduced. The problem considered is relevant in
many imaging applications and has been largely investigated in medical imagery, especially as regards the integration of
different imaging procedures.
A given pair of images to be matched, named target and stimulus respectively, are represented by Gabor Wavelets.
Correspondence is computed by exploiting the learning procedure of a neural network derived from Kohonen�s SOM.The SOM units coincide with the pixels of the target image and their weight are pointers to those of the stimulus images.
The standard SOM rule is modified so as to account for image features. The properties of our method are tested by
experiments performed on synthetic images. The considered implementation has shown that is able to recover a wide
range of transformations including global affine transformations and local distortions. Tests in the presence of additive
noise indicate considerable robustness against statistical variability. Applications to clinical images are presented.
� 2003 Elsevier B.V. All rights reserved.
Keywords: Image matching; Medical imaging; SOM neural network; Gabor wavelets
1. Introduction
Finding correspondences between images is a
central problem in many vision activities, from
preattentive processes to attentive tasks. A wide
qPartially founded by the Italian Ministry for Education,
University and Research (MIUR).*Corresponding author. Tel.: +39-503-153-480; fax: +39-
503-152-166.
E-mail addresses: [email protected] (G. Coppini), diciotti@
asp.det.unifi.it (S. Diciotti), [email protected] (G. Valli).
0167-8655/$ - see front matter � 2003 Elsevier B.V. All rights reserv
doi:10.1016/j.patrec.2003.10.012
category of image matching problems arises from
the need to integrate images generated by different
modalities. The importance of integration tasks
has rapidly grown in recent years and has become
of paramount importance in modern medical
imaging. In the latter case, it is necessary to buildcomprehensive functional and anatomical repre-
sentations of the observed biological structures by
utilizing the partial views offered by basic imaging
procedures such as MRI, PET, CT (van den Elsen
et al., 1993). Several correspondence problems
have been faced which include mono-modal or
ed.
tI (r)
sI (r’)
r
r’
T
S
stimulus image
target image
Fig. 1. A general matching procedure is expected to preserve
pictorial features and their neighborhood relationship.
1 One-to-one mapping can be desirable in some cases.
342 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
multi-modal image correspondence and intra- or
inter-subjects matching. In addition, matching of
3D images has become an important issue along
with the mapping of imaged structures to reference
representations such as anatomical atlases. Much
effort has been devoted to finding global defor-mations, which has led to efficient algorithms
being able to recover linear transformations. A
popular method is based on the maximization of
the mutual information of the two images (Wells
III et al., 1996). More recently, the use of deform-
able models has been widely investigated so as to
recover local non-linear deformations. A number
of references on the subject can be found in(Maintz and Viergever, 1998). Thermodynamical
analogy with Maxwell demons has been also
investigated (Thirion, 1998).
In general, specific solutions have been pro-
posed for many matching problems. The related
algorithms usually exploit some form of prior
knowledge, and are often based on strong
assumptions. On the other hand many ‘‘corre-spondence finding’’ problems share important
common facets, such as the statement of image-
similarity criteria or the definition of adequate
computational schemes. In this view, the impor-
tance of general solutions for image matching
problems has been considered by several
researchers (Haralick and Shapiro, 1993). In par-
ticular, we believe that a self-organizing process isa natural and very promising setting. As reported
by Bellando and Kotari (1996), the computation of
topology preserving maps using Kohonen neural
networks (Kohonen, 2001) can provide valid
solutions to establishing image correspondence.
Wurtz et al. (1999) compare the behavior of Ko-
honen�s SOM with the Dynamic Link Architecture
(Konen et al., 1994) paying particular attentionto the computational efficiency of self-organizing
processes. It is worth mentioning that the appli-
cation of a SOM network to image registration is
described by Huwer et al. (1996).
In this work we will take the following general
problem into account. Let ItðrÞ and Isðr0Þ, twoimages (target and stimulus image, respectively)
with r ¼ ðx; yÞ, and r0 ¼ ðx0; y0Þ co-ordinate vectorsdefined in proper subsets T and S, respectively(image planes) of R2 space. We assume that a
feature vector ftðrÞ ¼ ff it ðrÞg describing relevant
properties of ItðrÞ can be computed for each pointr in T . Similarly, fsðr0Þ ¼ ff i
s ðr0Þg will indicate thefeature vector of Isðr0Þ for each r0 2 S. We searchfor a correspondence rule M : T 7!S that maps
points in T to points in S which have similar
morphological properties, as described by the
feature vectors (see Fig. 1). This is an ill-posed
problem, and further constraints must be consid-
ered to compute a useful solution, when stated so.We believe that powerful constraints can be de-
rived from the regularity of the observed world
even if the use of specific knowledge available for
each matching problem considered can lead to
adequate solutions. For example, images of nor-
mal anatomical structures exhibit a large inter-
subject variability, nevertheless the observed
structures keep some general properties such asconnectedness and proximity among sub-struc-
tures. In general, it seems reasonable to search for
a correspondence law which is unique from T toS, 1 and which maps features in the T plane to
similar features in the S plane by matching spa-tially contiguous features in T into spatially con-tiguous features in S. These conditions can berestated more concisely by saying that the desiredmapping is a feature- and topology-preserving
transformation from T into S. The importance oftopology preservation in image matching has been
G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 343
discussed by Musse et al. (2001) who propose the
use of hierarchical parametric models. A formal
definition of the meaning of ‘‘topology preserving
transformation’’ is useful for the following. To this
end we assume that proper metrics dI and dF are
given in the image planes I ¼ fT ; Sg and the fea-ture space F , respectively. Let us assume that forany given r1; r2 2 T , r01; r
02 2 S are the correspond-
ing transformed points, i.e. r01 ¼ Mðr1Þ, r02 ¼Mðr2Þ. We will say that the transformation Mpreserves image topology if the following condi-
tions are met:
(1) Preservation of image plane topology:
limdI ðr1;r2Þ!0
dIðr01; r02Þ ¼ 0
(2) Feature preservation:
limdI ðr1;r01Þ!0
dF ðftðr1Þ; fsðr01ÞÞ ¼ 0
Though further constraints can be taken into
account (such as the high order smoothness of
the mapping), our aim is the investigation of the
preservation of the neighborhood relationship.
To this end we consider the computation of
image matching by using a topology preserving
neural network derived from Kohonen�s SOM.The resulting computation is typically data-drivenand no specific prior model is expected to con-
strain the obtained solution. In this sense our
approach is typically non-parametric. In order to
investigate the related capabilities, we outline the
basic scheme of the adopted computational para-
digm in the next section. Subsequently we present
and analyze the performances observed when a
set of known mathematical transformations wasapplied to phantom images in the presence of
additive noise. Applications to computing corre-
spondence between biomedical images are also
described.
2 Without loss of generality we will assume that each unit is
identified by a predefined index.
2. Computational paradigm
Following the previous considerations, a gen-
eral framework of image matching problems re-
quires: (a) a proper computational architecture to
implement a topology preserving map, (b) a flexi-
ble image description to compute a useful set of
pictorial features f ¼ ff ig. This led us to investi-gate a self-organizing neural network coupled with
image features computed by the Gabor Wavelet
Transform.
2.1. Matching through self-organization
The computation of topology preserving maps
between vector spaces can be carried out by using
Kohonen neural networks (Kohonen, 2001).
Basically, they include a grid of units, typically 2D,
each of which receives the same input vector
(stimulus) and operates on a winner-takes-allbasis. For a given input, the network weights are
changed according to the rule: (i) find a winner
unit, and (ii) change the weights of the units in a
neighborhood of the winner unit, the size of the
neighborhood (defining the extent of lateral plas-
ticity) is a decreasing function of time. The relax-
ation process obtained by iteratively applying the
above rule for a sequence of input stimuli is thebasis of the training of ordinary Kohonen net-
works. In this work, the same process, properly
adapted, is exploited to compute the desired cor-
respondenceM. In our computational scheme, the
pixels of one of the two images ItðrÞ (called thetarget image) correspond to the units of the Ko-
honen network. In general we assume the image to
be sub-sampled by using a N N grid of pixels,which provides a N N SOM. The weights of eachSOM unit can be interpreted as pointers to the
second image Isðr0Þ (called stimulus image) whosecoordinates are the input of the network (Fig. 2).
In this way, the weight vectors directly give the
displacement needed to match point r with the cor-
responding point r0. For this reason, in the fol-
lowing the network weight of the ith unit 2 isindicated byMðriÞ, while r0 is the input (or stimu-lus) vector. The standard Kohonen�s algorithmthat governs the updating of network weights in
response to a stimulus r0 presented at time t is asfollows:
Fig. 2. Image matching by a 2D self-organizing network: SOM
units are superimposed to the pixels of the target image and the
network is made to relax under the action of the pixel of the
stimulus image.
344 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
i(i) locate the winner unit c by:
c ¼ argminkkr0 �MðrkÞk
(ii) change the network weights according to theequation:
Mtþ1i ðriÞ ¼ Mt
iðriÞ þ aðtÞhicðtÞ½r0 �MtiðriÞ�
The term aðtÞ is the learning rate which is oftengiven by:
aðtÞ ¼ a0tmax � ttmax
being a06 1 and tmax the (prefixed) number ofiterations of the relaxation process. The function
hic accounts for lateral plasticity: its value is onefor i ¼ c and falls off with distance kri � rck fromthe winner unit. Gaussian is a widely used neigh-
borhood function:
hicðtÞ ¼ exp"� kri � rck2
2rðtÞ2
#
The parameter rðtÞ determines the width of theneighborhood and controls the topology preser-
vation process. As in the case of the learning rate,
rðtÞ is decreased as the relaxation progresses.Unfortunately, the previous self-adaptation
equation is too simple to produce the desired re-
sults. Only condition 1 of our definition (i.e.
preservation of image plane topology) is satisfiedby the obtained mapping. On the other hand,
image features are not taken into account and a
unitary mapping is produced independently of the
image content. In order to achieve a meaningful
behavior, i.e. to ensure that feature preservation is
satisfied (condition 2), we modified the standard
SOM rule so as to include image features. To be
more precise, we defined feature similarity S by:
Sðfsðr0Þ; ftðrÞÞ ¼1
1þ kfsðr0Þ � ftðrÞkwhere k � k is the ordinary Euclidean metric. ThefunctionS assumes a unit value when fsðr0Þ ¼ ftðrÞand drops to zero as kfsðr0Þ � ftðrÞk increases.Having defined S, we replaced the SOM weight
update rule with the equation:
Mtþ1i ðriÞ ¼ Mt
iðriÞ þ aðtÞhicðtÞ ½r0 �Mt
iðriÞ�Sðfsðr0Þ; f tðrÞÞ ð1ÞWith this choice, maximal weight changes are al-
lowed when similarity between corresponding
features is high. On the other hand, weight changes
are weakened when different image features arecoupled. Otherwise stated, image features influ-
ence the computed map by simply modulating the
entity of weight update. One can have an idea of
the effect of such a process by considering the ex-
treme case of zero-valued similarity: this prevents
any weight change despite the competition mech-
anism that has no actual effect. In brief, competi-
tion among network units involves spatialcoordinates only, while changes to mapping are
controlled by similarity of image-feature.
2.2. Image representation by Gabor Wavelets
In general, the matching process can be based
on pictorial features which are optimized with re-
spect to the problem to be faced. Nevertheless,especially in the absence of explicit knowledge,
several low-level descriptions of images are known
(such as derivative of Gaussians, and several
classes of wavelets) which are well suited to cap-
ture image structures. In this view, we took Gabor
Wavelets into account. They provide a multi-scale
image representation which can be rigorously
framed in the theory of wavelet transform (Lee,1996) while keeping an intriguing biological
inspiration (Daugman, 1988). Gabor Wavelets for
an image IðrÞ can be obtained by a set of filters:
G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 345
IðrÞ � Wðr; kÞ
with:
Wðr; kÞ ¼ 4k2 expð�2k2r2Þ
expði2pk � rÞ�
� exp�� p2
2
��
where k ¼ ðkx; kyÞ is the spatial frequency vectorwhich is usually more conveniently expressed in a
polar reference k ¼ ðk;uÞ where the modulus kacts as a scale-selection parameter. The gaborian
kernel, for a fixed k vector, is a Gaussian which is
modulated by a complex sinusoid. Such a kernelcan be decomposed into a pair of quadrature fil-
ters:
cðx; y; k;uÞ ¼ 4k2 exp½2k2ðx2 þ y2Þ� ðcosð2pkðx cosu þ y sinuÞÞ� expð�p2=2ÞÞ
sðx; y; k;uÞ ¼ 4k2 exp½2k2ðx2 þ y2Þ� sinð2pkðx cosu þ y sinuÞÞ
These are two Gaussians with a standard deviation12k, each Gaussian being modulated by a sinusoid
with a frequency k in the u direction: they respondmaximally to u-oriented features. It is worth not-ing that the frequency response is a Gaussian
centered at k, which has a standard deviation 2k.We expect specific oriented features to be strongly
enhanced when a proper set of oriented kernels is
used. At the same time, by filtering the image at
Fig. 3. Gabor representation of an MRI image: the scales (a) 1, an
different scales, image textures with different
coarseness are enhanced. In addition, Gabor ker-
nels are well suited to model the receptive fields of
simple cells in biological vision systems. Moreover,
the modulus mðr; kÞ of the output of a pair ofGabor filters:
mðr; k;uÞ ¼ mðr; kÞ
¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðcðr; kÞ � IðrÞÞ2 þ ðsðr; kÞ � IðrÞÞ2
qis usually considered a good approximation of the
output of complex cells (Spitzer and Hochstein,
1985). The non-linearly filtered images obtained by
mðr; kÞ ¼ jIðrÞ � Wðr; kÞj are used in this work.We choose to use the widely accepted set of
Gabor Wavelets with the spectral set of para-
meters distributed in a 2D log-polar lattice (as
described in (Daugman, 1988)). For each image
we computed a feature vector fðrÞ at K scales
k 2 f2�ðlþ1Þ : l ¼ 1; . . . ;Kg and M angles, u 2fi p
M : i ¼ 0; 1; . . . ;M � 1g ¼ fu0; . . . ;uM�1g:
fðrÞ ¼ ðmðr; 1;u0Þ; . . . ;mðr; 1;uM�1Þ; . . . ;mðr;K;u0Þ; . . . ;mðr;K;uM�1ÞÞ
In Fig. 3 the above representation is illustrated for
an MRI image with K ¼ 2 and M ¼ 4. A key fea-ture of multi-scale representations is their intrinsic
capability to implement coarse-to-fine processing.
In this way, one can immediately focus on largescale image feature and disregard fine details
which may often be misleading in preliminary
d (b) 2 for K ¼ 2, and four equispaced orientations, M ¼ 4.
346 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
processing steps. Small details can be usually
handled more efficiently in the following phases:
an approach which usually leads to improve
overall performance and reduce the computational
load. It must be pointed out that coarse-to-finestrategy matches well with SOM relaxation: SOM
are usually built according to a two-stage scheme.
In the first stage, commonly termed the ‘‘ordering’’
phase, adaptation parameters are selected so as
to capture the global structure of the map (in our
case the overall transformation between the two
images). In the second stage, the ‘‘refinement’’
phase, local details of the map emerge whichprovide a high-resolution estimate of the trans-
formation. We exploited these properties of SOM
networks by adopting the following computational
scheme.
During the ‘‘ordering’’ phase, we used a coarse
scale k of the Gabor Wavelet Transform with Morientations, thus using the feature vectors:
f tðrÞ ¼ ðmtðr; k;u0Þ; . . . ;mtðr; k;uM�1ÞÞfsðr0Þ ¼ ðmsðr0; k;u0Þ; . . . ;msðr0; k;uM�1ÞÞ
Fig. 4. Schema of the algorithm to compute the ma
for ItðrÞ, and Isðr0Þ respectively. In this phase,
images can be safely down-sampled leading to a
Nk Nk matrix.
During the ‘‘refinement’’ phase, feature vectors
computed at a finer scale h (keeping the numberof orientations) are used to build a Nh Nh map,
with Nh PNk. The implemented algorithm is
schematized in Fig. 4 and details are given in
Section 2.3.
2.3. Implementation issues
Original images were represented by 256 · 256matrices. In the ordering phase we used Gabor
features which were computed at scale 2 with 8
orientations (u ¼ 0; p8; . . . ; 7p
8) and re-sampled the
images to 64 · 64 obtaining a 64 · 64 unit SOM. Anumber of 5 · 105 iterations was used, varying rðtÞlinearly from 10 to 1. This means that the neigh-
borhood of each winning unit covers, on average,
almost the entire map in the initial iterations, whileat the end only a few units are included in the
neighborhood of the winner. Weights were ini-
pping M by relaxation of our SOM network.
G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 347
tialized with random values uniformly distributed
in ½0; 63�. Identity mapping is another possibleinitialization: several experiments performed in
this way showed no noticeable difference in the
results, nor any significant changes in the execu-
tion time. The starting value of the learning rate a0was set to one.
In the refinement phase we used Gabor features
computed at scale 1 (with the same orientations
adopted in the ordering phase) and re-sampled
(by bi-cubic spline interpolation) the images to
128 · 128 obtaining a 128 · 128 unit SOM. In thiscase 106 iterations were used, varying rðtÞ linearlyfrom 2 to 0.3. In the initial iteration the neigh-borhood of each winning unit includes a few
neighbors and eventually shrinks to the winner
alone. Weights were initialized by using the map
from the ordering phase and by computing the
missing values by bi-cubic spline interpolation.
The starting learning rate a0 was again set to one.Once convergence is reached, the map is re-sam-
pled to 256 · 256 matrix by means of bi-cubicspline interpolation.
All the procedures were implemented in C++
language, under a Linux operating system running
on a PC equipped with an AMD Athlon processor
clocked at 2 GHz. An overall processing time of
about 11 min was observed: 25 s for the compu-
tation of the Gabor features, 4 min for the
ordering phase and 6 min for the refinement phase.It is worth noting that the multi-resolution ap-
proach provides a noticeable saving of computa-
tion time as compared to a single step carried out
at the finest spatial scale: this needs about about 57
min, that is 51 min for the ordering phase and 6
min for the refinement phase.
3. Experiments
In order to evaluate the effectiveness of SOM-
based image matching we performed a series of
experiments by using the well-known Shepp and
Logan head phantom (Shepp and Logan, 1974)
while undergoing different types of deformation in
the presence of additive uniform noise. At first weapplied to the phantom a set of global affine
transformations:
r0 ¼ Arþ b ð2Þ
where, in our experiments, the A matrix accounts
for both image rotation about the image center
and uniform scaling along co-ordinate axes, while
b ¼ ðbx; byÞ is a rigid displacement. In this context,it is worthwhile writing Eq. (2) as:
r0 ¼ RDrþ b ð3Þ
where
R ¼ cos h sin h� sin h cos h
� �
is a rotation matrix, and
D ¼ s 0
0 s
� �
is a uniform scaling matrix.
In addition, we have considered local non-linear
transformations belonging to the class of barrel/
pincushion deformations. These are best describedin a polar frame. The transformation is defined
with respect to a fixed point (positioned at the
center of the image, in our case) by the equation:
qr ¼ qr0 ð1þ cq2r0 Þ ð4Þwhere qr is the radial position of a generic point
in the original picture and qr0 is its radial position
in the transformed picture. The coefficient c con-trols the extent of the transformation. For c < 0one has a pincushion-like distortion, while forc > 0 the deformation is barrel-like.Besides phantom images, we examined the
behavior of our method for MRI images in the
case of transformations computed according to
Eqs. (3) and (4).
As described in the following, experiments were
performed by using several sets of values which
determined the transformations considered. Uni-form uncorrelated noise was added to the phan-
tom images to mimic the effect of statistical
fluctuations. The noise level was controlled by
fixing the interval ½�a; a� of the uniform deviate.
Assuming image gray levels in ½0; 1�, noise levelwas controlled by varying a 2 ½0; 0:2�.To quantify the quality of the obtained
matching normalized image correlation rn wasused as a figure of merit:
Table 1
Results for affine transformations without added noise
Translation
bx )25 )20 )15 )10 )5 0 5 10 15 20 25
rn 0.981 0.998 0.981 0.998 0.985 1.000 0.987 0.999 0.985 0.999 0.982
r0n 0.628 0.661 0.698 0.742 0.809 1.000 0.809 0.742 0.698 0.661 0.628
Rotation
h )25 )20 )15 )10 )5 0 5 10 15 20 25
rn 0.979 0.979 0.982 0.983 0.985 1.000 0.984 0.983 0.982 0.980 0.979
r0n 0.730 0.753 0.783 0.833 0.913 1.000 0.913 0.833 0.783 0.753 0.730
Scale
s 0.8 0.85 0.9 0.95 1.0 1.05
rn 0.943 0.962 0.977 0.985 1.000 0.973
r0n 0.601 0.644 0.687 0.750 1.000 0.757
348 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
rn ¼P
x;y Itðx; yÞIrðx; yÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPx;y Itðx; yÞ
2Px;y Irðx; yÞ
2q
where Irðx; yÞ is the image registered according tothe computed transformation M : ðx; yÞ7!ðx0; y0Þ.It is worth noting that correlation rn is widely usedin literature, nevertheless its integral nature limits
its value. For this reason we also evaluated the
quality of the results by visually inspecting the
difference image Itðx; yÞ � Isðx; yÞ.
3.1. Affine transformation
We considered three different affine trans-
formations: translation, scaling and rotation. In
Table 1 we give rn for different values of bx(translation), h (rotation), and s (uniform scaling),
Table 2
Results for affine transformations in the presence of additive
noise
a 0.0 0.05 0.1 0.15 0.2
Translation (bx ¼ 10)
rn 0.999 0.994 0.988 0.981 0.974
r0n 1.000 0.763 0.784 0.803 0.820
Rotation (h ¼ 10�)rn 0.983 0.983 0.980 0.975 0.970
r0n 1.000 0.847 0.858 0.869 0.878
Scale (s ¼ 0:90)
rn 0.977 0.978 0.975 0.972 0.967
r0n 1.000 0.713 0.740 0.765 0.786
Fig. 5. Some examples of estimated matchings for Shepp and
Logan phantom while undergoing affine transformations. Col-
umn (i) refers to a rigid translation with bx ¼ 10, column (ii)illustrates a uniform scaling (s ¼ 0:90), and (iii) is a rotationwith h ¼ 10�. A uniform noise with a ¼ 0:15 was added to thephantom images. Transformed images are shown in row (a),
and in row (b) the recovered images are reported. In (c) one can
see the difference between the original and the transformed
image, and in (d) the difference between the original and the
recovered image.
Table 3
Results for pincushion/barrel deformations without noise
Pincushion
c 0.000 1 · 10�5 2 · 10�5 3· 10�5rn 1.000 0.976 0.968 0.955
r0n 1.000 0.760 0.736 0.727
Barrel
c )5·10�6 )4 ·10�6 )3 ·10�6 0.000
rn 0.979 0.990 0.985 1.000
r0n 0.669 0.692 0.737 1.000
Table 4
Results for pincushion/barrel deformations with noise
a 0.0 0.05 0.1 0.15 0.2
Barrel (c ¼ �4 10�6)
rn 0.990 0.987 0.982 0.972 0.962
r0n 1.000 0.707 0.719 0.729 0.735
Pincushion (c ¼ 2 10�5)
rn 0.968 0.960 0.952 0.939 0.925
r0n 1.000 0.743 0.748 0.754 0.757
Fig. 6. (a) The Shepp and Logan phantom undergoing pincushion
images, (c) difference images between original and distorted images, (d
values of c ¼ f�3 10�6;�5 10�6; 1 10�5; 3 10�5g are used for
G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 349
respectively, in the absence of noise. As a refer-
ence, we provide the normalized cross-correlation
r0n between the original picture and the trans-
formed one:
r0n ¼P
x;y Itðx; yÞIsðx; yÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPx;y Itðx; yÞ
2Px;y Isðx; yÞ
2q
In Table 2, the values of rn (along with r0n) arereported by varying the noise level, for bx ¼ 10pixels, h ¼ 10�, and s ¼ 0:90 respectively.The values of transformation parameters bx; h; s
(and noise level a) were selected so as to obtain aset of transformations which correspond to com-
mon cases.
In order to make the evaluation of experimental
results easier in the panels of Fig. 5 we show some
examples of transformed images and estimated
matchings for a ¼ 0:15.
and barrel distortions is shown, (b) corresponding recovered
) difference images between original and recovered images. The
(i), (ii), (iii), and (iv) respectively.
350 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
3.2. Pincushion and barrel deformation
The results for transformation in Eq. (4) are
shown in Table 3. We considered both negative
and positive values of parameter c, with c drawnfrom the interval [)5 · 10�6, 3 · 10�5].Results of simulations in the presence of noise
are given in Table 4 for both the cases c < 0 andc > 0.In Fig. 6 some examples of barrel and pin-
cushion deformations are shown.
3.3. Examples of MRI images with synthetic
transformations
Experiments on real medical images were car-
ried out in the case of mono-modal intra-subject
MRI. Comparing images taken at different times is
Fig. 7. An MRI slice undergoing different transformations. The le
bx ¼ by ¼ 20. Column (ii) refers to a scaling with s ¼ 0:80, column (iii)and (v) c ¼ 3 10�5.
crucial in treatment monitoring (e.g. before and
after a surgical intervention), as well to study the
growth of lesions (Maintz and Viergever, 1998;
Thirion and Calmon, 1999). In addition, breast
MRI is usually based on image subtraction fol-
lowing the injection of a contrast agent, and re-quires a careful image registration to avoid motion
artifacts (Schnabel et al., 2003).
We have tested our method by applying the
image transformations in Eqs. (3) and (4) to MRI
images. In Fig. 7 we show some examples of the
results obtained for an axial slice. The figure shows
examples of translation, scaling, rotation, pin-
cushion, and barrel deformations, respectively.Further examples obtained for an image of a dif-
ferent anatomical section are provided in Fig. 8.
The values of the correlations rn and r0n are givenfor the two images in Tables 5 and 6, respectively.
ftmost column illustrates the case of a rigid translation with
is a rotation with h ¼ 10� and distortion with (iv) c ¼ �5 10�6
Fig. 8. Results for an MRI slice undergoing translation, rotation, scaling, pincushion and barrel transformations using the same
parameters of Fig. 7, with the exception of rotation which corresponds to h ¼ �10�.
Table 5
Correlation values for the transformations of the image in Fig. 7
bx ¼ 20; by ¼ 20 s ¼ 0:8 h ¼ 10� c ¼ �5 10�6 c ¼ 3 10�5
rn 1.000 0.967 0.988 0.921 0.903
r0n 0.627 0.694 0.825 0.734 0.719
Table 6
Correlation values for the transformations of the image in Fig. 8
bx ¼ 20; by ¼ 20 s ¼ 0:8 h ¼ �10� c ¼ �5 10�6 c ¼ 3 10�5
rn 1.000 0.961 0.980 0.981 0.966
r0n 0.709 0.761 0.867 0.768 0.789
G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352 351
4. Conclusion
We have described a general procedure to solve
matching problems for medical image co-registra-
tion and fusion. The basic Kohonen�s SOM neural
network has been properly adapted and its relax-
ation mechanism is exploited in conjunction to amulti-scale image representation based on Gabor
Wavelets.
The proposed method is essentially data-driven
and non-parametric. Consequently, no specific
hypothesis is necessary about the mathematical
352 G. Coppini et al. / Pattern Recognition Letters 25 (2004) 341–352
form of the correspondence law. On the other
hand, the most frequently used co-registration
methods (Maintz and Viergever, 1998) strongly
rely on prior knowledge about the type of the
involved transformation (e.g. rigid, affine, poly-
nomial warping), which, in general, is onlyapproximately known.
In order to evaluate the power of the approach,
we have performed extensive computer simula-
tions on phantom images while undergoing known
transformations in the presence of additive noise.
Similar tests have been performed using clinical
MRI images. The results support the idea that a
topology-preserving paradigm can offer consistentestimates of both global and local transforma-
tions. Though further experimentation in clinical
settings is necessary, we believe that the method is
adequate for intra-subject image alignment in
mono-modality imaging. In particular, this is
useful when images acquired at different times
must be compared. Other image co-registration
problems (such as in the cases of intra-subjectmulti-modality imaging, registration of patient
data against an anatomical atlas or inter-subjects
comparison) may need further development of the
method as they usually involve more complex and
extended transformations.
It is worth noting that the data in the previous
section indicates a slight degradation when the
entity of the transformation increases. On theother hand, we based our similarity criterion on
local morphological features. Consequently, if the
images to be matched exhibit noticeable differences
one expects computed mapping to become less
accurate. Nevertheless, the method works rather
well for large changes of shape and it is able to
deal with both global linear transformations and
local non-linear deformations.We wish to point out that the adopted com-
putational strategy naturally leads to the imple-
mentation of the matching algorithm for 3D
images, as is often the case in medical imagery: this
requires the use of a 3D SOM along with a proper
set of 3D image features. A final remark about the
inherent sequentiality of SOM learning: as this
may limit the computation speed, some optimiza-tion may be desirable, e.g. via parallel implemen-
tation (Hamalainen, 2001).
References
Bellando, J., Kotari, R., 1996. On image correspondence
using topology preserving mappings. In: IEEE Interna-
tional Conference on Neural Networks, pp. 1784–
1789.
Daugman, J., 1988. Complete discrete 2D Gabor transforms
by neural networks for image analysis and compression.
IEEE Trans. Acoust. Speech Signal Process. 36, 1169–
1179.
Hamalainen, T., 2001. Parallel implementations of self-orga-
nizing maps. In: Seifert, U., Jain, L. (Eds.), Self-organizing
Neural Networks. Recent Advances and Applications.
Springer, Heidelberg, pp. 245–278.
Haralick, R., Shapiro, L., 1993. Computer and Robot Vision.
Addison-Wesley, Reading, Ma.
Huwer, S., Rahmal, J., Wangenheim, A.V., 1996. Data-driven
registration for local deformations. Pattern Recognition
Lett. 17, 951–957.
Kohonen, T., 2001. Self-organizing Maps. Springer, Berlin.
Konen, W., Maurer, T., von der Malsburg, C., 1994. A fast
dynamic link algorithm for invariant pattern recognition.
Neural Networks 7, 1019–1039.
Lee, T.S., 1996. Image representation using 2D Gabor wave-
lets. IEEE Trans. Pattern Anal. Machine Intell. 18, 959–
971.
Maintz, J., Viergever, M., 1998. A survey of medical image
registration. Med. Image Anal. 2, 1–36.
Musse, O., Heitz, F., Armspach, J., 2001. Topology preserving
deformable image matching using constrained hierarchical
parametric models. IEEE Trans. Image Process. 10, 1081–
1093.
Schnabel, C., Tanner, J.A., Castellano-Smith, A., Degenhard,
A., Leach, M., Hose, D., Hill, D., Hawkes, D., 2003.
Validation of nonrigid image registration using finite-
element methods: application to breast MR images. IEEE
Trans. Med. Imaging 22, 238–247.
Shepp, L., Logan, B., 1974. The Fourier reconstruction of a
head section. IEEE Trans. Nucl. Sci. 21, 692–702.
Spitzer, H., Hochstein, S., 1985. A complex–cell receptive
model. J. Neurosci. 5, 1266–1286.
Thirion, J., 1998. Image matching as a diffusion process: an
analogy with Maxwell�s demons. Med. Image Anal. 2, 243–260.
Thirion, J.P., Calmon, G., 1999. Deformation analysis to
detect and quantify active lesions in three-dimensional
medical image sequences. IEEE Trans. Med. Imaging 18,
429–441.
van den Elsen, P., Pol, E., Viergever, M., 1993. Medical image
matching: a review with classification. IEEE Eng. Med.
Biol. 12, 26–29.
Wells III, W., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.,
1996. Multi-modal volume registration by maximization of
mutual information. Med. Image Anal. 1, 35–51.
Wurtz, R., Konen, W., Behrmann, K., 1999. On the perfor-
mance of neuronal matching algorithms. Neural Networks
12, 127–134.