Medical image registration - Sharif University of Technologysbisc.sharif.edu/~miap/Files/Medical Image Registration.pdf · Medical image registration ... whatever the form, together

INSTITUTE OF PHYSICS PUBLISHING PHYSICS IN MEDICINE AND BIOLOGY

Phys. Med. Biol. 46 (2001) R1–R45 www.iop.org/Journals/pb PII: S0031-9155(01)96876-9

TOPICAL REVIEW

Medical image registration

Derek L G Hill, Philipp G Batchelor, Mark Holden and David J Hawkes

Radiological Sciences, King’s College London, Guy’s Hospital, 5th Floor Thomas Guy House,St Thomas’ Street, London SE1 9RT, UK

E-mail: [email protected]

Received 12 June 2000

AbstractRadiological images are increasingly being used in healthcare and medical re-search. There is, consequently, widespread interest in accurately relating infor-mation in the different images for diagnosis, treatment and basic science. Thisarticle reviews registration techniques used to solve this problem, and describesthe wide variety of applications to which these techniques are applied. Applica-tions of image registration include combining images of the same subject fromdifferent modalities, aligning temporal sequences of images to compensate formotion of the subject between scans, image guidance during interventions andaligning images from multiple subjects in cohort studies. Current registrationalgorithms can, in many cases, automatically register images that are relatedby a rigid body transformation (i.e. where tissue deformation can be ignored).There has also been substantial progress in non-rigid registration algorithmsthat can compensate for tissue deformation, or align images from different sub-jects. Nevertheless many registration problems remain unsolved, and this islikely to continue to be an active field of research in the future.

Summary of notation

A An image (reference or target)B A second image to be aligned with A

x A point{xi} A set of pointsxA A point in image A

A(xA) The value of image A at position xA

T The spatial transformation (mapping) from B to A

L A linear transformationT The transformation that maps both position and intensity (i.e. takes account of

interpolation and sampling)BT Image B transformed into the coordinate space of image A

�A The imaged volume or field of view i.e. continuous domain of image A

�A The discrete domain of image A

0031-9155/01/030001+45$30.00 © 2001 IOP Publishing Ltd Printed in the UK R1

R2 D L G Hill et al

�B The discrete domain of image B

�TA,B The overlap domain of images A and B for a given transformation estimate T

N =∑�T

A,B1 The number of pixels or voxels in the overlap domain of images A and B

�a An isointensity set (level set) in image A with intensity a

�Ta The isointensity set of A in the overlap domain

�Ta,b Overlapping isointensities

pi , qi Points p, qp A probability value (between 0 and 1)pTAB(a, b) Joint probability density function: the probability that a voxel in the overlap

domain has intensity a in A and b in BT

F Intensity mapping function (makes image A look like image B)A|�T

A,BImage A in the overlap domain

BT |�TA,B

Image B in the overlap domain

µ Meanσ Standard deviationς Sample spacing in a discrete image�ςA

Sampling grid for image A

E Euclidean space

R Real numbers (1.2, π ,√

2, etc)∈ In (element of)∀ For all

1. Introduction

Medical images are increasingly being used within healthcare for diagnosis, planningtreatment, guiding treatment and monitoring disease progression. Within medical research(especially neuroscience research) they are used to investigate disease processes and understandnormal development and ageing. In many of these studies, multiple images are acquired fromsubjects at different times, and often with different imaging modalities. In research studies, itis sometimes desirable to compare images obtained from patient cohorts rather than just singlesubjects imaged multiple times. Furthermore, the amount of data produced by each successivegeneration of imaging system is greater than the previous generation. This trend is set tocontinue with the introduction of multislice helical CT scanning and MR imaging systemswith higher gradient strengths. There are, therefore, potential benefits in improving the way inwhich these images are compared and combined. Current clinical practice normally involvesprinting the images onto radiographic film and viewing them on a light box. Computerizedapproaches offer potential benefits, particularly by accurately aligning the information in thedifferent images, and providing tools for visualizing the combined images. A critical stage inthis process is the alignment or registration of the images, which is the topic of this reviewarticle. There have been previous surveys of the medical image registration literature (e.g.Maurer and Fitzpatrick 1993, van den Elsen et al 1993, Maintz and Viergever 1998). Thisarticle aims to complement them both by describing some more recent literature and by usingnotation that makes clear some of the practical difficulties in implementing robust registrationtechniques. Furthermore, this article focuses discussion on some of the most widely usedregistration algorithms rather than attempting to provide a comprehensive survey of all theliterature in this field. For this reason, some very recently devised algorithms are not described,as it is not yet clear whether they will become widely used.

Medical image registration R3

In this article we describe the main approaches used for the registration of radiologicalimages. The most widely used application of medical image registration is aligningtomographic images. That is aligning images that sample three-dimensional space withreasonably isotropic resolution. Furthermore, it is often assumed that between imageacquisitions, the anatomical and pathological structures of interest do not deform or distort.This ‘rigid body’ assumption simplifies the registration process, but techniques that makethis assumption have quite limited applicability. Many organs do deform substantially, forexample with the cardiac or respiratory cycles or as a result of change in position. The brainwithin the skull is reasonably non-deformable provided the skull remains closed betweenimaging, and that there is no substantial change in anatomy and pathology, such as growthin a lesion, between scans. Imaging equipment is imperfect, so regardless of the organ beingimaged, the rigid body assumption can be violated as a result of scanner-induced geometricaldistortions that differ between images. Although the majority of the registration approachesreviewed here have been applied to the rigid body registration of head images acquired usingtomographic modalities, there is now considerable research activity aimed at tackling themore challenging problems of aligning images that have different dimensionality (for exampleprojection images with tomographic images), aligning images of organs that deform, aligningimages from different subjects, or of aligning images in ways that can correct for scanner-induced geometric distortion. This is quite a rapidly moving research field, so the workreviewed in these areas is more preliminary.

For all types of image registration, the assessment of registration accuracy is veryimportant. The required accuracy will vary between applications, but for all applicationsit is desirable to know both the expected accuracy of a technique and also the registrationaccuracy achieved on each individual set of images. For one type of registration algorithm,point-landmark registration, the error propagation is well understood. For other approaches,however, the algorithms themselves provide no useful indication of accuracy. The mostpromising approach to ensuring acceptable accuracy is visual assessment of the registeredimages before they are used for the desired clinical or research application.

Although this review concentrates on registration of radiological images of the samesubject (intrasubject registration), there is some discussion of the closely related topics ofintersubject registration, including registration of images of an individual to an atlas, andimage-to-physical space registration.

1.1. Types of images

The term ‘medical image’ covers a wide variety of types of images, with very differentunderlying physical principles, and very different applications. The sort of images used inhealthcare and medical research vary from microscopic images of histological sections tovideo images used for remote consultation, and from images of the eye taken with a funduscamera to whole body radioisotope images. In principle, medical image registration couldinvolve bringing all the information from a given patient, whatever the form, together into asingle representation of the individual that acted like a multimedia electronic patient recordwith implicit information about the spatial and temporal relationship between all the imageinformation. The huge variety of spatial and temporal resolutions and fields of view ofthe different images makes this difficult, and the clinical benefit of such an approach hasnot yet been demonstrated. Recent developments in medical image registration have beendriven less by this dream of unifying image information than by the practical desire to makebetter use of certain types of image information for specific clinical applications or in medicalresearch.

R4 D L G Hill et al

In this article we primarily consider the main radiological imaging modalities. Theseinclude traditional projection radiographs, with or without contrast and subtraction, nuclearmedicine projection images, ultrasound images and the cross-sectional modalities of x-raycomputed tomography (CT), magnetic resonance imaging (MRI), single photon emissioncomputed tomography (SPECT) and positron emission tomography (PET). We refer to theselast four modalities (CT, MRI, SPECT and PET) as the tomographic modalities. In many waysthese are the easiest modalities from the point of view of image registration. They providevoxel datasets in which the sampling is normally uniform along each axis, though the voxelsthemselves tend to have anisotropic resolution. In a projection x-ray, each pixel representsthe integral of attenuation along one of a set of converging lines through the patient, and theyrepresent a superposition of structures in the patient with varying magnification. Many nuclearmedicine images acquired with gamma cameras are parallel projections, in which each pixelrepresents the integral along one of a set of parallel lines through the patient. It is, therefore,a superposition of structures in the patient all at the same magnification (though at differentresolutions). The majority of ultrasound images are acquired with a free-hand transducer.Each image is a two-dimensional slice through part of the patient. The images are acquiredat a high frame rate, but the spatial relationship between the frames is not recorded. If thetransducer is moved in a controlled way or tracked, the relative positions of the frames can berecorded, and a three-dimensional dataset obtained, provided the structures being tracked arenot moving or deforming during the acquisition.

Video images are often acquired during surgery, for example using endoscopes ormicroscopes. For the purpose of image guidance, it can be useful to relate the video imagesto preoperatively acquired diagnostic images. Video images are, like radiographs and manynuclear medicine images, projections. They differ, however, in that they normally only containinformation about the surface of structures in the field of view, rather than a superpositionof overlying structures. There is a huge computer vision literature on estimating three-dimensional shapes from one or more video camera views. Video images can be alignedwith tomographic images either by first extracting surface structures, or directly.

2. Definitions, notation and terminology

In this article we use the term ‘registration’ to mean determining the spatial alignment betweenimages of the same or different subjects, acquired with the same or different modalities, andalso the registration of images with the coordinate system of a treatment device or trackedlocalizer. Other authors distinguish between different categories of alignment using the wordsregistration, co-registration and normalization. The term normalization is usually restrictedto the intersubject registration situation, and registration and co-registration are often usedinterchangeably. The algorithms used for all these applications have many features in common,so we prefer to use the term registration for all cases.

The word registration is used with two slightly different meanings. The first meaning isdetermining a transformation that can relate the position of features in one image or coordinatespace with the position of the corresponding feature in another image or coordinate space. Weuse the symbol T to represent this type of positional registration transformation. The secondmeaning of registration both relates the position of corresponding features and enables us tocompare the intensity at those corresponding positions (e.g. to subtract image intensity values).We use the symbol T to describe this second meaning of registration, which incorporates theconcepts of resampling and interpolation.

Using the language of geometry, the registration transformation is referred to as a mapping.We can consider the mapping T , that transforms a position x from one image to another, or


from one image to the coordinate system of a treatment device (image to physical registration)

T : xA → xB ⇔ T (xA) = xB. (1)

Using this notation, T is a spatial mapping. We also need to consider the more completemapping T that maps both position and associated intensity value from image A to image B.T therefore maps an image to an image, whereas T maps between coordinates. If we wish tooverlay two images that have been registered, or to subtract one from another, then we needto know T , not just T . T is only defined in the region of overlap of the image fields of view,and has to take account of image sampling and resolution. A(xA) is the intensity value at thelocation xA, and similarly for image B1. It is important to remember that the medical imagesA and B are derived from a real object, i.e. the patient. The images have a limited field of viewthat does not normally cover the entire patient. Furthermore, this field of view is likely to bedifferent for the two images.

We can usefully think of the two images themselves as being mappings of points in thepatient within their field of view (or domain �) to intensity values

A : xA ∈ �A → A(xA)

B : xB ∈ �B → B(xB).

Because the images are likely to have different fields of view, the domains �A and �B will bedifferent. This is a very important factor, which accounts for a good deal of the difficulty indevising accurate and reliable registration algorithms. We will return to this issue in section 2.1.

To compare images A and B, we would like them both to be defined at the location xA,so that we can write B(xA). This is wrong, however, as B is not defined at location xA.We therefore introduce the notation BT for the image B transformed with a given mappingT . If T accurately registers the images, then A(xA) and BT (xA) will represent the samelocation in the object to within some error depending on T . If we didn’t have to worry aboutinterpolation, we could write BT (xA) instead of BT (xA), but due to the discrete nature ofmedical images, discussed further in section 2.2, interpolation is necessary for any realistictransformation.

As the images A and B represent one object X, imaged with the same or differentmodalities, there is a relation between the spatial locations in A and B. Modality A is suchthat position x ∈ X is mapped to xA, and modality B maps x to xB . The registration processinvolves recovering the spatial transformation T which maps xA to xB over the entire domain ofinterest, i.e. which maps from�A to�B within the overlapping portion of the domains. We referto this overlap domain as �T

A,B . This notation makes it clear that the overlap domain dependson the domains of the original images A and B, and also on the spatial transformation T . Theoverlap domain is the positions in the domain of image A that are also in the domain of imageB after transformation, and can be defined as:

�TA,B = {xA ∈ �A|T −1(xA) ∈ �B}.

As stated earlier, the transformation T maps both position, and intensity at that position, fromone image to another, taking account of issues to do with sampling and interpolation. It isimportant to emphasize, however, that T is not an intensity mapping: it does not make imageBlook like image A by giving a position in image BT (xA) the same intensity as A(xA). We usethe symbol F to represent that mapping of intensities from one image to another. For twoimages differing only by noise, F will be the identity. In general F will be a spatially varying(non-stationary) function that is not monotonic.

1 A(xA) and B(xB) are normally scalars, but in some circumstances can be vectors (e.g. flow) or tensors (e.g.diffusion). Non-scalar voxel values can add further complications to image transformation which are not addressedin this article.

R6 D L G Hill et al

Registration algorithms that make use of geometrical features in the images such as points,lines and surfaces, determine the transformation T by identifying features such as sets of imagepoints {xA} and {xB} that correspond to the same physical entity visible in both images, andcalculating T for these features. When these algorithms are iterative, they iteratively determineT , and then infer T from T when the algorithm has converged.

Registration algorithms that are based on image intensity values work differently. Theyiteratively determine the image transformation T that optimizes a voxel similarity measure.Many of these voxel similarity measures involves analysing isointensity sets (or level sets)within the images. For a single image A, an isointensity set with intensity value a is the set ofvoxels within a subdomain of A, such that

�a = {xA ∈ �A|A(xA) = a}. (2)

This equation, put into words, states ‘all locations xA in the field of view of image A for whichthe intensity value is a’.

Some algorithms do not work on isointensity sets corresponding to a single intensity value,but on isointensity sets corresponding to small groups, or bins, of intensities. For example a12-bit image may have its intensities grouped into 256 4-bit bins. We use a to mean eitherindividual intensities or intensity bins, as appropriate.

It is important to remember that�a is the isointensity set within all of imageA that is withinthe domain �A. As has been stated above, for registration using voxel similarity measures,we work within the overlap domain �T

A,B . The isointensity set within this overlap domain is,of course, a function of T . To emphasize this T dependence, we define the isointensity set ofimage A with value a, within �T

A,B as

�Ta = {xA ∈ �T

A,B |A(xA) = a}. (3)

Similarly, we can consider an isointensity set in image B. Image B, of course, is alwaysthe image that we consider transformed, so the definition is slightly different from that forimage A. We consider the isointensity set to be the set of voxels in the space A that haveintensity b in image BT

�Tb = {xA ∈ �T

A,B |BT (xA) = b}. (4)

2.1. Consequences of different image fields of view

For the 3D intrasubject registration task that we are focusing on here, the object being imagedis the same for images A and B, and, while the domains �A and �B are different in extent andposition, they are both three dimensional. The overlap domain�T

A,B is, in general, smaller thaneither�A or�B , and is also a function of the transformation T . The latter point is important andsometimes overlooked. For registration algorithms that make use of corresponding geometricalfeatures, the difference in field of view of images A and B can cause difficulties, as featuresidentified in one image may not be present in the second. The dependence of �T

A,B on T

as the algorithm iterates is, however, less important. Greater difficulties arise for registrationalgorithms that make use of image intensity values to iteratively determine T . The isointensitysets used by these algorithms are the sets of fixed intensities within �T

A,B . Since �TA,B changes

with T , algorithms that are too sensitive to changes in �TA,B may be unreliable.

The difficulty caused by the different field of views of imagesA andB is further illustratedby considering an approach to rigid body registration called the method of moments (Faberand Stokely 1988). When applying this to images of a part of the body, e.g. the head, thatpart of the body is first delineated from images A and B using a segmentation algorithm,giving the binary volumes OA and OB . The images can then be registered by first aligning the


centroids (from the first-order moment) of OA and OB , and then aligning the principal axesof OA and OB (from the second-order moment). This approach is, however, unsatisfactoryfor most medical image registration applications because the first- and second-order momentsare highly sensitive to change in the image field of view. In order for this method to workaccurately, the object used for the calculations must be entirely within�T

A,B , and it is frequentlydifficult to delineate appropriate structures with this property from clinical images.

To emphasize the importance of this overlap domain in image registration, we introducenotation to make this clear. A|�T

A,Band BT |�T

A,Bare the portions of image A and BT

respectively in the overlap domain. The use of the A|�TA,B

and BT |�TA,B

notation is equivalent

to A(xA)∀ xA ∈ �TA,B and BT (xA)∀ xA ∈ �T

A,B .

2.2. The discrete nature of the images

A further important property of the medical images with which we work is that they are discrete.That is, they sample the object at a finite number of voxels in three dimensions or pixels in twodimensions. In general, this sampling is different for imagesA andB; also, while the samplingis commonly uniform in a given direction, it is anisotropic, that is it varies with direction inthe image. Discretization has important consequences for image registration, so it is useful tobuild this concept into our notational framework.

We can define our domain � as

� = � ∩ �ς

where � is a bounded continuous set, and could be called the volume or field of view of theimage, and � is an infinite discrete grid. � is our sampling grid, which is characterized bythe anisotropic sample spacing ς = (ςx, ςy, ςz). The sampling is normally different for theimages A and B being registered, and we denote this by introducing sampling grids �ςA

and�ςB

for the domains �A and �B .For any given T , the intersection of the discrete domains �A and T (�B) is likely to be the

empty set, because no sample points will exactly overlap. In order, therefore, to compare theimages A and B for any estimate of T , it is necessary to interpolate between sample positionsand to take account of the difference in sample spacing ςA and ςB . This introduces twoproblems. Firstly, fast interpolation algorithms are imperfect, introducing blurring or ringinginto the image (discussed further in section 9). This changes the image histograms, and hencealters the isointensity sets discussed above. Secondly, we must be careful when the imageB being transformed has higher-resolution sampling than image A (i.e. one or more of theelements in ςB is less than the corresponding element in ςA). In this case, we risk aliasingwhen we resample B from �B to �T

A,B , so we should first blur B with a filter of resolution ςAor lower before resampling.

As stated earlier, the transformation T maps both positions and intensities at thesepositions. T , therefore, has to take account of the discrete sampling. The spatial mapping T ,in contrast, does not take account of this effect. When we use T as a superscript of a symbol,as in BT , we are making it clear that the quantity represented by this symbol is dependent onboth the spatial mapping T and the interpolation or blurring used during resampling. Figure 1illustrates the relationship between field of view, domain of the image and the registrationtransformation.

R8 D L G Hill et al

Figure 1. Two images A and B illustrated in the top panel of this figure have different fields ofview �A and �B . For a registration transformation T there will be a region of overlap between theimages as illustrated in the third panel. It is important to remember that the images are discrete.The discretization is determined by the sampling grids �ςA and �ςB shown in the fourth panel.Even if images A and B have exactly the same sampling grid, the grid points will not normallycoincide in the volume of overlap as shown in the fifth panel. Interpolation is therefore necessary.For iterative registration algorithms, this interpolation is necessary at each iteration. The imagetransformation T described in the text maps both position and the associated intensity in the imageswithin the region of overlap, incorporating the concepts of field of view and discrete sampling.

3. Types of transformation

We have not, so far, discussed the nature of the transformation T . For most current applicationsof medical image registration, both�A and�B are three dimensional, so T transforms from 3Dspace to 3D space. In some circumstances, such as the registration of a radiograph to a CT scan,it is useful to consider transformations from 3D space to 2D space (or vice versa). Where bothimages being registered are two dimensional (e.g. two radiographs before and after contrast is


injected), the appropriate transformation might be from 2D space to 2D space2. Most medicalimage registration algorithms additionally assume that the transformation is ‘rigid body’, i.e.there are six degrees of freedom (or unknowns) in the transformation: three translations andthree rotations. The key characteristic of a rigid body transformation is that all distances arepreserved.

Some registration algorithms increase the number of degrees of freedom by allowing foranisotropic scaling (giving nine degrees of freedom) and skews (giving 12 degrees of freedom).A transformation that includes scaling and skews as well as the rigid body parameters is referredto as affine, and has the important characteristics that it can be described in matrix form andthat all parallel lines are preserved. A rigid body transformation can usefully be considered asa special case of affine, in which the scaling values are all unity and the skews all zero.

Individual bones are rigid at the resolution of radiological imaging modalities, so rigidbody registration is widely used in medical applications where the structures of interest areeither bone or are enclosed in bone. By far the most important part of the body registeredin this way is the head, and in particular the brain. Rigid body registration is used for otherregions of the body in the vicinity of bone (e.g. the neck, pelvis, leg or spine) but the errorsare likely to be larger.

The use of an affine transformation rather than a rigid body transformation does not greatlyincrease the applicability of image registration, as there are not many organs that only stretchor shear. Tissues usually deform in more complicated ways. There are, however, severalscanner introduced errors that can result in scaling or skew terms, and affine transformationsare sometimes used to overcome these problems (see section 4).

For most organs in the body, many more degrees of freedom are necessary to describethe tissue deformation with adequate accuracy. Even in the brain, development of children,lesion growth or resection can make a affine transformations inadequate. Also, it is commonin neuroscience research to align images from different subjects. This is called intersubjectregistration and is discussed in section 10.4.3. While an affine transformation is widely usedto provide an approximate alignment of different subjects, additional degrees of freedom areneeded for more accurate registration. These non-affine registration transformations are notthe main focus of this article, but some approaches for both intrasubject and intersubjectregistration are described in section 10.4.

3.1. Linear transformations

Many authors refer to affine transformations as linear. This is not strictly true, as a linear mapis a special map L which satisfies

L(αx + βx′) = αL(x) + βL(x′) ∀x,x′ ∈ RD.

The translational parts of affine transformations violate this. An affine map is more correctlydescribed as the composition of linear transformations with translations.

Furthermore, reflections are a linear transformation, but they are normally undesirablein medical image registration. For example, if a registration algorithm used in image-guidedneurosurgery calculated a reflection as part of the transformation it might result in a patienthaving a craniotomy on the wrong side of his or her head. If there is any doubt about whether analgorithm might have calculated a reflection, this should be checked prior to use. Since affinetransformations can be represented in matrix form, a reflection can be detected simply from

2 Although the images being registered may be 2D, the structures being imaged are often free to translate, rotate anddeform in three dimensions.

R10 D L G Hill et al

a negative value of the determinant of this matrix. When we refer to affine transformationselsewhere in this article we exclude reflections as they are undesirable in this application.

3.2. One-to-one transformations

For intrasubject registration, the object being imaged with the same or different modalitiesis the same patient. It would at first seem likely that the desired transformation T should beone-to-one (bijective). This means that each location in image B gets transformed to a singlepoint in image A and vice versa. There are several situations in which this does not happen.Firstly, if the dimensionality of the images is different such as in the registration of a radiographto a CT scan, a one-to-one transformation is impossible. The differences in image field of viewand sampling discussed in sections 2.1 and 2.2 also complicate this issue.

For various types of non-affine registration, a one-to-one transformation is not desirable.For example in registration of images from different subjects, or of the same subject beforeand after surgery, there may be structures in image A that are absent from image B.

4. Image distortion

4.1. Geometric distortion

It is important to realize that even when the object being imaged two or more times is rigid,there may be distortions in the imaging process that mean that the correct transformationbetween the coordinate spaces �A of image A and �B of image B requires additional degreesof freedom. There are many different causes of geometrical distortion in radiological imaging.The types of distortion in images is dependent on the underlying physics of the acquisition, sowill vary between the different modalities. It is theoretically possible to correct geometricaldistortion. In practice, however, it may be difficult to get the information needed to do thisroutinely.

The simplest forms of geometric distortion result in scaling and skewing of the coordinatespace, and can therefore be described using an affine transformation. Scaling distortion canarise as a result of miscalibration of the voxel dimensions of the images being registered. Skewdistortion is most common in CT images, where a tilted gantry rotates the image plane withrespect to the axis of the bed, introducing a skew into the images. The gantry tilt angle maynot be accurately known, so the skew error, like the scaling error, could be an unknown degreeof freedom in the registration transformation T .

Pin-cushion and barrel distortion lead to a smoothly varying geometrical distortion acrossthe field of view of the images, which is largest at the periphery.

Magnetic resonance images are especially liable to distortion. If the gradients are mis-calibrated, then the voxel dimensions will be incorrect, and a scaling error will be present in thedata. This can be corrected using phantom measurements (e.g. Hill et al 1998, Lemieux andBarker 1998). If the static magnetic field B0 is not uniform over the subject, then the magneticfield gradients applied during imaging will not result in a linear variation in resonant frequencywith position, leading to distortion. The main field inhomogeneity is dependent on the objectbeing imaged, and in particular on local susceptibility changes. The spatial error due to thisinhomogeneity is dependent on the field inhomogeneity as a proportion of the imaging gradientstrength. The distortion is reduced by increasing the gradient strength, which is equivalent toincreasing the bandwidth per pixel. For conventional spin-warp MR imaging (2D multisliceor 3D), this distortion is greatest in the read-out direction, and absent in the phase-encodedirection(s) where the bandwidth per pixel is essentially infinite because the lines are acquiredfrom different excitations (Chang and Fitzpatrick 1992, Sumanaweera et al 1994). For echo


planar imaging, the bandwidth per pixel is lowest in the phase encode (or blip) direction, so thedistortion is highest in that direction (Jezzard and Balaban 1995). For two-dimensional images,the field inhomogeneity results in the excitation of slices that are curved rather than planar. Themagnet-dependent inhomogeneity can be measured with a phantom experiment, but to correctfor the object-dependent field inhomogeneity it is necessary to make additional measurementsduring imaging. For spin-echo images, this can be done by making two measurements withdifferent readout gradient strengths (or opposite gradient directions) (Chang and Fitzpatrick1992). For gradient echo images (Sumanaweera et al 1994) and echo planar images (Jezzardand Balaban 1995), distortion can be inferred from a map of field inhomogeneity.

Patient motion during either CT or multislice MR imaging can result in a variation in sliceorientation relative to the patient across the scan. This is currently an unsolved problem, andmakes registration of such datasets very difficult. In the rigid body case, there is a differenttransformation needed for each slice (or in some MR images, groups of slice interleaves), andthis transform is hard to find.

Motion during MR imaging within the acquisition of a single slice or an entire 3D volumeresults in a different problem, namely ghost artefacts. This can result in one or more ‘ghosts’of the object appearing in the image along with the main image of the object. These ghostsnormally have higher spatial frequency content than the main image, but there is a differentregistration transformation needed for each ghost. Just as the geometric distortion describedabove can be corrected, these ghost artefacts can be removed (e.g. Atkinson et al 1999, McGeeet al 2000) but this is not routinely performed.

With emission tomography (PET and SPECT), an important cause of distortion errors canbe poor alignment of the detector heads in multidetector systems, or uncertainty in the centre ofrotation. These lead to a halo artefact that distorts the true distribution. Also, in PET imagingit is important to calibrate the voxel dimensions, as some reconstruction algorithms assumethat all photons are detected at the face of the scintillator crystals, but the mean free path ofphotons in these materials can be 1 cm or so, giving scaling errors in the data.

4.2. Intensity distortion

In addition to the geometric distortion described above, intensity distortion is common inimages. This results in the same tissue in different places appearing in the image with varyingintensity. Since the intensity distortion is likely to be different between images (even imagesof the same modality acquired at different times), this effect results in the intensity mappingF being non-stationary (i.e. changing over the image). The effect is common in MR imaging,where the shading is caused primarily by inhomogeneity in the RF (or B1) field, and is alsopresent in radiographs, where the heel effect results in a slow variation in intensity across theimage. The MR intensity distortion can sometimes be corrected if appropriate assumptionsare made about the image (e.g. Guillemaud and Brady 1997, Sled et al 1998).

5. Rigid body registration algorithms using geometric features

In this section we describe registration algorithms that determine T using image features thathave been extracted from the images either interactively or automatically. The most widelyused of these features in this application are points and surfaces, though crest lines and extremalpoints identified using differential geometric operators are also used. An alternative approachto registration using geometrical features is to generate derived images from A and B thatrepresent the strength of a feature (such as a ridge) at each voxel and then to register thesederived images using a voxel similarity measure. Such approaches are described in section 7.1.


5.1. Points and the Procrustes problem

Point-based registration involves identifying corresponding three-dimensional points in theimages to be aligned, registering the points and inferring the image transformation fromthe transformation determined from the points. Using the notation introduced in section 2,we want to find points {xi

A}i=1...N in image A and {xiB}i=1...N in image B corresponding

to the set of features {xi}i=1...N in the object, and find the transformation that alignsthem in a least square sense. The corresponding points are sometimes called homologouslandmarks to emphasize that they should represent the same feature in the differentimages.

5.1.1. The orthogonal Procrustes problem. The orthogonal Procrustes problem draws itsname from the Procrustes area of statistics. Procrustes was a robber in Greek mythology. Hewould offer travellers hospitality in his road-side house, and the opportunity to stay the nightin his bed that would perfectly fit each visitor. As the visitors discovered to their cost, however,it was the guest who was altered to fit the bed, rather than the bed that was altered to fit theguest. Short visitors were stretched to fit, and tall visitors had suitable parts of their body cutoff so they would fit. The result, it seems, was invariably fatal. The hero Theseus put a stop tothis unpleasant practice by subjecting Procrustes to his own method.

The term ‘Procrustes’ became a criticism for the practice of unjustifiably forcing datato look like they fit another set. More recently, Procrustes statistics has lost its negativeassociations and is used in shape analysis. The mathematical problem has relevance in manydomains including statistics (Green 1952, Hurley and Cattell 1962, Koschat and Swayne 1991,Schonemann 1966, Rao 1980), gene recognition (Gelfand et al 1996), satellite positioning androbotics (Kanatani 1994, Umeyama 1991) in addition to its interest in numerical mathematicsas a least square problem (Golub and van Loan 1996, Edelman et al 1998, Soderkvist 1993,Stewart 1993). The Procrustes problem is an optimal fitting problem, of least square type:given two configurations of N non-coplanar points P = {pi} and Q = {qi}, one seeks thetransformation T which minimizes G(T ) = ||T (P )−Q||2. The notation is: P , Q are the N -by-D matrices whose rows are the coordinates of the points pi , qi , T (P ) is the correspondingmatrix of transformed points, and || . . . || is a matrix norm, the simplest being the Frobenius= (

∑i ‖(T (pi ) − qi )

2‖)1/2. The standard case is when T is a rigid body transformation(Dryden and Mardia 1998, Fitzpatrick et al 1998b). One can additionally consider scaling(Dryden and Mardia 1998). If T is affine, we are faced with a standard least square (Goluband van Loan 1996).

5.1.2. Solutions. The classical Procrustes problem, i.e. T ∈ {rigid body transformations}has known solutions. A matrix representation of the rotational part can be computed usingsingular-value decomposition (SVD) (Dryden and Mardia 1998, Golub and van Loan 1996,Kanatani 1994, Schonemann 1966, Umeyama 1991).

First replace P and Q by their demeaned versions, as the optimal transformation is fromcentroid to centroid

pi → pi − p

qi → qi − q

This reduces the problem to the orthogonal Procrustes problem in which we wish todetermine the orthogonal rotation R. Central to the problem is the D-by-D correlation matrixK := P tQ, as this matrix quantifies how much the points in Q are ‘predicted’ by points in


P . If P = [pt1, . . . ,p

tN ]t is a matrix of row vectors (and the same for Q), K = ∑

i Ki whereKi := piq

ti , then

K = UDV t ⇒ R = V �U t � := diag(1, 1, det(VUt))

where K = UDV t is the SVD of K.It is essential for most medical registration applications that R does not include any

reflections. This can be detected from the determinant of V U t , which should be +1 for arotation with no reflection, and will be −1 if there is a reflection. In the above equation, �takes this into account.

Finally the translation t is given by t = q − Rp.This approach has been widely used in medical image registration, first for multimodality

registration (e.g. Evans et al 1988, Hill et al 1991) and more recently in image-guided surgery(e.g. Maurer et al 1997). The points can either be anatomical features that can be identified in3D, or markers attached to the patient. The theory of errors has been advanced in the medicalapplication domain through the work of Fitzpatrick et al (1998b).

5.2. Surface matching

Boundaries, or surfaces, in medical images tend to be more distinct than landmarks, andvarious segmentation algorithms can successfully locate such high-contrast surfaces. Thisis especially true of the skin surface—the boundary between tissue and air—which is highcontrast in most imaging modalities, with the important exceptions of certain tracers in nuclearmedicine emission tomography and some echo planar MR images. If equivalent surfaces canbe automatically segmented from two images to be combined, then rigid body registration canbe achieved by fitting the surfaces together. The surface matching algorithms described beloware normally only used for rigid body registration.

5.2.1. The head-and-hat algorithm. Pelizzari and colleagues (Pelizzari et al 1989, Levin et al1988) proposed a surface fitting technique for intermodality registration of images of the headthat became known as the ‘head-and-hat’ algorithm. Two equivalent surfaces are identified inthe images. The first, from the higher-resolution modality, is represented as a stack of discs,and is referred to as the head. The second surface is represented as a list of unconnected 3Dpoints. The registration transformation is determined by iteratively transforming the (rigid)hat surface with respect to the head surface, until the closest fit of the hat onto the head isfound. The measure of closeness of fit used is the square of the distance between a point on thehat and the nearest point on the head, in the direction of the centroid of the head. The iterativeoptimization technique used is the Powell method (Press et al 1992). The Powell optimizationalgorithm performs a succession of one-dimensional optimizations, finding in turn the bestsolution along each of the six degrees of freedom, and then returning to the first degree offreedom. The algorithm stops when it is unable to find a new solution with a significantlylower cost (as defined by a tolerance factor) than the current best solution. This algorithm hasbeen used with considerable success for registering images of the head (Levin et al 1988), andhas also been applied to the heart (Faber et al 1991). The surfaces most commonly used arethe skin surface delineated from both MR images and PET transmission images, or the brainsurface delineated from both MR images and PET emission images. The measure of goodnessof fit can be prone to error for convoluted surfaces: the distance metric used by the head-and-hat algorithm does not always measure the distance between a hat point and the closest pointon the head surface, because the nearest head point will not always lie in the direction of thehead centroid, especially if the surface is convoluted.


5.2.2. Distance transforms. A modification to the head-and-hat algorithm is to use a distancetransform to preprocess the head images. A distance transform is applied to a binary imagein which object pixels (or voxels) have the value 1 and other voxels have the value 0. It labelsall voxels in this image with their distance from the surface of the object. By prelabellingall image voxels in this way, the cost per iteration can be substantially reduced (potentiallyto a single address manipulation and accumulation for each transformed hat surface point).A widely used distance transform is the chamfer filter proposed by Borgefors (1984). Thisapproach was used for rigid body medical image registration (e.g. Jiang et al 1992, van denElsen 1992, van Herk and Kooy 1994). More recently, exact Euclidean distance transformshave been used in place of the chamfer transform (Huang and Mitchell 1994).

Given estimates for the six degrees of freedom of the rigid body transformation, the hatpoints are transformed and their distances from the head surface are calculated from the valuesin the relevant voxels in the distance transform. These values are squared and summed tocalculate the cost associated with the current transformation estimate. There remains a risk offinding local minima, so a pyramidal multiscale representation and outlier rejection is desirable(Jiang et al 1992).

5.2.3. Iterative closest point. The iterative closest point algorithm (ICP) was proposed by Besland McKay (1992) for the registration of 3D shapes. It was not designed with medical imagesin mind, but has subsequently been applied to medical images with considerable success,and is now probably the most widely used surface matching algorithm in medical imagingapplications (e.g. Cuchet et al 1995, Declerck et al 1997, Maurer et al 1998). The originalarticle is written in terms of registration of collected data to a model. The collected data, P ,could come from any sensor that provides three-dimensional surface information includinglaser scanners, stereo video and so forth. The model data, X , could come from a computer-aided design model. In medical imaging applications, both sets of surface data might bedelineated from radiological images, or the model might be derived from a radiological imageand the data from stereo video acquired during an operation. The algorithm is designed to workwith seven different representations of surface data: point sets, line segment sets (polylines),implicit curves, parametric curves, triangle sets, implicit surfaces and parametric surfaces. Formedical image registration the most relevant representations are likely to be point sets andtriangle sets, as algorithms for delineating these from medical images are widely available.

The algorithm has two stages and iterates. The first stage involves identifying the closestmodel point for each data point, and the second stage involves finding the least square rigidbody transformation relating these points sets. The algorithm then redetermines the closestpoint set and continues until it finds the local minimum match between the two surfaces, asdetermined by some tolerance threshold.

Whatever the original representation of the data surface P , it is first converted to a set ofpoints {pi}. The model data remain in their original representation. The first stage involvesidentifying, for each point pi on the data surface P , the closest point on the model surface X .This is the point x in X for which the distance d between pi and x is minimum

d(pi ,X ) = minx∈X

||x − pi ||.The resulting set of closest points (one for each pi) is {qi}. For a triangulated surface, whichis the most likely model representation from medical image data, the model X comprises a setof triangles {ti}. The closest model point to each data point is found by linearly interpolatingacross the facets. If triangle ti has vertices r1, r2 and r3, then the distance between the pointpi and the triangle ti is

d(pi , ti ) = minu+v+w=1

||ur1 + vr2 + wr3 − pi ||


where u ∈ [0, 1], v ∈ [0, 1] and w ∈ [0, 1]. The closest model point to the data point pi is,therefore, qi = (ur1, vr2, wr3).

A least squares registration between the points {pi} and {qi} can then be carried out usingthe method described in section 5.13. The set of data points {pi} is then transformed to {p′

i}using the calculated rigid body transformation, and then the closest points once again identified.The algorithm terminates when the change in mean square error between iterations falls belowa threshold.

The optimization can be accelerated by keeping track of the solutions at each iteration.If there is good alignment between the solutions (to within some tolerance), then both a parabolaand a straight line are fitted through the solutions, and the registration estimate is updated usingone or the other of these estimates based on a slightly ad hoc method to ‘be on the safe side’.

As the algorithm iterates to the local minimum closest to the starting position, it may notfind the correct match. The solution proposed by Besl and McKay is to start the algorithmmultiple times, each with a different estimates of the rotation alignment, and choose theminimum of the minima obtained.

5.3. Registration using crest lines

An alternative to the idea of using surface matching for registration is to use distinctive featureson those surfaces instead. For this to work the surfaces must be smooth, that is possible todifferentiate up to third order. Using the tools of differential geometry, it is possible to definetwo principal curvatures k1 and k2 at each point on a surface, (the strength of k1 is greater thank2) with associated principal directions. Crest lines are the loci of the surface where the valuek1 is locally maximal in its principal direction (Monga and Benayoun 1995). For medicalimages in which intensity threshold values can determine isosurfaces that delineate structuresof interest (e.g. bone from CT scans), these crest lines can be identified directly from theimage voxel arrays. For other applications, such as registration involving MRI where intensityshading makes isosurfaces less meaningful, prior surface segmentation is needed. In both cases,smoothing of the data is needed along with differentiation in order to reduce sensitivity to noise.

Images can be registered by aligning the crest lines identified in the images. This hasgreatest applicability when the images are very similar, in which case there will be goodcorrespondence between crest lines (Gueziec and Ayache 1994, Pennec and Thirion 1997).Alternative approaches include using hash tables of geometrical invariants for each curve(Gueziec et al 1997) together with the Hough transform, and using a modification of the iterativeclosest point algorithm described above (Gueziec and Ayache 1994, Pennec and Thirion 1997).The technique has been applied to registration of CT skull images and T1-weighted MR imagevolumes.

6. Intramodality registration using voxel similarity measures

Registration using voxel similarity measures involves calculating the registrationtransformation T by optimizing some measure calculated directly from the voxel values (orpixel values) in the images rather than from geometrical structures such as points or surfacesderived from the images. As was stated in section 2, with voxel similarity measures weare almost invariably iteratively determining T , whereas in the case of point registration orsurface matching we first identify corresponding features, determine T directly or iterativelyfrom these, and finally infer T .

3 The authors of the iterative closest point algorithm in fact use the equivalent quaternion method.


In sections 5.1 and 5.2 above we did not distinguish between registration where imagesA and B are of the same modality and registration of A and B when they are of differentmodalities. For registration using voxel similarity measures this is an important distinction, aswill be seen from the following example. A common reason for carrying out same modality, orintramodality, registration is to compare images from a subject taken at slightly different timesin order to ascertain whether there have been any subtle changes in anatomy or pathology.If there has been no change in the subject, we might expect that, after registration andsubtraction, there will be no structure in the difference image, just noise. Where there is a smallamount of change in the structure, we would expect to see noise in most places in the images,with a few regions visible in which there has been some change. If there were a registrationerror, we would expect to see artefactual structure in the difference image resulting from thepoor alignment. In this application, various voxel similarity measures suggest themselves. Wecould, for example, iteratively calculate T while minimizing the structure in the differenceimage on the grounds that at correct registration there will be either no structure or a verysmall amount of structure in the difference image, whereas with increasing misregistration,the amount of structure would increase. The structure could be quantified, for example, by thesum of squares of difference values, or the sum of absolute difference values, or the entropy ofthe difference image. An alternative intuitive approach (at least for those familiar with signalprocessing techniques) would be to find T by cross correlation of images A and B. In thissection, we describe these intramodality techniques in more detail.

6.1. Minimizing intensity difference

One of the simplest voxel similarity measures is the sum of squared intensity differencesbetween images, SSD, which is minimized during registration. For N voxels in the overlapdomain �T

A,B

SSD = 1

N

∑xA∈�T

A,B

|A(xA) − BT (xA)|2. (5)

It is necessary to divide by the number of voxels N in the overlap domain (the cardinalityof �T

A,B) because N = ∑�T

A,B1 may vary with each estimate of T . It can be shown that this is

the optimum measure when two images only differ by Gaussian noise (Viola 1995). It is quiteobvious that for intermodality registration this will never be the case. This strict requirementis seldom true in intramodality registration either, as noise in medical images such as modulusMRI scans is frequently not Gaussian, and also because there is likely to have been change inthe object being imaged between acquisitions or there would be little purpose in registeringthe images!

The SSD measure is widely used for serial MR registration, for example by Hajnal et al(1995a, b). It is also used in a slightly different framework by the registration algorithm withinFriston’s statistical parametric mapping (SPM) software (Friston et al 1995, Ashburner andFriston 1999), as described in more detail in section 7.1 below.

The SSD measure is very sensitive to a small number of voxels that have very largeintensity differences between images A and B. This might arise, for example, if contrastmaterial is injected into the patient between the acquisition of images A and B or if the imagesare acquired during an intervention and instruments are in different positions relative to thesubject in the two acquisitions. The effect of these ‘outlier’ voxels can be reduced by usingthe sum of absolute differences, SAD rather than SSD:

SAD = 1

N

∑xA∈�T

A,B

|A(xA) − BT (xA)|. (6)


6.2. Correlation techniques

The SSD measure makes the implicit assumption that after registration, the images differ onlyby Gaussian noise. A slightly less strict assumption would be that, at registration, there isa linear relationship between the intensity values in the images. In this case, the optimumsimilarity measures is the correlation coefficient, CC

CC =∑

xA∈�TA,B

(A(xA) − A)(BT (xA) − B)

{∑xA∈�TA,B

(A(xA) − A)2∑

xA∈�TA,B

(BT (xA) − B)2}1/2. (7)

where A is the mean voxel value in image A|�TA,B

and B is the mean of BT |�TA,B

. This similaritymeasure has been used for intramodality registration (e.g. Lemieux et al 1998). The correlationcoefficient can be thought of as a normalized version of the widely used cross correlationmeasure C

C =∑

xA∈�TA,B

A(xA)BT (xA). (8)

6.3. Ratio image uniformity

The ratio image uniformity (RIU) algorithm was originally introduced by Woods et al (1992)for the registration of serial PET studies, but has more recently been widely used for serialMR registration (Woods et al 1998), and is available in the AIR registration package fromUCLA. The algorithm can be thought of as working with a derived ratio image calculated fromimages A and B. An iterative technique is used to find the transformation T that maximizesthe uniformity of this ratio image, which is quantified as the normalized standard deviation ofthe voxels in the ratio image. The RIU acronym was not introduced when the algorithm wasfirst published, and it is also frequently referred to as the variance of intensity ratios (VIR)algorithm. The RIU algorithm is most easily thought of in terms of an intermediate ratio imageR comprising N voxels within �T

A,B (N = ∑�T

A,B1)

R(xA) =A|�T

A,B(xA)

BT |�TA,B

(xA)R = 1

N

∑xA∈�T

A,B

R(xA) (9)

RIU =√

1N

∑xA∈�T

A,B(R(xA) − R)2

R. (10)

7. Intermodality registration using voxel similarity measures

In section 6 we described algorithms that register images of the same modality by optimizinga voxel similarity measure. Because of the similarity of the intensities in the images beingregistered, the subtraction, correlation and ratio techniques described have an intuitive basis.With intermodality registration, the situation is quite different. There is, in general, no simplerelationship between the intensities in the images A and B. Using the notation of section 2,the intensity mapping function F is a complicated function and no simple arithmetic operationon the voxel values is, therefore, going to produce a single derived image from which we canquantify misregistration.

In section 7.1 we describe some interesting approaches to applying the intramodalitymeasures of subtraction and correlation to images of different modalities. In the remainder ofthis section we then describe similarity measures designed to work directly on intermodalityimages.


In theory these measures could be used to calculate an arbitrary mapping T . In practice,current intermodality registration techniques almost always involve finding rigid body or affinetransformations.

7.1. Using intramodality similarity measures for intermodality registration

It is possible to use image subtraction or correlation techniques for intermodality registrationby estimating an intensity mapping function F and applying this to one image (say A) tocreate a ‘virtual image’ Av that has similar intensity characteristics to image B, of a differentmodality. If the estimate of F is reasonably good, then the virtual image Av is sufficientlysimilar to image B that subtraction or correlation algorithms can be used for registration.

7.1.1. Intensity re-mapping for MR–CT registration. One approach, which works well inMR–CT registration, is to transform the CT image intensities, such that high intensities arere-mapped to low intensities. This creates a virtual image from the CT images that has anintensity distribution more like an MR images (in which bone is dark) (van den Elsen et al1994). The MR image and virtual MR image created from CT are then registered by crosscorrelation.

7.1.2. Registration of intensity ridges. A slightly different approach is to generate virtualimages Av and Bv from A and B, such that, although A and B are very different from eachother, Av and Bv are quite similar. This can be done by applying scale-space derivatives toboth images A and B in order to generate virtual images that comprise intensity ridges (vanden Elsen et al 1995, Maintz et al 1996). The operator used by these authors to quantify‘ridgeness’ in images is Lvv . In their terminology L is the intensity distribution (either 3D or2D) and a gradient-based local coordinate system is then defined in which w is the direction ofmaximum gradient and v is the normal to that direction. The direction v will change relativeto the Cartesian axis of the image depending on the local intensity landscape. Lvv is thereforethe second derivative of the intensity landscape at a local position and a selected scale. It is aline in 2D and a surface in 3D (Maintz et al 1996). Applying this operator to both images to beregistered produces two derived images of ridgeness that can be registered by cross correlation.

7.1.3. The registration algorithm used by the statistical parametric mapping (SPM)software. Registration involves finding the optimal transformation T . In order to runoptimization algorithms, these transformation are assumed to be parametrized, for examplethe rotations are typically parametrized by Euler angles. We call the collection of parametersθ = (θ0, . . . , θK−1), and we indicate the parametrization using the notation T = Tθ . For non-rigid transformations, the number of parameters K can be very large. In this sense, a generalsimilarity measure SA,B(T ) becomes a function of these parameters θ : SA,B(θ).

Friston et al (1995) proposed an alternative to running some optimization algorithm ona similarity measure. In the section on similarity measures, it was mentioned that differentrelationships between the image intensities are possible. At registration, the intensity mappingcould be the identity, i.e. B(T (xA)) = A(xA) + ε(xA). Alternatively the mapping could belinear, in which case BT (xA) = αA(xA) + β + ε(xA). More generally, we could have someglobal functional relation BT (xA) = F(A(xA)) + ε(xA) or local functional (non-stationary)relationship: BT (xA) = F(A(xA),xA) + ε(xA). In all these equations, ε(xA) is some errorterm, which is discarded. Friston et al note that this provides one equation for every xA. Just as


the transformation can be parametrized, they assume that the unknown F can be parametrizedtoo, say by u = (u0, . . . , uL−1), with L parameters. If we rewrite the N equations in the form

-xA(θ,u) = BTθ (xA) − Fu(A(xA),xA) = 0 (11)

we get N implicit equations in K +L unknowns. If N � K +L such a system can be formallysolved. This is a very general approach, and Friston et al propose various specific versions fordifferent applications. The general approach is analogous to the sum of squares of difference(SSD) algorithm described in section 6.1, in which registration is accomplished by minimizingthe sum of squares of differences in voxel intensities between a virtual image derived fromimage A and the corresponding locations in image B

SSD = ||F(A(xA)) − BT (xA)||2. (12)

The assumptions in Friston et al’s algorithm, however, mean that their solution is likely to bedifferent from the solution obtained by iteratively minimizing SSD.

In order to be able to solve explicitly equation (11), Friston et al make a series ofassumptions, which actually have the effect of ensuring that K and L are not too big. If - issmooth, and we know that the solution is close to our starting estimate θ0, u0, equation (11) canbe linearized by taking the first two terms of the Taylor expansion of -, in order to computean explicit equation: -xA

(θ,u) = -xA(θ0,u0) + [∂θ-xA

(θ0,u0), ∂u-xA(θ0,u0)][θ,u]t .

If we call A the matrix of partial derivatives, then equation (11) takes the simple formA[θ,u]t = −-xA

(θ0,u0)which can be solved by standard least square techniques. This is notiterative, but obviously, due to the assumptions which have been made, iterative improvementmight be required. Good starting estimates are essential for this technique.

7.1.4. Example: MR–PET registration. Here A = MR, B = PET. Friston et al’schoice was: T is a rigid body transformations, and the parametrization of F(MR(xMR),xMR)

was obtained under the following assumptions: the signal from many PET images comespredominantly from grey matter, so a segmentation of the grey matter in MR should give anapproximation to a virtual PET image (Friston et al 1995). This image might still have anon-stationary scaling, which is called u0(xMR). This segmented and scaled MR still has thewrong resolution, which is corrected by low-pass filtering by convolution with a Gaussiankernel. If vg is the estimated intensity, the segmentation is done by the transformationF(MR(xMR),xMR) = u0(xMR)e−(MR(xMR)−vg)

2/2σ 2. The real grey-value intensity is going

to vary, so vg is replaced by vg + v(xMR). The parameters of (we don’t write the argument forclarity) F(MR(·), ·) = u0(·)e−(MR(·)−(vg+v(·)))2/2σ 2

are then u0 and v. Friston et al make theparameter transformation u1 = u0v and assumes smooth variation over the image to ensurethe system is overdetermined.

7.2. Partitioned intensity uniformity

The first widely used intermodality registration algorithm that used a voxel similarity measurewas proposed by Woods and colleagues for MR-PET registration soon after they proposedtheir RIU algorithm (Woods et al 1993). Here, we refer to this intermodality algorithm aspartitioned intensity uniformity (PIU). This algorithm involved a fairly minor change to thesource code of their previously published RIU technique (see section 6.3), but transformed itsfunctionality. This algorithm makes an idealized assumption that ‘all pixels with a particularMR pixel value represent the same tissue type so that values of corresponding PET pixelsshould also be similar to each other’. The algorithm therefore partitions the MR image into256 separate bins based on the value of the MR voxels, then seeks to maximize the uniformity


of the PET voxel values within each bin. Once again, uniformity within each bin is maximizedby minimizing the normalized standard deviation.

It is possible to consider this algorithm also using the concept of ratio images, as in theintramodality RIU algorithm. In the intermodality case, however, instead of generating a singleratio image, one ‘ratio image’ is generated for each of the MR intensity bins. This produces256 sparse images, one for each isointensity set �a . No explicit division is needed to generatethese ‘ratio images’, however, because the denominator image corresponds to a single MR bin.The normalized standard deviation of each of these sparse ‘ratio images’ is then calculated,and the overall similarity measure calculated from a weighted sum of the normalized standarddeviations.

In the discussion above, we have described the algorithm in terms of MR and PETregistration only. We can now formulate the algorithm more generally in terms of imagesA and B. It is important to note that the two images are treated differently, so there are twodifferent versions of the algorithm, depending on whether image A or image B is partitioned.

For registration of the images A and B, the partitioned image uniformity measure (PIU)can be calculated in two ways. Either as the sum of the normalized standard deviation ofintensities inB for each intensity a inA (PIUB) or the sum of the normalized standard deviationof intensities in A for each intensity b in B (PIUA)

PIUB =∑a

na

N

σB(a)

µB(a)and PIUA =

∑b

nb

N

σA(b)

µA(b)(13)

where

na =∑�T

a

1 nb =∑�T

b

1

µB(a) = 1

na

∑�T

a

BT (xA) µA(b) = 1

nb

∑�T

b

A(xA)

σ 2B(a) = 1

na

∑�T

a

(BT (xA) − µB(a))2 σ 2

A(b) = 1

nb

∑�T

b

(A(xA) − µA(b))2.

In words, we can say that na is the number of voxels of the isointensity set a in A|�TA,B

, and

µB(a) and σB(a) are the mean and standard deviation values of the voxels in BT |�TA,B

that co-occur with this set. The PIU algorithm is widely used for MR–PET registration, requiring thatthe scalp is first removed from the MR image to avoid a breakdown of the idealized assumptiondescribed above. The technique has never been widely used for registration of other modalities,but its success has inspired considerable research activity aimed at identifying alternative voxelsimilarity measures for intermodality registration.

7.3. Information theoretic techniques

It can be useful to think of image registration as trying to maximize the amount of sharedinformation in two images. In a very qualitative sense, we might say that if two images ofthe head are correctly aligned then corresponding structures will overlap, so we will have twoears, two eyes, one nose and so forth. When the images are out of alignment, however, wewill have duplicate versions of these structures from A and B.

Using this concept, registration can be thought of as reducing the amount of informationin the combined image, which suggests the use of a measure of information as a registrationmetric. The most commonly used measure of information in signal and image processing is the


Shannon–Wiener entropy measure H , originally developed as part of communication theoryin the 1940s (Shannon 1948, 1949)

H = −∑i

pi logpi. (14)

H is the average information supplied by a set of n symbols whose probabilities are given byp1, p2, p3, . . . pn.

This formula, save for a multiplicative constant, is derived from three conditions that ameasure of choice or uncertainty in a communication channel should satisfy. These are:

(a) The functional should be continuous in pi .(b) If all pi equal 1

n, where n is the number of symbols, then H should be monotonically

increasing in n.(c) If a choice is broken down into a sequence of choices then the original value of H should

be the weighted sum of the constituent H . That is H(p1, p2, p3) = H(p1, p2 + p3)

+(p2 + p3)H(p2

p2+p3

p3

p2+p3).

Shannon proved that the − ∑pi logpi form was the only functional form satisfying all three

conditions.Entropy will have a maximum value if all symbols have equal probability of occurring

(i.e. pi = 1n∀i), and have a minimum value of zero if the probability of one symbol occurring

is 1, and the probability of all the others occurring is zero.An important observation made by Shannon is that any change in the data that tends to

equalize the probabilities of the symbols {p1, p2, p3, . . . pn} increases the entropy. Blurringthe symbols is one such operation. For a single image, the entropy is normally calculated fromthe image intensity histogram in which the probabilities p1 . . . pn are the histogram entries4.If all voxels in an image have the same intensity a, the histogram contains a single non-zeroelement with probability of 1, indicating that A(xA) = a for all xA. The entropy of thisimage is −1 log 1 = 0. If this uniform image were to include some noise, then the histogramwill contain a cluster of non-zero entries around a peak at the average (mode) intensity value,which will be approximately a. The addition of noise to the image, therefore, tends to equalizethe probabilities by ‘blurring’ the histogram which increases the entropy. The dependenceof entropy on noise is important. One consequence is that interpolation of an image maysmooth the image (see section 9 for more detail) which can reduce the noise, and consequently‘sharpen’ the histogram. This sharpening of the histograms reduces entropy.

An application of entropy for intramodality image registration is to calculate the entropyof a difference image. If two identical images, perfectly aligned, are subtracted the resultis an entirely uniform image that has zero entropy (as stated above). For two images thatdiffer by noise, the histogram will be ‘blurred’, giving higher entropy. Any misregistration,however, will lead to edge artefacts that further increase the entropy. Very similar images can,therefore, be registered by iteratively minimizing the entropy of the difference image (Buzugand Weese 1998).

7.3.1. Joint entropy. In image registration we have two imagesA andB to align. We thereforehave two values at each voxel location for any estimate of the transformation T . Joint entropymeasures the amount of information we have in the two images combined (Shannon 1948).

4 In some applications, image entropy is calculated by treating the value of each pixel or voxel xA ∈ �A as aprobability, so entropy is given by

∑xA∈�A

A(xA) logA(xA), with suitable normalization such that all probabilitiessum to 1. In this article, we consider entropy calculated from the histogram of the image, not directly from the voxelsin the image.


If A and B are totally unrelated, then the joint entropy will be the sum of the entropies of theindividual images. The more similar (i.e. less independent) the images are, the lower the jointentropy compared with the sum of the individual entropies

H(A,B) � H(A) + H(B). (15)

The concept of joint entropy can be visualized using a joint histogram calculated fromimage A and BT (Hill et al 1994). For all voxels in the overlapping regions of the images(xA ∈ �T

A,B), we plot the intensity of this voxel in image A, A(xA) against the intensity of thecorresponding voxel in image BT . The joint histogram can be normalized by dividing by thetotal number of voxels N in �T

A,B , and regarded as a joint probability density function (PDF)pTAB of images A and B. We use the superscript T to emphasize that pT

AB changes with T .Due to the quantization of image intensity values, the PDF is discrete, and the values in eachelement represent the probability of pairs of image values occurring together. The joint entropyH(A,B) is therefore given by

H(A,B) = −∑a

∑b

pTAB(a, b) logpT

AB(a, b). (16)

The number of elements in the PDF can either be determined by the range of intensityvalues in the two images, or from a partitioning of the intensity space into ‘bins’. For exampleMR and CT images being registered could have up to 4096 (12 bits) intensity values, leadingto a very sparse PDF with 4096 by 4096 elements. The use of between 64 and 256 bins is morecommon. In the above equation a and b either represent the original image intensities or theselected intensity bins. Joint entropy was simultaneously proposed for intermodality imageregistration by Studholme et al (1995) and Collignon et al (1995) at the 1995 InformationProcessing in Medical Imaging Conference.

As can be seen from figure 2, the joint histograms disperse or ‘blur’ with increasingmisregistration such that the brightest regions of the histogram gets less bright, and thenumber of dark regions is reduced. This arises because misregistration leads to joint histogramentries that correspond to different tissue types in the two images. This increases the entropy.Conversely, when registering images we want to find a transformation that will produce a smallnumber of histogram elements with high probabilities, and give us as many zero-probabilityelements in the histogram as possible, which will minimize the joint entropy. Registration can,therefore, be thought of as trying to find the transformation that maximizes the ‘sharpness’ ofthe histogram, thereby minimizing the joint entropy.

The simple form of the equation for joint entropy (equation (16)) can hide an importantlimitation of this measure. As we have emphasized with the T superscript on the jointprobabilities, joint entropy is dependent on T . In particular, pT

AB is very dependent on theoverlap �T

A,B , which is undesirable. For example, a change in T may alter the amount of airsurrounding the patient overlapping in the images A and B. Since the air region contains noisethat will tend to occupy the lowest value intensity bins (e.g. a = 0, b = 0), changing thisoverlap will alter the joint probability pT

AB(0, 0). If the overlap of air increases, pTAB(0, 0) will

increase, reducing the joint entropy H(A,B). If the overlap of air decreases, pTAB(0, 0) will

reduce, increasing H(A,B). A registration algorithm that seeks to minimize joint entropywill tend, therefore, to maximize the amount of air in �T

A,B , which may result in an incorrectsolution. More subtly, interpolation which is needed for both subvoxel translation and anyrotation will blur the image, altering the PDF values pT

AB .

7.3.2. Mutual information. A solution to the overlap problem from which joint entropysuffers is to consider the information contributed to the overlapping volume by each image


Figure 2. Example 2D histograms from Hill et al (1994) (with permission) for (a) identical MRimages of the head, (b) MR and CT images of the head and (c) MR and PET images of the head.For all modality combinations, the left panel is generated from the images when aligned, the middlepanel when translated by 2 mm, and the right panel when translated by 5 mm. Note that while thehistograms are quite different for the different modality combinations, misregistration results in adispersion or blurring of the signal. Although these histograms are generated by lateral translationalmisregistration, misregistration in other translation or rotation directions has a similar effect.

being registered together with the joint information. The information contributed by theindividual images is simply the entropy of the portion of the image that overlaps with the otherimage volume:

H(A) = −∑a

pTA(a) logpT

A(a) ∀A(xA) = a|xA ∈ �TA,B (17)


H(B) = −∑b

pTB (b) logpT

B (b) ∀BT (xA) = b|xA ∈ �TA,B (18)

where pTA and pT

B are the marginal probability distributions, which can be thought of as theprojection of the joint PDF onto the axes corresponding to intensities in image A and B

respectively. It is important to remember that the marginal entropies are not constant duringthe registration process. Although the information content of the images being registeredis constant (subject to slight changes caused by interpolation during transformation), theinformation content of the portion of each image that overlaps with the other image willchange with each change in estimated registration transformation. The superscript T in pT

B

once again emphasizes the dependence of the probabilities on T . The probabilities pTA have

a superscript T rather than T because image A is the target image which is not interpolatedduring registration, but the overlap domain nevertheless changes with T .

Communication theory provides a technique for measuring the joint entropy with respectto the marginal entropies. This measure, introduced by Shannon (1948) as ‘rate of transmissionof information’ in his article that founded information theory, has become known as mutualinformation I (A,B). It was independently and simultaneously proposed for intermodalitymedical image registration by researchers in Leuven, Belgium (Collignon et al 1995, Maeset al 1997) and MIT in the USA (Viola 1995, Wells et al 1996). In maximizing mutualinformation, we seek for solutions that have a low joint entropy together with high marginalentropies

I (A,B) = H(A) + H(B) − H(A,B) =∑a

∑b

pTAB(a, b) log

pTAB(a, b)

pTA(a).p

TB (b)

. (19)

The difference between joint entropy and mutual information is illustrated for serial MRimages in figure 3. These plots were obtained from MR images that were acquired perfectlyaligned5. The correct registration transformation should correspond to zero rotation and zerotranslation, so we want the optimum value of the similarity measure to be at this position.Figure 3 plots the value of joint entropy, marginal entropies and mutual information formisalignments of between 0 and 6 mm in 0.2 mm increments. Subvoxel translation wasachieved using trilinear interpolation which can introduce interpolation errors. The plotstherefore show the change in entropies with translation both using the original data (full curve),and using the data pre-filtered with a Gaussian of variance 0.5 voxels to smooth the images andreduce interpolation artifacts (broken curve). These plots demonstrate three important points.Firstly, the marginal entropies change with translation due to change in the overlap domain�T

A,B . Secondly, that mutual information (I (A,B)) varies more smoothly with misregistrationthan joint entropy (H(A,B)). Thirdly, subvoxel interpolation can blur the images, resultingin reduced entropy that introduces local extrema into parameter space, and the consequencesof this are greatly reduced by preblurring the data.

Mutual information can qualitatively be thought of as a measure of how well one imageexplains the other, and is maximized at the optimal alignment. We can make our descriptionmore rigorous if we think more about probabilities. The conditional probability p(b|a) is theprobability that B will take the value b given that A has the value a. The conditional entropyis, therefore, the average of the entropy of B for each intensity in A, weighted according to

5 A two average gradient echo volume sequence was acquired with isotropic voxels of dimension 2 mm. The raw datawere exported from the scanner, and echoes contributing to the two averages were separated and reconstructed as twodifferent images. Because the echoes contributing to the two images are interleaved, the acquisitions are essentiallysimultaneous, so are registered. However, as the echoes in the two images are obtained from different excitations,random noise and artefact should be different in the two images.


Figure 3. A comparison of change in marginal and joint entropies with cranial–caudal translationfor registration of two MR images that differ only by noise. Top left H(A), top right H(B), bottomleft H(A,B) and bottom right I (A,B). Each plot has two traces. The full curve is calculatedwith the images at original resolution, with subvoxel translation achieved using linear interpolation.Note the local extrema of the measures with the period of the voxel separation (2 mm). The brokencurve is obtained from images that have been preblurred with a Gaussian kernel with varianceσ 2 = 0.5 voxels. This has no effect on H(A), as image A is not interpolated. For H(B), thepreblurring with a Gaussian kernel reduces the interpolation artefacts, and results in a smooth tracefor mutual information.

the probability of getting that intensity in A

H(B|A) = −∑a,b

p(a, b) logp(b|a) = H(A,B) − H(A). (20)

Using equation (20), we can rewrite our previous expression for mutual information(equation (19))

I (A,B) = H(A) − H(B|A) = H(B) − H(A|B). (21)

The maximization of mutual information, therefore, involves minimizing the conditionalentropy with respect to the entropy of one of the images. The conditional entropy will be zeroif, knowing the intensity value A(xA), we can exactly predict the intensity value in BT . Theconditional entropy will be high if a given intensity value A(xA) in image A is a very poorpredictor of the intensity value BT (xA) in the corresponding location in image B.

Given that the images are of the same object, then when they are correctly registered,corresponding voxels in the two images will be of the same anatomical or pathological structure.Knowing the value of a voxel in image A should, therefore reduce the uncertainty (and henceentropy) for the value of the corresponding location in image B. This can be thought of as


a generalization of the assumption made by Woods in his PIU measure. The PIU measureassumes that at registration the uniformity of values in B corresponding to a given value a inA should be minimal. The information theoretic approaches assume that, at alignment, thevalue of a voxel in A is a good predictor of the value at the corresponding location in B. Asmisregistration increases, one image becomes a less good predictor of the second. In practicalterms, the advantage of mutual information over PIU is that two structures which have thesame intensity in image A may have very different intensities in image B. For example, inan MR image, cortical bone and air will both have very low intensities, whereas in CT, airwill have a very low intensity but cortical bone a high intensity. If we have a low-intensityvoxel in image A, then at correct alignment we know that this should either be air or bonein the CT image6. The histogram of CT intensity values corresponding to low intensities inMR will, therefore, have sharp peaks at both low-intensity values and high-intensity values,and the sharp peaks in the histogram give us low entropy. Even though the MR intensity is agood predictor of the CT intensity, however, the PIU would give a very low uniformity value.Mutual information has been compared with other voxel similarity measure for MR–CT andMR–PET registration (e.g. Studholme et al 1996, 1997).

7.3.3. Normalized mutual information. Mutual information does not entirely solve theoverlap problem described above. In particular, changes in overlap of very low-intensityregions of the image (especially air around the patient) can disproportionately contribute tothe mutual information. As was stated earlier, Shannon (1948) was the first to present thefunctional form of mutual information, calling it the ‘rate of transmission of information’in a noisy communication channel between source and receiver. In his application intelecommunications, the time over which the different measurements of the source and receiverare made are constant, by definition. In image registration, however, the quantity analogousto time is the total number of image data in the overlap domain �T

A,B , and this changes withthe transformation estimate T . To remove this dependence on volume of overlap we shouldnormalize to the combined information in the overlapping volume.

Three normalization schemes have so far been proposed in journal articles to address thisproblem. Equations (22) and (23) below were mentioned in passing in the discussion sectionof Maes et al (1997), though no results were presented comparing them with standard mutualinformation (equation (19))

I1(A,B) = 2I (A,B)

H(A) + H(B)(22)

I2(A,B) = H(A,B) − I (A,B). (23)

Studholme et al (1999) have proposed an alternative normalization devised to overcomethe sensitivity of mutual information to change in image overlap. This measure involvesnormalizing mutual information with respect to the joint entropy of the overlap volume

I3(A,B) = H(A) + H(B)

H(A,B)= I (A,B)

H(A,B)+ 1 = 1

I1(A,B) − 2. (24)

This third version of normalized mutual information has been shown to be considerablymore robust than standard mutual information for intermodality registration in which theoverlap volume changes substantially (Studholme et al 1999). For serial MR registration,when images A and B have virtually identical fields of view, however, mutual information

6 We are assuming in this example that the MR sequence being used will give us no other structures with very lowintensity.


and normalized mutual information (equation (24)) have been shown to perform equivalently(Holden et al 2000).

8. Optimization and capture ranges

With the exception of registration using the Procrustes technique described in section 5.1, andin certain circumstances the registration algorithm in SPM described in section 7.1, all theregistration algorithms reviewed in this article require an iterative approach, in which an initialestimate of the transformation is gradually refined by trial and error. In each iteration, thecurrent estimate of the transformation is used to calculate a similarity measure. The optimiza-tion algorithm then makes another (hopefully better) estimate of the transformation, evaluatesthe similarity measure again, and continues until the algorithm converges, at which point notransformation can be found that results in a better value of the similarity measure, to withina preset tolerance. A review of optimization algorithms can be found in (Press et al 1992).

One of the difficulties with optimization algorithms is that they can converge to an incorrectsolution called a ‘local optimum’. It is sometimes useful to consider the parameter space ofvalues of the similarity measure. For rigid body registration there are six degrees of freedom,giving a six-dimensional parameter space. Each point in the parameter space corresponds to adifferent estimate of the transformation. Non-rigid registration algorithms have more degreesof freedom (often many thousands), in which case the parameter space has correspondinglymore dimensions. The parameter space can be thought of as a high-dimensionality imagein which the intensity at each location corresponds to the value of the similarity measurefor that transformation estimate. If we consider dark intensities as good values of similarity,and high intensities as poor ones, an ideal parameter space image would contain a sharplow intensity optimum with monotonically increasing intensity with distance away from theoptimum position. The job of the optimization algorithm would then be to find the optimumlocation given any possible starting estimate.

Unfortunately, parameter spaces for image registration are frequently not this simple.There are often multiple optima within the parameter space, and registration can fail if theoptimization algorithm converges to the wrong optimum. Some of these optima may be verysmall, caused either by interpolation artefacts (discussed further in section 9) or a local goodmatch between features or intensities. These small optima can often be removed from theparameter space by blurring the images prior to registration. In fact, a hierarchical approachto registration is common, in which the images are first registered at low resolution, then thetransformation solution obtained at this resolution is used as the starting estimate for registrationat a higher resolution, and so on.

Multiresolution approaches do not entirely solve the problem of multiple optima in theparameter space. It might be thought that the optimization problem involves finding the globallyoptimal solution within the parameter space, and that a solution to the problem of multipleoptima is to start the optimization algorithm with multiple starting estimates, resulting inmultiple solutions, and choose the solution which has the lowest value of the similarity measure.This sort of approach, called ‘multistart’ optimization, can be effective for surface matchingalgorithms. For voxel similarity measures, however, the problem is more complicated. Thedesired optimum when registering images using voxel similarity measures is frequently not theglobal optimum, but is one of the local optima. The following example serves to illustrate thispoint. When registering images using joint entropy or mutual information, an extremely goodvalue of the similarity measure can be found by transforming the images such that only air in theimages overlaps. This will give a few pixels in the joint histogram with very high probabilities,surrounded by pixels with zero probability. This is a very low entropy situation, and will tend


to have lower entropy than the correct alignment. The global optimum in parameter spacewill, therefore, tend to correspond to an obviously incorrect transformation. The solution tothis problem is to start the algorithm within the ‘capture range’ of the correct optimum, thatis within the portion of the parameter space in which the algorithm is more likely to convergeto the correct optimum than the incorrect global one. In practical terms, this requires that thestarting estimate of the registration transformation is reasonably close to the correct solution.The size of the capture range depends on the features in the images, and cannot be knowna priori, so it is difficult to know in advance whether the starting estimate is sufficiently good.This is not, however, a very serious problem, as visual inspection of the registered images,described further in section 11, can easily detect convergence outside the capture range. Inthis case, the solution is clearly and obviously wrong (e.g. relevant features in the image donot overlap at all). If this sort of failure of the algorithm is detected, the registration can bere-started with a better starting estimate obtained, for example, by interactively transformingone image until it is approximately aligned with the other.

9. Image transformation

Image registration using voxel similarity measures involves determining the transformationT that relates the domain of image A to image B. This transformation can then be used totransform one image into the coordinates of the second within the region of overlap of the twodomains �T

A,B . As discussed in section 2 above, this process involves interpolation, and needsto take account of the difference in sample spacing in images A and B.

9.1. A consideration of sampling and interpolation theory

Shannon (1949) showed that a bandlimited signal sampled with an infinite periodic samplingfunction can be perfectly interpolated using the sinc function interpolant previously proposedby the mathematician Whittaker (1935). Many medical images are, however, not bandlimited.For example, multislice datasets are not bandlimited in the through-slice direction, as the fieldof view is truncated with a top-hat function. Even in MR image volumes reconstructed usinga 3D Fourier transform, the condition is not usually satisfied because the image data providedby the scanner are often truncated to remove slices at the periphery at the field of view. Alsothe data provided are usually the modulus of the signal, and taking the modulus is a nonlinearoperation that can increase the spatial frequency content.

Even if the images being transformed were strictly bandlimited, it would not be possibleto carry out perfect interpolation using a sinc function because a sinc function is infinite inextent.

For many purposes, this problem is entirely ignored during medical image analysis. Themost widely used image interpolation function is probably trilinear interpolation, in whicha voxel value in the transformed coordinates is estimated by taking a weighted average ofthe nearest eight neighbours in the original dataset. The weightings, which add up to one,are inversely proportional to the distance of each neighbour to the new sample point. Forthe accurate comparison of registered images, for example by subtracting one image fromanother, the errors introduced by trilinear interpolation become important. It can be shownthat trilinear interpolation applies a low-pass filter to the image and introduces aliasing (Parkeret al 1983). For transformations that contain rotations, the amount of low-pass filtering varieswith position in the image. If subtracting one image from another to detect small change, forexample in serial MR imaging, the low-pass filtering in this process can lead to substantialartefacts. Subtracting a low-pass filtered version of an image from the original is a well known


edge enhancement method, so even in the case of identical images differing only by a rigidbody transformation, using linear interpolation followed by subtraction does not result in theexpected null result but instead results in an edge enhanced version of the original.

Hajnal et al (1995a) recently brought this issue to the attention of MR image analysts andproposed that the solution is to interpolate using a sinc function truncated with a suitablewindow function such as a Hamming window. Care must be taken when truncating theinterpolation kernel to ensure that the integral of the weights of the truncated kernel is unity,or an artefactual intensity modulation can result (Thacker et al 1999).

Various modifications to sinc interpolation have recently been proposed. These fallinto three categories. Firstly, the use of sinc functions with various radii truncated withvarious window functions (Lehmann et al 1999). Secondly, approximations to windowedsinc functions such as cubic or B-spline interpolants (Lehmann et al 1999, Unser 1999).Thirdly, the shear transform, which involves transforming the image using a combination ofshears (Eddy et al 1996, Cox and Jesmanowicz 1999). This third approach is fast, though itdoes result in artefacts in the corners of the image which must be treated with caution.

An assumption implicit in the discussion above is that the original data being interpolatedare uniformly sampled. This is not always the case in medical images. MR physics researchersare used to the problem of non-uniform sampling in the acquisition, or k-space domain (Robsonet al 1997, Atkinson et al 2000), but this problem is less often considered in the spatialdomain. The most common circumstances when non-uniform sampling arises are in free-hand 3D ultrasound acquisition and certain types of CT acquisition where the slice spacingchanges during the acquisition. The correct way of interpolating from non-uniformly sampleddata onto a uniform grid is the reverse of sinc interpolation. This methodology, sometimesused in k-space regridding (Robson et al 1997, Atkinson et al 2000), involves calculating thesinc coefficients to go from the desired uniform sampling points to the non-uniform locationsacquired, and inverting the matrix of coefficients in order to do the correct interpolation. Inthe cases of 3D ultrasound and CT variable slice sample spacing, the data are a long way frombeing bandlimited, so the benefits of inverse sinc interpolation may be small in any case.

9.2. Interpolation during registration

Many registration algorithms involve iteratively transforming imageAwith respect to imageBwhile optimizing a similarity measure calculated from the voxel values. Interpolation errorscan introduce modulations in the similarity measure with T . This is most obvious fortransformations involving pure translations of datasets with equal sample spacing, where theperiod of the modulation is the same as the sample spacing (Pluim et al 2000). Figure 3illustrates this effect. This periodic modulation of the similarity measure introduces localminima that can lead to the incorrect registration solution being determined.

The computational cost of ‘correct’ interpolation is normally too high for this approachto be used in each iteration, so lower-cost interpolation techniques must be used. There areseveral possible approaches. The first is to use low-cost interpolation, such as trilinear or nearestneighbour, until the transformation is close to the desired solution, then carry out the final fewiterations using more expensive interpolation. An alternative strategy is to take advantage of thespatial-frequency dependence of interpolation errors. Trilinear interpolation low-pass filtersthe data, and therefore if the images are blurred prior to registration (high spatial frequencycomponents are removed), the interpolation errors are smaller so errors in the registration areless. Although the loss of resolution that results from blurring is a disadvantage, the registrationerrors caused by the interpolation errors can be greater than the loss of precision resulting fromblurring.


9.3. Transformation for intermodality image registration

It should be emphasized that these interpolation issues are more critical for intramodalityregistration where accuracy of considerably better than a voxel is frequently desired, than forintermodality image registration. In intermodality registration, one image is frequently ofsubstantially lower resolution than the other, and the desired accuracy of the order of a singlevoxel at the higher resolution. Furthermore, it is common for the final registration solution tobe used to transform the lower resolution image to the sample spacing of the higher resolutionmodality. Interpolation errors are still likely to be present if trilinear interpolation is usedwithout care, and may slightly reduce the registration accuracy, or degrade the quality of thetransformed images.

10. Registration applications

The most widely used applications of image registration involve determining the rigid bodytransformation that aligns 3D tomographic images of the same subject acquired using differentmodalities (intermodality registration), or the same subject imaged with a single modality atdifferent times (intramodality registration). There is increasing interest in non-rigid registrationof the same or different subjects, registration of 2D images with 3D images, and registration ofimages with the physical coordinates of a treatment system. In this section we give examplesof some applications of these approaches.

10.1. Rigid-body and affine intermodality registration

The motivation behind the majority of work in medical image registration during the 1980swas the desire to combine complementary information about the same patient from differentimaging modalities (e.g. Maguire et al 1986, Chen et al 1985, Peters et al 1986, Schad et al1987, Levin et al 1988, Faber and Stokely 1988, Evans et al 1988, Meltzer et al 1990, Hillet al 1991). With the introduction of MR imaging into many hospitals, and increasing useof CT and tomographic nuclear medicine modalities, it was becoming common for patientsto be imaged with more than one tomographic modality during diagnosis or planning oftreatment. Intermodality image registration provided a solution to the problem of relatingimages from different modalities that differ in field of view, resolution, and slice orientation.The head, and the brain in particular, has always been the major area of application forthese techniques partly because the skull makes the rigid body assumption valid. During the1990s, automatic algorithms that optimize a voxel similarity measure were devised for theseapplications, and these have been shown to be more accurate and robust than feature basedregistration algorithms for rigid body MR–CT registration and to a lesser extent also for rigidbody MR–PET registration (West et al 1999). The information theoretic approaches describedin section 7.3 have been shown to be the most generally applicable and robust (Studholmeet al 1996, 1997). Figure 4 shows example MR and CT images before and after registrationusing an affine transformation. An affine transformation was used in this case to compensatefor scaling and skew errors in the data as well as finding the translations and rotations neededto align the images.

10.2. Rigid-body intramodality registration

It is becoming common for a subject to be imaged two or more times with the same modalityfor diagnosis, treatment monitoring or for research. The images may be separated in timeby seconds (e.g. in functional experiments), minutes (e.g. pre- and postcontrast images), or


Figure 4. Top row: unregistered MR (left) and CT (right) images. The MR images are shown inthe original sagittal plane, and reformatted coronal plane. The CT images in the original obliqueplane, and reformatted sagittal plane. Note the different field of view of the images. Bottom panel,MR images in sagittal, coronal and axial planes with the outline of bone, thresholded from theregistered CT scan, overlaid. The registration transformation has 10 degrees of freedom, to correctfor errors in the machine supplied voxel dimensions and gantry tilt angle.

weeks (e.g. monitoring tumour growth). In all these applications it is desirable to have highsensitivity to small changes in the images. In functional experiments the signal in some voxelsmay change by a few per cent between the resting and activated state. In contrast or perfusionstudies, it is desirable to identify regions that enhance, or quantify intensity change in a regionof interest. In longer-term studies, it is desirable to detect small changes in lesion volume ordegenerative change in order to plan treatment or monitor response to therapy. Visual inspectionof images on a light-box has been shown to be less sensitive to these changes than looking atdifference images (Denton et al 2000). Since patients often move during examinations, andcannot be repositioned perfectly on subsequent visits, registration is an essential prerequisitefor subsequent analysis. Techniques to register intramodality images of the brain using arigid-body transformation were an area of active research during the 1990s (e.g. Woods et al1992, 1998, Hajnal et al 1995a,b, Freeborough et al 1996, Lemieux et al 1998, Holden et al2000). It might at first seem that intramodality registration is a much simpler problem thanintermodality registration because the images being aligned are very similar to one another. Itturns out, however, that registration accuracy of much less than a voxel is necessary, so greatcare must be taken to handle the image transformation issues discussed in section 9 (Hajnalet al 1995a). While similarity measures such as SSD, RIU and CC described in section 6are successfully used for intramodality registration of brain images, care must be taken in theoptimization and interpolation to ensure high-quality results. Furthermore, although the braincan be accurately aligned using a rigid body transformation, deformable regions such as thescalp and neck, or regions that change substantially between acquisitions (e.g. due to contrastuptake), can bias the result, leading to significant errors. Presegmentation of the images toexclude these regions can be necessary (Hajnal et al 1995a). Alternatively, it has been shown


that the information theoretic similarity measures discussed in section 7.3 can be less sensitiveto these changes, and may have an advantage over the more obvious intramodality measuresdescribed in section 6 for this application (Holden et al 2000).

10.3. 2D–3D registration

There is increasing interest in the registration of 2D projection images with 3D tomographicimages. One example of a 2D–3D registration problem is aligning video images withtomographic modalities. This can be achieved using surface reconstruction from videofollowed by surface matching (e.g. Colchester et al 1996, Grimson et al 1996). Registration ofvideo images with rendered views calculated from tomographic images has also been proposed(e.g. Viola 1995, Clarkson et al 1999, 2000), but these approaches have not yet been shown towork reliably on clinical data.

The predominant 2D–3D registration application is registration of fluoroscopy imagesand CT images in order to align preinterventional CT scans with interventional x-ray images(Lemieux et al 1994a, Weese et al 1997, Penney et al 1998, Lavallee and Szeliski 1995). Themost widely used approach to this problem is to simulate projection x-rays from the CT scans toproduce digitally reconstructed radiographs (DRRs), and to iteratively estimate the unknownpose of the x-ray set relative to the CT volume by optimizing a voxel similarity measurecalculated from the fluoroscopy image and DRR. If the DRR is a good approximation to thefluoroscopy image, then cross correlation is an appropriate measure (Lemieux et al 1994b).A particular problem with interventional fluoroscopy images is that they often contain high-contrast objects such as instruments and prostheses that have been added to the field of view.While bone and soft tissue in accurately simulated DRRs may have similar intensity valuesto the corresponding pixels in aligned fluoroscopy images, the instruments and other addedobjects are, of course, absent in the preinterventional CT scan. Many voxel similarity measuresare extremely sensitive to a small number of pixels that have large differences between themodalities, and both cross correlation and mutual information have been found to be unreliablefor this task (Penney et al 1998).

10.3.1. Pattern intensity. Pattern intensity is a similarity measure proposed by Weese et al(1997) to register fluoroscopy and CT images. It differs from other measures in that it calculatesthe value of the similarity measure at each pixel using a neighbourhood of other pixels, ratherthan just the corresponding pixels themselves. Considering imageA to be the x-ray fluoroscopyimage, and image BT to be the DRR generated from the CT scan B, pattern intensity is definedin terms of a difference image D. We describe the pattern intensity measure below in 2D, asoriginally proposed, although it can also be extended in 3D

D(xA) = A(xA) − λBT (xA)

where λ is an intensity scaling factor. The pattern intensity, PI, is then calculated from thedifference image over a neighbourhood of N pixels within a radius r each pixel xA

PIr,σ =∑

xA∈�TA,B

∑u,v

σ 2

σ 2 + (D(xA) − D(u, v))2

where {u, v} are pixels in the neighbourhood of xA such that |xA − (u, v)| � r . There are twoparameters required by this similarity measure. The first is the radius of the neighbourhoodr . Increasing r can improve the reliability of the algorithm, but increases the computationalrequirement. Values of between 3 and 5 are proposed. The parameter σ controls the sensitivity


of the measure to image features. Weese suggests it should be larger than the standard deviationof the noise in the fluoroscopy image but smaller than the contrast of structures of interest.

To understand how pattern intensity works, it is useful to compare it with the sum of squaresof intensity difference measure introduced in section 6.1 above. If pixels A(xA) and BT (xA)

are identical, and part of uniform patches of radiusR > r , then pixel xA will contribute 0 to thesum of squares of difference measure and N σ 2

σ 2+0 = N to the pattern intensity measure. Thecorresponding contribution of pixel xA if it is part of a uniform region in the presence of noiseδ would be δ2 for the sum of squares of difference measure, and approximately N σ 2

σ 2+δ2 ≈ N

to pattern intensity. For a uniform region of very large difference in intensity � � σ , thepixel xA will contribute the large value of �2 to the sum of squares of difference measure,and approximately N σ 2

σ 2+�2 ≈ 0 to the pattern intensity measure. As � increases, the sumof squares of intensity measure will increase as �2, but pattern intensity will asymptoticallyapproach zero, with the consequence that regions of very great difference in intensity contributeproportionately less to pattern intensity than to the sum of squares of intensity difference.Furthermore, if xA is a single pixel with a large difference � between modalities, surroundedby a uniform patch that is the same in both modalities, then this pixel will still give a contribution�2 to the sum of squares of difference measure, but a contribution of N − 1 ≈ N to patternintensity. Pattern intensity is, therefore, almost totally insensitive to individual pixels, or smallnumbers of pixels n � N where there is a very large difference between modalities. Sincehigh-contrast instruments in the fluoroscopy image have exactly this characteristic, patternintensity is much less sensitive to these objects than sum of square of intensity differences.

10.4. Non-rigid registration: beyond the affine transformation

As was stated in the introduction, aligning images using registration transformations with moredegrees of freedom than the affine transformation is an active area of research in which rapidprogress is being made. This review does not, therefore, attempt a comprehensive examinationof the area (which would inevitably become out of date extremely rapidly). Instead we give abrief introduction to the field.

Non-affine registration algorithms normally either include an initial rigid body or affinetransformation, or are run after a rigid-body or affine algorithm has provided a starting estimate.Many different approaches are used. When point landmarks are available, thin-plate splinesare often used to determine the transformation (Bookstein 1989). Using intensity-basedalgorithms (with the measures described in sections 6 and 7), the non-rigid component ofthe transformation can be determined using a linear combination of polynomial terms (Woodset al 1998), basis functions (Friston et al 1995) or B-spline surfaces defined by a regulargrid of control points (Rueckert et al 1999). Alternatively, pseudophysical models can beused in which the deformation of one image with respect to another is modelled as a physicalprocess such as elastic deformation or fluid flow. It is also possible to mix rigid and non-rigidtransformations in the same image (Little et al 1997).

The transformation T determined by a non-affine registration algorithm can be regarded asa deformation field that records the displacement vector at each voxel in imageB needed to alignit with the corresponding location in imageA. For some non-affine registration applications, Tis the most useful outcome of the registration. In a recent study of brain development, seriallyacquired MR images of children were registered using a non-affine algorithm in order to gaina better understanding of the rate of growth of different structures in the brain (Thompson et al2000). Also, T can be used to propagate segmented structures between images that have beenregistered using a non-affine algorithm, for example to measure change in volume of structuresover time (Dawant et al 2000).


10.4.1. Thin plate spline warps from point landmarks. The point registration techniquedescribed in section 5.1 above can be used to determine the rigid body or affine mapping thataligns the pointsP andQ in a least squares sense. If the structures being aligned are deformable,then it may be more appropriate to warp one of the images so that the landmarks are aligned.In the absence of any other information (for example about the mechanical properties of thetissue), the most appropriate transformation might be the one that matches the landmarksexactly, and bends the rest of space as little as possible. This requires an appropriate definitionof bending, or bending energy. Intuitively, the squared norm of the second derivative is a goodchoice of energy. This can also be justified more rigorously from plate theory (Marsden andHughes 1994). Also, by analogy with differential geometry, the curvature is related to secondderivatives of the metric tensor. In more than one dimension, this is to be interpreted as sumof squares of all second order derivatives of all components of the mapping T .

Variational problems of this type are usually solved by finding a corresponding PDEequation, whose solution are minimizers. Here the PDE operator is the Laplacian of theLaplacian (��) (Harder and Desmarais 1972, Goshtasby 1988, Bookstein 1989). Solutionsare built by superposition of fundamental solutions, called thin plate splines, again by analogywith plate theory. The reader might be more familiar with fundamental solutions of the normalLaplacian � = ∑

i ∂2xi

. In both cases, it is important to notice that these solutions havea different form in different dimensions: in one dimension, the thin plate splines are cubicsplines (Dryden and Mardia 1998), in higher dimensions they are functions of the distance tothe landmarks. In two dimensions there is a logarithmic term

x′ = Lx + t +N∑i

ci r2i ln r2

i (25)

whereas in three dimensions it takes the simpler form

x′ = Lx + t +N∑i

ci ri (26)

where L is a linear transformation, t is a translation, and ri is the distance to a landmark ci .

10.4.2. Non-affine registration by optimising a similarity measure. The similarity measuresdescribed in sections 6 and 7 can be used for affine or non-affine registration. When calculatingnon-affine transformations, it is normal to optimize a cost function that is the combination of aterm due to the similarity measure, and a regularization term. The regularization term assignshigh cost to undesirable transformations, such as high local stretching or bending, or foldingof the transformation (Rueckert et al 1999, Ashburner and Friston 1999). In this section wedescribe some examples of non-affine registration algorithms.

The approach used in the AIR software package (Woods et al 1998) is to initially calculatean affine transformation, then refine this to calculate a non-affine transformation in whichthe position of each voxel in image BT is a polynomial expansion of its location in imageB. For a second-order polynomial, there are 10 degrees of freedom for each axis (30 intotal), rising to 60 degrees of freedom for third order and 105 degrees of freedom for fifthorder.

An alternative to using polynomials is to break the image up into regions that are ableto move independently of one another. Following an approximate alignment of images A

and B using a rigid or affine transformation, the images can be split up into an array ofnodes. These nodes may be the control points of a spline such as a B spline (Rueckert et al1999), or may be regarded as the centre of a local region of interest (ROI). Each node in


(a) (b)

(c) (d) (e)

Figure 5. Example slice from (a) pre-contrast and (b) post-contrast MR mammogram. Withoutregistration, the difference image contains distracting artefacts (c). Affine registration results insome improvement (d). Non-rigid registration (e), using a 10 mm grid of B-spline control points(Rueckert et al 1999) results in reduced artefact and improved diagnostic value (Denton et al 2000).

image B is iteratively translated, or a local affine transformation is iteratively determinedfor each ROI, in order to optimize the chosen cost function. The similarity measures SSDand CC can be evaluated locally on each ROI. The information theoretic measures such asMI and NMI, however, are statistical in nature, and the estimate determined from a smallnumber of voxels is poor. In this case, the similarity measure should be calculated fromthe entire image for each node displacement. B-spline interpolation between nodes has theuseful characteristic that they have ‘local support’, which means that only voxels within aradius of about two node spacings are affected by translation of that node, so when optimizinginformation theoretic measures on an array of B-splines, only a small portion of the imageneeds to be transformed with each iteration, even though the similarity measure is evaluatedover the whole image.

Figure 5 shows an example non-affine registration of pre- and postcontrast MRmammograms obtained by optimizing normalized mutual information (NMI) on a 10 mmarray of B-spline control points (Rueckert et al 1999).

For some regions of the body, it is important to distinguish between different tissues thathave very different mechanical properties, such as bone and soft tissue. The algorithm ofLittle et al (1997) provides a technique to treat multiple rigid bodies within a deformablematrix.


(a) (b) (c)

Figure 6. Axial (top) and coronal (bottom) slices from average images produced from MR scansof seven different normal controls, registered using (a) rigid registration, (b) affine registration and(c) non-rigid registration using a 10 mm grid of B-spline control points (Rueckert et al 1999). Inall cases the registration was achieved by maximizing normalized mutual information, as describedin section 7.3.3. Note that after rigid registration the average image is quite blurred, indicatingthat rigid registration does not line up the different brain scans very well. After affine registration,the images are a lot less blurred, especially around basal structures. The non-rigid registration,however, produces the sharpest image, indicating that this sort of algorithm is better at lining upbrain features between subjects.

10.4.3. Intersubject registration. In neuroscience research it is standard practice to registerimages from different individuals into a standard space, in order to study variability betweensubjects, or to compare normal subjects with volunteers. This type of registration is frequentlycalled intersubject normalization. The most common space to align images is in the Talairachstereotactic space. Aligning images with this space involves registering them with an electronicversion of the Talairach atlas (Talairach and Tournoux 1988), generated from one post-mortemfemale brain. The alignment is most commonly carried out with an affine transformation (e.g.Collins et al 1994), but more degrees of freedom are sometimes used (Friston et al 1995,Ashburner and Friston 1999, Collins et al 1995, Woods et al 1998, Christensen et al 1996).This is normally treated as an intramodality registration problem, as it is normal to register theimages of a particular subject to the images of another subject of the same modality whoseimages have already been put into Talairach space. The approaches described in section 10.4.2above are, therefore, appropriate for this application.

Figure 6 compares the mean image obtained after registering seven MR images of normalsubjects using a rigid body, affine and non-affine transformation by optimizing normalizedmutual information using the algorithm of Rueckert et al (1999). Note the increasing sharpnessof the average image as the number of degrees of freedom increases.


10.5. Image-to-physical space registration

Image-to-physical space registration is important when it is desirable to align images with apatient in the treatment position. The main applications are in image-guided neurosurgeryand radiation therapy. The earliest approach was to use a stereotactic frame, rigidly attachedto the patient. This frame could be fitted with imaging markers visible in MR, CT or x-rayangiography, and these markers could be used to relate image coordinates with treatmentcoordinates defined by the frame (Peters et al 1986, Kelly 1991). More recently, framelesstechniques have been devised. These use point landmarks or surface features for registration.For neurosurgical applications, markers rigidly attached to the skull are quite invasive, but canprovide high accuracy (Maurer et al 1997, Edwards et al 2000). Skin-affixed markers haveless accuracy, but are also much less invasive (Roberts et al 1986). Bite blocks or dental stents(Howard et al 1995, Hauser et al 1997, Fenlon et al 2000) rigidly attached to the teeth providea minimally invasive and relocatable system for registration of images to physical space. Thepatient is usually scanned wearing the stent with the imaging markers attached. During theintervention, LEDs can be attached to the stent to track head movement. These devices havebeen shown to provide relocation accuracy of better than 2 mm and tracking accuracy ofless than 1 mm if carefully constructed (Fenlon 2000). An even less invasive approach is touse surface anatomical points, but this is very user dependent. Whichever points are used,registration can be achieved using the point registration algorithm described in section 5.1.

An alternative approach is to use surfaces for registration. The skin or exposed bone surfacecan be recorded in physical space using optical methods (Colchester et al 1996, Grimson et al1996) or by manual tracking of a localizer over the skin surface (Tan et al 1993). Thesetechniques do not require any markers to be attached to the patient prior to imaging, thoughpoints can also be used to improve robustness (Maurer et al 1998). In these cases, registrationis achieved using one of the surface matching algorithms described in section 5.2.

11. Quantifying registration accuracy

There are two main reasons why it is desirable to quantify registration accuracy. The firstis to calculate the expected accuracy of an algorithm in order to ascertain whether it is goodenough for a particular clinical application, or in order to compare one algorithm with another.The second reason is to assess the accuracy of registration for a particular subject, for exampleprior to using the registered images to make a decision about patient management.

The accuracy of a registration transformation T cannot easily be summarized by a singlenumber, as it is spatially varying over the image. If T is the calculated transformation, and Tg

is the true ‘gold standard’ transformation, then the registration error will vary with positionxA in the image. If we think of each point in the image as a potential target of some treatment,then we can define the error at this point as the target registration error (TRE) as follows:

TRE(xA) = |T (xA) − Tg(xA)|. (27)

The TRE will normally vary with position. For example, in a rigid body transformation therewill typically be rotational components. It may be that at some position in the image, bychance, the rotational component of the transformation cancels out the error in the translationcomponent, giving TRE = 0. Elsewhere, however, the TRE will be greater. In the extremelyunlikely case that the only error in a transformation is a translation error, then TRE will be thesame everywhere. If Tg is known, then TRE can be calculated everywhere in the image. In thiscase, an image of TRE could be produced. In practice, it is more common to summarize theTRE distribution for example by considering the mean or maximum value. For most practicalsituations, however, Tg is not known, so TRE cannot be calculated.


11.1. Point landmark registration

11.1.1. Errors in rigid body point registration. For rigid body registration using fiducialpoints, a theory of errors has been devised that can predict the expected error distributionbased on errors in localizing individual points, and the distributions of points.

It is clear that the localization of the fiducial points is never perfect. Fitzpatrick et al(1998b) call this error the fiducial localization error (FLE). The residual error in fitting ofpoint sets is called the fiducial registration error (FRE). The distribution of FRE given thedistribution of FLE has been described by Sibson (1978). Fitzpatrick et al (1998b) stress thatwhat really matters is not the FRE, as the fiducials are not located at the places of interest inthe subject, but the target registration error (TRE) we introduced earlier, i.e. the error inducedby FLE at a given target. Explicitly, if the mean FLE is ε, the rotation and translationsTε := (Rε, tε) which solve the Procrustes problem are going to contain an error. The targetregistration error (TRE) at the target x is then |Tε(x) − T (x)|. Fitzpatrick et al (1998b)provide formulae in first order in ε for TRE as a function of FLE, explaining a decrease in1/

√N , where N is the number of points. Summarized: the squared expectation value of TRE

at position x, (coordinates (x1, . . . , xK) is going to be to first order

〈TRE(x)2〉 ∼= 〈FLE〉2

(1

N+

1

K

K∑i

K∑j �=i

x2i

<2i + <2

j

)

where K is the number of spatial dimensions (normally three for medical applications) and <i

are the singular values of the configuration matrix of the markers.

11.2. Determining registration accuracy using a gold standard

If images are available for which the correct registration transformation is known, thenregistration accuracy can be calculated by comparing the transformation calculated by anyregistration algorithm with the known, ‘gold standard’ solution.

In the case of simulated data, the gold standard can have arbitrary accuracy, but theimages are often not very realistic. An alternative is to use a gold standard method to registerthe images very accurately, and then compare any other algorithm with the solution obtainedwith the gold standard algorithm. This is a satisfactory approach if a gold standard algorithmcan be applied to a group of test images, but it would not be appropriate for other registrationtasks. For intermodality registration, this can be achieved by using invasive markers to providethe gold standard. These invasive markers can be attached to cadavers (Hemler et al 1995),or to surgical patients (West et al 1997), and the markers removed from the images prior tousing these images as test data for algorithms that could be used when no invasive markersare available. Because the invasive markers can determine the transformation with a TRE atlocations of interest of about 0.5 mm (Maurer et al 1997), the resulting images are useful fortesting other algorithms. This approach has so far only been shown to be effective for rigidbody intermodality registration. For non-rigid registration, or intramodality registration whenmuch higher accuracy is required, the approach is not satisfactory.

11.3. Determining registration accuracy using consistency measurements

Because there is no realistic gold standard for intermodality registration, authors frequentlytest their algorithms by measuring the consistency of transformations (Freeborough et al 1996,Woods et al 1998, Holden et al 2000). Given three images of the same subject, A, B and C,there are three transformations that can be compared TA→B , TB→C and TC→A. Applying all


three transformations in turn completes a circuit, and should give the identity transformationfor a perfect algorithm.

Tc = TA→BTB→CTC→A.

For any real algorithm, of course, Tc will not be the identity. This will give a transformationthat provides an estimate of the errors. If the registration errors in the process are uncorrelated,then the RMS registration error for one application of the algorithm will be 1√

3times the

error for the whole circuit. Because one image is common in each of TA→B , TB→C andTC→A, however, the errors are not uncorrelated, so the errors estimated in this way will tendto underestimate the true error of the algorithm. As an extreme example, an algorithm thatalways produces an erroneous transformation close to the identity would incorrectly be foundto be perform well according to this measure.

11.4. Visual assessment

In many situations, the only practical means of estimating registration accuracy for a singlepatient is to visually inspect the images. An observer looks at the registered images, usingvarious tools such as colour overlays and difference images, and classifies the registrationsolution as either a ‘success’ or a ‘failure’ depending on whether they judge the registrationaccuracy to be above or below the required application accuracy. This is very effective ifan algorithm normally registers images well within the required accuracy, but occasionallyfails in an obvious way. For algorithms that produce a fairly uniform range of errors oneither side of the required accuracy, however, there is a risk that an observer will generatetoo many false negatives (images that are sufficiently well registered but are classified asfailures), or false positives (images that are not well enough registered, but are classified assuccesses). This approach has been carefully studied for rigid body registration of MR andCT images (Fitzpatrick et al 1998a). Preliminary work in serial MR registration has suggestedthat observers have high sensitivity to errors greater than 0.2 mm when viewing differenceimages (Holden et al 2000).

12. Conclusions

In this review we have introduced the topic of medical image registration and discussed themain approaches described in the literature. The main emphasis of this article is intrasubjectregistration of tomographic modalities, which is predominantly used to find the rigid-bodyor affine transformation needed to align images of the head. We have also considered inless detail the closely related topics of non-affine registration (for intrasubject registration ofdeformable regions, and intersubject registration) 2D–3D registration, and image-to-physicalspace registration. Non-affine registration, in particular, is a rapidly developing area with manypotential applications in healthcare and medical research.

Because most current algorithms for medical image registration calculate a rigid bodyor affine transformation, their applicability is restricted to parts of the body where tissuedeformation is small compared with the desired registration accuracy, and in practice they areused most commonly for registration of images of the head. The most accurate algorithmsfor intermodality registration of the head are based on optimizing a voxel similarity measure.The most generally applicable of these algorithms are currently the ones based on informationtheory. These algorithms can be applied automatically to a variety of modality combinationsfor intermodality and intramodality registration, without the need for presegmentation of theimages. They can also be extended to non-affine transformations.


One further appeal of these information theoretic approaches is the mystique that surroundsthe word entropy. An interesting anecdote to emphasize this point comes from a conversationbetween Shannon and Von Neumann (quoted in Applebaum (1996)). Apparently, Shannon hadasked Von Neumann which name he should give to his measure of uncertainty. Von Neumannanswered: ‘You should call it “entropy”, and for two reasons: first, the function is alreadyin use in thermodynamics under that name; second, and more importantly, most people don’tknow what entropy really is, and if you use the word ‘entropy’ in an argument, you will winevery time!’ There is, as yet, no proof that the information theory measures are in any wayoptimal for image registration, and better measures are likely to be devised in due course.

The first algorithms for medical image registration were devised in the early 1980s, andfully automatic algorithms have been available for many intermodality and intramodalityapplications since the mid 1990s. Despite this, at the time of writing, image registration is stillseldom carried out on a routine clinical basis. The most widely used registration applicationsare probably image-to-physical space registration in neurosurgery and registration of functionalMR images to correct for interscan patient motion. Intermodality registration, which accountsfor the majority of the literature in this area, is still unusual in the clinical setting.

Image registration is, however, being widely used in medical research, especially inneuroscience where it is used in functional studies, in cohort studies and to quantify changesin structure during development and ageing.

One barrier to the routine clinical use of image registration may be the logistical difficultiesin getting the images to be registered onto the same computer. Medical research labs doing alot of imaging tend to have a more integrated infrastructure than hospitals, so do not suffer fromthis problem. The healthcare sector is now moving towards integrating text-based and imageinformation about the patient to produce multimedia electronic patient records, and when thisinfrastructure is in place, the logistics of image registration will be much easier. Anotherreason for the lack of clinical use of image registration might be that traditional radiologicalpractice can provide all the necessary information for patient management, and registration isunnecessary. Even if this second argument is currently valid, the increasing data generated bysuccessive generations of scanners (including new multislice helical CT scanners) will steadilyincrease the need for registration to assist the radiologist carry out his or her task.

It is worthwhile briefly considering how image registration is likely to evolve over thenext few years. Increasing volumes of data and multimedia electronic patient records havealready been referred to, and these practical developments may see registration entering routineclinical use at many centres. Also, increasing use of dynamic acquisitions such as perfusionMRI will necessitate use of registration algorithms to correct for patient motion. In addition,non-affine registration is likely to find increasing application in the study of development,ageing and monitoring changes due to disease progression and response to treatment. Inthese latter applications, the transformation itself may have more clinical benefit than thetransformed images, as this will quantify the changes in structure in a given patient. Newdevelopments in imaging technology may open up new applications of image registration. Ithas recently been shown that very high field whole-body MR scanners can produce high signalto noise ratio images of the brain with 100 µm resolution (Robitaille et al 2000). Intramodalityregistration of these images may open up new applications such as monitoring change in smallblood vessels. Also, while ultrasound images have been largely ignored by image registrationresearchers up until now, the increasing quality of ultrasound images and its low cost makesthis a fertile area for both intramodality and intermodality applications (e.g. Roche et al 2000,King et al 2000).

In some ways, medical image registration is a mature technology that has been around fornearly two decades and has attracted considerable research activity in devising and validating


algorithms, and in demonstrating clinical application. We believe, however, that there will besubstantial additional innovation in this area over the next few years, especially in non-affineregistration, and applications outside the brain. In the next decade, registration algorithms arelikely to enter routine clinical use, and research applications will play a key role in improvingour understanding of physiology and disease processes, especially in neuroscience.

Acknowledgments

We are grateful to colleagues in the Computational Imaging Science group at King’s CollegeLondon for useful discussions and assistance, especially Dr Julia Schnabel for figure 6 andChristine Tanner for figure 5.

References

Applebaum D 1996 Probability and Information: An Integrated Approach (Cambridge: Cambridge University Press)Ashburner J and Friston K J 1999 Nonlinear spatial normalization using basis functions Human Brain Mapping 7

254–66Atkinson D, Hill D L G, Stoyle P N R, Summers P E, Clare S, Bowtell R and Keevil S F 1999 Automatic compensation

of motion artefacts in MRI Magn. Reson. Med. 41 163–70Atkinson D, Porter D A, Hill D L G, Calamante F and Connelly A 2000 Sampling and reconstruction effects due to

motion in diffusion-weighted interleaved echo planar imaging Magn. Reson. Med. 44 101–9Besl P J and McKay N D 1992 A method for registration of 3-D shapes IEEE Trans. Pattern Anal. Mach. Intell. 14

239–56Bookstein F L 1989 Principal warps: thin-plate splines and the decomposition of deformations IEEE Trans. Pattern

Anal. Mach. Intell. 11 567–85Borgefors G 1984 Distance transformations in arbitrary dimensions Comput. Vision Graph. Image Process. 27 321–45Buzug T M and Weese J 1998 Image registration for DSA quality enhancement Computerized Imaging Graphics 22

103Chang H and Fitzpatrick J M 1992 A technique for accurate magnetic resonance imaging in the presence of field

inhomogeneities IEEE Trans. Med. Imaging 11 319–29Chen G T Y, Kessler M and Pitluck S 1985 Structure transfer between sets of three dimensional medical imaging data

Computer Graphics 1985 (Dallas: National Computer Graphics Association) pp 171–5Christensen G E, Rabbitt R D and Miller M I 1996 Deformable templates using large deformation kinematics IEEE

Trans. Med. Imaging 5 1435–47Clarkson M J, Rueckert D, Hill D L G and Hawkes D J 1999 Registration of multiple video images to pre-operative

CT for image guided surgery Proc. SPIE 3661 14–23——2000 A multiple 2D video–3D medical image registration algorithm Proc. SPIE 3979 342–52Colchester A C F, Zhao J, Holton-Tainter K S, Henri C J, Maitland N, Roberts P T E, Harris C G and Evans R J

1996 Development and preliminary evaluation of VISLAN, a surgical planning and guidance system usingintra-operative video imaging Med. Image Anal. 1 73–90

Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P and Marchal G 1995 Automated multi-modality imageregistration based on information theory Information Processing in Medical Imaging 1995 ed Y Bizais, C Barillotand R Di Paola (Dordrecht: Kluwer Academic) pp 263–74

Collins D L, Holmes C J, Peters T M and Evans A C 1995 Automatic 3D model-based neuroanatomical segmentationHuman Brain Mapping 3 190–208

Collins D L, Neelin P, Peters T M and Evans A C 1994 Automatic 3D intersubject registration of MR volumetric datain standardized Talairach space J. Comput. Assist. Tomogr. 18 192–205

Cox R W and Jesmanowicz A 1999 Real-time 3D image registration for functional MRI Magn. Reson. Med. 421014–18

Cuchet E, Knoplioch J, Dormont D and Marsault C 1995 Registration in neurosurgery and neuroradiotherapyapplications J. Image Guided Surg. 1 198–207

Dawant B M, Hartmann S L, Thirion J-P, Maes F, Vandermeulen D and Demaerel P 2000 Automatic 3D segmentationof internal structures of the head in MR images using a combination of similarity and free-form transformations:part 1. Methodology and validation on normal subjects IEEE Trans. Med. Imaging 18 909–16


Declerck J, Feldmar J, Goris M L and Betting F 1997 Automatic registration and alignment on a template of cardiacstress and rest reoriented SPECT images IEEE Trans. Med. Imaging 16 727–37

Denton E R E, Holden M, Christ E, Jarosz J M, Russell-Jones D, Goodey J, Cox T C S and Hill D L G 2000 Theidentification of cerebral volume changes in treated growth hormone deficient adults using serial 3D MR imageprocessing J. Comput. Assist. Tomogr. 24 139–45

Dryden I and Mardia K 1998 Statistical Shape Analysis (New York: Wiley)Eddy W F, Fitzgerald M and Noll D C 1996 Improved image registration by using Fourier interpolation Magn. Reson.

Med. 36 923–31Edelman A, Arias T and Smith S 1998 The geometry of algorithms with orthogonality constraints SIAM J. Matrix

Anal. Appl. 20 303–53Edwards P J et al 2000 Design and evaluation of a system for microscope-assisted guided interventions (MAGI) IEEE

Trans. Med. Imaging 19 at pressEvans A C, Beil C, Marrett S, Thompson C J and Hakim A 1988 Anatomical–functional correlation using an adjustable

MRI-based region of interest atlas with positron emission tomography J. Cereb. Blood Flow Metab. 8 513–30Faber T L, McColl R W, Opperman R M, Corbett J R and Peshock R M 1991 Spatial and temporal registration of

cardiac SPECT and MR images: methods and evaluation Radiology 179 857–61Faber T L and Stokely E M 1988 Orientation of 3-D structures in medical images IEEE Trans. Pattern Anal. Mach.

Intell. 10 626–33Fenlon M R, Jusczyzck A S, Edwards P J and King A P 2000 Locking acrylic resin dental stent for image guided

surgery J. Prosthetic Dentistry 83 482–5Fitzpatrick J M, Hill D L G, Shyr Y, West J, Studholme C and Maurer C R Jr 1998a Visual assessment of the accuracy

of retrospective registration of MR and CT images of the brain IEEE Trans. Med. Imaging 17 571–85Fitzpatrick J, West J and Maurer C Jr 1998b Predicting error in rigid-body, point-based registration IEEE Trans.

Medical Imaging 17 694–702Freeborough P A, Woods R P and Fox N C 1996 Accurate registration of serial 3D MR brain images and its application

to visualizing change in neurodegenerative disorders J. Comput. Assist. Tomogr. 20 1012–22Friston K J, Ashburner J, Poline J B, Frith C D, Heather J D and Frackowiak R 1995 Spatial registration and

normalisation of images Human Brain Mapping 2 165–89Gelfand M, Mironov A and Pevzner P 1996 Gene recognition via spliced sequence alignment Proc. Natl. Acad. Sci.

USA 93 9061–6Golub G H and van Loan C F 1996 Matrix Computations 3rd edn (Baltimore, MD: Johns Hopkins University Press)Goshtasby A 1988 Registration of images with geometric distortions IEEE Trans. Geosci. Remote Sensing 26

60–4Green B F 1952 The orthogonal approximation of an oblique structure in factor analysis Psychometrika 17 429–40Grimson W E L, Ettinger G J, White S J, Lozano-Perez T, Wells W M III and Kikinis R 1996 An automatic registration

method for frameless stereotaxy, image guided surgery, and enhanced reality visualization IEEE Trans. Med.Imaging 15 129–40

Gueziec A and Ayache N 1994 Smoothing and matching of 3-D space curves Int. J. Comput. Vision 12 79–104Gueziec A, Pennec X and Ayache N 1997 Medical image registration using geometric hashing IEEE Comput. Sci.

Eng. 4 29–41Guillemaud R and Brady M 1997 Estimating the bias field of MR images IEEE Trans. Med. Imaging 16 238–51Hajnal J B, Saeed N, Oatridge A, Williams E J, Young I R and Bydder G M 1995b Detection of subtle brain changes

using subvoxel registration and subtraction of serial MR images J. Comput. Assist. Tomogr. 19 677–91Hajnal J B, Saeed N, Soar E J, Oatridge A, Young I R and Bydder G M 1995 A registration and interpolation procedure

for subvoxel matching of serially acquired MR images J. Comput. Assist. Tomogr. 19 289–96Harder R L and Desmarais R N 1972 Interpolation using surface splines J. Aircraft 9 189–91Hauser R, Westermann B and Probst R 1997 Non-invasive tracking of patients’ head movements during computer-

assisted intranasal microscopic surgery Laryngoscope 211 491–9Hemler P F, van den Elsen P A, Sumanaweera T S, Napel S, Drace J and Adler J R 1995 A quantitative comparison

of residual error for three different multimodality registration techniques Information Processing in MedicalImaging 1995 ed Y Bizais, C Barillot and R Di Paola (Dordrecht: Kluwer Academic) pp 251–62

Hill D L G, Hawkes D J, Crossman J E, Gleeson M J, Cox T C S, Bracey E E C M L, Strong A J and Graves P 1991Registration of MR and CT images for skull base surgery using point-like anatomical features Br. J. Radiol. 641030–5

Hill D L G, Maurer C R Jr, Studholme C, Fitzpatrick J M and Hawkes D J 1998 Correcting scaling errors in tomographicimages using a nine degree of freedom registration algorithm J. Comput. Assist. Tomogr. 22 317–23

Hill D L G, Studholme C and Hawkes D J 1994 Voxel similarity measures for automated image registration Proc.SPIE 2359 205–16


Holden M, Hill D L G, Denton E R E, Jarosz J M, Cox T C S, Goodey J, Rohlfing T and Hawkes D J 2000 Voxelsimilarity measures for 3D serial MR image registration IEEE Trans. Med. Imaging 19 94–102

Howard M A, Dobbs M B, Siminson T M, LaVelle W E and Granner M A 1995 A non-invasive reattachable skullfinducial marker system J. Neurosurg. 83 372–6

Huang C T and Mitchell O R 1994 A Euclidean distance transform using grayscale morphology decomposition IEEETrans. Pattern Anal. Mach. Intell. 16 443–8

Hurley J R and Cattell R B 1962 The Procrustes program: producing direct rotation to test a hypothesized factorstructure Behav. Sci. 7 258–62

Jezzard P and Balaban R S 1995 Correction for geometric distortion in echo planar images from B0 field variationsMagn. Reson. Med. 34 65–73

Jiang H, Robb R A and Holton K S 1992 A new approach to 3-D registration of multimodality medical images bysurface matching Proc. SPIE 1808 196–213

Kanatani K 1994 Analysis of 3-D rotation fitting IEEE Trans. Pattern Anal. Mach. Intell. 16 543–9Kelly P J 1991 Tumor Stereotaxis (Philadelphia: Saunders)King A P, Blackall J M, Penny G P, Edwards P J, Hill D L G and Hawkes D J 2000 Bayesian estimation of intra-operative

deformation for image-guided surgery using 3D ultrasound Medical Image Computing and Computer-AssistedIntervention (MICCAI) 2000 (Lecture Notes in Computer Science 1935) (Berlin: Springer) pp 588–97

Koschat M A and Swayne D F 1991 A weighted Procrustes criterion Psychometrika 56 229–39Lavallee S and Szeliski R 1995 Recovering the position and orientation of free-form objects from image contours

using 3-D distance maps IEEE Trans. Pattern Anal. Mach. Intell. 17 378–90Lehmann T M, Gonner C and Spitzer K 1999 Survey: interpolation methods in medical image processing IEEE Trans.

Med. Imaging 18 1049–75Lemieux L and Barker G J 1998 Measurement of small inter-scan fluctuations in voxel dimensions in magnetic

resonance images using registration Med. Phys. 25 1049–54Lemieux L, Jagoe R, Fish D R, Kitchen N D and Thomas D G 1994 A patient-to-computed-tomography image

registration method based on digitally reconstructed radiographs Med. Phys. 21 1749–60Lemieux L, Kitchen N D, Hughes S W and Thomas D G T 1994 Voxel-based localization in frame-based and frameless

stereotaxy and its accuracy Med. Phys. 21 1301–10Lemieux L, Wieshmann U C, Moran N F, Fish D R and Shorvon S D 1998 The detection and significance of subtle

changes in mixed-signal brain lesions by serial MRI scan matching and spatial normalization Med. Image Anal.2 227–42

Levin D N, Pelizzari C A, Chen G T Y, Chen C-T and Cooper M D 1988 Retrospective geometric correlation of MR,CT, and PET images Radiology 169 817–23

Little J A, Hill D L G and Hawkes D J 1997 Deformations incorporating rigid structures Comput. Vision ImageUnderstanding 66 223–32

Maes F, Collignon A, Vandermeulen D, Marchal G and Suetens P 1997 Multimodality image registration bymaximization of mutual information IEEE Trans. Med. Imaging 16 187–98

Maguire G Q Jr, Noz M E, Lee E M and Schimpf J H 1986 Correlation methods for tomographic images using two andthree dimensional techniques Information Processing in Medical Imaging 1985 ed S L Bacharach (Dordrecht:Martinus Nijhoff) pp 266–79

Maintz J B A, van den Elsen P A and Viergever M A 1996 Evaluation of ridge seeking operators for multimodalitymedical image matching IEEE Trans. Pattern Anal. Mach. Intell. 18 353–65

Maintz J B A and Viergever M A 1998 A survey of medical image registration Med. Image Anal. 2 1–36Marsden J and Hughes T 1994 Mathematical Foundations of Elasticity (New York: Dover)Maurer C R Jr and Fitzpatrick J M 1993 A review of medical image registration Interactive Image-Guided Neurosurgery

ed R J Maciunas (Park Ridge, IL: American Association of Neurological Surgeons) pp 17–44Maurer C R Jr, Fitzpatrick J M, Wang M Y, Galloway R L Jr, Maciunas R J and Allen G S 1997 Registration of head

volume images using implantable fiducial markers IEEE Trans. Med. Imaging 16 447–62Maurer C R Jr, Maciunas R J and Fitzpatrick J M 1998 Registration of head CT images to physical space using a

weighted combination of points and surfaces IEEE Trans. Med. Imaging 17 753–61McGee K P, Felmlee J P, Manduca A, Riederer S J and Ehman R L 2000 Rapid autocorrection using prescan navigator

echoes Magn. Reson. Med. 43 583–8Meltzer C C, Bryan R N, Holcomb H H, Kimball A W, Mayberg H S, Sadzot B, Leal J P, Wagner H N Jr and Frost J J

1990 Anatomical localization for PET using MR imaging J. Comput. Assist. Tomogr. 14 418–26Monga O and Benayoun S 1995 Using partial derivatives of 3D images to extract typical surface features Comput.

Vision Image Understanding 61 171–89Parker J, Kenyon R V and Troxel D 1983 Comparison of interpolating methods for image resampling IEEE Trans.

Med. Imaging 2 31–9


Pelizzari C A, Chen G T Y, Spelbring D R, Weichselbaum R R and Chen C-T 1989 Accurate three-dimensionalregistration of CT, PET, and/or MR images of the brain J. Comput. Assist. Tomogr. 13 20–6

Pennec X and Thirion J-P 1997 A framework for uncertainty and validation of 3-D registration methods based onpoints and frames Int. J. Comput. Vision 25 203–29

Penney G, Weese J, Little J A, Desmedt P, Hill D L G and Hawkes D J 1998 A comparison of similarity measures foruse in 2D–3D medical image registration IEEE Trans. Med. Imaging 17 586–95

Peters T M, Clark J A, Olivier A, Marchand E P, Mawko G, Dieumegarde M, Muresan L and Ethier R 1986 Integratedstereotaxic imaging with CT, MR imaging, and digital subtraction angiography Radiology 161 821–6

Pluim J, Maintz J B A and Viergever M A 2000 Interpolation artefacts in mutual information-based image registrationComputer Vision Image Understanding 77 211–32

Press W H, Teukolsky S A, Vetterling W T and Flannery B P 1992 Numerical Recipes in C: The Art of ScientificComputing 2nd edn (Cambridge: Cambridge University Press)

Rao C 1980 Matrix approximations and reduction of dimensionality in multivariate statistical analysis MultivariateAnalysis (Amsterdam: North Holland)

Roberts D W, Strohbehn J W, Hatch J F, Murray W and Kettenberger H 1986 A frameless stereotaxic integration ofcomputerized tomographic imaging and the operating microscope J. Neurosurg. 65 545–9

Robitaille P M, Abduljalil A M and Kangarlu A 2000 Ultra high resolution imaging of the human head at 8 Tesla:2K x 2K for Y2K J. Comput. Assist. Tomogr. 24 2–8

Robson M D, Anderson A W and Gore J C 1997 Diffusion-weighted multiple shot echo planar imaging of humanswithout navigation Magn. Reson. Med. 38 82–8

Roche A, Pennec X, Rudolph M, Auer D P, Malandain G, Ourselin S, Auer L M and Ayache N 2000 Generalisedcorrelation ratio for rigid registration of 3D ultrasound with MR images Medical Image Computing andComputer-Assisted Intervention (MICCAI) 2000 (Lecture Notes in Computer Science 1935) (Berlin: Springer)pp 567–77

Rueckert D, Sonoda L, Hayes C, Hill D, Leach M and Hawkes D 1999 Non-rigid registration using free-formdeformations: application to breast MR images IEEE Trans. Med. Imaging 18 712–21

Schad L R, Boesecke R, Schlegel W, Hartmann G H, Sturm V, Strauss L G and Lorenz W J 1987 Three dimensionalimage correlation of CT, MR, and PET studies in radiotherapy treatment planning of brain tumors J. Comput.Assist. Tomogr. 11 948–54

Schonemann P H 1966 A generalized solution of the orthogonal Procrustes problem Psychometrika 31 1–10Shannon C E 1948 The mathematical theory of communication (parts 1 and 2) Bell Syst. Tech. J. 27 379–423, 623–56

(reprint available from http://www.lucent.com)——1949 Communication in the presence of noise Proc. IRE 37 10–21 (reprinted in Proc. IEEE 86 447–57)Sibson R 1978 Studies in the robustness of multidimensional scaling: Procrustes statistics J. R. Statist. Soc. B 40

234–8Sled J, Zijdenbos A and Evans A 1998 A nonparametric method for automatic correction of intensity nonuniformity

in MRI data IEEE Trans. Med. Imaging 17 87–97Soderkvist I 1993 Perturbation analysis of the orthogonal procrustes problem II: Numerical analysis BIT 33 687–95Stewart G 1993 On the early history of the singular value decomposition SIAM Rev. 35 551–66Studholme C, Hill D L G and Hawkes D J 1995 Multiresolution voxel similarity measures for MR–PET registration

Information Processing in Medical Imaging 1995 ed Y Bizais, C Barillot and R Di Paola (Dordrecht: KluwerAcademic) pp 287–98

——1996 Automated 3D registration of MR and CT images of the head Med. Image Anal. 1 163–75——1997 Automated 3D registration of MR and PET brain images by multi-resolution optimization of voxel similarity

measures Med. Phys. 24 25–35——1999 An overlap invariant entropy measure of 3D medical image alignment Pattern Recognition 32 71–86Sumanaweera T S, Glover G H, Song S M, Adler J R and Napel S 1994 Quantifying MRI geometric distortion in

tissue Magn. Reson. Med. 31 40–7Talairach J and Tournoux P 1988 Co-Planar Stereotactic Atlas of the Human Brain (Stuttgart: George Thieme)Tan K K, Grzeszczuk R, Levin D N, Pelizzari C A, Chen G T, Erickson R K, Johnson D and Dohrmann G J 1993

A frameless stereotactic approach to neurosurgical planning based on retrospective patient-image registrationJ. Neurosurg. 79 296–303

Thacker N A, Jackson A, Moriarty D and Vokurka E 1999 Improved quality of re-sliced MR images using re-normalizedsinc interpolation J. Magn. Reson. Imaging 10 582–8

Thompson P M, Gledd J N, Woods R P, MacDonald D, Evans A C and Toga A W 2000 Growth patterns in thedeveloping brain detected by using continuum mechanics tensor maps Nature 404 190–3

Umeyama S 1991 Least-squares estimation of transformation parameters between two point patterns IEEE Trans.Pattern Anal. Mach. Intell. 13 376–80


Unser M 1999 Splines: a perfect fit for signal and image processing IEEE Signal Process. Mag. 16 22–38van den Elsen P A, Maintz J B A, Pol E-J D and Viergever M A 1995 Automatic registration of CT and MR brain

images using correlation of geometrical features IEEE Trans. Med. Imaging 14 384–96van den Elsen P A, Maintz J B A and Viergever M A 1992 Geometry driven multimodality image matching Brain

Topogr. 5 153–8van den Elsen P A, Pol E-J D, Sumanaweera T S, Hemler P F, Napel S and Adler J R 1994 Grey value correlation

techniques used for automatic matching of CT and MR brain and spine images Proc. SPIE 2359 227–37van den Elsen P A, Pol E-J D and Viergever M A 1993 Medical image matching: a review with classification IEEE

Eng. Med. Biol. 12 26–39Van Herk M and Kooy H M 1994 Automated three-dimensional correlation of CT–CT, CT–MRI and CT–SPECT using

chamfer matching Med. Phys. 21 1163–78Viola P A 1995 Alignment by maximization of mutual information PhD Thesis Massachusetts Institute of TechnologyWeese J, Penney G P, Desmedt P, Buzug T M, Hill D L G and Hawkes D J 1997 Voxel-based 2-D/3-D registration of

fluoroscopy images and CT scans for image-guided surgery IEEE Trans. Inform. Technol. Biomed. 1 284–93Wells W M III, Viola P, Atsumi H, Nakajima S and Kikinis R 1996 Multi-modal volume registration by maximization

of mutual information Med. Image Anal. 1 35–51West J B, Fitzpatrick J M, Wang M Y, Dawant B M, Maurer C R Jr, Kessler R M and Maciunas R J 1999 Retrospective

intermodality techniques for images of the head: surface-based versus volume-based IEEE Trans. Med. Imaging18 144–50

West J B et al 1997 Comparison and evaluation of retrospective intermodality brain image registration techniquesJ. Comput. Assist. Tomogr. 21 554–66

Whittaker J M 1935 Interpolatory function theory Cambridge Tracts in Mathematics and Mathematical Physics no 33(Cambridge: Cambridge University Press) ch 4

Woods R P, Cherry S R and Mazziotta J C 1992 Rapid automated algorithm for aligning and reslicing PET imagesJ. Comput. Assist. Tomogr. 16 620–33

Woods R P, Grafton S T, Holmes C J, Cherry S R and Mazziotta J C 1998 Automated image registration: I. Generalmethods and intrasubject, intramodality validation J. Comput. Assist. Tomogr. 22 139–52

Woods R P, Grafton S T, Watson J D G, Sicotte N L and Mazziotta J C 1998 Automated image registration:II. Intersubject validation of linear and nonlinear models J. Comput. Assist. Tomogr. 22 153–65

Woods R P, Mazziotta J C and Cherry S R 1993 MRI-PET registration with automated algorithm J. Comput. Assist.Tomogr. 17 536–46

Documents

Medical image registration - Sharif University of Technologysbisc.sharif.edu/~miap/Files/Medical Image Registration.pdf · Medical image registration ... whatever the form, together