11
642 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998 An Objective Comparison of 3-D Image Interpolation Methods George J. Grevera,* Member, IEEE, and Jayaram K. Udupa, Senior Member, IEEE Abstract— To aid in the display, manipulation, and analysis of biomedical image data, they usually need to be converted to data of isotropic discretization through the process of inter- polation. Traditional techniques consist of direct interpolation of the grey values [1]. When user interaction is called for in image segmentation, as a consequence of these interpolation methods, the user needs to segment a much greater (typically 4–10 ) amount of data. To mitigate this problem, a method called shape-based interpolation of binary data was developed [2]. Besides significantly reducing user time, this method has been shown to provide more accurate results than grey-level interpo- lation [2]–[5]. We proposed [6] an approach for the interpolation of grey data of arbitrary dimensionality that generalized the shape-based method from binary to grey data. This method has characteristics similar to those of the binary shape-based method. In particular, we showed preliminary evidence [6], [7] that it produced more accurate results than conventional grey-level interpolation methods. In this paper, concentrating on the three- dimensional (3-D) interpolation problem, we compare statistically the accuracy of eight different methods: nearest-neighbor, linear grey-level, grey-level cubic spline [8], grey-level modified cubic spline [9], Goshtasby et al. [10], and three methods from the grey-level shape-based class [6]. A population of patient magnetic resonance and computed tomography images, corresponding to different parts of the human anatomy, coming from different three-dimensional (3-D) imaging applications, are utilized for comparison. Each slice in these data sets is estimated by each in- terpolation method and compared to the original slice at the same location using three measures: mean-squared difference, number of sites of disagreement, and largest difference. The methods are statistically compared pairwise based on these measures. The shape-based methods statistically significantly outperformed all other methods in all measures in all applications considered here with a statistical relevance ranging from 10% to 32% (mean 15%) for mean-squared difference. Index Terms—Image interpolation, shape-based interpolation, three-dimensional (3-D) imaging, visualization. I. INTRODUCTION A. Background Interpolation is required in imaging, in general, whenever the acquired image data are not at the same level of dis- cretization as the level that is desired. Biomedical imaging Manuscript received November 4, 1997; revised July 22, 1998. The Associate Editor responsible for coordinating the review of this paper and recommending its publication was W. Higgins. Asterisk indicates corresponding author. *G. J. Grevera is with the Medical Image Processing Group, Depart- ment of Radiology, Suite 370 Science Center, University of Pennsyl- vania, 3600 Market Street, Philadelphia, PA 19104-6021 USA (e-mail: [email protected]). J. K. Udupa is with the Medical Image Processing Group, Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104-6021 USA. Publisher Item Identifier S 0278-0062(98)08576-0. systems, for example, collect data typically in a slice-by- slice fashion. Typically, the distance between adjacent image elements within a slice is different from the spacing between adjacent image elements in two neighboring slices. In addition, often the spacing between slices may not be the same for all slices. For visualization, manipulation, and analysis of such anisotropic data [1], they often need to be converted into data of isotropic discretization or of desired level of discretization in any of the dimensions. In many clinical imaging tasks, often it is useful to visualize the imaged property of the structure of interest within selected planes of arbitrary orientation. If data are acquired for the same object of study, say a patient’s brain, from two modalities such as magnetic resonance (MR) and positron emission tomography (PET) or from the same modality at two separate time instances, upon registering the two data sets, one of them needs to be rediscretized. In addition, the data sets may differ in their resolution, whence the level of discretization of one of them needs to be converted to that of the other, so that they can be analyzed compositely. Therefore, even with the continued improvement in scanner resolution, these problems will continue to exist and interpolation will always remain indispensable. For further reference, we will use the term scene to refer to an acquired three-dimensional (3-D) image data set (3-D rectangular array of numbers). We will refer to the cells of the array as points or voxels and the actual values as scene intensities. Broadly, interpolation techniques can be divided into two groups: scene-based and object-based. In scene-based methods [8], [9], [11], [12], interpolated scene intensity values are de- termined directly from the intensity values of the given scene. In object-based methods [2]–[5], [10], [13]–[15], some object information extracted from the given scene is used in guiding the interpolation process. Shape-based interpolation [2] is an example of object-based methods. Its motivation came from applications which required slice-by-slice help from a user for the difficult segmentation task. It is a method for interpolating binary scenes that result from segmentation. Thus, by doing interpolation after, rather than before, the time to be spent by a user in the segmentation task is significantly (4–10 ) reduced in such applications. Although saving user time was the original motivation, it was shown that, in applications where this is not a consideration, segmentation followed by shape-based interpolation gives more accurate results than the conventionally used linear grey-level interpolation followed by segmentation [2]–[4]. We described [6] a generalization of 0278–0062/98$10.00 1998 IEEE

An objective comparison of 3-D image interpolation methods

  • Upload
    jk

  • View
    225

  • Download
    1

Embed Size (px)

Citation preview

Page 1: An objective comparison of 3-D image interpolation methods

642 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

An Objective Comparison of 3-D ImageInterpolation Methods

George J. Grevera,*Member, IEEE, and Jayaram K. Udupa,Senior Member, IEEE

Abstract—To aid in the display, manipulation, and analysisof biomedical image data, they usually need to be convertedto data of isotropic discretization through the process of inter-polation. Traditional techniques consist of direct interpolationof the grey values [1]. When user interaction is called for inimage segmentation, as a consequence of these interpolationmethods, the user needs to segment a much greater (typically4–10���) amount of data. To mitigate this problem, a methodcalled shape-based interpolationof binary data was developed [2].Besides significantly reducing user time, this method has beenshown to provide more accurate results than grey-level interpo-lation [2]–[5]. We proposed [6] an approach for the interpolationof grey data of arbitrary dimensionality that generalized theshape-based method from binary to grey data. This method hascharacteristics similar to those of the binary shape-based method.In particular, we showed preliminary evidence [6], [7] that itproduced more accurate results than conventional grey-levelinterpolation methods. In this paper, concentrating on the three-dimensional (3-D) interpolation problem, we compare statisticallythe accuracy of eight different methods: nearest-neighbor, lineargrey-level, grey-level cubic spline [8], grey-level modified cubicspline [9], Goshtasby et al. [10], and three methods from thegrey-level shape-based class [6]. A population of patient magneticresonance and computed tomography images, corresponding todifferent parts of the human anatomy, coming from differentthree-dimensional (3-D) imaging applications, are utilized forcomparison. Each slice in these data sets is estimated by each in-terpolation method and compared to the original slice at the samelocation using three measures: mean-squared difference, numberof sites of disagreement, and largest difference. The methods arestatistically compared pairwise based on these measures. Theshape-based methods statistically significantly outperformed allother methods in all measures in all applications considered herewith a statistical relevance ranging from 10% to 32% (mean===15%) for mean-squared difference.

Index Terms—Image interpolation, shape-based interpolation,three-dimensional (3-D) imaging, visualization.

I. INTRODUCTION

A. Background

Interpolation is required in imaging, in general, wheneverthe acquired image data are not at the same level of dis-cretization as the level that is desired. Biomedical imaging

Manuscript received November 4, 1997; revised July 22, 1998. TheAssociate Editor responsible for coordinating the review of this paperand recommending its publication was W. Higgins.Asterisk indicatescorresponding author.

*G. J. Grevera is with the Medical Image Processing Group, Depart-ment of Radiology, Suite 370 Science Center, University of Pennsyl-vania, 3600 Market Street, Philadelphia, PA 19104-6021 USA (e-mail:[email protected]).

J. K. Udupa is with the Medical Image Processing Group, Department ofRadiology, University of Pennsylvania, Philadelphia, PA 19104-6021 USA.

Publisher Item Identifier S 0278-0062(98)08576-0.

systems, for example, collect data typically in a slice-by-slice fashion. Typically, the distance between adjacent imageelements within a slice is different from the spacing betweenadjacent image elements in two neighboring slices. In addition,often the spacing between slices may not be the same forall slices. For visualization, manipulation, and analysis ofsuch anisotropic data [1], they often need to be convertedinto data of isotropic discretization or of desired level ofdiscretization in any of the dimensions. In many clinicalimaging tasks, often it is useful to visualize the imagedproperty of the structure of interest within selected planes ofarbitrary orientation.

If data are acquired for the same object of study, saya patient’s brain, from two modalities such as magneticresonance (MR) and positron emission tomography (PET)or from the same modality at two separate time instances,upon registering the two data sets, one of them needs tobe rediscretized. In addition, the data sets may differ intheir resolution, whence the level of discretization of oneof them needs to be converted to that of the other, so thatthey can be analyzed compositely. Therefore, even with thecontinued improvement in scanner resolution, these problemswill continue to exist and interpolation will always remainindispensable.

For further reference, we will use the term scene to referto an acquired three-dimensional (3-D) image data set (3-Drectangular array of numbers). We will refer to the cells ofthe array as points or voxels and the actual values as sceneintensities.

Broadly, interpolation techniques can be divided into twogroups: scene-based and object-based. In scene-based methods[8], [9], [11], [12], interpolated scene intensity values are de-termined directly from the intensity values of the given scene.In object-based methods [2]–[5], [10], [13]–[15], some objectinformation extracted from the given scene is used in guidingthe interpolation process. Shape-based interpolation [2] is anexample of object-based methods. Its motivation came fromapplications which required slice-by-slice help from a user forthe difficult segmentation task. It is a method for interpolatingbinary scenes that result from segmentation. Thus, by doinginterpolation after, rather than before, the time to be spentby a user in the segmentation task is significantly (4–10)reduced in such applications. Although saving user time wasthe original motivation, it was shown that, in applicationswhere this is not a consideration, segmentation followed byshape-based interpolation gives more accurate results than theconventionally used linear grey-level interpolation followedby segmentation [2]–[4]. We described [6] a generalization of

0278–0062/98$10.00 1998 IEEE

Page 2: An objective comparison of 3-D image interpolation methods

GREVERA AND UDUPA: COMPARISON OF 3-D IMAGE INTERPOLATION METHODS 643

the binary shape-based method to grey-level scenes. In thisgeneralization, the scene itself is treated as an object in ahigher-dimensional space, the shape-based (binary) method isthen applied to this object to interpolate its shape, and the resultis expressed in the lower-dimensional space as the interpolatedscene. We observed [6] that the grey-level shape-based methodhad characteristics similar to those of the binary shape-basedmethod.

The objectives of the present paper are twofold: to presenta method of objectively comparing scene interpolation tech-niques, and to apply this method to compare the accuracyof the grey-level shape-based method with that of severalscene-based and object-based methods.

B. Evaluation of Interpolation Techniques

Interpolation is usually applied at an early stage of thesequence of operations that are used for the visualization,manipulation, and analysis of multidimensional scenes [16].While it is important to comparatively study the effects ofinterpolation methods on the end results in visualization,manipulation, and analysis, it is even more important toobjectively compare the methods in terms of the immediateinterpolated results (scenes) they produce. Both of these issueshave received little attention in the literature. Often, onlyvisual examples are presented to illustrate the behavior ofa new interpolation method. When quantitative measures areutilized for comparison, typically a few sample slices from aparticular modality or a particular application or a particularpart of the anatomy of a particular subject are analyzed. Suchcomparative studies do not convey a sense of the behavior ofthe interpolation method on a population of scene data even fora fixed modality, application, and body part. Additionally, newmethods are usually compared to very few other methods. Asa result, it becomes difficult to judge from the literature as tohow the methods perform relatively. A full-fledged comparisontaking into account these factors is indeed a daunting task andperhaps should not be a part of the responsibility of papersreporting new methods. Nevertheless, such comparisons areclearly essential.

C. Choosing Interpolation Methods for Comparison

To determine which interpolation methods should be usedfor comparison, we performed a literature search. We scannedpapers in the conference proceedings Visualization in Biomed-ical Computing and the journals IEEE TRANSACTIONS ON

MEDICAL IMAGING, IEEE Computer Graphics and Applica-tions, and Computer Vision, Graphics, and Image Processing:Graphical Models and Image Processing published during thepast five years. We recorded the interpolation methods thatwere mentioned at least once in each article. Linear grey-levelinterpolation was the most frequently used method (100).The spline family (B-spline, cubic spline, thin-plate spline,etc.) was the second most frequently mentioned method (51).A variety of other methods such as nearest-neighbor, (trun-cated) sinc, Lagrange polynomial, Bezier, etc., were mentioneda total of 81 with the number of mentions for each being 10,8, 6, and 5, respectively.

As a result of this survey, we concluded that linear grey-level interpolation is the most commonly used method. There-fore it should be included in the chosen methods. We alsofelt that it is important to include cubic spline interpolationas a representative of the spline family since this family wasthe second most frequently cited technique. We also chosethe modified cubic spline method [9] as a representative ofthe truncated sinc family. We included the nearest-neighbormethod since it is commonly employed and because it also rep-resents one extreme in interpolation method behavior. Theseare then the four scene-based methods chosen for comparison.

The grey-level shape-based approach [6] represents withinits framework several scene-based, object-based, and hybridstrategies. We chose three object-based representatives in thisfamily. We also felt that it was necessary to include otherobject-based methods. To keep our task manageable, we choseone such, namely the method of Goshtasbyet al. [10]. Thisgave us four object-based methods in addition to the fourscene-based methods mentioned above.

II. A B RIEF DESCRIPTION OF THE

METHODS CHOSEN FORCOMPARISON

We refer to the given 3-D image data set that we wish tointerpolate as a scene , where is a (finite) set ofdiscrete locations that comprise the scene, and each elementof is a triple of integers representing a location in the scene.Each location is viewed as a 3-D spatial element called a voxelrather than being viewed as a point. But we use voxel andpoint interchangeably. We think of a voxelas a 3-D cuboid whose center is the point underconsideration. Each voxel has a value associated with it whichis typically scalar. We define : to be the sceneintensity function which assigns a scalarin some set toeach voxel in . If (the cardinality of ) is two, then werefer to as a binary scene with . Otherwise, werefer to as a grey scene. In binary scenes, we refer to thosevoxels in such that as 0-voxels and similarlyfor 1-voxels. Interpolation transforms a scene intoanother scene . Typically, we use interpolationto increase the level of discretization of. Therefore, istypically greater than . We assume that voxel intensities areintegers and that is the set of voxels from a 3-D rectangulararray. We determine a fixed scene coordinate system such thatthe coordinates of a voxel are simply its indexes in the arrayand assume that

(1)

where is the size of the array in direction. Althoughmost of the interpolation methods chosen here for compari-son have -dimensional generalizations, to make our task ofcomparison manageable, we will utilize them in their 3-Dform. Further, rather than taking up 3-D interpolation in itsmost general form, we will restrict our study to the mostcommonly encountered “slice interpolation” problem, whereinthe task of interpolation is to estimate new intervening sliceswithout changing the level of discretization within the sliceitself. For any scene , we will refer to the subscene

Page 3: An objective comparison of 3-D image interpolation methods

644 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

, where

(2)

and is the restriction of to , as the th slice of .Generally, our approach will be to estimate theth slice ofand then to compare the estimated slice to the actualth sliceof . All methods considered here do this estimation froma few neighboring slices as described below. The method ofcomparing the estimated and the given slices is described inSection III.

In the rest of this section, we will give a brief description ofeach chosen method, first the scene-based and then the object-based methods. The description is included for completeness.Please refer to the original papers for details. Hereand represent the input and output scenes,respectively, and we use the notation todenote a voxel in the th slice of a scene.

A. Nearest-Neighbor

This method simply assigns to voxel in the output scenethe grey value of the closest voxel in the input scene.

B. Linear Grey-Level

The intensity of a voxel in is estimated from theintensities of voxels and in the th and

th slices of . For

(3)

where denotes Euclidean distance between voxels. For ourspecific problem of intervening slice interpolation,

.

C. Cubic Spline [8]

Cheney and Kincaid [8] define a natural spline as a functionconsisting of polynomial pieces on subintervals joinedtogether with certain conditions. The polynomial pieces arecalculated from known sample points and are of low order(typically cubic). These known sample points are referred toas knots. In contrast with Bezier curves [17], the polynomialsare required to pass through these knots. Its functional formfor our specific case of interpolating slices is as follows. For

(4)

D. Modified Cubic Spline [9]

This function was devised and studied in the context ofimage reconstruction from projections. It is based on a finiteimpulse response filter, a cubic function that is essentiallya truncated sinc function in the spatial domain with a low-pass frequency response. (We chose this truncated form overthe (untruncated) sinc function because the latter requires an

infinite number of sample points and the number of samplesavailable to us is finite [27].) Its functional form for ourparticular case is as follows. For

(5)

E. Goshtasby et al. [10]

This object-based method is essentially a slice interpolationtechnique. It uses features called “correspondence points” todirect the interpolation process. Input consists of the thand th slices of which are referred to as the targetand reference slices, respectively. Output is the estimatedthslice of . The algorithm for estimating ath slice of , for

, can be summarized as follows.

1) Normalize the input grey values in both input slices bymapping them to the range [0, 255].

2) Calculate the grey-level gradient magnitude and direc-tion at each point in both the target and the referenceslices.

3) Assign an initial correspondence between points in thetarget to points in the reference making use of thegradient information.

4) Correct mismatches in correspondence by assigning themedian displacement of neighboring correspondencepoints (in a window about the correspondence pointunder consideration) to the correspondence point underconsideration.

5) Correct inconsistent matches by repeating Steps 2) and3) above by reversing the target and reference images.Then discard any matches from the reference to thetarget that are not matched in the target to the referenceset of matches.

6) Interpolate using the correspondences determined in theprevious steps as follows. a) If the point (in the referenceslice) under consideration is a correspondence point,project along the correspondence line, determine theintersection of the line with the slice to be interpolated,and linearly assign the grey value to this point ofintersection. b) Otherwise, determine the four closestcorrespondence points surrounding the given point (inthe reference slice). Using these four correspondencepoints, their pairs in the target slice, and the given point,determine the point in the target slice for the given point.Then project along the line between these two points,determine the intersection of the line with the slice tobe interpolated, and linearly assign the grey value tothis point of intersection. c) For those points that are notinterpolated by either a) or b) above, simply performbilinear interpolation.

7) Reverse normalization of grey values done in Step 1).

F. Grey-Level Shape-Based [6]

This approach defines a family of methods. Here we outlinea particular version of it. See [6] for the general treatment.The following steps are applied to determine eachth slice of

, for .

Page 4: An objective comparison of 3-D image interpolation methods

GREVERA AND UDUPA: COMPARISON OF 3-D IMAGE INTERPOLATION METHODS 645

1) Convert the th slice and the th slice into3-D binary scenes and , respectively, by alifting operation. Lifting consists of creating a columnof 1-voxels for each voxel in the two grey slices. Theheight of the column is simply the grey value of thevoxel.

2) Convert and into 3-D grey scenes andby applying a distance transform to the binary

scenes. The grey value in the output scenes representsshortest distance to the boundary in the binary scenes.Distances are considered to be positive for 1-voxelsand negative for 0-voxels. (In this study, the distancetransform calculated the Euclidean distance based uponthe actual slice spacing of the data.)

3) Estimate a 3-D grey scene by interpolating betweenand . Here we use linear interpolation.

4) Convert into a binary scene by making a voxel ina 1-voxel if the corresponding voxel has a positive

value in , and making the voxel a 0-voxel otherwise.Note that voxels with a value of 0 can occur in. Wemay make such voxels 0-voxels or 1-voxels in. Wewill include both these approaches for comparison anddenote the methods by (shape-based zero distanceto zero in the binary scene) and (shape-based zerodistance to one in the binary scene), respectively.

5) Collapse into a 2-D grey slice by reversing thelifting operation.

6) Better results are obtained [6] by creating a 2-D slicethat is the average of the two slices created in Step 5) bymethods denoted and . We denote this methodby (shape-based averaging) and include it in ourevaluation.

Note that, in Step 3) of the shape-based methods, any scene-based interpolation technique can be employed, although weutilized linear interpolation in this investigation. In the twospline methods, a question arises as to whether we shouldhave considered th, th, th, and

th slices instead of th, th, th, andth slices as we did in determining the interpolating

functions. Since we effectively subsample the given scenes,the latter strategy reflects the correct approach. Clearly thereare many possibilities that we did not explore. For exam-ple, we could have taken the former slice selection strategyand utilized spline functions in Step 3) of the shape-basedmethods.

We note also that, because of end conditions, a few slices ateach end of the scene (three slices for the spline methodsand one slice for other methods) cannot be estimated byinterpolation. This does not matter for our evaluation since weconsider the figures of merit extracted from only the estimatedslices for statistical analysis.

III. M ETHODS OF EVALUATION

In this section, we first describe the patient scene datasets that were selected, then the figures of merit used forcomparison, and finally the method of statistical analysis.

A. Selection of Scene Data

Five scene data sets from each of four on-going clinicalresearch projects were randomly selected. A brief descriptionof these data sets is given below.

1) MR Brain Data:Fast spin-echo proton density weightedimages of multiple sclerosis patients were acquired forthe purpose of detecting, quantifying, and monitoringwhite matter lesions in these patients [18]. The size ofthe scene domain and of the voxels are 256256 50and 0.9 mm 0.9 mm 3.0 mm for all scenes.

2) MR Foot Data: T1-weighted gradient echo images ofhuman feet were acquired for quantifying, animating,and classifying kinematics of the tarsal joints [19].The sizes of the scene domain and of the voxels are256 256 60 and 0.6 mm 0.6 mm 1.5 mm for allscenes.

3) CT Head Data:X-ray CT data of craniofacial patientswere acquired for surgery planning and studying theefficacy of a 3-D soft tissue display method [20]. Thescene domain and the voxel size are 512512 60 and0.4 mm 0.4 mm 1.5 mm for all scenes.

4) MR Angiographic Data:Angiographic data of patientswith suspected lesions in major vessels in the abdomenwere acquired for studying the efficacy of a method ofdisplaying MRA data [21]. The scene domain and voxelsize are 256 256 80 and 1 mm 1 mm 1.5 mmfor all scenes.

We emphasize that these data were not acquired speciallyfor this study. They were taken from large databases of realclinical studies. They cover a broad spectrum in terms of noisecharacteristics, resolution, and object detail and definition. Nospecial processing has been done on them prior to testinginterpolation.

B. Figures of Merit

We will use the following abbreviations to refer to the eightmethods chosen for comparison: nearest-neighbor, lineargrey-level , cubic spline , modified cubic spline ,Goshtasbyet al. , shape-based zero to 0 , shape-basedzero to 1 , and shape-based averaging .

Our approach will be to estimate eachth slice that can beestimated (keeping in mind the end conditions) for each scene

(pretending that theth slice in is not known),to compare the estimatedth slice to the actual th slice in

using the following three figures of merit (FOM’s), andto compare the methods pairwise using these FOM’s. In thefollowing expressions of the FOM’s, denotes a method fromthe set ,

represents the output scene produced by methodfor an input scene , and indicates the totalnumber of voxels in that are involved in the comparison.Because of the end conditions, this number is less than theactual total number of voxels in the given scene. The index

is considered over all slices of that are actually estimatedby interpolation. and represent the thslice of and , respectively. For any method and scene

Page 5: An objective comparison of 3-D image interpolation methods

646 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

TABLE ISTATISTICAL RELEVANCE r

im m OF THE DIFFERENCE IN PERFORMANCE OF THEINTERPOLATION METHODS FORALL PAIRS OF METHODS FOR THEMR BRAIN DATA

SETS. A “+” I NDICATES THAT THE FIRST METHOD IN THE PAIR PERFORMED BETTER THAN THE SECOND AND A “�” I NDICATES THAT THE SECOND

METHOD PERFORMED BETTER. ALL ENTRIES SHOWN ARE STATISTICALLY SIGNIFICANT (p-VALUE � 0:05) EXCEPT THOSE MARKED “�”

, we denote the values of the three FOM’s by FOM ,.

1) Mean-squared difference (also referred to as meansquared error in the context of optimal linear filters[22]) FOM

FOM (6)

2) Number of sites of disagreement FOM

FOM (7)

where

ifotherwise.

(8)

This measure indicates the number of voxels at whichthe values within the th slices of differ bymore than a small fixed value(in our study we chose

) from those of .

3) Largest difference FOM

FOM (9)

This measure indicates the largest deviation in voxel intensitywithin the th slices of the interpolated scene from that of thegiven scene.

C. Statistical Analysis

All FOM’s are computed for all scenes and for all methods.We analyze the results separately for each of the four groupsof scenes described in Section III-A. For each group of scenes,methods are compared pairwise for all possible pairs (28 in all)based on each FOM using a paired student’s t-test [23] underthe null hypothesis that the two methods in the pair do notdiffer as per the FOM. In addition to this comparison by group,we conducted the paired t-test pooling all groups of scenestogether giving us a total of 20 scenes for the comparativestudy. The t-test was performed by using the Microsoft Excelspreadsheet software [24].

Page 6: An objective comparison of 3-D image interpolation methods

GREVERA AND UDUPA: COMPARISON OF 3-D IMAGE INTERPOLATION METHODS 647

TABLE IISTATISTICAL RELEVANCE r

im m OF THE DIFFERENCE IN PERFORMANCE OF THEINTERPOLATION METHODS FORALL PAIRS OF METHODS FOR THEMR FOOT DATA

SETS. A “+” I NDICATES THAT THE FIRST METHOD IN THE PAIR PERFORMED BETTER THAN THE SECOND AND A “�” I NDICATES THAT THE SECOND

METHOD PERFORMED BETTER. ALL ENTRIES SHOWN ARE STATISTICALLY SIGNIFICANT (p-VALUE � 0:05) EXCEPT THOSE MARKED “�”

To express the degree of importance of the observed dif-ference between methods as measured by the FOM’s, we usea measure called statistical relevance previously devised forcomparing image reconstruction methods [25]. We define thestatistical relevance of the improved performance of a method

over a method for a scene as

FOM

FOM(10)

In this definition, we assume that FOM FOM ,that is, that is strictly a better method than . The greaterthe value of , the more substantial is the observedimproved performance of over as per the th FOM.

FOM indicates the average, squared per voxel differ-ence between the actual, scanned data and the interpolateddata. A value of zero indicates perfect performance of aninterpolation method. Unfortunately, the choice of an upper

bound on poor performance is not clear. Consider a hypothet-ical interpolation function that estimates intensities that havethe following relationship to the original values:

if

otherwise(11)

where indicates the number of bits per voxel for sceneintensity. This interpolation method always grossly overesti-mates small values and underestimates large values and maybe considered to be an upper bound on poor estimation.Therefore, any other interpolation method would appear to bea dramatic improvement. So as not to bias the results in thismanner by the choice of some theoretical bound, we argue thatthe performance of the nearest-neighbor method be used as theworst interpolation method and that one may assess the relativeimprovement in performance by considering FOM inconjunction with where is nearest-neighborinterpolation and is any of the remaining methods.

Page 7: An objective comparison of 3-D image interpolation methods

648 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

TABLE IIISTATISTICAL RELEVANCE r

im m OF THE DIFFERENCE IN PERFORMANCE OF THEINTERPOLATION METHODS FORALL PAIRS OF METHODS FOR THECT HEAD DATA

SETS. A “+” I NDICATES THAT THE FIRST METHOD IN THE PAIR PERFORMED BETTER THAN THE SECOND AND A “�” I NDICATES THAT THE SECOND

METHOD PERFORMED BETTER. ALL ENTRIES SHOWN ARE STATISTICALLY SIGNIFICANT (p-VALUE � 0:05) EXCEPT THOSE MARKED “�”

IV. RESULTS AND DISCUSSION

In Tables I–V, we present the results of statistical analysisfor the four groups of scene data, and in Tables VI and VII wesummarize the results for all 20 scenes from the four groups.For each group of scenes, for each pair of (distinct) methods,and for each of the three FOM’s, we compute an averagestatistical relevance which is simply the average of

over all scenes in the group. This computationis done only if the t-test indicates a statistically significantdifference between methods and in the pair. Each tablelists the values of for all pairs of methods and for

. A “ ” in the tables indicates that the difference is not sta-tistically significant. A -value is considered to indicatea statistically significant difference. A “” next to the numbersindicates that the first method in the pair performed better thanthe second method, and vice versa when a “” appears.

In Figs. 1–4, we show one sample slice (randomly selected)from each of the four groups of data estimated by usingeach of the eight methods. The original slice is shown inthe center for reference. These figures are included only to

illustrate the nature of the images and not for making anyparticular point. We previously demonstrated [6] some superiorqualitative characteristics of the method (such as notintroducing artificial edges) in comparison with linear grey-level interpolation, based on mathematical phantoms and onacquired patient data. It is indeed a daunting task to try tomuster objectively the qualitative differences among methodsand to demonstrate them within a reasonable space throughsubjective image displays. This is exactly the reason whyobjective methods are needed. Quick subjective impressionsgained by looking at slice displays can be very misleading. Toillustrate this point, examine the slices corresponding to the

method in Figs. 1–4. Undoubtedly, in terms of sharpness,they appear the closest to the original slices and are perhapsthe most pleasing to the eye. However, we know thatisthe least accurate method. The reason for the sharpness ofthe -estimated slices is that the method simply picks thenearest original slice!

Several interesting observations can be made from theresults in Tables I–IV regarding the performance of the shape-

Page 8: An objective comparison of 3-D image interpolation methods

GREVERA AND UDUPA: COMPARISON OF 3-D IMAGE INTERPOLATION METHODS 649

TABLE IVSTATISTICAL RELEVANCE rim m OF THE DIFFERENCE IN PERFORMANCE OF THEINTERPOLATION METHODS FORALL PAIRS OF METHODS FOR THEMR ANGIOGRAPHIC

DATA SETS. A “+” I NDICATES THAT THE FIRST METHOD IN THE PAIR PERFORMED BETTER THAN THE SECOND AND A “�” I NDICATES THAT THE SECOND

METHOD PERFORMED BETTER. ALL ENTRIES SHOWN ARE STATISTICALLY SIGNIFICANT (p-VALUE � 0:05) EXCEPT THOSE MARKED “�”

TABLE VAMONG ALL 28 POSSIBLE COMPARISONS FORMETHOD sba WITH ALL OTHER

METHODS AND ALL GROUPS OFDATA, THE NUMBER OF TIMES sba DID BETTER

THAN, SAME AS, AND WORSE THAN OTHER METHODS FOREACH FOM

based methods relative to other methods and of other methodsamong themselves.

Clearly, the shape-based method outperforms othermethods. To further understand the performance of thismethod, we summarize in Table V some additional statistics.There are seven pairs in which any given method occurs ineach of Tables I–IV, and hence 28 pairs in total in which amethod occurs when all tables are considered. In Table V, wenote, out of these 28 occurrences, how many times methodperformed better than, same as, and worse than other methods.Mean-squared difference is perhaps the most expressive among

the three FOM’s. As per this FOM, performed better thanother methods 27 out of 28 cases and tied with once. Asper the second FOM (number of sites of disagreement),performed better than other methods in 21 cases and tied withother (mostly other shape-based) methods in the remainingseven cases. As per the third FOM (largest difference), theperformance of was better than or the same as othermethods in a majority of cases. Among the five cases ofworse performance, two cases are with respect to andthe remaining three cases are with respect to other methods.Clearly, this type of behavior must occur at very few voxels;otherwise the superior performance evidenced by the first twoFOM’s will be violated.

Table VI reconfirms the superior performance of the shape-based method over other methods. It is readily seen that

outperforms (statistically significantly) all other methodsin both FOM (mean-squared difference) and FOM(numberof sites of disagreement). For FOM(largest difference),

was either better than or not statistically significantlydifferent from all other methods except methodwhich

Page 9: An objective comparison of 3-D image interpolation methods

650 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

TABLE VISTATISTICAL RELEVANCE rim m OF THE DIFFERENCE IN PERFORMANCE OF THE INTERPOLATION METHODS FORALL PAIRS OF METHODS FORALL DATA SETS. A

“+” I NDICATES THAT THE FIRST METHOD IN THE PAIR PERFORMED BETTER THAN THE SECOND AND A “�” I NDICATES THAT THE SECOND

METHOD PERFORMED BETTER. ALL ENTRIES SHOWN ARE STATISTICALLY SIGNIFICANT (p-VALUE � 0:05) EXCEPT THOSE MARKED “�”

TABLE VIIGROUPING ALL DATA SETS INTO A SINGLE CATEGORY, METHOD

sba CAN BE COMPARED WITH ALL OTHER SEVEN METHODS.THE NUMBER OF TIMES sba DID BETTER THAN, SAME AS, AND

WORSE THAN OTHER METHODS FOREACH FOM IS REPORTED

performed better than . Again, as pointed out earlier,this type of behavior must occur infrequently; otherwise thesuperior performance evidenced by the first two FOM’s will beviolated. Analogous to Table V, we summarize in Table VIIthe results observed in Table VI for . The results inTable VII are at least as supportive of method as thosein Table V.

In addition to the paired t-tests, we also conducted theWilcoxon signed-rank test [26] for the 20 scene pooled data.The results are in complete agreement with those from the

paired t-test. Furthermore, we also performed an analysis ofvariance (ANOVA) [26] so as to avoid committing a Type Istatistical error (declaring something to be significant when infact it is not). The three figures of merit are the responses, andthe interpolation method and patient data set are the effects.For all figures of merit, both effects were highly significant( ) and the excellent fit for this model is summarizedin Table VIII.

From the point of view of challenge for interpolation, thefour groups of scenes cover a reasonably broad spectrum.The CT data have relatively smooth regions, although verysharp edges, and define one end of the spectrum. The MRbrain and foot data are more textured (and less smooth) thanthe CT data and fall in the middle of the spectrum whereasthe highly textured and noisy MR angiographic data (withoccasional sharp edges) constitute the other extreme. Thisdivision is also correct from the point of view of the resolutionof the original images. Often the performance of methodschanges from situation to situation. For example, the superiorperformance of the linear method over and as per

Page 10: An objective comparison of 3-D image interpolation methods

GREVERA AND UDUPA: COMPARISON OF 3-D IMAGE INTERPOLATION METHODS 651

Fig. 1. A sample slice estimated by each of the eight methods for one of theMR brain data sets. The original slice is in the center. From left to right, row 1:nearest-neighbor, linear, cubic spline; row 2: shape-based averaging, originaldata, modified cubic spline; row 3: shape-based zero to 1, shape-based zeroto 0, and Goshtasbyet al.

Fig. 2. A sample slice estimated by each of the eight methods for one of theMR foot data sets. The original slice is in the center. From left to right, row 1:nearest-neighbor, linear, cubic spline; row 2: shape-based averaging, originaldata, modified cubic spline; row 3: shape-based zero to 1, shape-based zeroto 0, and Goshtasbyet al.

FOM (mean-squared difference) for MR brain and foot andCT head data seems to have lost or become insignificant forthe noisy MR angiographic data. Interestingly, the superiorperformance of shape-based methods is not only consistent in

Fig. 3. A sample slice estimated by each of the eight methods for one of theCT head data sets. The original slice is in the center. From left to right, row 1:nearest-neighbor, linear, cubic spline; row 2: shape-based averaging, originaldata, modified cubic spline; row 3: shape-based zero to 1, shape-based zeroto 0, and Goshtasbyet al.

Fig. 4. A sample slice estimated by each of the eight methods for one ofthe MR angiographic data sets. The original slice is in the center. From leftto right, row 1: nearest-neighbor, linear, cubic spline; row 2: shape-basedaveraging, original data, modified cubic spline; row 3: shape-based zero to 1,shape-based zero to 0, and Goshtasbyet al.

all situations but also is actually better for the noisy data (asper FOM and FOM ).

Among the remaining (nonshape-based) methods, ex-cluding method which is clearly the worst performer,there is no clear mandate over all groups of scenes. Although

Page 11: An objective comparison of 3-D image interpolation methods

652 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 17, NO. 4, AUGUST 1998

TABLE VIIIANNOVA RESULTS

statistically significant differences have been observedbetween methods, they do not seem to be consistent over allgroups of scenes. For example, methodfares better thanmethod based on FOM for the MR brain data sets and theCT head data sets while the reverse is true for the other twogroups of data sets. There seems to be a similar dichotomyin the relative behavior of the algorithms for CT head/MRbrain data sets versus other groups of data. Interestingly,the remaining shape-based methods performed consistentlyacross all groups of scenes. The differences among some ofthe nonshape-based methods (for example,and ) aresubtle and may require the analysis of larger populations ofscenes to gain a clear understanding of their behavior.

V. CONCLUDING REMARKS

In this paper, we had two aims: 1) to present a method ofobjectively comparing 3-D image interpolation methods and 2)to utilize this method to evaluate eight methods chosen fromthe literature (nearest-neighbor, linear, cubic spline, modifiedcubic spline [9], Goshtasbyet al. [10], and shape-based [2],[6], [7]). The method of comparison was as follows. Weselected randomly five patient 3-D scenes from each of fourclinical projects. The four groups of scenes were: MR braindata, MR foot data, CT head data, and MR angiographicdata. Each slice of each scene in these groups was estimatedby using each interpolation method and then compared tothe actual slice in the scene using three figures of merit:mean-squared difference, number of sites of disagreement,and largest difference. The eight methods were statisticallycompared pairwise using these three FOM’s and the differencein each FOM was expressed as a statistical relevance. Thegrey-level shape-based methods [6] and [7] (in particular, themethod referred to as shape-based averaging), clearly showedan improved performance that was statistically significant overother methods and over all groups of scenes. We concludethat there is strong evidence that the shape-based averagingmethod is the most accurate slice interpolation method amongthe methods compared based on the FOM’s employed.

To gain further insight into the behavior of shape-basedand other methods of interpolation, one may consider moregeneral interpolation tasks in 3-D and four-dimensional thanthe “slice interpolation” task considered in this work andcarry out similar evaluation studies. FOM’s may be devisedto address goal-oriented questions such as how well edges arepreserved or how accurately surface and volume renditionsare created or how accurately subtle objects can be detectedto determine the effect of interpolation on such tasks.

ACKNOWLEDGMENT

The authors are grateful to Dr. R. I. Grossman, Dr. B. E.Hirsch, Dr. D. C. Hemmy, and Dr. G. Holland for making thepatient data sets utilized in this investigation available to them.

REFERENCES

[1] J. K. Udupa and G. T. Herman, Eds.,3-D Imaging in Medicine. BocaRaton, FL: CRC, 1991.

[2] S. P. Raya and J. K. Udupa, “Shape-based interpolation of multidi-mensional objects,”IEEE Trans. Med. Imag.,vol. 9, pp. 32–42, Feb.1990.

[3] G. T. Herman, J. Zheng, and C. A. Bucholtz, “Shape-based interpo-lation,” IEEE Comput. Graphics Applicat.,vol. 12, no. 3, pp. 69–79,1992.

[4] W. E. Higgins, C. Morice, and E. L. Ritman, “Shape-based interpolationof tree-like structures in three-dimensional images,”IEEE Trans. Med.Imag., vol. 12, pp. 439–450, Jun. 1993.

[5] W. E. Higgins, C. J. Orlick, and B. E. Ledell, “Nonlinear filteringapproach to 3-D gray-scale image interpolation,”IEEE Trans. Med.Imag., vol. 15, pp. 580–587, Aug. 1996.

[6] G. J. Grevera and J. K. Udupa, “Shape-based interpolation of multi-dimensional grey-level images,”IEEE Trans. Med. Imag.,vol. 15, pp.881–892, Dec. 1996.

[7] , “Shape-based interpolation ofnD grey scenes,” inSPIE Proc.2707, Medical Imaging,1996, pp. 106–116.

[8] W. Cheney and D. Kincaid,Numerical Mathematics and Computing.Monterey, CA: Brooks/Cole, 1980.

[9] G. T. Herman, S. W. Rowland, and M.-M. Yau, “A comparative studyof the use of linear and modified cubic spline interpolation for imagereconstruction,”IEEE Trans. Nucl. Sci.,vol. NS-26, pp. 2879–2894,1979.

[10] A. Goshtasby, D. A. Turner, and L. V. Ackerman, “Matching tomo-graphic slices for interpolation,”IEEE Trans. Med. Imag.,vol. 11, pp.507–516, Aug. 1992.

[11] W. K. Pratt,Digital Image Processing. New York, NY: Wiley, 1991.[12] M. R. Stytz and R. W. Parrott, “Using Kriging for 3-D medical imaging,”

Computerized Med. Imag., Graphics,vol. 17, no. 6, pp. 421–442, 1993.[13] R. A. Lotufo, G. T. Herman, and J. K. Udupa, “Combining shape-based

and grey-level interpolations,” inSPIE Proc. Visualization in BiomedicalComputing,1992, pp. 289–298.

[14] W. A. Barrett and R. R. Stringham, “Shape-based interpolation ofgreyscale serial slice images,”SPIE Proc.,vol. 1898, pp. 105–115, 1993.

[15] D. T. Puff, D. Eberly, and S. M. Pizer, “Object-based interpolationvia cores,” inSPIE Proc. 2167, Medical Imaging’94, Image Processing,1994, pp. 143–150.

[16] J. K. Udupa and R. J. Goncalves, “Imaging transforms for visualizingsurfaces and volumes,”J. Digital Imag.,vol. 6, no. 4, pp. 213–236, 1993.

[17] J. D. Foley and A. Van Dam,Fundamentals of Interactive ComputerGraphics. Menlo Park, CA: Addison Wesley, 1984.

[18] J. K. Udupa, L. Wei, S. Samarasekera, Y. Miki, M. A. van Buchem,and R. I. Grossman, “Multiple sclerosis lesion quantification usingfuzzy connectedness principles,”IEEE Trans. Med. Imag.,vol. 16, pp.598–609, Jun. 1997.

[19] B. E. Hirsch, J. K. Udupa, and S. Samarasekera, “A new method ofstudying joint kinematics from 3-D reconstructions of MRI data,”J.Amer. Podiatric Med. Assoc.,vol. 86, pp. 4–15, 1996.

[20] J. K. Udupa, J. Tian, D. C. Hemmy, and P. Tessier, “A pentium-basedcraniofacial 3-D imaging and analysis system,”J. Craniofacial Surg.,vol. 8, no. 5, pp. 333–339, 1997.

[21] J. K. Udupa, T. J. Odhner, G. Holland, and L. Axel, “Automatic clutter-free volume rendering for MR angiography using fuzzy connectedness,”SPIE Proc.,vol. 3034, pp. 114–119, 1997.

[22] E. J. Coyle and J.-H. Lin, “Stack filters and the mean absolute errorcriterion,” IEEE Trans. Acoust., Speech, Signal Processing,vol. 36, no.8, pp. 1244–1254, 1988.

[23] J. L. Devore,Probability and Statistics for Engineering and the Sciences.Pacific Grove, CA: Brooks/Cole, 1991.

[24] Microsoft Excel User’s Guide, Microsoft Corp., Redmond, WA, 1993.[25] S. Matej, S. Furuie, and G. T. Herman, “Relevance of statistically

significant differences between reconstruction algorithms,”IEEE Trans.Image Processing,vol. 5, pp. 554–556, Mar. 1996.

[26] J. Sall and A. Lehman,JMP Start Statistics. Belmont, CA: Duxbury,1996.

[27] G. Wolberg,Digital Image Warping. Los Alamitos, CA: IEEE Com-puter Society Press, 1990.