Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
4Effects of image characteristics onperformance of tumor delineationmethods: a test-retest assessment
Patsuree Cheebsumon, Floris H.P. van Velden, Maqsood Yaqub, Virginie Frings, Adrianus J.
de Langen, Otto S. Hoekstra, Adriaan A. Lammertsma, and Ronald Boellaard
Published in:
Journal of Nuclear Medicine 52 (2011), pp. 1550–1558
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Abstract
Positron emission tomography (PET) can be used to monitor response duringchemotherapy and assess biologic target volumes for radiotherapy. Previoussimulation studies have shown that the performance of various automatic orsemiautomatic tumor delineation methods depends on image characteristics.The purpose of this study was to assess test-retest variability of tumor delin-eation methods, with emphasis on the effects of several image characteristics(e.g., resolution and contrast). Methods: Baseline test-retest data from 19non-small cell lung cancer patients were obtained using [18F]-FDG (n=10) and3′-deoxy-3′-18F-fluorothymidine (18F-FLT) (n=9). Images were reconstructedwith varying spatial resolution and contrast. Six different types of tumor delin-eation methods, based on various thresholds or on a gradient, were applied to alldatasets. Test-retest variability of metabolic volume and standardized uptakevalue (SUV) was determined. Results: For both tracers, size of metabolic vol-ume, and test-retest variability of both metabolic volume and SUV were affectedby image characteristics and tumor delineation method used. The median vol-ume test-retest variability ranged from 8.3 to 23% and from 7.4 to 29% for 18F-FDG and 18F-FLT, respectively. For all image characteristics studied, largerdifferences (≤10-fold higher) were seen in test-retest variability of metabolicvolume than in SUV. Conclusion: Test-retest variability of both metabolicvolume and SUV varied with tumor delineation method, radiotracer and imagecharacteristics. The results indicate that a careful optimization of imaging anddelineation method parameters is needed when metabolic volume is used, forexample, as a response assessment parameter.
Reprinted by permission of the Society of Nuclear Medicine from:
Patsuree Cheebsumon, Floris H.P. van Velden, Maqsood Yaqub, Virginie Frings, Adrianus J.de Langen, Otto S. Hoekstra, Adriaan A. Lammertsma, and Ronald Boellaard. Effects ofImage Characteristics on Performance of Tumor Delineation Methods: A Test-RetestAssessment. J Nucl Med. 2011;52(10): 1550–1558.
50
4.1. Introduction
4.1 Introduction
Positron emission tomography (PET) is a functional imaging modality thatprovides information about the metabolism, physiology, or molecular biologyof tumor tissue. There is growing evidence that PET can be used to monitorresponse during chemotherapy and to assess biologic target volumes for radio-therapy [7, 26, 31, 32]. For response monitoring studies, it is important to knowwhether a difference between tumor volumes in successive scans represents a trueresponse or methodology-related variability. In addition, for radiation treat-ment planning, accurate definition of tumor volume is important for focusingthe dose to the tumor and sparing surrounding normal tissue. Various PETtracers have been developed to visualize and quantify biologic characteristics oftumors, that is, metabolism, proliferation, hypoxia and apoptosis. The mostwidely used PET tracer, 18F-FDG, is increasingly applied to define gross tumorvolume in radiotherapy. Evidence is accumulating that 18F-FDG could improvethe accuracy with which tumor boundaries are defined [7,31,32]. 18F-FDG up-take reflects glucose metabolism, and tumors can be identified on the basis oftheir increased rate of glycolysis. However, increased glucose metabolism is notspecific to tumors, and increased 18F-FDG uptake is also seen in, for example,inflammatory tissue [14].
Proliferation of tumor cells is directly related to DNA synthesis, which canbe measured using radiolabeled thymidine or thymidine derivatives. The 18F-labeled thymidine analog 3′-deoxy-3′-18F-fluorothymidine (18F-FLT) has showna high correlation with thymidine kinase-1 and tissue markers of proliferation,that is, proliferating cell nuclear antigen (Ki-67), in pulmonary nodules [16].Moreover, 18F-FLT showed high sensitivity and specificity, comparable with18F-FDG [65]. Therefore, 18F-FLT is increasingly being used as a specific tracerfor noninvasive assessment of tumor cell proliferation.
In this paper, we will use the term metabolic volume to indicate tumorvolumes that are derived directly from PET. This term may be justified, as 18F-FLT and 18F-FDG are trapped in tissue by metabolic (kinase) activity. However,for volume assessments with other tracers, that is, those that measure perfusionor bind to receptors, the term functional volume may be more appropriate.
Various techniques for determining the boundaries of the gross tumor volumebased on PET images have been reported [7, 29–32], ranging from visual inter-pretation to automatic or semiautomatic methods. In the simplest case (visual),tumor boundaries are outlined manually by a nuclear medicine physician, radi-ologist, or radiation oncologist. Manual outlining may lead to a large variationin gross tumor volume delineation, as boundary definition depends on both theexperience of the physician and the contouring protocol used [10]. Automatic orsemiautomatic delineation methods, methods that automatically delineate a tu-mor after user input, have been proposed to reduce this variability. So far, to ourknowledge, only 2 studies have reported the test-retest variability of metabolicvolumes [66, 67]. However, in the study of Frings et al. [66], metabolic volumetest-retest variability was evaluated for a few percentage threshold-based auto-
51
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
mated tumor delineation methods, and in both studies metabolic volume test-retest variability was assessed using constant imaging parameters only. Thereare, however, many factors that could affect the accuracy of PET-based auto-matic or semiautomatic delineation methods, that is, image resolution, recon-struction settings, image noise, and tumor characteristics [7, 31, 63]. Assessingthe effects of these different image characteristics on metabolic volume test-retest variability is of the utmost importance to understand the need to optimizeimage quality [19]. Moreover, there are several types of PET-based automatedtumor delineation methods for which test-retest performance may or may notbe sensitive to the image characteristics.
The aim of this study was to further evaluate both the test-retest variabilityand differences in metabolic volumes derived from PET studies using varioustypes of automatic or semiautomatic delineation methods, with emphasis on theeffects of image characteristics (i.e., resolution and contrast) and for 2 differenttracers.
4.2 Materials and methods
4.2.1 Patients and Radiotracers
Retrospective data from patients with stage IIIB or IV non-small cell lung can-cer for 2 radioactive PET tracers were used. All patients gave written informedconsent, and both studies were approved by the Medical Ethics Review Com-mittee of the VU University Medical Center.
Ten patients (3 women and 7 men; mean age ± SD, 51 ± 5 y; range, 45-63 y;mean weight, 76 ± 10 kg; range, 56-94 kg) were included in a dynamic baseline18F-FDG study. Blood glucose levels were obtained for each patient and werewithin the reference range (mean, 5.5 ± 0.6 mmol⋅L−1; range, 4.4-7.0 mmol⋅L−1).All patients fasted for at least 6 h before scanning. In all patients, 2 dynamic18F-FDG studies were acquired on consecutive days.
Nine patients (2 women and 7 men; mean age, 66 ± 11 y; range, 45-78 y;mean weight, 72 ± 8 kg; range, 61-87 kg) were included in a dynamic baseline18F-FLT study. All patients were scanned twice within an interval of 1 wk.
4.2.2 PET Protocol
Patients were prepared in accordance with recently published guidelines forquantitative PET studies [23,30]. All patients were scanned in the supine posi-tion and received an intravenous catheter for tracer administration. All scans,performed using an ECAT EXACT HR+ scanner (Siemens/CTI) [54], startedwith a 10-min transmission scan. Afterward, a tracer bolus was administratedintravenously (18F-FDG: 388 ± 71 MBq; 18F-FLT: 350 ± 47 MBq) while dynamicemission scanning began in 2-dimensional acquisition mode. Each dynamic scanconsisted of 40 frames with the following lengths: 1 × 30, 6 × 5, 6 × 10, 3 × 20,5 × 30, 5 × 60, 8 × 150, and 6 × 300 s.
52
4.2. Materials and methods
Both the last 3 frames (45-60 min after injection) and the last 6 frames (30-60 min after injection) were summed to obtain various image contrasts, and theresulting sinograms were reconstructed using normalization and attenuation-weighted ordered-subsets expectation maximization with 2 iterations and 16subsets, followed by postsmoothing using a Hanning filter at 0.5 of the Nyquistfrequency [6]. An image matrix size of 256 × 256 × 63 was used, corresponding toa pixel size of 2.57 × 2.57 × 2.43 mm. Additional smoothing was applied to theimages using various gaussian kernels, thereby reducing both image resolutionand noise. The kernels used resulted in final spatial resolutions of 6.5, 8.3, and10.2 mm in full width at half maximum (FWHM). Using each combination ofimage contrast and noise (i.e., sum of last 3 or 6 frames), spatial resolution(i.e., 6.5, 8.3, and 10.2 mm FWHM) and tracer (i.e., 18F-FDG and 18F-FLT),test-retest variability of both metabolic volume and corresponding standardizeduptake value (SUV) was determined for all automatic or semiautomatic tumordelineation methods.
4.2.3 Data Analysis
Test-retest variability of both observed metabolic volumes and volumetricaverage SUVs was assessed for the following 6 different types of automatic orsemiautomatic tumor delineation methods:
1. Fixed threshold range of 50% and 70% of maximum voxel value withintumor (VOI50, VOI70). This method applies a threshold based on thepercentage of the maximum voxel intensity within the tumor [30]. Next,this threshold is used to delineate the tumor.
2. Adaptive threshold range of 41%-70% of maximum voxel value withintumor (VOIA41, VOIA50, VOIA70). This method is similar to the fixedthreshold method, except that it adapts the threshold relative to the localaverage background, thereby correcting for the contrast between tumorand local background [30].
3. Contrast-oriented method (VOISchaefer). This method uses a correctionby measuring the mean of 70% maximal SUV and background activity forvarious sphere sizes. Regression coefficients are calculated, which representthe relationship between optimal threshold and image contrast for varioussphere sizes [31]. This threshold equation is given by:
Thresholdoptimal = A ×meanSUV70% +B × background�� ��4.1
where A and B were fitted using phantom studies [31]. In general, differentvalues are applied for sphere diameters smaller and larger than 3 cm. Inour paper, we recalibrated this method; that is, we determined the A andB values that are specific for the PET system and image characteristicsused. Ideally, the diameter could be derived from CT images. However, asno CT images were available for the studies used, we obtained the diameter
53
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
from 2 different delineation methods, VOIA41 and VOIA50 (multiplied by aconstant factor), and show them as VOISchaefer−A41 and VOISchaefer−A50,respectively.
4. Background-subtracted relative-threshold level (RTL) method (VOIRTL).This method is an iterative method based on a convolution of the point-spread function that takes into account the differences between varioussphere sizes and the scanner resolution [32].
5. Gradient-based watershed segmentation method (GradWT ). This methoduses 2 steps before calculating the volume of interest. First, this methodcalculates a gradient image on which a seed is placed in the tumor and an-other in the background. Next, a watershed algorithm is used to grow theseeds in the gradient basins, thereby creating boundaries on the gradientedges. In our presentation, the watershed continues to grow the gradientbasins until all voxels are classified as either tumor or nontumor (back-ground). The voxel is assigned to tumor if 2 watersheds are competing forthe same voxel.
6. Absolute SUV (SUV2.5). Normalised (SUV) voxel intensities at a chosenabsolute threshold are used to delineate tumor. An SUV of 2.5 was used, asit might properly differentiate between benign and malignant lesions [29].
For all delineation methods, the maximum voxel value was obtained byapplying a cross-shaped pattern that could be less sensitive to noise. Thismethod searches for the region with the (local) average maximum intensity,based on the average of 7 neighbouring voxels, which was then used as maxi-mum or peak value.
The volume measured by VOIA41 using both sum of last 3 frames and 6.5mm FWHM was used as the defined reference standard. The volumes obtainedby all tumor delineation methods using various image characteristics were com-pared with this defined reference standard. To assess accuracy, the mean ratio(of all methods compared with the reference dataset) and precision, that is, SD,for each tumor delineation method were calculated across all studies for a giventracer. Percentage test-retest variability was defined as:
∣Xtest −Xretest
Xmean of test and retest∣ ×100%
�� ��4.2
where X is either VOI size or SUV. For test-retest variability, we calculated me-dian, first quartile, third quartile, minimum and maximum values, and coefficientof determination (R2) between test and retest studies. All automated methodswere supervised to identify outliers. Outliers were removed from all analysesand were defined as either a small tumor (i.e., a node) that visually showed anunrealistically large measured metabolic tumor volume or a large tumor (>100mL) that had test-retest variability larger than 100% due to a clearly visuallyunderestimated metabolic volume in either the test or the retest baseline study.
54
4.3. Results
Figure 4.1. Mean ratio of tumor volume obtained with various tumor delineation methodsagainst defined reference standard (sum of last 3 frames and 6.5 mm FWHM) as functionof image resolution for 18F-FDG (A) and 18F-FLT (B). All bars cut off at 4 (indicated byabsence of SD bars) were higher than 20. Error bars represent SD
A 2-tailed paired Wilcoxon signed-rank test was used to indicate a statisti-cally significant difference between volume, SUV, and test-retest variability ofvolume and SUV obtained from images with various image characteristics andthose obtained from the defined reference standard. P values of less than 0.05were considered significantly different, and P values of between 0.1 and 0.05were considered to indicate a trend.
4.3 Results
4.3.1 Precision of Tumor Delineation Methods
Table 4.1 shows the number of outliers and detectable lesions for all tumor de-lineation methods in both test and retest studies. For 18F-FDG, identificationof several lesions was independent of contrast and resolution. Most methods didnot show a large difference (>3) in the number of outliers when image charac-teristics were varied, except for VOI50, VOIA41, both variants of VOISchaefer,and SUV2.5, which showed up to a 23% increase of the number of identifiedoutliers. Similarly, trends were observed for 18F-FLT. For this tracer, however,the number of lesions that could be detected depended moderately on imageresolution.
4.3.2 Accuracy of Tumor Delineation Methods
Figure 4.1 shows the effects of spatial resolution on the change in metabolicvolume for various tumor delineation methods and for both 18F-FDG and 18F-FLT. In general, there was variability (≤94%) in measured tumor volume when
55
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Table
4.1.
Nu
mber
of
ou
tliersw
hen
determ
inin
gtu
mo
rvo
lum
efo
ra
llsca
ns
(testa
nd
retest)fo
rd
ifferen
tim
age
cha
racteristics
an
dra
dio
tracers
18F
-FD
G18F
-FLT
Sum
of
last
Sum
of
last
Sum
of
last
Sum
of
last
6fra
mes
3fra
mes
6fra
mes
3fra
mes
Tum
or
delin
eatio
nm
eth
od
6.5
*8.3
*10.2
*6.5
8.3
10.2
6.5
8.3
10.2
6.5
8.3
10.2
VO
I50
12
14
15
813
14
74
57
54
VO
I70
02
30
02
00
01
00
VO
IA
41
610
12
66
11
33
34
54
VO
IA
50
23
30
02
10
02
00
VO
IA
70
00
00
00
10
01
00
VO
ISchaefer−A
41
63
66
34
21
34
23
VO
ISchaefer−A
50
55
44
34
21
14
22
VO
IR
TL
03
20
01
00
01
00
Gra
dW
T0
02
00
20
01
01
1
SU
V2.5
12
11
11
53
63
01
21
0
Tota
ldete
cta
ble
lesio
ns
60
60
60
60
60
60
35
33
32
36
34
31
*im
age
reso
lutio
n(m
m)
56
4.3. Results
Figure 4.2. Mean ratio of tumor volume obtained with various tumor delineation methodsagainst defined reference standard (sum of last 3 frames and 6.5 mm FWHM) as functionof image contrasts for 18F-FDG (A) and 18F-FLT (B). All bars cut off at 4 (indicated byabsence of SD bars) were higher than 20. Error bars represent SD
image resolution was changed. For almost all methods, except for VOIA70
and SUV2.5, the mean ratio obtained with low resolution (10.2 mm FWHM)was higher than that obtained with high resolution (6.5 mm FWHM). Com-pared with VOIA41 at 6.5 mm FWHM data, VOI50, VOISchaefer−A41, andGradWT provided similar volumes at high resolution. However, only GradWT
provided volumes independent of resolution. In contrast, VOI70, VOIA50, andVOIA70 gave lower volumes (>26%). Similar trends were observed between the2 tracers. However, for 18F-FLT, only a moderate overestimation of metabolicvolume (>15%) was observed for SUV2.5, compared with the reference value(Fig. 4.1B).
Figure 4.2 shows the effects of image contrast on the change in metabolicvolume for various tumor delineation methods and for both 18F-FDG and 18F-FLT. In general, the trends observed were similar to those when image resolutionwas changed; that is, results for lower contrast (6 frames or 30-60 min afterinjection) corresponded to those for lower resolution (10.2 mm FWHM).
4.3.3 Test-Retest Variability of VOI Size
Slope and R2 (intercept set to 0) between measured tumor volumes of test andretest studies obtained using different tumor delineation methods and tracersare shown in Table 4.2 for the defined reference standard. VOIA41, VOIA50,both variants of VOISchaefer, VOIRTL, and SUV2.5 showed good correlationbetween test and retest scans (R2 >0.90, slopes between 0.76 and 1.06) for bothtracers. For 18F-FDG, VOISchaefer−A41 showed the best correlation (R2, 1.00;slope, 1.01). Good correlation with respect to volume size (i.e., R2 >0.79, slopesbetween 0.71 and 1.11) was found for all tumor delineation methods, except for
57
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Table 4.2. Slope (with intercept fixed to 0) and coefficient of determination between tumorvolume size measured for test and retest studies
Tumor delineation method 18F-FDG 18F-FLT
R2 Slope R2 Slope
VOI50 0.79 0.71 0.93 0.83
VOI70 0.89 0.99 0.52 1.52
VOI70 (reduced dataset) — — 0.81* 1.21*
VOIA41 0.91 0.90 0.91 0.82
VOIA50 0.94 0.93 0.91 1.06
VOIA70 0.88 1.11 0.72 0.91
VOISchaefer−A41 1.00 1.01 0.96 0.79
VOISchaefer−A50 0.92 0.83 0.95 0.76
VOIRTL 0.95 0.93 0.90 1.04
GradWT 0.58 1.34 0.41 1.02
GradWT (reduced dataset) 0.86� 0.94� 0.70� 1.10�
SUV2.5 0.97 1.11 0.99 0.86
*After removing 2 outliers
�After removing 5 outliers
�After removing 3 outliers
GradWT , which showed a correlation of only 0.58. However, 5 lesions were clearoutliers for this method. These outliers were found in cases of heterogeneouslesions or a low tumor-to-background ratio. After these outliers were removed,a good correlation (R2, 0.86; slope, 0.94) was observed for this method as well.A similar result was observed in the case of 18F-FLT, for which the correlationfor GradWT improved from 0.41 to 0.70 when 3 outliers were excluded. Inaddition, VOI70 showed 2 outliers that provided a much smaller volume inthe test scan than in the retest scan. After these outliers were removed, thecorrelation improved from 0.52 (slope, 1.52) to 0.81 (slope, 1.21). In all cases,these outliers were found for tumors with very heterogeneous uptake or lesionsthat were close to high-uptake structures. For 18F-FLT, VOIA50 and VOIRTL
showed the best correlation (R2 >0.90; slope, ∼1.05).
Figure 4.3 shows the test-retest variability of metabolic volume as a functionof image resolution for high image contrast or noise (45-60 min after injection).Overall, volume test-retest variability depended mainly on image resolution forall tumor delineation methods and for both tracers. Median test-retest vari-ability of tumor volume ranged from 8.3% to 23% and from 7.4% to 29% for18F-FDG and 18F-FLT, respectively. For 18F-FDG (Fig. 4.3A), fixed, adap-tive percentage threshold, both variants of VOISchaefer and VOIRTL methodsshowed deteriorating median test-retest variability (≤11% difference) for lowerresolution. Both variants of VOISchaefer showed good performance, having alow median volume test-retest value (<13%) and a low number of changes inmedian test-retest values (<3.7% difference) when resolutions were varied. In
58
4.3. Results
Figure 4.3. Box-and-whisker plots of percentage test-retest (TRT) variability in tumorvolume obtained using various tumor delineation methods at high image contrast and varyingimage resolutions for 18F-FDG (A) and 18F -FLT (B). Median is horizontal line betweenlower (first) and upper (third) quartiles. Upper whisker represents upper quartile to maximumvalue, corrected for outliers (not exceeding 1.5 times interquartile range)
addition, VOIA41 and VOIA50 showed relatively low median volume test-retestvalues (14% and 17%, respectively) and a low number of changes in mediantest-retest values (<6.0% and 1.4% difference, respectively) when resolutionswere varied. Interestingly, for 18F-FLT (Fig. 4.3B), most methods showed anopposite trend in median test-retest variability when resolution was changed,with better performance at lower resolution. VOI70 and SUV2.5 were relativelyindependent of changes in resolution (<0.5% difference), having a low mediantest-retest variability (<15%). All other delineation methods gave a moderatevariation in test-retest variability (<9.5% difference) and reasonable mediantest-retest values (<29%) when resolutions were changed.
Figure 4.4 illustrates the effects of image contrast on volume test-retestvariability for a fixed resolution of 6.5 mm FWHM. Figure 4.4A shows that,for 18F-FDG, most methods were nearly independent of a change in contrast(<6.7% difference), except for GradWT (>8.3% difference). In contrast, for18F-FLT (Fig. 4.4B), reducing the contrast showed—likely because of an im-provement in noise levels by summing over more frames—an improvement inmedian test-retest variability for all methods (<12% lower difference), exceptfor GradWT (2.6% higher difference). SUV2.5 was the method that showed thelowest dependence on contrast (<1% difference).
4.3.4 Test-Retest Variability of SUV
Figure 4.5 illustrates the test-retest variability of SUV at high image contrastand for various image resolutions. Overall, changes in test-retest variability ofSUV were much lower than those seen for VOI size (median test-retest variabilityof SUV ranged from 4.5% to 11% and from 1.8% to 8.5% for 18F-FDG and18F-FLT, respectively). For all tumor delineation methods and both tracers,
59
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Figure 4.4. Box-and-whisker plots of percentage test-retest (TRT) variability of tumorvolume obtained by various tumor delineation methods when using different image contrastsfor 18F-FDG (A) and 18F-FLT (B). Median is horizontal line between lower (first) andupper (third) quartiles. Upper whisker represents upper quartile to maximum value, correctedfor outliers (not exceeding 1.5 times interquartile range)
Figure 4.5. Box plots of percentage test-retest (TRT) variability of SUV obtained by varioustumor delineation methods at high image contrast when image resolutions were varied for 18F-FDG (A) and 18F-FLT (B). Median is horizontal line between lower (first) and upper (third)quartiles. Upper whisker represents upper quartile to maximum value, corrected for outliers(not exceeding 1.5 times interquartile range). Note that scale differs from Figure 4.3
the effect of image resolution on test-retest variability of SUV was small (<4%difference). Figure 4.6 illustrates the effects of image contrast on test-retestvariability of SUV for a fixed resolution of 6.5 mm FWHM. Trends were similarto those seen for changes in image resolution.
60
4.4. Discussion
Figure 4.6. Box plots of percentage test-retest (TRT) variability of SUV obtained by vari-ous tumor delineation methods when different image contrasts were used for 18F-FDG (A)and 18F-FLT (B). Median is horizontal line between lower (first) and upper (third) quar-tiles. Upper whisker represents upper quartile to maximum value, corrected for outliers (notexceeding 1.5 times interquartile range). Note that scale differs from Figure 4.4
4.3.5 Statistics
Tables 4.3 and 4.4 (mean ± SD provided in Tables 4.5 – 4.8) indicate that formost tumor delineation methods a change in resolution has a more significantimpact on SUV and volume than their corresponding test-retest variability. Thesame trend was observed for a change in contrast, except for volumes obtainedon 18F-FLT images, where for most tumor delineation methods a change incontrast has a more significant impact on volume test-retest variability than onvolume itself.
4.4 Discussion
The aim of this study was to further investigate metabolic volume test-retestvariability beyond those findings published recently [66,67], not only by includ-ing various types of tumor delineation methods for 2 different tracers but alsoby studying the impact of image characteristics.
In theory, estimating metabolic tumor volume accuracy and reproducibilityis important for a curative outcome of radiation treatment planning. Tumordelineation methods may show metabolic tumor volumes that are too smallfor radiation treatment planning purposes, leading to local recurrences. Forresponse monitoring, however, consistent underestimations of metabolic tumorvolumes are less important, as only relative changes in tumor volume duringtherapy may be relevant. In general, all tumor delineation methods showedmuch larger variations in measured metabolic tumor volume (<29%) than inSUV (<11%), when image characteristics and radiotracers were varied. Thisfinding corresponds with the results of a previous report [30] showing that, in
61
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Table
4.3.
Pva
lues
for
actu
al
volu
me,
SU
Va
nd
test-retestva
riability
of
volu
me
an
dS
UV
obta
ined
from
18F
-FD
Gim
ages
with
vario
us
ima
gech
ara
cteristicsco
mpa
redto
tho
seo
btain
edfro
mth
ed
efin
edreferen
cesta
nd
ard
(sum
of
last
3fra
mes
an
d6
.5m
mF
WH
M)
Volu
me
Volu
me
test-re
test
varia
bility
SU
VSU
Vte
st-rete
stvaria
bility
Tum
or
delin
eatio
nR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
st
meth
od
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
8.3
10.2
68.3
10.2
68.3
10.2
68.3
10.2
6
VO
I50
<0.0
01
<0.0
01
<0.0
01
0.2
22
0.4
95
0.5
62
<0.0
01
<0.0
01
<0.0
01
0.3
05
0.3
93
0.8
38
VO
I70
<0.0
01
<0.0
01
0.0
01
0.4
52
0.9
32
0.8
08
<0.0
01
<0.0
01
<0.0
01
0.2
21
0.5
94
0.4
90
VO
IA
41
<0.0
01
<0.0
01
<0.0
01
0.1
26
0.1
79
0.1
01
<0.0
01
<0.0
01
<0.0
01
0.0
30
0.9
46
0.9
66
VO
IA
50
<0.0
01
<0.0
01
0.0
23
0.6
26
0.8
15
0.5
79
<0.0
01
<0.0
01
<0.0
01
0.2
53
0.5
65
0.5
50
VO
IA
70
0.2
61
0.4
38
0.2
21
0.9
83
0.9
71
0.2
29
<0.0
01
<0.0
01
<0.0
01
0.1
35
0.5
00
0.6
55
VO
ISchaefer−A
41
0.0
88
<0.0
01
<0.0
01
0.2
20
0.0
98
0.0
42
<0.0
01
<0.0
01
<0.0
01
0.1
20
0.8
23
0.9
88
VO
ISchaefer−A
50
0.0
12
<0.0
01
<0.0
01
0.2
37
0.0
45
0.1
14
<0.0
01
<0.0
01
<0.0
01
0.0
46
0.8
33
0.5
84
VO
IR
TL
<0.0
01
<0.0
01
<0.0
01
0.5
03
0.1
76
1.0
00
<0.0
01
<0.0
01
<0.0
01
0.3
39
0.0
72
0.8
55
Gra
dW
T0.1
10
0.0
02
0.5
12
0.1
00
0.4
17
0.2
29
<0.0
01
<0.0
01
<0.0
01
0.0
26
0.9
15
0.9
19
SU
V2.5
0.2
07
0.1
52
<0.0
01
0.4
83
0.5
65
0.3
93
<0.0
01
<0.0
01
<0.0
01
0.1
81
0.4
39
0.0
55
62
4.4. Discussion
Table
4.4.
Pva
lues
for
act
ua
lvo
lum
e,S
UV
an
dte
st-r
etes
tva
ria
bili
tyo
fvo
lum
ea
nd
SU
Vo
bta
ined
fro
m18F
-FL
Tim
age
sw
ith
vari
ou
sim
age
cha
ract
eris
tics
com
pare
dto
tho
seo
bta
ined
fro
mth
ed
efin
edre
fere
nce
sta
nd
ard
(su
mo
fla
st3
fra
mes
an
d6
.5m
mF
WH
M)
Volu
me
Volu
me
test
-rete
stvari
abilit
ySU
VSU
Vte
st-r
ete
stvari
abilit
y
Tum
or
delineati
on
Reso
luti
on
Contr
ast
Reso
luti
on
Contr
ast
Reso
luti
on
Contr
ast
Reso
luti
on
Contr
ast
meth
od
(FW
HM
,in
mm
)(f
ram
es)
(FW
HM
,in
mm
)(f
ram
es)
(FW
HM
,in
mm
)(f
ram
es)
(FW
HM
,in
mm
)(f
ram
es)
8.3
10.2
68.3
10.2
68.3
10.2
68.3
10.2
6
VO
I50
<0.0
01
0.0
01
0.2
72
0.0
24
0.1
05
0.0
67
<0.0
01
<0.0
01
0.0
43
0.0
83
0.7
70
0.0
43
VO
I70
<0.0
01
<0.0
01
0.0
12
0.2
77
0.5
61
0.1
17
<0.0
01
<0.0
01
<0.0
01
0.0
64
0.8
04
<0.0
01
VO
IA41
<0.0
01
<0.0
01
0.3
94
0.0
27
0.0
54
0.0
92
<0.0
01
<0.0
01
0.1
43
0.1
51
0.9
66
0.1
43
VO
IA50
<0.0
01
<0.0
01
0.0
73
0.3
91
0.2
17
0.0
09
<0.0
01
<0.0
01
0.0
03
0.4
63
0.2
41
0.0
03
VO
IA70
0.4
90
0.2
60
0.5
07
0.3
89
1.0
00
0.9
00
<0.0
01
<0.0
01
0.0
03
0.3
03
0.5
61
0.0
03
VO
ISchaefer−A
41
0.1
49
0.0
02
0.1
85
0.7
91
0.2
40
0.0
52
<0.0
01
<0.0
01
0.2
64
0.6
77
0.9
66
0.2
64
VO
ISchaefer−A
50
0.0
83
0.0
02
0.1
85
0.9
10
0.1
75
0.0
21
<0.0
01
<0.0
01
0.2
64
0.6
22
0.7
00
0.2
64
VO
IRT
L0.0
01
0.0
01
0.0
08
0.5
42
0.9
52
0.0
04
<0.0
01
<0.0
01
0.0
02
0.1
35
0.2
17
0.0
02
Gra
dW
T0.8
07
0.4
26
0.7
88
0.8
39
0.6
85
0.3
30
0.0
01
<0.0
01
0.0
96
0.7
87
0.1
68
0.0
96
SU
V2.5
<0.0
01
<0.0
01
<0.0
01
0.7
61
0.3
40
0.4
97
<0.0
01
<0.0
01
<0.0
01
0.2
41
0.4
14
<0.0
01
63
Chapter 4. Effects of image characteristics on performance of tumor delineation methodsTable
4.5.
Mea
n(S
D)
for
actu
al
volu
me
an
dtest-retest
varia
bilityo
fvo
lum
eo
btain
edfro
m18F
-FD
Gim
ages
with
vario
us
ima
gech
ara
cteristics
Volu
me
Volu
me
test-re
test
varia
bility
Tum
or
delin
eatio
nR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
st
meth
od
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
6.5
8.3
10.2
66.5
8.3
10.2
6
VO
I50
26.0
(48.0
)34.0
(60.1
)67.8
(138.1
)51.0
(94.1
)23.6
(34.9
)21.5
(22.0
)25.6
(22.9
)15.8
(16.7
)
VO
I70
5.2
(9.1
)6.2
(11.0
)7.0
(12.3
)6.6
(11.7
)31.4
(40.0
)29.0
(35.4
)26.6
(26.3
)34.9
(47.8
)
VO
IA
41
15.4
(25.3
)20.3
(29.6
)55.7
(94.9
)38.0
(65.2
)16.0
(22.2
)16.1
(15.7
)23.3
(23.3
)19.3
(21.2
)
VO
IA
50
9.9
(18.2
)11.0
(19.3
)12.3
(20.7
)10.7
(19.3
)29.8
(39.3
)28.2
(32.8
)25.5
(24.6
)33.6
(40.8
)
VO
IA
70
2.1
(4.2
)2.8
(5.7
)6.0
(16.0
)2.2
(4.6
)32.7
(32.1
)38.0
(46.4
)33.1
(40.9
)27.2
(35.5
)
VO
ISchaefer−A
41
14.7
(25.7
)16.2
(26.4
)46.5
(81.2
)29.9
(50.7
)13.8
(12.3
)19.8
(23.1
)23.4
(24.4
)23.2
(24.7
)
VO
ISchaefer−A
50
19.5
(32.0
)16.2
(26.4
)35.2
(64.8
)34.1
(62.5
)17.1
(19.6
)19.8
(20.9
)24.9
(25.4
)21.0
(22.7
)
VO
IR
TL
11.5
(19.3
)13.5
(21.4
)21.2
(38.1
)13.0
(21.0
)27.1
(31.3
)33.8
(37.6
)37.1
(38.7
)29.7
(35.0
)
Gra
dW
T13.7
(23.3
)16.2
(29.8
)16.8
(30.6
)15.4
(29.7
)30.4
(29.9
)18.0
(11.1
)21.1
(16.6
)38.7
(33.7
)
SU
V2.5
308.6
(410.0
)276.9
(411.4
)290.9
(435.2
)392.4
(429.7
)31.5
(35.5
)32.2
(36.9
)23.9
(19.1
)24.1
(21.6
)
Table
4.6.
Mea
n(S
D)
for
actu
al
SU
Va
nd
test-retestva
riability
of
SU
Vo
btain
edfro
m18F
-FD
Gim
ages
with
vario
us
ima
gech
ara
cteristics
SU
VSU
Vte
st-rete
stvaria
bility
Tum
or
delin
eatio
nR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
st
meth
od
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
6.5
8.3
10.2
66.5
8.3
10.2
6
VO
I50
5.7
(2.4
)5.3
(2.3
)5.3
(2.3
)5.5
(2.0
)8.6
(4.6
)8.5
(10.4
)10.0
(7.6
)9.6
(6.9
)
VO
I70
6.5
(3.0
)6.2
(2.8
)5.7
(2.8
)6.0
(2.6
)9.1
(9.3
)8.5
(9.2
)10.2
(10.0
)8.0
(5.3
)
VO
IA
41
5.3
(2.2
)5.1
(2.1
)5.1
(2.1
)5.0
(2.0
)7.5
(4.3
)7.5
(8.5
)11.3
(10.0
)8.6
(6.0
)
VO
IA
50
6.0
(2.6
)5.7
(2.5
)5.3
(2.4
)5.6
(2.3
)8.2
(7.4
)7.1
(8.4
)9.4
(9.3
)7.8
(6.1
)
VO
IA
70
7.3
(3.0
)6.9
(2.9
)6.7
(2.8
)6.7
(2.6
)8.8
(8.4
)7.2
(6.7
)9.3
(7.7
)7.7
(4.7
)
VO
ISchaefer−A
41
5.3
(2.0
)5.0
(2.1
)4.9
(2.0
)5.0
(1.9
)6.9
(4.0
)7.5
(9.2
)10.5
(10.5
)7.3
(4.9
)
VO
ISchaefer−A
50
5.3
(2.1
)5.0
(2.0
)4.9
(2.0
)5.0
(1.9
)7.2
(4.2
)7.3
(9.2
)9.7
(10.4
)8.1
(5.6
)
VO
IR
TL
5.8
(2.4
)5.5
(2.3
)5.0
(2.1
)5.3
(2.1
)7.7
(7.7
)9.2
(14.6
)12.4
(16.1
)9.0
(8.2
)
Gra
dW
T4.8
(2.0
)4.6
(2.0
)4.5
(2.0
)4.6
(1.9
)8.2
(5.3
)6.2
(4.9
)9.0
(11.2
)9.2
(7.4
)
SU
V2.5
3.9
(0.7
)3.8
(0.7
)3.7
(0.7
)3.8
(0.6
)6.9
(6.0
)6.1
(5.5
)6.0
(4.7
)5.3
(6.3
)
64
4.4. Discussion
Table
4.7.
Mea
n(S
D)
for
act
ua
lvo
lum
ea
nd
test
-ret
est
vari
abi
lity
of
volu
me
obt
ain
edfr
om
18F
-FL
Tim
age
sw
ith
vari
ou
sim
age
cha
ract
eris
tics
Volu
me
Volu
me
test
-rete
stvari
abilit
y
Tum
or
delineati
on
Reso
luti
on
Contr
ast
Reso
luti
on
Contr
ast
meth
od
(FW
HM
,in
mm
)(f
ram
es)
(FW
HM
,in
mm
)(f
ram
es)
6.5
8.3
10.2
66.5
8.3
10.2
6
VO
I50
54.8
(111.3
)134.3
(268.3
)149.0
(246.3
)107.5
(182.8
)26.3
(22.2
)21.4
(18.1
)26.7
(26.8
)22.9
(28.5
)
VO
I70
5.5
(6.7
)6.5
(7.5
)7.1
(7.9
)5.8
(6.5
)30.8
(40.1
)32.2
(36.9
)20.5
(19.8
)26.0
(46.1
)
VO
IA41
53.1
(107.3
)26.9
(36.0
)68.0
(149.7
)58.1
(106.5
)22.5
(23.3
)17.7
(17.1
)14.4
(11.8
)13.0
(18.0
)
VO
IA50
9.1
(12.0
)13.1
(17.7
)12.4
(13.5
)25.4
(64.9
)20.4
(18.2
)28.9
(32.8
)16.7
(12.2
)15.2
(15.1
)
VO
IA70
1.3
(1.9
)1.3
(1.7
)1.7
(2.5
)1.3
(1.6
)29.9
(31.7
)44.1
(41.4
)34.3
(42.2
)30.5
(29.6
)
VO
ISchaefer−A
41
53.4
(108.8
)21.9
(27.7
)26.1
(27.4
)54.3
(105.7
)19.7
(18.0
)19.6
(17.4
)12.6
(10.2
)8.6
(8.8
)
VO
ISchaefer−A
50
54.3
(112.0
)21.9
(27.7
)24.2
(27.1
)54.3
(105.6
)19.7
(18.5
)19.7
(16.6
)12.8
(9.9
)7.9
(8.4
)
VO
IRT
L11.0
(13.6
)17.6
(21.8
)17.5
(18.8
)11.9
(13.5
)38.2
(42.4
)24.1
(19.3
)23.0
(18.1
)14.4
(15.0
)
Gra
dW
T12.0
(11.2
)17.0
(23.1
)19.2
(27.3
)16.3
(18.5
)39.8
(34.5
)28.2
(34.6
)24.3
(25.5
)48.0
(47.8
)
SU
V2.5
126.9
(228.1
)79.7
(151.4
)82.2
(148.8
)88.0
(176.2
)16.1
(13.8
)14.2
(11.2
)19.7
(24.7
)23.5
(24.2
)
65
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
Table
4.8.
Mea
n(S
D)
for
actu
al
SU
Va
nd
test-retestva
riability
of
SU
Vo
btain
edfro
m18F
-FL
Tim
ages
with
vario
us
ima
gech
ara
cteristics
SU
VSU
Vte
st-rete
stvaria
bility
Tum
or
delin
eatio
nR
eso
lutio
nC
ontra
stR
eso
lutio
nC
ontra
st
meth
od
(FW
HM
,in
mm
)(fra
mes)
(FW
HM
,in
mm
)(fra
mes)
6.5
8.3
10.2
66.5
8.3
10.2
6
VO
I50
3.7
(1.3
)3.2
(1.1
)3.1
(1.0
)3.7
(1.3
)9.9
(9.4
)7.5
(8.1
)7.9
(5.7
)8.8
(6.8
)
VO
I70
4.2
(1.3
)3.8
(1.3
)3.4
(1.3
)4.1
(1.4
)8.1
(7.6
)7.5
(7.3
)7.9
(6.2
)8.8
(7.9
)
VO
IA
41
3.7
(1.3
)3.2
(1.1
)3.0
(1.1
)3.6
(1.2
)7.9
(7.9
)6.8
(7.0
)7.3
(6.1
)7.1
(5.8
)
VO
IA
50
3.7
(1.2
)3.5
(1.1
)3.2
(1.1
)3.9
(1.3
)7.4
(7.2
)6.8
(6.2
)6.8
(5.6
)6.5
(5.0
)
VO
IA
70
4.7
(1.4
)4.3
(1.3
)3.9
(1.4
)4.7
(1.5
)7.8
(7.1
)6.9
(6.6
)7.6
(5.7
)6.3
(4.9
)
VO
ISchaefer−A
41
3.6
(1.2
)3.2
(1.1
)3.0
(1.1
)3.5
(1.2
)7.2
(7.1
)6.2
(6.9
)7.0
(5.3
)7.0
(5.4
)
VO
ISchaefer−A
50
3.6
(1.2
)3.2
(1.1
)2.9
(1.1
)3.5
(1.2
)7.2
(7.0
)6.5
(6.9
)6.8
(5.1
)7.1
(5.3
)
VO
IR
TL
3.8
(1.3
)3.4
(1.1
)3.1
(1.0
)3.7
(1.2
)11.3
(13.3
)5.9
(7.4
)6.8
(7.3
)8.3
(9.2
)
Gra
dW
T2.9
(0.7
)3.0
(0.9
)2.8
(0.9
)3.1
(0.9
)7.5
(5.2
)5.9
(6.6
)7.7
(5.9
)9.8
(8.4
)
SU
V2.5
3.5
(0.6
)3.3
(0.4
)3.2
(0.4
)3.5
(0.6
)2.7
(3.2
)2.7
(2.5
)2.5
(2.1
)3.1
(2.0
)
66
4.4. Discussion
response studies, there was only a small dependency of SUV ratios on VOI defi-nition and image parameters. Therefore, this discussion will focus on metabolicvolumes.
In our study, volumes determined by different tumor delineation methodswere affected by imaging parameters (resolution and noise or contrast) andtracers being used (Fig. 4.1– 4.6). This finding is in line with a previousstudy [19], showing that measured tumor volumes were affected by severalfactors, that is, image reconstruction settings, smoothing filters, and measuredmaximal SUV within a lesion. Moreover, the performance of several automaticor semiautomatic tumor delineation methods as a function of PET image charac-teristics agreed with results obtained from simulation and phantom studies [68].
Differences in tumor volumes generated with different methods have beenreported previously [7,31,32]. Substantially different results could be obtained incomparison with other image modalities or pathologic data. Few clinical studieshave shown the potential of threshold-based methods for different tracers [66,67,69]. Two articles [66,67] showed that different metabolic tumor volume test-retest repeatabilities were obtained when different tumor delineation methodswere used. Moreover, similarly to this study, they showed that volume test-retest variability obtained from 18F-FLT was larger than that from 18F-FDG.To date, however, no gold standard exists for accurately defining tumor volumeson various image modalities, with the possible exception of pathologic findings.
A previous study [70] reported excellent reproducibility, with an intraclasscorrelation coefficient of 0.98 and an SD of 7% for quantitative 18F-FLT mea-surements with high image contrast, a resolution of about 7 mm FWHM,VOIA41. In addition, this study showed that there was no significant corre-lation between absolute 18F-FLT uptake and lesion size in either lung or head-and-neck cancers, indicating that this threshold-based delineation method wasreliable for defining tumor boundaries of all lesion sizes. For this reason, in ourstudy, VOIA41 was used to compare measured volumes obtained by all other tu-mor delineation methods. However, our study shows that VOIA41 seemed to besensitive to a change in image characteristics for both tracers. For 18F-FLT, ahigh SD was observed when we summed over more frames, caused by 1 primarylesion with heterogeneous uptake. Furthermore, a relatively high number ofoutliers (≤20%) was found for both tracers when different image characteristicswere used (Table 4.1). Therefore, VOIA41 seems to be reliable only for highimage resolution and lesions with high contrast to background.
Two versions of VOISchaefer were investigated in this study. The perfor-mances between the 2 versions were similar (Fig. 4.2). However, for 18F-FDGat high resolution, a high SD of VOISchaefer−A50 was observed, caused by 1lesion with heterogeneous uptake that was near the spine. Therefore, the ver-sion in which diameter is obtained using VOIA41 is preferred in this dataset.
For radiation treatment planning, VOI50, VOIA41, both versions ofVOISchaefer, and GradWT provided, for both tracers, volumes that wererelatively independent of image contrast. However, VOI50, VOIA41, and bothvariants of VOISchaefer showed poorer performance at low image resolution(Fig. 4.1). As GradWT was relatively independent of image characteristics and
67
Chapter 4. Effects of image characteristics on performance of tumor delineation methods
showed a low number of outliers when image characteristics were varied (<4%),GradWT seems to be a good possible candidate for radiation treatment planning.However, validation of the various tumor delineation methods against a goldstandard, for example, pathology-determined tumor sizes, is still warranted.
Test-retest variability is important for assessing differences between suc-cessive scans beyond methodology-related variability. Clearly, for monitoringresponse, test-retest variability needs to be as low as possible. Large differencesin test-retest variability of tumor volume estimates (≤ 94%) were obtained fordifferent tumor delineation methods when different tracers or image character-istics were used, especially in cases of low image resolution. For both tracers,GradWT showed small test-retest variability (<17%) when image resolution wasvaried but resulted in larger test-retest variability when contrast was varied.One limitation of GradWT is that delineation of the tumor boundaries by thegradient algorithm depends on the tumor-to-background ratio, that is, con-trast, showing better performance for higher image contrast. For both tracers,VOI70 and SUV2.5 gave low changes in test-retest variability (<5.3% difference)when image characteristics were varied. Measured volumes obtained by VOI70,however, were too small to cover the whole lesion. SUV2.5 showed large overesti-mations of volume. In addition, SUV2.5 generated a large number of outliers fordifferent contrasts, especially for 18F-FDG (Table 4.1). In general, VOIA50 andVOIRTL showed reasonable test-retest variability and a small number of outliersfor both 18F-FDG and 18F-FLT (Fig. 4.3- 4.6). In general, VOIA50 showeda slightly smaller coefficient of variation (calculated as mean divided by SD)(<21%) than did VOIRTL (>27%) when resolution was changed. Therefore, asalso reported previously [66], VOIA50 seems to be a good possible candidate forresponse monitoring purposes.
For all image characteristics investigated, there was poor agreement betweenmedian test-retest variability of tumor volume and SUV (R2 <0.3, data notshown). In addition, there were large differences in median test-retest vari-ability between the 2 parameters for all image characteristics; that is, me-dian differences between test-retest variability of tumor volume and SUV wereapproximately 2.3-fold (range, 1.1–4.5) and 3.7-fold (range, 1.0–11) for 18F-FDGand 18F-FLT, respectively. The implication is that tumor volume and its test-retest variability are more sensitive to changes in image characteristics than areSUV and its test-retest variability, as also confirmed by the statistical post hocanalysis.
For most methods, higher values of median SUV test-retest variability wereobtained for lower contrast, with the exception of both variants of VOISchaefer
and SUV2.5 for 18F-FDG (Fig. 4.6A) and VOIA70, VOIRTL, and GradWT for18F-FLT (Fig. 4.6B). The likely explanation for this poorer percentage re-producibility is the lower average SUV caused by summing over more frames.When imaging parameters are varied, larger differences in SUV test-retest vari-ability were seen for 18F-FLT than for 18F-FDG, probably because of the lowerSUV for 18F-FLT. Previously, in a comparative study, it was shown that meanmaximal SUV in all lesions was lower for 18F-FLT than for 18F-FDG [70].
There were several limitations in determining the test-retest variability of
68
4.5. Conclusion
metabolic tumor volume and SUV using the various methods. Although 2 dif-ferent types of tracers were used in this study, both tracers have the same kind ofkinetic model. Therefore, the impact of various image characteristics on tracerswith other kinetic behaviors should be further investigated. In addition, becausethis was a clinical study, the exact lesion volumes clearly were not known. Thisissue needs to be addressed in future studies by comparing VOI measurementswith independent measurements based on other (anatomic) image modalitiesor pathologic specimens. Furthermore, visual inspection of outliers may haveaffected the performance evaluations to some extent. However, these visualinspections were required, as unrealistically large tumor segmentations mightoccur during segmentation because of noise, surroundings, or uptake hetero-geneity. Finally, the sum of the last 3 frames not only shows higher contrastbut also more noise. However, data were summed over 30 or 15 min, providingimages with good statistical quality. In this way, we attempted to reduce theeffect of a difference in noise between the 2 datasets.
4.5 Conclusion
For all automatic or semiautomatic tumor delineation methods tested, derivedmetabolic tumor volumes themselves and test-retest variability of both metabolictumor volume and SUV depended on image characteristics. Differences intest-retest variability of SUV were much smaller than those of tumor volume.These findings underline the need for a careful optimization of both the tumordelineation method used and the imaging parameters to obtain accurate andreproducible delineations of tumors or metabolic volume assessments.
69