11
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 jo ur nal ho me p ag e: www.intl.elsevierhealt h.com/journals/cmpb Semi-automated and fully automated mammographic density measurement and breast cancer risk prediction Rafael Llobet a,, Marina Pollán b,c , Joaquín Antón a , Josefa Miranda-García d,e , María Casals d,e , Inmaculada Martínez d,e , Francisco Ruiz-Perales d,e , Beatriz Pérez-Gómez a,b , Dolores Salas-Trejo d,e , Juan-Carlos Pérez-Cortés a a Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera s/n, 46022 Valencia, Spain b National Center for Epidemiology, Carlos III Institute of Health, Monforte de Lemos 5, Madrid 28029, Spain c Consortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública CIBERESP), Carlos III Institute of Health, Monforte de Lemos 5, Madrid 28029, Spain d Valencian Breast Cancer Screening Program, General Directorate of Public Health, Valencia, Spain e Centro Superior de Investigación en Salud Pública CSISP, FISABIO, Valencia, Spain a r t i c l e i n f o Article history: Received 3 July 2013 Received in revised form 6 October 2013 Accepted 27 January 2014 Keywords: Mammographic density Automated density assessment Computer-aided diagnosis Computer image analysis Breast cancer risk a b s t r a c t The task of breast density quantification is becoming increasingly relevant due to its asso- ciation with breast cancer risk. In this work, a semi-automated and a fully automated tools to assess breast density from full-field digitized mammograms are presented. The first tool is based on a supervised interactive thresholding procedure for segmenting dense from fatty tissue and is used with a twofold goal: for assessing mammographic density (MD) in a more objective and accurate way than via visual-based methods and for label- ing the mammograms that are later employed to train the fully automated tool. Although most automated methods rely on supervised approaches based on a global labeling of the mammogram, the proposed method relies on pixel-level labeling, allowing better tissue classification and density measurement on a continuous scale. The fully automated method presented combines a classification scheme based on local features and thresholding oper- ations that improve the performance of the classifier. A dataset of 655 mammograms was used to test the concordance of both approaches in measuring MD. Three expert radiolo- gists measured MD in each of the mammograms using the semi-automated tool (DM-Scan). It was then measured by the fully automated system and the correlation between both methods was computed. The relation between MD and breast cancer was then analyzed using a case–control dataset consisting of 230 mammograms. The Intraclass Correlation Coefficient (ICC) was used to compute reliability among raters and between techniques. Corresponding author. Tel.: +34 96 3877069; fax: +34 96 3877239. E-mail addresses: [email protected] (R. Llobet), [email protected] (M. Pollán), [email protected] (J. Antón), miranda [email protected] (J. Miranda-García), casals [email protected] (M. Casals), martinez [email protected] (I. Martínez), [email protected] (F. Ruiz-Perales), [email protected] (B. Pérez-Gómez), salas [email protected] (D. Salas-Trejo), [email protected] (J.-C. Pérez-Cortés). http://dx.doi.org/10.1016/j.cmpb.2014.01.021 0169-2607/© 2014 Elsevier Ireland Ltd. All rights reserved.

Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

Smc

RJFJa

b

c

Cd

e

a

A

R

R

6

A

K

M

A

C

C

B

((h0

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

jo ur nal ho me p ag e: www.int l .e lsev ierhea l t h.com/ journa ls /cmpb

emi-automated and fully automatedammographic density measurement and breast

ancer risk prediction

afael Llobeta,∗, Marina Pollánb,c, Joaquín Antóna,osefa Miranda-Garcíad,e, María Casalsd,e, Inmaculada Martínezd,e,rancisco Ruiz-Peralesd,e, Beatriz Pérez-Gómeza,b, Dolores Salas-Trejod,e,uan-Carlos Pérez-Cortésa

Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera s/n, 46022 Valencia, SpainNational Center for Epidemiology, Carlos III Institute of Health, Monforte de Lemos 5, Madrid 28029, SpainConsortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública –IBERESP), Carlos III Institute of Health, Monforte de Lemos 5, Madrid 28029, SpainValencian Breast Cancer Screening Program, General Directorate of Public Health, Valencia, SpainCentro Superior de Investigación en Salud Pública CSISP, FISABIO, Valencia, Spain

r t i c l e i n f o

rticle history:

eceived 3 July 2013

eceived in revised form

October 2013

ccepted 27 January 2014

eywords:

ammographic density

utomated density assessment

omputer-aided diagnosis

omputer image analysis

reast cancer risk

a b s t r a c t

The task of breast density quantification is becoming increasingly relevant due to its asso-

ciation with breast cancer risk. In this work, a semi-automated and a fully automated

tools to assess breast density from full-field digitized mammograms are presented. The

first tool is based on a supervised interactive thresholding procedure for segmenting dense

from fatty tissue and is used with a twofold goal: for assessing mammographic density

(MD) in a more objective and accurate way than via visual-based methods and for label-

ing the mammograms that are later employed to train the fully automated tool. Although

most automated methods rely on supervised approaches based on a global labeling of the

mammogram, the proposed method relies on pixel-level labeling, allowing better tissue

classification and density measurement on a continuous scale. The fully automated method

presented combines a classification scheme based on local features and thresholding oper-

ations that improve the performance of the classifier. A dataset of 655 mammograms was

used to test the concordance of both approaches in measuring MD. Three expert radiolo-

gists measured MD in each of the mammograms using the semi-automated tool (DM-Scan).

It was then measured by the fully automated system and the correlation between both

ted. The relation between MD and breast cancer was then analyzed

methods was compu

using a case–control dataset consisting of 230 mammograms. The Intraclass Correlation

Coefficient (ICC) was used to compute reliability among raters and between techniques.

∗ Corresponding author. Tel.: +34 96 3877069; fax: +34 96 3877239.E-mail addresses: [email protected] (R. Llobet), [email protected] (M. Pollán), [email protected] (J. Antón), miranda [email protected]

J. Miranda-García), casals [email protected] (M. Casals), martinez [email protected] (I. Martínez), [email protected] (F. Ruiz-Perales), [email protected]. Pérez-Gómez), salas [email protected] (D. Salas-Trejo), [email protected] (J.-C. Pérez-Cortés).ttp://dx.doi.org/10.1016/j.cmpb.2014.01.021169-2607/© 2014 Elsevier Ireland Ltd. All rights reserved.

Page 2: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

106 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

The results obtained showed an average ICC = 0.922 among raters when using the semi-

automated tool, whilst the average correlation between the semi-automated and automated

measures was ICC = 0.838. In the case–control study, the results obtained showed Odds Ratios

(OR) of 1.38 and 1.50 per 10% increase in MD when using the semi-automated and fully

automated approaches respectively. It can therefore be concluded that the automated and

semi-automated MD assessments present a good correlation. Both the methods also found

an association between MD and breast cancer risk, which warrants the proposed tools for

breast cancer risk prediction and clinical decision making. A full version of the DM-Scan is

freely available.

© 2014 Elsevier Ireland Ltd. All rights reserved.

Fig. 1 – Breast composed of fatty tissue (top), dense andfatty tissue (middle) and mainly dense tissue (bottom).

1. Introduction

Breasts are composed of fibroglandular or dense tissue (FGT)and fatty tissue. On a conventional mammography, FGTappears brighter than fatty tissue, due to the higher X-rayattenuation of the former. Mammographic density (MD) iscomputed as the proportion of FGT in the breast. Some exam-ples of mammograms of different densities can be seen in Fig.1.

MD is associated with an increased risk of developingbreast cancer [1,2]. This association is more important thanalmost all other risk factors for the disease. Women with densebreasts are at four-to-six-fold higher risk than those with pri-marily fatty breasts [2–4].

In addition to the increased risk of developing breast can-cer in women with high MD, high breast density impedes itsdiagnosis due to the fact that the high X-ray attenuation ofdense tissue can obscure a tumor. This means the sensitivityof mammography for detecting breast cancer can be signifi-cantly reduced in the case of dense breasts [5]. Retrospectivestudies have shown that in current breast cancer screening10–25% of tumors are missed by radiologists [6,7] and some ofthese false negative results can be explained by higher breastdensity.

Since the association between mammographic density andan increased risk of breast cancer was first discovered, sev-eral metrics have been proposed to classify this parameter:Wolfe’s four parenchymal patterns [1,8], Tabar’s five patterns[9] and, more recently, Boyd and BI-RADS categories. TheBoyd scale [10] classifies MD in six categories: A:0%, B:1–10%,C:10–25%, D:25–50%, E:50–75% and F:75–100%, while BI-RADS[11] divides density into four categories: A:0–25%, B:25–50%,C:50–75% and D:75–100%. However, all these methods arebased on visual analysis of the mammogram and present twomajor drawbacks: the subjectiveness of the categorization andthe difficulty of assigning a category when the mammogramis near the boundary between two categories.

Computer-assisted measurement of breast density hasbeen studied in the last few years in an attempt to obtain moreobjective risk assessments. Several semi-automated methodsbased on interactive thresholding techniques that computethe percentage of the dense tissue over the segmented breastarea have been proposed [10,12], as well as some fully auto-

mated methods. Karssemeijer [13] developed an automatedmethod in which features are calculated from gray level his-tograms computed in different regions of approximately equal

distances from the skin line, and then classified using the k-nearest neighbor (k-NN) rule. Saha et al. [14] describe a methodusing a scale-based fuzzy connectivity approach. Klifa [15]et al. present a segmentation technique based on fuzzy clus-tering to quantify breast density from MRI data. Oliver et al.[16] suggest an approach based on gross segmentation andthe extraction of texture features of pixels with similar tis-sue appearance. This work is continued in [17] in which aFuzzy C-Means clustering approach is used for gross segmen-tation. Muhimmah et al. [18] use a feature extraction schemebased on a multiresolution histogram. Heine et al. [19] first per-

form FGT segmentation using a wavelet high pass filter. Thesame group later propose a new measure called variation mea-sure, which is calculated as the standard deviation of the pixel
Page 3: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

i n b i

vaMtmmttoIadthvtvsosa

mitbncr

aScvoTlo

cubmc

2

2

Tmucvnwct

c o m p u t e r m e t h o d s a n d p r o g r a m s

alues within a specific region of the breast, and find an associ-tion between this measure and the risk of breast cancer [20].anduca et al. [21] analyze several textural features of breast

issue and find that they predict breast cancer risk at the sameagnitude as MD. Li et al. [22] propose a method based on aachine learning approach in which the MD obtained using

he semi-automated tool presented in [10] is used as groundruth. A variety of measurements obtained under 15 thresh-lding methods are then used as features to learn the model.n general, segmentation of non-fatty tissue in mammogramsppears to be more difficult than one might think, due to largeifferences in appearance between different parenchymalypes [13]. 3D approximations to breast density assessmentave also been addressed [23–25], but there is no evidence thatolumetric density measurements are more strongly relatedo breast cancer risk than 2D measurements [26]. Moreover,ery high correlation has been found between breast den-ity measured from the cranio-caudal (CC) and mediolateralblique (MLO) views [27], which suggests that measuring den-ity from a mammography (a 2D projection of the breast) is ancceptable simplification.

Regardless of the method employed to classify and esti-ate mammographic density, this measure is of major

mportance as it could influence the choice of alterna-ive screening paradigms, such as shortening the intervalsetween mammograms, using other methods such as mag-etic resonance imaging (MRI), or to signal the need for moreareful interpretation of the mammogram, such as double-eading.

This work presents a computer-assisted (semi-automated)nd a fully automated tool for the assessment of MD (DM-can). The aim of the semi-automated tool is twofold. Firstly, itan be used to assess MD more objectively and accurately thanisually based methods. Secondly, it is used to label each pixelf the mammogram in one of two possible classes: dense/fatty.his process, together with a feature extraction scheme, is

ater used to train a model to estimate MD automatically, with-ut human intervention.

In a previous work [28], MD estimates using DM-Scan wereompared with those obtained by visual inspection and bysing Cumulus [10,29]. In this work, a comparison is madeetween the concordance of semi-automated and fully auto-ated assessment tools in calculating MD, after which the

orrelation between MD and breast cancer risk is analyzed.

. Materials and methods

.1. Data sets

wo different sets of mammograms, known as DDM (655ammograms) and case–control (230 mammograms), were

sed in this study. The DDM set was used to analyze theoncordance of each of the three methods in assessing MD:isual, semi-automated and fully automated. Unfortunately,

o prospective information related to cancer developmentas available in this set of mammograms, so that breast can-

er risk could not be analyzed in this case. For this purpose,he case–control set was used instead.

o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 107

The DDM set consists of 655 mammograms, a subsetof those used in the DDM-Spain project [30,31]. This wasa cross-sectional study which recruited 3584 women aged45–68 years at 7 screening centers within the Spanish breastcancer screening network. Informed consent was obtainedfrom all participants, who agreed to their left cranio-caudalmammograms (single view) being used for study purposes.Mammograms from women participants at two screening cen-ters equipped with full-field digital mammography machines,i.e. a total of 655 mammograms, were used in the study. Thedevices used in these two centers were a Hologic-Lorad M-IVand a Siemmens MAMMOMAT NovationDR.

The case–control set was formed from all breast cancercases diagnosed in women attending the Burjasot screeningcenter in Valencia, where full-field digital images had been inuse for more than 4 years (Senographe 2000D Full Field Digi-tal Mammography System). Breast cancer cases diagnosed inwomen screened at this center between the years 2007 and2010 and who had attended screening in the previous round,were included in the study. For each case, a matched con-trol was randomly chosen among women of similar age (±2years) screened the same year. Cases and controls with breastimplants, surgical reduction or poor quality mammogramswere excluded, so that the final sample consisted of 112 casesand 119 controls. The study was approved by the CSISP (CentroSuperior de Investigación en Salud Pública) Ethics Committee.

2.2. MD assessment concordance

A computer-aided tool (DM-Scan) which computes MD bysegmenting dense and fatty tissue in the mammogram wasspecifically developed for this project. The MD of each of the655 mammograms in the DDM dataset was then assessedby three raters using this tool. In addition, a second read-ing of a subset of 150 randomly selected mammograms wasperformed two months later and both inter and intra-raterconcordance was computed.

The three raters who participated in this study were expe-rienced radiologists. Raters R1 and R2 had been readingscreening mammograms from more than 10 years, with 2years experience of full digital mammography in the formercase and 6 years of indirect digital mammography in the lat-ter. R3 had been reading mammograms for 34 years, including2 years of indirect digital mammographs and 6 years of fulldigital mammograms.

The goals in this phase were (1) to provide a tool to reducethe subjectiveness of the classification process, (2) to obtainmammogram density on a continuous scale instead of in cat-egories (as in manual classification) and (3) to obtain local(pixel-level) labeling of the mammograms. (1) and (2) wereexpected to improve agreement achieved in manual classi-fication, while (3) was to be used to train a fully automatedclassifier.

In the second phase, the fully automated classifier wasimplemented and the MD of each mammogram in the DDM

dataset was estimated. The reproducibility of the differentmethods was analyzed by computing the correlation of theMDs obtained by the semi-automated and automated meth-ods.
Page 4: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

108 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

segm

(

Fig. 2 – Example of FGT

2.3. Breast cancer discrimination

A case–control study was conducted to analyze the ability ofthe proposed techniques to predict breast cancer. MD wasassessed both by a rater using the semi-automated tool and bythe automated system. The area under the receiver operatingcharacteristic curve (AUC) and odds ratios were computed.

2.4. Semi-automated classification

Classification based on visual inspection is susceptible tohuman error and subjectivity. Even for experienced radiolo-gists, it is hard to assess mammographic density objectivelyand accurately. Some studies show that intraobserver agree-ment is about 80%, whilst inter-observer agreement is below70% when using this technique [32,33].

As computer-assisted breast density measurement canhelp to reduce subjectiveness, we developed a computer-aided diagnostic tool for MD quantification (DM-Scan).This tool is freely available for noncommercial use athttp://dmscan.iti.upv.es. As opposed to visual inspectionmethods, in which MD is usually specified as a category, itprovides quantitative MD measurement on a continuous scale.Although qualitative methods have been used in other studiesto measure mammographic density, these techniques involve

greater subjectivity and therefore lower reproducibility. Quan-titative methods can minimize such drawbacks. Fig. 2 showsan example of a digital mammogram viewed in the DM-Scanscreen.

(

entation in DM-Scan.

The basic idea of the proposed tool is to identify pixelsbelonging to background, fat tissue (FT) and fibroglandular ordense tissue (FGT) by establishing two thresholds: T1 and T2.MD is then measured as the amount of FGT in relation to breastsize, i.e., MD = FGT/(FGT + FT)100.

Firstly, a pre-process is applied to condition the imagebefore tissue segmentation is performed. Three main opera-tions are carried out in this phase: (a) contrast and brightnessnormalization, (b) brightness correction according to breastthickness and (c) segmentation of the breast and removal ofregions of no-interest.

a) Contrast and brightness normalization: Contrast can varysignificantly from one image to another due to differ-ences in the acquisition process (mainly acquisition deviceand radiation dose). To ensure that the brightness valuesdepend as much as possible on tissue density and noton other factors related to the acquisition process, con-trast and brightness normalization is desirable. Assumingthat minimum and maximum tissue densities are alwayspresent in a mammography (subcutaneous fat and con-nective tissue respectively), minimum and maximumgray-level values should also appear in the histogram.Based on this idea, a histogram stretching operation can beset to normalize brightness and contrast. Options to man-

ually modify brightness and contrast are also available.

b) Brightness correction: X-ray attenuation depends not onlyon the density of the irradiated tissue, but also on itsthickness. The thicker the tissue irradiated, the greater

Page 5: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 109

(

bbtFtsm

2

Tcnbi(fIamwoi

Training images

Labeling Feature extraction

Prototypes TFG/Fatty

Test image Preprocess Feature

extraction KNN

classification Hypothesis

map

Thresholding

MD

Fig. 3 – Scheme of the automated process.

c o m p u t e r m e t h o d s a n d p r o g r a m s

the attenuation and, consequently, the brighter the image.When the mammogram is taken, the breast is compressedbetween two parallel flat plates, (b) which causes thebreast to have a uniform thickness between the plates.However, toward the edge of the breast, the thickness grad-ually decreases. This is a drawback when the goal is tosegment dense tissue, since thicker regions may look likedense tissue and vice versa. In order to avoid this problem,a brightness correction coefficient ki,j was applied to eachpixel pi,j according to a user-defined parameter ∈ [0 : 1] asspecified below:

ki,j = + (1 − ˛)di,j

where di,j is the horizontal distance from pi,j to either theinternal border of the image or the pectoral muscle ifpresent divided by the total distance between this borderand the breast edge at row i, i.e., di,j = 0 when pi,j coincideswith the border of the image, and di,j = 1 when pi,j coincideswith the edge of the breast. A value of = 1 leaves the imageunchanged, while values of 0 ≤ < 1 attenuate brightnessas we approach the internal part of the mammogram. Thelower is the greater the attenuation.

c) Breast segmentation and removal of unwanted regions:Mammograms usually contain other extraneous objects,such as labels and/or the pectoral muscle. Breast seg-mentation is semi-automatically performed by findinga threshold value T1 that discriminates between back-ground and object pixels. The biggest object found isconsidered to be the breast, while the remainder are con-sidered regions of no interest and, therefore, removed.Nevertheless, this process cannot discard objects con-nected to the breast. To fix this problem, the user canmodify the proposed T1 threshold and also manuallyinvalidate other regions/objects not detected in the pre-vious process.

Once the image has been preprocessed and the breast haseen segmented by means of T1, a second threshold T2 muste manually set to separate dense and fat tissue, which meanshe dense and non-dense or fatty tissue can be measured.inally, MD = FGT/(FGT + FT)100 is computed. The segmenta-ion obtained in this process is also used as a method ofupervised pixel labeling, which is used to train a fully auto-ated classifier, as explained in the next section.

.5. Fully automated classification

he approach used for automated classification is based on alassical supervised machine-learning scheme. This approacheeds a set of labeled samples to train a model, which wille used later in the classification stage. When manual (visual

nspection) classification is performed, a global class labelsemi-quantitative classification usually including betweenour and six categories) is assigned to each mammogram.n this case, ground truth at pixel level (local level) is notvailable, which prevents the implementation of a supervised

ethod based on local features to train the classifier. However,hen addressing MD assessment tasks, methods based onlyn global features tend to fail due to high intra-class variabil-

ty. Our approach takes advantage of the FGT segmentation

performed with DM-Scan to label each breast pixel into oneof the two possible classes (FGT/fatty). This makes it possibleto train a model that can discriminate between both types oftissue. Fig. 3 shows a diagram of the proposed scheme.

In the training phase, local features are extracted from eachimage in the training set. For this purpose, a local window ofN × N pixels is shifted along the breast region and the gray levelvalues are extracted, producing a local feature vector for eachlocal window. Then, the extracted local feature vectors are pro-jected into a lower dimensional space by using PCA analysis,and labeled with the class pertaning to the central pixel ofthe local window. The optimal N value for images measuring1024 × 1024 was empirically found to be 31.

Once the system has been trained, the proposed classi-fication method consists of three stages. Firstly, the breastis automatically segmented from the background of themammogram and contrast normalization and brightness cor-rection is performed. Secondly, each pixel of the mammogram(or a subsampling if the resolution is too high) is classifiedusing the model learned in the previous training stage. Thisyields a hypothesis map (FGT/fatty) for each pixel, which couldbe used to compute MD. However, a third final step based ona thresholding operation is performed to improve the classifi-cation.

Breast segmentation is automated by combining automaticthresholding based on the histogram of the image, followed bya connected-component analysis algorithm that separates thebreast structure from background noise. In this work we usedthe cranio-caudal (CC) view of the mammogram, in whichthe pectoral muscle is either not present or is negligible. Thismeans the pectoral muscle is not automatically segmented.When using the mediolateral oblique (MLO) view, manual orautomatic segmentation of the pectoral muscle should be car-ried out.

Thresholding is performed to remove the dark backgroundby looking at the position of the highest peak in the histogram.Connected-component labeling is then applied for blob detec-tion. Finally, the breast region is selected as the blob with thebiggest area.

Contrast normalization and brightness correction is done

in the same way as in the semi-automated approach,described in Section 2.4. In this case, the parameter is setto a fixed, empirically calculated value.
Page 6: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

Table 1 – Inter-rater ICC with their 95% confidenceintervals for semi-automated estimation.

Raters ICC [CI 95%]

R1–R2 0.922 [0.910–0.933]R1–R3 0.928 [0.916–0.938]

Table 3 shows the ICC comparing the fully automated andthe semi-automated (DM-Scan) methods for each rater. Fig. 5

Table 2 – Intra-rater ICC with their 95% confidenceintervals for semi-automated estimation.

Rater ICC [CI 95%]

110 c o m p u t e r m e t h o d s a n d p r o g r a m

In the classification stage, the approach used to extractlocal feature vectors from training images is applied to the testimage. Each of the feature vectors extracted are then classi-fied by the k-NN algorithm, which produces a hypothesis labelfor each pixel, using the fast approximate nearest neighborsearch, based on kd trees [34]. The result of the classifica-tion stage is a binary map representing tissue types. MD couldbe computed from this map, but a thresholding operation isfinally performed to improve the results.

Different thresholding operations are performed on theoriginal image at all the existing gray level values t. False posi-tives (FPt) and false negatives (FNt) are computed, in which FPt

are pixels classified as FGT with brightness lower or equal tot, and FNt are pixels classified as fatty with brightness higherthan t. Finally, the optimal threshold t is calculated as

t = argmint(FPt + FNt) (1)

and every point in the image is relabeled as FGT if its gray valueis higher than t orotherwise as fatty, leading to the final MDvalue as the ratio of FGT over total breast pixels.

Instead of using the hard classification scheme whichlabels each pixel in one of two possible classes (FGT/fatty), asoft or probabilistic classification can be performed. In thiscase, the k-NN classifier assigns to each pixel xi,j a probabilityP(xi,j) of belonging to the FGT class (and P(fatty|xi,j) is com-puted as 1 − P(FGT|xi,j)). The classification stage thus producesa probability map that represents the likelihood of each pixelbelonging to the FGT class. The optimal threshold t can thenbe computed using a probabilistic estimation of the conceptsof FPt and FNt as follows:

t = argmint

⎛⎝∑

i,j

(P(fatty|xi,j)ˇi,j,t + P(TFG|xi,j)(1 − ˇi,j,t)

⎞⎠ (2)

where ˇi,j,t = 1 if the brightness of xi,j is greater than t and 0otherwise. In short, the thresholding operation minimizes thesum of P(FGT|xi,j) for those pixels with brightness lower thant plus the sum of P(fatty|xi,j) for pixels with brightness greaterthan t.

3. Experiments and results

Two sets of experiments were carried out to test the perfor-mance of the proposed system. The first was aimed at testingthe agreement level in the MD calculation when using thesemi-automated and fully automated systems, using the DDMdataset described in Section 2.1. In the second set, the rela-tion between MD and the risk of developing breast cancer wasanalyzed by the case–control dataset (see Section 2.1).

3.1. Semi-automated and fully automated correlation

In this task, a real ground truth (gold standard) as defined inother classification problems does not exist, since differentexperts may differ in their assessment about what is consid-ered dense and fat tissue.

R2–R3 0.916 [0.902–0.927]Mean 0.922 [0.909–0.933]

In these cases, the performance is not derived from theerror rate, but from the concordance correlation among raters,which determines the degree of reproducibility of the mea-surement technique. With this purpose, intra and inter raterscorrelation using computer-aided (semi-automated) assess-ment, as well as correlation between fully automated andsemi-automated assessment has been computed using theIntraclass Correlation Coefficient (ICC). The oneway approxi-mation, as described in [35], has been employed.

In these cases, the performance is not derived from theerror rate, but from the concordance correlation amongraters, which determines the degree of reproducibility ofthe measurement technique. With this purpose, the intra-and inter-rater correlation was calculated using computer-aided (semi-automated) assessment, as well as the correlationbetween fully automated and semi-automated assessment,using the Intraclass Correlation Coefficient (ICC). The onewayapproximation, as described in [35], was employed. Prelim-inary experiments were carried out to test the behavior ofEqs. (1) and (2) in the proposed classification scheme. The bestresults were obtained with Eq. (1) (hard classification), and theresults shown below correspond to this approach.

Tables 1 and 2 show, respectively, the inter and intra-raterICC obtained when using DM-Scan in semi-automated mode.The extent of dispersion between raters is shown in Fig. 4by means of Bland–Altman plots. Horizontal lines are plottedindicating the limits of agreement (mean and mean ±1.96 SD).

These correlations are very good and significantly outper-form the concordance obtained in visual inspection reportedin previous works [31,36,37].

Once MD assessment by means of DM-Scan had beenperformed and analyzed, a fully automated classifier wasimplemented as described in Section 2.5. The FGT segmenta-tion of rater R2 was used as ground truth to train the system.The optimal values of the different parameters were empir-ically found to be: local window size = 31, number of principalcomponents = 8, = 0.7 and K = 11. A 10-fold cross-validationapproach was employed to assess the MD of the 655 mam-mograms.

R1 0.935 [0.911–0.952]R2 0.938 [0.915–0.955]R3 0.900 [0.863–0.926]Mean 0.924 [0.896–0.944]

Page 7: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 111

R1–R

sc

dibe

Fig. 4 – Bland–Altman plot for raters R1–R2 (top),

hows the Bland–Altman plot between R2 and the automatedlassification.

As can be seen in Tables 1 and 2, radiologists are reported toisagree on classifications, even if intra-observer agreement

s analyzed (average ICC = 0.92), so that a higher agreementetween a human and an automated classifier cannot bexpected. The reported ICC = 0.838 between R2 (used to train

Table 3 – ICC with their 95% confidence intervals for fullyautomated estimation.

Rater ICC [CI 95%]

R1 0.800 [0.771–0.826]R2 0.838 [0.814–0.860]R3 0.785 [0.754–0.813]Mean 0.794 [0.764–0.821]

3 (medium) and R2–R3 (bottom) using DM-Scan.

the system) and the automatic classifier can therefore be con-sidered a good result.

3.2. Breast cancer prediction

In the second set of experiments, MD was used as a test vari-able to determine the risk of developing breast cancer. Both therelation of semi-automated and fully automated MD assess-ment with breast cancer risk were analyzed. The case–controldataset was used to test both approaches, whilst R2’s DDMdataset was employed in these experiments to train the auto-mated system. Firstly, the MD mean values for Cases and

Controls were computed in both methods. Then, the areaunder the Receiver Operating Characteristic curve (AUC) andthe Odds Ratios (OR), adjusted by age, were used to comparethe semi-automated and the fully automated approaches. OR
Page 8: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

112 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

Fig. 5 – Bland–Altman plot for automated classification andsemi-automated classification (rater R1) with case–control

Table 6 – Odds ratios obtained at different cutpoints.

Cutpoint Control/case OR 95% CI P-value

Semi-automated<10 37/21 1.0010–19.9 31/26 1.89 0.94–3.82 0.07420–29.9 26/32 2.56 1.15–5.69 0.021≥30 25/32 2.87 1.29–6.39 0.010Per 10% increase 1.38 1.11–1.71 0.003

Automated<10 36/22 1.0010–19.9 32/25 2.56 1.19–5.49 0.016

dataset.

and its 95% confidence interval were estimated using uncon-ditional logistic models.

The mean MD obtained for controls and cases is shownin Table 4. As expected, MD is higher in Cases, which corrobo-rates the existence of a relation between MD and breast cancer.However, a slight underestimation and a lower MD range isdetected in the automated mode. This is due to some low con-trast images present in the case–control dataset, which meansMD is generally underrated in the automated system.

Table 5 shows the AUC obtained for semi-automated andfully automated MD quantification. Both methods show a sub-tle but significant correlation with breast cancer risk, whichsuggests that MD can help to predict this disease. The resultsare very similar for both approaches, which demonstrates thatthe proposed automated method could be used instead of thetraditional manual or semi-automated methods to estimateMD and predict breast cancer.

Table 6 shows the OR obtained at different cutpoints, with

the first quartile used as reference. It can be observed thatcancer risk increases with MD (average OR increase of 1.38 and1.50 per 10% increase in MD for the semiautomated and fully

Table 4 – Mean MD for controls and cases insemi-automated and fully automated methods.

Mean controls Mean cases P-value

Semi-automated 16.8 [14.4–19.2] 20.9 [18.1–23.6] 0.026Automated 16.6 [15.05–18.1] 18.6 [17.1–20.1] 0.061

Table 5 – AUC for the different markers analyzed

Marker AUC [CI 95%]

MDsemi 0.613 [0.539–0.687]MDauto 0.602 [0.530–0.672]

20–29.9 25/33 3.50 1.45–8.45 0.005≥30 26/31 2.55 0.81–8.03 0.109Per 10% increase 1.50 1.07–2.10 0.019

automated methods, respectively). In general, as in the AUCstudy, the behavior of both methods is very similar, althoughthe automated approach gives wider confidence intervals inthe last quartile.

Although the automated method presents a slightly lowerpredictive value than the semi-automated one at the highestcut point, the OR per 10% increase is higher in the former,which suggests that the automated method could also be usedto estimate breast cancer risk and consequently could be usedin clinical decision making.

4. Discussion

A semi-automated and a fully automated method of measur-ing breast density are proposed here. In the first case, MD isbased on the selection of an experienced radiologist’s thresh-old, which segments dense from fatty tissue. This allows MDmeasurement to be made on a continuous scale, in contrastwith traditional visual-based methods, which divide MD intocategories. Moreover, semi-automated segmentation of densetissue is employed to label the mammograms used later totrain the automatic system. A computer tool (DM-Scan) wasdeveloped for this purpose and is now freely available for non-commercial use at http://dmscan.iti.upv.es.

Using a dataset of 655 mammograms, we found a highinter-rater correlation in semi-automated MD assessment(average ICC = 0.922). Furthermore, the correlation betweenrater R2, using the semi-automated system, and the auto-mated system (trained by R2) was ICC = 0.838. Althoughthis correlation is slightly lower than intra-R2 correlation(ICC = 0.938), it is still high, which suggests that the proposedautomated method could be a valid option for measuring MD,particularly when a large number of mammograms must beprocessed.

In a second set of experiments carried out on 230 mam-mograms (a case–control dataset), we found that both theautomated and the semi-automated estimate of MD wereassociated to a similar extent with breast cancer (OR per 10%increase in MD: 1.38 and 1.50 for semi-automated and auto-mated modes, respectively). These results confirm that riskassessment protocols can take advantage of the proposed MDmeasurement methods to improve the estimation of breast

cancer risk. More accurate risk prediction would help in bet-ter clinical decision making as regards screening frequencyand screening test selection.
Page 9: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

i n b i

piscmiiwpAwc

bsluwabTbs

wmao

alsoa

tmitbmbp

wtcaetwtbc

iHom

r

c o m p u t e r m e t h o d s a n d p r o g r a m s

Although the automated method achieves a slightly bettererformance in average cases, there is a notable drop in OR

n patients with high breast density (greater than 30%), due toome misclassifications in this group. Some of these errorsould be due to the automated method being trained withammograms from the DDM set, while the test images used

n this experiment were from the case–control set. The imagesn the latter set were acquired by a Senographe 2000D (whichas not used to acquire the DDM set) and its mammogramsresent bright levels very different from those in the DDM set.lthough brightness and contrast normalization algorithmsere applied, some images had not been correctly prepro-

essed, which adversely affected the classification stage.The behavior of both approaches is in general compara-

le. The semi-automated method is suitable for analyzingmall numbers of mammograms. However, when processingarge numbers, as in screening programs, the advantages ofsing an automated system are obvious. Radiologists soon tirehen analyzing large numbers of images and the accuracy

nd objectivity of the measurement process can be affected,esides which using human processors is more expensive.he evidence we found that automated MD is associated withreast cancer risk could warrant the use of this tool in a clinicaletting for risk prediction and clinical decision making.

However, a number of issues should be addressed in futureork in order to improve the accuracy of automated densityeasurement and cancer risk prediction, including brightness

nd contrast normalization issues by more accurate methodsf finding histogram limits.

It has also been mentioned that the agreement between human expert (rater) and the automated system is slightlyower than that among individual raters. This could be due toome manual processes (brightness correction and removalf pectoral muscle) not being addressed in the automatedpproach.

Brightness correction, (parameter explained in Sec-ion 2.4) was fixed at a constant value of 0.9 in the automated

ethod. This value is adequate in most cases, but couldntroduce a significant error in MD measurement. Duringhe acquisition process, breast compression and the distanceetween the plates containing the breast can be stored asetadata in the image file and could be used to estimate

reast thickness [38] and, consequently, an adaptive value ofarameter ˛.

It should also be noted that a hard classification schemeas used, assuming that each pixel can only belong to one of

he two possible classes. In preliminary experiments, betteroncordance was found between semi-automated and fullyutomated assessment when using hard classification. How-ver the relation between MD and breast cancer risk was notested with a soft or probabilistic classification scheme, inhich each pixel has an associated probability of belonging

o each class. Future experiments should take these proba-ilities into account to test whether a probabilistic approachould improve breast cancer risk prediction.

The influence of the pectoral muscle was considered to be

nsignificant in the cranio-caudal (CC) view used in this study.owever, here again more accurate measurements could bebtained if it were to be supressed using algorithms for auto-atic pectoral muscle removall [39,40]. If mediolateral oblique

o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 113

(MLO) views are used instead of CC, then automatic pec-toral muscle supression is mandatory. Also, improvements incontrast normalization should be addressed to avoid underes-timating MD in low contrast images. Finally, other parameterscould be explored that take into account not only the relativedensity but also the shape and the distribution of FGT in thebreast [20,22] or texture features from the mammogram [21].

In conclusion, our work shows that the semi-automatedand fully automated methods presented here showed a rea-sonable agreement and had substantial discriminative powerto predict subsequent breast cancer development. The fullyautomated method will facilitate the incorporation of mam-mographic density assessment in clinical and screeningpractices. It can also be expected to facilitate the study of theevolution of breast density with time, as recent studies sug-gest that this evolution is even more informative as regardsthe risk of breast cancer [41,42].

Authors’ contributions

JA & RLL developed DM-Scan. MP, RLL, JMG, BPG, and DST con-ceived the concordance study and participated in its design.JMG, RLL & JA gathered and organized the set of mammogramsfrom the concordance study. MC, IM, & FRP were responsiblefor MD assessment by DM-Scan. JA, RLL & JCPC conceived anddeveloped the algorithms for automated classification. JA &RLL carried out the experiments on automated classification.JMG & DST performed the matching process and gathered theinformation needed for the case–control study. RLL and MPperformed the statistical analysis and the results were circu-lated and commented on by all the authors. RLL drafted thefirst version of the manuscript and it was critically reviewedby the rest of the authors. JCPC corrected the final version. Allthe authors have read and approved the final manuscript.

Conflict of interest statement

The authors declare that they have no conflict of interest.

Acknowledgments

This work was supported by research grants from Gent perGent Fund (EDEMAC Project); Spain’s Health Research Fund(Fondo de Investigación Santiaria) (PI060386 & FIS PS09/00790);Spanish MICINN grants TIN2009-14205-C04-02 and ConsoliderIngenio 2010: MIPRCV (CSD2007-00018); Spanish Federationof Breast Cancer Patients (Federación Espanola de Cáncer deMama) (FECMA 485 EPY 1170-10). The English revision of thispaper was funded by the Universitat Politècnica de València,Spain.

e f e r e n c e s

[1] J.N. Wolfe, Breast pattern as an index of risk for developingbreast cancer, Am. J. Roentgenol. 126 (June (6)) (1976)1130–1137.

Page 10: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

s i n

114 c o m p u t e r m e t h o d s a n d p r o g r a m

[2] V.A. McCormack, I. dos Santos Silva, Breast density andparenchymal patterns as markers of breast cancer risk: ameta-analysis, Cancer Epidemiol. Biomarkers Prev. 15 (June(6)) (2006) 1159–1169.

[3] N.F. Boyd, J.M. Rommens, K. Vogt, V. Lee, J.L. Hopper, M.J.Yaffe, et al., Mammographic breast density as anintermediate phenotype for breast cancer, Lancet Oncol. 6(October (10)) (2005) 798–808.

[4] N.F. Boyd, L.J. Martin, M.J. Yaffe, S. Minkin, Mammographicdensity and breast cancer risk: current understanding andfuture prospects, Breast Cancer Res. 13 (6) (2011) 223.

[5] D.S. Buist, P.L. Porter, C. Lehman, S.H. Taplin, E. White,Factors contributing to mammography failure in womenaged 40–49 years, J Natl Cancer Inst. 96 (October (19)) (2004)1432–1440.

[6] L. Domingo, M. Sala, S. Servitja, J.M. Corominas, F. Ferrer, J.Martínez, et al., Phenotypic characterization and risk factorsfor interval breast cancers in a population-based breastcancer screening program in Barcelona, Spain, CancerCauses Control. 21 (August (8)) (2010) 1155–1164.

[7] G.M. te Brake, N. Karssemeijer, J.H. Hendriks, Automateddetection of breast carcinomas that were not detected in ascreening program, Radiology 207 (May (2)) (1998) 465–471.

[8] J.N. Wolfe, Risk for breast cancer development determinedby mammographic parenchymal pattern, Cancer. 37 (May(5)) (1976) 2486–2492.

[9] I.T. Gram, E. Funkhouser, L. Tabar, The Tabar classification ofmammographic parenchymal patterns, Eur. J. Radiol. 24(February (2)) (1997) 131–136.

[10] N.F. Boyd, J.W. Byng, R.A. Jong, E.K. Fishell, L.E. Little, A.B.Miller, et al., Quantitative classification of mammographicdensities and breast cancer risk: results from the Canadiannational breast screening study, J. Natl. Cancer Inst. 87 (May)(1995) 670–675.

[11] American College of Radiology (ACR), Illustrated BreastImaging Reporting and Data System (BI-RADS), 3rd ed.,American College of Radiology, Reston, VA, 1998, pp.167–181.

[12] N. Jamal, K.H. Ng, L.M. Looi, D. McLean, A. Zulfiqar, S.P. Tan,et al., Quantitative assessment of breast density fromdigitized mammograms into Tabar’s patterns, Phys. Med.Biol. 51 (November (22)) (2006) 5843–5857.

[13] N. Karssemeijer, Automated classification of parenchymalpatterns in mammograms, Phys. Med. Biol. 43 (1998)365–378.

[14] P.K. Saha, J.K. Udupa, E.F. Conant, D.P. Chakraborty, D.Sullivan, Breast tissue density quantification via digitizedmammograms, IEEE Trans. Med. Imaging 20 (August (8))(2001) 792–803.

[15] C. Klifa, J. Carballido-Gamio, L. Wilmes, A. Laprie, C. Lobo, E.Demicco, et al., Quantification of breast tissue index fromMR data using fuzzy clustering, Conf. Proc. IEEE Eng. Med.Biol. Soc. 3 (2004) 1667–1670.

[16] A. Oliver, J. Freixenet, A. Bosch, D. Raba, R. Zwiggelaar,Automatic classification of breast tissue, Lect. NotesComput. Sci. 3523 (2005) 431–438.

[17] A. Oliver, J. Freixenet, R. Martí, J. Pont, E. Pérez, E.R. Denton,R. Zwiggelaar, A novel breast tissue density classificationmethodology, IEEE Trans. Inf. Technol. Biomed. 12 (January(1)) (2008) 55–65.

[18] I. Muhimmah, R. Zwiggelaar, Mammographic densityclassification using multiresolution histogram information,in: Proceedings of the International Special Topic Conferenceon Information Technology in Biomedicine, October, 2006.

[19] J.J. Heine, M.J. Carston, C.G. Scott, K.R. Brandt, F.F. Wu, V.S.

Pankratz, et al., An automated approach for estimation ofbreast density, Cancer Epidemiol. Biomarkers Prev. 17(November (11)) (2008) 3090–3097.

b i o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115

[20] J.J. Heine, C.G. Scott, T.A. Sellers, et al., A novel automatedmammographic density measure and breast cancer risk, J.Natl. Cancer Inst. 104 (July (13)) (2012) 1028–1037.

[21] A. Manduca, M.J. Carston, J.J. Heine, C.G. Scott, V.S. Pankratz,K.R. Brandt, T.A. Sellers, C.M. Vachon, J.R. Cerhan, Texturefeatures from mammographic images and risk of breastcancer, Cancer Epidemiol. Biomarkers Prev. 18 (March (3))(2009) 837–845.

[22] J. Li, L. Szekely, L. Eriksson, B. Heddson, A. Sundbom, K.Czene, et al., High-throughput mammographic-densitymeasurement: a tool for risk prediction of breast cancer,Breast Cancer Res. 14 (July (4)) (2012) R114.

[23] H.M. Gweon, J.H. Youk, J.A. Kim, E.J. Son, Radiologistassessment of breast density by BI-RADS categories versusfully automated volumetric assessment, Am. J. Roentgenol.201 (September (3)) (2013) 692–697.

[24] A. Tagliafico, G. Tagliafico, D. Astengo, F. Cavagnetto, R.Rosasco, G. Rescinito, et al., Mammographic densityestimation: one-to-one comparison of digitalmammography and digital breast tomosynthesis using fullyautomated software, Eur. Radiol. 22 (2012) 1265–1270.

[25] M. Jeffreys, R. Warren, R. Highnam, G.D. Smith, Initialexperiences of using an automated volumetric measure ofbreast density: the standard mammogram form, Br. J. Radiol.79 (May (941)) (2006) 378–382.

[26] M. Lokate, M.G. Kallenberg, N. Karssemeijer, M.A. Van denBosch, P.H. Peeters, C.H. Van Gils, Volumetric breast densityfrom full-field digital mammograms and its association withbreast cancer risk factors: a comparison with a thresholdmethod, Cancer Epidemiol. Biomarkers Prev. 19 (December(12)) (2010) 3096–3105.

[27] M. Pollán, N. Ascunce, M. Ederra, A. Murillo, N. Erdozáin, J.E.Alés-Martínez, et al., Mammographic density and risk ofbreast cancer according to tumor characteristics and modeof detection: a Spanish population-based case–control study,Breast Cancer Res. 15 (January (1)) (2013) R9.

[28] M. Pollán, R. Llobet, J. Miranda-García, J. Antón, M. Casals, I.Martínez, et al., Validation of DM-Scan, a computer-assistedtool to assess mammographic density in full-field digitalmammograms, SpringerPlus 2 (2013) 242.

[29] J.W. Byng, M.J. Yaffe, R.A. Jong, R.S. Shumak, G.A. Lockwood,D.L. Tritchler, N.F. Boyd, Analysis of mammographic densityand breast cancer risk from digitized mammograms,Radiographics 18 (1998) 1587–1598.

[30] M. Garrido-Estepa, F. Ruiz-Perales, J. Miranda, N. Ascunce, I.González-Román, C. Sánchez-Contador, et al., Evaluation ofmammographic density patterns: reproducibility andconcordance among scales, BMC Cancer 10 (September)(2010) 485.

[31] B. Pérez-Gómez, F. Ruiz, I. Martínez, M. Casals, J. Miranda, C.Sánchez-Contador, et al., Women’s features andinter-/intra-rater agreement on mammographic densityassessment in full-field digital mammograms (DDM-SPAIN),Breast Cancer Res. Treat. 132 (February (1)) (2012) 287–295.

[32] S. Ciatto, N. Houssami, A. Apruzzese, E. Bassetti, B. Brancato,F. Carozzi, et al., Categorizing breast mammographicdensity: intra and interobserver reproducibility of biradsdensity categories, Breast 14 (August (4)) (2005) 269–275.

[33] C. Zhou, H.P. Chan, N. Petrick, M.A. Helvie, M.M. Goodsitt, B.Sahiner, et al., Computerized image analysis: estimation ofbreast density in mammograms, Med. Phys. 28 (2001)1056–1069.

[34] S. Arya, D.M. Mount, N.S. Netanyahu, et al., An optimalalgorithm for approximate nearest neighbor searching, J.ACM 45 (6) (1998 Nov) 891–923.

[35] P.E. Shrout, J.L. Fleiss, Intraclass correlation: uses inassessing rater reliability, Psychol. Bull. 86 (March (2)) (1979)420–428.

Page 11: Semi-automated and fully automated mammographic density ...106 computer methods and programs in biomedicine 116 (2014) 105–115 The results obtained showed an average ICC=0.922 among

i n b i

Vacek, R. Smith-Bindman, et al., Longitudinal measurement

c o m p u t e r m e t h o d s a n d p r o g r a m s

[36] M.B. Lobbes, J.P. Cleutjens, V. Lima Passos, C. Frotscher, M.J.Lahaye, K.B. Keymeulen, et al., Density is in the eye of thebeholder: visual versus semi-automated assessment ofbreast density on standard mammograms, Insights Imaging3 (February (1)) (2012) 91–99.

[37] K.E. Martin, M.A. Helvie, C. Zhou, M.A. Roubidoux, J.E. Bailey,C. Paramagul, et al., Mammographic density measured withquantitative computer-aided method: comparison withradiologists’ estimates and BI-RADS categories, Radiology240 (September (3)) (2006) 656–665.

[38] R.P. Highnam, J.M. Brady, B.J. Shepstone, Estimation of

compressed breast thickness during mammography, Br. J.Radiol. 71 (June (846)) (1998) 646–653.

[39] D. Raba, A. Oliver, J. Martí, M. Peracaula, J. Espunya, BreastSegmentation with Pectoral Muscle Suppression on Digital

o m e d i c i n e 1 1 6 ( 2 0 1 4 ) 105–115 115

Mammograms, in: Proc. of IbPRIA, 2005, pp.471–478.

[40] M. Mustra, M. Grgic, Robust automatic breast and pectoralmuscle segmentation from scanned mammograms, SignalProcess. 93 (10) (2013) 2817–2827.

[41] C.H. van Gils, J.H. Hendriks, R. Holland, N. Karssemeijer, J.D.Otten, H. Straatman, et al., Changes in mammographicbreast density and concomitant changes in breast cancerrisk, Eur. J. Cancer Prev. 8 (December (6)) (1999) 509–515.

[42] K. Kerlikowske, L. Ichikawa, D.L. Miglioretti, D.S. Buist, P.M.

of clinical mammographic breast density to improveestimation of breast cancer risk, J. Natl. Cancer Inst. 99(March (5)) (2007) 386–395.