COMPUTER AIDED BREAST CANCER DETECTION USING … · 2019-05-15 · Master Thesis Title: Computer Aided Breast Cancer Detection Using Temporal Mammograms The present Thesis Dissertation

DEPARTMENT OF ELECTRICAL AND COMPUTER

ENGINEERING

COMPUTER AIDED BREAST CANCER

DETECTION USING TEMPORAL

MAMMOGRAMS

KOSMIA LOIZIDOU

A Dissertation Submitted to the University of Cyprus in Partial

Fulfillment of the MSc Requirements

May 2018

DEPARTMENT OF ELECTRICAL AND COMPUTER

ENGINEERING

COMPUTER AIDED BREAST CANCER DETECTION USING

TEMPORAL MAMMOGRAMS

KOSMIA LOIZIDOU

Dr. Costas Pitris

Assistant Professor, Department of Electrical & Computer

Engineering, Advisor

Dr. Christos Nicolaou

Radiologist, Co-advisor

©Kosmia Loizidou, 2018

i

VALIDATION PAGE

MSc Student: Kosmia Loizidou

Master Thesis Title: Computer Aided Breast Cancer Detection Using Temporal Mammograms

The present Thesis Dissertation was submitted in partial fulfillment of the requirements for the Degree

of Master of Science at the Department of Electrical and Computer Engineering and was approved

on the 24/05/2018 by the members of the Examination Committee.

Examination Committee:

Research Supervisor: Constantinos Pitris

Assistant Professor, Department of Electrical and Computer Engineering

Committee Member: Theocharis Theocharides

Assistant Professor, Department of Electrical and Computer Engineering

Committee Member: Constantinos Pattichis

Professor, Department of Computer Science

ii

DECLARATION

The present Master dissertation submitted in partial fulfillment of the requirements for the degree of

Master of Science of the University of Cyprus. It is a product of original work of my own, unless

otherwise mentioned through references, notes, or any other statements.

Kosmia Loizidou

…………………………………..

iii

Dedicated to Petros,

Thank you for believing in me

iv

ABSTRACT

Breast cancer remains until today, one of the most deadly cancers worldwide for women over 40 years

old. Early detection is crucial for women, in order to minimize the damage, discomfort and provide a

potential cure. Mammography is the most reliable screening tool for the identification of any signs of

malignancy or abnormality in general. Computer Aided Diagnosis (CAD) systems are dynamic tools

that can assist radiologists to detect and classify mammographic abnormalities. Those systems are

needed due to the radiologists’ high error rate in spotting cancer. In the literature, various algorithms

have been proposed, which evaluate mammograms and identify subtle and complicated cancers that

otherwise could be missed from the human observers.

In this work, a novel and advanced technique for breast micro-calcifications diagnosis on a temporal

pair of mammograms is introduced. Micro-calcifications are microscopic calcifications that appear in

clusters with a higher intensity level than their surroundings and they are difficult to recognise. The

goal of this work is to develop an innovative CAD system for the detection and classification of micro-

calcifications.

The recommended approach begun with the creation of the dataset, which contained eight temporal

pairs. After that was the normalization of the mammograms and the pre-processing step, which removed

the mammogram’s irrelevant regions, without eradicating other important details of the image. The

second step consisted of the registration of the prior and current mammograms for an efficient

subtraction. We tested two algorithms: Affine and Demons. Our experimental results demonstrated that

the mean residual with Affine was approximately 10% higher than the residuals with Demons, for both

dense and fatty mammograms, so Demons registration was selected. The third step involved the

subtraction of the current and registered image and the post-processing steps. The subtracted image

filtered with various filters in order to discover the best one, thresholded and further processed with

morphological operations.

The fourth step contained the removal of the periphery regions that did not correspond to micro-

calcifications. Range filter was chosen because the high intensity background was erased but micro-

calcifications remained. Then, we eliminated the old micro-calcifications that did not removed from the

previous subtraction of the mammograms and created the ground truth images based on the radiologist’s

observations, for the evaluation part. We evaluated the proposed methodology and found 280 false

positives from the detected 379 ROIs. The F1-score was limited to 0.301 and for that reasons we

implemented machine-learning to our study for the elimination of false positives.

For the classification part, 13 FOS and shape features were extracted from both the current and the

subtracted images on every ROI and with statistical analysis and multivariate analysis of variance, the

best features were selected. The features that were extracted from the current image had smaller p-value

and the best combination of features contained only 10 of them. With discriminant analysis, we used

leave-one-patient-out validation to divide the dataset into a training set and a test set. The training set

was used to train the classifier and the test set to evaluate the classifier. The accuracy was 83.245%,

sensitivity 96.9% and specificity 78.6%. Even though the results indicated that our proposed algorithm

is an effective and powerful tool for breast micro-calcifications detection using temporal mammograms,

additional studies must be carried out to improve the diagnostic value of the algorithm.

v

ACKNOWLEDGMENTS

I take this opportunity to thank all those who have contributed in this work.

First, I would like to express sincere gratitude to my thesis advisor, Assistant Professor Constantinos

Pitris of the ECE Department at University of Cyprus. He consistently allowed this thesis to be my own

work but steered me in the right direction whenever he thought I needed it. He provided me with

knowledge and motivation to handle difficult situations with confidence and courage.

I would also like to thank my co-advisor Dr. Chistos Nicolaou that was the Radiologist of this work.

Dr. Nicolaou allowed me to be part of this project and helped me with the medical terms and the creation

of the dataset. Without his guidance and input, this work could not have been successfully conducted.

Besides my advisors, I would like to thank the rest of my thesis committee for their insightful comments

and encouragement in order to broaden my work.

Last, but not least, I would like to thank my people. My parents, Stavros and Melani, my brother

Kleanthis and my friend and colleague Rafaella, for supporting me spiritually throughout writing this

thesis and in my life in general.

vi

Table of Contents

ABSTRACT .................................................................................................................. IV

ACKNOWLEDGMENTS ............................................................................................. V

1 INTRODUCTION .....................................................................................................1

1.1 RESEARCH PROBLEM .............................................................................................1 1.2 BREAST CANCER ....................................................................................................1 1.3 CAD MAMMOGRAPHY ..........................................................................................2

2 REVIEW OF THE LITERATURE .........................................................................8

2.1 PROCESSING OF MAMMOGRAMS ............................................................................8 2.2 BILATERAL SUBTRACTION ...................................................................................12 2.3 TEMPORAL ANALYSIS ..........................................................................................19 2.4 DETECTION AND CLASSIFICATION OF MICRO-CALCIFICATIONS ...........................30 2.5 SCOPE ..................................................................................................................33

3 METHODOLOGY OF THE PROPOSED ALGORITHM ................................34

3.1 DETECTION OF ABNORMAL ROIS .........................................................................34 3.1.1 Computer-Aided Diagnosis System Pipeline .............................................34 3.1.2 Dataset .......................................................................................................34 3.1.3 Normalization ............................................................................................36 3.1.4 Pre-processing ...........................................................................................36 3.1.5 Registration ................................................................................................37 3.1.6 Temporal Subtraction ................................................................................38 3.1.7 Post-processing ..........................................................................................38 3.1.8 Removal of the periphery pixels .................................................................39 3.1.9 Removal of the old micro-calcifications ....................................................39 3.1.10 Evaluation of the proposed algorithm .....................................................39

3.2 ELIMINATION OF FALSE POSITIVES ......................................................................40 3.2.1 Feature Extraction and Selection ..............................................................40 3.2.2 Classification .............................................................................................42 3.2.3 Evaluation of the classification ..................................................................43

4 RESULTS .................................................................................................................45

4.1 DETECTION OF ABNORMAL ROIS .........................................................................45 4.1.1 Pre-processing ...........................................................................................45 4.1.2 Registration ................................................................................................47 4.1.3 Temporal Subtraction ................................................................................52 4.1.4 Post-processing ..........................................................................................53 4.1.5 Removal of the Periphery Pixels ................................................................57 4.1.6 Removal of the old micro-calcifications ....................................................58 4.1.7 Evaluation of the proposed algorithm .......................................................60

4.2 ELIMINATION OF FALSE POSITIVES ......................................................................60 4.2.1 Feature Extraction and Selection ..............................................................61 4.2.2 Classification .............................................................................................66 4.2.3 Evaluation of the classification ..................................................................66

vii

5 DISCUSSION ..........................................................................................................69

6 CONCLUSION AND FUTURE WORK ..............................................................72

REFERENCES ..............................................................................................................73

viii

FIGURES

Figure 1.1: Structure of the breast (Beura, 2016) ........................................................................ 1

Figure 1.2: The breast mass in mammogram (a) benign mass with smooth shape (b) malignant mass

with irregular shape (Li et al., 2017) ............................................................................................ 2

Figure 1.3: Two types of view of the breast imaging (a) Left CC view (b) Left MLO view (Nicosia

Diagnostic Centre) ........................................................................................................................ 3

Figure 1.4: Two types of mammograms (a) Fatty mammogram (b) Dense mammogram ............ 3

Figure 1.5: Evaluation methodology of the CAD algorithm (Oliver et al., 2010 p. 100) ............. 4

Figure 2.1: Features extract from mammogram images (a) normal, (b) benign and (c) malignant

(Ganesan et al., 2013 p. 85) ........................................................................................................ 10

Figure 2.2: Three areas of the breast region (Mendez et al., 1998 p. 958) ................................ 13

Figure 2.3: Thresholded images of the right [(a),(c)] and left breasts [(b),(d)] with cut off values 25%

[(a),(b)] and 35% [(c),(d)] (Yin, 1991 p. 957) ............................................................................ 14

Figure 2.4: Comparison of the detection performances obtained with the nonlinear and linear

subtraction methods (Yin, 1991 p. 962) ...................................................................................... 15

Figure 2.5: Manually registered borders of the right and left breasts (Yin, 1991 p. 956) ......... 16

Figure 2.6: Processing steps and information flow between processing steps used to identify potential

control points and establish their correspondence (Vujovic & Brzakovic, 1997 p. 1388) ......... 21

Figure 2.7: Regional registration technique (Sanjay-Gopal et al., 1999 p. 2671) ..................... 22

Figure 2.8: Consistent landmarks in the CC and ML ‘idealized’ outlines (Marias et al., 2005 p. 3) 23

Figure 2.9: Types of micro-calcifications’ distribution (Smithuis, R. and Pijnappel, 2008) ...... 30

Figure 3.1: Computer-aided diagnosis system pipeline ............................................................. 35

Figure 3.2: Example of temporal pairs of mammograms (a) current (b) prior .......................... 36

Figure 3.3: Detailed representation of the proposed algorithm ................................................. 44

Figure 4.1: Example of the clear border removal in two cases (a) normalized mammograms with red

circle in the malignancy (b) border removal .............................................................................. 45

Figure 4.2: Example of the Gamma correction in two cases (a) border removal (b) Gamma correction

.................................................................................................................................................... 46

Figure 4.3: Affine registration in two cases (a) current mammogram (b) prior mammogram (c)

subtracted image ......................................................................................................................... 47

Figure 4.4: Box plot for dense mammograms ............................................................................. 48

Figure 4.5: Box plot for fatty mammograms ............................................................................... 48

https://ucy-my.sharepoint.com/personal/cloizi01_ucy_ac_cy/Documents/THESIS/THESIS_PC/Writing/Thesis%20Report.docx#_Toc514940103































ix

Figure 4.6: Displacement filed for Demons registration in the 1st example ............................... 49

Figure 4.7: Displacement filed for Demons registration in the 2nd example .............................. 49

Figure 4.8: Demons registration in two cases (a) current mammogram (b) prior mammogram (c)

subtracted image ......................................................................................................................... 50

Figure 4.9: Box plot for dense and fatty mammograms.............................................................. 51

Figure 4.10: Comparison of dense mammograms ...................................................................... 51

Figure 4.11: Comparison of fatty mammograms ........................................................................ 52

Figure 4.12: Box plot of the contrast ratio ................................................................................. 53

Figure 4.13: STD filter (a) subtracted image (b) filtered image ................................................ 54

Figure 4.14: CLAHE (a) subtracted image (b) filtered image .................................................... 54

Figure 4.15: Range filter (a) subtracted image (b) filtered image ............................................. 55

Figure 4.16: Thresholding (a) filtered image (b) thresholded .................................................... 55

Figure 4.17: Morphological operations (a) thresholded image (b) new image ......................... 56

Figure 4.18: Resulted image for example 1 ................................................................................ 56

Figure 4.19: Resulted image for example 2 ................................................................................ 57

Figure 4.20: Images with removed periphery pixels .................................................................. 57

Figure 4.21: Ground truth images .............................................................................................. 58

Figure 4.22: New binary image .................................................................................................. 58

Figure 4.23: True micro-calcifications ....................................................................................... 59

Figure 4.24: False negative areas .............................................................................................. 59





















x

TABLES

Table 1.1: Model-based mammographic detection and/or segmentation techniques (Oliver et al., 2010

p.100) ............................................................................................................................................ 7

Table 2.1: Listing of popular feature extraction and classification methods (Ganesan et al., 2013 p.

90) ............................................................................................................................................... 12

Table 2.2: Listing of popular feature extraction and classification methods (Ganesan et al., 2013 p.

90) ............................................................................................................................................... 12

Table 2.3: Comparison of Bilateral Subtraction methods .......................................................... 19

Table 2.4: Comparison of registration techniques in temporal mammograms .......................... 24

Table 2.5: Single and temporal features (Ma et al., 2014 p. 1264) ............................................ 27

Table 2.6: Comparison of Temporal Analysis techniques in mammograms ............................... 29

Table 2.7: Comparison of detection and classification methods of micro-calcifications ........... 33

Table 3.1: Distribution of our testing dataset ............................................................................. 34

Table 3.2: Features extracted from both subtracted and current image .................................... 41

Table 3.3: Confusion matrix ....................................................................................................... 43

Table 4.1: Evaluation of the algorithm ....................................................................................... 60

Table 4.2: Features extracted from current image ..................................................................... 62

Table 4.3: Features extracted from subtracted image ................................................................ 63

Table 4.4: T-test results for current image ................................................................................. 64

Table 4.5: T-test results for subtracted image ............................................................................ 64

Table 4.6: MANOVA results for current image .......................................................................... 65

Table 4.7: MANOVA results for subtracted image ..................................................................... 65

Table 4.8: Selected features for the classification step ............................................................... 66

Table 4.9: Classification results ................................................................................................. 66

Table 4.10: Evaluation of the classifier ...................................................................................... 67

Table 5.1: Performance of different methods ............................................................................. 71





1 Introduction

1.1 Research Problem

At the present time, there are no effective techniques to avoid breast cancer due to its undiscovered

cause. Despite the fact that the radiologists can provide better chances to women with early stage

diagnosis from mammograms, wrong assessments are inevitable (Tang et al., 2009). With this in mind,

computer-aided diagnosis systems (CAD) that use computer technologies and can recognize

abnormalities, implemented to support and assist the radiologists.

1.2 Breast Cancer

Breast cancer is the most common cancer in women in the European Union and the United States (Oliver

et al., 2010). In 2007, the American Cancer Society published a study suggesting that between one in

eight and one in twelve women will present breast cancer at least once during their lifetime. Breast

cancer remains, the number one cause of death in women older than 40 years of age (Oliver et al., 2010).

The breast composed from lobules, which are the glands that produce milk, ducts, fatty and connective

tissue, blood and lymphatic vessels (Beura, 2016). In the case of breast cancer, the carcinogenesis leads

to an uncontrolled growth of breast cells in a lump, usually forming a tumor (Ganesan et al., 2013). The

process, by which the cancer starts and later develops, can vary between patients. A large percentage of

breast cancers begin in the ducts and called ‘ductal cancers’ while others start in the glands and are

named ‘lobular cancers’. It is worth mentioning that when the cancer has spread to the fatty tissue of

the breast and other organs of the body called invasive and that type of cancer is the riskiest for the

patient’s health (Beura, 2016). Other types of breast cancer appear more rarely (Cancer.org, 2016).

However, it should be noted that the mortality of breast cancer has a downward trend over the past

decade among women of all ages, due to the introduction of mammography screening and the discovery

of alternative and more efficient treatments (Oliver et al., 2010). As mention before, a tumor is a mass

of abnormal tissue and in breast cancer, there are two major categories of tumors: the non-cancerous,

Figure 1.1: Structure of the breast (Beura, 2016)

2

called ‘benign’, and the cancerous called ‘malignant’. The first category contains the harmless tumors.

Most of the times when a patient is diagnosed with this kind of tumor, the doctor will prefer to leave it

alone instead of removing it, because it will not expand in other areas. Although from time to time,

some of those tumors spread to the surrounding tissue, causing pain to the patient. Hence, the doctor is

going to remove the tumor to avoid further expansion of the cancer. The second category refers to

dangerous and unstable tumors, which can infect and damage the surrounding tissue and organs. In

those cases, the doctor is going to do a biopsy to the patient, to find how aggressive is the tumur

(www.nationalbreastcancer.org, 2017). When the breast contains a lot of fibrous or glandular tissue and

not much fatty one is called dense. On the other hand, a breast composed entirely from fat tissue called

fatty. Dense mammograms are more difficult to evaluate because dense tissue looks white, same as the

tumors and abnormalities inside a mammogram (Cancer.org, 2016).

Numerous types of abnormalities can be found in a mammogram and usually they are combined with

asymmetries among left and right breasts, distortion of the normal architecture of the tissue, appearance

of micro-calcifications in the breast and masses of various sizes and shapes. In the most cases, left and

right breast are almost symmetrical, for this reason any asymmetric area can reveal a developing mass

or a variation of normal breast tissue. A distortion in the normal architecture introduces a chaos of the

normal tissue resulting in abnormal regions. Micro-calcifications are microscopic calcifications that

commonly show up in clusters and evaluated according to their specifications such as shape, size etc.

In like manner, a breast mass is a restricted swelling in the breast which is also characterized by its

specifications (Oliver, 2007).

1.3 CAD Mammography

For the detection of breast cancer as well as other kinds of abnormalities of the breasts, worldwide

radiologists use mammography as the key screening tool (Oliver et al., 2010). This method involves X-

ray imaging of the two breasts, with the images stored on film or in digital format (Ganesan et al., 2013).

Mammography is the process of applying low energy X-rays for presentation of breast, to discover the

abnormality areas. A beam of X-rays transfers through the breast and based on the breast’s intensity the

tissue absorbs a percentage of the rays. The rest of the rays then pass through a detector to go to the

(a)

(a)

(a)

(a)

(b)

(b)

(b)

(b)

Figure 1.2: The breast mass in mammogram (a) benign

mass with smooth shape (b) malignant mass with irregular

shape (Li et al., 2017)

3

photographic film, in order to construct a gray-scale image, which is generally known as a film-based

mammogram. From this image, a digital mammogram is re-constructed. The mammograms taken from

two different projections for each breast: the Cranio-Caudal (CC) and the Medio-Lateral Oblique

(MLO) and in Figure 1.3 are two examples. In the first one, the view is taken throughout the entire time

of screening and it can show the pectoral muscle, while in the second one, the view is taken from head

down and the pectoral muscles are not visible (Beura, 2016).

(a) (b)

Figure 1.3: Two types of view of the breast imaging (a) Left CC view (b)

Left MLO view (Nicosia Diagnostic Centre)

Figure 1.4: Two types of mammograms (a) Fatty mammogram (b)

Dense mammogram

(a) (b)

4

Digital mammography, which extensively used today, provides an electronic image of the breast stored

as a computer file. Thus, the information can be upgraded, magnified or processed easily and assist the

radiologist to adapt, store and recover the digital images electronically (National Cancer Institute,

2016). After the mammograms acquired, an expert radiologist reviews and examines the scans or the

files, to determine whether the patient has cancer, followed by appropriate disease management

depending on the case. Unfortunately, even an experienced and well-trained radiologist can make a

wrong assessment or miss a variety of abnormalities (Oliver et al., 2010). It is worth mentioning that

recent studies, published in the past years, have shown that radiologists have an error rate between 10%

and 30% in spotting cancer. Consequently, this significantly high error rates, result in unwarranted

procedures patient discomfort and unnecessary expenditure for hospitals and medical centres. As a

result, the idea of developing computer systems, which could assist the radiologists to detect and classify

breast cancers, has been promoted (Ganesan et al., 2013).

A Computer-Aided Detection (CAD) system is a radiological device, which includes a set of automatic

or semi-automatic tools that aid radiologists in the detection and classification of mammographic

abnormalities (Oliver et al., 2010). The main objective of such a system is to identify an increased

number of subtle and complicated cancers that might otherwise be missed from radiologist due to the

lack of expertise (Lehman et al., 2015). CAD system based on computational intelligence (CI)

techniques dedicated to detecting breast cancer from mammograms. To date, most CAD systems are

not designed to decide independently about the abnormalities related to the tumors, therefore requiring

human intervention to finalise the results and identify the problem. The main components of CAD

systems, such as pre-processing, breast region segmentation, feature extraction and classification,

employ a variety of CI techniques (Ganesan et al., 2013). Despite the fact that the USA Food and Drug

Administration (FDA) approved CAD for mammography in 1998, 3 years later, only a small percentage

(5%) of clinics and hospitals the USA use it. Nevertheless, by 2008 this percentage dramatically

Figure 1.5: Evaluation methodology of the CAD algorithm (Oliver et al., 2010 p. 100)

5

increased (74%) with almost all of the screening mammograms in the USA now involving CAD

(Lehman et al., 2015).

It is generally accepted, that radiologists can miss important signs in a mammogram if they do not

analyse them correctly and with the right tools. With this in mind, the evaluation of mammograms is

performed by two independent radiologists, a procedure called a ‘double reading’, providing greater

sensitivity than just a single reading. Conversely, double reading does not increase the recall rates.

Furthermore, double reading of mammograms is a challenging, almost impossible task, especially in

limited human resource settings. For this reason, many times, CAD utilized as a second reader (Ganesan

et al., 2013). The idea of breast cancer detection using both a CAD system and a radiologist is

straightforward. Initially, after the mammogram acquired the radiologist examines the images, finds

any suspicious clues and then completes an assessment. Next, the CAD system identifies potential

abnormalities by marking specific regions on the image, to assist the radiologist with the final decision

(Fenton et al., 2011). This process acts as a second reading. Finally, the radiologist reviews the areas

identified by the CAD system and determines whether additional evaluation is warranted (Lehman et

al., 2015). With the use of CAD, the number of radiologists needed for double reading of mammograms

decreased significantly, which can be extremely useful in less develop countries.

The challenge of CAD system research is to automate the process of mammographic screening to detect

and categorise tumors automatically and without intervention from the radiologists. Nonetheless, CAD

is only as effective as its computer software (Ganesan et al., 2013, Berry, 2011). Computational time

improves with increasing computer speed but the ultimate goal is to create CAD systems, which learn

from previous decisions and correctly classify malignancies versus non-malignant lesions. However,

such algorithms hindered by numerous challenges. Currently, there are so many different approaches

that if two CAD systems analyse the same mammogram, their response is going to be completely

diverse. Ultimately, CAD systems should be able to find a tumor before it even becomes symptomatic

in the mammogram (Berry, 2011). The success of automated mammography analysis depends on

detection (discovery of potential lesions within the background) and segmentation (define very

accurately the outline of a potential lesion) techniques. Mass detection can be achieved either using a

single view and relying on the differences of the pixels in each area or using multiple views and utilizing

more than one mammographic image of the same person, for comparison. Segmentation techniques

categorised into supervised and unsupervised methods. Supervised methods use a priori information

and/or user intervention to decide the boundaries of specific regions in the image. Unsupervised

methods divide the image into segments and then classify each based on specific properties such as

texture or intensity variations. These approaches further divided in three method groups: region-based,

where the image separated into homogeneous and spatially connected regions, contour-based, which

manipulate the boundary of the regions, and clustering, which organises pixels into groups that have

the same properties. More information will presented in the following sections.

Clinical evaluation of CAD algorithms is necessary to realise whether those systems are accurate in

detecting mass abnormalities with minimum error. For that purpose, after the results obtained by the

CAD detection and segmentation algorithms, they compared with the ‘gold standard’, which is the

analysis of the radiologists. Both the radiologists and the CAD system label each mammogram as

normal or abnormal. The radiologists provide binary images, which include both the outline of a mass

and its details. In contrast, the automatic algorithms give a probability image identifying the different

sections as high or low probability to be a mass compared to normal tissue (Oliver et al., 2010). Previous

literature suggests that the performance of CAD is not conclusive enough to warrant clinical use but the

results are encouraging to warrant further research (Ganesan et al., 2013). In the United States, CAD

applied to a large percentage of screening mammograms, with an annual cost of approximately 30

6

million dollars (Fenton et al., 2011). The main problem is the significant number of false positive

detections of masses, detected only in one view (Oliver et al., 2010). It is important to note, though, that

with the use of CAD, the accuracy of cancer detection has slightly increased. The popularity of these

systems remains high but cannot replace human radiologists (Ganesan et al., 2013). In film-screening

mammography, CAD methods can improve the specificity and the positive predictive value (PPV), but

there is no evidence that the detection rate of invasive breast cancer and the sensitivity can be elevated.

In addition, the impact on breast cancer mortality remains almost the same, with or without CAD

(Fenton et al., 2011). This is since algorithms aiming to improve the mortality rate require many decades

to optimize (Fenton et al., 2013). In some cases, biopsy recommendations declined. Clearly, the aim is

the early detection of high-risk cancers by CAD through sensitivity improvement (Fenton et al., 2011).

The limitations of the state-of-the-art algorithms should take under consideration. Those were the small

numbers of women who took part in some studies and their ages, which are not a representative sample.

Furthermore, out-dated film screen mammograms used and the radiologists were not comfortable with

this approach in the early studies (Lehman et al., 2015). Film-screen mammograms digitized before the

CAD analysis, but this process probably introduced noise and affected the performance. However, some

studies have shown that digital and film-screen environments have similar response with the CAD

system (Fenton et al., 2011). One way to improve the performance of the CAD systems is to combine

different classifiers. Every classifier has its own unique properties and a window over which it performs

best (Ganesan et al., 2013). In the bibliography there are studies where the classifiers were combined

in parallel, serial or cascading schemes and the probabilities were multiplied and summed and indicated

the possibility of improvement in further developments (Oliver et al., 2010). A major concern in pattern

recognition is the fact that a classifier, which designed for a specific set of data, may not be useful for

another set, due to ‘over-fitting’. The combination of classifiers could provide a better generalization

capability for the recognition system. In addition, even senior and junior radiologists in the same

workplace, have completely diverse responses when they use CAD approaches. Thus, the radiologist’s

experience with CAD plays a crucial role and the overall performance can vary (Fenton et al., 2011).

In summary, the main purpose for CAD systems is to serve as double reading machines. At the same

time, the need for CAD in less developed countries, which do not have many expert radiologists, appears

as a necessity. In those cases, CAD systems can be very helpful and improve the diagnostic accuracy

(Ganesan et al., 2013). The long-term effect of CAD on the detection of breast cancer in screening

mammography, demands further investigation (Fenton et al., 2013). The results from past reviews

indicate that there are still unanswered questions related to detection, segmentation, sensitivity,

specificity, mortality rate etc., which make automatic mass detection using CAD an active research field

(Oliver et al., 2010). CAD techniques are veritably popular in the United States for certain reasons.

Firstly, the most apparent reason is that CAD built into digital mammography equipment, which appear

frequently in the USA. The second reason is financial, since the equivalent return for CAD in 2008 was

16.50 dollars. Finally, the readers are comfortable with the CAD system; even though that does not

guarantee that the system is going to perform without any errors (Berry, 2011). Currently, CAD systems

are not ready to use as independent machines which recognize and detect mass abnormalities in

mammograms. With a deeper understanding of this field, one day it is almost certain that they will

assume a bigger role. More research in CAD systems needed in order to assure that the advantages

outweigh the disadvantages (Fenton, 2015).

7

Table 1.1: Model-based mammographic detection and/or segmentation techniques (Oliver et al.,

2010 p.100)

Model-based detection and/or segmentation techniques

Author Year

Lai et al. (1989) 1989

Kegelmeyer (1992) and Kegelmeyer et al. (1994) 1992

Ng and Bischof (1992) 1992

Karssemeijer (1994,1999) and Karssemeijer and te Brake (1996) 1994

Stathaki and Constantinides (1994) 1994

Tarassenko et al. (1995) 1995

Calder et al. (1996) 1996

Chang et al. (1996) 1996

Che et al. (1996) 1996

Diahi et al. (1996) 1996

Li et al. (1996,2001) 1996

Kalman et al. (1997) 1997

Jiang et al. (1998) 1998

Te Brake and Karssemeijer (1998, 1999) 1999

Zwiggelaar et al. (1998,1999) 1998

Constantinides et al. (1999,2000,2001) 1999

Morrison and Linnett (1999) 1999

Christoyianni et al. (2000) 2000

Hatanaka et al. (2001) 2001

Liu et al. (2001) 2001

Lo et al. (2002) 2002

Youssry et al. (2003) 2003

Campanini et al. (2004) 2004

Cheng et Cui (2004) 2004

Hassanien et al. (2004) 2004

Oktem and Jouny (2004) and Ali andHassanien (2006) 2004

Mousa et al. (2005) 2005

Oliver et al. (2006) and Freixenet et al. (2008) 2006

Sakellaropoulos et al. (2006) 2006

Szekely et al. (2006) 2006

8

2 Review of the Literature

At the present time, the literature is focused on the mathematical techniques used in breast cancer

detection with CAD and more specifically algorithms that include pre-processing approaches, feature

extraction and selection and classification methodologies. This section is devoted to these aspects due

to their importance (Ganesan et al., 2013).

2.1 Processing of Mammograms

The contrast of the mammograms is very important for mass detection. For this reason, pre-processing

of the mammograms occurs first, in order to improve it. Other key issues for breast cancer detection

from mammograms are the removal of the noise from the images, the segmentation of the breast region

from the muscles and the extraction of the suspicious regions. Denoising and enhancement of

mammograms affect both basic stages in mass detection, the manual analysis from the radiologist and

the second reading stage from the CAD system (Giger, Karssemeijer and Armato, 2001 ; Hackshaw

and Paul, 2003 ; Cady and Chung, 2005).

In general, the contrast in mammograms is varying between the normal tissues and the malignant ones.

In small lesions and tumors especially, it becomes difficult for the radiologists to clearly visualize and

compare the normal and cancerous tissues. From a mathematical view, this can be explained by the

linear absorption coefficients which define the image’s contrast in the Beer-Lambert law [𝐼 =

(𝐼𝑜𝑒−𝑎𝑥)]. The law relates the 𝐼𝑜 which is the incident electromagnetic wave, with the 𝐼, which is the

transmitted electromagnetic wave. For small tumors (small x) the difference in the intensity is very

small and can cause difficulties in the detection of small and hard to find tumors (Scharcanski and Jung,

2006). An approach is to find areas in the images where the local contrast varies (Karssemeijer, 1993).

This technique improves the detection in mammograms using a neighbourhood of an image location

and then calculating the local contrast as

𝑐(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) − 𝑚𝑒𝑑𝑖𝑎𝑛(𝑥, 𝑦)

where 𝑐(𝑥, 𝑦) is the local contrast, 𝑓(𝑥, 𝑦) is the image gray level and median (𝑥, 𝑦) is the median gray

level inside the neighbourhood of (𝑥, 𝑦). This equation can also considered as a high-pass spatial filter.

In addition, linear stretching presented for image enhancement, with linear or nonlinear mapping of

wavelet coefficients (Mallat and Zhong, 1992). Based on linear stretching, a new approach, called local

range modification, processes the image twice trying, to find the local parameters and, then, to enhance

the contrast (Fahnestock and Schowengerdt, 1983).

For denoising, besides filtering, another method is the Bayesian estimator-based discriminator, which

separates the image and noise by assuming a priori Gaussian additive noise (Simoncelli and Adelson,

1996). Furthermore, wavelet-based techniques used for denoising and extracting important features

from mammograms. These methods based both on Discrete Wavelet Transforms and Continuous

Wavelet Transforms and they result in high rates of success (Heinlein, Drexl and Schneider, 2003 ;

Gonzalez and Woods, 1992). Iris filters, which are adaptive filters used mainly for enhancing rounded

opacities despite their contrast, have also used. These filters use an orientation map with gradient vectors

(Kobatake et al., 1999). Xie, Li and Ma, (2016) proposed a new method for the pre-processing of

mammograms, which included the removal of the pectoral muscle with a combination of the circular

and linear Hough transform. Logarithmic transformation applied in some cases, to rise the dynamic

range in the dark areas of the mammogram and correct the low contrast regions inside the image (Arfan,

2017).

9

Undoubtedly, there are many more techniques and algorithms, which can used for image enhancement.

Pre-processing is a very important step and the choice of methods depends on a large variety of

parameters.

The purpose of statistical pattern recognition is to group the features to categories and classify them

correctly (Jain, Duin and Jianchang Mao, 2000). The effectiveness of the approach depends on how

well the patterns from diverse classes will divided into different classes and the boundaries established

from the probability distribution of these patterns. As a consequence, the patterns and features that are

produced must be effective as well as accurate, in order to classify the data without errors (Sickles, 1986

; Burrell et al., 1996).

The most important goal is to represent the data by a reduced number of dimensions, which both

maintain all the relevant information, but also reject the variables that are not going to contribute to the

effectiveness of the classification. There are two main methods to achieve that. The first one is the

feature selection, which used to seek features out of the available measurements. On the contrary, the

second approach called feature extraction and identifies features via a transformation of the

measurements to a lower dimensional feature space (Webb, 2002). In either case, the features must

minimized to allow efficient and generalizable classification. Usually, these techniques include a pre-

processing step, outlier removal, data normalization and the handling of missing data. There are several

techniques, which can used to discover the unknown parameters, with the help of known features

vectors, such as Bayesian interference, maximum likelihood estimators, maximum entropy estimation

and others. For the problem of missing data, the probability density function (pdf) can estimated with

the expectation maximization algorithm (Theodoridis and Koutroumbas, 2006). If the probability

density functions of the data are available, then the extraction of important features from them is

possible.

It is important to note that the key features that represent mammographic images can be spectral, textural

and contextual. Spectral features illustrated as the average tonal variations in various bands of the

intensity spectrum. Likewise, textural features contain information related to the spatial distribution of

tonal variations within a band and contextual features, include information obtained from blocks of

pictorial data, surrounding the area being analysed. In mammographic images, the textural features are

hardest to find. However, there are very important for the analysis of the image, since mammograms

obtained using a single medium of acquisition. With this in mind, in order to extract critical and

meaningful clues from the mammograms, the type of textures that mentioned before play an essential

role (Haralick, Shanmugam and Dinstein, 1973). Wavelet features contained useful data extracted from

the mammograms (Mousa, Munib and Moussa, 2005 ; Dhawan, Chitre and Kaiser-Bonasso, 1996).

Another approach for feature generation is based on the geometric moments. An image can represented

from its equivalent moments (Papoulis, 1965). It can be realized from the literature that gradient-based

measures, such as directional gradient features, can be very efficient in breast cancer detection

(Desautels, Rangayyan and Mudigonda, 2000). Furthermore, speculation and morphological features

can identify malignant tissues or lesions. Spicules are radiating patterns of linear spikes surrounding the

irregularly shaped malignant densities (Karssemeijer and Brake, 1996). Morphological features related

to the physical characteristics of the lesions. Additionally, difference and similarity techniques, which

depending the availability of prior mammograms, can provide additional clues. Specifically, the most

recent mammographic images compared with the prior ones in order to find differences and similarities

regarding to the size or other characteristics of tumors (Timp, Varela and Karssemeijer, 2007). Arfan,

(2017) tried to find the texture features inside a mammogram with Gabor Filter, which are Gaussian

kernel functions and can be applied to different scales, frequencies and orientation. Final, Galván-

10

Tejada et al. (2017) extracted 37 features from every mammographic image, with the genetic algorithm,

to analyse which features were the most important. In like manner, eight of them were clinical and

general data, eight were intensity based and calculated from the gray-levels of the pixels, thirteen were

texture descriptors and eight were shape and location descriptors.

To sum up, there are several feature generators, which can used for breast cancer detection. The problem

of these methods is that the image properties are unique and that every generator can provide specific

information, which can be used by the classifiers (Ganesan et al., 2013).

The next step in breast cancer detection using CAD is the development of a classifier. There are

numerous classifiers, which can used to categorise the data correctly and each one has its specific

characteristics and decision accuracy. Consequently, the choice of the right classifier is difficult and

requires careful consideration. Conditional probabilities used to evaluate the efficiency of the classifier.

The classification of an object 𝑥 to the class 𝜔𝑖 implies the posterior probability 𝑝(𝜔𝑖|𝑥), which must

be the highest in order for the classification to have the minimum error (Ganesan et al., 2013). Several

techniques tried for the calculation of the probability densities, divided into parametric and

nonparametric estimations. The first category includes Mixture Modeling, Bayesian Interference and

Maximum Likelihood Estimation, while the second one involves Histogram Approximation (Devroye,

Gyorfi and Lugosi, 1996).

The techniques that used for mass detection from mammograms are the parametric multivariate method,

which is based on linear classifier theory, (Ganesan et al., 2013); the hybrid classifier, based on

Adaptive Resonance Theory (ART); Linear Discriminant Analysis (LDA) (Hadjiiski et al., 1999 ;

Ganesan et al., 2013) and Support Vector Machines (SVM) which creates a hyperplane to classify

linearly separable data, of two or more classes (Cherkassky and Mulier, 1998). Neural networks

(Rumelhart, Hinton and Williams, 1986), with back-propagation, also used for mass detection in breast

cancer (Dhawan, Chitre and Kaiser-Bonasso, 1996).

One more approach related to probabilistic logic, is Fuzzy Logic. This technique is slightly different

from the others since the truth-values are not exactly binary in nature (Bhattacharya and Das, 2007).

This specific approach is successful when the truth-values do not have a definite description. This can

be very helpful in breast cancer detection when noise occurs in mammograms (Kovalerchuk et al.,

1997). Similarly, Bayesian networks are probabilistic classifiers, which can applied to the classification

problem, in order to find the most effective solutions. In CAD of breast cancer using mammograms,

they produce encouraging results (Wang et al., 1999). To develop a Bayesian network, it is necessary

to learn the network topology and then to estimate the marginal and conditional probabilities (Viton,

1996).

Figure 2.1: Features extract from mammogram images (a) normal, (b) benign and (c)

malignant (Ganesan et al., 2013 p. 85)

11

Decision trees are nonlinear classifiers, which categorize the data into classes. Specifically, they use

subsets of the data at diverse levels to classify them in classes, as an alternative to the whole feature set

(Lei Zhen and Chan, 2001). Another approach is the k-Means Clustering, which divides the data into k

clusters so that the sum of square differences will minimized. Then, the data classified to the class with

the shortest Euclidean distance (Patel and Sinha, 2010). From 2015, important studies came up for the

classification of malignant and benign masses in breast cancer with Extreme Learning Machine. This

algorithm is a feed-forward neural network that generates the connected weights between input and

hidden units and has an exceptional performance in terms of generalization (Xie, Li and Ma, 2016).

Moreover, the introduction of the adaptive boosting brought an improvement to the simple classifiers,

which introduced to the literature. This advanced system adopted an iterative procedure, which during

every iteration a method is boosting the misclassified data. Because of this, vulnerable classifiers are

mandatory to learn more by practising on difficult samples (Arfan, 2017). Galván-Tejada, et al. (2017)

assessed three diverse classification techniques: Random Forest (RF), Nearest Centroid (NC) and K-

Nearest Neighbours (K-NN). While, Random Forest is a non-linear supervised sparse regression-based

method, K-NN is a supervised instance-based method. In contrast, NC is a hybrid approach, which

includes an instance-based method combine with a statistical one. The results from the RF classifier

were more secure and showed stability.

As it can be see, every classifier has its own characteristics and it can applied to different problems to

provide specific results. In most cases though, it is beneficial to use more than one classifier so

combinations have proposed to improve the accuracy. This is still the subject of further investigations.

To summarise, pre-processing, feature generation and extraction and classification of mammograms are

key steps in mass detection in breast cancer. Every step can include a large variety of techniques, the

choice of which depends on the goal and the data set (Ganesan et al., 2013).

12

2.2 Bilateral Subtraction

Bilateral subtraction can used in breast cancer detection with CAD systems. The main assumption of

this method is that the two breasts are almost identical and they have symmetry. Because of that, when

the mammograms are matched with the appropriate processing techniques, and then subtracted, the

remaining asymmetries will point to possible masses (Yin, 1999). The CAD system, which shows

symmetry in the paired images, chooses only the asymmetrical areas as hypothetical malignant mass

regions. In literature, this approach seems to improve the system’s performance in specific conditions

and within the limited databases used (Zheng, Chang and Gur, 1995). The challenge of bilateral

subtraction is to be able to find all the suspicious regions with asymmetries but to reject all of the areas

that are not masses. True masses have particular characteristics, which, in brief, are their convex

contour, their density is equal at the centre and the periphery and they displayed on at least two different

projections. Non-malignant asymmetriess curve inwards and they include dense elements at different

places.

Feature extraction and classification methods

Author Methods Accuracy

[%]

Kimme et al. Normalized statistics and texture features 74

Petrosian et al. Spatial Gray Level Dependence and textural features with a

decision tree classifier 76-89

Kinoshita et al. Shape and texture features with a three layer feed-forward

neural network 81

Rangayyan et

al. Region based edge-profile acutance measure 92

Polakowski et

al.

Model based vision algorithm. Difference of Gaussians and

texture features 92

Priebe et al. Fractal texture measures 88

Sameti et al. Optical density, photometric and textural features 72

Chitre et al. Texture measures with artificial neural network 87

Mudigonda et

al.

Gray level co-occurrence matrices, polygonal modelling with

jack-knife classification 83

Brijesh et al. Statistical features with fuzzy neural network 83

Yoshida et al. Wavelet features in combination with a difference image

technique 90

Liyang Wei et

al.

Statistical features in a multiple view mammogram with SVM

and KFD 85

Oliver A et al. Eigen faces approach 82-90

Szekeley et al. Texture features and a combining classifier of decision trees

and multiresolution Markov random models 88-94

Alolfe et al. Forward stepwise linear regression method with a combined

classifier of SVM and LDA 82.5-90

Table 2.1: Listing of popular feature extraction and classification methods (Ganesan et al., 2013

p. 90)

Table 2.2: Listing of popular feature extraction and classification methods (Ganesan et al., 2013

p. 90)

13

The computerized scheme of bilateral subtraction contains seven steps:

1. If the mammograms were on film, each pair of mammograms digitized with a laser scanner to obtain

an image of specific pixel size.

2. The next step involved the segmentation of the mammograms to detect the breast border and nipple.

Two images were used to identify the breast border, the first one was the thresholded image and the

second a smoothed version of the original image. After that, five points, which divide the whole

image into three areas (Fig. 1.4.1), automatically selected and a tracking algorithm was applied to

detect the border based on the gray levels outside and inside the breast. It is worth to mention, that

there is a relationship, which relates the tracking procedure with the area of the breast. For the nipple

detection, two methods combined to provide higher accuracy. In the first method, the maximum

height of the breast taken as the position of the nipple, while in the second a reference point inside

the breast was included.

3. The mammograms aligned to allow direct comparison between the two images. Reference points

chosen to transform the coordinates of one image, to be associated with the other. In the literature,

usually the left mammogram was the one that displaced and rotated. The coordinates of the detected

nipples of both images determined the displacement and the angle of rotation.

4. Next, normalization of the images took place to fix the different brightness between the right and

left mammograms due to the recording procedure.

5. Bilateral subtraction followed the pre-processing and alignment of the images. In most cases, the

left breast image subtracted from the right. Hence, the masses, which located in the left breast

subtracted image, had negative pixel values, while masses in the right breast had positive pixel

values in the new subtracted image. The main goal was to produce two new images with positive

and negative value pixels while all the common areas in those images will be at the zero gray level

and indicates no remarkable difference between images. After simple linear stretching, a threshold

value was determined to extract the possible malignant regions and made the new images take over

the complete available range of pixel values.

Figure 2.2: Three areas of the breast region (Mendez et al., 1998 p.

958)

14

6. The next step was the analysis of the suspicious areas, which revealed when the threshold value

applied to the new images (Yin, 1999). Many of the areas identified by the bilateral subtraction

technique were not masses (Mendez et al., 1998). Several techniques were tested for the reduction

of false positives and, usually, size and eccentricity of the suspected regions were evaluated (Yin,

1999). Based on the size test, features with smaller values than the cut off, were ignored (Mendez

et al., 1998). Additional texture tests and the absolute values of the gray levels further decreased

the percentage of false positives. With linear discriminant analysis, an analytical model was

developed to classify the area as normal tissue or mass (Yin, 1999). Other techniques include pre-

processing by morphological filtering (Mendez et al., 1998).

7. Last, was the classification of the database to evaluate the performance of the CAD system. The

ground truth was a two-fold classification by the radiologists. First, they defined five main levels

of the mammographic appearances of the masses, related to the quality of mass. Level 1 was a

visible mass, easy to identify even to an inexperienced observer. Level 2 was an approximately

clear mass, detectable even by an inexperienced observer. Level 3 was a hard to notice mass, which

can recognized by observers with some mammographic experience. Level 4 was a very subtle mass,

which demands more skills and knowledge and Level 5 was a remarkably not obvious mass,

discoverable only by a skilled and experienced radiologist. The second classification based on the

radiographic contrast and the size of the masses, which were manually measured (Yin, 1999).

The image subtraction can be linear or nonlinear. In the linear bilateral subtraction method, as described

above, the right breast image subtracted directly from the corresponding left breast image, or vice versa.

Subsequent gray lever thresholding produces two binary images. The cut off gray values arise from the

gray level histogram of those images. After the thresholding, a great amount of the remaining features

correlated to locations of potential abnormalities. On the other hand, non-linear subtraction is based on

Figure 2.3: Thresholded images of the right

[(a),(c)] and left breasts [(b),(d)] with cut off values

25% [(a),(b)] and 35% [(c),(d)] (Yin, 1991 p. 957)

15

thresholding, with various cut-off gray levels before the bilateral subtraction thus resulting in various

subtraction images from a single pair of right and left mammograms. Thresholding excluded some

normal anatomic background from the subsequent analysis. The cut off gray values for thresholding

had a direct relationship to equivalent percentages of the areas beneath the gray level histograms of the

corresponding right and left breast images, inside the breast borders. The pixels with a gray level above

the cut off preserved their gray level while pixels below the cut off value appointed to a constant value.

As can be seen from the Figure 2.4, the nonlinear subtraction method had a better performance than the

linear subtraction method with the true positive rate about 95%, with an average of three false positives

detections per image. With the linear approach, the true positives were 11% lower.

Effective alignment is crucial for the performance of bilateral subtraction since any misalignments

between the paired breast images can cause artefacts and affect the evaluation of abnormalities and the

detection performance. However, misalignment is sometimes unavoidable due to physical differences

between the two breast images. In general, it could be caused by modifications in breast size, breast

compression, patient placement, acquisition, and computer registration (Zheng, Chang and Gur, 1995).

Despite the detrimental effects of misalignments on the detection of breast cancer, it is very challenging

to understand and account for the models of misalignments because of the complexities of the imaging

procedure and the fact that the breast is a soft-tissue organ. Only the nipple position and the breast

border can be located accurately in the alignment process and sometimes, it is impossible to locate the

position of the nipple so the skin line is the only source. This may not have a major impact for the

detection of large mass areas, but it can alter the results near the skin boundary and masses can be

missed. Fortunately, several state-of-the-art trials have shown that the detection performance is not

sensitive to minor misalignments (Yin, 1991). Nevertheless, at the same time, bilateral subtraction

misses some true positive values due to two main reasons. The first one is that masses, which are small

and have low contrast, cannot detected easily despite being present in the mammogram. The second is

due to the position of the abnormalities, since the detection efficiency of masses that are close to the

skin boundary, decreases.

Figure 2.4: Comparison of the detection performances

obtained with the nonlinear and linear subtraction methods

(Yin, 1991 p. 962)

16

It is important to note that after 2013, numerous studies appeared in literature. Sun, et al. (2014)

reviewed and presented a new study, which intended to detect a percentage of women with high risk of

having or developing an observable breast cancer. They used three types of characteristics to build a

new and unique CAD model for breast cancer detection. First, they did not use any registration or

alignment methods due to the limitations of the methods described above. Their approach based on

asymmetry feature extraction from bilateral mammograms. More specifically, they analysed and

matched the differences of image features that computed independently from the two bilateral images.

Until now, previous researches used only spatial features, but Sun, et al. (2014) added morphological

and texture ones, aiming not only to detect the abnormal regions but also to predict the near-term risk

of the negative cases with 73.5% correct rate. In this study, they used the assumption that two breasts

of negative cases have relatively symmetrical areas. Moreover, it shown that the asymmetry of the tissue

determined the risk of a woman having or later developing breast cancer. Despite the encouraging

results, this study is not ready to be applied to the general population, due to the limited testing dataset.

Kelder, et al. (2015) investigated an advance CAD without lesion segmentation, which based on the

identification and analysis of region of interest (ROI)-based bilateral mammographic tissue asymmetry.

Their algorithm included bilateral image registration, image feature selection and naive Bayes linear

classifier. The first step was the automated identification of abnormal areas, with non-fixed size, from

every mammogram and registration of the mammograms to find ROI-based bilateral mammographic

feature asymmetries. Finally, a machine-learning classifier applied to combine the ROI-based features

and to compute the likelihood of the ROI. It was already known that bilateral tissue asymmetries can

indicate the risk of developing breast cancer and which were the main features that a radiologist had to

examine to make a diagnosis. Those asymmetries were global or local differences in tissue, density

between the right and left mammograms, or matching areas in the two bilateral mammograms. Kelder,

et al. (2015) showed that the asymmetry scheme had superior performance and that the asymmetry

features can help the radiologists in the identification of high-risk patients. While this study produced

very promising results, the database that used was not large enough for a satisfactory performance

evaluation and more work needed in this matter.

Figure 2.5: Manually registered borders of the

right and left breasts (Yin, 1991 p. 956)

17

A basic problem in mass detection from mammograms using CAD systems was the relatively high

number of false positive (FP) values. Li, et al. (2014) suggested the use of a two-step technique, which

could decrease those values to improve the identification process. The first step was similar to Kelder,

et al. (2015), while in the second step they examined the bilateral similarity to reject the FP values in

the detection. Li, et al. (2014) found that the calculation of matching cost was necessary. The global

and local image features used to detect the similarities between mass-to-normal and normal-to-normal

pairs and to determine the ROIs. In general, this method showed promising results decreasing the FP

values for breast cancer detection. Limitations included small-scale database, similarity of a mass with

its matched region, which resulted in an inability of the algorithm to identify the mass and low density

masses being missed. Likewise, Casti, et al. (2015) tried to develop a new approach for the analysis of

structural similarity between right and left mammograms using landmarking, automatic bilateral

masking procedures, multi-directional Gabor filters, modelling of spherical semivariograms and

extraction of similarity features. They found that the central problem with the identification of

asymmetries appeared, because the bilateral anomalies caused by a developing or underlying

pathological process must be discriminated from the physiological variations between the two breasts.

They implemented landmarking and bilateral masking approaches to segment paired mammographic

areas. Then, with the use of spherical semivariogram descriptors, they extracted useful information

related to the spatial dependency of pixels inside the mammographic area. With Gabor filers combined

with the original gray scale values, they characterized the structural information present in the oriented

patterns of the breast. Their final step contained the introduction of correlation-based structural

algorithms to compare the diverse regions. This study showed that this approach had better performance

with the only limitation being the relatively small dataset of the available asymmetric cases accessible

in public databases.

Celaya-Padilla, et al. (2015) developed a CAD method which can triage mammogram sets

automatically. This technique co-registered the left and right mammograms, extracted image features

and divided the patients into three basic categories: risk of having malignant masses, malignant masses

and healthy subject. This study based on asymmetry analysis, like the others presented above. This

approach related with earlier studies, but upgraded by the extraction of hundreds of asymmetry features

and with the use of an automated registration algorithm, which was more robust and simplified. This

technique could be used to queue cases with a large percentage of malignant findings in developing

countries where there are few radiologists.

Celaya-Padilla, et al. (2015) and Tan, et al. (2016) introduced a new model to predict near-term breast

cancer risk based on quantitative assessment of bilateral mammographic image feature variations in a

series of negative full-field digital mammography (FFDM) images. The database included four

sequential FFDM examinations for every patient, with the first three examinations considered priors

and the last one recent. They also studied a possible link between the model-generated risk scores and

the time lag of negative and positive screenings. It is worth mentioning that Tan, et al. (2016) adopted

a large number of mammographic density, texture and structural based features used in the literature

and formulated several new ones. They determined the most effective ones to be the WLD similarity

features, RLS, texture and gray level magnitude based features. To eliminate the probable training and

testing bias they adopted a LOCO cross-validation method with feature selection. Their observations

indicated a non-linear relationship between the diverse mammographic tissue patterns, from the

negative to positive screening and that the near-term cancer risk factor, estimated from bilateral

mammographic image feature asymmetry, did not depend on age. In addition, they combined the

negative and recalled benign cases into one cancer-free group, to ease subsequent processing.

18

At 2016 Casti, et al. (2017) proposed an algorithm for the automatic localization of malignant sites of

asymmetry in mammograms. At first, the left and right mammograms pre-processed to discover

anatomical spots for bilateral matching and then filtered with Gabor filters at different directions to

magnify the directional components of the breast. After that, the segmentation procedure took place

with two masks: one horizontal and one vertical. The horizontal masking built by rectangular areas of

the same height, from the minimum to the maximum pixel inside the breast area. The vertical one

included eight more rectangular regions across the chest area. The characterization of the pairs

accomplished with the correlation-based similarity. Classification achieved with Bayesian models,

which were trained using structural similarity features. With this study, they derived a useful system

with satisfactory results, which can localize sites of malignant asymmetry and their classifier have been

database-independent and ensured an unbiased outcome.

One year later, Yan, et al. (2017) tested an unusual algorithm for breast cancer risk prediction. They

used a single mutual threshold, rather than two different thresholds, on bilateral mammograms to

segment the breast areas. The threshold value decided by taking the median grayscale value of the entire

pixel range, in both left and right mammogram. Following, they estimated three types of image features:

asymmetry, which defined by the absolute difference of the two bilateral mammograms, mean features,

that estimated by taking the average price of two registered mammograms and maximum features,

which described by the higher value of the two matched bilateral mammograms. With a two-stage

classification model, they dissolved the three diverse types of features and the risk prediction

performance certified by leave-one-out cross-validation method. Their results were promising as they

managed to achieve higher prediction accuracy from other studies in the literature.

In conclusion, the bilateral subtraction approach is an effective technique for breast cancer detection,

despite the limitations mentioned above, because of the low average number of false positives. Although

the true positive rates still need improvement, radiologists can use these systems as second readers to

separate normal and malignant masses and to identify normal areas as suspicious (Yin, 1999). Table 2.3

summarizes the approaches of bilateral subtraction techniques.

19

Table 2.3: Comparison of Bilateral Subtraction methods

2.3 Temporal Analysis

In temporal analysis, mammograms from multiple examinations of the same breast, at prior times, are

available to the CAD algorithm to achieve higher accuracy in detecting abnormalities (Timp and

Karssemeijer, 2006). The past mammograms can convey significant information about the likelihood

of breast cancer. If a suspicious area in the current mammogram matches closely, in location and

appearance, to a suspicious region in the past mammogram, then the anomaly is probably not associated

with cancer. Conversely, if there is no corresponding anomaly in the past mammogram or there is

significant change, then this considered a malignant lesion. With the inspection of previous

mammographic images, the number of false positive values is decreased and the identification of

malignant masses that otherwise will be missed is accomplished (Ma et al., 2010 ; Ma et al., 2015).

There are some significant advantages in using prior mammograms. First, when the current

mammogram compared to the prior one, subtle signs of malignancy, like small masses or recent

calcifications, are more obvious. These differences might have overlooked if the past mammogram was

not available for comparison, so radiologists use this approach frequently to identify developing

anomalies and boost the true positive values. Second, the suspicious areas of the present mammogram

can identified more accurate when the area is compared with the corresponding area of the past

mammogram. For instance, if a radiologist detects a mass on the current mammogram, he/she can use

the past one to determine if this is a new or existing density. If the mass was apparent on the prior

mammogram, the radiologist can analyse the size and the contrast of both lesions. A third benefit of

using previous mammograms is that the additional data can be used to eliminate a number of false

positive detections. If the areas related to a mass, in both sequential mammograms are alike, probably

Bilateral Subtraction methods

Sun, et al. (2014) 𝐴𝑍 = 0.754 ± 0.024

Kelder, et al. (2015) 𝐴𝑍 = 0.87 on the ROI-based evaluation

𝐴𝑍 = 0.72 on the case-based evaluation

Li, et al. (2014) Sensitivity=85%

34% reduction of FP

Casti, et al. (2015)

𝐴𝑍 = 0.83 with linear discriminant analysis

𝐴𝑍 = 0.77 with Bayesian classifier

𝐴𝑍 = 0.87 with ANN

Cellaya-Padilla, et al.

(2015)

p-rank: 𝐴𝑍 = 0.882 for calcifications VS healthy cases

𝐴𝑍 = 0.842 for masses VS healthy cases

z-normalization: 𝐴𝑍 = 0.882 for calcifications VS healthy cases

𝐴𝑍 = 0.807 for masses VS healthy cases

Tan, et al. (2016) 𝐴𝑍 = 0.730 ± 0.027

Casti, et al. (2017) 1st database: 𝐴𝑍 = 0.79

2nd database: 𝐴𝑍 = 0.75

Yan, et al. (2017) 𝐴𝑍 = 0.830 ± 0.033

20

those areas represent false positives or gradually growing benign masses (Timp and Karssemeijer,

2006).

In order to detect breast cancer with temporal analysis, the prior and current mammograms must be

aligned. Registration is the key for the alignment of mammograms and is increasingly important for the

early detection of abnormalities. However, registration techniques face many challenges due to changes

that occur in the breasts over time and the differences in the way that the mammography performed,

including variations in breast compression and imaging parameters, changes in the shape and the

amount of pectoral muscle that presented in the medio-lateral projection (Marias et al., 2005). Because

of these challenges, registration of mammographic images is still an on-going research topic for the

improvement of CAD systems using temporal analysis (Ma et al., 2010).

There are two major categories of registration techniques for time diverse mammograms found in past

studies in the literature: global registration techniques and regional registration techniques. The first

approach compares the entire current mammogram with the previous one to identify corresponding

regions in the breast (Timp and Karssemeijer, 2006). Vujovic and Brzakovic, (1997) applied this

method by separating the current and past mammograms into various areas with the use of internal

control points. Depending on the location of the control points, both mammograms partitioned into

statistically homogeneous areas. Textural and contrast features from circular regions, whose centres

were the control points, compared providing valuable information to distinguish normal and abnormal

tissues. Figure 2.6 outlines the scheme that Vujovic and Brzakovic, (1997) used to identify the control

points in the mammograms. Richard and Cohen, (2003) used a variation of the formulation to find the

smooth function that describes better the deformation necessary in order to map one mammogram to

another, but they applied that technique in bilateral pairs. While this method is very mathematically

efficient, in practise the true map between two images of the same breast is not smooth and cannot be

applied effectively to temporal pairs. Hasegawa, et al. (2008) started with a rigid body alignment and

then applied registration of the dense areas of the mammogram with the use of a B-spline control point

grid.

21

The regional registration approach correlates suspicious areas in current mammograms with

corresponding areas in the past ones (Timp and Karssemeijer, 2006). In other words, this algorithm

searches locally in one mammogram to discover matching regions of interest of the paired mammogram

(Ma et al., 2015). Sanjay-Gopal, et al. (1999) designed a fan-shaped search area in past mammograms,

using the nipple and the centroid of the breast axis. For every abnormal area on the current mammogram,

they constructed a warp-shaped search region on the previous mammogram and for every location

inside the search region, a correlation measure computed. The location with the highest correlation

value selected as the location of the prior. In Figure 2.7, the procedure used by Sanjay-Gopal, et al.

(1999) presented. In the same way Hadjiiski, et al. (2001) classified the abnormal masses as malignant

or benign based on the comparison of the current and prior mammograms. In that case, radiologists

defined the region of interest and then the algorithm found the associated anomalies inside the

mammograms.

Figure 2.6: Processing steps and information flow between processing steps used to identify

potential control points and establish their correspondence (Vujovic & Brzakovic, 1997 p. 1388)

22

A natural extension was to combine local and regional registration for improved results. Marias, et al.

(2005) combined several techniques.

1. For the detection of the breast outline, which was important in the registration process to assure that

the breast boundary segmented. This step was complicated by labels that overlapped or were close

to the breast region, noise and/or skinfolds. The boundary extraction based on a combination of the

Hough transform, image gradient operators and morphology to isolate the background and mark

points along the boundary.

2. For the curvature analysis of the breast outline. This method included the detection of invariant

points along the breast boundary characteristic of the curvature. The point with the maximum

curvature, most of the times, corresponded to the nipple and, even if the nipple was not noticeable;

the algorithm still used it as the maximum negative curvature point. In general, the three detected

landmarks usually corresponded to the anatomical location of the rib (point 1 in Fig. 1.5.3), the

nipple (point 2 in Fig. 1.5.3) and the axilla (point 3 in Fig. 1.5.3). Nevertheless, the breast outline

could be notched which complicated the computation of the curvature with second order derivatives

resulting in a noisy curvature profile. This problem overcome with Gaussian multiscale analysis of

features in 2-D curves.

3. For image transformation. With at least 5 points along the breast boundary, a satisfactory initial

alignment can be achieved for temporally mammogram registration. With 7 points between

landmarks three and two and another 7 in the middle of landmarks two and one, greater accuracy

can be reached. Based on these points, the transformation that adjusted the boundaries computed

with the use of thin-plate spine interpolation and warped images created by forcing all the points

inside a mammogram to take the intensity values of the point where the interpolating function maps

the point of the past mammogram. Intensities outside the pixel grid can estimated by bilinear

interpolation. This approach reduced the primary differences among the images and corrected for

scaling, translation and limited rotations because of the breast positioning and orientation.

Figure 2.7: Regional registration technique (Sanjay-Gopal et al.,

1999 p. 2671)

23

4. For define internal correspondences in the mammogram pair. Diverse breast compressions tend to

make denser structures move at a greater scale than the less dense ones, which can result in an

interruption of an otherwise generally smooth motion field. The performance of the transformation

can improved by choosing internal landmarks. The solitary areas of dense tissue are the preferred

choice for internal landmarks, which are often the brightest regions in a mammogram. Yet,

calcifications cannot used as possible matches due to their restricted spatial extend which can result

in numerous but erroneous matches. The distinct areas of dense tissue that move closer when the

primary boundary-based algorithm applied, preferentially considered since they decreased the value

of the internal matches. A nonlinear Coiflet wavelet scale-space technique used to analyse the

mammogram pair in order to identify important regions of interest. Additionally, for each region in

the first image a search window implemented in the second with a match rejection filter to guarantee

that spatially localized features not matched to bigger ones. Finally, a limited number of landmarks

determined as the centroids of the matched areas.

5. Finally, the boundary points and the internal landmarks are included in a thin-plate spline

approximation approach that allows both smoothness control and individual weighting for every

landmark, to match the images accurately.

One year later, Timp and Karssemeijer, (2006) designed another global and regional registration

method. This method started with the two-time diverse mammograms, the prior and the current, and

subsequently the breast region and the pectoral muscle segmented. A global registration method based

on centre of mass alignment, registered the current and past mammograms and a pixel level mass

identification algorithm assigned to every pixel inside the breast region, a measure of suspiciousness.

This measure shows the likelihood of existence of a cancerous mass. Then, the most dangerous locations

on the current image chosen and linked to a corresponding location on the past image. Both locations

segmented into distinct areas and features computed for every area and used to classify each region.

For a more accurate comparison, van Engeland, et al. (2003) evaluated four different approaches for

temporal mammogram registration. The first two methods were the simplest alignment procedures

based on nipple location and centre of mass of the breast region. In the first case, the detection of the

nipple automated and the nipples in the past and current view aligned at the same spot by translation of

the past view. In the second method, the mammograms segmented into three areas: the breast, pectoral

and background area. The centres of mass of the breast area in the past and current view placed in the

same location by translation of the past view. The third method utilized a mutual information algorithm,

Figure 2.8: Consistent landmarks in the

CC and ML ‘idealized’ outlines (Marias

et al., 2005 p. 3)

24

which calculated from the joint probability distribution of the images’ intensities. The two images

registered by maximizing the mutual information using rotation, scaling and shearing to register past

and current mammograms. Finally, the fourth method was a warping approach, which used a set of

automatically determined control points placed on the breast contour and the pectoral muscle. The

control points in both views aligned with each other and the warped image formed by interpolation

between the control points, using a thin-plate spline surface approach. Van Engeland, et al. (2003)

determined that the method of mutual information provided the best results. The table below contains a

comparison for all the previous studies related to registration of temporal mammograms.

Table 2.4: Comparison of registration techniques in temporal mammograms

After 2010, a significant number of studies appeared in the literature for breast cancer detection with

temporal analysis. A large number of those studies dealt with the registration process while new

approaches established for the identification of abnormalities.

Rangayyan, Banik and Desautels, (2010) developed a method for the identification of architectural

distortion in mammograms, with the use of Gabor filters, phase portrait analysis, fractal analysis and

Haralick’s texture features. Architectural distortion is the distortion of the architecture of a breast region

without associate elevated density or mass. Such areas, due to their subtlety and diverse appearance,

missed, most of the time. For the feature extraction, they used Haralick’s texture features, which based

on the moments of a joint probability density function that calculated using the joint occurrence or co-

occurrence of gray levels. Feature selection accomplished by evaluating the performance of every

feature or a combination of significant features. Rangayyan, Banik and Desautels, (2010) used various

image processing and pattern classification methods such as artificial neural network and support vector

machines. From their results, they found that Gabor filters, phase portraits, fractal analysis and textural

features could combined to achieve early detection of subtle evidence of breast cancer in mammograms,

especially as related to architectural distortion. The main disadvantage of their method was the high

Registration methods in temporal mammograms

Vujovic and Brzakovic,

(1997) 86% of points in agreement with ground truth

Sanjay-Gopal, et al.

(1999)

85% correctly identified with Global Breast Alignment

77% correctly identified without Global Breast Alignment

Hadjiiski, et al. (1999)

92% classification accuracy using temporal mammograms

90% classification accuracy using current mammograms

78% classification accuracy using prior mammograms

Marias, et al. (2005) 85% accuracy for the identification of rib, 75% of the nipple and 80% of

the axilla

Timp and Karssemeijer,

(2006)

72% of the cases were correctly identified

69% of the cases were correctly linked based on correlation

200 FP and 389 TP

van Engeland, et al.

(2003)

Mutual Information outperformed all the techniques

Mean 7.9 and Median 6.1

25

number of false positives, which could be reduced with the identification and removal of the pectoral

muscle.

Ma, et al. (2010), realized the difficulty of registering temporal mammograms and replaced image

registration with graph matching. The promising advantage of this method was that registration

narrowed only to image features, which were compatible with breast cancer and reduced the chance of

error. Their procedure included 4 steps. First, was the segmentation of the current and prior

mammograms and the assignment of mass-like scores to components to show the likelihood that a

component corresponded to a malignant mass. After that, was the implementation of the graph matching

algorithm to combine candidate masses in the current and prior images and finally the adaptation of the

range of mass-like scores to demonstrate the information from both mammograms. The graph theory

based method, detected components which associated with cancerous masses but did not provide precise

boundaries of masses, which needed to estimate specific characteristics. The performance of their

identification scheme applied only to the current images and it was comparable to the best detection

results presented in the literature. The main drawback of this method was the increased number of false

positive detections per image, but when the past mammogram compared to the current, this number

slightly changed. Moreover, the graph matching process was unsuccessful when the two-time diverse

mammograms had significant differences in appearance, usually due to the presence of dense tissue.

In 2011 Diez, et al. (2011) proposed a quantitative evaluation of state-of-the art intensity based image

registration approaches, applied to temporal mammograms, which included global and rigid

transformation, local deformable paradigms using differing metrics and multi-resolution techniques.

They assessed their results with the use of temporal cases based on quantitative analysis and a multi-

observer study and quantitatively compared rigid and non-rigid intensity based registration methods.

The eight state-of the art algorithms that used were: Rigid, Affine, B-Spline Free-Form Deformations,

Polyrigid, Demons and combinations of them. While the B-Spline Free-Form Deformation (BSP)

method showed the best results from the numeric as well as the subjective point of view, it produced

registration artefacts. To solve this problem and improve the results Diez, et al. (2011) combined the

BSP method with Affine registration or multi-resolution techniques. Soon after, Diéz, et al. (2014)

published a new study related to three other registration algorithms (Affine, SyN and Demons) tested

on DCE-MRI images, which collected from clinical practise. The methodology included segmentation

approaches, which can focus on the region of the breast and anatomical landmarks and image metric,

in order to evaluate the quality of registration. These three registration methods adjusted separately to

each breast using automatic breast segmentation masks. With temporal registration, they compared

current and prior exams to reduce the false positive values and to identify abnormal areas. The

conclusion was that the SyN registration technique provided the best results.

Martí, et al. (2014) evaluated the use of image and deformation features to identify abnormal cases with

cancerous masses. More specific, they investigated whether the image registration results could used

for the identification of malignant masses from temporal mammograms of the same patient and for the

categorization of cases as normal or abnormal. Their approach included an Affine transformation, which

maximised mutual information similarity measure with a non-rigid point correspondence method based

on a robust point matching algorithm. The intensity, deformation and differ similarity based features,

which found from the registration method were subsequently used in a machine-learning framework to

identify cancerous cases. Martí, et al. (2014) discovered that the combination of features could improve

the results in breast cancer detection, but supplementary registration algorithms and a multi-centre

dataset should be added in future works.

26

Bozek, et al. (2014) performed temporal comparison of lesions in full-field digital mammograms

(FFMV) and extracted temporal features that define change in the lesion between two-time diverse

mammograms. The main advantages of this study were the use of FFDM mammograms exclusively

and the introduction of volumetric change of a lesion which was determined using dense tissue thickness

maps. More specifically, they adopted four volumetric features that include information related to the

size of the lesion and many more. Equally important, the volumetric feature could pass the restrains

presented when calculating the size of a lesion. The results showed that volume might be a much more

relevant feature compared to area for the analysis of the temporal change in lesion size. Furthermore,

the classification performance could be improved if the temporal volumetric features were included in

a set of features collected only from the current exam. However, the study had a drawback. The small

size of the samples did not permit Bozek, et al. (2014) to characterise whether there was an advantage

in using volume features.

In 2014, Ma, et al. (2014) presented a temporal mammogram registration framework, based on spatial

relations between ROI and graph matching, used to build correspondences between ROIs of current and

past mammograms. This temporal mammogram registration developed correspondences between areas

of the two time diverse images. They employed 18 image features (Fig. 1.5.4) to capture the

dissimilarities between the matched areas. To evaluate the contribution of temporal change information

to the identification of abnormalities, 5 techniques were implemented to associate mass classification

to image features measured on single areas and mass classification based on temporal features, to

enhance mass classification. The framework of this study contained both preprocessing of the current

and prior mammograms (gamma correction, anisotropic filtering and extraction of both breast and

pectoral muscle boundary) and segmentation (adaptive pyramid based segmentation and sublevel set

analysis). Their results demonstrated that including the temporal features in the mammogram mass

identification and combining them with the single classification features, linearly or by taking the

minimum value of the two classifications, increased the general performance of the algorithm. The main

limitation of this study was the relatively small set of mammograms that used. A larger set of

mammograms or supplementary mammogram pairs from diverse sources and databases could allow to

better determination whether there is an improvement from the use of the temporal change information

for cancer identification. Additionally, Ma, et al. (2014) used only five basic algorithms for the

implementation of temporal image features in the detection.

27

Table 2.5: Single and temporal features (Ma et al., 2014 p. 1264)

Soon after, Ma, et al. (2015) designed an innovative approach of incorporating 17 fuzzy sets based on

spatial relations, to register temporal mammogram pairs. They used four spatial relations: to the right

of, to the left of, below and above. The histogram of the entire considerable angles between all pairs of

points in a pair of ROI regarded as a fuzzy set and spatial relations between the pair of ROIs

characterized by determine to what degree this fuzzy set came nearer to the four spatial relations. Based

on the spatial relations, association of ROIs of temporal mammogram pairs evaluated as a graph-

matching problem and registration of temporal mammograms performed by discover the shared sub-

graph between two graphs illustrating a pair of temporal mammograms. Their processing and

segmentation scheme was similar to the one in their prior research. Their experiments showed that this

algorithm can cope with mammograms with variations in position or size but they did not found an

actual enhancement in performance, since the slight increase in the ROC number. Despite this, even a

small increase in the performance has the potential to affect positively the results for breast cancer

detection especially in developed countries. Ma, et al. (2015) found that their algorithm could identify

changes over time so it worked well on dense breasts but for these methods, they had to extract

accurately the breast boundary and then to find the reference points. The breast boundary used as a

component and involved in the registration method to supply global reference. When they combined

classification, a minor increase experienced in the performance compared to single detection. To

determine statistical significance of such limited improvement must use larger datasets and better

boundaries of the ROI.

Subsequently, Ma, et al. (2015) expanded their research to investigate the combination of image features

measured from single areas and image features measured from the matched areas of temporal

mammograms based on fuzzy spatial relation representation and graph matching. They used three

Support Vector Machine (SVM) kernels: the multi-layer perceptron kernel, the polynomial kernel and

the Gaussian radial basis kernel but also combined those kernels and applied them to the two-time

diverse mammograms for mass classification. To connect the two types of features from the single and

Single features Matched features

solidity solidity

axis ratio solidity2

std ratio axis ratio

iv circularity

c2 int

c3 relint

int entropy std radi

energy radi

inertial momentum c2

anisotropy c3

m1-m7 int entropy

energy

inertial momentum

anisotropy

m2,m3,m7

mass like number

28

temporal mammograms they used three combination rules: Linear combination, the Max rule and the

Min rule. Their results showed that this Multiple Kernel Learning (MKL) method provided the best

performance on both single ant temporal feature sets using the Min combination rule for the most

effective classification. The major drawback of this study was the use of only heuristic searching to

reduce computational time.

More recently, Abdel-Nasser, Moreno and Puig, (2016) proposed a temporal mammogram registration

method based on the curvilinear coordinates which constructed from both global and local deformations

in the breast region. This method was fully automated and it could be applied to both CC-CC and MLO-

MLO mammographic pairs. In the curvilinear mapping, a coordinate pair (s,t) assigned to each pixel in

cartesian coordinates (x,y). The construction of the curvilinear coordinates does not demand any

information related to the structures inside the breast region. To build the curvilinear coordinates they

used the breast boundary and a landmark point placed on it. Hence, the developed representation of a

given mammogram was invariant to differences in the size, position and orientation of the internal

structures of the breast. They applied the curvilinear coordinates to manage both global and local

deformations inside the breast area and compensated for the deformations between the mammographic

images. With the use of curvilinear coordinates, instead of using the Cartesian coordinates, in

mammogram registration, they created a model which mimics the anatomy of the breast and did not

required control points or use of a correspondence algorithm. They also integrated the segmentation

algorithm within the registration framework. After the registration, they maximized the similarity

between the two-time diverse mammograms and decreased the distance between manually defined

landmarks. A careful comparison with the use of state-of-the art mammogram registration methods,

proved that this technique provided the best results and the smallest landmark errors compared to

Demons, DRAMMS and Brandt’s method.

A new Fractal Dimension-based diagnosis approach implemented by Shanmugavadivu, Sivakumar and

Sudhir, (2016) for the change detection and time-series analysis of masses in temporal mammograms.

Fractal geometry is an effective mathematical technique, which handles alike and abnormal geometrical

objects known as fractals. With the use of Fractal Hust bound for enhancement and Fractal Thresholding

for segmentation, they tried to detect spatial masses in temporal mammograms. Furthermore,

Shanmugavadivu, Sivakumar and Sudhir, (2016) did temporal analysis of mass lesions applying Fractal

dimension. Their results indicate that Fractal dimension of temporal mammograms can provide valuable

information to the decision support expert system of radiologists.

Finally, at 2017 Kooi and Karssemeijer, (2017) examined the use of deep convolutional neural networks

with the intention of discovering abnormalities in mammograms. They did a linear mapping that took

the area of a mass and mapped it to the prior mammogram. Then, they examined two diverse

architectures. The first one relied on a fusion model, which made use of two data-streams were both

ROIs delivered to the network at the time of training and testing and the second one was a stage-wise

algorithm, were a ROI trained on the primary mammogram and used as feature extractor for the primary

and prior mammograms. For the classification of features, they used the gradient boosted tree classifier.

Their results demonstrated improvement in performance and they were promising for further research.

All of the above studies prove that temporal analysis can enhance the detection accuracy for breast

cancer identification using mammographic algorithms (Timp and Karssemeijer, 2006). However,

despite the increasing number of studies in the literature, automatic comparison between temporal

mammograms is still a challenging task due to the complexity of temporal mammogram registration

(Ma et al., 2015). Most of the studies in the literature focus on techniques based on registration with

advanced techniques implemented for the alignment of the two-time diverse mammograms, in order to

29

decrease the errors from registration. The main drawback of the current methods is the limited dataset

and studies examining the performance of CAD systems in clinical settings. The table below presents

general comparison for all the approaches that mentioned above related to breast cancer detection using

temporal analysis.

Table 2.6: Comparison of Temporal Analysis techniques in mammograms

Temporal Analysis methods

Desautels,

Rangayyan and

Mudigonda, (2000)

𝐴𝑍 = 0.76 Bayesian classifier, 𝐴𝑍 = 0.73 Fisher analysis, 𝐴𝑍 = 0.77 Neural Networks

𝐴𝑍 = 0.77 SVM

Ma, et al. (2010)

80% true detection rate with 1.02 FP per image at single scheme and 0.96 FP at

temporal scheme

90% true detection rate with 1.84 FP per image at single scheme and 1.63 FP at

temporal scheme

Diez, et al. (2011) BSP best method with Mean=2.73 and Variance=1.42

Diez, et al. (2014) SyN best algorithm with RMS=72.89±36.77 mm

NMI=1.21±0.04

Martí, et al. (2014)

no registration 𝐴𝑍 = 0.76

with RPM 𝐴𝑍 = 0.88

with Affine 𝐴𝑍 = 0.84

Bozek, et al. (2014)

with all current features 𝐴𝑍 = 0.77

with all current features and all nine temporal features 𝐴𝑍 = 0.86

with all current features and four temporal features 𝐴𝑍 = 0.90

Ma, et al. (2015)

with linear combination 𝐴𝑍 = 0.8989

by taking minimum value 𝐴𝑍 = 0.8863

with Fisher analysis 𝐴𝑍 = 0.8855

with SVM 𝐴𝑍 = 0.6028

Ma, et al. (2015) 𝐴𝑍 = 0.852

Ma, et al. (2015) Min combination rule best results with 𝐴𝑍 = 0.8532 on the single feature set

MKL best results with 𝐴𝑍 = 0.7987 on the temporal feature set

Abdel-Nasser,

Moreno and Puig,

(2016)

SSIM=0.903±0.142

MI=1.232±0.108

LE=5.23±2.11mm

Kooi and

Karssemeijer (2017)

𝐴𝑍 = 0.87

1st architecture 𝐴𝑍 = 0.895

2nd architecture 𝐴𝑍 = 0.88

same architecture for temporal analysis 𝐴𝑍 = 0.884 1st

𝐴𝑍 = 0.879 2nd

30

2.4 Detection and Classification of Micro-calcifications

A significant percentage of cancers are detectable due to the appearance of Micro-Calcification Clusters

(MCCs) (Cheng et al., 2003). The morphology of the calcifications is the most crucial parameter in the

classification between benign and malignant tumors. Suspicious calcifications or malignant, have either

an amorphous or a rough heterogeneous form. On the other hand, benign calcifications are uniform and

smooth. The distribution of calcifications is also important and it can dissected into four categories:

diffuse, regional, clustered and segmental. Diffuse calcifications are alike calcifications, which appear

in the whole breast. Regional, are the diffuse calcifications in a larger scale. Linear distribution is

typically seen when ductal carcinomas fills the whole duct. Clustered is the distribution which at least

five calcifications are combined and final, segmental distribution contains calcium deposits in ducts.

Usually, the first two categories are entitled to benign tumors and the other three to malignant ones

(Smithuis, R. and Pijnappel, 2008).

The high correlation among the display of MMCs and the abnormality condition demonstrate that the

Computer Aided Diagnosis systems will be beneficial for the automated detections and classification

of MCCs. On the contrary, automated analysis of micro-calcifications is a complicated procedure due

to a series of problems. First, micro-calcifications are relatively small and their size varies from 0.1-1

mm. Second, they can presented in diverse shapes and distributions, which makes template matching

impossible. Third, micro-calcifications may be low contrast and so the difference between abnormal

regions and normal tissue inside the mammograms will be unnoticeable. In like manner, they can be

connected to the surrounding health tissue making the segmentation approaches useless, or in some

cases, the abnormal areas are hidden because the tissue is too dense or the skin is thicker than usual. A

large amount of studies in the literature introduce a sequence of algorithms for the detection and then

the classification of micro-calcifications (Cheng et al., 2003).

At 1998, Wang and Karayiannis, (1998) presented an approach based on wavelet image decomposition.

Micro-calcifications appeared in small clusters of pixels with high intensity values related to their

neighbour pixels. With the introduction of a new detection system that kept these features and applied

an image transformation, the signal characteristics localized in the original and the transform domain.

Figure 2.9: Types of micro-calcifications’ distribution (Smithuis, R. and

Pijnappel, 2008)

31

Micro-calcifications correlate with high-frequency components of the image spectrum and their

detection accomplished by decomposing the mammograms into several frequency sub-bands,

suppressing the low frequency sub-band and reorganize the mammogram from the sub-bands consisting

only high frequencies. This technique driven from the capability of wavelets to differentiate various

frequencies and to retain signal details at diverse resolutions. New studies needed in order to examine

how the performance of this approach can change with numerous alternations of the properties of the

wavelet filters.

Papadopoulos, Fotiadis and Likas, (2005) developed a computer-based fully automated approach for

the identification and characterization of micro-calcification clusters in digital mammograms. Their

method performed in three stages: the cluster identification stage, the feature extraction stage and the

classification stage. For the last stage, they used a rule-based system, an ANN and a SVM. At the

beginning, they did a pre-processing step to eliminate the unusable radiological marks and the

background of the image. Later, they tried to reveal hidden micro-calcifications with background

correction and contrast enhancement. For all the objects and clusters, they estimated various

discriminative morphological and textural features, to use it as input to the false positive reduction

system and they added four new rule-based features. Then, with feature extraction they found the vital

features of each cluster and with their classification algorithms, they characterized the abnormal regions

as benign or malignant. Even though their method showed satisfying results compared with the existing

automated methods in the literature, further studies are required with bigger datasets.

Few years later, Suhail, Sarwar and Murtaza, (2015) based on the observation that the calcification

looks like small bright spots on a mammogram, built a new scale-specific blob detection approach,

which the scale picked through supervised learning. They imported a new feature called ‘Ratio Energy’

for effective blob detection, which calculated the energy from a pixel in two diverse scales. After

maximum RE acquired, they analysed the energy of each pixel to thresholded maximum to decide if

the pixel corresponded to calcification or not. Moreover, they examined some region-based properties

from the normal mammograms, which were different from the abnormal ones and they used it as filter

procedures to bypass additional processing. Their results were reliable and good enough to help

radiologists in early diagnosis of breast cancer.

Later, Boulehmi, Mahersia and Hamrouni, (2016) tried to diagnose and classify breast micro-

calcifications on mammograms with a three-step system. The method started with the pre-processing

of mammograms and more specific with the use of morphological operators to eliminate the pectoral

muscle and all the irrelevant elements in the mammograms. Furthermore, a new mammographic image

enhancement technique introduced which contained the application of the top hat followed by wavelet

contrast enhancement and galactophorous tree interpolation. The second step contained the

segmentation of the micro-calcifications clusters with the use of Generalizes Gaussian Density (GGD)

evaluation and a Bayesian back-propagation neural network. The last step involved the characterization

of the clusters as benign or malignant with a neuro-fuzzy system. Even though with this system

Boulehmi, Mahersia and Hamrouni, (2016) achieved acceptable results not only for micro-

calcification’s detection but also for breast masses diagnosis; they should boost classification accuracy

using more cases.

In recent times, Ciecholewski, (2016) introduced a cutting-edge approach for the segmentation of

micro-calcifications in mammograms, using morphological transformations. First, they identified the

calcifications morphologically, but they let the region of their occurrence to be evaluated, the contrast

to be enhanced and the noise to be removed. Then, they did a segmentation procedure, which extracted

the shape of micro-calcifications. They used gradient transformations and less interim steps throughout

32

the extraction of the final shape of micro-calcifications. Their method was practical, fully automated

and did not need to combine different regions by maximizing average contrast, like the other available

publications in the literature. For additional research, diverse categories of micro-calcifications must be

used, radiologists should assess the segmentation results and new generation of mammograms needed.

A series of studies in the literature focus only in the classification of micro-calcifications in two classes

as benign and malignant. Khehra and Pharwaha, (2016) examined Multilayer Feed-Forward

Backpropagation Artificial Neural Network (MFFB-ANN) and Support Vector Machine (SVM) as

classifiers. They identified the needed features from the mammograms and with Particle Swarm

Optimization; they chose the most relevant features. To compare the two classifiers, they applied

confusion matrix and ROC analysis. From their outcome, they noticed that MFFB-ANN performed as

a good classifier but SVM classifier behaved as an excellent one. In fact, SVM classifier identified the

cases with greater accuracy within experimental errors. In future, metaheuristic techniques can be

implemented to discover the optimal hyperplane with diverse kernel functions in SVM.

At the same time, Bekker, et al. (2016) proposed a two-stage classification scheme, which imitated the

biopsy decision. Their method based on a view-level outcome and a logistic regression classifier, which

came from the stochastic combination of the two-view level indications into a simple benign or

malignant decision. In other words, their algorithm automatically learned how to connect the

information that already took from the CC and MLO views. At the first part, different classifiers tested

on multiple CC and MLO views to decide if, based on the particular view, the abnormal region is

malignant. The image level decision modelled as a hidden random variable that not detected neither in

train nor in test. At the second part, the final biopsy level decision found by integrated the decisions of

both views. In addition, they found a rotation invariant feature set based on the Curvelet transformation.

Because they targeted only texture features, a segmentation stage was not necessary. Despite the fact

that their approach achieved better performance compared to various diverse schemes, which were,

connected the view-level information, classifiers that are more powerful have to be explored, such as

ANN.

Up to the present time, the most studies in the literature, which presented Wavelet Transformation as

classifier for benign and malignant masses, were restricted and ignored the correlation among wavelet

scales. Hu, Yang and Gao, (2017) decided to create an improved algorithm based on Wavelet transform

which adapted Hidden Markov Tree-Model of Dual Tree Complex Wavelet Transform (DTCWT-

HMT) for micro-calcification diagnosis. This algorithm could find the correlation between various

wavelet coefficients and model the statistical dependencies. To define the abnormalities as malignant

or benign, non-Gaussian statistics of real signals were used and the connected features (DTCWT-HMT

and DTCWT) upgraded by Generic Algorithm and Extreme Learning Machine, to enhance diagnostic

accuracy. They compared their method with state-of-the-art diagnosis methods and from their results

demonstrated the high performance of the recommended method in terms of the accuracy and stability.

Likewise, the results from the two features combined were better than adopting either the DTCWT-

HMT or DTCWT alone. Nevertheless, future work needed, to evaluate this method on a bigger dataset,

which covers a larger variety of micro-calcification’s types.

To summarise, detection and classification algorithms for micro-calcifications in mammography, can

improve the accuracy of breast cancer detection and classify tumors as benign and malignant with good

results, despite the fact that still need improvement due to the lack of datasets and the high FP values.

The Table 2.7 presents a comparison for all the mentioned approaches for detection and classification

of micro-calcifications.

33

Table 2.7: Comparison of detection and classification methods of micro-calcifications

Micro-calcifications detection and classification methods

Papadopoulos,

Fotiadis and Likas,

(2005)

Nijmegen set: with SVM 𝐴𝑍 = 0.79 original feature set 𝐴𝑍 = 0.77 enhanced

feature set

with ANN 𝐴𝑍 = 0.70 original feature set 𝐴𝑍 = 0.76 enhanced feature set

MIAS set: with SVM 𝐴𝑍 = 0.81 original feature set 𝐴𝑍 = 0.80 enhanced feature

set

with ANN 𝐴𝑍 = 0.73 original feature set 𝐴𝑍 = 0.78 enhanced feature set

Suhail, Sarwar and

Murtaza, (2015)

Sensitivity = 91 %

Specificity = 97%

Precision = 85%

Accuracy = 93%

Boulehmi, Mahersia

and Hamrouni,

(2016)

80% accuracy

76% sensitivity

81.25% specificity with SVM

Ciecholewski,

(2016)

80.5 % similarity index

75.7 % similarity fraction

70.8 % overlap value

19.8 % extra fraction

0.83 average executing time

Khehra and

Pharwaha, (2016)

0.8651 overall accuracy with MFFB-ANN

0.9016 overall accuracy with SVM

Bekker, et al. (2016)

classification accuracy=69.5 %

Sensitivity=68.1 %

Specificity=69.7 %

𝐴𝑍 = 0.75

Hu, Yang and Gao,

(2017)

Nijmegen set: 𝐴𝑍 = 0.9856

MIAS set: 𝐴𝑍 = 0.9941

DDSM set: 𝐴𝑍 = 0.9168

2.5 Scope

The aim of this work is to develop a new and upgraded Computer-Aided Diagnosis system, which could

assist radiologist to detect and classify micro-calcifications using temporal digital mammograms. More

than that, our algorithm will have valuable advantages compared to the current algorithms that already

described in the literature. First, it will be completely automatic, without requiring manual information

from the radiologists, except the prior and current mammograms, and second, we will eliminate false

positives, which are the main drawback of all of the existing algorithms, using machine-learning

algorithms.

34

3 Methodology of the Proposed Algorithm

3.1 Detection of abnormal ROIs

3.1.1 Computer-Aided Diagnosis System Pipeline

Our proposed system’s pipeline, is presented in Figure 3.1 and contain the next stages. First, prior and

current mammograms were normalized and then experienced a pre-processing step with two different

filters, for the enhancement of abnormal areas and the removal of unnecessary regions. After that, each

temporal image pair was matched with the use of Demons registration algorithm on the prior

mammogram. Following, the current mammogram and the registered one were subtracted, and the

difference image went through a series of post processing techniques such as filtering, thresholding,

erosion and dilation to remove the unnecessary regions. Then, we eliminated the remaining areas that

corresponded to the periphery and the old micro-calcifications. Finally, for each ROI, basic intensity

and pixel-based features were acquired for the classification as micro-calcification or normal, with the

use of machine-learning techniques.

3.1.2 Dataset

In this research project, we used 8 pairs of full-field digital temporal mammograms from The Breast

Center of Cyprus and tested 32 pairs of Cranio-Caudal (CC) view and 13 pairs of Mediolateral-Oblique

(MLO) view cases, from 43 women who did their routine screening mammography examinations. A

breast radiologist with 9 years of experience picked the mammograms. The women’s ages varied from

58 to 73, with a mean age of 65.25 years and median age of 66. From the eight pairs of mammograms

that were used in the project, the three belonged to healthy subjects who did not present malignant

micro-calcifications in neither prior nor current mammograms. On the contrary, in the remaining five

pairs we had malignant micro-calcifications in the next sequential screening examination. For every

subject four standard mammograms were taken: left and right MLO views and left and right CC views.

The dimensions of the mammograms were 4096x3328 pixels, in an 8-bit format.

Table 3.1: Distribution of our testing dataset

View

Case

Results in the first

mammography exam

(Prior)

Results in the second

mammography exam

(Current)

Normal Normal Abnormal

CC 5 3 2

MLO 3 0 3

Total 8 3 5

35

Figure 3.1: Computer-aided diagnosis system

pipeline

36

3.1.3 Normalization

Our algorithm begun with the normalization of the current and prior mammograms. In image processing

the normalization process is taking place to adjust the range of pixel intensity values. More specific,

normalization can bring the image in a range that is more common to the senses and when the

mammograms were normalized, the same filters and algorithms can be implement over them. In our

case, we normalized both mammograms by divide them with 4096, which was the number of the rows

in the original images. The size of the mammograms was remained the same and the visualization of

the image did not change.

3.1.4 Pre-processing

The pre-processing step was crucial because we wanted to discard the mammogram’s background,

including the pectoral muscle in MLO view, without removing any other relevant details of the image,

which corresponded to micro-calcifications. As a consequence, we applied Matlab’s border removal,

which suppresses light structures connected to the border (Soille, 1999). On MLO mammograms, the

Figure 3.2: Example of temporal pairs of mammograms (a) current (b) prior

(a) (b)

37

pectoral muscle can be found in the top left or right corner depend on the orientation of the image. Its

regions are always brighter than the normal areas on a mammogram but its intensity difference amongst

the abnormal regions, is minimum (Shanmugavadivu and Sivakumar, 2013). With this filter the brighter

areas that were linked to the border, were removed without making any changes to other regions in the

mammograms (Soille, 1999).

The next phase in the pre-processing step was the elimination of the high intensity background that did

not accord to abnormal areas, to detect micro-calcifications more efficient. Therefore, we used contrast

adjustment with Gamma correction because the mapping between the input and output images was

nonlinear (Ma et al., 2014). Gamma factor takes values from zero to infinity. If is equal to one, the

mapping is linear, if is less than one the mapping accentuated in the direction of the brighter output

values and if is greater that one, the mapping accentuated in the direction of the darker output values.

Thus, we set Gamma factor to two.

3.1.5 Registration

After the pre-processing step, the registration of the current and prior mammograms occurred. As

mentioned in the literature, for an effective subtraction between the temporal pair of mammograms, the

alignment is important, so two famous algorithms were tested: Affine and Demons to determine which

one had better accuracy based on the residuals (Marias et al., 2005). Residual was a way to measure the

effectiveness of the subtraction and is the sum of the remaining pixels after the subtraction of the current

and registered image.

3.1.5.1 Affine Registration Algorithm

The first registration algorithm that we tried was Affine. Affine, is a linear mapping technique that

secures points, straight lines and planes and consists scaling, rotation and translation. Affine is a global

method, which all pixels go through the same transformation (Diez et al., 2011). Usually, Affine is used

to improve geometric distortions such as differences in image’s size, which happened because of wrong

camera angles. In our case, the differences appeared because of the way that the mammography was

taken over the years (Hadjiiski et al., 2001). This registration technique is intensity-based which means

that maps the pixels from the prior mammogram, based on relative intensity patterns to the current one.

Our technique was iterative, and the iteration number indicates how many times the registration process

will took place. For this reason, we checked Affine registration for 20 dense and 20 fatty mammograms,

all normal without malignant micro-calcifications, for five iteration numbers: 100, 200, 300, 400 and

500. We examined the results to discover the appropriate registration number. In all times, the prior

mammogram was the one that registered in order to be compared with the current one.

3.1.5.2 Demons Registration Algorithm

Demons was the second registration technique that was tested. Demons is a local method, which

transforms image’s pixels locally, having an unlike transformation, reliant on their regional similarity

and location. In contrast, from global methods, local techniques can handle deformations that are more

complicated. Additionally, this algorithm is depended on seeing the registration as a diffusion process

influenced by optical flow formulation and sometimes can includes regularization to assure smoothness

and continuity (Diez et al., 2011). With Demons, we estimated the displacement field that aligns the

prior image with the reference one (current). We created a new registered image, which was a distorted

version of the moving (prior) image and was changed according to the displacement field and by

applying linear interpolation. As before, the prior mammogram was registered and compared with the

38

current one. In like manner, we examined Demons registration for the same cases that we did with

Affine, we analyzed the results and compared the two registration methods to find the best technique.

3.1.6 Temporal Subtraction

Once the registration was completed, the next step was the temporal subtraction. We used simple image

subtraction, to subtract the improved and registered mammogram with the current one. Likewise, we

set all the pixels that corresponded to the registered (prior) image to zero because we did not need our

algorithm to ‘see’ old and suspicious areas that were removed over the years. With this technique, we

left only the areas that did not found in the previous examinations and can point to possible micro-

calcifications (Yin, 1999). To assess the effectiveness of the subtraction, we measured the contrast ratio

of the subtracted image and compared it with the contrast ratio of the current one (without processing)

for the eight temporal pairs of our dataset. The contrast ratio of an image is defined as the ratio of the

maximum pixel to the mean pixel of the entire image. The goal was to increase the contrast, in order to

help the radiologists with better visualization.

3.1.7 Post-processing

3.1.7.1 Filtering

In the difference image that acquired in the previous stage, post-processing occurred. In detail, an

effective filter was needed in order to enhance the micro-calcifications and separate them from the

background without causing distortions in other areas. From the literature, we saw that the micro-

calcifications and the abnormalities in general, have higher intensity values and appear brighter than

other areas in the mammograms. Henceforth, for this stage we tested a variety of filters to discover the

best one. First, we used standard deviation filter, which calculates the standard deviation of a 3x3

neighborhood around each pixel in the input mammogram. Second, we tried contrast-limited adaptive

histogram equalization (CLAHE) that enhances the contrast of the input image by changing the intensity

values of the pixels (Zuiderveld, 1994). Third, range filter, which is a local pixel-based filter, where

every output pixel consists the range value, between the maximum and the minimum value, of a 3-by-

3 neighborhood for every equivalent pixel in the difference image (Bailey and Hodgson, 1985). Finally,

after the implementation of those filters we chose the filter that had the best performance.

3.1.7.2 Thresholding

Afterwards, the grayscale image that was captured earlier, became binary. The main idea was to separate

the high intensity pixels that could point to micro-calcifications and the low risk areas that we did not

found essential. With thresholding, we created a binary image by replacing all the pixel values equals

or higher than the threshold value with ones (white) and all the remaining pixels with zeroes (black).

The micro-calcifications and all the ROIs appeared as white pixels, but the background was erased, so

we had only the regions of interest. The threshold value was selected to 0.08 with trial and error.

Threshold value was set to a relatively small rate to meet the needs of our goal, which was to remove

only the areas that did not belong to micro-calcifications (Ma et al., 2015).

3.1.7.3 Morphological Operations

Later, we decided to further process the binary image with morphological operations. Morphological

image processing contains a group of non-linear procedures that associated with the shape and

morphology of features in an image and depend on the relative ordering of pixels instead on their

intensity value. The first operation was erosion, which removed isolated pixels, from the binary image,

39

that did not relate to micro-calcifications but at the same time decreases the size of ROIs. Then dilation

took place, which connected all the grouped pixels together to discover the clustered micro-

calcifications (Efford, 2002).

3.1.8 Removal of the periphery pixels

Next was the removal of the periphery regions that stayed after the processing of the images. From the

registration procedure the prior and the current mammograms were aligned, but most of the times, some

minor misalignments existed in the periphery, which introduced false assumptions of abnormalities in

those areas. For that reason, we excluded all the periphery pixels, which our algorithm found incorrectly,

to minimize the error of the proposed algorithm.

3.1.9 Removal of the old micro-calcifications

After the detection of the suspected ROIs, we decided to remove the micro-calcifications from the

previous mammograms that somehow eliminated in the next screening round (current mammogram)

and did not removed from the temporal subtraction. We did that because we needed only the new micro-

calcifications. The radiologist marked the current and the previous mammograms and we created an

algorithm that compared the two images and removed the areas that existed in the previous

mammograms. For more accurate results, we dilated the marked regions in the registered image,

because that image was distorted after Demons registration, and we wanted to map the micro-

calcifications correctly to their corresponding location in the current image.

With the new ground truth image that collected after the removal of the periphery pixels, we

characterized the detected ROIs that found from the proposed algorithm as micro-calcifications or

normal. As a result, we constructed the ground truth images for the next steps.

3.1.10 Evaluation of the proposed algorithm

Once the proposed algorithm discovered the possible regions of interest, we evaluated the results to

examine the efficiency of our method. We determined the true positives, true negatives, false positives

and the F1-score. An image region was identified as micro-calcification (positive) or normal (negative)

and a decision for the detection result could be either correct (true) or incorrect (false). Hence, the

decision for a detection result was one of four possible categories: true positive (TP), false positive

(FP), true negative (TN), false negative (FN). The FN and FP were the wrong assessments of the

algorithm. The false negative indicated that a true micro-calcification was not discovered and the false

positive that a normal area was characterized as abnormal. Similarly, a true positive decision identified

correctly the micro-calcifications and a true negative decision found precisely the normal ROIs.

Because we cannot find the TN since all the remaining regions that did not identified were normal, we

measured the F1-score. F1-score is related to accuracy and is used when we do not have the TN values.

Its formula is described below:

𝑭𝟏_𝒔𝒄𝒐𝒓𝒆 =𝟐𝑻𝑷

𝟐𝑻𝑷 + 𝑭𝑷 + 𝑭𝑵

40

3.2 Elimination of False Positives

The following steps contained the elimination of regions that falsely found from our algorithm as micro-

calcifications. The false positive regions alert the radiologists without a reason and lead to waste of time

and false alarms. The removal of those areas is important and necessary for CAD algorithms.

3.2.1 Feature Extraction and Selection

Up to the present time, CAD systems improved the radiologist’s performance up to 15% but the main

weakness of those systems, besides the relatively small dataset, was the high number of false positives.

In the literature, diverse approaches have been proposed for the reduction of false positives with a great

amount of studies focused on machine-learning applications. In detail, for the detected ROIs, basic

features were extracted such as textural, intensity-based, geometry, shape etc. and they were imported

to a classifier to categorize the ROIs as micro-calcifications (true positives) or false positives. The

selected features play a key role for the precise and accurate classification of the ROIs and the

combination of different feature types does not always promise better classification results. Given that,

a feature selection approach was needed to find the appropriate features for the classification step

(Nguyen et al., 2015). The ROIs characterizes as TP or FP based on the previous step.

3.2.1.1 Feature Extraction

In our proposed algorithm, thirteen image features were used to fully characterize the intensity, texture

and geometry of the regions of interest. This set of features preferred for their capability to categorize

the regions as micro-calcifications or not, based on micro-calcifications’ characteristics. The features

were computed on every ROI that previously found, after the characterization of the areas as TP or FP,

based on the ground truth. We extracted the features from the subtracted image and from the current

image after the implementation of the pre-processing step, in order to find which features showed the

best results. Each ROI was correlated with a feature vector of 13 dimension, one for every feature

calculated (Ma et al., 2015). The features were extracted only at the ROIs, not in the bounding box,

which was the smallest rectangle that included the ROI (Nguyen et al., 2015).

First Order Statistics (FOS) Features

Seven FOS features were computed on every detected ROI. Those were the average value of

gray level, the max intensity value of gray level, standard deviation, coefficient of variance,

entropy, skewness and kurtosis. They were determined by the standard mathematical equations.

Shape Features

Six shape measurements were extracted for the areas of interest that previously found from the

proposed methodology. Those were the area, eccentricity, convex area, filled area, solidity and

extent. Area measures the real number of pixels in all the ROIs. Eccentricity finds the ratio of

the distance among the foci of the ellipse that has identical second-moments as the area, and its

major axis length. The eccentricity value is from zero to one and an ellipse with eccentricity

value equals to zero is a circle, while an ellipse with eccentricity value equals to one is a line.

Convex area consists the number of pixels in the convex image, which is an image that

determines the convex hull. Equally, filled area involves the number of pixels in the filled

image, which is a binary image of the same size as the bounding box of the area, with all the

holes filled in. Solidity estimates the proportion of pixels in the convex hull that are in the

region too. Finally, extent calculates the ratio of pixels in the area, to pixels in the total bounding

41

box (MATLAB-Image Processing Toolbox). Those features found with the use of Matlab’s

regionprops.

Table 3.2: Features extracted from both subtracted and current image

3.2.1.2 Feature Selection

Feature selection was a necessary and valuable step before the implementation of machine-learning

algorithms. The removal of insignificant and unnecessary features can increase the performance of the

classifier. From the thirteen features that previously found, with the use of the subtracted and the current

images, only the best will be imported to the classifier for the classification step. In fact, the feature

selection process took place for both images, in order to choose the best way to eliminate FP (Nguyen

et al., 2015). In that case, hypothesis test and multivariate analysis of variance required, to discover the

features with the bigger contribution and separate TP and FP areas.

First, we conducted a paired t-test, for all the features that extracted from the two images. T-test used

to compare two population means, in which the samples in the first population can paired with the

corresponding samples in the second population. This method discovers a test decision for the null

hypothesis that the data in two different populations come from independent random samples, from

normal distributions with same means and same but unknown variances. The other possible outcome is

that the data come from populations with dissimilar means, so they are statistical different and can be

separate. The variable h is equal to 1 when the test dismisses the null hypothesis at the 5% significance

level and 0 otherwise. The p-value correspond to the significance level and if it is less than 0.05 the

hypothesis is rejected, and the distributions can be set apart. On the contrary, if the p-value is bigger

than 0.05, the hypothesis is legit, and the two groups are related to each other (Hsu and Lachenbruch,

2008).

For the purpose of this study, elimination of the false positives was occurred, in order to improve the

accuracy of the suggested algorithm. The two diverse populations for the t-test analysis were the

features that were extracted from the TP and FP regions. In exchange for that, the features that are

statistical different and can distinguish the two populations as TP and FP, found and comparison of the

features that were extracted from the subtracted image and the current image was happened.

Features

FOS features Shape features

Mean Intensity Area

Max Intensity Eccentricity

Skewness Convex Area

Kurtosis Filled Area

STD Solidity

Variance Extent

Entropy

42

After the detection of the features that contribute the most for the classification of ROIs as TP and FP,

was essential to examine if the combination of them will result in better p-value than single features.

Thus, multivariate analysis of variance (MANOVA) was required to select the best feature subset

between available features. MANOVA test was used to examine this hypothesis and rather than a single

p-value for each feature, a multivariate p-value was acquired depend on the comparison of the error of

variance and covariance matrix. The covariance matrix was needed because the features related to each

other and correlation should be considered. Variables that maximize the group changes were

constructed to analyze the diverse dependent features and those variables were linear combinations of

the calculated dependent variables (French et al., n.d.). MANOVA returns the variable d, which is an

assessment of the dimension of the space. When d equals to 0 the hypothesis is not rejected at the 5%

significance level but if d equals 1, the hypothesis rejected. Like t-test, the p-value presents the

significance level. With this intention, MANOVA test has three assumptions for the input data: the

populations for all the groups are normally distributed, the variance-covariance matrix is identical for

every population and all the observations are independent (Krzanowski, 2008).

In this case, an algorithm that automatically selected the features that already collected from the paired

t-test, was constructed to achieve the lower p-value and find the appropriate features for the

classification part. The remaining features were imported as well, because sometimes help the

classification despite the high p-value.

3.2.2 Classification

For the classification of the interested ROIs, we decided to apply Discriminant Analysis (DA) with

Matlab’s ‘classify’. With discriminant analysis, we discovered the best function that classified the two

classes in our problem (TP and FP). More general, the purpose of discriminant analysis was to find a

combination of features that define two or more classes, objects or events. This combination can used

as a linear or non-linear classifier for new data. A region is estimated to belong to class one or class two

based on the classification boundary. If the region is assigned to the wrong class, an error occurred

(Mika et al., 1999).Discriminant Analysis includes the determination of a liner equation similar to

regression that can be found below and predicts in which class each ROI belong.

𝐷 = 𝑣1𝑋1 + 𝑣2𝑋2 + 𝑣3𝑋3 = ⋯ 𝑣𝑖𝑋𝑖 + 𝑎

In this equation, D is the discriminate function, v the discriminant coefficient, X the score for the

approximate coefficient, a is a constant and i the number of the predictor variables. This function must

maximize the range amongst the classes and find the discriminant function that separate those classes

and any new regions. The assumptions of DA are:

the observations are random;

every ROI is normally distributed;

every allocation for the dependent classes in the initial classification classified accurately;

there must be at least two classes and each region could be part of only one class, because the

classes are mutually exclusive;

every class should be precisely defined and the differences with the other classes must be clear

(Krzanowski, 2008; Seber, 1984).

In our case, we created a training set with the features that were extracted earlier, to train our classifier

and a test set to evaluate the classifier. To train the proposed classifier, the 8 mammograms were used

in a leave-one-patient-out procedure. During each round, one mammographic image was used as the

43

test sample and the remaining images as the train one, until all the cases we had in our dataset were

classified. It is really important to state that, to avoid any bias, during the leave-one-patient-out

procedure, the test mammogram was also removed from the training stage. This achieved complete

separation of the test sample, from any of the training sets.

3.2.3 Evaluation of the classification

As before, we calculated the TP, FP, FN and now we included the TN since we classified only the

previous detected regions, for the evaluation of the classification step. With those indices, we created a

confusion matrix (Table 3.3) for the total values of all the eight cases.

Table 3.3: Confusion matrix

Predicted Class

True Class

Positive Negative

Positive True Positive False Positive

Negative False Negative True Negative

Next, the performance of the diagnostic system was measured with the accuracy, which is a description

of random errors and a measurement of statistical variability. The classification’s accuracy shows the

percentage of diagnostic decisions that identified correctly, and the formula is describe below (Cheng

et al., 2003). We also found the sensitivity, which is the true positive rate and the specificity, which

corresponds to the true negative rate and the formulas are shown below. We compared the results with

the previous results of the detection to see if the false positives were successfully eliminated and the

algorithm was improved.

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =𝑇𝑃 + 𝑇𝑁

𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁

𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 (𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) =𝑇𝑃

𝑇𝑃 + 𝐹𝑁

𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 (𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) =𝑇𝑁

𝑇𝑁 + 𝐹𝑃

44

Figure 3.3: Detailed representation of the proposed algorithm

45

4 Results

4.1 Detection of abnormal ROIs

4.1.1 Pre-processing

The pre-processing step contained the use of the border removal filter, which suppresses high intensity

structures related to the border. All pair of mammograms (normal and abnormal cases) from our dataset

were tested and the results are show in Figure 4.1 for two normal to abnormal cases of mammograms.

From the outcome, can be found that this step was effective and discarded a great amount of unnecessary

pixels from the mammograms without removing any important regions in the images. Also, at the MLO

views this filter eliminated the pectoral muscle which was redundant information in the mammograms

for this study.

Figure 4.1: Example of the clear border removal in two cases (a) normalized mammograms

with red circle in the malignancy (b) border removal

(a) (b)

46

Later, Gamma correction took place and further cleared the irrelevant pixels from the processed image.

As before, Gamma filter was applied to all the mammograms (current and prior) from our dataset and

in Figure 4.2 are the results for the same two cases of mammograms. It is clear from the resulted images

that the remaining background was eliminated without removing any regions of interest. Border

removal was combined with Gamma correction and built an efficient pre-processing step which

removed all the unnecessary information in the mammograms.

Figure 4.2: Example of the Gamma correction in two cases (a) border removal (b) Gamma

correction

(a) (b)

47

4.1.2 Registration

When the pre-processing step was finished, registration took place. We evaluated the registration

performance of two different registration methods: Affine and Demons, on a subset of 40 dense and

fatty mammographic temporal pairs from our dataset. We found the most suitable preferences of the

two algorithms and the best method for the proposed method. Then, we subtracted the current and

registered image. As evaluation measure, residual was determined on the subtracted image.

4.1.2.1 Affine Registration Algorithm

We ran 40 temporal pairs, 20 dense and 20 fatty, with Affine registration for five iteration numbers:

100, 200, 300, 400 and 500 and Figure 4.3 shows the current and prior and then the subtracted image

between the current and the new prior registered images in the same two cases as above and in 200

iterations as an example. It is noticeable, that the prior image was altered in order to look like the current

one.

The next step was to decide regarding the most suitable iteration number. For that, two box plots were

made, one for dense and one for fatty mammograms with the percentage of the residuals of the

subtracted images related to iteration number. Figure 4.4 displays the residuals in percentage for the

dense mammograms and Figure 4.5 the residuals in percentage for the fatty mammograms. Even though

fatty and dense mammograms are different according to the tissue and dense mammograms seem more

complicated, from the two box plots we observed that they needed the same iteration number for a

Figure 4.3: Affine registration in two cases (a) current mammogram (b) prior mammogram (c)

subtracted image

(a) (b) (c)

48

proficient registration. The iteration number that minimizes the residual for the two categories was 200

and we chose this number for our algorithm. However, the mean residuals for all the iteration numbers

of fatty mammograms were slightly lower (approximately 33%) than the corresponding ones of dense

mammograms (approximately 38%).

Figure 4.5: Box plot for fatty mammograms

Figure 4.4: Box plot for dense mammograms

49

4.1.2.2 Demons Registration Algorithm

After the completion of the Affine registration method, we evaluated Demons registration for the same

20 temporal pairs that we did with Affine. Demons calculated the displacement field to align the prior

mammogram with the current one and Figures 4.6 and 4.7 demonstrate a pair image with the current

(fixed) and prior (moving) mammograms and how the second one has to change (yellow arrows) based

on the displacement field that has been found from Demons, in the two examples as above.

Figure 4.6: Displacement filed for Demons registration in

the 1st example

Figure 4.7: Displacement filed for Demons registration in

the 2nd example

50

Figure 4.7 represents the current mammogram, the prior one and the subtracted image between the

current and the new prior registered images. With Demons, the prior mammogram has changed and

adjusted to the current mammogram. Next, 2 box plots were created, one for dense and one for fatty

mammograms, with the percentage of the residuals for the subtracted images. Figure 4.8 exhibits the

residuals in percentage for the dense mammograms and Figure 4.9 the residuals in percentage for the

fatty mammograms. From the 2 figures we noticed that, like Affine, mean residual percentage of fatty

mammograms (approximately 24%) was 3% lower than the corresponding mean value of the dense

mammograms (approximately 27%) and this, like already explained, was due to the fact that dense

mammograms were harder to deal with.

Figure 4.8: Demons registration in two cases (a) current mammogram (b) prior mammogram

(c) subtracted image

(a) (b) (c)

51

4.1.2.3 Comparison

Eventually, we compared the two registration techniques based on our previous results and chose the

best method for our algorithm. Two box plots were created with the residuals in percentage for the 20

temporal cases for both Affine and Demons. Figure 4.10 shows the box plot for the dense mammograms

and Figure 4.11 the box plot for the fatty mammograms.

Figure 4.10: Comparison of dense mammograms

Figure 4.9: Box plot for dense and fatty mammograms

52

For dense mammograms, the mean residual with Affine was 37% and is approximately 10% higher than

the residuals with Demons. In like manner, in fatty mammograms the average value of residual with

Demons was 24% and is 11% lower than Affines’. Again, the residuals in dense mammograms were

larger than the corresponding ones for the fatty mammograms. To the end of that, Demons registration

technique was selected due to its unique characteristics and its better results over Affine and was applied

to the algorithm.

4.1.3 Temporal Subtraction

After the registration, we subtracted the temporal pairs and calculated the contrast ratio of the subtracted

image compared to the current one. In Figure 4.12 we can see a box plot associated with the contrast

ratio of the two images. We created one plot for both dense and fatty mammograms. It is noticed that,

the contrast ratio of the subtracted image is significantly greater than the one in the current image. This

proves that the image was cleared from unnecessary regions and irrelevant details that most of the times

distract the radiologist. We removed all the areas that appeared in the previous screening round and the

old micro-calcifications and with this in mind, we already help the radiologists to better visualization

and assessment of the mammographic images.

Figure 4.11: Comparison of fatty mammograms

53

4.1.4 Post-processing

4.1.4.1 Filtering

Later, diverse filters tested for the enhancement of the subtracted image from the previous stage,

included standard deviation filter, CLAHE and range filter. We used the same examples and in Figure

4.13 we see the ‘raw’ subtracted image and the image with standard deviation filter. From the filtered

image we understood that this filter was insufficient because it made the background and all the bright

areas smoother and the micro-calcifications which were brighter, could not be found. The second

technique that we tried was CLAHE, which enhanced the contrast by changing the intensity values of

the pixels. In Figure 4.14 the subtracted and the filtered image were presented. This filter enhanced the

high intensity background instead of removing it and was not appropriate for our goal. Finally, range

filter was used and returned the range value of the neighborhood in the input image. In Figure 4.15 we

see the implementation of range filter. With this filter, the high intensity background erased without

making any changes to any other bright areas. For our algorithm, we selected range filter for the post-

processing step.

Figure 4.12: Box plot of the contrast ratio

54

Figure 4.13: STD filter (a) subtracted image (b) filtered image

Figure 4.14: CLAHE (a) subtracted image (b) filtered image

(a) (b)

(a) (b)

55

4.1.4.2 Thresholding

Afterwards, the image became binary and in Figure 4.16 we look at the filtered and binary images. With

the threshold number set to 0.08 we removed the regions that did not belonged to abnormalities and left

only the ROIs.

Figure 4.15: Range filter (a) subtracted image (b) filtered image

Figure 4.16: Thresholding (a) filtered image (b) thresholded

(a) (b)

(a) (b)

56

4.1.4.3 Morphological Operations

The final step for the post-processing was the morphological operations. Erosion followed from dilation

took place to discard the isolated pixels and group pixels that were close to each other. The resulted

image s areshown in Figure 4.17 for the 2 known examples.

Now that the post-processing step was finished, Figure 4.18 demonstrates the current mammograms

from the first example, with red circles in all the detected regions from the proposed algorithm and

Figure 4.19 for the second example.

Figure 4.17: Morphological operations (a) thresholded image (b) new image

Figure 4.18: Resulted image for example 1

(a) (b)

57

4.1.5 Removal of the Periphery Pixels

For further processing, we removed the periphery pixels that corresponded to misalignments and were

remained after the above procedure, to improve the accuracy of the algorithm. In Figure 4.20 are the

two new images for the above examples.

Figure 4.19: Resulted image for example 2

Figure 4.20: Images with removed periphery pixels

58

4.1.6 Removal of the old micro-calcifications

For the removal of the old micro-calcifications, we took the marked registered and current images from

the radiologist and compared them. Figure 4.21 displays the ground truth images, current and prior

registered, for the first example that contain the micro-calcifications (white pixels) and Figure 4.22

shows the resulted image, after the implementation of the algorithm that erased the old regions that

appeared in both images. This step was crucial since the algorithm rejected all the micro-calcifications

that erased in the current mammogram and the remaining areas were the new micro-calcifications.

Figure 4.21: Ground truth images

Figure 4.22: New binary image

59

Next, we created the ground truth images for the next steps. Figure 4.23 demonstrates the true micro-

calcifications with red circles in the two previous examples. Figure 4.24 displays the normal regions

with green circles that our algorithm incorrectly found as abnormal for the same cases.

Figure 4.23: True micro-calcifications

Figure 4.24: False negative areas

60

4.1.7 Evaluation of the proposed algorithm

For the evaluation of our algorithm, we used the ground truth images from the next step to classify the

regions that previously found as true micro-calcifications or normal. Once we categorized all the regions

as true or false, we created a table with all the evaluation measurements for all the cases.

Table 4.1: Evaluation of the algorithm

Patient Detected

ROIs TP FP

FN

(not detected ROIs) F1-score

1 46 14 32 0 0.466

2 73 8 65 0 0.197

3 58 6 52 0 0.188

4 18 6 12 0 0.5

5 27 2 25 0 0.137

6 30 2 28 0 0.125

7 82 54 28 36 0.627

8 42 4 38 1 0.170

Total 376 96 280 37 0.301

From the table above, we noticed that the proposed algorithm correctly identified 96 micro-

calcifications from the 133. It is worth noting though that neither of the 37 false negative cases that the

algorithm did not found, were malignant. Also, with this technique a large amount of false positive were

found. From the total 376 ROIs that our algorithm identified as important, the 280 were false positives.

The false positives can mislead the radiologist and cause many problems. Instead of helping with this

system, the radiologists will be confused so we need to eliminate the false positives by upgrading our

methodology. Moreover, the average F1-score was only 0.301 and from the number we understood that

we had to increase the accuracy of the algorithm with the removal of the irrelevant regions. However,

the proposed methodology is valuable, despite those limitations, and can help the radiologists with the

detection of micro-calcifications.

4.2 Elimination of False Positives

After the evaluation of the algorithm we found a considerable amount of false detected regions. With

the following methodology we removed those regions, to increase the performance of the algorithm.

61

4.2.1 Feature Extraction and Selection

4.2.1.1 Feature Extraction

From the final images that were captured earlier from the proposed methodology, 13 FOS and shape

features were extracted from each region, for the automatic classification of false positive and true

positive areas. We evaluated the features, from both the subtracted and the current images and the results

are presented in the two tables below. Table 4.2 demonstrates the value for each feature that extracted

from the current mammogram, for 10 randomly selected ROIs, included TP and FP. Similarly, Table

4.3 displays the value for each feature that extracted from the subtracted image, for the same ROIs. The

characterization as TP or FP was based on the previous procedure. From the two following tables it can

be seen that shape features were identical for the features that were extracted from the current and

subtracted images and that lies to the fact that those features calculated based on the shape of the area,

thus the intensity of the pixel was independent. Conversely, the FOS features that were intensity and

pixel based, were different for the current and subtracted image.

4.2.1.2 Feature Selection

With feature selection the best features found for the next step. From the paired t-test, the 13 features

were compared and the Tables 4.4 and 4.5 show the results for the h and p values of the current and

subtracted images, respectively. The best features are highlighted with red, in both tables. Here one

should notice the differences between the features from the two images. In current image, the features

that rejected the hypothesis that TP and FP are statistically equal, were the max intensity value, standard

deviation, entropy, area, eccentricity, convex area, filled area and extent. Nevertheless, in the subtracted

image the features with h index equals to 1, were the max intensity, variance, entropy, skewness,

kurtosis, area, eccentricity, convex area, filled area and extent. When the features extracted from the

subtracted image we noticed that the lowest p-value belonged to entropy. Thus, for the current image,

maximum intensity gave the minimum p-value. Equally important was the fact that the p-scores for the

best features of the subtracted image were slightly smaller than the ones of the current image.

Following, the best features that discovered from the t-test, were imported to multivariate analysis of

variance to find out if the combination of them helped. We decided to include all of the features to

MANOVA, because occasionally some features tend to improve the p-value even if their response at

the t-test was not satisfactory. Table 4.6 have the results for the current image and Table 4.7 for the

subtracted one. In the two tables we see four different combinations of features. The highlighted step

was the most efficient one and gave the lowest p-value. For the current image, the best features were

the mean intensity, max intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and

extent. In the subtracted image the most efficient combination of features contained the max intensity,

standard deviation, variance, entropy, skewness, kurtosis, eccentricity, filled area and extent. In general,

FOS features had bigger contribution for the classification of the two populations compared to the shape

features, because they are related with the intensity value of each pixel and not the shape of the region.

From there, it was clear that the current image had more valuable information regarding the features

since the p-value was smaller. Additionally, the combination of features improved the p-value compared

to the use of single features. The features with the smaller p-value were the mean intensity, max

intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and extent, extracted from

the current image and were selected for the classification step.

62

Table 4.2: Features extracted from current image

Current image


ROI Class Mean

Intensity

Max

Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity

Convex

Area

Filled

Area Solidity Extent

1 TP 7.08E-05 0.004 361.780 130884.940 0 -0.144 2.256 1732 0.176 3481 3479 0.498 0.423

2 TP 2.72E-05 0.004 137.543 18918.158 0 0.310 1.897 2300 0.209 5673 5650 0.405 0.342

3 TP 1.03E-01 0.312 559.091 312583.277 0 -0.708 2.461 2602 0.287 6970 6922 0.373 0.304

4 TP 9.89E-02 0.266 783.983 614629.225 0 -0.768 2.029 1774 0.229 3619 3617 0.490 0.414

5 TP 2.15E-04 0.043 250.541 62770.737 0 1.989 9.101 1864 0.403 3940 3936 0.473 0.404

6 FP 1.29E-03 0.069 369.577 136586.851 0 2.069 6.954 1950 0.313 4228 4214 0.461 0.381

7 FP 6.61E-03 0.114 740.430 548236.096 0 1.221 3.310 2642 0.502 7128 6991 0.371 0.318

8 FP 7.62E-03 0.087 571.897 327066.722 0 -0.462 2.156 1586 0.164 2996 2996 0.529 0.448

9 FP 3.88E-02 0.148 880.791 775792.459 0 -0.552 1.554 1586 0.164 2996 2996 0.529 0.448

10 FP 2.86E-03 0.064 345.451 119336.504 0 0.254 2.502 1982 0.461 4313 4288 0.460 0.377

63

Table 4.3: Features extracted from subtracted image

Subtracted image


ROI Class Mean

Intensity

Max


Convex

Area

Filled


1 TP 5.69E-05 0.004 2.462E-04 6.06E-08 0.03 8.106 94.856 1732 0.176 3481 3479 0.498 0.423

2 TP 5.19E-08 2.88E-05 9.77E-07 9.54E-13 0 23.945 637.385 2300 0.209 5673 5650 0.405 0.342

3 TP 0.0601 0.312 0.069 0.0048 4.75 0.857 2.510 2602 0.287 6970 6922 0.373 0.304

4 TP 0.0980 0.266 0.074 0.0055 5.03 -0.148 1.601 1774 0.229 3619 3617 0.490 0.414

5 TP 0.0001 0.043 0.001 2.22E-06 0.10 19.553 465.099 1864 0.403 3940 3936 0.473 0.404

6 FP 0.0012 0.069 0.006 3.23E-05 0.59 5.533 37.390 1950 0.313 4228 4214 0.461 0.381

7 FP 0.0066 0.114 0.019 0.0004 1.51 3.370 14.108 2642 0.502 7128 6991 0.371 0.318

8 FP 0.0076 0.087 0.014 0.0002 2.47 2.633 10.551 1586 0.164 2996 2996 0.529 0.448

9 FP 0.0344 0.148 0.034 0.0012 3.87 0.504 2.035 1586 0.164 2996 2996 0.529 0.448

10 FP 0.0011 0.033 0.003 1.17E-05 0.84 4.888 32.250 1982 0.461 4313 4288 0.460 0.377

64

Table 4.4: T-test results for current image

Table 4.5: T-test results for subtracted image

Current image


Mean

Intensity

Max


Convex

Area

Filled


h 0 1 1 0 1 0 0 1 1 1 1 0 1

p 0.354 6.22E-08 0.038 0.544 1.9E-04 0.724 0.654 0.004 0.002 0.017 0.007 0.095 0.012

Subtracted image


Mean

Intensity

Max


Convex

Area

Filled


h 0 1 0 1 1 1 1 1 1 1 1 0 1

p 0.767 3.75E-06 0.068 0.03 9.44E-10 3.47E07 4.92E-05 0.004 0.002 0.017 0.007 0.095 0.012

65

Table 4.6: MANOVA results for current image

Table 4.7: MANOVA results for subtracted image

Current image


Mean

Intensity

Max


Convex

Area

Filled

Area Solidity Extent d p

x x x x x x x x x x x x x 1 4.6E-03

x x x x x x x x 1 3.23E-04

x x x x x x x x x 1 2.98E-04

x x x x x x x x x x 1 8.67E-09

Subtracted image


Mean

Intensity

Max


Convex

Area

Filled

Area Solidity Extent d p

x x x x x x x x x x x x x 1 0.004

x x x x x x x x x x 1 3.12E-04

x x x x x x x x x x 1 6.71E-05

x x x x x x x x x 1 7.35E-06

66

Table 4.8: Selected features for the classification step

4.2.2 Classification

Since we found the most valuable features, with Discriminant Analysis we classified the suspected ROIs

as TP or FP. The classification step occurred 8 times, and every time a patient left out of the training

process in order to be part of the validation. Table 4.9 exhibits the total results of the classification step

for all the cases from our dataset.

Table 4.9: Classification results

True Class

Predicted Class

Micro-calcifications Normal

Micro-calcifications 121 12

Normal 60 220

With a first look in the above table, we noticed that the classification step improved significantly the

algorithm, due to the decreased number of false positives and false negatives.

4.2.3 Evaluation of the classification

To evaluate the performance of the classification step in our algorithm, we measured the accuracy,

sensitivity and specificity with the formulas that described above. Table 4.10 illustrates the total results.

Selected Features


Mean Intensity Area

Max Intensity Eccentricity

Skewness Solidity

Kurtosis Extent

Variance

Entropy

67

Table 4.10: Evaluation of the classifier

Patient ROIs TP FP TN FN

1 46 14 12 0 20

2 73 8 9 0 56

3 58 6 11 0 41

4 18 5 0 1 12

5 27 2 9 0 16

6 30 2 6 0 22

7 82 53 4 1 24

8 42 3 9 1 29

Total 376 93 60 3 220

Accuracy (percent) 83.245%

Sensitivity 0.969

Specificity 0.786

After the machine-learning implementation, the performance of the proposed algorithm was upgraded,

and the algorithm became more efficient. From the 96 micro-calcifications, we precisely recognized 93

and missed only 3, which again were not malignant. This is really important for the CAD system, since

we wanted to exclude only the areas that did not correspond to abnormalities and to identify as many

micro-calcifications as possible. Hence, the main problem was the high number of false positives. Our

algorithm categorised a large number of normal areas as abnormal and more specific from the 376

detected ROIs, the 280 were false positives. With the classification step, 220 falsely detected regions

were eliminated and only 60 regions were misclassified as abnormal. Therefore, the classification step

was very promising and increased the accuracy of the algorithm.

The accuracy of the classifier was found 83.245% and is promising since we used only 8 temporal pairs

of mammograms. In the following chapter we are going to compare it with the state-of-the-art

techniques. Sensitivity was found 0.969 and from that number we can understand that almost all the

micro-calcifications found accurately, and we did not miss or overlook the abnormal areas. High

sensitivity is substantial in medical applications and in our case was the most important measurement

because we wanted to exclude the areas that were not micro-calcifications and eliminate false negatives.

It was necessary to be sure that all the patients that found as healthy were indeed healthy.

68

Specificity though was lower due to the false positives. With specificity we measured which abnormal

regions were actually abnormal and not just false detected. In our study, specificity was not so important

as the sensitivity. Our goal was to eliminate false positives, to assist the radiologists and not confusing

them with all those falsely detected regions but the crucial part was to exclude the true healthy patients

and not characterized a cancerous region as a normal one. In general, with the classification step we

upgraded the proposed methodology and constructed a valuable CAD system for the detection and

classification of micro-calcifications in temporal mammographic pairs.

69

5 Discussion

In this work, we presented an automated CAD algorithm for breast micro-calcifications diagnosis on

temporal pair of mammograms. We used 8 pairs of full-field digital temporal mammograms from The

Breast Center of Cyprus, with 5 pairs of Cranio-Caudal (CC) view and 3 pairs of Mediolateral-Oblique

(MLO) view, from 8 women. At the beginning, we did a normalization and then the pre-processing took

place. Gamma correction and border removal were chosen for the elimination of any irrelevant regions

and showed satisfactory results compared to the state-of-the-art techniques. Given that, we did not find

any algorithms in the literature that combined those two filters.

After that, was the registration step, which mapped current and prior mammograms and then the

registered and the current mammograms subtracted. Affine and Demons registration techniques were

tested for dense and fatty mammograms. The mean residual with Affine was 37% and is 10% greater

than the residuals with Demons. In fatty mammograms, the mean value of residual with Demons was

24% and is 11% worse than Affines’. Based on the results, Demons registration technique was selected

over Affine and implemented to the algorithm. In the literature, a large amount of studies used Demons

for the registration of the mammographic images and our results compared well with them. Nonetheless,

those studies took different measurement for the effectiveness of the registration method and it is

difficult to be compared with our study. With the previous steps, we already improved the contrast of

the subtracted image, compared to the current one without the pre-processing. Specifically, the mean

contrast ratio of the current image was 2 and of the subtracted image 6. The insignificant regions were

cleared, and the radiologist can identify the abnormalities easier without seeing the background and the

old areas that excluded in the second screening round.

Next, was the post-processing step, which involved the filtering of the subtracted image with various

filters to determine the most suitable one, the thresholding and the processing with morphological

operations. Range filter was preferred because the high intensity background was erased without

damaging important regions. With the thresholding and the processing, the subtracted image became

binary and unnecessary regions were removed. The fourth step contained the removal of the periphery

areas that occurred from the false assessment of the algorithm and the removal of the regions that

belonged to old micro-calcifications.

Later, we constructed the ground truth images to evaluate the performance of the proposed algorithm.

From the results we established that the algorithm performed sufficiently accurate for the identification

of the true micro-calcifications but found incorrectly a lot of false abnormal areas. From the total 376

detected areas, falsely characterized 280 as micro-calcifications and 37 as normal. The F1-score was

0.301 and pointed that the algorithm needed improvements since the large amount of the identified false

positives regions created a critical problem for this study. From the literature we had seen that this was

the most significant problem and all the studies struggled with that issue.

For this reason, we used machine-learning to eliminate the false positives and upgrade our method. We

extracted 13 FOS and shape features from current and subtracted images, on every previously detected

ROI. Those features were the average value of gray level, the max intensity value of gray level, standard

deviation, coefficient of variance, entropy, skewness, kurtosis, area, eccentricity, convex area, filled

area, solidity and extent. Statistical analysis and multivariate analysis of variance used to find the best

combination of features. The features that extracted from the current image were more valuable, since

the p-value was slightly lower and finally the features that selected for the classification step were the

mean intensity, max intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and

70

extent. With discriminant analysis, we used leave-one-patient-out validation to split the dataset into a

training set and a test set and the accuracy, sensitivity and specificity were found.

Table 5.1 illuminates a comparison between our proposed method and the state-of-the-art classification

techniques that used machine-learning applications with leave-one-patient-out analysis, for the

elimination of false positives. Nonetheless, these three studies did not discover micro-calcifications

from temporal pairs, but only from one screening round. We went one-step further and eliminated the

old micro-calcifications based on the previous mammograms to improve the results and help the

radiologists. Provided that, the comparison of our methodology, with the other methods mentioned in

the literature, is not straightforward because the experiments were conducted in different datasets and

the size of each dataset varies. Additionally, in each method the authors chose different evaluation

techniques for their algorithms.

In 1998, Nagel, et al. constructed a computerized scheme for the identification of micro-calcifications

and more specific the removal of the false-computer detections. They examined 3 different methods for

the feature analysis which were the rule-based method, the artificial neural network and a combined

method. The combined method achieved the highest results with 83% sensitivity and 0.8 false positive

detections per image. Next, Diaz-Huerta, Felipe-Riveron and Montaño-Zetina, (2014) used

morphological approaches and contrast enhancement techniques to detect the micro-calcifications.

Then, for the false positives elimination, they extracted 65 spatial, texture and spectral features and

inserted them in a support vector machine classifier for the discrimination of micro-calcifications and

normal tissue. The overall sensitivity was 85.9% and in normal images they obtained 13 false positives

per image. Lu, et al. (2016) after the detection of the micro-calcifications, they applied a classification

step for the removal of the false regions that identified as micro-calcifications. They extracted 11 shape

and 27 appearance features and used the RUSBoost classifiers. With their method found 80% accuracy

with 10 false positives per image. In our algorithm, the accuracy was 83.245%, sensitivity 96.9% and

specificity 78.6%. Compared to the above methods, with these measurements we can understand that

our results are very promising and the proposed algorithm very effective.

It is worth to mention, that a considerable amount of the studies that we found in the literature, did not

use leave-one-patient-out validation but cross-validation with the micro-calcifications. With that

method, they did not leave behind all the ROIs correspond to a single patient, but they divided the train

set and the test set with randomly selected micro-calcifications or they used a part of the same areas as

a test set and a train set. We did not choose this approach, due to the fact that we wanted to detect all of

the micro-calcifications inside a mammogram. With this technique, if we put a percentage of the ROIs

in the training set and used the rest of them for testing, the algorithm categorized the remaining areas

correctly due to bias. Songyang Yu and Ling Guan, (2000) proposed an automated CAD system for the

detection and classification of clustered micro-calcifications in digitized mammograms. First, they

discovered the possible regions. Then, they extracted 31 gray level statistical and wavelet features and

imported them into a neural network for the classification step. However, their training samples were

used also in the testing of the classifier. There results showed 90% accuracy with only 0.5 false positives

per image.

With this in mind, our approach was more generalized, did not contain any bias and this was the reason

that our results were slightly lower than the state-of-the-art. For comparison reasons, we evaluated our

algorithm with that approach and we found that the accuracy was 98%, the sensitivity 0.985 and the

specificity 0.876. From that, we can assure that the proposed algorithm is as good and even better from

the state-of-the-art algorithms.

71

The main drawback of the proposed method was the relatively narrow dataset, which can alter the results

of the classification step. It is reasonable to expect that with bigger datasets the classification step can

be more accurate and the accuracy higher. Moreover, the age of the patients’ that participate in the study

was not a representative sample. At the registration step, the execution time of Demons was low and

consequently the whole process affected. Also, we used only Discriminant Analysis for the separation

between TP and FP, and we did not check other machine-learning algorithms. With this intention, our

results show that our proposed algorithm is efficient and can be used with relatively high accuracy for

breast micro-calcifications detection using temporal mammograms.

Table 5.1: Performance of different methods

Classification as micro-calcification or Normal using a classifier and Leave-one-

patient-out procedure

Nagel, et al. (1998) 83% sensitivity with 0.8 false positives per image

Diaz-Huerta, Felipe-Riveron and

Montaño-Zetina, (2014) 85.9% sensitivity with 13 false positives per image

Lu, et al. (2016) 80% accuracy with 10 false positives per image

Proposed method

83.245% accuracy

96.9% sensitivity

78.6% specificity

72

6 Conclusion and Future Work

We performed a study for the development of a computerized CAD system for the detection of micro-

calcifications on temporal pair of mammograms. We combined a series of pre-processing, registration

and post-processing techniques in order to efficiently subtract the mammographic pair and enhance the

micro-calcifications. Additionally, we removed the periphery regions that were unnecessary to our

algorithm and the old micro-calcifications. We extracted 13 FOS and shape features from the subtracted

image for the classification step to eliminate the false positives regions. With statistical analysis and

multivariate analysis of variance, we selected the best features and with discriminant analysis and leave-

one-patient-out validation, we classified our results and found the accuracy, the sensitivity and the

specificity as evaluation measurements. From our results, we can conclude that the achieved

performance is sufficient, and our algorithm can assist radiologists in breast micro-calcifications

detection using temporal mammograms.

Encouraged by this initial success, we plan to use larger dataset in order to test our algorithm further

and with more representative age samples. Likewise, more features needed for the classification of

regions as TP or FP and the use of more complex classifiers, which can improve the results and increase

the percent correct. Finally, we can generalize our algorithm for other kind of abnormalities in

mammograms, besides micro-calcifications and give predictions for the development of new

abnormalities in the next screening rounds.

73

References

Abdel-Nasser, M., Moreno, A. and Puig, D. (2016). Temporal mammogram image registration using

optimized curvilinear coordinates. Computer Methods and Programs in Biomedicine, 127, pp.1-14.

Arfan, M. (2017). Wavelets Texture based Classification of Breast Mammograms using Adaboost

Classifier. International Journal of Advanced Computer Science and Applications, 8(5).

Bailey, D. and Hodgson, R. (1985). Range filters: Local intensity subrange filters and their

properties. Image and Vision Computing, 3(3), pp.99-110.

Bekker, A., Shalhon, M., Greenspan, H. and Goldberger, J. (2016). Multi-View Probabilistic

Classification of Breast Microcalcifications. IEEE Transactions on Medical Imaging, 35(2), pp.645-

653.

Berry, D. (2011). Computer-Assisted Detection and Screening Mammography: Where's the Beef?.

JNCI Journal of the National Cancer Institute, 103(15), pp.1139-1141.

Beura, S. (2016). Development of Features and Feature Reduction Techniques for Mammogram

Classification. Ph.D. National Institute of Technology Rourkela.

Bhattacharya, M. and Das, A. (2007). Fuzzy Logic Based Segmentation of Microcalcification in Breast

Using Digital Mammograms Considering Multiresolution. In: Machine Vision and Image Processing

Conference.

Boulehmi, H., Mahersia, H. and Hamrouni, K. (2016). A New CAD System for Breast

Microcalcifications Diagnosis. International Journal of Advanced Computer Science and Applications,

7(4).

Bozek, J., Kallenberg, M., Grgic, M. and Karssemeijer, N. (2014). Use of volumetric features for

temporal comparison of mass lesions in full field digital mammograms. Medical Physics, 41(2),

p.021902.

Burrell, H., Sibbering, D., Wilson, A., Pinder, S., Evans, A., Yeoman, L., Elston, C., Ellis, I., Blamey,

R. and Robertson, J. (1996). Screening interval breast cancers: mammographic features and prognosis

factors. Radiology, 199(3), pp.811-817.

Cady, B. and Chung, M. (2005). Mammographic Screening: No Longer Controversial. American

Journal of Clinical Oncology, 28(1), pp.1-4.

Cancer.org. (2016). What is breast cancer?. [online] Available at:

http://www.cancer.org/cancer/breastcancer/detailedguide/breast-cancer-what-is-breast-cancer

[Accessed 3 Oct. 2016].

Casti, P., Mencattini, A., Salmeri, M. and Rangayyan, R. (2015). Analysis of Structural Similarity in

Mammograms for Detection of Bilateral Asymmetry. IEEE Transactions on Medical Imaging, 34(2),

pp.662-671.

Casti, P., Mencattini, A., Salmeri, M., Ancona, A., Lorusso, M., Pepe, M., Natale, C. and Martinelli, E.

(2017). Towards localization of malignant sites of asymmetry across bilateral mammograms. Computer

Methods and Programs in Biomedicine, 140, pp.11-18.

74

Celaya-Padilla, J., Martinez-Torteya, A., Rodriguez-Rojas, J., Galvan-Tejada, J., Treviño, V. and

Tamez-Peña, J. (2015). Bilateral Image Subtraction and Multivariate Models for the Automated

Triaging of Screening Mammograms. BioMed Research International, 2015, pp.1-12.

Cheng, H., Cai, X., Chen, X., Hu, L. and Lou, X. (2003). Computer-aided detection and classification

of microcalcifications in mammograms: a survey. Pattern Recognition, 36(12), pp.2967-2991.

Cherkassky, V. and Mulier, F. (1998). Learning from data. New York: Wiley.

Ciecholewski, M. (2016). Microcalcification Segmentation from Mammograms: A Morphological

Approach. Journal of Digital Imaging, 30(2), pp.172-184.

Desautels, J., Rangayyan, R. and Mudigonda, N. (2000). Gradient and texture analysis for the

classification of mammographic masses. IEEE Transactions on Medical Imaging, 19(10), pp.1032-

1043.

Devroye, L., Gyorfi, L. and Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York:

Springer.

Dhawan, A., Chitre, Y. and Kaiser-Bonasso, C. (1996). Analysis of mammographic microcalcifications

using gray-level image structure features. IEEE Transactions on Medical Imaging, 15(3), pp.246-259.

Diaz-Huerta, C., Felipe-Riveron, E. and Montaño-Zetina, L. (2014). Quantitative analysis of

morphological techniques for automatic classification of micro-calcifications in digitized

mammograms. Expert Systems with Applications, 41(16), pp.7361-7369.

Diéz, Y., Gubern-Mérida, A., Wang, L., Diekmann, S., MartÃ, J., Platel, B., Kramme, J. and Martí,

R. (2014). Comparison of Methods for Current-to-Prior Registration of Breast DCE-MRI. In: H. Fujita,

T. Hara and C. Muramatsu, ed., Breast Imaging: 12th International Workshop, IWDM 2014, Gifu City,

Japan, June 29 - July 2, 2014. Proceedings, 1st ed. Gify City: Springer, pp.689-6995.

Diéz, Y., Oliver, A., Llado, X., Freixenet, J., Marti, J., Vilanova, J. and Marti, R. (2011). Revisiting

Intensity-Based Image Registration Applied to Mammography. IEEE Transactions on Information

Technology in Biomedicine, 15(5), pp.716-725.

Efford, N. (2002). Digital image processing. Harlow, Essex [u.a.]: Addison-Wesley.

El-Naqa, I., Yongyi Yang, Wernick, M., Galatsanos, N. and Nishikawa, R. (2002). A support vector

machine approach for detection of microcalcifications. IEEE Transactions on Medical Imaging, 21(12),

pp.1552-1563.

Fahnestock, J. and Schowengerdt, R. (1983). Spatially Variant Contrast Enhancement Using Local

Range Modification. Optical Engineering, 22(3).

Fenton, J. (2015). Is It Time to Stop Paying for Computer-Aided Mammography?. JAMA Internal

Medicine, 175(11), p.1837.

Fenton, J., Abraham, L., Taplin, S., Geller, B., Carney, P., D'Orsi, C., Elmore, J. and Barlow, W. (2011).

Effectiveness of Computer-Aided Detection in Community Mammography Practice. JNCI Journal of

the National Cancer Institute, 103(15), pp.1152-1161.

75

Fenton, J., Xing, G., Elmore, J., Bang, H., Chen, S., Lindfors, K. and Baldwin, L. (2013). Short-Term

Outcomes of Screening Mammography Using Computer-Aided Detection. Annals of Internal

Medicine, 158(8), p.580.

French, A., Macedo, M., Poulsen, J., Waterson, T. and Yu, A. (n.d.). Multivariate Analysis of Variance

(MANOVA). http://libraryguides.neomed.edu/c.php?g=324183&p=2172315.

Galván-Tejada, C., Zanella-Calzada, L., Galván-Tejada, J., Celaya-Padilla, J., Gamboa-Rosales, H.,

Garza-Veloz, I. and Martinez-Fierro, M. (2017). Multivariate Feature Selection of Image Descriptors

Data for Breast Cancer with Computer-Assisted Diagnosis. Diagnostics, 7(1), p.9.

Ganesan, K., Acharya, U., Chua, C., Min, L., Abraham, K. and Ng, K. (2013). Computer-Aided Breast

Cancer Detection Using Mammograms: A Review. IEEE Reviews in Biomedical Engineering, 6, pp.77-

98.

Giger, M., Karssemeijer, N. and Armato, S. (2001). Guest editorial computer-aided diagnosis in medical

imaging. IEEE Transactions on Medical Imaging, 20(12), pp.1205-1208.

Gonzalez, R. and Woods, R. (1992). Instructor's manual for digital image processing. Reading [etc.]:

Addison-Wesley.

Hackshaw, A. and Paul, E. (2003). Breast self-examination and death from breast cancer: a meta-

analysis. British Journal of Cancer, 88(7), pp.1047-1053.

Hadjiiski, L., Sahiner, B., Heang-Ping Chan, Petrick, N. and Helvie, M. (1999). Classification of

malignant and benign masses based on hybrid ART2LDA approach. IEEE Transactions on Medical

Imaging, 18(12), pp.1178-1187.

Hadjiiski, L., Sahiner, B., Chan, H., Petrick, N., Helvie, M. and Gurcan, M. (2001). Analysis of temporal

changes of mammographic features: Computer-aided classification of malignant and benign breast

masses. Medical Physics, 28(11), p.2309.

Haralick, R., Shanmugam, K. and Dinstein, I. (1973). Textural Features for Image Classification. IEEE

Transactions on Systems, Man, and Cybernetics, 3(6), pp.610-621.

Hasegawa, A., Neemuchwala, H., Tsunoda-Shimizu, H., Honda, S., Shimura, K., Sato, M., Koyama,

T., Kikuchi, M. and Hiramatsu, S. (2008). A Tool for Temporal Comparison of Mammograms: Image

Toggling and Dense-Tissue-Preserving Registration. In: E. Krupinski, ed., Digital Mammography. 9th

International Workshop, IWDM 2008 Tucson, AZ, USA, July 20-23, 2008 Proceedings, 1st ed. Berlin:

Springer, pp.447-454.

Heinlein, P., Drexl, J. and Schneider, W. (2003). Integrated wavelets for enhancement of

microcalcifications in digital mammography. IEEE Transactions on Medical Imaging, 22(3), pp.402-

413.

Hsu, H. and Lachenbruch, P. (2008). PairedtTest. Wiley Encyclopedia of Clinical Trials.

Hu, K., Yang, W. and Gao, X. (2017). Microcalcification diagnosis in digital mammography using

extreme learning machine based on hidden Markov tree model of dual-tree complex wavelet transform.

Expert Systems with Applications, 86, pp.135-144.

76

Jain, A., Duin, P. and Jianchang Mao, (2000). Statistical pattern recognition: a review. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 22(1), pp.4-37.

Karssemeijer, N. (1993). Adaptive noise equalization and recognition of microcalcification clusters in

mammograms. International Journal of Pattern Recognition and Artificial Intelligence, 07(06),

pp.1357-1376.

Karssemeijer, N. and te Brake, G. (1996). Detection of stellate distortions in mammograms. IEEE

Transactions on Medical Imaging, 15(5), pp.611-619.

Kelder, A., Zigel, Y., Lederman, D. and Zheng, B. (2015). A new computer-aided detection scheme

based on assessment of local bilateral mammographic feature asymmetry - A preliminary evaluation.

In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

IEEE.

Khehra, B. and Pharwaha, A. (2016). Classification of Clustered Microcalcifications using MLFFBP-

ANN and SVM. Egyptian Informatics Journal, 17(1), pp.11-20.

Kobatake, H., Murakami, M., Takeo, H. and Nawano, S. (1999). Computerized detection of malignant

tumors on digital mammograms. IEEE Transactions on Medical Imaging, 18(5), pp.369-378.

Kooi, T. and Karssemeijer, N. (2017). Classifying Symmetrical Differences and Temporal Change for

the Detection of Malignant Masses in Mammography Using Deep Neural Networks. [online] Available

at:

https://www.researchgate.net/publication/315520619_Classifying_Symmetrical_Differences_and_Te

mporal_Change_in_Mammography_Using_Deep_Neural_Networks [Accessed 1 Jul. 2017].

Kovalerchuk, B., Triantaphyllou, E., Ruiz, J. and Clayton, J. (1997). Fuzzy logic in computer-aided

breast cancer diagnosis: analysis of lobulation. Artificial Intelligence in Medicine, 11(1), pp.75-85.

Krzanowski, W. (2000). Principles of Multivariate Analysis. Oxford: Oxford University Press.

Lehman, C., Wellman, R., Buist, D., Kerlikowske, K., Tosteson, A. and Miglioretti, D. (2015).

Diagnostic Accuracy of Digital Screening Mammography With and Without Computer-Aided

Detection. JAMA Internal Medicine, 175(11), p.1828.

Lei Zhen, and Chan, A. (2001). An artificial intelligent algorithm for tumor detection in screening

mammogram. IEEE Transactions on Medical Imaging, 20(7), pp.559-567.

Li, Y., Chen, H., Yang, Y., Cheng, L. and Cao, L. (2015). A bilateral analysis scheme for false positive

reduction in mammogram mass detection. Computers in Biology and Medicine, 57, pp.84-95.

Li, H., Meng, X., Wang, T., Tang, Y. and Yin, Y. (2017). Breast masses in mammography classification

with local contour features. BioMedical Engineering OnLine, 16(1).

Lu, Z., Carneiro, G., Dhungel, N. and P. Bradley, N. (2016). Automated Detection of Individual Micro-

calcifications From Mammograms Using a Multi-Stage Cascade Approach. [ebook] Available at:

http://1610.02251.pdf [Accessed 8 Feb. 2018].

Ma, F., Bajger, M., Williams, S. and Bottema, M. (2010). Improved Detection of Cancer in Screening

Mammograms by Temporal Comparison. In: J. Marta, A. Oliver, J. Freixenet and R. Marta, ed., Digital

77

Mammography 10th International Workshop, IWDM 2010, Girona, Catalonia, Spain, June 16-18, 2010.

Proceedings, 1st ed. Girona: Springer, pp.752-759.

Ma, F., Yu, L., Liu, G. and Niu, Q. (2014). Temporal change analysis for computer aided mass detection

in mammography. 2014 7th International Conference on Biomedical Engineering and Informatics.

Ma, F., Yu, L., Bajger, M. and Bottema, M. (2015). Incorporation of fuzzy spatial relation in temporal

mammogram registration. Fuzzy Sets and Systems, 279, pp.87-100.

Ma, F., Yu, L., Bajger, M. and Bottema, M. (2015). Mammogram Mass Classification with Temporal

Features and Multiple Kernel Learning. In: International Conference on Digital Image Computing:

Techniques and Applications (DICTA). Adelaide: IEEE eXpress Conference Publishing, pp.505-511.

Mallat, S. and Zhong, S. (1992). Characterization of signals from multiscale edges. IEEE Transactions

on Pattern Analysis and Machine Intelligence, 14(7), pp.710-732.

Marias, K., Behrenbruch, C., Parbhoo, S., Seifalian, A. and Brady, M. (2005). A registration framework

for the comparison of mammogram sequences. IEEE Transactions on Medical Imaging, 24(6), pp.782-

790.

Martí, R., Díez, Y., Oliver, A., Tortajada, M., Zwiggelaar, R. and Lladó, X. (2014). Detecting Abnormal

Mammographic Cases in Temporal Studies Using Image Registration Features. In: H. Fujita, T. Hara

and C. Muramatsu, ed., Breast Imaging: 12th International Workshop, IWDM 2014, Gifu City, Japan,

June 29 - July 2, 2014. Proceedings, 1st ed. Gifu City: Springer, pp.612-619.

MATLAB and Image Processing Toolbox. Release 2006b, The MathWorks, Inc., Natick,

Massachusetts, United Sates.

Mendez, A., Tahoces, P., Lado, M., Souto, M. and Vidal, J. (1998). Computer-aided diagnosis:

Automatic detection of malignant masses in digitized mammograms. Medical Physics, 25(6), p.957.

Mika, S., Ratsch, G., Weston, J., Scholkopf, B. and Mullers, K. (1999). Fisher discriminant analysis

with kernels. In: Neural Networks for Signal Processing IX.

Mousa, R., Munib, Q. and Moussa, A. (2005). Breast cancer diagnosis system based on wavelet analysis

and fuzzy-neural. Expert Systems with Applications, 28(4), pp.713-723.

Nagel, R., Nishikawa, R., Papaioannou, J. and Doi, K. (1998). Analysis of methods for reducing false

positives in the automated detection of clustered microcalcifications in mammograms. Medical Physics,

25(8), pp.1502-1506.

National Cancer Institute. (2016). Mammograms. [online] Available at:

https://www.cancer.gov/types/breast/mammograms-fact-sheet#q11 [Accessed 1 Nov. 2016].

Nguyen, V., Nguyen, D., Nguyen, T., Phan, V. and Truong, Q. (2015). Filter-based feature selection

and support vector machine for false positive reduction in computer-aided mass detection in

mammograms. Seventh International Conference on Machine Vision (ICMV 2014).

Oliver, A. (2007). Automatic Mass Segmentation in Mammographic Images. Ph.D. University of

Girona.

78

Oliver, A., Freixenet, J., Martí , J., Pérez, E., Pont, J., Denton, E. and Zwiggelaar, R. (2010). A review

of automatic mass detection and segmentation in mammographic images. Medical Image Analysis,

14(2), pp.87-110.

Papadopoulos, A., Fotiadis, D. and Likas, A. (2005). Characterization of clustered microcalcifications

in digitized mammograms using neural networks and support vector machines. Artificial Intelligence

in Medicine, 34(2), pp.141-150.

Papoulis, A. (1965). Probability, random variables, and stochastic processes. New York: McGraw-Hill.

Patel, B. and Sinha, G. (2010). An Adaptive K-means Clustering Algorithm for Breast Image

Segmentation. International Journal of Computer Applications, 10(4), pp.35-38.

Rangayyan, R., Banik, S. and Desautels, J. (2010). Computer-Aided Detection of Architectural

Distortion in Prior Mammograms of Interval Cancer. Journal of Digital Imaging, 23(5), pp.611-631.

Richard, F. and Cohen, L. (2003). A new Image Registration technique with free boundary constraints:

application to mammography. Computer Vision and Image Understanding, 89(2-3), pp.166-196.

Rumelhart, D., Hinton, G. and Williams, R. (1986). Learning internal representation by error

propagation. In: D. Rumelhart, J. McClelland and PDP Research Group, ed., Parallel distributed

processing: explorations in the microstructure of cognition, 1st ed. Cambridge: MIT Press, pp.318-362.

Sanjay-Gopal, S., Chan, H., Wilson, T., Helvie, M., Petrick, N. and Sahiner, B. (1999). A regional

registration technique for automated interval change analysis of breast lesions on mammograms.

Medical Physics, 26(12), p.2669.

Scharcanski, J. and Jung, C. (2006). Denoising and enhancing digital mammographic images for visual

screening. Computerized Medical Imaging and Graphics, 30(4), pp.243-254.

Seber, G. (1984). Multivariate Observations. Wiley Series in Probability and Statistics.

Shanmugavadivu, P. and Sivakumar, V. (2013). Segmentation of pectoral muscle in mammograms

using fractal method. 2013 International Conference on Computer Communication and Informatics.

Shanmugavadivu, P., Sivakumar, V. and Sudhir, R. (2016). Fractal dimension-bound spatio-temporal

analysis of digital mammograms. The European Physical Journal Special Topics, 225(1), pp.137-146.

Sickles, E. (1986). Mammographic features of 300 consecutive nonpalpable breast cancers. American

Journal of Roentgenology, 146(4), pp.661-663.

Simoncelli, E. and Adelson, E. (1996). Noise removal via Bayesian wavelet coring. Proceedings of 3rd

IEEE International Conference on Image Processing, 1, pp.379-382.

Smithuis, R. and Pijnappel, R. (2008). Breast - Calcifications Differential Diagnosis. [ebook]

Available at: http://www.radiologyassistant.nl/en/p4793bfde0ed53/breast-calcifications-differential-

diagnosis.html [Accessed 19 Aug. 2017].

Soille, P. (1999). Morphological Image Analysis: Principles and Applications. Berlin: Springer, pp.164-

165.

79

Songyang Yu and Ling Guan (2000). A CAD system for the automatic detection of clustered

microcalcifications in digitized mammogram films. IEEE Transactions on Medical Imaging, 19(2),

pp.115-126.

Suhail, Z., Sarwar, M. and Murtaza, K. (2015). Automatic detection of abnormalities in mammograms.

BMC Medical Imaging, 15(1).

Sun, W., Zheng, B., Lure, F., Wu, T., Zhang, J., Wang, B., Saltzstein, E. and Qian, W. (2014). Prediction

of near-term risk of developing breast cancer using computerized features from bilateral mammograms.

Computerized Medical Imaging and Graphics, 38(5), pp.348-357.

Tan, M., Zheng, B., Leader, J. and Gur, D. (2016). Association Between Changes in Mammographic

Image Features and Risk for Near-Term Breast Cancer Development. IEEE Transactions on Medical

Imaging, 35(7), pp.1719-1728.

Tang, J., Rangayyan, R., Xu, J., El Naqa, I. and Yang, Y. (2009). Computer-Aided Detection and

Diagnosis of Breast Cancer With Mammography: Recent Advances. IEEE Transactions on Information

Technology in Biomedicine, 13(2), pp.236-251.

Theodoridis, S. and Koutroumbas, K. (2006). Pattern recognition. Amsterdam: Elsevier/Academic

Press.

Timp, S. and Karssemeijer, N. (2006). Interval change analysis to improve computer aided detection in

mammography. Medical Image Analysis, 10(1), pp.82-95.

Timp, S., Varela, C. and Karssemeijer, N. (2007). Temporal Change Analysis for Characterization of

Mass Lesions in Mammography. IEEE Transactions on Medical Imaging, 26(7), pp.945-953.

van Engeland, S., Snoeren, P., Jan Hendriks, and Karssemeijer, N. (2003). A comparison of methods

for mammogram registration. IEEE Transactions on Medical Imaging, 22(11), pp.1436-1444.

Viton, J. (1996). Method for characterizing masses in digital mammograms. Optical Engineering,

35(12), p.3453.

Vujovic, N. and Brzakovic, D. (1997). Establishing the correspondence between control points in pairs

of mammographic images. IEEE Transactions on Image Processing, 6(10), pp.1388-1399.

Wang, T. and Karayiannis, N. (1998). Detection of microcalcifications in digital mammograms using

wavelets. IEEE Transactions on Medical Imaging, 17(4), pp.498-509.

Wang, X., Zheng, B., Good, W., King, J. and Chang, Y. (1999). Computer-assisted diagnosis of breast

cancer using a data-driven Bayesian belief network. International Journal of Medical Informatics, 54(2),

pp.115-126.

Webb, A. (2002). Statistical pattern recognition. West Sussex, England: Wiley.

www.nationalbreastcancer.org. (2017). Breast Tumors :: The National Breast Cancer Foundation.

[online] Available at: http://www.nationalbreastcancer.org/breast-tumors [Accessed 3 Sep. 2017].

Xie, W., Li, Y. and Ma, Y. (2016). Breast mass classification in digital mammography based on extreme

learning machine. Neurocomputing, 173, pp.930-941.

80

Yan, S., Wang, Y., Aghaei, F., Qiu, Y. and Zheng, B. (2017). Applying a new bilateral mammographic

density segmentation method to improve accuracy of breast cancer risk prediction. International Journal

of Computer Assisted Radiology and Surgery, 12(10), pp.1819-1828.

Yin, F. (1991). Computerized detection of masses in digital mammograms: Analysis of bilateral

subtraction images. Medical Physics, 18(5), p.955.

Yin, F. (1999). Computerized detection of masses in digital mammograms: Automated alignment of

breast images and its effect on bilateral-subtraction technique. Medical Physics, 21(3), p.445.

Yu, S. and Guan, L. (2000). A CAD system for the automatic detection of clustered microcalcifications

in digitized mammogram films. IEEE Transactions on Medical Imaging, 19(2), pp.115-126.

Zheng, B., Chang, Y. and Gur, D. (1995). Computerized detection of masses from digitized

mammograms: Comparison of single-image segmentation and bilateral-image subtraction. Academic

Radiology, 2(12), pp.1056-1061.

Zuiderveld, K. (1994). Contrast limited adaptive histogram equalization. In: Graphic Gems, 4th ed. San

Diego: Academic Press Professional, pp.474-485.