Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
DEPARTMENT OF ELECTRICAL AND COMPUTER
ENGINEERING
COMPUTER AIDED BREAST CANCER
DETECTION USING TEMPORAL
MAMMOGRAMS
KOSMIA LOIZIDOU
A Dissertation Submitted to the University of Cyprus in Partial
Fulfillment of the MSc Requirements
May 2018
DEPARTMENT OF ELECTRICAL AND COMPUTER
ENGINEERING
COMPUTER AIDED BREAST CANCER DETECTION USING
TEMPORAL MAMMOGRAMS
KOSMIA LOIZIDOU
Dr. Costas Pitris
Assistant Professor, Department of Electrical & Computer
Engineering, Advisor
Dr. Christos Nicolaou
Radiologist, Co-advisor
©Kosmia Loizidou, 2018
i
VALIDATION PAGE
MSc Student: Kosmia Loizidou
Master Thesis Title: Computer Aided Breast Cancer Detection Using Temporal Mammograms
The present Thesis Dissertation was submitted in partial fulfillment of the requirements for the Degree
of Master of Science at the Department of Electrical and Computer Engineering and was approved
on the 24/05/2018 by the members of the Examination Committee.
Examination Committee:
Research Supervisor: Constantinos Pitris
Assistant Professor, Department of Electrical and Computer Engineering
Committee Member: Theocharis Theocharides
Assistant Professor, Department of Electrical and Computer Engineering
Committee Member: Constantinos Pattichis
Professor, Department of Computer Science
ii
DECLARATION
The present Master dissertation submitted in partial fulfillment of the requirements for the degree of
Master of Science of the University of Cyprus. It is a product of original work of my own, unless
otherwise mentioned through references, notes, or any other statements.
Kosmia Loizidou
…………………………………..
iii
Dedicated to Petros,
Thank you for believing in me
iv
ABSTRACT
Breast cancer remains until today, one of the most deadly cancers worldwide for women over 40 years
old. Early detection is crucial for women, in order to minimize the damage, discomfort and provide a
potential cure. Mammography is the most reliable screening tool for the identification of any signs of
malignancy or abnormality in general. Computer Aided Diagnosis (CAD) systems are dynamic tools
that can assist radiologists to detect and classify mammographic abnormalities. Those systems are
needed due to the radiologists’ high error rate in spotting cancer. In the literature, various algorithms
have been proposed, which evaluate mammograms and identify subtle and complicated cancers that
otherwise could be missed from the human observers.
In this work, a novel and advanced technique for breast micro-calcifications diagnosis on a temporal
pair of mammograms is introduced. Micro-calcifications are microscopic calcifications that appear in
clusters with a higher intensity level than their surroundings and they are difficult to recognise. The
goal of this work is to develop an innovative CAD system for the detection and classification of micro-
calcifications.
The recommended approach begun with the creation of the dataset, which contained eight temporal
pairs. After that was the normalization of the mammograms and the pre-processing step, which removed
the mammogram’s irrelevant regions, without eradicating other important details of the image. The
second step consisted of the registration of the prior and current mammograms for an efficient
subtraction. We tested two algorithms: Affine and Demons. Our experimental results demonstrated that
the mean residual with Affine was approximately 10% higher than the residuals with Demons, for both
dense and fatty mammograms, so Demons registration was selected. The third step involved the
subtraction of the current and registered image and the post-processing steps. The subtracted image
filtered with various filters in order to discover the best one, thresholded and further processed with
morphological operations.
The fourth step contained the removal of the periphery regions that did not correspond to micro-
calcifications. Range filter was chosen because the high intensity background was erased but micro-
calcifications remained. Then, we eliminated the old micro-calcifications that did not removed from the
previous subtraction of the mammograms and created the ground truth images based on the radiologist’s
observations, for the evaluation part. We evaluated the proposed methodology and found 280 false
positives from the detected 379 ROIs. The F1-score was limited to 0.301 and for that reasons we
implemented machine-learning to our study for the elimination of false positives.
For the classification part, 13 FOS and shape features were extracted from both the current and the
subtracted images on every ROI and with statistical analysis and multivariate analysis of variance, the
best features were selected. The features that were extracted from the current image had smaller p-value
and the best combination of features contained only 10 of them. With discriminant analysis, we used
leave-one-patient-out validation to divide the dataset into a training set and a test set. The training set
was used to train the classifier and the test set to evaluate the classifier. The accuracy was 83.245%,
sensitivity 96.9% and specificity 78.6%. Even though the results indicated that our proposed algorithm
is an effective and powerful tool for breast micro-calcifications detection using temporal mammograms,
additional studies must be carried out to improve the diagnostic value of the algorithm.
v
ACKNOWLEDGMENTS
I take this opportunity to thank all those who have contributed in this work.
First, I would like to express sincere gratitude to my thesis advisor, Assistant Professor Constantinos
Pitris of the ECE Department at University of Cyprus. He consistently allowed this thesis to be my own
work but steered me in the right direction whenever he thought I needed it. He provided me with
knowledge and motivation to handle difficult situations with confidence and courage.
I would also like to thank my co-advisor Dr. Chistos Nicolaou that was the Radiologist of this work.
Dr. Nicolaou allowed me to be part of this project and helped me with the medical terms and the creation
of the dataset. Without his guidance and input, this work could not have been successfully conducted.
Besides my advisors, I would like to thank the rest of my thesis committee for their insightful comments
and encouragement in order to broaden my work.
Last, but not least, I would like to thank my people. My parents, Stavros and Melani, my brother
Kleanthis and my friend and colleague Rafaella, for supporting me spiritually throughout writing this
thesis and in my life in general.
vi
Table of Contents
ABSTRACT .................................................................................................................. IV
ACKNOWLEDGMENTS ............................................................................................. V
1 INTRODUCTION .....................................................................................................1
1.1 RESEARCH PROBLEM .............................................................................................1 1.2 BREAST CANCER ....................................................................................................1 1.3 CAD MAMMOGRAPHY ..........................................................................................2
2 REVIEW OF THE LITERATURE .........................................................................8
2.1 PROCESSING OF MAMMOGRAMS ............................................................................8 2.2 BILATERAL SUBTRACTION ...................................................................................12 2.3 TEMPORAL ANALYSIS ..........................................................................................19 2.4 DETECTION AND CLASSIFICATION OF MICRO-CALCIFICATIONS ...........................30 2.5 SCOPE ..................................................................................................................33
3 METHODOLOGY OF THE PROPOSED ALGORITHM ................................34
3.1 DETECTION OF ABNORMAL ROIS .........................................................................34 3.1.1 Computer-Aided Diagnosis System Pipeline .............................................34 3.1.2 Dataset .......................................................................................................34 3.1.3 Normalization ............................................................................................36 3.1.4 Pre-processing ...........................................................................................36 3.1.5 Registration ................................................................................................37 3.1.6 Temporal Subtraction ................................................................................38 3.1.7 Post-processing ..........................................................................................38 3.1.8 Removal of the periphery pixels .................................................................39 3.1.9 Removal of the old micro-calcifications ....................................................39 3.1.10 Evaluation of the proposed algorithm .....................................................39
3.2 ELIMINATION OF FALSE POSITIVES ......................................................................40 3.2.1 Feature Extraction and Selection ..............................................................40 3.2.2 Classification .............................................................................................42 3.2.3 Evaluation of the classification ..................................................................43
4 RESULTS .................................................................................................................45
4.1 DETECTION OF ABNORMAL ROIS .........................................................................45 4.1.1 Pre-processing ...........................................................................................45 4.1.2 Registration ................................................................................................47 4.1.3 Temporal Subtraction ................................................................................52 4.1.4 Post-processing ..........................................................................................53 4.1.5 Removal of the Periphery Pixels ................................................................57 4.1.6 Removal of the old micro-calcifications ....................................................58 4.1.7 Evaluation of the proposed algorithm .......................................................60
4.2 ELIMINATION OF FALSE POSITIVES ......................................................................60 4.2.1 Feature Extraction and Selection ..............................................................61 4.2.2 Classification .............................................................................................66 4.2.3 Evaluation of the classification ..................................................................66
vii
5 DISCUSSION ..........................................................................................................69
6 CONCLUSION AND FUTURE WORK ..............................................................72
REFERENCES ..............................................................................................................73
viii
FIGURES
Figure 1.1: Structure of the breast (Beura, 2016) ........................................................................ 1
Figure 1.2: The breast mass in mammogram (a) benign mass with smooth shape (b) malignant mass
with irregular shape (Li et al., 2017) ............................................................................................ 2
Figure 1.3: Two types of view of the breast imaging (a) Left CC view (b) Left MLO view (Nicosia
Diagnostic Centre) ........................................................................................................................ 3
Figure 1.4: Two types of mammograms (a) Fatty mammogram (b) Dense mammogram ............ 3
Figure 1.5: Evaluation methodology of the CAD algorithm (Oliver et al., 2010 p. 100) ............. 4
Figure 2.1: Features extract from mammogram images (a) normal, (b) benign and (c) malignant
(Ganesan et al., 2013 p. 85) ........................................................................................................ 10
Figure 2.2: Three areas of the breast region (Mendez et al., 1998 p. 958) ................................ 13
Figure 2.3: Thresholded images of the right [(a),(c)] and left breasts [(b),(d)] with cut off values 25%
[(a),(b)] and 35% [(c),(d)] (Yin, 1991 p. 957) ............................................................................ 14
Figure 2.4: Comparison of the detection performances obtained with the nonlinear and linear
subtraction methods (Yin, 1991 p. 962) ...................................................................................... 15
Figure 2.5: Manually registered borders of the right and left breasts (Yin, 1991 p. 956) ......... 16
Figure 2.6: Processing steps and information flow between processing steps used to identify potential
control points and establish their correspondence (Vujovic & Brzakovic, 1997 p. 1388) ......... 21
Figure 2.7: Regional registration technique (Sanjay-Gopal et al., 1999 p. 2671) ..................... 22
Figure 2.8: Consistent landmarks in the CC and ML ‘idealized’ outlines (Marias et al., 2005 p. 3) 23
Figure 2.9: Types of micro-calcifications’ distribution (Smithuis, R. and Pijnappel, 2008) ...... 30
Figure 3.1: Computer-aided diagnosis system pipeline ............................................................. 35
Figure 3.2: Example of temporal pairs of mammograms (a) current (b) prior .......................... 36
Figure 3.3: Detailed representation of the proposed algorithm ................................................. 44
Figure 4.1: Example of the clear border removal in two cases (a) normalized mammograms with red
circle in the malignancy (b) border removal .............................................................................. 45
Figure 4.2: Example of the Gamma correction in two cases (a) border removal (b) Gamma correction
.................................................................................................................................................... 46
Figure 4.3: Affine registration in two cases (a) current mammogram (b) prior mammogram (c)
subtracted image ......................................................................................................................... 47
Figure 4.4: Box plot for dense mammograms ............................................................................. 48
Figure 4.5: Box plot for fatty mammograms ............................................................................... 48
ix
Figure 4.6: Displacement filed for Demons registration in the 1st example ............................... 49
Figure 4.7: Displacement filed for Demons registration in the 2nd example .............................. 49
Figure 4.8: Demons registration in two cases (a) current mammogram (b) prior mammogram (c)
subtracted image ......................................................................................................................... 50
Figure 4.9: Box plot for dense and fatty mammograms.............................................................. 51
Figure 4.10: Comparison of dense mammograms ...................................................................... 51
Figure 4.11: Comparison of fatty mammograms ........................................................................ 52
Figure 4.12: Box plot of the contrast ratio ................................................................................. 53
Figure 4.13: STD filter (a) subtracted image (b) filtered image ................................................ 54
Figure 4.14: CLAHE (a) subtracted image (b) filtered image .................................................... 54
Figure 4.15: Range filter (a) subtracted image (b) filtered image ............................................. 55
Figure 4.16: Thresholding (a) filtered image (b) thresholded .................................................... 55
Figure 4.17: Morphological operations (a) thresholded image (b) new image ......................... 56
Figure 4.18: Resulted image for example 1 ................................................................................ 56
Figure 4.19: Resulted image for example 2 ................................................................................ 57
Figure 4.20: Images with removed periphery pixels .................................................................. 57
Figure 4.21: Ground truth images .............................................................................................. 58
Figure 4.22: New binary image .................................................................................................. 58
Figure 4.23: True micro-calcifications ....................................................................................... 59
Figure 4.24: False negative areas .............................................................................................. 59
x
TABLES
Table 1.1: Model-based mammographic detection and/or segmentation techniques (Oliver et al., 2010
p.100) ............................................................................................................................................ 7
Table 2.1: Listing of popular feature extraction and classification methods (Ganesan et al., 2013 p.
90) ............................................................................................................................................... 12
Table 2.2: Listing of popular feature extraction and classification methods (Ganesan et al., 2013 p.
90) ............................................................................................................................................... 12
Table 2.3: Comparison of Bilateral Subtraction methods .......................................................... 19
Table 2.4: Comparison of registration techniques in temporal mammograms .......................... 24
Table 2.5: Single and temporal features (Ma et al., 2014 p. 1264) ............................................ 27
Table 2.6: Comparison of Temporal Analysis techniques in mammograms ............................... 29
Table 2.7: Comparison of detection and classification methods of micro-calcifications ........... 33
Table 3.1: Distribution of our testing dataset ............................................................................. 34
Table 3.2: Features extracted from both subtracted and current image .................................... 41
Table 3.3: Confusion matrix ....................................................................................................... 43
Table 4.1: Evaluation of the algorithm ....................................................................................... 60
Table 4.2: Features extracted from current image ..................................................................... 62
Table 4.3: Features extracted from subtracted image ................................................................ 63
Table 4.4: T-test results for current image ................................................................................. 64
Table 4.5: T-test results for subtracted image ............................................................................ 64
Table 4.6: MANOVA results for current image .......................................................................... 65
Table 4.7: MANOVA results for subtracted image ..................................................................... 65
Table 4.8: Selected features for the classification step ............................................................... 66
Table 4.9: Classification results ................................................................................................. 66
Table 4.10: Evaluation of the classifier ...................................................................................... 67
Table 5.1: Performance of different methods ............................................................................. 71
1 Introduction
1.1 Research Problem
At the present time, there are no effective techniques to avoid breast cancer due to its undiscovered
cause. Despite the fact that the radiologists can provide better chances to women with early stage
diagnosis from mammograms, wrong assessments are inevitable (Tang et al., 2009). With this in mind,
computer-aided diagnosis systems (CAD) that use computer technologies and can recognize
abnormalities, implemented to support and assist the radiologists.
1.2 Breast Cancer
Breast cancer is the most common cancer in women in the European Union and the United States (Oliver
et al., 2010). In 2007, the American Cancer Society published a study suggesting that between one in
eight and one in twelve women will present breast cancer at least once during their lifetime. Breast
cancer remains, the number one cause of death in women older than 40 years of age (Oliver et al., 2010).
The breast composed from lobules, which are the glands that produce milk, ducts, fatty and connective
tissue, blood and lymphatic vessels (Beura, 2016). In the case of breast cancer, the carcinogenesis leads
to an uncontrolled growth of breast cells in a lump, usually forming a tumor (Ganesan et al., 2013). The
process, by which the cancer starts and later develops, can vary between patients. A large percentage of
breast cancers begin in the ducts and called ‘ductal cancers’ while others start in the glands and are
named ‘lobular cancers’. It is worth mentioning that when the cancer has spread to the fatty tissue of
the breast and other organs of the body called invasive and that type of cancer is the riskiest for the
patient’s health (Beura, 2016). Other types of breast cancer appear more rarely (Cancer.org, 2016).
However, it should be noted that the mortality of breast cancer has a downward trend over the past
decade among women of all ages, due to the introduction of mammography screening and the discovery
of alternative and more efficient treatments (Oliver et al., 2010). As mention before, a tumor is a mass
of abnormal tissue and in breast cancer, there are two major categories of tumors: the non-cancerous,
Figure 1.1: Structure of the breast (Beura, 2016)
2
called ‘benign’, and the cancerous called ‘malignant’. The first category contains the harmless tumors.
Most of the times when a patient is diagnosed with this kind of tumor, the doctor will prefer to leave it
alone instead of removing it, because it will not expand in other areas. Although from time to time,
some of those tumors spread to the surrounding tissue, causing pain to the patient. Hence, the doctor is
going to remove the tumor to avoid further expansion of the cancer. The second category refers to
dangerous and unstable tumors, which can infect and damage the surrounding tissue and organs. In
those cases, the doctor is going to do a biopsy to the patient, to find how aggressive is the tumur
(www.nationalbreastcancer.org, 2017). When the breast contains a lot of fibrous or glandular tissue and
not much fatty one is called dense. On the other hand, a breast composed entirely from fat tissue called
fatty. Dense mammograms are more difficult to evaluate because dense tissue looks white, same as the
tumors and abnormalities inside a mammogram (Cancer.org, 2016).
Numerous types of abnormalities can be found in a mammogram and usually they are combined with
asymmetries among left and right breasts, distortion of the normal architecture of the tissue, appearance
of micro-calcifications in the breast and masses of various sizes and shapes. In the most cases, left and
right breast are almost symmetrical, for this reason any asymmetric area can reveal a developing mass
or a variation of normal breast tissue. A distortion in the normal architecture introduces a chaos of the
normal tissue resulting in abnormal regions. Micro-calcifications are microscopic calcifications that
commonly show up in clusters and evaluated according to their specifications such as shape, size etc.
In like manner, a breast mass is a restricted swelling in the breast which is also characterized by its
specifications (Oliver, 2007).
1.3 CAD Mammography
For the detection of breast cancer as well as other kinds of abnormalities of the breasts, worldwide
radiologists use mammography as the key screening tool (Oliver et al., 2010). This method involves X-
ray imaging of the two breasts, with the images stored on film or in digital format (Ganesan et al., 2013).
Mammography is the process of applying low energy X-rays for presentation of breast, to discover the
abnormality areas. A beam of X-rays transfers through the breast and based on the breast’s intensity the
tissue absorbs a percentage of the rays. The rest of the rays then pass through a detector to go to the
(a)
(a)
(a)
(a)
(b)
(b)
(b)
(b)
Figure 1.2: The breast mass in mammogram (a) benign
mass with smooth shape (b) malignant mass with irregular
shape (Li et al., 2017)
3
photographic film, in order to construct a gray-scale image, which is generally known as a film-based
mammogram. From this image, a digital mammogram is re-constructed. The mammograms taken from
two different projections for each breast: the Cranio-Caudal (CC) and the Medio-Lateral Oblique
(MLO) and in Figure 1.3 are two examples. In the first one, the view is taken throughout the entire time
of screening and it can show the pectoral muscle, while in the second one, the view is taken from head
down and the pectoral muscles are not visible (Beura, 2016).
(a) (b)
Figure 1.3: Two types of view of the breast imaging (a) Left CC view (b)
Left MLO view (Nicosia Diagnostic Centre)
Figure 1.4: Two types of mammograms (a) Fatty mammogram (b)
Dense mammogram
(a) (b)
4
Digital mammography, which extensively used today, provides an electronic image of the breast stored
as a computer file. Thus, the information can be upgraded, magnified or processed easily and assist the
radiologist to adapt, store and recover the digital images electronically (National Cancer Institute,
2016). After the mammograms acquired, an expert radiologist reviews and examines the scans or the
files, to determine whether the patient has cancer, followed by appropriate disease management
depending on the case. Unfortunately, even an experienced and well-trained radiologist can make a
wrong assessment or miss a variety of abnormalities (Oliver et al., 2010). It is worth mentioning that
recent studies, published in the past years, have shown that radiologists have an error rate between 10%
and 30% in spotting cancer. Consequently, this significantly high error rates, result in unwarranted
procedures patient discomfort and unnecessary expenditure for hospitals and medical centres. As a
result, the idea of developing computer systems, which could assist the radiologists to detect and classify
breast cancers, has been promoted (Ganesan et al., 2013).
A Computer-Aided Detection (CAD) system is a radiological device, which includes a set of automatic
or semi-automatic tools that aid radiologists in the detection and classification of mammographic
abnormalities (Oliver et al., 2010). The main objective of such a system is to identify an increased
number of subtle and complicated cancers that might otherwise be missed from radiologist due to the
lack of expertise (Lehman et al., 2015). CAD system based on computational intelligence (CI)
techniques dedicated to detecting breast cancer from mammograms. To date, most CAD systems are
not designed to decide independently about the abnormalities related to the tumors, therefore requiring
human intervention to finalise the results and identify the problem. The main components of CAD
systems, such as pre-processing, breast region segmentation, feature extraction and classification,
employ a variety of CI techniques (Ganesan et al., 2013). Despite the fact that the USA Food and Drug
Administration (FDA) approved CAD for mammography in 1998, 3 years later, only a small percentage
(5%) of clinics and hospitals the USA use it. Nevertheless, by 2008 this percentage dramatically
Figure 1.5: Evaluation methodology of the CAD algorithm (Oliver et al., 2010 p. 100)
5
increased (74%) with almost all of the screening mammograms in the USA now involving CAD
(Lehman et al., 2015).
It is generally accepted, that radiologists can miss important signs in a mammogram if they do not
analyse them correctly and with the right tools. With this in mind, the evaluation of mammograms is
performed by two independent radiologists, a procedure called a ‘double reading’, providing greater
sensitivity than just a single reading. Conversely, double reading does not increase the recall rates.
Furthermore, double reading of mammograms is a challenging, almost impossible task, especially in
limited human resource settings. For this reason, many times, CAD utilized as a second reader (Ganesan
et al., 2013). The idea of breast cancer detection using both a CAD system and a radiologist is
straightforward. Initially, after the mammogram acquired the radiologist examines the images, finds
any suspicious clues and then completes an assessment. Next, the CAD system identifies potential
abnormalities by marking specific regions on the image, to assist the radiologist with the final decision
(Fenton et al., 2011). This process acts as a second reading. Finally, the radiologist reviews the areas
identified by the CAD system and determines whether additional evaluation is warranted (Lehman et
al., 2015). With the use of CAD, the number of radiologists needed for double reading of mammograms
decreased significantly, which can be extremely useful in less develop countries.
The challenge of CAD system research is to automate the process of mammographic screening to detect
and categorise tumors automatically and without intervention from the radiologists. Nonetheless, CAD
is only as effective as its computer software (Ganesan et al., 2013, Berry, 2011). Computational time
improves with increasing computer speed but the ultimate goal is to create CAD systems, which learn
from previous decisions and correctly classify malignancies versus non-malignant lesions. However,
such algorithms hindered by numerous challenges. Currently, there are so many different approaches
that if two CAD systems analyse the same mammogram, their response is going to be completely
diverse. Ultimately, CAD systems should be able to find a tumor before it even becomes symptomatic
in the mammogram (Berry, 2011). The success of automated mammography analysis depends on
detection (discovery of potential lesions within the background) and segmentation (define very
accurately the outline of a potential lesion) techniques. Mass detection can be achieved either using a
single view and relying on the differences of the pixels in each area or using multiple views and utilizing
more than one mammographic image of the same person, for comparison. Segmentation techniques
categorised into supervised and unsupervised methods. Supervised methods use a priori information
and/or user intervention to decide the boundaries of specific regions in the image. Unsupervised
methods divide the image into segments and then classify each based on specific properties such as
texture or intensity variations. These approaches further divided in three method groups: region-based,
where the image separated into homogeneous and spatially connected regions, contour-based, which
manipulate the boundary of the regions, and clustering, which organises pixels into groups that have
the same properties. More information will presented in the following sections.
Clinical evaluation of CAD algorithms is necessary to realise whether those systems are accurate in
detecting mass abnormalities with minimum error. For that purpose, after the results obtained by the
CAD detection and segmentation algorithms, they compared with the ‘gold standard’, which is the
analysis of the radiologists. Both the radiologists and the CAD system label each mammogram as
normal or abnormal. The radiologists provide binary images, which include both the outline of a mass
and its details. In contrast, the automatic algorithms give a probability image identifying the different
sections as high or low probability to be a mass compared to normal tissue (Oliver et al., 2010). Previous
literature suggests that the performance of CAD is not conclusive enough to warrant clinical use but the
results are encouraging to warrant further research (Ganesan et al., 2013). In the United States, CAD
applied to a large percentage of screening mammograms, with an annual cost of approximately 30
6
million dollars (Fenton et al., 2011). The main problem is the significant number of false positive
detections of masses, detected only in one view (Oliver et al., 2010). It is important to note, though, that
with the use of CAD, the accuracy of cancer detection has slightly increased. The popularity of these
systems remains high but cannot replace human radiologists (Ganesan et al., 2013). In film-screening
mammography, CAD methods can improve the specificity and the positive predictive value (PPV), but
there is no evidence that the detection rate of invasive breast cancer and the sensitivity can be elevated.
In addition, the impact on breast cancer mortality remains almost the same, with or without CAD
(Fenton et al., 2011). This is since algorithms aiming to improve the mortality rate require many decades
to optimize (Fenton et al., 2013). In some cases, biopsy recommendations declined. Clearly, the aim is
the early detection of high-risk cancers by CAD through sensitivity improvement (Fenton et al., 2011).
The limitations of the state-of-the-art algorithms should take under consideration. Those were the small
numbers of women who took part in some studies and their ages, which are not a representative sample.
Furthermore, out-dated film screen mammograms used and the radiologists were not comfortable with
this approach in the early studies (Lehman et al., 2015). Film-screen mammograms digitized before the
CAD analysis, but this process probably introduced noise and affected the performance. However, some
studies have shown that digital and film-screen environments have similar response with the CAD
system (Fenton et al., 2011). One way to improve the performance of the CAD systems is to combine
different classifiers. Every classifier has its own unique properties and a window over which it performs
best (Ganesan et al., 2013). In the bibliography there are studies where the classifiers were combined
in parallel, serial or cascading schemes and the probabilities were multiplied and summed and indicated
the possibility of improvement in further developments (Oliver et al., 2010). A major concern in pattern
recognition is the fact that a classifier, which designed for a specific set of data, may not be useful for
another set, due to ‘over-fitting’. The combination of classifiers could provide a better generalization
capability for the recognition system. In addition, even senior and junior radiologists in the same
workplace, have completely diverse responses when they use CAD approaches. Thus, the radiologist’s
experience with CAD plays a crucial role and the overall performance can vary (Fenton et al., 2011).
In summary, the main purpose for CAD systems is to serve as double reading machines. At the same
time, the need for CAD in less developed countries, which do not have many expert radiologists, appears
as a necessity. In those cases, CAD systems can be very helpful and improve the diagnostic accuracy
(Ganesan et al., 2013). The long-term effect of CAD on the detection of breast cancer in screening
mammography, demands further investigation (Fenton et al., 2013). The results from past reviews
indicate that there are still unanswered questions related to detection, segmentation, sensitivity,
specificity, mortality rate etc., which make automatic mass detection using CAD an active research field
(Oliver et al., 2010). CAD techniques are veritably popular in the United States for certain reasons.
Firstly, the most apparent reason is that CAD built into digital mammography equipment, which appear
frequently in the USA. The second reason is financial, since the equivalent return for CAD in 2008 was
16.50 dollars. Finally, the readers are comfortable with the CAD system; even though that does not
guarantee that the system is going to perform without any errors (Berry, 2011). Currently, CAD systems
are not ready to use as independent machines which recognize and detect mass abnormalities in
mammograms. With a deeper understanding of this field, one day it is almost certain that they will
assume a bigger role. More research in CAD systems needed in order to assure that the advantages
outweigh the disadvantages (Fenton, 2015).
7
Table 1.1: Model-based mammographic detection and/or segmentation techniques (Oliver et al.,
2010 p.100)
Model-based detection and/or segmentation techniques
Author Year
Lai et al. (1989) 1989
Kegelmeyer (1992) and Kegelmeyer et al. (1994) 1992
Ng and Bischof (1992) 1992
Karssemeijer (1994,1999) and Karssemeijer and te Brake (1996) 1994
Stathaki and Constantinides (1994) 1994
Tarassenko et al. (1995) 1995
Calder et al. (1996) 1996
Chang et al. (1996) 1996
Che et al. (1996) 1996
Diahi et al. (1996) 1996
Li et al. (1996,2001) 1996
Kalman et al. (1997) 1997
Jiang et al. (1998) 1998
Te Brake and Karssemeijer (1998, 1999) 1999
Zwiggelaar et al. (1998,1999) 1998
Constantinides et al. (1999,2000,2001) 1999
Morrison and Linnett (1999) 1999
Christoyianni et al. (2000) 2000
Hatanaka et al. (2001) 2001
Liu et al. (2001) 2001
Lo et al. (2002) 2002
Youssry et al. (2003) 2003
Campanini et al. (2004) 2004
Cheng et Cui (2004) 2004
Hassanien et al. (2004) 2004
Oktem and Jouny (2004) and Ali andHassanien (2006) 2004
Mousa et al. (2005) 2005
Oliver et al. (2006) and Freixenet et al. (2008) 2006
Sakellaropoulos et al. (2006) 2006
Szekely et al. (2006) 2006
8
2 Review of the Literature
At the present time, the literature is focused on the mathematical techniques used in breast cancer
detection with CAD and more specifically algorithms that include pre-processing approaches, feature
extraction and selection and classification methodologies. This section is devoted to these aspects due
to their importance (Ganesan et al., 2013).
2.1 Processing of Mammograms
The contrast of the mammograms is very important for mass detection. For this reason, pre-processing
of the mammograms occurs first, in order to improve it. Other key issues for breast cancer detection
from mammograms are the removal of the noise from the images, the segmentation of the breast region
from the muscles and the extraction of the suspicious regions. Denoising and enhancement of
mammograms affect both basic stages in mass detection, the manual analysis from the radiologist and
the second reading stage from the CAD system (Giger, Karssemeijer and Armato, 2001 ; Hackshaw
and Paul, 2003 ; Cady and Chung, 2005).
In general, the contrast in mammograms is varying between the normal tissues and the malignant ones.
In small lesions and tumors especially, it becomes difficult for the radiologists to clearly visualize and
compare the normal and cancerous tissues. From a mathematical view, this can be explained by the
linear absorption coefficients which define the image’s contrast in the Beer-Lambert law [𝐼 =
(𝐼𝑜𝑒−𝑎𝑥)]. The law relates the 𝐼𝑜 which is the incident electromagnetic wave, with the 𝐼, which is the
transmitted electromagnetic wave. For small tumors (small x) the difference in the intensity is very
small and can cause difficulties in the detection of small and hard to find tumors (Scharcanski and Jung,
2006). An approach is to find areas in the images where the local contrast varies (Karssemeijer, 1993).
This technique improves the detection in mammograms using a neighbourhood of an image location
and then calculating the local contrast as
𝑐(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) − 𝑚𝑒𝑑𝑖𝑎𝑛(𝑥, 𝑦)
where 𝑐(𝑥, 𝑦) is the local contrast, 𝑓(𝑥, 𝑦) is the image gray level and median (𝑥, 𝑦) is the median gray
level inside the neighbourhood of (𝑥, 𝑦). This equation can also considered as a high-pass spatial filter.
In addition, linear stretching presented for image enhancement, with linear or nonlinear mapping of
wavelet coefficients (Mallat and Zhong, 1992). Based on linear stretching, a new approach, called local
range modification, processes the image twice trying, to find the local parameters and, then, to enhance
the contrast (Fahnestock and Schowengerdt, 1983).
For denoising, besides filtering, another method is the Bayesian estimator-based discriminator, which
separates the image and noise by assuming a priori Gaussian additive noise (Simoncelli and Adelson,
1996). Furthermore, wavelet-based techniques used for denoising and extracting important features
from mammograms. These methods based both on Discrete Wavelet Transforms and Continuous
Wavelet Transforms and they result in high rates of success (Heinlein, Drexl and Schneider, 2003 ;
Gonzalez and Woods, 1992). Iris filters, which are adaptive filters used mainly for enhancing rounded
opacities despite their contrast, have also used. These filters use an orientation map with gradient vectors
(Kobatake et al., 1999). Xie, Li and Ma, (2016) proposed a new method for the pre-processing of
mammograms, which included the removal of the pectoral muscle with a combination of the circular
and linear Hough transform. Logarithmic transformation applied in some cases, to rise the dynamic
range in the dark areas of the mammogram and correct the low contrast regions inside the image (Arfan,
2017).
9
Undoubtedly, there are many more techniques and algorithms, which can used for image enhancement.
Pre-processing is a very important step and the choice of methods depends on a large variety of
parameters.
The purpose of statistical pattern recognition is to group the features to categories and classify them
correctly (Jain, Duin and Jianchang Mao, 2000). The effectiveness of the approach depends on how
well the patterns from diverse classes will divided into different classes and the boundaries established
from the probability distribution of these patterns. As a consequence, the patterns and features that are
produced must be effective as well as accurate, in order to classify the data without errors (Sickles, 1986
; Burrell et al., 1996).
The most important goal is to represent the data by a reduced number of dimensions, which both
maintain all the relevant information, but also reject the variables that are not going to contribute to the
effectiveness of the classification. There are two main methods to achieve that. The first one is the
feature selection, which used to seek features out of the available measurements. On the contrary, the
second approach called feature extraction and identifies features via a transformation of the
measurements to a lower dimensional feature space (Webb, 2002). In either case, the features must
minimized to allow efficient and generalizable classification. Usually, these techniques include a pre-
processing step, outlier removal, data normalization and the handling of missing data. There are several
techniques, which can used to discover the unknown parameters, with the help of known features
vectors, such as Bayesian interference, maximum likelihood estimators, maximum entropy estimation
and others. For the problem of missing data, the probability density function (pdf) can estimated with
the expectation maximization algorithm (Theodoridis and Koutroumbas, 2006). If the probability
density functions of the data are available, then the extraction of important features from them is
possible.
It is important to note that the key features that represent mammographic images can be spectral, textural
and contextual. Spectral features illustrated as the average tonal variations in various bands of the
intensity spectrum. Likewise, textural features contain information related to the spatial distribution of
tonal variations within a band and contextual features, include information obtained from blocks of
pictorial data, surrounding the area being analysed. In mammographic images, the textural features are
hardest to find. However, there are very important for the analysis of the image, since mammograms
obtained using a single medium of acquisition. With this in mind, in order to extract critical and
meaningful clues from the mammograms, the type of textures that mentioned before play an essential
role (Haralick, Shanmugam and Dinstein, 1973). Wavelet features contained useful data extracted from
the mammograms (Mousa, Munib and Moussa, 2005 ; Dhawan, Chitre and Kaiser-Bonasso, 1996).
Another approach for feature generation is based on the geometric moments. An image can represented
from its equivalent moments (Papoulis, 1965). It can be realized from the literature that gradient-based
measures, such as directional gradient features, can be very efficient in breast cancer detection
(Desautels, Rangayyan and Mudigonda, 2000). Furthermore, speculation and morphological features
can identify malignant tissues or lesions. Spicules are radiating patterns of linear spikes surrounding the
irregularly shaped malignant densities (Karssemeijer and Brake, 1996). Morphological features related
to the physical characteristics of the lesions. Additionally, difference and similarity techniques, which
depending the availability of prior mammograms, can provide additional clues. Specifically, the most
recent mammographic images compared with the prior ones in order to find differences and similarities
regarding to the size or other characteristics of tumors (Timp, Varela and Karssemeijer, 2007). Arfan,
(2017) tried to find the texture features inside a mammogram with Gabor Filter, which are Gaussian
kernel functions and can be applied to different scales, frequencies and orientation. Final, Galván-
10
Tejada et al. (2017) extracted 37 features from every mammographic image, with the genetic algorithm,
to analyse which features were the most important. In like manner, eight of them were clinical and
general data, eight were intensity based and calculated from the gray-levels of the pixels, thirteen were
texture descriptors and eight were shape and location descriptors.
To sum up, there are several feature generators, which can used for breast cancer detection. The problem
of these methods is that the image properties are unique and that every generator can provide specific
information, which can be used by the classifiers (Ganesan et al., 2013).
The next step in breast cancer detection using CAD is the development of a classifier. There are
numerous classifiers, which can used to categorise the data correctly and each one has its specific
characteristics and decision accuracy. Consequently, the choice of the right classifier is difficult and
requires careful consideration. Conditional probabilities used to evaluate the efficiency of the classifier.
The classification of an object 𝑥 to the class 𝜔𝑖 implies the posterior probability 𝑝(𝜔𝑖|𝑥), which must
be the highest in order for the classification to have the minimum error (Ganesan et al., 2013). Several
techniques tried for the calculation of the probability densities, divided into parametric and
nonparametric estimations. The first category includes Mixture Modeling, Bayesian Interference and
Maximum Likelihood Estimation, while the second one involves Histogram Approximation (Devroye,
Gyorfi and Lugosi, 1996).
The techniques that used for mass detection from mammograms are the parametric multivariate method,
which is based on linear classifier theory, (Ganesan et al., 2013); the hybrid classifier, based on
Adaptive Resonance Theory (ART); Linear Discriminant Analysis (LDA) (Hadjiiski et al., 1999 ;
Ganesan et al., 2013) and Support Vector Machines (SVM) which creates a hyperplane to classify
linearly separable data, of two or more classes (Cherkassky and Mulier, 1998). Neural networks
(Rumelhart, Hinton and Williams, 1986), with back-propagation, also used for mass detection in breast
cancer (Dhawan, Chitre and Kaiser-Bonasso, 1996).
One more approach related to probabilistic logic, is Fuzzy Logic. This technique is slightly different
from the others since the truth-values are not exactly binary in nature (Bhattacharya and Das, 2007).
This specific approach is successful when the truth-values do not have a definite description. This can
be very helpful in breast cancer detection when noise occurs in mammograms (Kovalerchuk et al.,
1997). Similarly, Bayesian networks are probabilistic classifiers, which can applied to the classification
problem, in order to find the most effective solutions. In CAD of breast cancer using mammograms,
they produce encouraging results (Wang et al., 1999). To develop a Bayesian network, it is necessary
to learn the network topology and then to estimate the marginal and conditional probabilities (Viton,
1996).
Figure 2.1: Features extract from mammogram images (a) normal, (b) benign and (c)
malignant (Ganesan et al., 2013 p. 85)
11
Decision trees are nonlinear classifiers, which categorize the data into classes. Specifically, they use
subsets of the data at diverse levels to classify them in classes, as an alternative to the whole feature set
(Lei Zhen and Chan, 2001). Another approach is the k-Means Clustering, which divides the data into k
clusters so that the sum of square differences will minimized. Then, the data classified to the class with
the shortest Euclidean distance (Patel and Sinha, 2010). From 2015, important studies came up for the
classification of malignant and benign masses in breast cancer with Extreme Learning Machine. This
algorithm is a feed-forward neural network that generates the connected weights between input and
hidden units and has an exceptional performance in terms of generalization (Xie, Li and Ma, 2016).
Moreover, the introduction of the adaptive boosting brought an improvement to the simple classifiers,
which introduced to the literature. This advanced system adopted an iterative procedure, which during
every iteration a method is boosting the misclassified data. Because of this, vulnerable classifiers are
mandatory to learn more by practising on difficult samples (Arfan, 2017). Galván-Tejada, et al. (2017)
assessed three diverse classification techniques: Random Forest (RF), Nearest Centroid (NC) and K-
Nearest Neighbours (K-NN). While, Random Forest is a non-linear supervised sparse regression-based
method, K-NN is a supervised instance-based method. In contrast, NC is a hybrid approach, which
includes an instance-based method combine with a statistical one. The results from the RF classifier
were more secure and showed stability.
As it can be see, every classifier has its own characteristics and it can applied to different problems to
provide specific results. In most cases though, it is beneficial to use more than one classifier so
combinations have proposed to improve the accuracy. This is still the subject of further investigations.
To summarise, pre-processing, feature generation and extraction and classification of mammograms are
key steps in mass detection in breast cancer. Every step can include a large variety of techniques, the
choice of which depends on the goal and the data set (Ganesan et al., 2013).
12
2.2 Bilateral Subtraction
Bilateral subtraction can used in breast cancer detection with CAD systems. The main assumption of
this method is that the two breasts are almost identical and they have symmetry. Because of that, when
the mammograms are matched with the appropriate processing techniques, and then subtracted, the
remaining asymmetries will point to possible masses (Yin, 1999). The CAD system, which shows
symmetry in the paired images, chooses only the asymmetrical areas as hypothetical malignant mass
regions. In literature, this approach seems to improve the system’s performance in specific conditions
and within the limited databases used (Zheng, Chang and Gur, 1995). The challenge of bilateral
subtraction is to be able to find all the suspicious regions with asymmetries but to reject all of the areas
that are not masses. True masses have particular characteristics, which, in brief, are their convex
contour, their density is equal at the centre and the periphery and they displayed on at least two different
projections. Non-malignant asymmetriess curve inwards and they include dense elements at different
places.
Feature extraction and classification methods
Author Methods Accuracy
[%]
Kimme et al. Normalized statistics and texture features 74
Petrosian et al. Spatial Gray Level Dependence and textural features with a
decision tree classifier 76-89
Kinoshita et al. Shape and texture features with a three layer feed-forward
neural network 81
Rangayyan et
al. Region based edge-profile acutance measure 92
Polakowski et
al.
Model based vision algorithm. Difference of Gaussians and
texture features 92
Priebe et al. Fractal texture measures 88
Sameti et al. Optical density, photometric and textural features 72
Chitre et al. Texture measures with artificial neural network 87
Mudigonda et
al.
Gray level co-occurrence matrices, polygonal modelling with
jack-knife classification 83
Brijesh et al. Statistical features with fuzzy neural network 83
Yoshida et al. Wavelet features in combination with a difference image
technique 90
Liyang Wei et
al.
Statistical features in a multiple view mammogram with SVM
and KFD 85
Oliver A et al. Eigen faces approach 82-90
Szekeley et al. Texture features and a combining classifier of decision trees
and multiresolution Markov random models 88-94
Alolfe et al. Forward stepwise linear regression method with a combined
classifier of SVM and LDA 82.5-90
Table 2.1: Listing of popular feature extraction and classification methods (Ganesan et al., 2013
p. 90)
Table 2.2: Listing of popular feature extraction and classification methods (Ganesan et al., 2013
p. 90)
13
The computerized scheme of bilateral subtraction contains seven steps:
1. If the mammograms were on film, each pair of mammograms digitized with a laser scanner to obtain
an image of specific pixel size.
2. The next step involved the segmentation of the mammograms to detect the breast border and nipple.
Two images were used to identify the breast border, the first one was the thresholded image and the
second a smoothed version of the original image. After that, five points, which divide the whole
image into three areas (Fig. 1.4.1), automatically selected and a tracking algorithm was applied to
detect the border based on the gray levels outside and inside the breast. It is worth to mention, that
there is a relationship, which relates the tracking procedure with the area of the breast. For the nipple
detection, two methods combined to provide higher accuracy. In the first method, the maximum
height of the breast taken as the position of the nipple, while in the second a reference point inside
the breast was included.
3. The mammograms aligned to allow direct comparison between the two images. Reference points
chosen to transform the coordinates of one image, to be associated with the other. In the literature,
usually the left mammogram was the one that displaced and rotated. The coordinates of the detected
nipples of both images determined the displacement and the angle of rotation.
4. Next, normalization of the images took place to fix the different brightness between the right and
left mammograms due to the recording procedure.
5. Bilateral subtraction followed the pre-processing and alignment of the images. In most cases, the
left breast image subtracted from the right. Hence, the masses, which located in the left breast
subtracted image, had negative pixel values, while masses in the right breast had positive pixel
values in the new subtracted image. The main goal was to produce two new images with positive
and negative value pixels while all the common areas in those images will be at the zero gray level
and indicates no remarkable difference between images. After simple linear stretching, a threshold
value was determined to extract the possible malignant regions and made the new images take over
the complete available range of pixel values.
Figure 2.2: Three areas of the breast region (Mendez et al., 1998 p.
958)
14
6. The next step was the analysis of the suspicious areas, which revealed when the threshold value
applied to the new images (Yin, 1999). Many of the areas identified by the bilateral subtraction
technique were not masses (Mendez et al., 1998). Several techniques were tested for the reduction
of false positives and, usually, size and eccentricity of the suspected regions were evaluated (Yin,
1999). Based on the size test, features with smaller values than the cut off, were ignored (Mendez
et al., 1998). Additional texture tests and the absolute values of the gray levels further decreased
the percentage of false positives. With linear discriminant analysis, an analytical model was
developed to classify the area as normal tissue or mass (Yin, 1999). Other techniques include pre-
processing by morphological filtering (Mendez et al., 1998).
7. Last, was the classification of the database to evaluate the performance of the CAD system. The
ground truth was a two-fold classification by the radiologists. First, they defined five main levels
of the mammographic appearances of the masses, related to the quality of mass. Level 1 was a
visible mass, easy to identify even to an inexperienced observer. Level 2 was an approximately
clear mass, detectable even by an inexperienced observer. Level 3 was a hard to notice mass, which
can recognized by observers with some mammographic experience. Level 4 was a very subtle mass,
which demands more skills and knowledge and Level 5 was a remarkably not obvious mass,
discoverable only by a skilled and experienced radiologist. The second classification based on the
radiographic contrast and the size of the masses, which were manually measured (Yin, 1999).
The image subtraction can be linear or nonlinear. In the linear bilateral subtraction method, as described
above, the right breast image subtracted directly from the corresponding left breast image, or vice versa.
Subsequent gray lever thresholding produces two binary images. The cut off gray values arise from the
gray level histogram of those images. After the thresholding, a great amount of the remaining features
correlated to locations of potential abnormalities. On the other hand, non-linear subtraction is based on
Figure 2.3: Thresholded images of the right
[(a),(c)] and left breasts [(b),(d)] with cut off values
25% [(a),(b)] and 35% [(c),(d)] (Yin, 1991 p. 957)
15
thresholding, with various cut-off gray levels before the bilateral subtraction thus resulting in various
subtraction images from a single pair of right and left mammograms. Thresholding excluded some
normal anatomic background from the subsequent analysis. The cut off gray values for thresholding
had a direct relationship to equivalent percentages of the areas beneath the gray level histograms of the
corresponding right and left breast images, inside the breast borders. The pixels with a gray level above
the cut off preserved their gray level while pixels below the cut off value appointed to a constant value.
As can be seen from the Figure 2.4, the nonlinear subtraction method had a better performance than the
linear subtraction method with the true positive rate about 95%, with an average of three false positives
detections per image. With the linear approach, the true positives were 11% lower.
Effective alignment is crucial for the performance of bilateral subtraction since any misalignments
between the paired breast images can cause artefacts and affect the evaluation of abnormalities and the
detection performance. However, misalignment is sometimes unavoidable due to physical differences
between the two breast images. In general, it could be caused by modifications in breast size, breast
compression, patient placement, acquisition, and computer registration (Zheng, Chang and Gur, 1995).
Despite the detrimental effects of misalignments on the detection of breast cancer, it is very challenging
to understand and account for the models of misalignments because of the complexities of the imaging
procedure and the fact that the breast is a soft-tissue organ. Only the nipple position and the breast
border can be located accurately in the alignment process and sometimes, it is impossible to locate the
position of the nipple so the skin line is the only source. This may not have a major impact for the
detection of large mass areas, but it can alter the results near the skin boundary and masses can be
missed. Fortunately, several state-of-the-art trials have shown that the detection performance is not
sensitive to minor misalignments (Yin, 1991). Nevertheless, at the same time, bilateral subtraction
misses some true positive values due to two main reasons. The first one is that masses, which are small
and have low contrast, cannot detected easily despite being present in the mammogram. The second is
due to the position of the abnormalities, since the detection efficiency of masses that are close to the
skin boundary, decreases.
Figure 2.4: Comparison of the detection performances
obtained with the nonlinear and linear subtraction methods
(Yin, 1991 p. 962)
16
It is important to note that after 2013, numerous studies appeared in literature. Sun, et al. (2014)
reviewed and presented a new study, which intended to detect a percentage of women with high risk of
having or developing an observable breast cancer. They used three types of characteristics to build a
new and unique CAD model for breast cancer detection. First, they did not use any registration or
alignment methods due to the limitations of the methods described above. Their approach based on
asymmetry feature extraction from bilateral mammograms. More specifically, they analysed and
matched the differences of image features that computed independently from the two bilateral images.
Until now, previous researches used only spatial features, but Sun, et al. (2014) added morphological
and texture ones, aiming not only to detect the abnormal regions but also to predict the near-term risk
of the negative cases with 73.5% correct rate. In this study, they used the assumption that two breasts
of negative cases have relatively symmetrical areas. Moreover, it shown that the asymmetry of the tissue
determined the risk of a woman having or later developing breast cancer. Despite the encouraging
results, this study is not ready to be applied to the general population, due to the limited testing dataset.
Kelder, et al. (2015) investigated an advance CAD without lesion segmentation, which based on the
identification and analysis of region of interest (ROI)-based bilateral mammographic tissue asymmetry.
Their algorithm included bilateral image registration, image feature selection and naive Bayes linear
classifier. The first step was the automated identification of abnormal areas, with non-fixed size, from
every mammogram and registration of the mammograms to find ROI-based bilateral mammographic
feature asymmetries. Finally, a machine-learning classifier applied to combine the ROI-based features
and to compute the likelihood of the ROI. It was already known that bilateral tissue asymmetries can
indicate the risk of developing breast cancer and which were the main features that a radiologist had to
examine to make a diagnosis. Those asymmetries were global or local differences in tissue, density
between the right and left mammograms, or matching areas in the two bilateral mammograms. Kelder,
et al. (2015) showed that the asymmetry scheme had superior performance and that the asymmetry
features can help the radiologists in the identification of high-risk patients. While this study produced
very promising results, the database that used was not large enough for a satisfactory performance
evaluation and more work needed in this matter.
Figure 2.5: Manually registered borders of the
right and left breasts (Yin, 1991 p. 956)
17
A basic problem in mass detection from mammograms using CAD systems was the relatively high
number of false positive (FP) values. Li, et al. (2014) suggested the use of a two-step technique, which
could decrease those values to improve the identification process. The first step was similar to Kelder,
et al. (2015), while in the second step they examined the bilateral similarity to reject the FP values in
the detection. Li, et al. (2014) found that the calculation of matching cost was necessary. The global
and local image features used to detect the similarities between mass-to-normal and normal-to-normal
pairs and to determine the ROIs. In general, this method showed promising results decreasing the FP
values for breast cancer detection. Limitations included small-scale database, similarity of a mass with
its matched region, which resulted in an inability of the algorithm to identify the mass and low density
masses being missed. Likewise, Casti, et al. (2015) tried to develop a new approach for the analysis of
structural similarity between right and left mammograms using landmarking, automatic bilateral
masking procedures, multi-directional Gabor filters, modelling of spherical semivariograms and
extraction of similarity features. They found that the central problem with the identification of
asymmetries appeared, because the bilateral anomalies caused by a developing or underlying
pathological process must be discriminated from the physiological variations between the two breasts.
They implemented landmarking and bilateral masking approaches to segment paired mammographic
areas. Then, with the use of spherical semivariogram descriptors, they extracted useful information
related to the spatial dependency of pixels inside the mammographic area. With Gabor filers combined
with the original gray scale values, they characterized the structural information present in the oriented
patterns of the breast. Their final step contained the introduction of correlation-based structural
algorithms to compare the diverse regions. This study showed that this approach had better performance
with the only limitation being the relatively small dataset of the available asymmetric cases accessible
in public databases.
Celaya-Padilla, et al. (2015) developed a CAD method which can triage mammogram sets
automatically. This technique co-registered the left and right mammograms, extracted image features
and divided the patients into three basic categories: risk of having malignant masses, malignant masses
and healthy subject. This study based on asymmetry analysis, like the others presented above. This
approach related with earlier studies, but upgraded by the extraction of hundreds of asymmetry features
and with the use of an automated registration algorithm, which was more robust and simplified. This
technique could be used to queue cases with a large percentage of malignant findings in developing
countries where there are few radiologists.
Celaya-Padilla, et al. (2015) and Tan, et al. (2016) introduced a new model to predict near-term breast
cancer risk based on quantitative assessment of bilateral mammographic image feature variations in a
series of negative full-field digital mammography (FFDM) images. The database included four
sequential FFDM examinations for every patient, with the first three examinations considered priors
and the last one recent. They also studied a possible link between the model-generated risk scores and
the time lag of negative and positive screenings. It is worth mentioning that Tan, et al. (2016) adopted
a large number of mammographic density, texture and structural based features used in the literature
and formulated several new ones. They determined the most effective ones to be the WLD similarity
features, RLS, texture and gray level magnitude based features. To eliminate the probable training and
testing bias they adopted a LOCO cross-validation method with feature selection. Their observations
indicated a non-linear relationship between the diverse mammographic tissue patterns, from the
negative to positive screening and that the near-term cancer risk factor, estimated from bilateral
mammographic image feature asymmetry, did not depend on age. In addition, they combined the
negative and recalled benign cases into one cancer-free group, to ease subsequent processing.
18
At 2016 Casti, et al. (2017) proposed an algorithm for the automatic localization of malignant sites of
asymmetry in mammograms. At first, the left and right mammograms pre-processed to discover
anatomical spots for bilateral matching and then filtered with Gabor filters at different directions to
magnify the directional components of the breast. After that, the segmentation procedure took place
with two masks: one horizontal and one vertical. The horizontal masking built by rectangular areas of
the same height, from the minimum to the maximum pixel inside the breast area. The vertical one
included eight more rectangular regions across the chest area. The characterization of the pairs
accomplished with the correlation-based similarity. Classification achieved with Bayesian models,
which were trained using structural similarity features. With this study, they derived a useful system
with satisfactory results, which can localize sites of malignant asymmetry and their classifier have been
database-independent and ensured an unbiased outcome.
One year later, Yan, et al. (2017) tested an unusual algorithm for breast cancer risk prediction. They
used a single mutual threshold, rather than two different thresholds, on bilateral mammograms to
segment the breast areas. The threshold value decided by taking the median grayscale value of the entire
pixel range, in both left and right mammogram. Following, they estimated three types of image features:
asymmetry, which defined by the absolute difference of the two bilateral mammograms, mean features,
that estimated by taking the average price of two registered mammograms and maximum features,
which described by the higher value of the two matched bilateral mammograms. With a two-stage
classification model, they dissolved the three diverse types of features and the risk prediction
performance certified by leave-one-out cross-validation method. Their results were promising as they
managed to achieve higher prediction accuracy from other studies in the literature.
In conclusion, the bilateral subtraction approach is an effective technique for breast cancer detection,
despite the limitations mentioned above, because of the low average number of false positives. Although
the true positive rates still need improvement, radiologists can use these systems as second readers to
separate normal and malignant masses and to identify normal areas as suspicious (Yin, 1999). Table 2.3
summarizes the approaches of bilateral subtraction techniques.
19
Table 2.3: Comparison of Bilateral Subtraction methods
2.3 Temporal Analysis
In temporal analysis, mammograms from multiple examinations of the same breast, at prior times, are
available to the CAD algorithm to achieve higher accuracy in detecting abnormalities (Timp and
Karssemeijer, 2006). The past mammograms can convey significant information about the likelihood
of breast cancer. If a suspicious area in the current mammogram matches closely, in location and
appearance, to a suspicious region in the past mammogram, then the anomaly is probably not associated
with cancer. Conversely, if there is no corresponding anomaly in the past mammogram or there is
significant change, then this considered a malignant lesion. With the inspection of previous
mammographic images, the number of false positive values is decreased and the identification of
malignant masses that otherwise will be missed is accomplished (Ma et al., 2010 ; Ma et al., 2015).
There are some significant advantages in using prior mammograms. First, when the current
mammogram compared to the prior one, subtle signs of malignancy, like small masses or recent
calcifications, are more obvious. These differences might have overlooked if the past mammogram was
not available for comparison, so radiologists use this approach frequently to identify developing
anomalies and boost the true positive values. Second, the suspicious areas of the present mammogram
can identified more accurate when the area is compared with the corresponding area of the past
mammogram. For instance, if a radiologist detects a mass on the current mammogram, he/she can use
the past one to determine if this is a new or existing density. If the mass was apparent on the prior
mammogram, the radiologist can analyse the size and the contrast of both lesions. A third benefit of
using previous mammograms is that the additional data can be used to eliminate a number of false
positive detections. If the areas related to a mass, in both sequential mammograms are alike, probably
Bilateral Subtraction methods
Sun, et al. (2014) 𝐴𝑍 = 0.754 ± 0.024
Kelder, et al. (2015) 𝐴𝑍 = 0.87 on the ROI-based evaluation
𝐴𝑍 = 0.72 on the case-based evaluation
Li, et al. (2014) Sensitivity=85%
34% reduction of FP
Casti, et al. (2015)
𝐴𝑍 = 0.83 with linear discriminant analysis
𝐴𝑍 = 0.77 with Bayesian classifier
𝐴𝑍 = 0.87 with ANN
Cellaya-Padilla, et al.
(2015)
p-rank: 𝐴𝑍 = 0.882 for calcifications VS healthy cases
𝐴𝑍 = 0.842 for masses VS healthy cases
z-normalization: 𝐴𝑍 = 0.882 for calcifications VS healthy cases
𝐴𝑍 = 0.807 for masses VS healthy cases
Tan, et al. (2016) 𝐴𝑍 = 0.730 ± 0.027
Casti, et al. (2017) 1st database: 𝐴𝑍 = 0.79
2nd database: 𝐴𝑍 = 0.75
Yan, et al. (2017) 𝐴𝑍 = 0.830 ± 0.033
20
those areas represent false positives or gradually growing benign masses (Timp and Karssemeijer,
2006).
In order to detect breast cancer with temporal analysis, the prior and current mammograms must be
aligned. Registration is the key for the alignment of mammograms and is increasingly important for the
early detection of abnormalities. However, registration techniques face many challenges due to changes
that occur in the breasts over time and the differences in the way that the mammography performed,
including variations in breast compression and imaging parameters, changes in the shape and the
amount of pectoral muscle that presented in the medio-lateral projection (Marias et al., 2005). Because
of these challenges, registration of mammographic images is still an on-going research topic for the
improvement of CAD systems using temporal analysis (Ma et al., 2010).
There are two major categories of registration techniques for time diverse mammograms found in past
studies in the literature: global registration techniques and regional registration techniques. The first
approach compares the entire current mammogram with the previous one to identify corresponding
regions in the breast (Timp and Karssemeijer, 2006). Vujovic and Brzakovic, (1997) applied this
method by separating the current and past mammograms into various areas with the use of internal
control points. Depending on the location of the control points, both mammograms partitioned into
statistically homogeneous areas. Textural and contrast features from circular regions, whose centres
were the control points, compared providing valuable information to distinguish normal and abnormal
tissues. Figure 2.6 outlines the scheme that Vujovic and Brzakovic, (1997) used to identify the control
points in the mammograms. Richard and Cohen, (2003) used a variation of the formulation to find the
smooth function that describes better the deformation necessary in order to map one mammogram to
another, but they applied that technique in bilateral pairs. While this method is very mathematically
efficient, in practise the true map between two images of the same breast is not smooth and cannot be
applied effectively to temporal pairs. Hasegawa, et al. (2008) started with a rigid body alignment and
then applied registration of the dense areas of the mammogram with the use of a B-spline control point
grid.
21
The regional registration approach correlates suspicious areas in current mammograms with
corresponding areas in the past ones (Timp and Karssemeijer, 2006). In other words, this algorithm
searches locally in one mammogram to discover matching regions of interest of the paired mammogram
(Ma et al., 2015). Sanjay-Gopal, et al. (1999) designed a fan-shaped search area in past mammograms,
using the nipple and the centroid of the breast axis. For every abnormal area on the current mammogram,
they constructed a warp-shaped search region on the previous mammogram and for every location
inside the search region, a correlation measure computed. The location with the highest correlation
value selected as the location of the prior. In Figure 2.7, the procedure used by Sanjay-Gopal, et al.
(1999) presented. In the same way Hadjiiski, et al. (2001) classified the abnormal masses as malignant
or benign based on the comparison of the current and prior mammograms. In that case, radiologists
defined the region of interest and then the algorithm found the associated anomalies inside the
mammograms.
Figure 2.6: Processing steps and information flow between processing steps used to identify
potential control points and establish their correspondence (Vujovic & Brzakovic, 1997 p. 1388)
22
A natural extension was to combine local and regional registration for improved results. Marias, et al.
(2005) combined several techniques.
1. For the detection of the breast outline, which was important in the registration process to assure that
the breast boundary segmented. This step was complicated by labels that overlapped or were close
to the breast region, noise and/or skinfolds. The boundary extraction based on a combination of the
Hough transform, image gradient operators and morphology to isolate the background and mark
points along the boundary.
2. For the curvature analysis of the breast outline. This method included the detection of invariant
points along the breast boundary characteristic of the curvature. The point with the maximum
curvature, most of the times, corresponded to the nipple and, even if the nipple was not noticeable;
the algorithm still used it as the maximum negative curvature point. In general, the three detected
landmarks usually corresponded to the anatomical location of the rib (point 1 in Fig. 1.5.3), the
nipple (point 2 in Fig. 1.5.3) and the axilla (point 3 in Fig. 1.5.3). Nevertheless, the breast outline
could be notched which complicated the computation of the curvature with second order derivatives
resulting in a noisy curvature profile. This problem overcome with Gaussian multiscale analysis of
features in 2-D curves.
3. For image transformation. With at least 5 points along the breast boundary, a satisfactory initial
alignment can be achieved for temporally mammogram registration. With 7 points between
landmarks three and two and another 7 in the middle of landmarks two and one, greater accuracy
can be reached. Based on these points, the transformation that adjusted the boundaries computed
with the use of thin-plate spine interpolation and warped images created by forcing all the points
inside a mammogram to take the intensity values of the point where the interpolating function maps
the point of the past mammogram. Intensities outside the pixel grid can estimated by bilinear
interpolation. This approach reduced the primary differences among the images and corrected for
scaling, translation and limited rotations because of the breast positioning and orientation.
Figure 2.7: Regional registration technique (Sanjay-Gopal et al.,
1999 p. 2671)
23
4. For define internal correspondences in the mammogram pair. Diverse breast compressions tend to
make denser structures move at a greater scale than the less dense ones, which can result in an
interruption of an otherwise generally smooth motion field. The performance of the transformation
can improved by choosing internal landmarks. The solitary areas of dense tissue are the preferred
choice for internal landmarks, which are often the brightest regions in a mammogram. Yet,
calcifications cannot used as possible matches due to their restricted spatial extend which can result
in numerous but erroneous matches. The distinct areas of dense tissue that move closer when the
primary boundary-based algorithm applied, preferentially considered since they decreased the value
of the internal matches. A nonlinear Coiflet wavelet scale-space technique used to analyse the
mammogram pair in order to identify important regions of interest. Additionally, for each region in
the first image a search window implemented in the second with a match rejection filter to guarantee
that spatially localized features not matched to bigger ones. Finally, a limited number of landmarks
determined as the centroids of the matched areas.
5. Finally, the boundary points and the internal landmarks are included in a thin-plate spline
approximation approach that allows both smoothness control and individual weighting for every
landmark, to match the images accurately.
One year later, Timp and Karssemeijer, (2006) designed another global and regional registration
method. This method started with the two-time diverse mammograms, the prior and the current, and
subsequently the breast region and the pectoral muscle segmented. A global registration method based
on centre of mass alignment, registered the current and past mammograms and a pixel level mass
identification algorithm assigned to every pixel inside the breast region, a measure of suspiciousness.
This measure shows the likelihood of existence of a cancerous mass. Then, the most dangerous locations
on the current image chosen and linked to a corresponding location on the past image. Both locations
segmented into distinct areas and features computed for every area and used to classify each region.
For a more accurate comparison, van Engeland, et al. (2003) evaluated four different approaches for
temporal mammogram registration. The first two methods were the simplest alignment procedures
based on nipple location and centre of mass of the breast region. In the first case, the detection of the
nipple automated and the nipples in the past and current view aligned at the same spot by translation of
the past view. In the second method, the mammograms segmented into three areas: the breast, pectoral
and background area. The centres of mass of the breast area in the past and current view placed in the
same location by translation of the past view. The third method utilized a mutual information algorithm,
Figure 2.8: Consistent landmarks in the
CC and ML ‘idealized’ outlines (Marias
et al., 2005 p. 3)
24
which calculated from the joint probability distribution of the images’ intensities. The two images
registered by maximizing the mutual information using rotation, scaling and shearing to register past
and current mammograms. Finally, the fourth method was a warping approach, which used a set of
automatically determined control points placed on the breast contour and the pectoral muscle. The
control points in both views aligned with each other and the warped image formed by interpolation
between the control points, using a thin-plate spline surface approach. Van Engeland, et al. (2003)
determined that the method of mutual information provided the best results. The table below contains a
comparison for all the previous studies related to registration of temporal mammograms.
Table 2.4: Comparison of registration techniques in temporal mammograms
After 2010, a significant number of studies appeared in the literature for breast cancer detection with
temporal analysis. A large number of those studies dealt with the registration process while new
approaches established for the identification of abnormalities.
Rangayyan, Banik and Desautels, (2010) developed a method for the identification of architectural
distortion in mammograms, with the use of Gabor filters, phase portrait analysis, fractal analysis and
Haralick’s texture features. Architectural distortion is the distortion of the architecture of a breast region
without associate elevated density or mass. Such areas, due to their subtlety and diverse appearance,
missed, most of the time. For the feature extraction, they used Haralick’s texture features, which based
on the moments of a joint probability density function that calculated using the joint occurrence or co-
occurrence of gray levels. Feature selection accomplished by evaluating the performance of every
feature or a combination of significant features. Rangayyan, Banik and Desautels, (2010) used various
image processing and pattern classification methods such as artificial neural network and support vector
machines. From their results, they found that Gabor filters, phase portraits, fractal analysis and textural
features could combined to achieve early detection of subtle evidence of breast cancer in mammograms,
especially as related to architectural distortion. The main disadvantage of their method was the high
Registration methods in temporal mammograms
Vujovic and Brzakovic,
(1997) 86% of points in agreement with ground truth
Sanjay-Gopal, et al.
(1999)
85% correctly identified with Global Breast Alignment
77% correctly identified without Global Breast Alignment
Hadjiiski, et al. (1999)
92% classification accuracy using temporal mammograms
90% classification accuracy using current mammograms
78% classification accuracy using prior mammograms
Marias, et al. (2005) 85% accuracy for the identification of rib, 75% of the nipple and 80% of
the axilla
Timp and Karssemeijer,
(2006)
72% of the cases were correctly identified
69% of the cases were correctly linked based on correlation
200 FP and 389 TP
van Engeland, et al.
(2003)
Mutual Information outperformed all the techniques
Mean 7.9 and Median 6.1
25
number of false positives, which could be reduced with the identification and removal of the pectoral
muscle.
Ma, et al. (2010), realized the difficulty of registering temporal mammograms and replaced image
registration with graph matching. The promising advantage of this method was that registration
narrowed only to image features, which were compatible with breast cancer and reduced the chance of
error. Their procedure included 4 steps. First, was the segmentation of the current and prior
mammograms and the assignment of mass-like scores to components to show the likelihood that a
component corresponded to a malignant mass. After that, was the implementation of the graph matching
algorithm to combine candidate masses in the current and prior images and finally the adaptation of the
range of mass-like scores to demonstrate the information from both mammograms. The graph theory
based method, detected components which associated with cancerous masses but did not provide precise
boundaries of masses, which needed to estimate specific characteristics. The performance of their
identification scheme applied only to the current images and it was comparable to the best detection
results presented in the literature. The main drawback of this method was the increased number of false
positive detections per image, but when the past mammogram compared to the current, this number
slightly changed. Moreover, the graph matching process was unsuccessful when the two-time diverse
mammograms had significant differences in appearance, usually due to the presence of dense tissue.
In 2011 Diez, et al. (2011) proposed a quantitative evaluation of state-of-the art intensity based image
registration approaches, applied to temporal mammograms, which included global and rigid
transformation, local deformable paradigms using differing metrics and multi-resolution techniques.
They assessed their results with the use of temporal cases based on quantitative analysis and a multi-
observer study and quantitatively compared rigid and non-rigid intensity based registration methods.
The eight state-of the art algorithms that used were: Rigid, Affine, B-Spline Free-Form Deformations,
Polyrigid, Demons and combinations of them. While the B-Spline Free-Form Deformation (BSP)
method showed the best results from the numeric as well as the subjective point of view, it produced
registration artefacts. To solve this problem and improve the results Diez, et al. (2011) combined the
BSP method with Affine registration or multi-resolution techniques. Soon after, Diéz, et al. (2014)
published a new study related to three other registration algorithms (Affine, SyN and Demons) tested
on DCE-MRI images, which collected from clinical practise. The methodology included segmentation
approaches, which can focus on the region of the breast and anatomical landmarks and image metric,
in order to evaluate the quality of registration. These three registration methods adjusted separately to
each breast using automatic breast segmentation masks. With temporal registration, they compared
current and prior exams to reduce the false positive values and to identify abnormal areas. The
conclusion was that the SyN registration technique provided the best results.
Martí, et al. (2014) evaluated the use of image and deformation features to identify abnormal cases with
cancerous masses. More specific, they investigated whether the image registration results could used
for the identification of malignant masses from temporal mammograms of the same patient and for the
categorization of cases as normal or abnormal. Their approach included an Affine transformation, which
maximised mutual information similarity measure with a non-rigid point correspondence method based
on a robust point matching algorithm. The intensity, deformation and differ similarity based features,
which found from the registration method were subsequently used in a machine-learning framework to
identify cancerous cases. Martí, et al. (2014) discovered that the combination of features could improve
the results in breast cancer detection, but supplementary registration algorithms and a multi-centre
dataset should be added in future works.
26
Bozek, et al. (2014) performed temporal comparison of lesions in full-field digital mammograms
(FFMV) and extracted temporal features that define change in the lesion between two-time diverse
mammograms. The main advantages of this study were the use of FFDM mammograms exclusively
and the introduction of volumetric change of a lesion which was determined using dense tissue thickness
maps. More specifically, they adopted four volumetric features that include information related to the
size of the lesion and many more. Equally important, the volumetric feature could pass the restrains
presented when calculating the size of a lesion. The results showed that volume might be a much more
relevant feature compared to area for the analysis of the temporal change in lesion size. Furthermore,
the classification performance could be improved if the temporal volumetric features were included in
a set of features collected only from the current exam. However, the study had a drawback. The small
size of the samples did not permit Bozek, et al. (2014) to characterise whether there was an advantage
in using volume features.
In 2014, Ma, et al. (2014) presented a temporal mammogram registration framework, based on spatial
relations between ROI and graph matching, used to build correspondences between ROIs of current and
past mammograms. This temporal mammogram registration developed correspondences between areas
of the two time diverse images. They employed 18 image features (Fig. 1.5.4) to capture the
dissimilarities between the matched areas. To evaluate the contribution of temporal change information
to the identification of abnormalities, 5 techniques were implemented to associate mass classification
to image features measured on single areas and mass classification based on temporal features, to
enhance mass classification. The framework of this study contained both preprocessing of the current
and prior mammograms (gamma correction, anisotropic filtering and extraction of both breast and
pectoral muscle boundary) and segmentation (adaptive pyramid based segmentation and sublevel set
analysis). Their results demonstrated that including the temporal features in the mammogram mass
identification and combining them with the single classification features, linearly or by taking the
minimum value of the two classifications, increased the general performance of the algorithm. The main
limitation of this study was the relatively small set of mammograms that used. A larger set of
mammograms or supplementary mammogram pairs from diverse sources and databases could allow to
better determination whether there is an improvement from the use of the temporal change information
for cancer identification. Additionally, Ma, et al. (2014) used only five basic algorithms for the
implementation of temporal image features in the detection.
27
Table 2.5: Single and temporal features (Ma et al., 2014 p. 1264)
Soon after, Ma, et al. (2015) designed an innovative approach of incorporating 17 fuzzy sets based on
spatial relations, to register temporal mammogram pairs. They used four spatial relations: to the right
of, to the left of, below and above. The histogram of the entire considerable angles between all pairs of
points in a pair of ROI regarded as a fuzzy set and spatial relations between the pair of ROIs
characterized by determine to what degree this fuzzy set came nearer to the four spatial relations. Based
on the spatial relations, association of ROIs of temporal mammogram pairs evaluated as a graph-
matching problem and registration of temporal mammograms performed by discover the shared sub-
graph between two graphs illustrating a pair of temporal mammograms. Their processing and
segmentation scheme was similar to the one in their prior research. Their experiments showed that this
algorithm can cope with mammograms with variations in position or size but they did not found an
actual enhancement in performance, since the slight increase in the ROC number. Despite this, even a
small increase in the performance has the potential to affect positively the results for breast cancer
detection especially in developed countries. Ma, et al. (2015) found that their algorithm could identify
changes over time so it worked well on dense breasts but for these methods, they had to extract
accurately the breast boundary and then to find the reference points. The breast boundary used as a
component and involved in the registration method to supply global reference. When they combined
classification, a minor increase experienced in the performance compared to single detection. To
determine statistical significance of such limited improvement must use larger datasets and better
boundaries of the ROI.
Subsequently, Ma, et al. (2015) expanded their research to investigate the combination of image features
measured from single areas and image features measured from the matched areas of temporal
mammograms based on fuzzy spatial relation representation and graph matching. They used three
Support Vector Machine (SVM) kernels: the multi-layer perceptron kernel, the polynomial kernel and
the Gaussian radial basis kernel but also combined those kernels and applied them to the two-time
diverse mammograms for mass classification. To connect the two types of features from the single and
Single features Matched features
solidity solidity
axis ratio solidity2
std ratio axis ratio
iv circularity
c2 int
c3 relint
int entropy std radi
energy radi
inertial momentum c2
anisotropy c3
m1-m7 int entropy
energy
inertial momentum
anisotropy
m2,m3,m7
mass like number
28
temporal mammograms they used three combination rules: Linear combination, the Max rule and the
Min rule. Their results showed that this Multiple Kernel Learning (MKL) method provided the best
performance on both single ant temporal feature sets using the Min combination rule for the most
effective classification. The major drawback of this study was the use of only heuristic searching to
reduce computational time.
More recently, Abdel-Nasser, Moreno and Puig, (2016) proposed a temporal mammogram registration
method based on the curvilinear coordinates which constructed from both global and local deformations
in the breast region. This method was fully automated and it could be applied to both CC-CC and MLO-
MLO mammographic pairs. In the curvilinear mapping, a coordinate pair (s,t) assigned to each pixel in
cartesian coordinates (x,y). The construction of the curvilinear coordinates does not demand any
information related to the structures inside the breast region. To build the curvilinear coordinates they
used the breast boundary and a landmark point placed on it. Hence, the developed representation of a
given mammogram was invariant to differences in the size, position and orientation of the internal
structures of the breast. They applied the curvilinear coordinates to manage both global and local
deformations inside the breast area and compensated for the deformations between the mammographic
images. With the use of curvilinear coordinates, instead of using the Cartesian coordinates, in
mammogram registration, they created a model which mimics the anatomy of the breast and did not
required control points or use of a correspondence algorithm. They also integrated the segmentation
algorithm within the registration framework. After the registration, they maximized the similarity
between the two-time diverse mammograms and decreased the distance between manually defined
landmarks. A careful comparison with the use of state-of-the art mammogram registration methods,
proved that this technique provided the best results and the smallest landmark errors compared to
Demons, DRAMMS and Brandt’s method.
A new Fractal Dimension-based diagnosis approach implemented by Shanmugavadivu, Sivakumar and
Sudhir, (2016) for the change detection and time-series analysis of masses in temporal mammograms.
Fractal geometry is an effective mathematical technique, which handles alike and abnormal geometrical
objects known as fractals. With the use of Fractal Hust bound for enhancement and Fractal Thresholding
for segmentation, they tried to detect spatial masses in temporal mammograms. Furthermore,
Shanmugavadivu, Sivakumar and Sudhir, (2016) did temporal analysis of mass lesions applying Fractal
dimension. Their results indicate that Fractal dimension of temporal mammograms can provide valuable
information to the decision support expert system of radiologists.
Finally, at 2017 Kooi and Karssemeijer, (2017) examined the use of deep convolutional neural networks
with the intention of discovering abnormalities in mammograms. They did a linear mapping that took
the area of a mass and mapped it to the prior mammogram. Then, they examined two diverse
architectures. The first one relied on a fusion model, which made use of two data-streams were both
ROIs delivered to the network at the time of training and testing and the second one was a stage-wise
algorithm, were a ROI trained on the primary mammogram and used as feature extractor for the primary
and prior mammograms. For the classification of features, they used the gradient boosted tree classifier.
Their results demonstrated improvement in performance and they were promising for further research.
All of the above studies prove that temporal analysis can enhance the detection accuracy for breast
cancer identification using mammographic algorithms (Timp and Karssemeijer, 2006). However,
despite the increasing number of studies in the literature, automatic comparison between temporal
mammograms is still a challenging task due to the complexity of temporal mammogram registration
(Ma et al., 2015). Most of the studies in the literature focus on techniques based on registration with
advanced techniques implemented for the alignment of the two-time diverse mammograms, in order to
29
decrease the errors from registration. The main drawback of the current methods is the limited dataset
and studies examining the performance of CAD systems in clinical settings. The table below presents
general comparison for all the approaches that mentioned above related to breast cancer detection using
temporal analysis.
Table 2.6: Comparison of Temporal Analysis techniques in mammograms
Temporal Analysis methods
Desautels,
Rangayyan and
Mudigonda, (2000)
𝐴𝑍 = 0.76 Bayesian classifier, 𝐴𝑍 = 0.73 Fisher analysis, 𝐴𝑍 = 0.77 Neural Networks
𝐴𝑍 = 0.77 SVM
Ma, et al. (2010)
80% true detection rate with 1.02 FP per image at single scheme and 0.96 FP at
temporal scheme
90% true detection rate with 1.84 FP per image at single scheme and 1.63 FP at
temporal scheme
Diez, et al. (2011) BSP best method with Mean=2.73 and Variance=1.42
Diez, et al. (2014) SyN best algorithm with RMS=72.89±36.77 mm
NMI=1.21±0.04
Martí, et al. (2014)
no registration 𝐴𝑍 = 0.76
with RPM 𝐴𝑍 = 0.88
with Affine 𝐴𝑍 = 0.84
Bozek, et al. (2014)
with all current features 𝐴𝑍 = 0.77
with all current features and all nine temporal features 𝐴𝑍 = 0.86
with all current features and four temporal features 𝐴𝑍 = 0.90
Ma, et al. (2015)
with linear combination 𝐴𝑍 = 0.8989
by taking minimum value 𝐴𝑍 = 0.8863
with Fisher analysis 𝐴𝑍 = 0.8855
with SVM 𝐴𝑍 = 0.6028
Ma, et al. (2015) 𝐴𝑍 = 0.852
Ma, et al. (2015) Min combination rule best results with 𝐴𝑍 = 0.8532 on the single feature set
MKL best results with 𝐴𝑍 = 0.7987 on the temporal feature set
Abdel-Nasser,
Moreno and Puig,
(2016)
SSIM=0.903±0.142
MI=1.232±0.108
LE=5.23±2.11mm
Kooi and
Karssemeijer (2017)
𝐴𝑍 = 0.87
1st architecture 𝐴𝑍 = 0.895
2nd architecture 𝐴𝑍 = 0.88
same architecture for temporal analysis 𝐴𝑍 = 0.884 1st
𝐴𝑍 = 0.879 2nd
30
2.4 Detection and Classification of Micro-calcifications
A significant percentage of cancers are detectable due to the appearance of Micro-Calcification Clusters
(MCCs) (Cheng et al., 2003). The morphology of the calcifications is the most crucial parameter in the
classification between benign and malignant tumors. Suspicious calcifications or malignant, have either
an amorphous or a rough heterogeneous form. On the other hand, benign calcifications are uniform and
smooth. The distribution of calcifications is also important and it can dissected into four categories:
diffuse, regional, clustered and segmental. Diffuse calcifications are alike calcifications, which appear
in the whole breast. Regional, are the diffuse calcifications in a larger scale. Linear distribution is
typically seen when ductal carcinomas fills the whole duct. Clustered is the distribution which at least
five calcifications are combined and final, segmental distribution contains calcium deposits in ducts.
Usually, the first two categories are entitled to benign tumors and the other three to malignant ones
(Smithuis, R. and Pijnappel, 2008).
The high correlation among the display of MMCs and the abnormality condition demonstrate that the
Computer Aided Diagnosis systems will be beneficial for the automated detections and classification
of MCCs. On the contrary, automated analysis of micro-calcifications is a complicated procedure due
to a series of problems. First, micro-calcifications are relatively small and their size varies from 0.1-1
mm. Second, they can presented in diverse shapes and distributions, which makes template matching
impossible. Third, micro-calcifications may be low contrast and so the difference between abnormal
regions and normal tissue inside the mammograms will be unnoticeable. In like manner, they can be
connected to the surrounding health tissue making the segmentation approaches useless, or in some
cases, the abnormal areas are hidden because the tissue is too dense or the skin is thicker than usual. A
large amount of studies in the literature introduce a sequence of algorithms for the detection and then
the classification of micro-calcifications (Cheng et al., 2003).
At 1998, Wang and Karayiannis, (1998) presented an approach based on wavelet image decomposition.
Micro-calcifications appeared in small clusters of pixels with high intensity values related to their
neighbour pixels. With the introduction of a new detection system that kept these features and applied
an image transformation, the signal characteristics localized in the original and the transform domain.
Figure 2.9: Types of micro-calcifications’ distribution (Smithuis, R. and
Pijnappel, 2008)
31
Micro-calcifications correlate with high-frequency components of the image spectrum and their
detection accomplished by decomposing the mammograms into several frequency sub-bands,
suppressing the low frequency sub-band and reorganize the mammogram from the sub-bands consisting
only high frequencies. This technique driven from the capability of wavelets to differentiate various
frequencies and to retain signal details at diverse resolutions. New studies needed in order to examine
how the performance of this approach can change with numerous alternations of the properties of the
wavelet filters.
Papadopoulos, Fotiadis and Likas, (2005) developed a computer-based fully automated approach for
the identification and characterization of micro-calcification clusters in digital mammograms. Their
method performed in three stages: the cluster identification stage, the feature extraction stage and the
classification stage. For the last stage, they used a rule-based system, an ANN and a SVM. At the
beginning, they did a pre-processing step to eliminate the unusable radiological marks and the
background of the image. Later, they tried to reveal hidden micro-calcifications with background
correction and contrast enhancement. For all the objects and clusters, they estimated various
discriminative morphological and textural features, to use it as input to the false positive reduction
system and they added four new rule-based features. Then, with feature extraction they found the vital
features of each cluster and with their classification algorithms, they characterized the abnormal regions
as benign or malignant. Even though their method showed satisfying results compared with the existing
automated methods in the literature, further studies are required with bigger datasets.
Few years later, Suhail, Sarwar and Murtaza, (2015) based on the observation that the calcification
looks like small bright spots on a mammogram, built a new scale-specific blob detection approach,
which the scale picked through supervised learning. They imported a new feature called ‘Ratio Energy’
for effective blob detection, which calculated the energy from a pixel in two diverse scales. After
maximum RE acquired, they analysed the energy of each pixel to thresholded maximum to decide if
the pixel corresponded to calcification or not. Moreover, they examined some region-based properties
from the normal mammograms, which were different from the abnormal ones and they used it as filter
procedures to bypass additional processing. Their results were reliable and good enough to help
radiologists in early diagnosis of breast cancer.
Later, Boulehmi, Mahersia and Hamrouni, (2016) tried to diagnose and classify breast micro-
calcifications on mammograms with a three-step system. The method started with the pre-processing
of mammograms and more specific with the use of morphological operators to eliminate the pectoral
muscle and all the irrelevant elements in the mammograms. Furthermore, a new mammographic image
enhancement technique introduced which contained the application of the top hat followed by wavelet
contrast enhancement and galactophorous tree interpolation. The second step contained the
segmentation of the micro-calcifications clusters with the use of Generalizes Gaussian Density (GGD)
evaluation and a Bayesian back-propagation neural network. The last step involved the characterization
of the clusters as benign or malignant with a neuro-fuzzy system. Even though with this system
Boulehmi, Mahersia and Hamrouni, (2016) achieved acceptable results not only for micro-
calcification’s detection but also for breast masses diagnosis; they should boost classification accuracy
using more cases.
In recent times, Ciecholewski, (2016) introduced a cutting-edge approach for the segmentation of
micro-calcifications in mammograms, using morphological transformations. First, they identified the
calcifications morphologically, but they let the region of their occurrence to be evaluated, the contrast
to be enhanced and the noise to be removed. Then, they did a segmentation procedure, which extracted
the shape of micro-calcifications. They used gradient transformations and less interim steps throughout
32
the extraction of the final shape of micro-calcifications. Their method was practical, fully automated
and did not need to combine different regions by maximizing average contrast, like the other available
publications in the literature. For additional research, diverse categories of micro-calcifications must be
used, radiologists should assess the segmentation results and new generation of mammograms needed.
A series of studies in the literature focus only in the classification of micro-calcifications in two classes
as benign and malignant. Khehra and Pharwaha, (2016) examined Multilayer Feed-Forward
Backpropagation Artificial Neural Network (MFFB-ANN) and Support Vector Machine (SVM) as
classifiers. They identified the needed features from the mammograms and with Particle Swarm
Optimization; they chose the most relevant features. To compare the two classifiers, they applied
confusion matrix and ROC analysis. From their outcome, they noticed that MFFB-ANN performed as
a good classifier but SVM classifier behaved as an excellent one. In fact, SVM classifier identified the
cases with greater accuracy within experimental errors. In future, metaheuristic techniques can be
implemented to discover the optimal hyperplane with diverse kernel functions in SVM.
At the same time, Bekker, et al. (2016) proposed a two-stage classification scheme, which imitated the
biopsy decision. Their method based on a view-level outcome and a logistic regression classifier, which
came from the stochastic combination of the two-view level indications into a simple benign or
malignant decision. In other words, their algorithm automatically learned how to connect the
information that already took from the CC and MLO views. At the first part, different classifiers tested
on multiple CC and MLO views to decide if, based on the particular view, the abnormal region is
malignant. The image level decision modelled as a hidden random variable that not detected neither in
train nor in test. At the second part, the final biopsy level decision found by integrated the decisions of
both views. In addition, they found a rotation invariant feature set based on the Curvelet transformation.
Because they targeted only texture features, a segmentation stage was not necessary. Despite the fact
that their approach achieved better performance compared to various diverse schemes, which were,
connected the view-level information, classifiers that are more powerful have to be explored, such as
ANN.
Up to the present time, the most studies in the literature, which presented Wavelet Transformation as
classifier for benign and malignant masses, were restricted and ignored the correlation among wavelet
scales. Hu, Yang and Gao, (2017) decided to create an improved algorithm based on Wavelet transform
which adapted Hidden Markov Tree-Model of Dual Tree Complex Wavelet Transform (DTCWT-
HMT) for micro-calcification diagnosis. This algorithm could find the correlation between various
wavelet coefficients and model the statistical dependencies. To define the abnormalities as malignant
or benign, non-Gaussian statistics of real signals were used and the connected features (DTCWT-HMT
and DTCWT) upgraded by Generic Algorithm and Extreme Learning Machine, to enhance diagnostic
accuracy. They compared their method with state-of-the-art diagnosis methods and from their results
demonstrated the high performance of the recommended method in terms of the accuracy and stability.
Likewise, the results from the two features combined were better than adopting either the DTCWT-
HMT or DTCWT alone. Nevertheless, future work needed, to evaluate this method on a bigger dataset,
which covers a larger variety of micro-calcification’s types.
To summarise, detection and classification algorithms for micro-calcifications in mammography, can
improve the accuracy of breast cancer detection and classify tumors as benign and malignant with good
results, despite the fact that still need improvement due to the lack of datasets and the high FP values.
The Table 2.7 presents a comparison for all the mentioned approaches for detection and classification
of micro-calcifications.
33
Table 2.7: Comparison of detection and classification methods of micro-calcifications
Micro-calcifications detection and classification methods
Papadopoulos,
Fotiadis and Likas,
(2005)
Nijmegen set: with SVM 𝐴𝑍 = 0.79 original feature set 𝐴𝑍 = 0.77 enhanced
feature set
with ANN 𝐴𝑍 = 0.70 original feature set 𝐴𝑍 = 0.76 enhanced feature set
MIAS set: with SVM 𝐴𝑍 = 0.81 original feature set 𝐴𝑍 = 0.80 enhanced feature
set
with ANN 𝐴𝑍 = 0.73 original feature set 𝐴𝑍 = 0.78 enhanced feature set
Suhail, Sarwar and
Murtaza, (2015)
Sensitivity = 91 %
Specificity = 97%
Precision = 85%
Accuracy = 93%
Boulehmi, Mahersia
and Hamrouni,
(2016)
80% accuracy
76% sensitivity
81.25% specificity with SVM
Ciecholewski,
(2016)
80.5 % similarity index
75.7 % similarity fraction
70.8 % overlap value
19.8 % extra fraction
0.83 average executing time
Khehra and
Pharwaha, (2016)
0.8651 overall accuracy with MFFB-ANN
0.9016 overall accuracy with SVM
Bekker, et al. (2016)
classification accuracy=69.5 %
Sensitivity=68.1 %
Specificity=69.7 %
𝐴𝑍 = 0.75
Hu, Yang and Gao,
(2017)
Nijmegen set: 𝐴𝑍 = 0.9856
MIAS set: 𝐴𝑍 = 0.9941
DDSM set: 𝐴𝑍 = 0.9168
2.5 Scope
The aim of this work is to develop a new and upgraded Computer-Aided Diagnosis system, which could
assist radiologist to detect and classify micro-calcifications using temporal digital mammograms. More
than that, our algorithm will have valuable advantages compared to the current algorithms that already
described in the literature. First, it will be completely automatic, without requiring manual information
from the radiologists, except the prior and current mammograms, and second, we will eliminate false
positives, which are the main drawback of all of the existing algorithms, using machine-learning
algorithms.
34
3 Methodology of the Proposed Algorithm
3.1 Detection of abnormal ROIs
3.1.1 Computer-Aided Diagnosis System Pipeline
Our proposed system’s pipeline, is presented in Figure 3.1 and contain the next stages. First, prior and
current mammograms were normalized and then experienced a pre-processing step with two different
filters, for the enhancement of abnormal areas and the removal of unnecessary regions. After that, each
temporal image pair was matched with the use of Demons registration algorithm on the prior
mammogram. Following, the current mammogram and the registered one were subtracted, and the
difference image went through a series of post processing techniques such as filtering, thresholding,
erosion and dilation to remove the unnecessary regions. Then, we eliminated the remaining areas that
corresponded to the periphery and the old micro-calcifications. Finally, for each ROI, basic intensity
and pixel-based features were acquired for the classification as micro-calcification or normal, with the
use of machine-learning techniques.
3.1.2 Dataset
In this research project, we used 8 pairs of full-field digital temporal mammograms from The Breast
Center of Cyprus and tested 32 pairs of Cranio-Caudal (CC) view and 13 pairs of Mediolateral-Oblique
(MLO) view cases, from 43 women who did their routine screening mammography examinations. A
breast radiologist with 9 years of experience picked the mammograms. The women’s ages varied from
58 to 73, with a mean age of 65.25 years and median age of 66. From the eight pairs of mammograms
that were used in the project, the three belonged to healthy subjects who did not present malignant
micro-calcifications in neither prior nor current mammograms. On the contrary, in the remaining five
pairs we had malignant micro-calcifications in the next sequential screening examination. For every
subject four standard mammograms were taken: left and right MLO views and left and right CC views.
The dimensions of the mammograms were 4096x3328 pixels, in an 8-bit format.
Table 3.1: Distribution of our testing dataset
View
Case
Results in the first
mammography exam
(Prior)
Results in the second
mammography exam
(Current)
Normal Normal Abnormal
CC 5 3 2
MLO 3 0 3
Total 8 3 5
35
Figure 3.1: Computer-aided diagnosis system
pipeline
36
3.1.3 Normalization
Our algorithm begun with the normalization of the current and prior mammograms. In image processing
the normalization process is taking place to adjust the range of pixel intensity values. More specific,
normalization can bring the image in a range that is more common to the senses and when the
mammograms were normalized, the same filters and algorithms can be implement over them. In our
case, we normalized both mammograms by divide them with 4096, which was the number of the rows
in the original images. The size of the mammograms was remained the same and the visualization of
the image did not change.
3.1.4 Pre-processing
The pre-processing step was crucial because we wanted to discard the mammogram’s background,
including the pectoral muscle in MLO view, without removing any other relevant details of the image,
which corresponded to micro-calcifications. As a consequence, we applied Matlab’s border removal,
which suppresses light structures connected to the border (Soille, 1999). On MLO mammograms, the
Figure 3.2: Example of temporal pairs of mammograms (a) current (b) prior
(a) (b)
37
pectoral muscle can be found in the top left or right corner depend on the orientation of the image. Its
regions are always brighter than the normal areas on a mammogram but its intensity difference amongst
the abnormal regions, is minimum (Shanmugavadivu and Sivakumar, 2013). With this filter the brighter
areas that were linked to the border, were removed without making any changes to other regions in the
mammograms (Soille, 1999).
The next phase in the pre-processing step was the elimination of the high intensity background that did
not accord to abnormal areas, to detect micro-calcifications more efficient. Therefore, we used contrast
adjustment with Gamma correction because the mapping between the input and output images was
nonlinear (Ma et al., 2014). Gamma factor takes values from zero to infinity. If is equal to one, the
mapping is linear, if is less than one the mapping accentuated in the direction of the brighter output
values and if is greater that one, the mapping accentuated in the direction of the darker output values.
Thus, we set Gamma factor to two.
3.1.5 Registration
After the pre-processing step, the registration of the current and prior mammograms occurred. As
mentioned in the literature, for an effective subtraction between the temporal pair of mammograms, the
alignment is important, so two famous algorithms were tested: Affine and Demons to determine which
one had better accuracy based on the residuals (Marias et al., 2005). Residual was a way to measure the
effectiveness of the subtraction and is the sum of the remaining pixels after the subtraction of the current
and registered image.
3.1.5.1 Affine Registration Algorithm
The first registration algorithm that we tried was Affine. Affine, is a linear mapping technique that
secures points, straight lines and planes and consists scaling, rotation and translation. Affine is a global
method, which all pixels go through the same transformation (Diez et al., 2011). Usually, Affine is used
to improve geometric distortions such as differences in image’s size, which happened because of wrong
camera angles. In our case, the differences appeared because of the way that the mammography was
taken over the years (Hadjiiski et al., 2001). This registration technique is intensity-based which means
that maps the pixels from the prior mammogram, based on relative intensity patterns to the current one.
Our technique was iterative, and the iteration number indicates how many times the registration process
will took place. For this reason, we checked Affine registration for 20 dense and 20 fatty mammograms,
all normal without malignant micro-calcifications, for five iteration numbers: 100, 200, 300, 400 and
500. We examined the results to discover the appropriate registration number. In all times, the prior
mammogram was the one that registered in order to be compared with the current one.
3.1.5.2 Demons Registration Algorithm
Demons was the second registration technique that was tested. Demons is a local method, which
transforms image’s pixels locally, having an unlike transformation, reliant on their regional similarity
and location. In contrast, from global methods, local techniques can handle deformations that are more
complicated. Additionally, this algorithm is depended on seeing the registration as a diffusion process
influenced by optical flow formulation and sometimes can includes regularization to assure smoothness
and continuity (Diez et al., 2011). With Demons, we estimated the displacement field that aligns the
prior image with the reference one (current). We created a new registered image, which was a distorted
version of the moving (prior) image and was changed according to the displacement field and by
applying linear interpolation. As before, the prior mammogram was registered and compared with the
38
current one. In like manner, we examined Demons registration for the same cases that we did with
Affine, we analyzed the results and compared the two registration methods to find the best technique.
3.1.6 Temporal Subtraction
Once the registration was completed, the next step was the temporal subtraction. We used simple image
subtraction, to subtract the improved and registered mammogram with the current one. Likewise, we
set all the pixels that corresponded to the registered (prior) image to zero because we did not need our
algorithm to ‘see’ old and suspicious areas that were removed over the years. With this technique, we
left only the areas that did not found in the previous examinations and can point to possible micro-
calcifications (Yin, 1999). To assess the effectiveness of the subtraction, we measured the contrast ratio
of the subtracted image and compared it with the contrast ratio of the current one (without processing)
for the eight temporal pairs of our dataset. The contrast ratio of an image is defined as the ratio of the
maximum pixel to the mean pixel of the entire image. The goal was to increase the contrast, in order to
help the radiologists with better visualization.
3.1.7 Post-processing
3.1.7.1 Filtering
In the difference image that acquired in the previous stage, post-processing occurred. In detail, an
effective filter was needed in order to enhance the micro-calcifications and separate them from the
background without causing distortions in other areas. From the literature, we saw that the micro-
calcifications and the abnormalities in general, have higher intensity values and appear brighter than
other areas in the mammograms. Henceforth, for this stage we tested a variety of filters to discover the
best one. First, we used standard deviation filter, which calculates the standard deviation of a 3x3
neighborhood around each pixel in the input mammogram. Second, we tried contrast-limited adaptive
histogram equalization (CLAHE) that enhances the contrast of the input image by changing the intensity
values of the pixels (Zuiderveld, 1994). Third, range filter, which is a local pixel-based filter, where
every output pixel consists the range value, between the maximum and the minimum value, of a 3-by-
3 neighborhood for every equivalent pixel in the difference image (Bailey and Hodgson, 1985). Finally,
after the implementation of those filters we chose the filter that had the best performance.
3.1.7.2 Thresholding
Afterwards, the grayscale image that was captured earlier, became binary. The main idea was to separate
the high intensity pixels that could point to micro-calcifications and the low risk areas that we did not
found essential. With thresholding, we created a binary image by replacing all the pixel values equals
or higher than the threshold value with ones (white) and all the remaining pixels with zeroes (black).
The micro-calcifications and all the ROIs appeared as white pixels, but the background was erased, so
we had only the regions of interest. The threshold value was selected to 0.08 with trial and error.
Threshold value was set to a relatively small rate to meet the needs of our goal, which was to remove
only the areas that did not belong to micro-calcifications (Ma et al., 2015).
3.1.7.3 Morphological Operations
Later, we decided to further process the binary image with morphological operations. Morphological
image processing contains a group of non-linear procedures that associated with the shape and
morphology of features in an image and depend on the relative ordering of pixels instead on their
intensity value. The first operation was erosion, which removed isolated pixels, from the binary image,
39
that did not relate to micro-calcifications but at the same time decreases the size of ROIs. Then dilation
took place, which connected all the grouped pixels together to discover the clustered micro-
calcifications (Efford, 2002).
3.1.8 Removal of the periphery pixels
Next was the removal of the periphery regions that stayed after the processing of the images. From the
registration procedure the prior and the current mammograms were aligned, but most of the times, some
minor misalignments existed in the periphery, which introduced false assumptions of abnormalities in
those areas. For that reason, we excluded all the periphery pixels, which our algorithm found incorrectly,
to minimize the error of the proposed algorithm.
3.1.9 Removal of the old micro-calcifications
After the detection of the suspected ROIs, we decided to remove the micro-calcifications from the
previous mammograms that somehow eliminated in the next screening round (current mammogram)
and did not removed from the temporal subtraction. We did that because we needed only the new micro-
calcifications. The radiologist marked the current and the previous mammograms and we created an
algorithm that compared the two images and removed the areas that existed in the previous
mammograms. For more accurate results, we dilated the marked regions in the registered image,
because that image was distorted after Demons registration, and we wanted to map the micro-
calcifications correctly to their corresponding location in the current image.
With the new ground truth image that collected after the removal of the periphery pixels, we
characterized the detected ROIs that found from the proposed algorithm as micro-calcifications or
normal. As a result, we constructed the ground truth images for the next steps.
3.1.10 Evaluation of the proposed algorithm
Once the proposed algorithm discovered the possible regions of interest, we evaluated the results to
examine the efficiency of our method. We determined the true positives, true negatives, false positives
and the F1-score. An image region was identified as micro-calcification (positive) or normal (negative)
and a decision for the detection result could be either correct (true) or incorrect (false). Hence, the
decision for a detection result was one of four possible categories: true positive (TP), false positive
(FP), true negative (TN), false negative (FN). The FN and FP were the wrong assessments of the
algorithm. The false negative indicated that a true micro-calcification was not discovered and the false
positive that a normal area was characterized as abnormal. Similarly, a true positive decision identified
correctly the micro-calcifications and a true negative decision found precisely the normal ROIs.
Because we cannot find the TN since all the remaining regions that did not identified were normal, we
measured the F1-score. F1-score is related to accuracy and is used when we do not have the TN values.
Its formula is described below:
𝑭𝟏_𝒔𝒄𝒐𝒓𝒆 =𝟐𝑻𝑷
𝟐𝑻𝑷 + 𝑭𝑷 + 𝑭𝑵
40
3.2 Elimination of False Positives
The following steps contained the elimination of regions that falsely found from our algorithm as micro-
calcifications. The false positive regions alert the radiologists without a reason and lead to waste of time
and false alarms. The removal of those areas is important and necessary for CAD algorithms.
3.2.1 Feature Extraction and Selection
Up to the present time, CAD systems improved the radiologist’s performance up to 15% but the main
weakness of those systems, besides the relatively small dataset, was the high number of false positives.
In the literature, diverse approaches have been proposed for the reduction of false positives with a great
amount of studies focused on machine-learning applications. In detail, for the detected ROIs, basic
features were extracted such as textural, intensity-based, geometry, shape etc. and they were imported
to a classifier to categorize the ROIs as micro-calcifications (true positives) or false positives. The
selected features play a key role for the precise and accurate classification of the ROIs and the
combination of different feature types does not always promise better classification results. Given that,
a feature selection approach was needed to find the appropriate features for the classification step
(Nguyen et al., 2015). The ROIs characterizes as TP or FP based on the previous step.
3.2.1.1 Feature Extraction
In our proposed algorithm, thirteen image features were used to fully characterize the intensity, texture
and geometry of the regions of interest. This set of features preferred for their capability to categorize
the regions as micro-calcifications or not, based on micro-calcifications’ characteristics. The features
were computed on every ROI that previously found, after the characterization of the areas as TP or FP,
based on the ground truth. We extracted the features from the subtracted image and from the current
image after the implementation of the pre-processing step, in order to find which features showed the
best results. Each ROI was correlated with a feature vector of 13 dimension, one for every feature
calculated (Ma et al., 2015). The features were extracted only at the ROIs, not in the bounding box,
which was the smallest rectangle that included the ROI (Nguyen et al., 2015).
First Order Statistics (FOS) Features
Seven FOS features were computed on every detected ROI. Those were the average value of
gray level, the max intensity value of gray level, standard deviation, coefficient of variance,
entropy, skewness and kurtosis. They were determined by the standard mathematical equations.
Shape Features
Six shape measurements were extracted for the areas of interest that previously found from the
proposed methodology. Those were the area, eccentricity, convex area, filled area, solidity and
extent. Area measures the real number of pixels in all the ROIs. Eccentricity finds the ratio of
the distance among the foci of the ellipse that has identical second-moments as the area, and its
major axis length. The eccentricity value is from zero to one and an ellipse with eccentricity
value equals to zero is a circle, while an ellipse with eccentricity value equals to one is a line.
Convex area consists the number of pixels in the convex image, which is an image that
determines the convex hull. Equally, filled area involves the number of pixels in the filled
image, which is a binary image of the same size as the bounding box of the area, with all the
holes filled in. Solidity estimates the proportion of pixels in the convex hull that are in the
region too. Finally, extent calculates the ratio of pixels in the area, to pixels in the total bounding
41
box (MATLAB-Image Processing Toolbox). Those features found with the use of Matlab’s
regionprops.
Table 3.2: Features extracted from both subtracted and current image
3.2.1.2 Feature Selection
Feature selection was a necessary and valuable step before the implementation of machine-learning
algorithms. The removal of insignificant and unnecessary features can increase the performance of the
classifier. From the thirteen features that previously found, with the use of the subtracted and the current
images, only the best will be imported to the classifier for the classification step. In fact, the feature
selection process took place for both images, in order to choose the best way to eliminate FP (Nguyen
et al., 2015). In that case, hypothesis test and multivariate analysis of variance required, to discover the
features with the bigger contribution and separate TP and FP areas.
First, we conducted a paired t-test, for all the features that extracted from the two images. T-test used
to compare two population means, in which the samples in the first population can paired with the
corresponding samples in the second population. This method discovers a test decision for the null
hypothesis that the data in two different populations come from independent random samples, from
normal distributions with same means and same but unknown variances. The other possible outcome is
that the data come from populations with dissimilar means, so they are statistical different and can be
separate. The variable h is equal to 1 when the test dismisses the null hypothesis at the 5% significance
level and 0 otherwise. The p-value correspond to the significance level and if it is less than 0.05 the
hypothesis is rejected, and the distributions can be set apart. On the contrary, if the p-value is bigger
than 0.05, the hypothesis is legit, and the two groups are related to each other (Hsu and Lachenbruch,
2008).
For the purpose of this study, elimination of the false positives was occurred, in order to improve the
accuracy of the suggested algorithm. The two diverse populations for the t-test analysis were the
features that were extracted from the TP and FP regions. In exchange for that, the features that are
statistical different and can distinguish the two populations as TP and FP, found and comparison of the
features that were extracted from the subtracted image and the current image was happened.
Features
FOS features Shape features
Mean Intensity Area
Max Intensity Eccentricity
Skewness Convex Area
Kurtosis Filled Area
STD Solidity
Variance Extent
Entropy
42
After the detection of the features that contribute the most for the classification of ROIs as TP and FP,
was essential to examine if the combination of them will result in better p-value than single features.
Thus, multivariate analysis of variance (MANOVA) was required to select the best feature subset
between available features. MANOVA test was used to examine this hypothesis and rather than a single
p-value for each feature, a multivariate p-value was acquired depend on the comparison of the error of
variance and covariance matrix. The covariance matrix was needed because the features related to each
other and correlation should be considered. Variables that maximize the group changes were
constructed to analyze the diverse dependent features and those variables were linear combinations of
the calculated dependent variables (French et al., n.d.). MANOVA returns the variable d, which is an
assessment of the dimension of the space. When d equals to 0 the hypothesis is not rejected at the 5%
significance level but if d equals 1, the hypothesis rejected. Like t-test, the p-value presents the
significance level. With this intention, MANOVA test has three assumptions for the input data: the
populations for all the groups are normally distributed, the variance-covariance matrix is identical for
every population and all the observations are independent (Krzanowski, 2008).
In this case, an algorithm that automatically selected the features that already collected from the paired
t-test, was constructed to achieve the lower p-value and find the appropriate features for the
classification part. The remaining features were imported as well, because sometimes help the
classification despite the high p-value.
3.2.2 Classification
For the classification of the interested ROIs, we decided to apply Discriminant Analysis (DA) with
Matlab’s ‘classify’. With discriminant analysis, we discovered the best function that classified the two
classes in our problem (TP and FP). More general, the purpose of discriminant analysis was to find a
combination of features that define two or more classes, objects or events. This combination can used
as a linear or non-linear classifier for new data. A region is estimated to belong to class one or class two
based on the classification boundary. If the region is assigned to the wrong class, an error occurred
(Mika et al., 1999).Discriminant Analysis includes the determination of a liner equation similar to
regression that can be found below and predicts in which class each ROI belong.
𝐷 = 𝑣1𝑋1 + 𝑣2𝑋2 + 𝑣3𝑋3 = ⋯ 𝑣𝑖𝑋𝑖 + 𝑎
In this equation, D is the discriminate function, v the discriminant coefficient, X the score for the
approximate coefficient, a is a constant and i the number of the predictor variables. This function must
maximize the range amongst the classes and find the discriminant function that separate those classes
and any new regions. The assumptions of DA are:
the observations are random;
every ROI is normally distributed;
every allocation for the dependent classes in the initial classification classified accurately;
there must be at least two classes and each region could be part of only one class, because the
classes are mutually exclusive;
every class should be precisely defined and the differences with the other classes must be clear
(Krzanowski, 2008; Seber, 1984).
In our case, we created a training set with the features that were extracted earlier, to train our classifier
and a test set to evaluate the classifier. To train the proposed classifier, the 8 mammograms were used
in a leave-one-patient-out procedure. During each round, one mammographic image was used as the
43
test sample and the remaining images as the train one, until all the cases we had in our dataset were
classified. It is really important to state that, to avoid any bias, during the leave-one-patient-out
procedure, the test mammogram was also removed from the training stage. This achieved complete
separation of the test sample, from any of the training sets.
3.2.3 Evaluation of the classification
As before, we calculated the TP, FP, FN and now we included the TN since we classified only the
previous detected regions, for the evaluation of the classification step. With those indices, we created a
confusion matrix (Table 3.3) for the total values of all the eight cases.
Table 3.3: Confusion matrix
Predicted Class
True Class
Positive Negative
Positive True Positive False Positive
Negative False Negative True Negative
Next, the performance of the diagnostic system was measured with the accuracy, which is a description
of random errors and a measurement of statistical variability. The classification’s accuracy shows the
percentage of diagnostic decisions that identified correctly, and the formula is describe below (Cheng
et al., 2003). We also found the sensitivity, which is the true positive rate and the specificity, which
corresponds to the true negative rate and the formulas are shown below. We compared the results with
the previous results of the detection to see if the false positives were successfully eliminated and the
algorithm was improved.
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 (𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) =𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 (𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) =𝑇𝑁
𝑇𝑁 + 𝐹𝑃
44
Figure 3.3: Detailed representation of the proposed algorithm
45
4 Results
4.1 Detection of abnormal ROIs
4.1.1 Pre-processing
The pre-processing step contained the use of the border removal filter, which suppresses high intensity
structures related to the border. All pair of mammograms (normal and abnormal cases) from our dataset
were tested and the results are show in Figure 4.1 for two normal to abnormal cases of mammograms.
From the outcome, can be found that this step was effective and discarded a great amount of unnecessary
pixels from the mammograms without removing any important regions in the images. Also, at the MLO
views this filter eliminated the pectoral muscle which was redundant information in the mammograms
for this study.
Figure 4.1: Example of the clear border removal in two cases (a) normalized mammograms
with red circle in the malignancy (b) border removal
(a) (b)
46
Later, Gamma correction took place and further cleared the irrelevant pixels from the processed image.
As before, Gamma filter was applied to all the mammograms (current and prior) from our dataset and
in Figure 4.2 are the results for the same two cases of mammograms. It is clear from the resulted images
that the remaining background was eliminated without removing any regions of interest. Border
removal was combined with Gamma correction and built an efficient pre-processing step which
removed all the unnecessary information in the mammograms.
Figure 4.2: Example of the Gamma correction in two cases (a) border removal (b) Gamma
correction
(a) (b)
47
4.1.2 Registration
When the pre-processing step was finished, registration took place. We evaluated the registration
performance of two different registration methods: Affine and Demons, on a subset of 40 dense and
fatty mammographic temporal pairs from our dataset. We found the most suitable preferences of the
two algorithms and the best method for the proposed method. Then, we subtracted the current and
registered image. As evaluation measure, residual was determined on the subtracted image.
4.1.2.1 Affine Registration Algorithm
We ran 40 temporal pairs, 20 dense and 20 fatty, with Affine registration for five iteration numbers:
100, 200, 300, 400 and 500 and Figure 4.3 shows the current and prior and then the subtracted image
between the current and the new prior registered images in the same two cases as above and in 200
iterations as an example. It is noticeable, that the prior image was altered in order to look like the current
one.
The next step was to decide regarding the most suitable iteration number. For that, two box plots were
made, one for dense and one for fatty mammograms with the percentage of the residuals of the
subtracted images related to iteration number. Figure 4.4 displays the residuals in percentage for the
dense mammograms and Figure 4.5 the residuals in percentage for the fatty mammograms. Even though
fatty and dense mammograms are different according to the tissue and dense mammograms seem more
complicated, from the two box plots we observed that they needed the same iteration number for a
Figure 4.3: Affine registration in two cases (a) current mammogram (b) prior mammogram (c)
subtracted image
(a) (b) (c)
48
proficient registration. The iteration number that minimizes the residual for the two categories was 200
and we chose this number for our algorithm. However, the mean residuals for all the iteration numbers
of fatty mammograms were slightly lower (approximately 33%) than the corresponding ones of dense
mammograms (approximately 38%).
Figure 4.5: Box plot for fatty mammograms
Figure 4.4: Box plot for dense mammograms
49
4.1.2.2 Demons Registration Algorithm
After the completion of the Affine registration method, we evaluated Demons registration for the same
20 temporal pairs that we did with Affine. Demons calculated the displacement field to align the prior
mammogram with the current one and Figures 4.6 and 4.7 demonstrate a pair image with the current
(fixed) and prior (moving) mammograms and how the second one has to change (yellow arrows) based
on the displacement field that has been found from Demons, in the two examples as above.
Figure 4.6: Displacement filed for Demons registration in
the 1st example
Figure 4.7: Displacement filed for Demons registration in
the 2nd example
50
Figure 4.7 represents the current mammogram, the prior one and the subtracted image between the
current and the new prior registered images. With Demons, the prior mammogram has changed and
adjusted to the current mammogram. Next, 2 box plots were created, one for dense and one for fatty
mammograms, with the percentage of the residuals for the subtracted images. Figure 4.8 exhibits the
residuals in percentage for the dense mammograms and Figure 4.9 the residuals in percentage for the
fatty mammograms. From the 2 figures we noticed that, like Affine, mean residual percentage of fatty
mammograms (approximately 24%) was 3% lower than the corresponding mean value of the dense
mammograms (approximately 27%) and this, like already explained, was due to the fact that dense
mammograms were harder to deal with.
Figure 4.8: Demons registration in two cases (a) current mammogram (b) prior mammogram
(c) subtracted image
(a) (b) (c)
51
4.1.2.3 Comparison
Eventually, we compared the two registration techniques based on our previous results and chose the
best method for our algorithm. Two box plots were created with the residuals in percentage for the 20
temporal cases for both Affine and Demons. Figure 4.10 shows the box plot for the dense mammograms
and Figure 4.11 the box plot for the fatty mammograms.
Figure 4.10: Comparison of dense mammograms
Figure 4.9: Box plot for dense and fatty mammograms
52
For dense mammograms, the mean residual with Affine was 37% and is approximately 10% higher than
the residuals with Demons. In like manner, in fatty mammograms the average value of residual with
Demons was 24% and is 11% lower than Affines’. Again, the residuals in dense mammograms were
larger than the corresponding ones for the fatty mammograms. To the end of that, Demons registration
technique was selected due to its unique characteristics and its better results over Affine and was applied
to the algorithm.
4.1.3 Temporal Subtraction
After the registration, we subtracted the temporal pairs and calculated the contrast ratio of the subtracted
image compared to the current one. In Figure 4.12 we can see a box plot associated with the contrast
ratio of the two images. We created one plot for both dense and fatty mammograms. It is noticed that,
the contrast ratio of the subtracted image is significantly greater than the one in the current image. This
proves that the image was cleared from unnecessary regions and irrelevant details that most of the times
distract the radiologist. We removed all the areas that appeared in the previous screening round and the
old micro-calcifications and with this in mind, we already help the radiologists to better visualization
and assessment of the mammographic images.
Figure 4.11: Comparison of fatty mammograms
53
4.1.4 Post-processing
4.1.4.1 Filtering
Later, diverse filters tested for the enhancement of the subtracted image from the previous stage,
included standard deviation filter, CLAHE and range filter. We used the same examples and in Figure
4.13 we see the ‘raw’ subtracted image and the image with standard deviation filter. From the filtered
image we understood that this filter was insufficient because it made the background and all the bright
areas smoother and the micro-calcifications which were brighter, could not be found. The second
technique that we tried was CLAHE, which enhanced the contrast by changing the intensity values of
the pixels. In Figure 4.14 the subtracted and the filtered image were presented. This filter enhanced the
high intensity background instead of removing it and was not appropriate for our goal. Finally, range
filter was used and returned the range value of the neighborhood in the input image. In Figure 4.15 we
see the implementation of range filter. With this filter, the high intensity background erased without
making any changes to any other bright areas. For our algorithm, we selected range filter for the post-
processing step.
Figure 4.12: Box plot of the contrast ratio
54
Figure 4.13: STD filter (a) subtracted image (b) filtered image
Figure 4.14: CLAHE (a) subtracted image (b) filtered image
(a) (b)
(a) (b)
55
4.1.4.2 Thresholding
Afterwards, the image became binary and in Figure 4.16 we look at the filtered and binary images. With
the threshold number set to 0.08 we removed the regions that did not belonged to abnormalities and left
only the ROIs.
Figure 4.15: Range filter (a) subtracted image (b) filtered image
Figure 4.16: Thresholding (a) filtered image (b) thresholded
(a) (b)
(a) (b)
56
4.1.4.3 Morphological Operations
The final step for the post-processing was the morphological operations. Erosion followed from dilation
took place to discard the isolated pixels and group pixels that were close to each other. The resulted
image s areshown in Figure 4.17 for the 2 known examples.
Now that the post-processing step was finished, Figure 4.18 demonstrates the current mammograms
from the first example, with red circles in all the detected regions from the proposed algorithm and
Figure 4.19 for the second example.
Figure 4.17: Morphological operations (a) thresholded image (b) new image
Figure 4.18: Resulted image for example 1
(a) (b)
57
4.1.5 Removal of the Periphery Pixels
For further processing, we removed the periphery pixels that corresponded to misalignments and were
remained after the above procedure, to improve the accuracy of the algorithm. In Figure 4.20 are the
two new images for the above examples.
Figure 4.19: Resulted image for example 2
Figure 4.20: Images with removed periphery pixels
58
4.1.6 Removal of the old micro-calcifications
For the removal of the old micro-calcifications, we took the marked registered and current images from
the radiologist and compared them. Figure 4.21 displays the ground truth images, current and prior
registered, for the first example that contain the micro-calcifications (white pixels) and Figure 4.22
shows the resulted image, after the implementation of the algorithm that erased the old regions that
appeared in both images. This step was crucial since the algorithm rejected all the micro-calcifications
that erased in the current mammogram and the remaining areas were the new micro-calcifications.
Figure 4.21: Ground truth images
Figure 4.22: New binary image
59
Next, we created the ground truth images for the next steps. Figure 4.23 demonstrates the true micro-
calcifications with red circles in the two previous examples. Figure 4.24 displays the normal regions
with green circles that our algorithm incorrectly found as abnormal for the same cases.
Figure 4.23: True micro-calcifications
Figure 4.24: False negative areas
60
4.1.7 Evaluation of the proposed algorithm
For the evaluation of our algorithm, we used the ground truth images from the next step to classify the
regions that previously found as true micro-calcifications or normal. Once we categorized all the regions
as true or false, we created a table with all the evaluation measurements for all the cases.
Table 4.1: Evaluation of the algorithm
Patient Detected
ROIs TP FP
FN
(not detected ROIs) F1-score
1 46 14 32 0 0.466
2 73 8 65 0 0.197
3 58 6 52 0 0.188
4 18 6 12 0 0.5
5 27 2 25 0 0.137
6 30 2 28 0 0.125
7 82 54 28 36 0.627
8 42 4 38 1 0.170
Total 376 96 280 37 0.301
From the table above, we noticed that the proposed algorithm correctly identified 96 micro-
calcifications from the 133. It is worth noting though that neither of the 37 false negative cases that the
algorithm did not found, were malignant. Also, with this technique a large amount of false positive were
found. From the total 376 ROIs that our algorithm identified as important, the 280 were false positives.
The false positives can mislead the radiologist and cause many problems. Instead of helping with this
system, the radiologists will be confused so we need to eliminate the false positives by upgrading our
methodology. Moreover, the average F1-score was only 0.301 and from the number we understood that
we had to increase the accuracy of the algorithm with the removal of the irrelevant regions. However,
the proposed methodology is valuable, despite those limitations, and can help the radiologists with the
detection of micro-calcifications.
4.2 Elimination of False Positives
After the evaluation of the algorithm we found a considerable amount of false detected regions. With
the following methodology we removed those regions, to increase the performance of the algorithm.
61
4.2.1 Feature Extraction and Selection
4.2.1.1 Feature Extraction
From the final images that were captured earlier from the proposed methodology, 13 FOS and shape
features were extracted from each region, for the automatic classification of false positive and true
positive areas. We evaluated the features, from both the subtracted and the current images and the results
are presented in the two tables below. Table 4.2 demonstrates the value for each feature that extracted
from the current mammogram, for 10 randomly selected ROIs, included TP and FP. Similarly, Table
4.3 displays the value for each feature that extracted from the subtracted image, for the same ROIs. The
characterization as TP or FP was based on the previous procedure. From the two following tables it can
be seen that shape features were identical for the features that were extracted from the current and
subtracted images and that lies to the fact that those features calculated based on the shape of the area,
thus the intensity of the pixel was independent. Conversely, the FOS features that were intensity and
pixel based, were different for the current and subtracted image.
4.2.1.2 Feature Selection
With feature selection the best features found for the next step. From the paired t-test, the 13 features
were compared and the Tables 4.4 and 4.5 show the results for the h and p values of the current and
subtracted images, respectively. The best features are highlighted with red, in both tables. Here one
should notice the differences between the features from the two images. In current image, the features
that rejected the hypothesis that TP and FP are statistically equal, were the max intensity value, standard
deviation, entropy, area, eccentricity, convex area, filled area and extent. Nevertheless, in the subtracted
image the features with h index equals to 1, were the max intensity, variance, entropy, skewness,
kurtosis, area, eccentricity, convex area, filled area and extent. When the features extracted from the
subtracted image we noticed that the lowest p-value belonged to entropy. Thus, for the current image,
maximum intensity gave the minimum p-value. Equally important was the fact that the p-scores for the
best features of the subtracted image were slightly smaller than the ones of the current image.
Following, the best features that discovered from the t-test, were imported to multivariate analysis of
variance to find out if the combination of them helped. We decided to include all of the features to
MANOVA, because occasionally some features tend to improve the p-value even if their response at
the t-test was not satisfactory. Table 4.6 have the results for the current image and Table 4.7 for the
subtracted one. In the two tables we see four different combinations of features. The highlighted step
was the most efficient one and gave the lowest p-value. For the current image, the best features were
the mean intensity, max intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and
extent. In the subtracted image the most efficient combination of features contained the max intensity,
standard deviation, variance, entropy, skewness, kurtosis, eccentricity, filled area and extent. In general,
FOS features had bigger contribution for the classification of the two populations compared to the shape
features, because they are related with the intensity value of each pixel and not the shape of the region.
From there, it was clear that the current image had more valuable information regarding the features
since the p-value was smaller. Additionally, the combination of features improved the p-value compared
to the use of single features. The features with the smaller p-value were the mean intensity, max
intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and extent, extracted from
the current image and were selected for the classification step.
62
Table 4.2: Features extracted from current image
Current image
FOS features Shape features
ROI Class Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent
1 TP 7.08E-05 0.004 361.780 130884.940 0 -0.144 2.256 1732 0.176 3481 3479 0.498 0.423
2 TP 2.72E-05 0.004 137.543 18918.158 0 0.310 1.897 2300 0.209 5673 5650 0.405 0.342
3 TP 1.03E-01 0.312 559.091 312583.277 0 -0.708 2.461 2602 0.287 6970 6922 0.373 0.304
4 TP 9.89E-02 0.266 783.983 614629.225 0 -0.768 2.029 1774 0.229 3619 3617 0.490 0.414
5 TP 2.15E-04 0.043 250.541 62770.737 0 1.989 9.101 1864 0.403 3940 3936 0.473 0.404
6 FP 1.29E-03 0.069 369.577 136586.851 0 2.069 6.954 1950 0.313 4228 4214 0.461 0.381
7 FP 6.61E-03 0.114 740.430 548236.096 0 1.221 3.310 2642 0.502 7128 6991 0.371 0.318
8 FP 7.62E-03 0.087 571.897 327066.722 0 -0.462 2.156 1586 0.164 2996 2996 0.529 0.448
9 FP 3.88E-02 0.148 880.791 775792.459 0 -0.552 1.554 1586 0.164 2996 2996 0.529 0.448
10 FP 2.86E-03 0.064 345.451 119336.504 0 0.254 2.502 1982 0.461 4313 4288 0.460 0.377
63
Table 4.3: Features extracted from subtracted image
Subtracted image
FOS features Shape features
ROI Class Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent
1 TP 5.69E-05 0.004 2.462E-04 6.06E-08 0.03 8.106 94.856 1732 0.176 3481 3479 0.498 0.423
2 TP 5.19E-08 2.88E-05 9.77E-07 9.54E-13 0 23.945 637.385 2300 0.209 5673 5650 0.405 0.342
3 TP 0.0601 0.312 0.069 0.0048 4.75 0.857 2.510 2602 0.287 6970 6922 0.373 0.304
4 TP 0.0980 0.266 0.074 0.0055 5.03 -0.148 1.601 1774 0.229 3619 3617 0.490 0.414
5 TP 0.0001 0.043 0.001 2.22E-06 0.10 19.553 465.099 1864 0.403 3940 3936 0.473 0.404
6 FP 0.0012 0.069 0.006 3.23E-05 0.59 5.533 37.390 1950 0.313 4228 4214 0.461 0.381
7 FP 0.0066 0.114 0.019 0.0004 1.51 3.370 14.108 2642 0.502 7128 6991 0.371 0.318
8 FP 0.0076 0.087 0.014 0.0002 2.47 2.633 10.551 1586 0.164 2996 2996 0.529 0.448
9 FP 0.0344 0.148 0.034 0.0012 3.87 0.504 2.035 1586 0.164 2996 2996 0.529 0.448
10 FP 0.0011 0.033 0.003 1.17E-05 0.84 4.888 32.250 1982 0.461 4313 4288 0.460 0.377
64
Table 4.4: T-test results for current image
Table 4.5: T-test results for subtracted image
Current image
FOS features Shape features
Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent
h 0 1 1 0 1 0 0 1 1 1 1 0 1
p 0.354 6.22E-08 0.038 0.544 1.9E-04 0.724 0.654 0.004 0.002 0.017 0.007 0.095 0.012
Subtracted image
FOS features Shape features
Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent
h 0 1 0 1 1 1 1 1 1 1 1 0 1
p 0.767 3.75E-06 0.068 0.03 9.44E-10 3.47E07 4.92E-05 0.004 0.002 0.017 0.007 0.095 0.012
65
Table 4.6: MANOVA results for current image
Table 4.7: MANOVA results for subtracted image
Current image
FOS features Shape features
Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent d p
x x x x x x x x x x x x x 1 4.6E-03
x x x x x x x x 1 3.23E-04
x x x x x x x x x 1 2.98E-04
x x x x x x x x x x 1 8.67E-09
Subtracted image
FOS features Shape features
Mean
Intensity
Max
Intensity STD Variance Entropy Skewness Kurtosis Area Eccentricity
Convex
Area
Filled
Area Solidity Extent d p
x x x x x x x x x x x x x 1 0.004
x x x x x x x x x x 1 3.12E-04
x x x x x x x x x x 1 6.71E-05
x x x x x x x x x 1 7.35E-06
66
Table 4.8: Selected features for the classification step
4.2.2 Classification
Since we found the most valuable features, with Discriminant Analysis we classified the suspected ROIs
as TP or FP. The classification step occurred 8 times, and every time a patient left out of the training
process in order to be part of the validation. Table 4.9 exhibits the total results of the classification step
for all the cases from our dataset.
Table 4.9: Classification results
True Class
Predicted Class
Micro-calcifications Normal
Micro-calcifications 121 12
Normal 60 220
With a first look in the above table, we noticed that the classification step improved significantly the
algorithm, due to the decreased number of false positives and false negatives.
4.2.3 Evaluation of the classification
To evaluate the performance of the classification step in our algorithm, we measured the accuracy,
sensitivity and specificity with the formulas that described above. Table 4.10 illustrates the total results.
Selected Features
FOS features Shape features
Mean Intensity Area
Max Intensity Eccentricity
Skewness Solidity
Kurtosis Extent
Variance
Entropy
67
Table 4.10: Evaluation of the classifier
Patient ROIs TP FP TN FN
1 46 14 12 0 20
2 73 8 9 0 56
3 58 6 11 0 41
4 18 5 0 1 12
5 27 2 9 0 16
6 30 2 6 0 22
7 82 53 4 1 24
8 42 3 9 1 29
Total 376 93 60 3 220
Accuracy (percent) 83.245%
Sensitivity 0.969
Specificity 0.786
After the machine-learning implementation, the performance of the proposed algorithm was upgraded,
and the algorithm became more efficient. From the 96 micro-calcifications, we precisely recognized 93
and missed only 3, which again were not malignant. This is really important for the CAD system, since
we wanted to exclude only the areas that did not correspond to abnormalities and to identify as many
micro-calcifications as possible. Hence, the main problem was the high number of false positives. Our
algorithm categorised a large number of normal areas as abnormal and more specific from the 376
detected ROIs, the 280 were false positives. With the classification step, 220 falsely detected regions
were eliminated and only 60 regions were misclassified as abnormal. Therefore, the classification step
was very promising and increased the accuracy of the algorithm.
The accuracy of the classifier was found 83.245% and is promising since we used only 8 temporal pairs
of mammograms. In the following chapter we are going to compare it with the state-of-the-art
techniques. Sensitivity was found 0.969 and from that number we can understand that almost all the
micro-calcifications found accurately, and we did not miss or overlook the abnormal areas. High
sensitivity is substantial in medical applications and in our case was the most important measurement
because we wanted to exclude the areas that were not micro-calcifications and eliminate false negatives.
It was necessary to be sure that all the patients that found as healthy were indeed healthy.
68
Specificity though was lower due to the false positives. With specificity we measured which abnormal
regions were actually abnormal and not just false detected. In our study, specificity was not so important
as the sensitivity. Our goal was to eliminate false positives, to assist the radiologists and not confusing
them with all those falsely detected regions but the crucial part was to exclude the true healthy patients
and not characterized a cancerous region as a normal one. In general, with the classification step we
upgraded the proposed methodology and constructed a valuable CAD system for the detection and
classification of micro-calcifications in temporal mammographic pairs.
69
5 Discussion
In this work, we presented an automated CAD algorithm for breast micro-calcifications diagnosis on
temporal pair of mammograms. We used 8 pairs of full-field digital temporal mammograms from The
Breast Center of Cyprus, with 5 pairs of Cranio-Caudal (CC) view and 3 pairs of Mediolateral-Oblique
(MLO) view, from 8 women. At the beginning, we did a normalization and then the pre-processing took
place. Gamma correction and border removal were chosen for the elimination of any irrelevant regions
and showed satisfactory results compared to the state-of-the-art techniques. Given that, we did not find
any algorithms in the literature that combined those two filters.
After that, was the registration step, which mapped current and prior mammograms and then the
registered and the current mammograms subtracted. Affine and Demons registration techniques were
tested for dense and fatty mammograms. The mean residual with Affine was 37% and is 10% greater
than the residuals with Demons. In fatty mammograms, the mean value of residual with Demons was
24% and is 11% worse than Affines’. Based on the results, Demons registration technique was selected
over Affine and implemented to the algorithm. In the literature, a large amount of studies used Demons
for the registration of the mammographic images and our results compared well with them. Nonetheless,
those studies took different measurement for the effectiveness of the registration method and it is
difficult to be compared with our study. With the previous steps, we already improved the contrast of
the subtracted image, compared to the current one without the pre-processing. Specifically, the mean
contrast ratio of the current image was 2 and of the subtracted image 6. The insignificant regions were
cleared, and the radiologist can identify the abnormalities easier without seeing the background and the
old areas that excluded in the second screening round.
Next, was the post-processing step, which involved the filtering of the subtracted image with various
filters to determine the most suitable one, the thresholding and the processing with morphological
operations. Range filter was preferred because the high intensity background was erased without
damaging important regions. With the thresholding and the processing, the subtracted image became
binary and unnecessary regions were removed. The fourth step contained the removal of the periphery
areas that occurred from the false assessment of the algorithm and the removal of the regions that
belonged to old micro-calcifications.
Later, we constructed the ground truth images to evaluate the performance of the proposed algorithm.
From the results we established that the algorithm performed sufficiently accurate for the identification
of the true micro-calcifications but found incorrectly a lot of false abnormal areas. From the total 376
detected areas, falsely characterized 280 as micro-calcifications and 37 as normal. The F1-score was
0.301 and pointed that the algorithm needed improvements since the large amount of the identified false
positives regions created a critical problem for this study. From the literature we had seen that this was
the most significant problem and all the studies struggled with that issue.
For this reason, we used machine-learning to eliminate the false positives and upgrade our method. We
extracted 13 FOS and shape features from current and subtracted images, on every previously detected
ROI. Those features were the average value of gray level, the max intensity value of gray level, standard
deviation, coefficient of variance, entropy, skewness, kurtosis, area, eccentricity, convex area, filled
area, solidity and extent. Statistical analysis and multivariate analysis of variance used to find the best
combination of features. The features that extracted from the current image were more valuable, since
the p-value was slightly lower and finally the features that selected for the classification step were the
mean intensity, max intensity, variance, entropy, skewness, kurtosis, area, eccentricity, solidity and
70
extent. With discriminant analysis, we used leave-one-patient-out validation to split the dataset into a
training set and a test set and the accuracy, sensitivity and specificity were found.
Table 5.1 illuminates a comparison between our proposed method and the state-of-the-art classification
techniques that used machine-learning applications with leave-one-patient-out analysis, for the
elimination of false positives. Nonetheless, these three studies did not discover micro-calcifications
from temporal pairs, but only from one screening round. We went one-step further and eliminated the
old micro-calcifications based on the previous mammograms to improve the results and help the
radiologists. Provided that, the comparison of our methodology, with the other methods mentioned in
the literature, is not straightforward because the experiments were conducted in different datasets and
the size of each dataset varies. Additionally, in each method the authors chose different evaluation
techniques for their algorithms.
In 1998, Nagel, et al. constructed a computerized scheme for the identification of micro-calcifications
and more specific the removal of the false-computer detections. They examined 3 different methods for
the feature analysis which were the rule-based method, the artificial neural network and a combined
method. The combined method achieved the highest results with 83% sensitivity and 0.8 false positive
detections per image. Next, Diaz-Huerta, Felipe-Riveron and Montaño-Zetina, (2014) used
morphological approaches and contrast enhancement techniques to detect the micro-calcifications.
Then, for the false positives elimination, they extracted 65 spatial, texture and spectral features and
inserted them in a support vector machine classifier for the discrimination of micro-calcifications and
normal tissue. The overall sensitivity was 85.9% and in normal images they obtained 13 false positives
per image. Lu, et al. (2016) after the detection of the micro-calcifications, they applied a classification
step for the removal of the false regions that identified as micro-calcifications. They extracted 11 shape
and 27 appearance features and used the RUSBoost classifiers. With their method found 80% accuracy
with 10 false positives per image. In our algorithm, the accuracy was 83.245%, sensitivity 96.9% and
specificity 78.6%. Compared to the above methods, with these measurements we can understand that
our results are very promising and the proposed algorithm very effective.
It is worth to mention, that a considerable amount of the studies that we found in the literature, did not
use leave-one-patient-out validation but cross-validation with the micro-calcifications. With that
method, they did not leave behind all the ROIs correspond to a single patient, but they divided the train
set and the test set with randomly selected micro-calcifications or they used a part of the same areas as
a test set and a train set. We did not choose this approach, due to the fact that we wanted to detect all of
the micro-calcifications inside a mammogram. With this technique, if we put a percentage of the ROIs
in the training set and used the rest of them for testing, the algorithm categorized the remaining areas
correctly due to bias. Songyang Yu and Ling Guan, (2000) proposed an automated CAD system for the
detection and classification of clustered micro-calcifications in digitized mammograms. First, they
discovered the possible regions. Then, they extracted 31 gray level statistical and wavelet features and
imported them into a neural network for the classification step. However, their training samples were
used also in the testing of the classifier. There results showed 90% accuracy with only 0.5 false positives
per image.
With this in mind, our approach was more generalized, did not contain any bias and this was the reason
that our results were slightly lower than the state-of-the-art. For comparison reasons, we evaluated our
algorithm with that approach and we found that the accuracy was 98%, the sensitivity 0.985 and the
specificity 0.876. From that, we can assure that the proposed algorithm is as good and even better from
the state-of-the-art algorithms.
71
The main drawback of the proposed method was the relatively narrow dataset, which can alter the results
of the classification step. It is reasonable to expect that with bigger datasets the classification step can
be more accurate and the accuracy higher. Moreover, the age of the patients’ that participate in the study
was not a representative sample. At the registration step, the execution time of Demons was low and
consequently the whole process affected. Also, we used only Discriminant Analysis for the separation
between TP and FP, and we did not check other machine-learning algorithms. With this intention, our
results show that our proposed algorithm is efficient and can be used with relatively high accuracy for
breast micro-calcifications detection using temporal mammograms.
Table 5.1: Performance of different methods
Classification as micro-calcification or Normal using a classifier and Leave-one-
patient-out procedure
Nagel, et al. (1998) 83% sensitivity with 0.8 false positives per image
Diaz-Huerta, Felipe-Riveron and
Montaño-Zetina, (2014) 85.9% sensitivity with 13 false positives per image
Lu, et al. (2016) 80% accuracy with 10 false positives per image
Proposed method
83.245% accuracy
96.9% sensitivity
78.6% specificity
72
6 Conclusion and Future Work
We performed a study for the development of a computerized CAD system for the detection of micro-
calcifications on temporal pair of mammograms. We combined a series of pre-processing, registration
and post-processing techniques in order to efficiently subtract the mammographic pair and enhance the
micro-calcifications. Additionally, we removed the periphery regions that were unnecessary to our
algorithm and the old micro-calcifications. We extracted 13 FOS and shape features from the subtracted
image for the classification step to eliminate the false positives regions. With statistical analysis and
multivariate analysis of variance, we selected the best features and with discriminant analysis and leave-
one-patient-out validation, we classified our results and found the accuracy, the sensitivity and the
specificity as evaluation measurements. From our results, we can conclude that the achieved
performance is sufficient, and our algorithm can assist radiologists in breast micro-calcifications
detection using temporal mammograms.
Encouraged by this initial success, we plan to use larger dataset in order to test our algorithm further
and with more representative age samples. Likewise, more features needed for the classification of
regions as TP or FP and the use of more complex classifiers, which can improve the results and increase
the percent correct. Finally, we can generalize our algorithm for other kind of abnormalities in
mammograms, besides micro-calcifications and give predictions for the development of new
abnormalities in the next screening rounds.
73
References
Abdel-Nasser, M., Moreno, A. and Puig, D. (2016). Temporal mammogram image registration using
optimized curvilinear coordinates. Computer Methods and Programs in Biomedicine, 127, pp.1-14.
Arfan, M. (2017). Wavelets Texture based Classification of Breast Mammograms using Adaboost
Classifier. International Journal of Advanced Computer Science and Applications, 8(5).
Bailey, D. and Hodgson, R. (1985). Range filters: Local intensity subrange filters and their
properties. Image and Vision Computing, 3(3), pp.99-110.
Bekker, A., Shalhon, M., Greenspan, H. and Goldberger, J. (2016). Multi-View Probabilistic
Classification of Breast Microcalcifications. IEEE Transactions on Medical Imaging, 35(2), pp.645-
653.
Berry, D. (2011). Computer-Assisted Detection and Screening Mammography: Where's the Beef?.
JNCI Journal of the National Cancer Institute, 103(15), pp.1139-1141.
Beura, S. (2016). Development of Features and Feature Reduction Techniques for Mammogram
Classification. Ph.D. National Institute of Technology Rourkela.
Bhattacharya, M. and Das, A. (2007). Fuzzy Logic Based Segmentation of Microcalcification in Breast
Using Digital Mammograms Considering Multiresolution. In: Machine Vision and Image Processing
Conference.
Boulehmi, H., Mahersia, H. and Hamrouni, K. (2016). A New CAD System for Breast
Microcalcifications Diagnosis. International Journal of Advanced Computer Science and Applications,
7(4).
Bozek, J., Kallenberg, M., Grgic, M. and Karssemeijer, N. (2014). Use of volumetric features for
temporal comparison of mass lesions in full field digital mammograms. Medical Physics, 41(2),
p.021902.
Burrell, H., Sibbering, D., Wilson, A., Pinder, S., Evans, A., Yeoman, L., Elston, C., Ellis, I., Blamey,
R. and Robertson, J. (1996). Screening interval breast cancers: mammographic features and prognosis
factors. Radiology, 199(3), pp.811-817.
Cady, B. and Chung, M. (2005). Mammographic Screening: No Longer Controversial. American
Journal of Clinical Oncology, 28(1), pp.1-4.
Cancer.org. (2016). What is breast cancer?. [online] Available at:
http://www.cancer.org/cancer/breastcancer/detailedguide/breast-cancer-what-is-breast-cancer
[Accessed 3 Oct. 2016].
Casti, P., Mencattini, A., Salmeri, M. and Rangayyan, R. (2015). Analysis of Structural Similarity in
Mammograms for Detection of Bilateral Asymmetry. IEEE Transactions on Medical Imaging, 34(2),
pp.662-671.
Casti, P., Mencattini, A., Salmeri, M., Ancona, A., Lorusso, M., Pepe, M., Natale, C. and Martinelli, E.
(2017). Towards localization of malignant sites of asymmetry across bilateral mammograms. Computer
Methods and Programs in Biomedicine, 140, pp.11-18.
74
Celaya-Padilla, J., Martinez-Torteya, A., Rodriguez-Rojas, J., Galvan-Tejada, J., Treviño, V. and
Tamez-Peña, J. (2015). Bilateral Image Subtraction and Multivariate Models for the Automated
Triaging of Screening Mammograms. BioMed Research International, 2015, pp.1-12.
Cheng, H., Cai, X., Chen, X., Hu, L. and Lou, X. (2003). Computer-aided detection and classification
of microcalcifications in mammograms: a survey. Pattern Recognition, 36(12), pp.2967-2991.
Cherkassky, V. and Mulier, F. (1998). Learning from data. New York: Wiley.
Ciecholewski, M. (2016). Microcalcification Segmentation from Mammograms: A Morphological
Approach. Journal of Digital Imaging, 30(2), pp.172-184.
Desautels, J., Rangayyan, R. and Mudigonda, N. (2000). Gradient and texture analysis for the
classification of mammographic masses. IEEE Transactions on Medical Imaging, 19(10), pp.1032-
1043.
Devroye, L., Gyorfi, L. and Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York:
Springer.
Dhawan, A., Chitre, Y. and Kaiser-Bonasso, C. (1996). Analysis of mammographic microcalcifications
using gray-level image structure features. IEEE Transactions on Medical Imaging, 15(3), pp.246-259.
Diaz-Huerta, C., Felipe-Riveron, E. and Montaño-Zetina, L. (2014). Quantitative analysis of
morphological techniques for automatic classification of micro-calcifications in digitized
mammograms. Expert Systems with Applications, 41(16), pp.7361-7369.
Diéz, Y., Gubern-Mérida, A., Wang, L., Diekmann, S., MartÃ, J., Platel, B., Kramme, J. and Martí,
R. (2014). Comparison of Methods for Current-to-Prior Registration of Breast DCE-MRI. In: H. Fujita,
T. Hara and C. Muramatsu, ed., Breast Imaging: 12th International Workshop, IWDM 2014, Gifu City,
Japan, June 29 - July 2, 2014. Proceedings, 1st ed. Gify City: Springer, pp.689-6995.
Diéz, Y., Oliver, A., Llado, X., Freixenet, J., Marti, J., Vilanova, J. and Marti, R. (2011). Revisiting
Intensity-Based Image Registration Applied to Mammography. IEEE Transactions on Information
Technology in Biomedicine, 15(5), pp.716-725.
Efford, N. (2002). Digital image processing. Harlow, Essex [u.a.]: Addison-Wesley.
El-Naqa, I., Yongyi Yang, Wernick, M., Galatsanos, N. and Nishikawa, R. (2002). A support vector
machine approach for detection of microcalcifications. IEEE Transactions on Medical Imaging, 21(12),
pp.1552-1563.
Fahnestock, J. and Schowengerdt, R. (1983). Spatially Variant Contrast Enhancement Using Local
Range Modification. Optical Engineering, 22(3).
Fenton, J. (2015). Is It Time to Stop Paying for Computer-Aided Mammography?. JAMA Internal
Medicine, 175(11), p.1837.
Fenton, J., Abraham, L., Taplin, S., Geller, B., Carney, P., D'Orsi, C., Elmore, J. and Barlow, W. (2011).
Effectiveness of Computer-Aided Detection in Community Mammography Practice. JNCI Journal of
the National Cancer Institute, 103(15), pp.1152-1161.
75
Fenton, J., Xing, G., Elmore, J., Bang, H., Chen, S., Lindfors, K. and Baldwin, L. (2013). Short-Term
Outcomes of Screening Mammography Using Computer-Aided Detection. Annals of Internal
Medicine, 158(8), p.580.
French, A., Macedo, M., Poulsen, J., Waterson, T. and Yu, A. (n.d.). Multivariate Analysis of Variance
(MANOVA). http://libraryguides.neomed.edu/c.php?g=324183&p=2172315.
Galván-Tejada, C., Zanella-Calzada, L., Galván-Tejada, J., Celaya-Padilla, J., Gamboa-Rosales, H.,
Garza-Veloz, I. and Martinez-Fierro, M. (2017). Multivariate Feature Selection of Image Descriptors
Data for Breast Cancer with Computer-Assisted Diagnosis. Diagnostics, 7(1), p.9.
Ganesan, K., Acharya, U., Chua, C., Min, L., Abraham, K. and Ng, K. (2013). Computer-Aided Breast
Cancer Detection Using Mammograms: A Review. IEEE Reviews in Biomedical Engineering, 6, pp.77-
98.
Giger, M., Karssemeijer, N. and Armato, S. (2001). Guest editorial computer-aided diagnosis in medical
imaging. IEEE Transactions on Medical Imaging, 20(12), pp.1205-1208.
Gonzalez, R. and Woods, R. (1992). Instructor's manual for digital image processing. Reading [etc.]:
Addison-Wesley.
Hackshaw, A. and Paul, E. (2003). Breast self-examination and death from breast cancer: a meta-
analysis. British Journal of Cancer, 88(7), pp.1047-1053.
Hadjiiski, L., Sahiner, B., Heang-Ping Chan, Petrick, N. and Helvie, M. (1999). Classification of
malignant and benign masses based on hybrid ART2LDA approach. IEEE Transactions on Medical
Imaging, 18(12), pp.1178-1187.
Hadjiiski, L., Sahiner, B., Chan, H., Petrick, N., Helvie, M. and Gurcan, M. (2001). Analysis of temporal
changes of mammographic features: Computer-aided classification of malignant and benign breast
masses. Medical Physics, 28(11), p.2309.
Haralick, R., Shanmugam, K. and Dinstein, I. (1973). Textural Features for Image Classification. IEEE
Transactions on Systems, Man, and Cybernetics, 3(6), pp.610-621.
Hasegawa, A., Neemuchwala, H., Tsunoda-Shimizu, H., Honda, S., Shimura, K., Sato, M., Koyama,
T., Kikuchi, M. and Hiramatsu, S. (2008). A Tool for Temporal Comparison of Mammograms: Image
Toggling and Dense-Tissue-Preserving Registration. In: E. Krupinski, ed., Digital Mammography. 9th
International Workshop, IWDM 2008 Tucson, AZ, USA, July 20-23, 2008 Proceedings, 1st ed. Berlin:
Springer, pp.447-454.
Heinlein, P., Drexl, J. and Schneider, W. (2003). Integrated wavelets for enhancement of
microcalcifications in digital mammography. IEEE Transactions on Medical Imaging, 22(3), pp.402-
413.
Hsu, H. and Lachenbruch, P. (2008). PairedtTest. Wiley Encyclopedia of Clinical Trials.
Hu, K., Yang, W. and Gao, X. (2017). Microcalcification diagnosis in digital mammography using
extreme learning machine based on hidden Markov tree model of dual-tree complex wavelet transform.
Expert Systems with Applications, 86, pp.135-144.
76
Jain, A., Duin, P. and Jianchang Mao, (2000). Statistical pattern recognition: a review. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 22(1), pp.4-37.
Karssemeijer, N. (1993). Adaptive noise equalization and recognition of microcalcification clusters in
mammograms. International Journal of Pattern Recognition and Artificial Intelligence, 07(06),
pp.1357-1376.
Karssemeijer, N. and te Brake, G. (1996). Detection of stellate distortions in mammograms. IEEE
Transactions on Medical Imaging, 15(5), pp.611-619.
Kelder, A., Zigel, Y., Lederman, D. and Zheng, B. (2015). A new computer-aided detection scheme
based on assessment of local bilateral mammographic feature asymmetry - A preliminary evaluation.
In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.
IEEE.
Khehra, B. and Pharwaha, A. (2016). Classification of Clustered Microcalcifications using MLFFBP-
ANN and SVM. Egyptian Informatics Journal, 17(1), pp.11-20.
Kobatake, H., Murakami, M., Takeo, H. and Nawano, S. (1999). Computerized detection of malignant
tumors on digital mammograms. IEEE Transactions on Medical Imaging, 18(5), pp.369-378.
Kooi, T. and Karssemeijer, N. (2017). Classifying Symmetrical Differences and Temporal Change for
the Detection of Malignant Masses in Mammography Using Deep Neural Networks. [online] Available
at:
https://www.researchgate.net/publication/315520619_Classifying_Symmetrical_Differences_and_Te
mporal_Change_in_Mammography_Using_Deep_Neural_Networks [Accessed 1 Jul. 2017].
Kovalerchuk, B., Triantaphyllou, E., Ruiz, J. and Clayton, J. (1997). Fuzzy logic in computer-aided
breast cancer diagnosis: analysis of lobulation. Artificial Intelligence in Medicine, 11(1), pp.75-85.
Krzanowski, W. (2000). Principles of Multivariate Analysis. Oxford: Oxford University Press.
Lehman, C., Wellman, R., Buist, D., Kerlikowske, K., Tosteson, A. and Miglioretti, D. (2015).
Diagnostic Accuracy of Digital Screening Mammography With and Without Computer-Aided
Detection. JAMA Internal Medicine, 175(11), p.1828.
Lei Zhen, and Chan, A. (2001). An artificial intelligent algorithm for tumor detection in screening
mammogram. IEEE Transactions on Medical Imaging, 20(7), pp.559-567.
Li, Y., Chen, H., Yang, Y., Cheng, L. and Cao, L. (2015). A bilateral analysis scheme for false positive
reduction in mammogram mass detection. Computers in Biology and Medicine, 57, pp.84-95.
Li, H., Meng, X., Wang, T., Tang, Y. and Yin, Y. (2017). Breast masses in mammography classification
with local contour features. BioMedical Engineering OnLine, 16(1).
Lu, Z., Carneiro, G., Dhungel, N. and P. Bradley, N. (2016). Automated Detection of Individual Micro-
calcifications From Mammograms Using a Multi-Stage Cascade Approach. [ebook] Available at:
http://1610.02251.pdf [Accessed 8 Feb. 2018].
Ma, F., Bajger, M., Williams, S. and Bottema, M. (2010). Improved Detection of Cancer in Screening
Mammograms by Temporal Comparison. In: J. Marta, A. Oliver, J. Freixenet and R. Marta, ed., Digital
77
Mammography 10th International Workshop, IWDM 2010, Girona, Catalonia, Spain, June 16-18, 2010.
Proceedings, 1st ed. Girona: Springer, pp.752-759.
Ma, F., Yu, L., Liu, G. and Niu, Q. (2014). Temporal change analysis for computer aided mass detection
in mammography. 2014 7th International Conference on Biomedical Engineering and Informatics.
Ma, F., Yu, L., Bajger, M. and Bottema, M. (2015). Incorporation of fuzzy spatial relation in temporal
mammogram registration. Fuzzy Sets and Systems, 279, pp.87-100.
Ma, F., Yu, L., Bajger, M. and Bottema, M. (2015). Mammogram Mass Classification with Temporal
Features and Multiple Kernel Learning. In: International Conference on Digital Image Computing:
Techniques and Applications (DICTA). Adelaide: IEEE eXpress Conference Publishing, pp.505-511.
Mallat, S. and Zhong, S. (1992). Characterization of signals from multiscale edges. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 14(7), pp.710-732.
Marias, K., Behrenbruch, C., Parbhoo, S., Seifalian, A. and Brady, M. (2005). A registration framework
for the comparison of mammogram sequences. IEEE Transactions on Medical Imaging, 24(6), pp.782-
790.
Martí, R., Díez, Y., Oliver, A., Tortajada, M., Zwiggelaar, R. and Lladó, X. (2014). Detecting Abnormal
Mammographic Cases in Temporal Studies Using Image Registration Features. In: H. Fujita, T. Hara
and C. Muramatsu, ed., Breast Imaging: 12th International Workshop, IWDM 2014, Gifu City, Japan,
June 29 - July 2, 2014. Proceedings, 1st ed. Gifu City: Springer, pp.612-619.
MATLAB and Image Processing Toolbox. Release 2006b, The MathWorks, Inc., Natick,
Massachusetts, United Sates.
Mendez, A., Tahoces, P., Lado, M., Souto, M. and Vidal, J. (1998). Computer-aided diagnosis:
Automatic detection of malignant masses in digitized mammograms. Medical Physics, 25(6), p.957.
Mika, S., Ratsch, G., Weston, J., Scholkopf, B. and Mullers, K. (1999). Fisher discriminant analysis
with kernels. In: Neural Networks for Signal Processing IX.
Mousa, R., Munib, Q. and Moussa, A. (2005). Breast cancer diagnosis system based on wavelet analysis
and fuzzy-neural. Expert Systems with Applications, 28(4), pp.713-723.
Nagel, R., Nishikawa, R., Papaioannou, J. and Doi, K. (1998). Analysis of methods for reducing false
positives in the automated detection of clustered microcalcifications in mammograms. Medical Physics,
25(8), pp.1502-1506.
National Cancer Institute. (2016). Mammograms. [online] Available at:
https://www.cancer.gov/types/breast/mammograms-fact-sheet#q11 [Accessed 1 Nov. 2016].
Nguyen, V., Nguyen, D., Nguyen, T., Phan, V. and Truong, Q. (2015). Filter-based feature selection
and support vector machine for false positive reduction in computer-aided mass detection in
mammograms. Seventh International Conference on Machine Vision (ICMV 2014).
Oliver, A. (2007). Automatic Mass Segmentation in Mammographic Images. Ph.D. University of
Girona.
78
Oliver, A., Freixenet, J., Martí , J., Pérez, E., Pont, J., Denton, E. and Zwiggelaar, R. (2010). A review
of automatic mass detection and segmentation in mammographic images. Medical Image Analysis,
14(2), pp.87-110.
Papadopoulos, A., Fotiadis, D. and Likas, A. (2005). Characterization of clustered microcalcifications
in digitized mammograms using neural networks and support vector machines. Artificial Intelligence
in Medicine, 34(2), pp.141-150.
Papoulis, A. (1965). Probability, random variables, and stochastic processes. New York: McGraw-Hill.
Patel, B. and Sinha, G. (2010). An Adaptive K-means Clustering Algorithm for Breast Image
Segmentation. International Journal of Computer Applications, 10(4), pp.35-38.
Rangayyan, R., Banik, S. and Desautels, J. (2010). Computer-Aided Detection of Architectural
Distortion in Prior Mammograms of Interval Cancer. Journal of Digital Imaging, 23(5), pp.611-631.
Richard, F. and Cohen, L. (2003). A new Image Registration technique with free boundary constraints:
application to mammography. Computer Vision and Image Understanding, 89(2-3), pp.166-196.
Rumelhart, D., Hinton, G. and Williams, R. (1986). Learning internal representation by error
propagation. In: D. Rumelhart, J. McClelland and PDP Research Group, ed., Parallel distributed
processing: explorations in the microstructure of cognition, 1st ed. Cambridge: MIT Press, pp.318-362.
Sanjay-Gopal, S., Chan, H., Wilson, T., Helvie, M., Petrick, N. and Sahiner, B. (1999). A regional
registration technique for automated interval change analysis of breast lesions on mammograms.
Medical Physics, 26(12), p.2669.
Scharcanski, J. and Jung, C. (2006). Denoising and enhancing digital mammographic images for visual
screening. Computerized Medical Imaging and Graphics, 30(4), pp.243-254.
Seber, G. (1984). Multivariate Observations. Wiley Series in Probability and Statistics.
Shanmugavadivu, P. and Sivakumar, V. (2013). Segmentation of pectoral muscle in mammograms
using fractal method. 2013 International Conference on Computer Communication and Informatics.
Shanmugavadivu, P., Sivakumar, V. and Sudhir, R. (2016). Fractal dimension-bound spatio-temporal
analysis of digital mammograms. The European Physical Journal Special Topics, 225(1), pp.137-146.
Sickles, E. (1986). Mammographic features of 300 consecutive nonpalpable breast cancers. American
Journal of Roentgenology, 146(4), pp.661-663.
Simoncelli, E. and Adelson, E. (1996). Noise removal via Bayesian wavelet coring. Proceedings of 3rd
IEEE International Conference on Image Processing, 1, pp.379-382.
Smithuis, R. and Pijnappel, R. (2008). Breast - Calcifications Differential Diagnosis. [ebook]
Available at: http://www.radiologyassistant.nl/en/p4793bfde0ed53/breast-calcifications-differential-
diagnosis.html [Accessed 19 Aug. 2017].
Soille, P. (1999). Morphological Image Analysis: Principles and Applications. Berlin: Springer, pp.164-
165.
79
Songyang Yu and Ling Guan (2000). A CAD system for the automatic detection of clustered
microcalcifications in digitized mammogram films. IEEE Transactions on Medical Imaging, 19(2),
pp.115-126.
Suhail, Z., Sarwar, M. and Murtaza, K. (2015). Automatic detection of abnormalities in mammograms.
BMC Medical Imaging, 15(1).
Sun, W., Zheng, B., Lure, F., Wu, T., Zhang, J., Wang, B., Saltzstein, E. and Qian, W. (2014). Prediction
of near-term risk of developing breast cancer using computerized features from bilateral mammograms.
Computerized Medical Imaging and Graphics, 38(5), pp.348-357.
Tan, M., Zheng, B., Leader, J. and Gur, D. (2016). Association Between Changes in Mammographic
Image Features and Risk for Near-Term Breast Cancer Development. IEEE Transactions on Medical
Imaging, 35(7), pp.1719-1728.
Tang, J., Rangayyan, R., Xu, J., El Naqa, I. and Yang, Y. (2009). Computer-Aided Detection and
Diagnosis of Breast Cancer With Mammography: Recent Advances. IEEE Transactions on Information
Technology in Biomedicine, 13(2), pp.236-251.
Theodoridis, S. and Koutroumbas, K. (2006). Pattern recognition. Amsterdam: Elsevier/Academic
Press.
Timp, S. and Karssemeijer, N. (2006). Interval change analysis to improve computer aided detection in
mammography. Medical Image Analysis, 10(1), pp.82-95.
Timp, S., Varela, C. and Karssemeijer, N. (2007). Temporal Change Analysis for Characterization of
Mass Lesions in Mammography. IEEE Transactions on Medical Imaging, 26(7), pp.945-953.
van Engeland, S., Snoeren, P., Jan Hendriks, and Karssemeijer, N. (2003). A comparison of methods
for mammogram registration. IEEE Transactions on Medical Imaging, 22(11), pp.1436-1444.
Viton, J. (1996). Method for characterizing masses in digital mammograms. Optical Engineering,
35(12), p.3453.
Vujovic, N. and Brzakovic, D. (1997). Establishing the correspondence between control points in pairs
of mammographic images. IEEE Transactions on Image Processing, 6(10), pp.1388-1399.
Wang, T. and Karayiannis, N. (1998). Detection of microcalcifications in digital mammograms using
wavelets. IEEE Transactions on Medical Imaging, 17(4), pp.498-509.
Wang, X., Zheng, B., Good, W., King, J. and Chang, Y. (1999). Computer-assisted diagnosis of breast
cancer using a data-driven Bayesian belief network. International Journal of Medical Informatics, 54(2),
pp.115-126.
Webb, A. (2002). Statistical pattern recognition. West Sussex, England: Wiley.
www.nationalbreastcancer.org. (2017). Breast Tumors :: The National Breast Cancer Foundation.
[online] Available at: http://www.nationalbreastcancer.org/breast-tumors [Accessed 3 Sep. 2017].
Xie, W., Li, Y. and Ma, Y. (2016). Breast mass classification in digital mammography based on extreme
learning machine. Neurocomputing, 173, pp.930-941.
80
Yan, S., Wang, Y., Aghaei, F., Qiu, Y. and Zheng, B. (2017). Applying a new bilateral mammographic
density segmentation method to improve accuracy of breast cancer risk prediction. International Journal
of Computer Assisted Radiology and Surgery, 12(10), pp.1819-1828.
Yin, F. (1991). Computerized detection of masses in digital mammograms: Analysis of bilateral
subtraction images. Medical Physics, 18(5), p.955.
Yin, F. (1999). Computerized detection of masses in digital mammograms: Automated alignment of
breast images and its effect on bilateral-subtraction technique. Medical Physics, 21(3), p.445.
Yu, S. and Guan, L. (2000). A CAD system for the automatic detection of clustered microcalcifications
in digitized mammogram films. IEEE Transactions on Medical Imaging, 19(2), pp.115-126.
Zheng, B., Chang, Y. and Gur, D. (1995). Computerized detection of masses from digitized
mammograms: Comparison of single-image segmentation and bilateral-image subtraction. Academic
Radiology, 2(12), pp.1056-1061.
Zuiderveld, K. (1994). Contrast limited adaptive histogram equalization. In: Graphic Gems, 4th ed. San
Diego: Academic Press Professional, pp.474-485.