24
Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings Jagpreet Chhatwal 1 O. Alagoz 1 , E.S. Burnside 1 , H. Nassif 1 , E.A. Sickles 2 1 University of Wisconsin-Madison 2 University of California, San Francisco

Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings Jagpreet Chhatwal 1 O. Alagoz 1, E.S. Burnside 1, H. Nassif

Embed Size (px)

Citation preview

Bayesian Network for Predicting Invasive and In-situ Breast

Cancer using Mammographic Findings

Jagpreet Chhatwal1 O. Alagoz1, E.S. Burnside1, H. Nassif1,

E.A. Sickles2

1University of Wisconsin-Madison2University of California, San Francisco

Outline

• Introduction• Model Formulation• Results• Conclusions

2

Background: Facts

• Breast cancer is the most common non-skin cancer affecting women in the U.S.

• Every two minutes a woman is diagnosed with breast cancer

• Estimated number of deaths because of breast cancer in 2007 - 40,460

3

Mammography

A low dose X-ray examination of breasts

• Mammography is shown to be the most cost-effective diagnostic procedure for early diagnosis of breast cancer– American Cancer Society recommends that

women above the age 40 should have a screening mammogram every year

– More than 20 million mammograms are performed in the US annually

Breast Biopsy

5

Tissue-sampling procedure to confirm the presence of cancer• Types of biopsies: Needle

aspiration and surgical• Estimated number of

biopsies performed annually - 700,000

• 55−85% of breast biopsies result in benign (non-cancerous) findings

• Estimated overspending on benign biopsies - $250 million

• Significant anxiety associated with biopsy

Invasive and In-situ Cancer

6

• Nearly all breast cancer arises in the milk ducts of the breast• Invasive cancer• Ductal carcinoma in-situ

(DCIS)• DCIS lesions contain cells

that appear to be cancer but not all such lesions behave as cancer

Invasive and In-situ Cancer (Cont.) DCIS is a non-invasive malignant condition

with a very favorable prognosis. Depending on the grade of the DCIS and the

expected life span of older women, DCIS often will not cause morbidity or mortality for many years, if ever.

Invasive breast cancer has an increased risk of axillary node metastasis or distant disease Quickly results in morbidity and mortality (also in

older women).

7

Diagnosis or Over-diagnosis

Only some DCIS lesions will eventually become invasive cancer. What percent will become invasive cancer is not known Which DCIS will become invasive is not known

Detecting DCIS on mammograms may benefit those women whose DCIS would become invasive cancer.

Detecting DCIS may potentially harm those women who have breast surgery but whose DCIS would never become invasive cancer. Incidence of DCIS has increased significantly,

with the same predominance in older women as invasive breast cancer.

8

Objective

To build a quantitative model to predict the risk of DCIS and invasive breast cancer using patient demographic factors and mammography findings.

9

Data Source

• Mammography data from University of California San Francisco Medical Center between 1997 to 2007

• Combination of structured data and extracted variables from dictated text reports– Patient demographic factors – Imaging features according to the standardized

Breast Imaging Reporting and Data Systems (BI-RADS) lexicon.

• 2,211 malignant biopsy records – 1,544 invasive cancer and 667 DCIS.

10

Sample Text-report

• “Possible clustered microcalcifications, right breast. FINDINGS: Spot compression magnification mammography of the right breast was performed. In the right upper outer breast, there is a cluster of amorphous appearing microcalcifications. These are slightly suspicious for malignancy and therefore biopsy is recommended. No other suspicious clusters of microcalcifications are present. There are few scattered microcalcifications elsewhere in the right breast. Recommend needle localization followed by surgical biopsy…”

11

MBNi

• Developed a Mammography Bayesian Network for Invasive and In-situ cancer risk prediction (MBNi)

• Structural Training– NP hard problem– Using Tree Augmented Naïve Bayes

(TAN) algorithm– WEKA (Waikato Environment for

Knowledge Analysis)

12

MBNi

13

Performance Measures Sensitivity and Specificity

Patient with Disease

Patients without disease

Test + a b

Test - c d

Sensitivity=a/(a+c)

Specificity = d/(b+d)

Receiver Operating Characteristic (ROC) Curve Graphical plot of sensitivity versus 1-specificity for

varying cut-off points (thresholds) Area under the ROC curve (Az)

Performance Measures (Cont.) Precision and Recall

Patient with Disease

Patients without disease

Test + a b

Test - c d

Recall = a/(a+c)

Precision = a/(a+b)

Precision-Recall (PR) Curve Graphical plot of precision versus recall for varying cut-off

points (thresholds)

Validation Technique 10 fold cross-validation

Test fold

Training fold

1 32 10…

Data set

Fold

Merge tested folds for performance analysis

1 32 10…

Fold

Performance: ROC Curve

17

Performance: PR Curve

18

Older versus Younger Women

• Mammography is known to perform better in older women

• We stratified our data set in two parts as follows:– Mammography data of women less than

age 50 (177 DCIS and 361 invasive cancers),

– Mammography data of women above the age 65 (219 DCIS and 600 invasive cancers).

19

Performance: ROC

20

P=0.039

Performance: PR Curve

21

P=0.038

Conclusions

• Our MBNi can predict the risk of DCIS versus invasive cancer and may be superior in older.

• Our MBNi has the potential to aid in the clinical management decisions such as the need for increased sampling at biopsy and the appropriate selection of surgical interventions.

• Our MBNi is a step towards shared decision-making and may empower older women to better manage their health in the context of their co-morbidities and life expectancy.

22

Ongoing and Future Research

• Validation of “text extraction” features• Three-class prediction model – Benign,

DCIS and Invasive cancer• Predict the risk of breast diseases type• Ensemble learning:

– Logistic Regression– Artificial Neural Networks– Bayesian Networks– Support Vector Machines

23

Thank You!

24