69
Medical Imaging with Deep Learning Tutorial Chapter 1 - Radiology and Multi-View Chapter 2 - Histology and Segmentation Chapter 3 - Cell Counting Chapter 4 - Incorrect Feature Attribution Chapter 5 - GANs in Medical Imaging Slides by Joseph Paul Cohen 2020 Email: [email protected] License: Creative Commons Attribution-Sharealike Watch online

Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Medical Imaging with Deep Learning Tutorial

Chapter 1 - Radiology and Multi-View

Chapter 2 - Histology and Segmentation

Chapter 3 - Cell Counting

Chapter 4 - Incorrect Feature Attribution

Chapter 5 - GANs in Medical Imaging

Slides by Joseph Paul Cohen 2020Email: [email protected]

License: Creative Commons Attribution-Sharealike

Watch online

Page 2: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

2

Chapter 1

Radiology and Multi-View

Page 3: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Common X-ray projections/views

PA = PosteroAnterior = BackFront

Image: [Bustos, “PadChest: A Large Chest x-Ray Image Dataset with Multi-Label Annotated Reports.” 2019]

LR

Most common

Page 4: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Ronald SummersNIH Clinical Center

Released 2017, first large scale chest X-ray dataset

>100k frontal images released as public domain

Enabled the deep learning radiology revolution

Chest X-ray14 Dataset

Page 5: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Stanford Pneumonia study

https://stanfordmlgroup.github.io/projects/chexnet/

In 2017 Pranav Rajpurkar and Jeremy Irvin trained a DenseNet on NIH data scaled to 224x224 pixels

Set the benchmark performance which has not been significantly improved.

They evaluated pneumonia predictions against 4 radiologists.

"We find that the model exceeds the average radiologist performance on the pneumonia detection task."

Page 6: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Criticism of the Chest X-ray14 Dataset

https://lukeoakdenrayner.wordpress.com/2017/12/18/the-chestxray14-dataset-problems/

In 2017 Luke Oakden-Rayner published a blog post discussing issues with the labels in the NIH data.

This led to more work on automatic label extraction.

In a sample of images red are said to be wrong

Page 7: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

2019: the year of chest X-ray data

PADCHEST160k imagesMultiple views

Almost 200 labels

27% hand labelled, others using an RNN.

License:Creative Commons Attribution-ShareAlike

CheXpert224k images

PA and L views13 labels.

Automated rule-based labeler

Non-commercial research purposes only

MIMIC-CXR377k images

PA and L views13 labels.

Automated rule-based labeler. NIH (NegBio) and CheX

labelers ran.

Non-commercial research purposes only. Confidentially

training required.

Page 8: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

8/28

PADCHEST, ~200 labels

27% hand labelled, others using an RNN.

CheXpert, 13 labels

Custom rule-based labeler.

MIMIC-CXR, 13 labels

Automated rule-based labeler. NIH (NegBio) and

CheX labelers used.

NIH chest X-ray1414 labels

Automated rule-based labeler (NegBio)

RSNA Pneumonia KaggleRelabelled NIH data

A group at Google relabelled a subset of NIH images

MeSH automatic labeller

Many datasets exist with different methods of obtaining labels. Automatic or hand labelled

Page 9: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

PALateral

Flattened diaphragm

Pleural effusion

Multi-modal/view inference (X-ray use case)

Here saliency maps are from models trained on single views.

These two tasks perform better when using lateral views.

[Bertrand, 2019]

Page 10: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Also: Multi-modal/view inference (MRI use case)

T1 T2

T1C Flair

Ischemic stroke lesion segmentation

(ISLES dataset)

Stroke perfusion estimation

Brain tumor segmentation

(BraTS dataset)

Image Credit: Mohammad Havaei

Page 11: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Patient 1

Patient 2

Patient 3

Incomplete

Input!

Expected:

Given:

Challenge: missing modalities/views

Page 12: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Integrating multiple views

Combine images right at the input

Take mean of activations in the

middle of the network

Concat output features of two

models with single prediction

Three losses. A network for each

modality with losses that regularize each

network.

Image: [Hashir, Quantifying the Value of Lateral Views in Deep Learning for Chest X-rays, 2020]

Page 13: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Integrating multiple views (X-ray images)

All models are about equal in performance given the right hyperparameters. Hyperparameter tuning is easier on some models but not others

Image: [Hashir, Quantifying the Value of Lateral Views in Deep Learning for Chest X-rays, 2020]

Page 14: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Chapter 1 - ReferencesBustos, A., Pertusa, A., Salinas, J.-M., & de la Iglesia-Vayá, M. (2019). PadChest: A large chest x-ray image dataset with multi-label

annotated reports. ArXiv Preprint. http://arxiv.org/abs/1901.07441Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., & Summers, R. M. (2017). ChestX-ray8: Hospital-scale Chest X-ray Database and

Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2017.369

Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., Lungren, M. P., & Ng, A. Y. (2017). CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. Arxiv. http://arxiv.org/abs/1711.05225

Viviano, J. D., Simpson, B., Dutil, F., Bengio, Y., & Cohen, J. P. (2019). Underwhelming Generalization Improvements From Controlling Feature Attribution. Arxiv:1910.00199. http://arxiv.org/abs/1910.00199

Hashir, M., Bertrand, H., & Cohen, J. P. (2020). Quantifying the Value of Lateral Views in Deep Learning for Chest X-rays. Medical Imaging with Deep Learning.

Havaei, M., Guizard, N., Chapados, N., & Bengio, Y. (2016). HeMIS: Hetero-modal image segmentation. Medical Image Computing and Computer Assisted Intervention, 9901 LNCS. https://doi.org/10.1007/978-3-319-46723-8_54

Rubin, J., Sanghavi, D., Zhao, C., Lee, K., Qadir, A., & Xu-Wilson, M. (2018, April 20). Large Scale Automated Reading of Frontal and Lateral Chest X-Rays using Dual Convolutional Neural Networks. Machine Intelligence in Medical Imaging.

Cohen, J. P., Hashir, M., Brooks, R., & Bertrand, H. (2020). On the limits of cross-domain generalization in automated X-ray prediction. Medical Imaging with Deep Learning. https://arxiv.org/abs/2002.02497

Cohen, J. P., Viviano, J., Hashir, M., & Bertrand, H. (2020). TorchXRayVision: A library of chest X-ray datasets and models. https://github.com/mlmed/torchxrayvision

Page 15: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

15

Chapter 2

Histology and Segmentation

Page 16: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Peter Bandi, et al. From detection of individual metastases to classification of lymph node status at the patient level: the CAMELYON17 challenge. IEEE-TMI 2018

CAMELYON17: A large high resolution open histology dataset for cancer detection

CAMELYON17 Dataset1000 whole-slide images (WSIs) of sentinel lymph node. (~3GB each!)5 medical centers. 40 patients from each center. 5 whole-slide images per patient.

Page 17: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

https://colab.research.google.com/drive/13T9s3weexAw6YskKoY6c-VvoUgUvWsqf

Starting with a full slide image of breast tissue. Image is labelled as IDC or not Image is chopped into patches

and labelled as IDC or not

Patch wise segmentationUse case: Invasive Ductal Carcinoma (most common subtype of all breast cancers)

Page 18: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Slide design: Fei-Fei Li & Andrej Karpathy & Justin Johnson

Extract patch

Run througha CNN

Classify center pixel

CNN p(cancer)

Class imbalance is an issue. Patch wise training allows easy balancing of classes using standard methods.

Patch wise segmentationUse case: Invasive Ductal Carcinoma (most common subtype of all breast cancers)

Page 19: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Fully convolutional processing

Kernel size 3

Kernel size 2

Input size 4

Output size 1

Page 20: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Fully convolutional processing

Kernel size 3

Kernel size 2

Input size 5

Output size 2

Page 21: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Fully convolutional processing

Kernel size 3

Kernel size 2

Input size 5

Output size 2

Input size 4Input size 4

Page 22: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Fully convolutional processing

Kernel size 3

Kernel size 2

Input size 5

Output size 2

Model's receptive field = 4 nodes

Multiplications saved = 4

Allows for very fast inference.

However, training this way requires a lot of memory. Need to save past outputs.

Patch wise training together with FCN inference is a good balance.

Input size 4Input size 4

Page 23: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

https://colab.research.google.com/drive/13T9s3weexAw6YskKoY6c-VvoUgUvWsqf

Input image Output class 0 Output class 1 class 1 > class 0

Page 24: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015Slide design: Fei-Fei Li & Andrej Karpathy & Justin Johnson

Normal VGG “Upside down” VGG

Recap: Segmentation using a bottleneck

Upsampling possible with ● Unpooling● Transposed convolutions

Page 25: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Recap: U-NET

Difference: Skip connections (like resnet)

Dogma: skips carry spatial information, bottleneck carries

high level structure.

Page 26: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Segmentation metricsgt pred

True PositiveTrue Negative

False Negative

False Positive

IoU=0.4 IoU=0.7 IoU=0.9

Page 27: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Training with dice

Using the dot product to compute the intersection allows for a differentiable loss.

For multiple classes a basic approach is to average over all

classes

Exercise: What p maximizes this?

More reading: https://arxiv.org/abs/1707.00478

Use a sigmoid or a softmax to restrict output.

Page 28: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Images provided by Konrad Wagstyl (University College London) 2020

input gt seg

Predicted p(cortex) Edge prediction pred seg

Baseline

With edge prediction

Tricks: Improving edges in segmentations by predicting edges

Brain histology image

More reading about idea: [Polzounov, WordFence: Text Detection in Natural Images with Border Awareness, 2017]

Ground truthp(cortex)

Task: segment cortical layers in brain histology

Model output

None

Page 29: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Challenge: extreme class imbalance (e.g. lung nodule)

Background classes can dominates the loss and cause learning instability do to large

gradients.

Balanced sampling may not work as well because patches which could yield false

positives are rarely seen to train on.

Page 30: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

CASED importance sampling for large images

[Jesson, https://arxiv.org/abs/1807.10819 ]

General Idea:

Store a probability for each patch.

Generate patches based on this probability.

Probability is inverse of how well your model performs on that patch.

Samples are stratified by class.

Page 31: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Chapter 2 - References

Fidon, L., Li, W., Garcia-Peraza-Herrera, L. C., Ekanayake, J., Kitchen, N., Ourselin, S., & Vercauteren, T. (2017). Generalised Wasserstein Dice Score for Imbalanced Multi-class Segmentation using Holistic Convolutional Networks. http://arxiv.org/abs/1707.00478

Litjens, G., Bandi, P., Bejnordi, B. E., et al (2018). 1399 H&E-stained sentinel lymph node sections of breast cancer patients: The CAMELYON dataset. In GigaScience (Vol. 7, Issue 6). https://doi.org/10.1093/gigascience/giy065

Noh, H., Hong, S., & Han, B. (2015). Learning Deconvolution Network for Semantic Segmentation. International Conference on Computer Vision. https://arxiv.org/abs/1505.04366

Polzounov, A., Ablavatski, A., Escalera, S., Lu, S., & Cai, J. (2018). Wordfence: Text detection in natural images with border awareness. International Conference on Image Processing, 2017-Septe, 1222–1226. https://doi.org/10.1109/ICIP.2017.8296476

Jesson, A., Guizard, N., Ghalehjegh, S. H., Goblot, D., Soudan, F., & Chapados, N. (2017, September 10). CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance. Medical Image Computing and Computer Assisted Intervention. https://doi.org/10.1007/978-3-319-66179-7_73

Janowczyk, A., & Madabhushi, A. (2016). Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. Journal of Pathology Informatics, 7(1), 29. https://doi.org/10.4103/2153-3539.186902

Page 32: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

32

Chapter 3

Cell Counting

Page 33: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Use case: Proliferation/Cell growth studies

Treat cells with different compounds and observe proliferation over time

Standard 96-well plate

Page 34: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Bachstetter, MW151 Inhibited IL-1? Levels after Traumatic Brain Injury with No Effect on Microglia Physiological Responses, PLOS ONE, 2017

Use case: Proliferation/Cell growth studies

Page 35: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Complicated cell structure

Use case: Counting in histology slides

Page 36: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

1. Create binary segmentation image

2. Watershed segmentation

3. Isolate and count

Cell counting (classic CV)

Page 37: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

1. Create binary segmentation image

2. Watershed segmentation

3. Isolate and count

Cell counting (classic CV)

Page 38: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

1. Create binary segmentation image

2. Watershed segmentation

3. Isolate and count

Cell counting (classic CV)

Page 39: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

1. Create binary segmentation image

2. Watershed segmentation

3. Isolate and count

Cell counting (classic CV)

Page 40: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

1. Create binary segmentation image

2. Watershed segmentation

3. Isolate and count

Cell counting (classic CV)

This works well on easy tasks but doesn't scale.

"Pipelines" end up breaking on new images with different lighting or stain.

Page 41: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

How to get labels?

Page 42: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

V. Lempitsky and A. Zisserman, “Learning To Count Objects in Images,” 2010.

Counting via Segmentation

Targets for regressionSigma is typically small

like a few pixels

Page 43: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

V. Lempitsky and A. Zisserman, “Learning To Count Objects in Images,” 2010.

Counting via Segmentation

Targets for regressionSigma is typically small

like a few pixels

Train model to regress

Page 44: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

V. Lempitsky and A. Zisserman, “Learning To Count Objects in Images,” 2010.

Counting via Segmentation

To recover count:

Targets for regressionSigma is typically small

like a few pixels

Train model to regress

Page 45: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

V. Lempitsky and A. Zisserman, “Learning To Count Objects in Images,” 2010.

Counting via Segmentation

To recover count:

Targets for regressionSigma is typically small

like a few pixels

Train model to regress

Note: Square kernels for redundant countingwork better [Cohen 2017]

Page 46: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Multiple output classes

Count and classify different cell types [Bidart 2018]

Counting and classifying also possible using multiple output channels.

Combine losses together

Max prediction over output channels for each cell identified

Page 47: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Chapter 3 - References

Lempitsky, V., & Zisserman, A. (2010). Learning To Count Objects in Images. Neural Information Processing Systems (NeurIPS).

Cohen, J. P., Boucher, G., Glastonbury, C. A., Lo, H. Z., & Bengio, Y. (2017). Count-ception: Counting by Fully Convolutional Redundant Counting. International Conference on Computer Vision Workshop on BioImage Computing. http://arxiv.org/abs/1703.08710

Xie, W., Noble, J. A., & Zisserman, A. (2016). Microscopy cell counting and detection with fully convolutional regression networks. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization. https://doi.org/10.1080/21681163.2016.1149104

Gangeh, M. J., Bidart, R., Peikari, M., Martel, A. L., Ghodsi, A., Salama, S., & Nofech-Mozes, S. (2018). Localization and classification of cell nuclei in post-neoadjuvant breast cancer surgical specimen using fully convolutional networks. In M. N. Gurcan & J. E. Tomaszewski (Eds.), Medical Imaging 2018: Digital Pathology (Vol. 10581, p. 23). SPIE. https://doi.org/10.1117/12.2292815

BBBC021 - Human MCF7 cells – compound-profiling

RxRx1 - CellSignal: Disentangling biological signal from experimental noise

MBM - Modified Bone Marrow cell counting dataset

Page 48: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

48

Chapter 4

Incorrect Feature Attribution

Page 49: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

[Ross, Right for the Right Reasons, 2017][Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019]

Goal: predict if there are two plus signs anywhere

However, an easy to spot confounder exists!

The confounding variable distracts the model causing it to fail to generalize.

Page 50: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

[Ross, Right for the Right Reasons, 2017][Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019]

Goal: predict if there are two plus signs anywhere

However, an easy to spot confounder exists!

The confounding variable distracts the model causing it to fail to generalize.

Page 51: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

[Ross, Right for the Right Reasons, 2017][Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019]

Goal: predict if there are two plus signs anywhere

However, an easy to spot confounder exists!

The confounding variable distracts the model causing it to fail to generalize.

We can observe this by looking at the saliency map

Page 52: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

Models can overfit to confounding variables in the data.

● Merging datasets with different class imbalance (confounding artifacts from each hospital)

● Labels confounding with each other

● Demographics confounding with labels

[Ross, Right for the Right Reasons, 2017][Zeck, Confounding variables can degrade generalization performance of radiological ..., 2018]

[Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019][Simpson, GradMask: Reduce Overfitting by Regularizing Saliency, 2019]

Page 53: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

Models can overfit to confounding variables in the data.

● Merging datasets with different class imbalance (confounding artifacts from each hospital)

● Labels confounding with each other

● Demographics confounding with labels

[Ross, Right for the Right Reasons, 2017][Zeck, Confounding variables can degrade generalization performance of radiological ..., 2018]

[Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019][Simpson, GradMask: Reduce Overfitting by Regularizing Saliency, 2019]

(10k images)

Example:Systematic discrepancy between average image in datasets

Page 54: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Incorrect feature attribution

Recall:NIH/PADCHEST Diff

[Viviano, Underwhelming Generalization Improvements From Controlling Feature Attribution, 2019]

Page 55: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Mitigation approachesFeature engineering

● Range normalization ( /max)● Subspace alignment (align data using their eigenbasis based on a feature) [Fernando 2014]● Removing the largest principle component (joint PCA and reconstruct without largest eigenvector)

55

Page 56: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Mitigation approachesFeature engineering

● Range normalization ( /max)● Subspace alignment (align data using their eigenbasis based on a feature) [Fernando 2014]● Removing the largest principle component (joint PCA and reconstruct without largest eigenvector)

During training

● Reverse gradient (make intermediate layer invariant to a label) [Ganin & Lempitsky, 2014]● Right for the Right Reasons (regularize saliency map) [Ross, Hughes, & Finale Doshi-Velez, 2017]● GradMask (regularize contrast saliency map between classes) [Simpson, 2019]● ActivDiff (regularize representation to focus on pathology) [Viviano, 2019]

56

What if feature artifact is correlated with target label?Is the reason that should be used for prediction known?What if it is not known?

Page 57: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Chapter 4 - References

Ganin, Y., & Lempitsky, V. (2015, September 26). Unsupervised Domain Adaptation by Backpropagation. Proceedings of the International Conference on Machine Learning (ICML). http://jmlr.org/proceedings/papers/v37/ganin15.html

Ross, A., Hughes, M. C., & Doshi-Velez, F. (2017). Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. International Joint Conference on Artificial Intelligence. https://github.com/dtak/rrr.

Viviano, J. D., Simpson, B., Dutil, F., Bengio, Y., & Cohen, J. P. (2019). Underwhelming Generalization Improvements From Controlling Feature Attribution. Arxiv:1910.00199. http://arxiv.org/abs/1910.00199

Simpson, B., Dutil, F., Bengio, Y., & Cohen, J. P. (2019, April 16). GradMask: Reduce Overfitting by Regularizing Saliency. Medical Imaging with Deep Learning Workshop. http://arxiv.org/abs/1904.07478

Fernando, B., Habrard, A., Sebban, M., & Tuytelaars, T. (2014). Subspace Alignment For Domain Adaptation. http://arxiv.org/abs/1409.5241

Zech, J. R., Badgeley, M. A., Liu, M., Costa, A. B., Titano, J. J., & Oermann, E. K. (2018). Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Medicine, 15(11). https://doi.org/10.1371/journal.pmed.1002683

Seyyed-Kalantari, L., Liu, G., McDermott, M., & Ghassemi, M. (2020). CheXclusion: Fairness gaps in deep chest X-ray classifiers. http://arxiv.org/abs/2003.00827

Page 58: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

58

Chapter 5

GANs in Medical Imaging

Page 59: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Medical image-to-image translation considered harmful

MR -> CT CT -> PETSynthesized H&E staining

Adversarial losses are very good at distribution matching

(e.g. CycleGAN).But artifacts could be introduced

and then used in diagnosis which can be dangerous.

Many papers have proposed methods that can "translate between modalities"

Page 60: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

But a bias in training data can lead to incorrect translation

T1 Transformed

Image Translation/Synthesis

Undersampled raw MRI

source data

60

Use case: MRI modality transformation

Cohen, Distribution Matching Losses Can Hallucinate Features in Medical Image Translation, 2018

Everyone is so healthy!

Page 61: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

But a bias in training data can lead to incorrect translation

T1 Transformed

Everyone is so healthy!

T1 Real

Real Image

Image Translation/Synthesis

Source Image

61

Use case: MRI modality transformation

Cohen, Distribution Matching Losses Can Hallucinate Features in Medical Image Translation, 2018

Undersampled raw MRI

source data

Page 62: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Tumors here are a proxy to illustrate the impact of an unaccounted pathology

Cohen, Distribution Matching Losses Can Hallucinate Features in Medical Image Translation, 201862

Page 63: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

63[Goldsborough, CytoGAN: Generative Modeling of Cell Images, 2017]

Latent space interpretation

Vector algebra:

Real Real

Example: CytoGAN learning a self-supervised representation for cell images.● Encoder can be useful for semi-supervised learning ● Exploring representations to understand the cell biology

Adversarial losses are useful for representation learning

Page 64: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Semi-supervised Segmentation with GANs

Images with segmentation labels

Images without segmentation labels

Page 65: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Predicted segmentations from unlabelled images

Semi-supervised Segmentation with GANs

Predicted segmentations from images that were trained on

Match distributions

Luc et al. "Semantic Segmentation using Adversarial Networks" 2016Zhang et al., "Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images," 2017

Page 66: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Semi-supervised Segmentation with GANs

Segmentation Loss

E should predict 1 for labelled examples

Luc et al. "Semantic Segmentation using Adversarial Networks" 2016Zhang et al., "Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images," 2017

Update discriminator

E

Update segmenter

S

E should predict 0 for unlabelled examples

Segmentation output should not make E predict 0

Page 67: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Explanation by Progressive Exaggeration

Train a classifier and generative model jointly while maintaining consistency between them.

[Singla et al. Explanation by Progressive Exaggeration. ICLR 2020]

Explainer function:(cf outputs a one hot)

Page 68: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Explanation by Progressive Exaggeration

[Singla et al. Explanation by Progressive Exaggeration. ICLR 2020]

Generating images conditioned on an over and under prediction of the model helps explain what aspects of the image were important in prediction.

Here we can see the heart enlarge or shrink.

Prediction (normalized heart size)

Page 69: Watch Medical Imaging with Deep Learning Tutorial€¦ · Ronald Summers NIH Clinical Center Released 2017, first large scale chest X-ray dataset >100k frontal images released as

Chapter 5 - References

Cohen, J. P., Luck, M., & Honari, S. (2018). Distribution Matching Losses Can Hallucinate Features in Medical Image Translation. Medical Image Computing & Computer Assisted Intervention (MICCAI).

Goldsborough, P., Pawlowski, N., Caicedo, J. C., Singh, S., & Carpenter, A. E. (2017). CytoGAN: Generative Modeling of Cell Images. Workshop On Machine Learning In Computational Biology, Neural Information Processing Systems. https://doi.org/10.1101/227645

Luc, P., Couprie, C., Chintala, S., & Verbeek, J. (n.d.). Semantic Segmentation using Adversarial Networks. Retrieved December 5, 2017, from https://arxiv.org/pdf/1611.08408.pdf

Zhang, Y., Lin, Y., Chen, J., Fredericksen, M., Hughes, D. P., Chen, D. Z., Yang, L., Chen, J., Fredericksen, M., Hughes, D. P., & Chen, D. Z. (2017, September 10). Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images. Medical Image Computing and Computer-Assisted Intervention. https://doi.org/10.1007/978-3-319-66179-7

Singla, S., Pollack, B., Chen, J., & Batmanghelich, K. (2020, November 1). Explanation by Progressive Exaggeration. International Conference on Learning Representations. http://arxiv.org/abs/1911.00483