48
Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015 Collaborators: Bernardino Romera-Paredes Shuai Zheng Phillip Torr

Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Semantic Image Segmentation withDeep LearningSadeep Jayasumana

07/10/2015

Collaborators:Bernardino Romera-ParedesShuai ZhengPhillip Torr

Page 2: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Live Demo - http://crfasrnn.torr.vision/

Page 3: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Outline

Semantic segmentation

Why?

CNNs for Pixelwise prediction

CRFs

CRF as RNN

Conclusion

Page 4: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Semantic Segmentation

• Recognizing and delineating objects in an image Classifying each pixel in the image

Page 5: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Why Semantic Segmentation?

• To help partially sighted people by highlighting important objects in their glasses

Page 6: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Why Semantic Segmentation?

• To let robots segment objects so that they can grasp them

Page 7: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

• Road scenes understanding• Useful for autonomous navigation of cars and

drones

Image taken from the cityscapes dataset.

Why Semantic Segmentation?

Page 8: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

• Useful tool for editing images

Why Semantic Segmentation?

Page 9: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

• Medical purposes: e.g. segmenting tumours, dental cavities, ...

Image taken from Mauricio Reyes

ISBI Challenge 2015, dental x-ray images

Why Semantic Segmentation?

Page 10: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

But How?

• Deep convolutional neural networks are successful at learning a good representation of the visual inputs.

• However, here we have a structured output.

Page 11: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

CNN for Pixel-wise Labelling• Usual convolutional networks

Page 12: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

CNN for Pixel-wise Labelling• Usual convolutional networks

• Fully convolutional networks

Long et. al., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015.

Page 13: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Fully Convolutional Networks[Long et al, CVPR 2014]

Page 14: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

+ Significantly improved the state of the art in semantic segmentation.

- Poor object delineation: e.g. spatial consistency neglected.

Fully Convolutional Networks[Long et al, CVPR 2014]

Image FCN Results Ground truth

Page 15: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

• A CRF can account for contextual information in the image

Conditional Random Fields (CRFs)

Coarse output from the pixel-wise classifier

MRF/CRF modelling Output after the CRF inference

Page 16: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Conditional Random Fields (CRFs)

�� ∈ {bg, cat, tree, person, …}

• Define a discrete random variable Xi for each pixel i.

• Each Xi can take a value from the label set.

• Connect random variables to form a random field. (MRF)

Page 17: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Conditional Random Fields (CRFs)

�� ∈ {bg, cat, tree, person, …} �� = cat�� = bg

• Define a discrete random variable Xi for each pixel i.

• Each Xi can take a value from the label set.

• Connect random variables to form a random field. (MRF)

• Most probable assignment given the image → segmentation.

Page 18: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Finding the Best Assignment

�� = bgPr �� = ��, �� = ��,… , �� = �� |� = Pr(� = �|�)

�� = cat

Pr � = �|� = exp −� �|�

• Maximize Pr � = � → Minimize� �

• So we have formulated the problem as an energy minimization.

Page 19: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

� �|� = �����_���� + ��������_����

�� = ��

Page 20: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Unary energy

����(�� = ��) =?

� �|� = �����_���� + ��������_����

�� = ��

Page 21: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Unary energy

����(�� = ��) =?

Your label doesn’t agree with the initial

classifier → you pay a penalty.

� �|� = �����_���� + ��������_����

�� = ��

Page 22: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Unary energy

����(�� = ��) =?

Your label doesn’t agree with the initial

classifier → you pay a penalty.

Pairwise energy

����(�� = ��, �� = ��) =?

You assign different labels to two very similar

pixels → you pay a penalty.

How do you measure similarity?

� �|� = �����_���� + ��������_����

��

��

Page 23: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Unary energy

����(�� = ��) =?

Your label doesn’t agree with the initial

classifier → you pay a penalty.

Pairwise energy

����(�� = ��, �� = ��) =?

You assign different labels to two very similar

pixels → you pay a penalty.

How do you measure similarity?

� �|� = �����_���� + ��������_����

�� ��

Page 24: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

��

��

Unary energy

����(�� = ��) =?

Your label doesn’t agree with the initial

classifier → you pay a penalty.

Pairwise energy

����(�� = ��, �� = ��) =?

You assign different labels to two very similar

pixels → you pay a penalty.

How do you measure similarity?

� �|� = �����_���� + ��������_����

Page 25: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Dense CRF Formulation

• Pairwise energies are defined for every pixel pair in the image.

� � = ������(

��) + ���������(��, ��)

�,�

• Exact inference is not feasible.

• Use approximate mean field inference.

[Krähenbühl & Koltun, NIPS 2011.]

Page 26: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Dense CRF Formulation

• Pairwise energies are defined for every pixel pair in the image.

� � = ������(

��) + ���������(��, ��)

�,�

• Exact inference is not feasible.

• Use approximate mean field inference.

[Krähenbühl & Koltun, NIPS 2011.]

exp(−� � ) = � � = ��(��)

���

Page 27: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Fully Connected CRFs as a CNN

Page 28: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

BilateralQ

I

U

Fully Connected CRFs as a CNN

Page 29: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Bilateral ConvQ

I

U

Fully Connected CRFs as a CNN

Page 30: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Bilateral Conv ConvQ

I

U

Fully Connected CRFs as a CNN

Page 31: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Bilateral Conv Conv +Q

I

U

Fully Connected CRFs as a CNN

Page 32: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Bilateral Conv Conv + SoftMaxQ

I

U

Fully Connected CRFs as a CNN

Page 33: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Bilateral Conv Conv + SoftMaxQ

I

U

CRF as a Recurrent Neural Network

• Each of these blocks is differentiable We can backprop

Mean-field Iteration

Page 34: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

CRF Iteration

SoftMax

Image

Unaries

• Each of these blocks is differentiable We can backprop

Output

CRF as RNN

CRF as a Recurrent Neural Network

Page 35: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Putting Things Together

FCN CRF-RNN

Page 36: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Experiments

68.3 69.5 72.9

FCN CRFFCNCRF-RNNCRF-RNN

FCN

Ours[Chen et al, 2015][Long et al, 2014]

Page 37: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Try our demo: http://crfasrnn.torr.visionCode & model: https://github.com/torrvision/crfasrnn

Shuai Zheng

Bernardino Romera-Paredes

Philip Torr

Page 38: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Examples

http://pp.vk.me/c622119/v622119584/20dc3/7lS5BU2Bp_k.jpg

Page 39: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Examples

http://media1.fdncms.com/boiseweekly/imager/mountain-bikers-are-advised-to-dism/u/original/3446917/walk_thru_sheep_1_.jpg

Page 40: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Examples

http://img.rtvslo.si/_up/upload/2014/07/22/65129194_tour-3.jpg

Page 41: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Examples

http://www.toxel.com/wp-content/uploads/2010/11/bike05.jpg

Page 42: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Not-so-good examples

http://www.independent.co.uk/incoming/article10335615.ece/alternates/w620/planecat.jpg

Page 43: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

http://i1.wp.com/theverybesttop10.files.wordpress.com/2013/02/the-world_s-top-10-best-images-of-camouflage-cats-5.jpg?resize=375,500

Not-so-good examples

Page 44: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Tricky examples

http://se-preparer-aux-crises.fr/wp-content/uploads/2013/10/Golum.png

Page 45: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRf4J7Hszkc8Wf6riVUX-cV_K-un8LJy5dYIBW1KDIn6i7UCzGHpg

Tricky examples

Page 46: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

http://i.huffpost.com/gen/1478236/thumbs/s-DIRD6-large640.jpg

Tricky examples

Page 47: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Conclusion

• CNNs yield a coarse prediction on pixel-labeled tasks.

• CRFs improve the result by accounting for the contextual information in the image.

• Learning the whole pipeline end-to-end significantly improves the results.

CNN CRF

Page 48: Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering Department Semantic Image Segmentation with Deep Learning Sadeep Jayasumana 07/10/2015

Torr Vision Group, Engineering Department

Conclusion

• CNNs yield a coarse prediction on pixel-labeled tasks.

• CRFs improve the result by accounting for the contextual information in the image.

• Learning the whole pipeline end-to-end significantly improves the results.

CNN CRF

Thank You!