64

R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5
Page 2: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 3: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Over 2180

citations !

Page 4: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 5: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ด๐’†๐’•๐’‰๐’๐’… โ†’ ๐‘ฉ๐’‘ โˆ’ ๐‘ท๐’“๐’†๐’…๐’Š๐’„๐’•๐’†๐’… ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ช๐’๐’๐’‡๐’Š๐’…๐’†๐’๐’„๐’† ๐’”๐’„๐’๐’“๐’† ๐’‘๐’†๐’“ ๐’…๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’

๐‘ฎ๐’“๐’๐’–๐’๐’… ๐‘ป๐’“๐’–๐’•๐’‰ โ†’ ๐‘ฉ๐’ˆ๐’• โˆ’ ๐‘จ๐’„๐’•๐’–๐’‚๐’ ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

Page 6: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘จ๐’“๐’†๐’‚ ๐‘ถ๐’—๐’†๐’“๐’๐’‚๐’‘ โ‰œ ๐‘ฐ๐’๐‘ผ โ‰œ๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆฉ ๐‘ฉ๐’ˆ๐’•)

๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆช ๐‘ฉ๐’ˆ๐’•)

๐‘ด๐’†๐’•๐’‰๐’๐’… โ†’ ๐‘ฉ๐’‘ โˆ’ ๐‘ท๐’“๐’†๐’…๐’Š๐’„๐’•๐’†๐’… ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ช๐’๐’๐’‡๐’Š๐’…๐’†๐’๐’„๐’† ๐’”๐’„๐’๐’“๐’† ๐’‘๐’†๐’“ ๐’…๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’

๐‘ฎ๐’“๐’๐’–๐’๐’… ๐‘ป๐’“๐’–๐’•๐’‰ โ†’ ๐‘ฉ๐’ˆ๐’• โˆ’ ๐‘จ๐’„๐’•๐’–๐’‚๐’ ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ช๐’๐’“๐’“๐’†๐’„๐’• ๐‘ซ๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’: ๐‘ฐ๐’๐‘ผ >๐Ÿ

๐Ÿ

Page 7: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ด๐’†๐’•๐’‰๐’๐’… โ†’ ๐‘ฉ๐’‘ โˆ’ ๐‘ท๐’“๐’†๐’…๐’Š๐’„๐’•๐’†๐’… ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ช๐’๐’๐’‡๐’Š๐’…๐’†๐’๐’„๐’† ๐’”๐’„๐’๐’“๐’† ๐’‘๐’†๐’“ ๐’…๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’

๐‘ฎ๐’“๐’๐’–๐’๐’… ๐‘ป๐’“๐’–๐’•๐’‰ โ†’ ๐‘ฉ๐’ˆ๐’• โˆ’ ๐‘จ๐’„๐’•๐’–๐’‚๐’ ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘จ๐’—๐’†๐’“๐’‚๐’ˆ๐’† ๐‘ท๐’“๐’†๐’„๐’Š๐’”๐’Š๐’๐’ โ‰œ ๐‘จ๐‘ท

๐‘จ๐’“๐’†๐’‚ ๐‘ถ๐’—๐’†๐’“๐’๐’‚๐’‘ โ‰œ ๐‘ฐ๐’๐‘ผ โ‰œ๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆฉ ๐‘ฉ๐’ˆ๐’•)

๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆช ๐‘ฉ๐’ˆ๐’•)

๐‘ช๐’๐’“๐’“๐’†๐’„๐’• ๐‘ซ๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’: ๐‘ฐ๐’๐‘ผ >๐Ÿ

๐Ÿ

Page 8: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ด๐’†๐’•๐’‰๐’๐’… โ†’ ๐‘ฉ๐’‘ โˆ’ ๐‘ท๐’“๐’†๐’…๐’Š๐’„๐’•๐’†๐’… ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ช๐’๐’๐’‡๐’Š๐’…๐’†๐’๐’„๐’† ๐’”๐’„๐’๐’“๐’† ๐’‘๐’†๐’“ ๐’…๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’

๐‘ฎ๐’“๐’๐’–๐’๐’… ๐‘ป๐’“๐’–๐’•๐’‰ โ†’ ๐‘ฉ๐’ˆ๐’• โˆ’ ๐‘จ๐’„๐’•๐’–๐’‚๐’ ๐‘ฉ๐’๐’–๐’๐’…๐’Š๐’๐’ˆ ๐‘ฉ๐’๐’™

๐‘ด๐’†๐’‚๐’ ๐‘จ๐’—๐’†๐’“๐’‚๐’ˆ๐’† ๐‘ท๐’“๐’†๐’„๐’Š๐’”๐’Š๐’๐’ โ‰œ ๐’Ž๐‘จ๐‘ท โ‰œ๐‘ด๐’†๐’‚๐’( ๐‘จ๐‘ท ๐’๐’—๐’†๐’ ๐’‚๐’๐’ ๐’„๐’๐’‚๐’”๐’” )

๐‘จ๐’—๐’†๐’“๐’‚๐’ˆ๐’† ๐‘ท๐’“๐’†๐’„๐’Š๐’”๐’Š๐’๐’ โ‰œ ๐‘จ๐‘ท

๐‘จ๐’“๐’†๐’‚ ๐‘ถ๐’—๐’†๐’“๐’๐’‚๐’‘ โ‰œ ๐‘ฐ๐’๐‘ผ โ‰œ๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆฉ ๐‘ฉ๐’ˆ๐’•)

๐‘จ๐’“๐’†๐’‚(๐‘ฉ๐’‘ โˆช ๐‘ฉ๐’ˆ๐’•)

๐‘ช๐’๐’“๐’“๐’†๐’„๐’• ๐‘ซ๐’†๐’•๐’†๐’„๐’•๐’Š๐’๐’: ๐‘ฐ๐’๐‘ผ >๐Ÿ

๐Ÿ

Page 9: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 10: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

R-CNN

Page 11: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 12: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Input image

Page 13: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Input image

Regions of interest (ROI)

from a proposal method

(~2k)

Page 14: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Input image

Warped image regions

Regions of interest (ROI)

from a proposal method

(~2k)

Page 15: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Input image

Forward each region

through ConvNet

Warped image regions

Regions of interest (ROI)

from a proposal method

(~2k)

Page 16: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Classify each region with SVMs

Regions of interest (ROI)

from a proposal method

(~2k)

Warped image regions

Forward each region

through ConvNet

Input image

Page 17: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 18: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

mini batch size

of 128

Page 19: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 20: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Better

mAP of

3-5%

Page 21: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 22: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 23: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 24: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Input image

Regions of interest

(ROI) from a proposal

method (~2k)

Warped image regions

Forward each region

through ConvNet

Classify each region with

SVMsApply

bounding box

regressors

Page 25: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 26: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 27: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 28: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 29: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 30: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 31: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 32: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 33: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 34: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

arXiv: 1504.08083 (2015):

By: Ross Girshick, Microsoft Reasearch

Page 35: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 36: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 37: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 38: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 39: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 40: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 41: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 42: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 43: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 44: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 45: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 46: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ณ ๐’‘, ๐’–, ๐’•๐’–, ๐’— = ๐‘ณ๐’„๐’๐’”(๐’‘, ๐’–) + ๐บ โˆ™ ๐’– โ‰ฅ ๐Ÿ โˆ™ ๐‘ณ๐’๐’๐’„(๐’•๐’–, ๐’—)

p = ๐‘0, ๐‘1, โ€ฆ , ๐‘๐พ

๐‘ก๐‘˜ = ๐‘ก๐‘ฅ๐‘˜ , ๐‘ก๐‘ฆ

๐‘˜ , ๐‘ก๐‘ค๐‘˜ , ๐‘กโ„Ž

๐‘˜

over K + 1 categories

For each of the K object classes, indexed by k

๐’– be the ground truth class of the RoI

๐’— be the ground truth bounding box

Page 47: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ณ ๐’‘, ๐’–, ๐’•๐’–, ๐’— = ๐‘ณ๐’„๐’๐’”(๐’‘, ๐’–) + ๐บ โˆ™ ๐’– โ‰ฅ ๐Ÿ โˆ™ ๐‘ณ๐’๐’๐’„(๐’•๐’–, ๐’—)

๐‘ณ๐’„๐’๐’” ๐’‘, ๐’– = โˆ’๐’๐’๐’ˆ ๐’‘๐’–

๐บ โˆ’ ๐‘น๐’†๐’ˆ๐’–๐’๐’“๐’Š๐’›๐’‚๐’•๐’Š๐’๐’ ๐’‘๐’‚๐’“๐’‚๐’Ž๐’†๐’•๐’†๐’“

๐’– โ‰ฅ ๐Ÿ โˆ’ ๐‘ญ๐’๐’“๐’†๐’ˆ๐’“๐’๐’–๐’๐’… ๐’‚๐’„๐’•๐’Š๐’—๐’‚๐’•๐’Š๐’๐’

๐‘ณ๐’๐’๐’„ ๐’•๐’–, ๐’— =

๐’Šโˆˆ ๐’™,๐’š,๐’˜,๐’‰

๐’”๐’Ž๐’๐’๐’•๐’‰๐‘ณ๐Ÿ(๐’•๐’Š๐’– โˆ’ ๐’—๐’Š)

๐’”๐’Ž๐’๐’๐’•๐’‰๐‘ณ๐Ÿ ๐’™ = ๐ŸŽ. ๐Ÿ“ โˆ™ ๐’™๐Ÿ, ๐’™ < ๐Ÿ๐’™ โˆ’ ๐ŸŽ. ๐Ÿ“, ๐’™ โ‰ฅ ๐Ÿ

Page 48: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐’š๐’“๐’‹ = ๐’™๐’Š โˆ—(๐’“,๐’‹)

๐’Š โˆ— (๐’“, ๐’‹) = ๐š๐ซ๐ ๐ฆ๐š๐ฑ๐’Šโ€ฒโˆˆ ๐“ก ๐’“,๐’‹

๐’™๐’Šโ€ฒ

๐๐‘ณ

๐๐’™๐’Š=

๐’“

๐’‹

[๐’Š = ๐’Šโˆ—(๐’“, ๐’‹)]๐๐‘ณ

๐๐’š๐’“๐’‹

Page 49: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 50: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 51: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 52: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 53: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Neural Information Processing Systems (NIPS), 2015:

By: S. Ren, K. He, R. Girshick, J. Sun, Microsoft Research

Page 54: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 55: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 56: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 57: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

OR

Page 58: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ณ ๐’‘๐’Š , ๐’•๐’Š =๐Ÿ

๐‘ต๐’„๐’๐’”

๐’Š

๐‘ณ๐’„๐’๐’”(๐’‘๐’Š, ๐’‘๐’Šโˆ—) + ๐บ โˆ™

๐Ÿ

๐‘ต๐’“๐’†๐’ˆ

๐’Š

๐’‘๐’Šโˆ— โˆ™ ๐‘ณ๐’“๐’†๐’ˆ(๐’•๐’Š, ๐’•๐’Š

โˆ—)

OR

Page 59: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ณ ๐’‘๐’Š , ๐’•๐’Š =๐Ÿ

๐‘ต๐’„๐’๐’”

๐’Š

๐‘ณ๐’„๐’๐’”(๐’‘๐’Š, ๐’‘๐’Šโˆ—) + ๐บ โˆ™

๐Ÿ

๐‘ต๐’“๐’†๐’ˆ

๐’Š

๐’‘๐’Šโˆ— โˆ™ ๐‘ณ๐’“๐’†๐’ˆ(๐’•๐’Š, ๐’•๐’Š

โˆ—)

๐’Š โˆ’ ๐’‚๐’๐’„๐’‰๐’๐’“ ๐’Š๐’๐’…๐’†๐’™

๐’‘๐’Š โˆ’ ๐’‘๐’“๐’†๐’…๐’Š๐’„๐’•๐’†๐’… ๐’‘๐’“๐’๐’ƒ๐’‚๐’ƒ๐’Š๐’๐’Š๐’•๐’š ๐’๐’‡ ๐’‚๐’๐’„๐’‰๐’๐’“ ๐’Š ๐’ƒ๐’†๐’Š๐’๐’ˆ ๐’‚๐’ ๐’๐’ƒ๐’‹๐’†๐’„๐’•

๐’‘๐’Šโˆ— =

๐Ÿ , ๐’Š๐’‡๐’‚๐’๐’„๐’‰๐’๐’“ ๐’Š ๐’Š๐’” ๐‘ท๐’๐’”๐’Š๐’•๐’Š๐’—๐’†๐ŸŽ , ๐’Š๐’‡๐’‚๐’๐’„๐’‰๐’๐’“ ๐’Š ๐’Š๐’” ๐‘ต๐’†๐’ˆ๐’†๐’•๐’Š๐’—๐’†

๐‘ณ๐’„๐’๐’” ๐’‘๐’Š, ๐’‘๐’Šโˆ— โˆ’ ๐’๐’๐’ˆ ๐’๐’๐’”๐’” ๐’๐’—๐’†๐’“ ๐’•๐’˜๐’ ๐’„๐’๐’‚๐’”๐’”๐’†๐’”

๐‘ต๐’„๐’๐’” โˆ’ ๐’•๐’‰๐’† ๐’Ž๐’Š๐’๐’Š โˆ’ ๐’ƒ๐’‚๐’•๐’„๐’‰ ๐’”๐’Š๐’›๐’† (๐Ÿ๐Ÿ“๐Ÿ”)

Page 60: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

๐‘ณ ๐’‘๐’Š , ๐’•๐’Š =๐Ÿ

๐‘ต๐’„๐’๐’”

๐’Š

๐‘ณ๐’„๐’๐’”(๐’‘๐’Š, ๐’‘๐’Šโˆ—) + ๐บ โˆ™

๐Ÿ

๐‘ต๐’“๐’†๐’ˆ

๐’Š

๐’‘๐’Šโˆ— โˆ™ ๐‘ณ๐’“๐’†๐’ˆ(๐’•๐’Š, ๐’•๐’Š

โˆ—)

๐‘ณ๐’“๐’†๐’ˆ ๐’•๐’Š, ๐’•๐’Šโˆ— = ๐’”๐’Ž๐’๐’๐’•๐’‰๐‘ณ๐Ÿ(๐’•๐’Š โˆ’ ๐’•๐’Š

โˆ—)

๐‘ก๐‘ฅ = ๐‘ฅ โˆ’ ๐‘ฅ๐‘Ž /๐‘ค๐‘Ž

๐‘ก๐‘ฅโˆ— = ๐‘ฅโˆ— โˆ’ ๐‘ฅ๐‘Ž /๐‘ค๐‘Ž

๐‘ก๐‘ฆ = ๐‘ฆ โˆ’ ๐‘ฆ๐‘Ž /โ„Ž๐‘Ž

๐‘ก๐‘ฆโˆ— = ๐‘ฆโˆ— โˆ’ ๐‘ฆ๐‘Ž /โ„Ž๐‘Ž

๐‘ก๐‘ค = ๐‘™๐‘œ๐‘” ๐‘ค/๐‘ค๐‘Ž

๐‘ก๐‘คโˆ— = ๐‘™๐‘œ๐‘” ๐‘คโˆ—/๐‘ค๐‘Ž

๐‘กโ„Ž = ๐‘™๐‘œ๐‘” โ„Ž/โ„Ž๐‘Ž

๐‘กโ„Žโˆ— = ๐‘™๐‘œ๐‘” โ„Žโˆ—/โ„Ž๐‘Ž

๐‘ต๐’“๐’†๐’ˆ โˆ’ ๐’•๐’‰๐’† ๐’๐’–๐’Ž๐’ƒ๐’†๐’“ ๐’๐’‡ ๐’‚๐’๐’„๐’‰๐’๐’“ ๐’๐’๐’„๐’‚๐’•๐’Š๐’๐’๐’” (~๐Ÿ, ๐Ÿ’๐ŸŽ๐ŸŽ)

๐‘ท๐’‚๐’“๐’‚๐’Ž๐’†๐’•๐’†๐’“๐’Š๐’›๐’‚๐’•๐’Š๐’๐’๐’” ๐’๐’‡ ๐’‚๐’๐’ ๐’•๐’‰๐’† ๐’•๐’Š ๐’–๐’”๐’Š๐’๐’ˆ ๐’•๐’‰๐’† ๐’‚๐’๐’„๐’‰๐’๐’“๐’”:

๐‘ฅ โˆ’ ๐‘กโ„Ž๐‘’ ๐‘๐‘Ÿ๐‘’๐‘‘๐‘–๐‘๐‘ก๐‘’๐‘‘ ๐‘๐‘œ๐‘ ๐‘–๐‘ก๐‘–๐‘œ๐‘›๐‘ก = (๐‘ก๐‘ฅ , ๐‘ก๐‘ฆ, ๐‘ก๐‘ค , ๐‘กโ„Ž) ๐‘ฅ๐‘Ž โˆ’ ๐‘กโ„Ž๐‘’ ๐‘Ž๐‘›๐‘โ„Ž๐‘œ๐‘Ÿ ๐‘๐‘œ๐‘ ๐‘–๐‘ก๐‘–๐‘œ๐‘›

๐‘ฅโˆ— โˆ’ ๐‘กโ„Ž๐‘’ ๐บ๐‘‡ ๐‘๐‘œ๐‘ ๐‘–๐‘ก๐‘–๐‘œ๐‘›

Page 61: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 62: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Page 63: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Test Time per Image

using VGG-16

Detection mAP on

PASCAL VOC

201220102007

47 Sec58.553.762.4R-CNN

300 mSec(Excluding object proposal time

For 2K proposals)

7068.868.4Fast R-CNN

200 mSecOverall time

73.2---70.4Faster R-CNN

Page 64: R-CNN - TAUweb.eng.tau.ac.il/deep_learn/wp-content/uploads/2017/01/RCNN.pdfย ยท R-CNN Test Time per Image using VGG-16 Detection mAP on PASCAL VOC 2007 2010 2012 R-CNN 62.4 53.7 58.5

R-CNN

Thank You

For Listening

-

Any Questions ?