Upload
hiroshi-fukui
View
1.804
Download
4
Embed Size (px)
Citation preview
Histograms of Oriented Gradients (HOG) --
[] Deep LearningMachine Perception Robotics Group
0
1
1
2
2
3
2004, 2005
2009, 2010
3
Histogram of Oriented Gradient Support Vector Machine [Dalal 2005]4
SVM
[Dalal 2005] N. Dalal and B. Triggs,"Histograms of Oriented Gradients for Human Detection", CVPR, 2005.
4
5
2004, 2005
2009, 2010
2013, 2014
2015, 2016
5
2009RGBLIDAR
Deep Convolutional Neural Network
6
6
7
INRIA Dataset [Dalal 2004]
Caltech Pedestrian Dataset[Dollr 2009]1,568 - 1,20856633,171 - 192,0004,02480,000 - 25,000 - LIDAR, , GPSKITTI Dataset [Andreas 2012]
[Andreas 2012] G. Andreas, L. Philip and U. Raquel, "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite", CVPR, 2012.[Dalal 2005] N. Dalal and B. Triggs,"Histograms of Oriented Gradients for Human Detection", CVPR, 2005.[Dollr 2009] P. Dollr, C. Wojek, B. Schiele and P. Perona"Pedestrian Detection: A Benchmark, CVPR, 2009.
7
Toronto City DatasetKITTI DatasetKITTI DatasetRGB(, ) LIDARGPS712km8,439km400,000
8
Deep Convolutional Neural Network2012[Krizhevsky 2012]AlexNet10009
AlexNetAlexNet[Krizhevsky 2012] A. Krizhevsky, I. Sutskever and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", NIPS, 2012.
9
2
Region proposal10
10
2
Region proposal11
11
2CNNCNNCNN12
CNN
12
213
13
~2013Color Self Similarity-HOG + SVM2014~
14
14
LUVHOG15ICF[Dollr 2009]VeryFast[Benenson, 2012]ACF[Dollr , 2014]LDCF[Nam, 2014]Checkerboard[Benenson, 2015]SquaresChrFtrs[Benenson, 2013]
Filtered Channel Feature
[Benenson 2013] R. Benenson, M. Mathias, T. Tuytelaars and L. Van Gool, "Seeking the strongest rigid detector", CVPR, 2013.[Dollr 2009] P. Dollr, Z. Tu, P. Perona and S. Belongie, "Integral Channel Features", BMVC, 2009.[Benenson 2012] R. Benenson, M. Mathias, R. Timofte and L. Van Gool, "Pedestrian detection at 100 frames per second", CVPR2012.[Nam 2014] W. Nam, P. Dollr and J. H. Han, "Local Decorrelation For Improved Pedestrian Detection", NIPS, 2014.[Zhang 2015] S. Zhang, R. Benenson and B. Schiele, "Filtered Channel Features for Pedestrian Detection", CVPR, 2015.[Dollr 2014] P. Dollr, R. Appel, S. Belongie and P. Perona, "Fast feature pyramids for object detection", PAMI, 2014.
HOG+SVM&DPM
15
Integral Channel Feature [Dollr 2009]HOGBoosted tree16
[Dollr 2009] P. Dollr, Z. Tu, P. Perona and S. Belongie, "Integral Channel Features", BMVC, 2009.
16
VeryFast [Benenson 2012] 11Feature pyramidFeature pyramidFast Feature pyramid17
N/K models 1 scale image1 model N scale images[Benenson 2012] R. Benenson, M. Mathias, R. Timofte and L. Van Gool, "Pedestrian detection at 100 frames per second", CVPR2012.
17
Aggregate Channel Feature [Benenson 2014] ICF, VeryFast
18
[Benenson 2014] P. Dollr, R. Appel, S. Belongie and P. Perona, "Fast feature pyramids for object detection", PAMI, 2014.
18
Filtered Channel Feature [Nam 2014] [Zhang 2015]LDCFCheckerboardCheckerboard
19
LDCFCheckerboard
[Nam 2014] W. Nam, P. Dollr and J. H. Han, "Local Decorrelation For Improved Pedestrian Detection", NIPS, 2014.[Zhang 2015] S. Zhang, R. Benenson and B. Schiele, "Filtered Channel Features for Pedestrian Detection", CVPR, 2015.
CNNCNN
CNNCNN20
20
221Miss rate (fps)CNN+Joint Deep Learning201339.32--CNN + RBMSDN201437.870.7CNN + RBMEIN201537.771CNNTACNN201534.99--AlexNetCCF201517.32--VGGDeep Cascade201526.2115VGGDeepParts201511.89--GoogLeNetCompACT201511.752CNN, VGGNet
21
1. CNNJoint Deep Learning, Switchable Deep Network, DeepParts
22
22
23
Joint Deep Learning[Ouyang 2013] - Level - RBM
Switchable Deep Network[Luo 2013]3 - RBM
DeepParts[Tian 2015] - Caltech -
[Luo 2013] P. Luo, Y. Tian, X. Wang and X. Tang, "Switchable Deep Network for Pedestrian Detection", CVPR, 2014.[Tian 2015] Y. Tian, P. Luo, X. Wang and X. Tang, "Deep Learning Strong Parts for Pedestrian Detection", ICCV, 2015.[Ouyang 2013] W. Ouyang and X. Wang, "Joint deep learning for pedestrian detection" ,ICCV, 2013.
23
2. CNNConvolutional Channel FeatureCNNBoosted treeACF24
[Yang 2015] B. Yang, J. Yan, Z. Lei and S. Z. Li, "Convolutional Channel Features: Tailoring CNN to Diverse Tasks", ICCV, 2015.
24
3. Deep cascade Complex Aware Cascade Training
CNN25 CNN[Cai 2015] Z. Cai, M. Saberian and N. Vasconcelos, "Learning Complexity-Aware Cascades for Deep Pedestrian Detection", ICCV, 2015.[Angelova 2015] A. Angelova, A. Krizhevsky, M. View, V. Vanhoucke, A. Ogale and D. Ferguson, "Real-Time Pedestrian Detection With Deep Network Cascades", BMVC, 2015.
25
Deep Cascade [Angelova 2015] VeryFastCNNDeep Learning26
VeryFast
Tiny CNN
Baseline CNN
Tiny CNN
BaselineCNN
[Angelova 2015] A. Angelova, A. Krizhevsky, M. View, V. Vanhoucke, A. Ogale and D. Ferguson, "Real-Time Pedestrian Detection With Deep Network Cascades", BMVC, 2015.
2
Region proposal27
27
Region proposal1
Fast R-CNN [Girshick 2015] Faster R-CNN [Ren 2015] You Only Look Once [Redmon 2016] Single Shot Multi-box Detector [Liu 2016]
28[Redmon 2016] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You only look once: Unified, real-time object detection", CVPR, 2016.[Girshick 2015] R. Girshick, "Fast R-CNN", ICCV, 2015.[Ren 2015] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", NIPS, 2015.[Liu 2016] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu and A. C. Berg, "SSD : Single Shot MultiBox Detector", ECCV, 2016.
28
Region proposal29Miss rate (fps)CNN+Fast R-CNN201512.863Fast R-CNNSA-FAST R-CNN20159.682.5Fast R-CNNFaster R-CNN201518.022RPNMS-CNN2016102.5RPNRPN+BF20169.62RPNSSD201613.0610SSDFused DNN20168.20.5SSD + FCN
29
R-CNNSelective searchCNNSelective searchCNNCNNSVM30
Selective searchCNN[Girshick 2014] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation", CVPR, 2014.
30
Selective search()
31
[Jasper 2013] J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders, "Selective Search for Object Recognition", In International Journal of Computer Vision 2013.
31
R-CNN (1 / 2)32
Selective searchCNN
CNN - 12000CNN - CNN - Selective searchetc
32
R-CNN (2 / 2)33
Selective searchCNN
Selective search - CNN
33
R-CNN34
Selective searchCNN
Faster R-CNN
Fast R-CNN
34
Fast R-CNN [Girshick 2015] & Faster R-CNN [Ren 2015] Fast R-CNN1Faster R-CNN1CNNRegion Proposal Network(RPN)35
Fast R-CNNFaster R-CNN[Girshick 2015] R. Girshick, "Fast R-CNN", ICCV, 2015.[Ren 2015] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", NIPS, 2015.
35
Fast R-CNNScale Aware Fast R-CNN36
[Li 2016] J. Li, X. Liang, S. Shen, T. Xu and S. Yan, "Scale-aware Fast R-CNN for Pedestrian Detection", ECCV, 2015.
Faster R-CNN(RPN)Multi Scale CNN
37
[Cai 2016] Z. Cai, Q. Fan, R. S. Feris and N. Vasconcelos, "A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection", ECCV, 2016.
Faster R-CNN(RPN)RPN + Boosted ForestRPNBoosted Forest
38
[Zhang 2016] L. Zhang, L. Lin, X. Liang and K. He, "Is Faster R-CNN Doing Well for Pedestrian Detection", abs/1607.07032, 2016.
38
Single shot1CNN1Faster R-CNN1CNN
You Only Lock OnceSingle Shot Multi-box Detector
39
Faster R-CNNYOLOSSDVS.[Redmon 2016] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You only look once: Unified, real-time object detection", CVPR, 2016.[Liu 2016] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu and A. C. Berg, "SSD : Single Shot MultiBox Detector", ECCV, 2016.
39
You Only Look Once( + ) x 2Faster R-CNN
40
[Redmon 2016] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You only look once: Unified, real-time object detection", CVPR, 2016.
40
Single Shot Multibox Detector113%
41
[Liu 2016] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu and A. C. Berg, "SSD : Single Shot MultiBox Detector", ECCV, 2016.
41
Fused DNNSSDSSDSoft-rejection based Network FusionCaltech Pedestrian Dataset42
[Du 2016] X. Du, M. El-Khamy, J. Lee, S. D. Larry, "Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection", abs/1610.03466, 2016.
42
2016
2Faster R-CNNSSD
4322Region proposal1
43
Deep LearningRGB
Toronto City DatasetRGB(, ) LIDARGPS712km8,439km400,000
44
44
Region Proposal NetworkSSD2016RPNSSD2016.12.12
CG
()45
Fused DNN [Du 2016][Du 2016] X. Du, M. El-Khamy, J. Lee, S. D. Larry, "Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection", abs/1610.03466, 2016.
45
46
47[Dalal 2005] N. Dalal and B. Triggs,"Histograms of Oriented Gradients for Human Detection", CVPR, 2005.[Andreas 2012] G. Andreas, L. Philip and U. Raquel, "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite", CVPR, 2012.[Dollr 2009] P. Dollr, C. Wojek, B. Schiele and P. Perona"Pedestrian Detection: A Benchmark, CVPR, 2009.[Krizhevsky 2012] A. Krizhevsky, I. Sutskever and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", NIPS, 2012.[Benenson 2013] R. Benenson, M. Mathias, T. Tuytelaars and L. Van Gool, "Seeking the strongest rigid detector", CVPR, 2013.[Dollr 2009] P. Dollr, Z. Tu, P. Perona and S. Belongie, "Integral Channel Features", BMVC, 2009.[Benenson 2012] R. Benenson, M. Mathias, R. Timofte and L. Van Gool, "Pedestrian detection at 100 frames per second", CVPR2012.[Nam 2014] W. Nam, P. Dollr and J. H. Han, "Local Decorrelation For Improved Pedestrian Detection", NIPS, 2014.[Zhang 2015] S. Zhang, R. Benenson and B. Schiele, "Filtered Channel Features for Pedestrian Detection", CVPR, 2015.[Dollr 2014] P. Dollr, R. Appel, S. Belongie and P. Perona, "Fast feature pyramids for object detection", PAMI, 2014.[Luo 2013] P. Luo, Y. Tian, X. Wang and X. Tang, "Switchable Deep Network for Pedestrian Detection", CVPR, 2014.[Tian 2015] Y. Tian, P. Luo, X. Wang and X. Tang, "Deep Learning Strong Parts for Pedestrian Detection", ICCV, 2015.[Ouyang 2013] W. Ouyang and X. Wang, "Joint deep learning for pedestrian detection" ,ICCV, 2013.[Yang 2015] B. Yang, J. Yan, Z. Lei and S. Z. Li, "Convolutional Channel Features: Tailoring CNN to Diverse Tasks", ICCV, 2015.[Cai 2015] Z. Cai, M. Saberian and N. Vasconcelos, "Learning Complexity-Aware Cascades for Deep Pedestrian Detection", ICCV, 2015.[Angelova 2015] A. Angelova, A. Krizhevsky, M. View, V. Vanhoucke, A. Ogale and D. Ferguson, "Real-Time Pedestrian Detection With Deep Network Cascades", BMVC, 2015.[Girshick 2014] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation", CVPR, 2014.[Girshick 2015] R. Girshick, "Fast R-CNN", ICCV, 2015.[Ren 2015] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", NIPS, 2015.[Zhang 2016] L. Zhang, L. Lin, X. Liang and K. He, "Is Faster R-CNN Doing Well for Pedestrian Detection", abs/1607.07032, 2016.[Redmon 2016] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You only look once: Unified, real-time object detection", CVPR, 2016.[Liu 2016] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu and A. C. Berg, "SSD : Single Shot MultiBox Detector", ECCV, 2016.[Du 2016] X. Du, M. El-Khamy, J. Lee, S. D. Larry, "Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection", abs/1610.03466, 2016.