1
Real-time Object Detection Zibo Gong, Tianchang He, Ziyi Yang Department of Electrical Engineering, Stanford University Introduction Object Detection Faster R-CNN Matching Illustration of Fast-RCNN Training Set Example Object detection: one of the classic problems in computer vision. Convolution Neural Networks (CNN): one of the most powerful tools now for Artificial Intelligence and Machine Learning problem. We used Faster Region-based Convolutional Neural Network method (Faster R-CNN) to detect certain objects in photos and matched with one object in our database. The matching of objects adopts features learned by Faster R-CNN or conventional CV features, for instance, histogram of oriented gradients (HOG). As a result, we can realized the detection and matching of objects (e.g. cars) in pictures. Training the Faster R-CNN. Train the region proposal network (RPN) end-to-end by back- propagation and stochastic gradient descent. Train a separate detection network by Fast R-CNN using the proposals generated by the step-1 RPN for 20k mini-batches. Use the detector network to initialized RPN training for 40k mini- batches. Keeping the shared convolutional layers fixed, fine-tune the unique layers of Fast R-CNN for 20k mini-batches. R-CNN output: bounding box with confidence level. Perspective Average Precision Front 81.57% Back 86.40% Left 87.56% Right 79.31% detection with p(object | box) >=0.8 Detection of car in different perspective Features Color Keypoint regions (SIFT) Histogram of oriented Gradients (HoG) Algorithm K-nearest neighbor (K-NN) Distance: Euclidean/Cosine Application Pipeline Color detected Reference [1] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [2] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. Closest Match With d=0.315 Input image patch d=0.364 d=0.366 Input video/image Detection using neural network Feature Extraction Output matches from database

Real-time Object Detection - Machine learningcs229.stanford.edu/proj2016/poster/GongYangHe-Real Time... · 2017-09-23 · Real-time Object Detection Zibo Gong, Tianchang He, Ziyi

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real-time Object Detection - Machine learningcs229.stanford.edu/proj2016/poster/GongYangHe-Real Time... · 2017-09-23 · Real-time Object Detection Zibo Gong, Tianchang He, Ziyi

Real-time Object DetectionZibo Gong, Tianchang He, Ziyi Yang

Department of Electrical Engineering, Stanford University

Introduction Object Detection

Faster R-CNN

Matching

Illustration of Fast-RCNN Training Set Example

• Object detection: one of the classic problems in computer vision.• Convolution Neural Networks (CNN): one of the most powerful tools

now for Artificial Intelligence and Machine Learning problem. • We used Faster Region-based Convolutional Neural Network method

(Faster R-CNN) to detect certain objects in photos and matched with one object in our database. The matching of objects adopts featureslearned by Faster R-CNN or conventional CV features, for instance, histogram of oriented gradients (HOG). As a result, we can realized the detection and matching of objects (e.g. cars) in pictures.

Training the Faster R-CNN.• Train the region proposal network (RPN) end-to-end by back-

propagation and stochastic gradient descent. • Train a separate detection network by Fast R-CNN using the proposals

generated by the step-1 RPN for 20k mini-batches. • Use the detector network to initialized RPN training for 40k mini-

batches. • Keeping the shared convolutional layers fixed, fine-tune the unique

layers of Fast R-CNN for 20k mini-batches.R-CNN output: bounding box with confidence level.

Perspective AveragePrecision

Front 81.57%

Back 86.40%

Left 87.56%

Right 79.31%

detection with p(object | box) >=0.8

Detection of car in different perspective

Features• Color• Keypoint regions (SIFT)• Histogram of oriented

Gradients (HoG)• …

Algorithm• K-nearest neighbor (K-NN)• Distance: Euclidean/Cosine

Application Pipeline

Color detected

Reference[1] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.[2] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.[3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

Closest MatchWith d=0.315

Input image patch

d=0.364 d=0.366

Input video/image Detection usingneural network

Feature ExtractionOutput matchesfrom database