Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Modeling Traffic Scenes for Intelligent Vehicles using
CNN-based Detection and Orientation Estimation
Carlos Guindel, David Martín and José María Armingol
Intelligent Systems Laboratory (LSI) · Universidad Carlos III de Madrid
Sevilla · 23 November 2017
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
2
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
3
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Introduction4
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
• A basic requirement for driving tasks
Obstacle detection
• An accurate estimation of the class is essential
Classification• Close-to-market
assemblies
• Rich data source
Vision-based approaches
• Feature learning
• The new paradigm in computer vision
Convolutional Neural Networks
Automated vehicles
• Highly dynamic, semi-structured environments
• They have to handle complex situations
IVVI 2.0 project5
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
INTELLIGENT VEHICLE BASED ON VISUAL INFORMATION 2.0
Trinocular stereo cam.
Multi-layer lidar scanner
Side-looking cameras
Computer with GPU
+info:uc3m.es/islab
System overview6
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
• Two main branches intended to run in parallel
• Obstacle detection
• Features are extracted exclusively from the left stereo image
• Scene modeling
• Stereo-based 3D reconstruction & flat-ground assumption
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
7
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Faster R-CNN framework
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2016.
Convolutional features computed only once per image
A RPN generates proposals wrt. a fixed set of anchors
Conv. features in these regionsare pooled for classification
Parameters are learned through a multi-task loss
8
Viewpoint estimation
• Faster R-CNN framework was modified to introduce viewpoint inference
9
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
C. Guindel, D. Martin, and J. M. Armingol, “Joint object detection and viewpoint estimation using CNN features,” in Proc. of the IEEE International Conference on Vehicular Electronics and Safety (ICVES), 2017, pp. 145–150.
Final estimation: Θ𝑖∗ → መ𝜃
Discrete viewpoint inference
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
𝑁𝑏 angle bins Θ𝑖 …Θ𝑁𝑏
𝑁𝑏 = 8Training: 𝜃𝑖0→ Θ𝑖
Inference output: r ∈ Δ𝑁𝑏−1
𝑟
Elements of 𝑟
10
• Every object is assigned a bin
• Inference gives a categorial distribution
Joint detection and viewpoint estimation
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
11
Feature map
Proposal
Fixed sizefeat. vector
Fully connected (FC) layers
FC layer FC layer FC layer
B. Box regression
Softmax Softmax Softmax
Class Viewpoint
Only 𝑁𝑏 · 𝐾 · 4096new weights
…
𝑁𝑏
· 𝐾
𝑁𝑏 · 𝐾
elementsangle bins
classes
Fast
er R
-CN
N lo
ss
Loss function and training
• Unweighted muli-task loss with five components
Joint Object Detection and Viewpoint Estimation using CNN featuresCarlos Guindel · ICVES 2017
Logistic lossfor RPN objectness
Smooth-L1 lossfor RPN b.box regression
Logistic lossfor class
Smooth-L1 lossfor b.box regression
Logistic lossfor viewpoint
estimation
12
…
𝑁𝑏 angle bins of the ground truth class
Ground-truth angle bin
Normalized on the batch
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
13
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Scene modeling14
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Left image
Right image
DisparityP.C.
generatorDisparity
map
3D point cloud
3D RECONSTRUCTION
Scene modeling15
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Left image
Right image
DisparityP.C.
generatorDisparity
map
3D point cloud
3D RECONSTRUCTION
Scene modeling16
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Left image
Right image
DisparityP.C.
generatorDisparity
map
3D point cloud
3D RECONSTRUCTION
SGM stereo matchingSuitable for environments with lack of texture, illumination changes, etc.
Scene modeling17
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Left image
Right image
DisparityP.C.
generatorDisparity
map
3D point cloud
3D RECONSTRUCTION
Pin-hole + disparityWe build a XYZRGB cloud
from the left image and the disparity map
Scene modeling18
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Left image
Right image
DisparityP.C.
generatorDisparity
map
3D point cloud
3D point cloud
Plane
segmentation
Plane model
Calibration from plane
Camera-to-world
calibration
3D RECONSTRUCTION
EXTRINSIC PARAMETERS AUTO-CALIBRATION
Scene modeling19
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
3D point cloud
Plane
segmentation
Plane model
Calibration from plane
Camera-to-world
calibration
EXTRINSIC PARAMETERS AUTO-CALIBRATION
Voxel grid dowsamplingThe cloud from the 3D reconstruction pipeline is downsampled (grid size: 20 cm)
Scene Modeling20
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
3D point cloud
Plane
segmentation
Plane model
Calibration from plane
Camera-to-world
calibration
EXTRINSIC PARAMETERS AUTO-CALIBRATION
Planar segmentationUsing RANSAC with a 10 cm threshold, and a small angular tolerance.
…Pass through filtersVertical axis: 0-2 mDepth axis: 0-20 m
Scene modeling21
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
3D point cloud
Plane
segmentation
Plane model
Calibration from plane
Camera-to-world
calibration
EXTRINSIC PARAMETERS AUTO-CALIBRATION
roll
pitchPlane:
Object localization22
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Object localization23
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Object ROI for localization
11 central rows of the object’s bounding box
Filtered cloudWithout points belonging to the ground or too close to the camera
Organized pointcloud
Coordinates on the world’sXY plane for every point
𝒙 = (𝑥, 𝑦, 𝑧)
𝒙𝑜𝑏𝑗 = 𝑚𝑒𝑑𝑖𝑎𝑛 𝒙
𝜃𝑜𝑏𝑗 = 𝛼 − atan2(𝑦𝑜𝑏𝑗 − 𝑥𝑜𝑏𝑗)
x
𝑦𝑜𝑏𝑗
world
y
𝑥𝑜𝑏𝑗
𝜃
Top-down view
yaw
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
24
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Results: Detection and viewpoint estimation
• KITTI Object Detection Benchmark
• 5,576 images for training and 2,065 for validation
• Labels for class and orientation available
• Evaluation metric
• Average Orientation Similarity (AOS)
25
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Results: Detection and viewpoint estimation
• Two different architectures:
• ZF (lightweight) and VGG 16-layer (more complex)
• Three different scales (height in pixels):
• 375, 500, 625
26
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
88,43 66,28 63,41 N.A. N.A. N.A. 2 sec.Top-performing
comparable method in the KITTI ranking
(ms)
Results: Scene modeling27
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Results: Scene modeling28
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Results: Scene modeling29
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Agenda
Introduction
Obstacle detection
Scene modeling
Results
Conclusion
30
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Conclusion
• Towards a full object-based scene understanding
• CNN-based detection and viewpoint inference
• Efficient approach: the same set of features is used for all tasks
• Stereo-vision 3D information is included for situation assessment
• Results validate our approach
31
Modeling Traffic Scenes for Intelligent Vehicles using CNN-based Detection and Orientation Estimation
C. Guindel et al. · ROBOT 2017
Code for CNN detection & viewpoints available athttps://github.com/cguindel/lsi-faster-rcnn
Future work
• New categories of traffic elements
• Extension to the time domain
• Tracking, filtering, etc.
• Including information from other perception modules
• E.g., semantic segmentation
THANKS FOR
YOUR ATTENTION
23 November 2017
ROBOT'2017 - Third Iberian Robotics Conference
Carlos Guindel · cguindel.github.io · [email protected]