Download pdf - Detection and Segmentation of Road Images with Deep …on-demand.gputechconf.com/gtc-eu/2017/presentation/23304-william... · Detection and Segmentation of Road Images with Deep Learning

Detection and Segmentation of Road Images with

Deep Learning

GTC Europe, October 2017, Talk #23304

Frank Geujen – Senior Product Manager

William Raveane – Computer Vision Engineer

Mapscape, a Navinfo company

CONTENTS

目录Who is NavInfo

SD & HD Map Making

Road Feature Extraction

Traffic Sign Detection

Looking Ahead

Who is NavInfo

NavInfo is the leading map provider in China, with focus on location big data platform, HD / SD map, Telematics and ADAS comprehensive solutions.

• Established on 2002 in Beijing China

• More than 4500 employees Globally

NavInfo Introduction

Automated Driving

Connected Car Navigation

Branches

International business expansion Advanced technology research

America Netherlands Mapscape: Compilation Technology (NDS) EU Technology Centre:

• Computer Vision• Deep Learning

China 31 localization base for data collect and technology service. 4 R&D Centers（Shanghai、Xian、Shenyang、Wuhan） Beijing Headquarters

SD & HD Map Making

Challenge of Map Making

Ingestion/ Extraction Data Source

100+

Collection vehicles

600+dispersed field staff in China

360+ citiesBig data mining99% Highway80% Main Road

22M+community contributions

3.2+ MillionSigns processed per year

20+ Million

POI updated per year

4+ MillionRoad distance updated per year

Map Creation

> 500production staff

4,000+page specifications

Delivery

6.16+ Million Kilometer

24.95+ MillionCore POI

260+Attributes in the portfolio

60+Cities of ground truth testing in 2017 Q1

80%/70%

Update80% POI and 70% road linkin China per year

field local offices31

Hong Kong, Laos,Macao, CambodiaMap data

SD Map

POI

Standard Definition Map, is primarily for A to B routing & guidanceand is a simplified representation of the road in links and nodes.

HD Map

High Definition Map, is used for automated/autonomous drivingand includes high accurate lane and road features.

The Field drive use cases

Live feedmono

camera

Real Timesign

extraction

(Semi) automatedcore mapupdating

2 FPSstoredmono

cameraimagery

Off lineFeature

extraction

(Semi) automatedcore mapupdating

Panoramic imagery

Road feature

extraction

Projection on LIDAR

data

Feature extraction

from LIDAR

HD map creation/ updating

On-boardSD map

Off-boardSD map

Off-boardHD map

Traffic Sign Detection

Technical Deep Dive:

Real Time Traffic Sign Detection

Use case: In-car data collection on an NVIDIA Jetson TX2

• Over 180 traffic sign classes supported today

• Up to 32 fps at 1920x1080 in 15W MAX-N mode

• Detection based on Single Shot Detector (SSD)

• Training on a Titan X GPU server

• Inference through TensorRT

Supported Features

Supported today• Speed Limits• Warning Signs• Information Signs• Prohibition Signs• Directional Signs

In development• Gantry Sign Boards• Traffic Lights• Digital Traffic Signs

Sign Detection Demo Watch Online: https://goo.gl/KBgGo8

Performance Highlights

Multiple Simultaneous Detections


Distant Traffic Signs


Bad Lighting Conditions

Optimization of SSD on Jetson TX2

• With TensorRT:• 6x speedup in inference performance• 3x reduction in memory consumption

• And with our in-house CUDA Kernels• Additional 3x speedup in inference performance• Allows full utilization of GPU resources

Implementation of SSD

Two-stage system:• ResNet-based SSD for Detection• ResNet for Fine Classification

Custom Layer API:• Bridges both TensorRT Stages

Detector: SSD

Classifier:ResNet

SSD Custom Layers

• Implementation of SSD layers as custom CUDA kernels:• Executed by Custom Layer API• Priors replaced by on-demand calculations• Softmax calculated only when required• Non-maximum suppression replaced by a

batched data feeder for the classifier

SSD on the Jetson TX2

SSD Caffe TensorRT + our CUDA kernels

Profile visualization of SSD inference

510ms 31ms

Detection Accuracy

• Single Image:• Precision: 92.5%• Recall: 98%

• Tracking over Time:• Precision: 96.0%• Recall: 98.5%

Single Image Per-Class Detection Accuracy

Single Image Per-Class Classification Accuracy

Single Image Detection PR Curve

Class ID

Class ID

Acc

ura

cyA

ccu

racy

0

100

0

100


Technical Deep Dive:


• Road feature and object extraction

• Semantic segmentation network architecture

• Automatic lane grouping

• Training & inference on NVIDIA Titan X GPU server Lane numbering

Road features

Gantry sign boards

Supported today:• Surface level:

• Lane markings• Text, numbers, speed limits• Arrows

• Road objects:• Gantry sign boards• Guard Rails• Curbs

In Development:• Poles• Traffic Lights• Tunnels

Road Features

The On-road feature extraction process

Crop from Panoramic Image

Camera Calibration Transformation

to Top View

Segmentation Network

Transformation to Front View

Semantic Segmentation

Deep Neural

Network

Lane Number Grouping

Lane Segmentation Demo Watch Online: https://goo.gl/4CXTD5

Semantic Segmentation Performance• Inference at 5 images per second using an NVIDIA Titan X GPU

• Common lane marking classes• Recall: 92.8%• Precision: 82%

• Common road arrow marking classes• Recall: 85.6%• Precision: 72.8%

Confusion MatrixPerformance of the system expected to further improve as we continue development

Looking Ahead

Looking Ahead

• Deep Learning continues being more integrated into our:• Field collection• Map creation• Distribution processes

• On-going developments:• Real-time semantic segmentation system on-board vehicles• Crowdsource data processing supporting self-healing maps• Applications for crowdsourcing

Detection and Segmentation of Road Images with

Deep Learning

GTC Europe, October 2017 , Talk #23304

Frank Geujen – Senior Product Manager

William Raveane – Computer Vision Engineer

Mapscape, a Navinfo company