APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER … · 2015. 8. 10. · APPLICATIONS...

Preview:

Citation preview

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston

Practical DEEP LEARNING Examples

Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding

Speech Recognition, Speech Translation, Natural Language Processing

Pedestrian Detection, Traffic Sign Recognition Breast Cancer Cell Mitosis Detection, Volumetric Brain Image Segmentation

What is DEEP LEARNING?

Input Result

Tree

Cat

Dog

Deep Learning Framework

“turtle”

Forward Propagation

Compute weight update to nudge

from “turtle” towards “dog”

Backward Propagation

Trained Neural

Net Model

“cat”

Repeat

Training

Inference

Making a vehicle classifier

PICKUP

SUV

SUV

The “Big Bang” In Deep Learning

Algorithms Data Compute Capability

Medical Research

Detecting Mitosis in

Breast Cancer Cells — IDSIA

Predicting the Toxicity

of New Drugs — Johannes Kepler University

Understanding Gene Mutation

to Prevent Disease — University of Toronto

“Automated Image Captioning with ConvNets and Recurrent Nets”

—Andrej Karpathy, Fei-Fei Li

Captioning

Why Are GPUs Good for Deep Learning?

GPUs deliver --

same or better prediction accuracy

faster results

smaller footprint

lower power

Neural Networks GPUs

Inherently

Parallel Matrix

Operations

FLOPS

0 0 4

60

110 28%

26%

16%

12%

7%

2010 2011 2012 2013 2014

bird

frog

person

dog

chair

GPU-Accelerated Deep Learning

START-UPS

GPU-Accelerated Deep Learning Frameworks

CAFFE TORCH THEANO CUDA-CONVNET2 KALDI

Domain Deep Learning

Framework

Scientific Computing

Framework

Math Expression

Compiler

Deep Learning

Application

Speech Recognition

Toolkit

cuDNN R2 R2 R2 -- --

Multi-GPU In Progress In Progress In Progress (nnet2)

Multi-CPU (nnet2)

License BSD-2 GPL BSD Apache 2.0 Apache 2.0

Interface(s) Text-based definition

files, Python, MATLAB Python, Lua, MATLAB Python C++ C++, Shell scripts

Embedded (TK1)

http://developer.nvidia.com/deeplearning

DIGITS

DIGITS DEEP GPU TRAINING

SYSTEM FOR DATA

SCIENTISTS

Design DNNs

Visualize activations

Manage multiple trainings GPU GPU HW Cloud GPU

Cluster Multi-GPU

USER INTERFACE

Visualize Layers

Configure DNN

Process Data

Monitor Progress

Theano Torch

Caffe cuDNN, cuBLAS

CUDA

DIGITS

Test Image

Monitor Progress Configure DNN Process Data Visualize Layers

DIGITS DEVBOX World’s fastest GPU

Max GPU out of a plug

Multi-GPU training & inference

Production Automotive Pipeline

TEGRA X1 CLASSIFICATION Performance

AlexNet

0

10

20

30

40

50

60

70

80

90

100

Tegra K1 Tegra X1

IMAG

ES /

SECO

ND

Project dave — darpa autonomous vehicle

DNN-based self-driving robot

Training data by human

driver

No hand-coded CV algorithms

IMAGENET

CHALLENGE Accuracy %

2010 2014 2012 2011 2013

74%

84%

DNN

CV

72%

TRAINING DATA 225K Images

DAVE IN ACTION

Data Scientist Vehicle

Active Learning

Drive PX - Deploy

Model Classification

Detection

Segmentation DIGITS - Train

Network

Solver

Dashboard

Deep Learning and Vision/Graphics

Street Number Detection

[Goodfellow 2014]

Object Classification

[Krizhevsky 2012]

Image Retrieval

[Krizhevsky 2012]

Pose Estimation

[Toshev, Szegedy 2014]

Object Detection

[Huval et al. 2015]

Face Recognition

[Taigman et al. 2014]

Action Recognition

[Simonyan et al. 2014]

Playing Games

[Mnih et al. 2013]

Semantic Segmentation

[Farabet et al. 2013]

Super Resolution

[Dong et al. 2014]

Ray Tracing – Monte Carlo Denoising

[Kalantari et al. 2015]

“Dreams”

[Mordvinstev et al. 2015]

“Dreams”

[Mordvinstev et al. 2015]

Recommended