Upload
hadien
View
231
Download
4
Embed Size (px)
Citation preview
GTCx 2016 — Seoul
THE DEEP LEARNING AI REVOLUTION
2
NEW ERA OF COMPUTING
PC INTERNET
AI & INTELLIGENT DEVICES
MOBILE-CLOUD
3
GPU DEEP LEARNING BIG BANG
Deep Learning NVIDIA GPU
NIPS (2012)
ImageNet Classification with Deep ConvolutionalNeural Networks
Alex KrizhevskyUniversity of Toronto
Ilya SutskeverUniversity of Toronto
Geoffrey e. HintonUniversity of Toronto
4
74%
96%
2010 2011 2012 2013 2014 2015
Deep Learning
THE STAGE IS SET FOR THE AI REVOLUTION
2012: Deep Learning researchersworldwide discover GPUs
2015: ImageNet — Deep Learning achievessuperhuman image recognition
2016: Microsoft’s Deep Learning system achieves new milestone in speech recognition
Human
Hand-coded CV
Microsoft, Google3.5% error rate
Microsoft09/13/16
“The Microsoft 2016 Conversational Speech Recognition System.” W. Xiong, J. Droppo, X. Huang, F. Seide, M.
Seltzer, A. Stolcke, D. Yu, G. Zweig. 2016
5
NVIDIA — “THE AI COMPUTING COMPANY”
GPU Computing Computer Graphics Artificial Intelligence
6
SCI-FI NO LONGERThe Near Future with VR, AR and AI
7
GTC — 25X GROWTH IN GPU DL DEVELOPERS
4X Attendees 3X GPU Developers 25x Deep Learning Developers
2014
55,000400,00016,000
2,200120,000
3,700
• Australia• China• Europe• India
• Japan• Korea• United States
(Silicon Valley, D.C.)
20162014 2016
• Japan• United States
• Higher Ed 35%• Software 19%• Internet 15%• Auto 10%
• Government 5%• Medical 4%• Finance 4%• Manufacturing 4%
2014 2016
8
WHY DID AI RESEARCHERS ADOPT GPUs FOR DEEP LEARNING?
9
BRAIN IS LIKE A GPU
BRAIN CREATES MENTAL IMAGES WHEN WE THINK
10
GPU IS LIKE A BRAIN
11
GPU DEEP LEARNING IS A NEW COMPUTING MODEL
Training
Intelligent Devices
Datacenter
12
GPU DEEP LEARNING IS A NEW COMPUTING MODEL
TRAINING
Billions of Trillions of Operations
GPU train larger models,accelerate time to market
Training
Intelligent Devices
Datacenter
13
GPU DEEP LEARNING IS A NEW COMPUTING MODEL
DEEP NEURAL NETWORK
Modern neural network with hundreds of hidden layers
Generalize representations by learning hierarchy of features
Billions of operations
Training
Intelligent Devices
Datacenter
14
GPU DEEP LEARNING IS A NEW COMPUTING MODEL
DATACENTER INFERENCING
10s of billions of image, voice, video queries per day
GPU inference for fast response, maximize datacenter throughput
Training
Intelligent Devices
Datacenter
15
GPU DEEP LEARNING IS A NEW COMPUTING MODEL
DEVICE INFERENCING
Billions of intelligent devices
GPU for real-time accurate response
Training
Intelligent Devices
Datacenter
16
AI — THE ULTIMATE COMPUTING CHALLENGE
IMAGE RECOGNITION SPEECH RECOGNITION
Important Property of Neural Networks
Results get better with
more data +bigger models +
more computation
(Better algorithms, new insights and improved techniques always help, too!)
2012AlexNet
2015ResNet
152 layers
22.6 GFLOP
~3.5% error8 layers
1.4 GFLOP
~16% Error
16XModel
2014Deep Speech 1
2015Deep Speech 2
80 GFLOP7,000 hrs of Data
~8% Error
10XTraining Ops
465 GFLOP
12,000 hrs of Data
~5% Error
17
PASCAL “5 MIRACLES” BOOST DEEP LEARNING 65X
Pascal — 5 Miracles NVIDIA DGX-1 Supercomputer 65X in 4 yrs Accelerate Every Framework
PaddlePaddleBaidu Deep Learning
Pascal
16nm FinFET
CoWoS HBM2
NVLink
cuDNN
Chart: Relative speed-up of images/sec vs K40 in 2013. AlexNet training throughput based on 20 iterations. CPU: 1x E5-2680v3 12 Core 2.5GHz. 128GB System Memory, Ubuntu 14.04. M40 datapoint: 8x M40 GPUs in a node P100: 8x P100 NVLink-enabled.
Kepler
Maxwell
Pascal
X
10X
20X
30X
40X
50X
60X
70X
2013 2014 2015 2016
18
NEW IBM SERVER FOR THE AI ENTERPRISEPOWER8 + NVIDIA TESLA P100
“Putting NVIDIA’s technology into the IBM system will
speed up performance for such emerging workloads as AI,
deep learning and data analytics.” — eWeek
19
20
Training
Intelligent Devices
Datacenter
21
TESLA P4 & P40 INFERENCING ACCELERATORS
Pascal Architecture | INT8
P40: 250W | 40X Energy Efficient versus CPU
P40: 250W | 40X Performance versus CPU
22
TensorRTPERFORMANCE OPTIMIZING INFERENCING ENGINE
FP32, FP16, INT8 | Vertical & Horizontal Fusion | Auto-Tuning
VGG, GoogLeNet, ResNet, AlexNet & Custom Layers
Available Today: developer.nvidia.com/tensorrt
23
24
25
NVIDIA AI COMPUTING ECOSYSTEM
AI-powered Consumer Services AI-as-a-Service AI for Enterprise GPU Server Builders
iQIYI JD.comGoogleFlickr
Amazon FacebookeBayBaidu
ShazamQihoo 360 Skype Sogou
Periscope PinterestNetflixMicrosoft
TwitterTencent Yandex Yelp
26
>1,500 AI STARTUPS AROUND THE WORLD
Deep Learning for Cybersecurity
Deep Learning for Genomics
Deep Learning for Self-Driving Cars
Deep Learning for Art
27
Training Datacenter
Intelligent Devices
28
“BILLIONS OF INTELLIGENT DEVICES”
“Billions of intelligent devices will take advantage of DNNs to provide personalization and localization as GPUs become faster and faster over the next several years.”
— Tractica
29
JETSON TX1 EMBEDDED AI SUPERCOMPUTER
10W | 1 TF FP16 | >20 images/sec/W
30
NVIDIA EMBEDDED AI COMPUTING PLATFORM
Best AI Development Environment New Training Courses for AI Developers Worldwide Partnerships
31
AI TRANSPORTATION — $10T INDUSTRY
PERCEPTION AI PERCEPTION AI LOCALIZATION DRIVING AI
DEEP LEARNING
32
NVIDIA DRIVE PX 2AutoCruise to Full Autonomy — One Architecture
Full Autonomy
AutoChauffeur
AutoCruise
AUTONOMOUS DRIVINGPerception, Reasoning, Driving
AI Supercomputing, AI Algorithms, Software
Scalable Architecture
33
NVIDIA DRIVE PX 2 AUTOCRUISE
10W AI Car Computer | Passive Cooling | Automotive IO
AI Highway Driving | Localization & Mapping
34
3D CAR DETECTION FREE SPACE DETECTION
35
NVIDIA BB8 AI CAR
36
UDACITY SELECTS DRIVE PX 2 FOR SELF-DRIVING CAR
37
ANNOUNCING TOMTOM SELECTS DRIVE PX 2 FOR SELF-DRIVING CAR
38
NVIDIA & TOMTOM SELF-DRIVING CAR
NVIDIA DRIVEWORKS OS
MAPPING
AI
PERCEPTION
AI
LOCALIZATION
CV
DRIVING
AI
NVIDIA DRIVE PX 2 AUTOCRUISE
TESLA FOR CLOUD HD MAP PROCESSING
DRIVE PX 2 FOR IN-CAR HD MAP PROCESSING
OPEN “CLOUD-TO-CAR” SDC PLATFORM —HD MAP, AI ALGORITHMS, AI SUPERCOMPUTER
39
ANNOUNCING DRIVEWORKS ALPHA 1OS FOR SELF-DRIVING CARS
DRIVEWORKS
PilotNet
OpenRoadNet
DriveNet
Localization
Path Planning
Traffic Prediction
Action Engine
Occupancy Grid
40
NVIDIA AI SELF-DRIVING CARS IN DEVELOPMENT
Baidu nuTonomy Volvo WEpodsTomTom
41
VISUALCOMPUTING
AI
HPC
AI COMPUTING FOR INTELLIGENT MACHINES
42
INTRODUCING XAVIERAI SUPERCOMPUTER SOC
7 Billion Transistors 16nm FF
8 Core Custom ARM64 CPU
512 Core Volta GPU
New Computer Vision Accelerator
Dual 8K HDR Video Processors
Designed for ASIL C Functional Safety
43
DRIVE PX 2
2 PARKER + 2 PASCAL GPU | 20 TOPS DL | 120 SPECINT | 80W
XAVIER
20 TOPS DL | 160 SPECINT | 20W
INTRODUCING XAVIERAI SUPERCOMPUTER SOC
ONE ARCHITECTURE
44
AI FOR EVERYONE
AI will Revolutionize Transportation AI will Revolutionize Healthcare AI will Revolutionize Society