45
GTCx 2016 — Seoul THE DEEP LEARNING AI REVOLUTION

THE DEEP LEARNING AI REVOLUTION

  • Upload
    hadien

  • View
    231

  • Download
    4

Embed Size (px)

Citation preview

Page 1: THE DEEP LEARNING AI REVOLUTION

GTCx 2016 — Seoul

THE DEEP LEARNING AI REVOLUTION

Page 2: THE DEEP LEARNING AI REVOLUTION

2

NEW ERA OF COMPUTING

PC INTERNET

AI & INTELLIGENT DEVICES

MOBILE-CLOUD

Page 3: THE DEEP LEARNING AI REVOLUTION

3

GPU DEEP LEARNING BIG BANG

Deep Learning NVIDIA GPU

NIPS (2012)

ImageNet Classification with Deep ConvolutionalNeural Networks

Alex KrizhevskyUniversity of Toronto

Ilya SutskeverUniversity of Toronto

Geoffrey e. HintonUniversity of Toronto

Page 4: THE DEEP LEARNING AI REVOLUTION

4

74%

96%

2010 2011 2012 2013 2014 2015

Deep Learning

THE STAGE IS SET FOR THE AI REVOLUTION

2012: Deep Learning researchersworldwide discover GPUs

2015: ImageNet — Deep Learning achievessuperhuman image recognition

2016: Microsoft’s Deep Learning system achieves new milestone in speech recognition

Human

Hand-coded CV

Microsoft, Google3.5% error rate

Microsoft09/13/16

“The Microsoft 2016 Conversational Speech Recognition System.” W. Xiong, J. Droppo, X. Huang, F. Seide, M.

Seltzer, A. Stolcke, D. Yu, G. Zweig. 2016

Page 5: THE DEEP LEARNING AI REVOLUTION

5

NVIDIA — “THE AI COMPUTING COMPANY”

GPU Computing Computer Graphics Artificial Intelligence

Page 6: THE DEEP LEARNING AI REVOLUTION

6

SCI-FI NO LONGERThe Near Future with VR, AR and AI

Page 7: THE DEEP LEARNING AI REVOLUTION

7

GTC — 25X GROWTH IN GPU DL DEVELOPERS

4X Attendees 3X GPU Developers 25x Deep Learning Developers

2014

55,000400,00016,000

2,200120,000

3,700

• Australia• China• Europe• India

• Japan• Korea• United States

(Silicon Valley, D.C.)

20162014 2016

• Japan• United States

• Higher Ed 35%• Software 19%• Internet 15%• Auto 10%

• Government 5%• Medical 4%• Finance 4%• Manufacturing 4%

2014 2016

Page 8: THE DEEP LEARNING AI REVOLUTION

8

WHY DID AI RESEARCHERS ADOPT GPUs FOR DEEP LEARNING?

Page 9: THE DEEP LEARNING AI REVOLUTION

9

BRAIN IS LIKE A GPU

BRAIN CREATES MENTAL IMAGES WHEN WE THINK

Page 10: THE DEEP LEARNING AI REVOLUTION

10

GPU IS LIKE A BRAIN

Page 11: THE DEEP LEARNING AI REVOLUTION

11

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

Training

Intelligent Devices

Datacenter

Page 12: THE DEEP LEARNING AI REVOLUTION

12

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

TRAINING

Billions of Trillions of Operations

GPU train larger models,accelerate time to market

Training

Intelligent Devices

Datacenter

Page 13: THE DEEP LEARNING AI REVOLUTION

13

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

DEEP NEURAL NETWORK

Modern neural network with hundreds of hidden layers

Generalize representations by learning hierarchy of features

Billions of operations

Training

Intelligent Devices

Datacenter

Page 14: THE DEEP LEARNING AI REVOLUTION

14

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

DATACENTER INFERENCING

10s of billions of image, voice, video queries per day

GPU inference for fast response, maximize datacenter throughput

Training

Intelligent Devices

Datacenter

Page 15: THE DEEP LEARNING AI REVOLUTION

15

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

DEVICE INFERENCING

Billions of intelligent devices

GPU for real-time accurate response

Training

Intelligent Devices

Datacenter

Page 16: THE DEEP LEARNING AI REVOLUTION

16

AI — THE ULTIMATE COMPUTING CHALLENGE

IMAGE RECOGNITION SPEECH RECOGNITION

Important Property of Neural Networks

Results get better with

more data +bigger models +

more computation

(Better algorithms, new insights and improved techniques always help, too!)

2012AlexNet

2015ResNet

152 layers

22.6 GFLOP

~3.5% error8 layers

1.4 GFLOP

~16% Error

16XModel

2014Deep Speech 1

2015Deep Speech 2

80 GFLOP7,000 hrs of Data

~8% Error

10XTraining Ops

465 GFLOP

12,000 hrs of Data

~5% Error

Page 17: THE DEEP LEARNING AI REVOLUTION

17

PASCAL “5 MIRACLES” BOOST DEEP LEARNING 65X

Pascal — 5 Miracles NVIDIA DGX-1 Supercomputer 65X in 4 yrs Accelerate Every Framework

PaddlePaddleBaidu Deep Learning

Pascal

16nm FinFET

CoWoS HBM2

NVLink

cuDNN

Chart: Relative speed-up of images/sec vs K40 in 2013. AlexNet training throughput based on 20 iterations. CPU: 1x E5-2680v3 12 Core 2.5GHz. 128GB System Memory, Ubuntu 14.04. M40 datapoint: 8x M40 GPUs in a node P100: 8x P100 NVLink-enabled.

Kepler

Maxwell

Pascal

X

10X

20X

30X

40X

50X

60X

70X

2013 2014 2015 2016

Page 18: THE DEEP LEARNING AI REVOLUTION

18

NEW IBM SERVER FOR THE AI ENTERPRISEPOWER8 + NVIDIA TESLA P100

“Putting NVIDIA’s technology into the IBM system will

speed up performance for such emerging workloads as AI,

deep learning and data analytics.” — eWeek

Page 19: THE DEEP LEARNING AI REVOLUTION

19

Page 20: THE DEEP LEARNING AI REVOLUTION

20

Training

Intelligent Devices

Datacenter

Page 21: THE DEEP LEARNING AI REVOLUTION

21

TESLA P4 & P40 INFERENCING ACCELERATORS

Pascal Architecture | INT8

P40: 250W | 40X Energy Efficient versus CPU

P40: 250W | 40X Performance versus CPU

Page 22: THE DEEP LEARNING AI REVOLUTION

22

TensorRTPERFORMANCE OPTIMIZING INFERENCING ENGINE

FP32, FP16, INT8 | Vertical & Horizontal Fusion | Auto-Tuning

VGG, GoogLeNet, ResNet, AlexNet & Custom Layers

Available Today: developer.nvidia.com/tensorrt

Page 23: THE DEEP LEARNING AI REVOLUTION

23

Page 24: THE DEEP LEARNING AI REVOLUTION

24

Page 25: THE DEEP LEARNING AI REVOLUTION

25

NVIDIA AI COMPUTING ECOSYSTEM

AI-powered Consumer Services AI-as-a-Service AI for Enterprise GPU Server Builders

iQIYI JD.comGoogleFlickr

Amazon FacebookeBayBaidu

ShazamQihoo 360 Skype Sogou

Periscope PinterestNetflixMicrosoft

TwitterTencent Yandex Yelp

Page 26: THE DEEP LEARNING AI REVOLUTION

26

>1,500 AI STARTUPS AROUND THE WORLD

Deep Learning for Cybersecurity

Deep Learning for Genomics

Deep Learning for Self-Driving Cars

Deep Learning for Art

Page 27: THE DEEP LEARNING AI REVOLUTION

27

Training Datacenter

Intelligent Devices

Page 28: THE DEEP LEARNING AI REVOLUTION

28

“BILLIONS OF INTELLIGENT DEVICES”

“Billions of intelligent devices will take advantage of DNNs to provide personalization and localization as GPUs become faster and faster over the next several years.”

— Tractica

Page 29: THE DEEP LEARNING AI REVOLUTION

29

JETSON TX1 EMBEDDED AI SUPERCOMPUTER

10W | 1 TF FP16 | >20 images/sec/W

Page 30: THE DEEP LEARNING AI REVOLUTION

30

NVIDIA EMBEDDED AI COMPUTING PLATFORM

Best AI Development Environment New Training Courses for AI Developers Worldwide Partnerships

Page 31: THE DEEP LEARNING AI REVOLUTION

31

AI TRANSPORTATION — $10T INDUSTRY

PERCEPTION AI PERCEPTION AI LOCALIZATION DRIVING AI

DEEP LEARNING

Page 32: THE DEEP LEARNING AI REVOLUTION

32

NVIDIA DRIVE PX 2AutoCruise to Full Autonomy — One Architecture

Full Autonomy

AutoChauffeur

AutoCruise

AUTONOMOUS DRIVINGPerception, Reasoning, Driving

AI Supercomputing, AI Algorithms, Software

Scalable Architecture

Page 33: THE DEEP LEARNING AI REVOLUTION

33

NVIDIA DRIVE PX 2 AUTOCRUISE

10W AI Car Computer | Passive Cooling | Automotive IO

AI Highway Driving | Localization & Mapping

Page 34: THE DEEP LEARNING AI REVOLUTION

34

3D CAR DETECTION FREE SPACE DETECTION

Page 35: THE DEEP LEARNING AI REVOLUTION

35

NVIDIA BB8 AI CAR

Page 36: THE DEEP LEARNING AI REVOLUTION

36

UDACITY SELECTS DRIVE PX 2 FOR SELF-DRIVING CAR

Page 37: THE DEEP LEARNING AI REVOLUTION

37

ANNOUNCING TOMTOM SELECTS DRIVE PX 2 FOR SELF-DRIVING CAR

Page 38: THE DEEP LEARNING AI REVOLUTION

38

NVIDIA & TOMTOM SELF-DRIVING CAR

NVIDIA DRIVEWORKS OS

MAPPING

AI

PERCEPTION

AI

LOCALIZATION

CV

DRIVING

AI

NVIDIA DRIVE PX 2 AUTOCRUISE

TESLA FOR CLOUD HD MAP PROCESSING

DRIVE PX 2 FOR IN-CAR HD MAP PROCESSING

OPEN “CLOUD-TO-CAR” SDC PLATFORM —HD MAP, AI ALGORITHMS, AI SUPERCOMPUTER

Page 39: THE DEEP LEARNING AI REVOLUTION

39

ANNOUNCING DRIVEWORKS ALPHA 1OS FOR SELF-DRIVING CARS

DRIVEWORKS

PilotNet

OpenRoadNet

DriveNet

Localization

Path Planning

Traffic Prediction

Action Engine

Occupancy Grid

Page 40: THE DEEP LEARNING AI REVOLUTION

40

NVIDIA AI SELF-DRIVING CARS IN DEVELOPMENT

Baidu nuTonomy Volvo WEpodsTomTom

Page 41: THE DEEP LEARNING AI REVOLUTION

41

VISUALCOMPUTING

AI

HPC

AI COMPUTING FOR INTELLIGENT MACHINES

Page 42: THE DEEP LEARNING AI REVOLUTION

42

INTRODUCING XAVIERAI SUPERCOMPUTER SOC

7 Billion Transistors 16nm FF

8 Core Custom ARM64 CPU

512 Core Volta GPU

New Computer Vision Accelerator

Dual 8K HDR Video Processors

Designed for ASIL C Functional Safety

Page 43: THE DEEP LEARNING AI REVOLUTION

43

DRIVE PX 2

2 PARKER + 2 PASCAL GPU | 20 TOPS DL | 120 SPECINT | 80W

XAVIER

20 TOPS DL | 160 SPECINT | 20W

INTRODUCING XAVIERAI SUPERCOMPUTER SOC

ONE ARCHITECTURE

Page 44: THE DEEP LEARNING AI REVOLUTION

44

AI FOR EVERYONE

AI will Revolutionize Transportation AI will Revolutionize Healthcare AI will Revolutionize Society

Page 45: THE DEEP LEARNING AI REVOLUTION