AI AND DATA SCIENCE FOR THE ENTERPRISE - Cisco...Spark, Graph TensorFlow, PyTorch MXNet, Scikit-Learn, XGBoost TensorFlow Serving ... from test to production with minimal changes

AI AND DATA SCIENCE FOR THE ENTERPRISE

Jennifer St. John-Foster – NVIDIA Business Manager, Financial Services

[email protected]

2

3

4

AI & DATA SCIENCEARE CHANGING THE WORLD

RoboticsManufacturing, construction, navigation

HealthcareCancer detection, drug discovery, genomics

Internet ServicesImage classification, speech recognition, NLP

FinanceTrading strategy, fraud detection

Media & EntertainmentDigital content creation, game development

Autonomous VehiclesPedestrian & traffic sign detection, lane tracking

5

DATA SCIENCE –A NEW PILLAR OF DISCOVERY

PREDICTIVE

MODELFEATURES

PREDICTIONDATA

ML

DL

CVNLU

AI

DATA

ANALYTICSINFERENCE

6


PREDICTIVE

MODELFEATURES

PREDICTIONDATA

ETL, Pandas,

Spark, Graph

TensorFlow, PyTorch

MXNet, Scikit-Learn, XGBoost

TensorFlow Serving

ONNX, SageMaker NEO

CSV, PARQ, HDFS

ML

DL

CVNLU

AI

DATA

ANALYTICSINFERENCE

7


PREDICTIVE

MODELFEATURES

PREDICTIONDATA

cuIO

cuDF

cuGraph

cuDNN

cuML

TensorRT

TRTIS

ML

DL

CVNLU

AI

DATA

ANALYTICSINFERENCE

8

SMALL CHANGES, BIG SPEED-UPApplication Code

+

GPU CPU5% of Code

Compute-Intensive Functions

Rest of SequentialCPU Code

9

ML SERVICES

NVIDIA CUDA-X AI ECOSYSTEM

FRAMEWORKS DEPLOYMENT

DA GRAPH DL TRAINML DL INFERENCE

Serving

Amazon

SageMaker Neo

CUDA-X AI

CUDA

AmazonSageMaker

Azure Machine Learning

GoogleCloud ML

10

CONTAINERS: SIMPLIFYING WORKFLOWS

Simplifies Deployments

- Eliminates complex, time-consuming builds and installs

Get started in minutes

- Simply Pull & Run the app

- https://ngc.nvidia.com

Portable

- Deploy across various environments, from test to production with minimal changes

11

DESIGNING INFRASTRUCTURE THAT SCALESInsights gained from deep learning data centers

Rack Design Networking Storage Facilities Software

• DL drives

close to

operational

limits

• Similarities

to HPC best

practices

• IB or

Ethernet

based fabric

• 100Gbps

inter-

connect

• High-

bandwidth,

ultra-low

latency

• Datasets

range from

10k’s to

millions

objects

• terabyte

levels of

storage and

up

• High IOPS,

low latency

• assume

higher watts

per-rack

• Higher

FLOPS/watt

= DC less

floorspace

required

• Scale

requires

“cluster-

aware”

software

Example:

• Autonomous vehicle = 1TB / hr

• Training sets up to 500 PB

• RN50: 113 days to train

• Objective: 7 days

• 6 simultaneous developers

= 97 node cluster

12

DATACENTER BECOMES A COMPUTE ENGINE

13

PREDICT CUSTOMER INTENT TO PURCHASE

There’s an increasing need to accurately

predict customers’ intent to buy into

extensive product portfolios to help with

companies’ bottom line.

Cisco uses Driverless AI powered by NVIDIA

GPUs to provide pre-built ready to use

algorithms and models, reducing the

processing time from 1 month to

2 days with much larger data sets.

This resulted in more comprehensive

view of customer behavior.

14

With >100,000 different products in its 4,700 U.S. stores,

the Walmart Labs data science team predicts demand for

500 million item-by-store combinations every week.

By performing forecasting with the open-source RAPIDS

data processing and machine learning libraries built

on CUDA-X AI on NVIDIA GPUs, Walmart speeds

up feature engineering 100x and trains machine

learning algorithms 20x faster, resulting in faster

delivery of products, real-time reaction to

shopper trends, and inventory cost

savings at scale.

IMPROVINGDEMAND FORECASTS

15

AI HELPS DOCTORSDIAGNOSEBREAST CANCEREvery day, pathologists are tasked with providing

cancer diagnosis to guide patient treatment.

However, sifting through millions of normal cells

to identify a few malignant cells is extremely

laborious using conventional methods. PathAI

combines GPU deep learning with traditional

pathology to improve accuracy,

speed diagnosis, and

reduce error rates

by 85%.

16

REAL-TIME FRAUD DETECTIONRecently, PayPal was looking to deploy a new fraud

detection system. The team working on it set a high

bar: this system had to operate worldwide 24/7,

and work in real-time to protect customer

transactions from potential fraud. In spec’ing

the system, it became evident that CPU-only

servers couldn’t meet these requirements.

Using NVIDIA T4 GPUs, PayPal delivered a

new level of service, using GPU inference

to improve real-time fraud detection by

10% while lowering server capacity

by nearly 8x.

17

Documents

AI AND DATA SCIENCE FOR THE ENTERPRISE - Cisco...Spark, Graph TensorFlow, PyTorch MXNet, Scikit-Learn, XGBoost TensorFlow Serving ... from test to production with minimal changes