Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Adaptable Computing The Future of FPGA Acceleration
Dan Gibbons, VP Software Development
June 6, 2018
© Copyright 2018 XilinxPage 2
Adaptable Accelerated Computing
© Copyright 2018 Xilinx
Three Big Trends
© Copyright 2018 Xilinx
Trend to Heterogeneous Architectures with
Acceleration of New Workloads
Mainframe
Era
PC
Era
Mobile
Era
Pervasive Intelligence
Era
The Evolution of Computing
© Copyright 2018 Xilinx
Everything Intelligent & Connected
Dynamic Needs & Rapid Innovation
Deployed at Global Scale
The intelligent connected world needs
adaptable accelerated computing.
The Need for Adaptable Intelligence
© Copyright 2018 Xilinx
Whole genome diagnosis to treat critically ill newborns
Analysis reduced from 1 day to 20 minutes
Patient-specific genomics dynamically optimized
Medical data and research needs to be securely accessed across the globe
Why it Matters – Personalized Medicine Example
© Copyright 2018 XilinxPage 7
The FPGA Advantage
© Copyright 2018 XilinxPage 8
The FPGA Advantage for Machine Learning Inference
FPGA
Layer
1
Layer
2Layer
3
GPU
Layer
1
Layer
2Layer
3
Adaptive Architecture
> Customer dataflow, precision, optimizations
Custom Memory Hierarchy
> Keeps data inside vs. external memory bottleneck
Workload + ML Inference
> Unleashes the power of on-chip system dataflow
© Copyright 2018 XilinxPage 9
Powerful FPGA Optimizations: Precision
Similar accuracy
10+ years of research
Active research area
(binary, variable, bit serial…)
int1 int4 int8 TPU GPU CPU
Impact of Precisionon Performance
…
© Copyright 2018 XilinxPage 10
Powerful FPGA Optimizations: Compression
Compression
30x to 50x compression rate without impacting accuracy (AlexNet)
© Copyright 2018 Xilinx
FPGA Advantage: Deterministic Latency
Input 1
Input 2
Input 3
Input 4
Input 1
Input 2
Input 3
Input 4
GPU
DNN
Result 1
Result 2
Result 3
Result 4
Batch
Latency1
Latency2
Latency3
Latency4
Input 1
Input 2
Input 3
Input 4
FPGA
DNN
Result 1
Result 2
Result 3
Result 4
Latency1
Latency2
Latency3
Latency4
Page 11
“Batch” Inference
> Parallel batch of data to feed SIMD
> High batch => low latency, higher throughput
> Lower compute efficiency at low batch
“Batch-less” Inference
> Low and deterministic latency
> High throughput regardless of batch size
> Consistent compute efficiency
© Copyright 2018 Xilinx
ML Inference Integrated with Other Workloads
Page 12
PCIe
Multi-format
Video Decoder
Scaler
Color Space
ConverterConvolutional Net
RNN / LSTM
FPGA
Large Rabbit
choking squirrel in
forest
Live video summary using CNN & RNN
© Copyright 2018 Xilinx
Adaptable Compute Use Cases Across the Datacenter
Page 13
Compute
ML Inference
Database / Big Data Analytics
Video Transcoding
Financial Services Analytics
Genomics
Storage
Compression
Encryption
Key-Value Store
ML Inference
Database / Big Data Analytics
Networking
IPSec/SSL
OVS Offload
Bare Metal Services
Security
Monitoring
© Copyright 2018 Xilinx
Zynq SoCs: Adaptable Computing on the Edge
Page 14
4 CNN Models
3 Live Inputs + File IO
Under 10 Watts!
HDMI
USB 3
SD Card
Face Detect
Ped SSD
Traffic SSD
Joint Detect
MIPI
ZCU102 Development
Platform
© Copyright 2018 XilinxPage 15
Xilinx Enables Adaptable Accelerated Computing
© Copyright 2018 Xilinx
XILINX ‘FPGA as a Service’ goes wide
Launched Nov 2016
Launched Oct 2017Launched Sep 2017
Launched Jul 2017
Launched Aug 2017
Launched Nov 2016
Page 16
© Copyright 2018 Xilinx
Towards Software as a Service (SaaS)
Enterprise
SaaS
Accelerated
SW
FaaS
SDAccel
SW API
Page 17
© Copyright 2018 Xilinx
Optimal acceleration results requires platform performance, compiler efficiency and programming proficiency
Breakout in Programming for Acceleration
HighPerformance Platform Advanced Compiler
Productive IDE & Optimized Libraries User Onboarding
Page 18
© Copyright 2018 Xilinx
Rich Stack Integrated with Frameworks
Open Frameworks
Accelerated Libraries
Development Environment
Machine Learning Video Transcoding Data Analytics
Platforms
Database
Analytics
On Premise Boards
Page 19
© Copyright 2018 Xilinx
Transformation Through Innovation
World’s
First FPGA
First 3D FPGA
& HW/SW
Programmable SoCGraphic of
MPSoC,
RFSoC
1980 1990 2000 2010 2020
First
Virtex FPGA
First
MPSoC & RFSoC
ACAP
Virtex-2 Pro
Page 20
© Copyright 2018 Xilinx
The Era of Heterogeneous Computing Architectures is Here
Page 21
FPGA’s are uniquely suited for adaptable accelerated computing
Xilinx is leading the way with platforms, tools, applications and FaaS
Now is the opportunity for application development and deployment