27
© Copyright 2018 Xilinx Presented By Johan Janssen Chief Video Architect Video Acceleration in the Cloud Using FPGAs Sean Gardner Marketing Mgr., Cloud Video

Cloud - Video Acceleration in the Cloud Using FPGAs

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

© Copyright 2018 Xilinx

Presented By

Johan Janssen

Chief Video Architect

Video Acceleration in the Cloud Using FPGAs

Sean Gardner

Marketing Mgr., Cloud Video

© Copyright 2018 Xilinx

The market is recognizing us and our technology…

Dan Rayburn is key industry analyst & veteran

https://www.streamingmediablog.com/2018/07/fixing-the-ott-problem.html

© Copyright 2018 Xilinx

Xilinx enabling all live applications

© Copyright 2018 Xilinx

Live streaming seeing explosive growth

Page 4

0

5

10

15

20

25

30

35

40

2016 2021

Live Streaming Market Size ($B)

Ann

ualr

even

ue U

SD

Markets & Markets report

© Copyright 2018 XilinxPage 5

Live video demands a new approach

• Video will be 82% of all internet traffic by 2021 - Cisco VNI report

• Video will be 73% of all wireless traffic by 2023 - Ericsson Mobility report

0

20

40

60

80

100

120

SD (480) HD (720) FHD (1080) UHD (4k) 8k

Increase in pixels

© Copyright 2018 XilinxPage 6

New codecs help one hand…

0

20

40

60

80

100

120

140

MPEG2 H.264 HEVC / VP9 AV1

1080p60 (mbps) Codec Complexity

8 x less bandwidthbandwidth

Codec

Complexity

© Copyright 2018 Xilinx

CPUs are too slow for live video…

Compression efficiency

customers would like

What they live with for

real-time applications

© Copyright 2018 XilinxPage 8

Intel performance scaling not keeping up

Slide taken from Intel XDF presentation

Description Growth in Compute

Live video growth 2 x

Resolution growth 96 x

Growth in codec complexity 125 x

Total growth in compute 223 x

© Copyright 2018 XilinxPage 9

The “Pareto Principle” of live video distribution

20% of video streams generate 80% of streaming traffic

Less popular streams (80% of streams but 20% of bandwidth)

Most

Po

pula

r/V

iew

ed S

tream

s

OPEX

HEAVY

CAPEX

HEAVY

Lowest Bandwidth Highest Density

© Copyright 2018 XilinxPage 10

Together Xilinx & NGCodec can save on streaming costs

24.00

25.00

26.00

27.00

28.00

29.00

30.00

31.00

32.00

33.00

0 1 2 3 4 5 6Millions

x264 medium

N265

PS

NR

(dB

)

Bitrate (Mbps) 45%

© Copyright 2018 XilinxPage 11

The difference compression performance makes

© Copyright 2018 XilinxPage 12

Hardened solutions come at a cost

Encoder saves significant OPEX costs for bandwidth on egress traffic

22

23

24

25

26

27

28

29

30

0 2000000 4000000 6000000 8000000 10000000 12000000 14000000

PS

NR

NVenc HEVC NGcodec HEVC PSNR

35% lower bitrate for same quality

Measured June 2018 SDK

Competitor

Competitor

Bitrate (kbps)

© Copyright 2018 Xilinx

Overview of FPGA Video Acceleration

˃ Johan Janssen

˃ Chief Video Architect

© Copyright 2018 XilinxPage 14

What to Accelerate?

© Copyright 2018 XilinxPage 15

FPGA HEVC encode vs. x265 encoding configurations

http://x265.ru/wp-content/uploads/2014/12/diff-presets.png

Commonly used

X265 preset for

encoder benchmarking

HEVC encode

comparison

1080p fps @

x265-slow preset

Device

Powerperformance/W

performance/W

ImprovementAvailable

Dual Socket

E5-2680 v3 2.5GHz10 fps 240 W 0.042 fps/W

Single socket VU9p 120 fps <40 W 3.0 fps/W 72x Dec 2018

Dual socket VU9p 240 fps <80 W 3.0 fps/W 72x Dec 2019

Fra

mes p

er

second

© Copyright 2018 XilinxPage 16

˃ Source: https://www.ffmpeg.org/about.html

˃ FFmpeg is the leading multimedia framework, able to decode, encode,

transcode, mux, demux, stream, filter and play pretty much anything that

humans and machines have createdIt supports the most obscure ancient formats up to the cutting edge

Highly portable: FFmpeg compiles, runs, across Linux, Mac OS X, Microsoft Windows, under a wide variety of build environments, machine architectures, and configurations

About FFmpeg

© Copyright 2018 XilinxPage 17

Integration into FFmpeg framework: building the ecosystem

FPGA

SDAccel Target Board

FFmpeg(video codecs, Scalars, Compositing etc.)

XMA(XRT, Partial Reconfiguration, Video IP)

x86 Server

Customer Application

FPGA HEVC

encode plugin

FPGA h.264

encode pluginXilinx Yolo plugin

Xilinx ABR Scalar

plugin

FPGA VP9 encode

plugin

Xilinx h.264

decode plugin

© Copyright 2018 XilinxPage 18

Video Transcoding ABR Example (Single VU9p)

FFmpeg

ABR

Scaler

2 x

HEVC or

VP9 1080p60

Encoder

RTSP/RTMP Live Stream RTSP

ClientDecode

ABR

ScalerEncoder

RSTP/RTMP Stream to Inference

Xilinx Alveo

PCIe Card

Mux

Aud/Vid

Demux

Aud/VidMux

Aud/Vid

RTSP/RTMP to Media Server

Encoder

(low frame

rate)

Decode

Optional

2 x

HEVC or

VP9 1080p60

Encoder

© Copyright 2018 XilinxPage 19

Video Transcoding ABR Example (Single VU9p)

FFmpeg

ABR

Scaler

HEVC or

VP9

Encoder

RTSP/RTMP Live Stream RTSP

ClientDecode

ABR

ScalerEncoder

RSTP/RTMP Stream to Inference

Xilinx Alveo

PCIe Card

Mux

Aud/Vid

Demux

Aud/VidMux

Aud/Vid

RTSP/RTMP to Media Server

Encoder

(low frame

rate)

Decode

Optional

HEVC or

VP9

Encoder

HEVC or

VP9

Encoder

HEVC or

VP9

Encoder

HEVC or

VP9

Encoder

10 x H.264

1080p60

Encoder

© Copyright 2018 XilinxPage 20

cloud

1080p30

Decode

“Edge-to-Cloud Video Analytics”

Video Inference

Streaming Server

h.264

h.264

Decode

Xilinx Alveo

PCIe card

Inference

xDNN

Post-

ProcessOverlay

Metadata

RTSP/RTMP to Media Server

Database

(mysql/postgresql)HTTP Server

Demux

Aud/VidRTSP/

RTMP

Client

RTSP/RTMP

Live Stream Encode

Optional

h.264 or

HEVC or

VP9

file

File

© Copyright 2018 XilinxPage 21

Integrating accelerators into FFmpeg framework

$ ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -b:v 4000k -g 30 -c:v xlnx_HEVC_enc -f h265 -y ./hw_outdir/out1_br4000k.h264

ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -c:v libx264 -preset medium -profile:v high -crf 23 -bf 4 -refs 3 -g 30 -b:v 4000k -maxrate 4000k -bufsize8000k -f h264 -r 30 -y ./sw_outdir/x264_medium_out0_br4000k.h264

https://trac.ffmpeg.org/wiki/EncodingForStreamingSites

$ ffmpeg \-f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -r 30 -an -i/home/ffmpeg/VU9P/TestSequences/Kimono1_1920x1080_24.yuv \-frames 240 -b:v 4000k -g 30 -c:v xlnx_h264_enc-hq -f h264 -y ./hw_outdir/out0_br4000k.h264

Change 20 characters to get acceleration

© Copyright 2018 XilinxPage 22

XMA Architecture

˃ Key features

Video domain specific interfaces with seamless integration with FFmpeg

Low-level plugin can be reused with any media framework

Supports multiple processes sharing different kernels on the same device

Supports multiple channels on a single kernel

Ensures a kernel resource is reserved for the lifetime of a video session

Xilinx Media AcceleratorXMA

OpenCL

Decoder ABR Scaler

EncoderSession-1

EncoderSession-2

EncoderSession-N

Custom Applications

Host

Accelerator

Media FrameworkFFmpeg, Gstreamer, Other

Xilinx Runtime(XRT)

© Copyright 2018 Xilinx

Video IP Offering (each IP has FFmpeg plugin)

All throughput numbers are based on VU9P

© Copyright 2018 XilinxPage 24

Xilinx Alveo ABR video transcoding solution

: https://www.xilinx.com/products/boards-and-kits/alveo/applications/.html

© Copyright 2018 Xilinx

© Copyright 2018 XilinxPage 26

Xilinx has a solution to address all workload needs

© Copyright 2018 XilinxPage 27

Video Optimized platforms