24
Real Time SHVC Decoder: Implementation and Complexity Analysis Wassim Hamidouche Research Engineer E-mail: [email protected] Phone : +33 6 19 14 16 38 October 2014

Real time SHVC decoder

Embed Size (px)

DESCRIPTION

Real Time SHVC Decoder: Implementation and Complexity Analysis

Citation preview

Page 1: Real time SHVC decoder

Real Time SHVC Decoder: Implementation andComplexity Analysis

Wassim Hamidouche

Research Engineer

E-mail: [email protected]

Phone : +33 6 19 14 16 38

October 2014

Page 2: Real time SHVC decoder

Outline

Introduction

Real time SHVC decoder

Results and analysis

Conclusion

Page 3: Real time SHVC decoder

Outline

Introduction

Real time SHVC decoder

Results and analysis

Conclusion

Page 4: Real time SHVC decoder

Introduction (1/3)

Scalable High efficiency Video Coding (SHVC) standard:

Enable higher temporal, spatial, quality, bitdepth resolutions and wider color gamut.

Use all the powerful of the HEVC standard:quadtree-based block partitioning, largetransform and prediction blocks, accurateintra/inter predictions and the in-loopfilters.

Page 5: Real time SHVC decoder

Introduction (2/3)

Block diagram of the SHVC encoder with two spatial scalabilitylayers:

Downsampling

Upsampling

& scaling MV

Entropy coding

BL bitstream

EL bitstream

Original picture SHVC EL encoder

HEVC BL encoder

T/Q

T /QLoop filter

MC/Inter-layer

prediction

Picture

buffer

Intra

prediction

Entropy codingT/Q

T /QLoop filter

MC/Inter-layer

prediction

Picture

buffer

Intra

prediction

-1 -1

-1 -1

SHVC bitstream

HD

4K

Page 6: Real time SHVC decoder

Introduction (3/3)

Scalable High efficiency Video Coding (SHVC) standard:

I Uses Inter-layer predictions to improve the coding efficiency(15%-30% gain vs independent coding) [1]

I Legacy compliant: supports a base layer compliant with theAVC standard

I Multi-loop based coding structure: all intermediate layersneed to be decoded

Access the complexity of a real time SHVC decoder in respect to asimulcast HEVC decoder

[1] V. Seregin, T. D. Chuang, Y. He, D. K. Kwon, and F. Le Leannec, ”AHG

Report: SHVC software,” in document JCTVC-L0011. Geneva, Switzerland,

January 2013.

Page 7: Real time SHVC decoder

Outline

Introduction

Real time SHVC decoder

Results and analysis

Conclusion

Page 8: Real time SHVC decoder

Real time SHVC decoder (1/6)

SHVC decoder:I Based on the OpenHEVC decoder [2]

I Open source implementation of the HEVC decoder developedin C language on the top of FFmpeg library

I Supports the decoding of all HEVC profiles and the HEVCconformance bitstreams

I The main decoding operations are heavily optimized in SSEinstructions for x86 architecture

I Enables all parallel processing solutions adopted in the HEVCstandard: tile, slice, wavefront and frame-based

[2] ”Open source HEVC decoder (OpenHEVC)” in

https:://github.com/OpenHEVC

Page 9: Real time SHVC decoder

Real time SHVC decoder (2/6)

Architecture of the OpenHEVC software:

hls_coding_tree

hls_sao_filter_ctb

hls_deblocking_filter_ctb

hls_transform_tree

hls_prediction_unit

hls_transform_unit

hls_coding_unit

Yes

No

Is SAO filter

enabled ?

No

Yes

No

No

Is a CTU ?

is a TTU ?

Slice decoded

Decode a slice

Is the CTU

decoded ?Yes

hls_decode_slice

Yes

Yes

No

More CTU

in the slice?Yes

Is deblocking

filter enabled ?

Page 10: Real time SHVC decoder

Real time SHVC decoder (3/6)

Extend the OpenHEVC decoder to decode SHVC enhancementlayers (EL):

I Parse multi-layers syntax elements at the enhancement layers

I New functions to perform the upsampling of the inter-layerreference picture and the upscaling its MVs (8-tap filter forluma and 4-tap filter for chroma)

I Manage reference lists by including new inter-layer referenceframes

Low level optimizations in the SHVC decoder:

I Optimize the up-sampling filters in Single Instruction MultipleData (SIMD) operations (SSE instructions for x86 processor)

Page 11: Real time SHVC decoder

Real time SHVC decoder (4/6)

Multiple instances of the modified OpenHEVC decoder, oneinstance to decode each layer:

Demultiplexer

SHVC bitstream

BL bitstream

OpenHEVC decoder

OpenHEVC decoder

EL bitstream

Access to the decoded BL & its MVs

SHVC decoder

Page 12: Real time SHVC decoder

Real time SHVC decoder (5/6)

Parallelism in the SHVC decoder :

I SHVC decoder supports three levels of parallelism:

I Wavefront parallelism: the CTB rows of each layer aredecoded in parallel for low latency applications

I Temporal frame-based parallelism: successive temporal frameare decoded in parallel

I Layers parallelism: frames of the SHVC layers are decoded inparallel

I Communication control mechanism is implemented to managewavefront dependencies as well as inter and inter-layerprediction dependencies

Page 13: Real time SHVC decoder

Real time SHVC decoder (6/6)

Hybrid parallel architecture for the SHVC decoder with two layers:

I frame I frame

B frame

B frame B frameInter prediction

Decoded CTBs

No decoded CTBs

CTB in decoding

Wait for the decoding of the referenced PU

P frame P frame

B frame

Inter prediction

B frame

Inter layer prediction

Bas

e la

yer

Enha

ncem

ent l

ayer

B frame

Thread 0

Thread 1{{

Thread 2Thread 3

{{

Thread 4

Thread 5

{{

Thread 5Thread 6

{{

Page 14: Real time SHVC decoder

Outline

Introduction

Real time SHVC decoder

Results and analysis

Conclusion

Page 15: Real time SHVC decoder

Results and analysis (1/5)

Experimental configuration:

I 4 Cores Intel i7 processor running at 2.8 GHz

I Common SHVC test conditions: video sequences from class Aand B and all QP, 3 scalability configurations 2x, 1.5x andSNR

I Reference Scalable Software Model (SHM) 4.1 is used toencode the test video sequences

I Three configurations of parallelism(n,m) = {(4, 1), (1, 4), (2, 2)} with n: number of threads forwavefront parallelism and m the number of frames decoded inparallel

Page 16: Real time SHVC decoder

Results and analysis (2/5)

Decoding time performance of the SHVC decoder (mono core):

SequencesDecoding time (second)

HEVC SHVC

SSE no SSE SNR 2x 1.5x

Kimono 3.53 12.35 5.78 5.06 5.86

ParkScene 4.03 12.08 7.24 5.86 6.92

Cactus 6.52 18.04 12.29 9.51 12.41BasketBallDrive 8.02 26.56 14.74 11.26 14.27BQTerrace 9.60 29.53 18.02 12.92 16.56

Traffic 4.14 11.27 7.83 6.22 -PeopleonStreet 6.70 15.90 10.42 9.73 -

Complexity (%) 0 195 80 43 77

Page 17: Real time SHVC decoder

Results and analysis (2/5)

Decoding time performance of the SHVC decoder (mono core):

SequencesDecoding time (second)

HEVC SHVC

SSE no SSE SNR 2x 1.5x

Kimono 3.53 12.35 5.78 5.06 5.86

ParkScene 4.03 12.08 7.24 5.86 6.92

Cactus 6.52 18.04 12.29 9.51 12.41BasketBallDrive 8.02 26.56 14.74 11.26 14.27BQTerrace 9.60 29.53 18.02 12.92 16.56

Traffic 4.14 11.27 7.83 6.22 -PeopleonStreet 6.70 15.90 10.42 9.73 -

Complexity (%) 0 195 80 43 77

Page 18: Real time SHVC decoder

Results and analysis (3/5)

Time repartition in the SHVC decoder (mono core):

Transform, 2.98

Motion compensation, 33.75

Inter-layer

prediction, 28.75

Intra prediction

, 0.98

In-loop filters, 16.6

Entropy decoding,

10.8

Rest, 6.15

(a) 2x (15.1 seconds)

Transform, 3.35

Motion compensation, 32.45

Inter-layer

prediction, 29.55

Intra prediction

, 1.08

In-loop filters, 15.75

Entropy decoding,

11.65

Rest, 6.18

(b) 1.5x (16.5 seconds)

Transform, 4

Motion compensation, 45.55

Inter-layer

prediction, 7.95

Intra prediction

, 1.45

In-loop filters, 22.2

Entropy decoding,

11.05

Rest, 7.8

(c) SNR (16.3 seconds)

Transform, 4.98

Motion compensation, 45.38

, 0

Intra prediction

, 3.45

In-loop filters, 22.58

Entropy decoding,

15.93

Rest, 7.7

(d) HEVC (8.5 seconds)

Figure : Average time distribution (%) in the SHVC decoder:BasketBallDrive, Random Access, all QP values.

Page 19: Real time SHVC decoder

Results and analysis (4/5)

Parallelism performance in the SHVC decoder:

Configurations Decoding configurations(1, 1) (4, 1) (1, 4) (2, 2)

Speedup

×2 1 2.99 2.02 3.24

×1.5 1 3.13 2.21 3.36

SNR 1 2.90 2.50 3.34HEVC 1 3.10 2.64 3.05

Decoding ×2 52 156 107 169

Cla

ssB

(108

0p) frame rate ×1.5 44 138 99 149

(fps) SNR 41 124 108 138HEVC 72 226 195 224

Decoding ×2 20.54 6.88 24.04 16.08

time per ×1.5 24.11 7.76 27.82 19.27

frame (ms) SNR 29.47 10.28 33.36 24.81HEVC 16.07 5.20 24.05 10.66

Page 20: Real time SHVC decoder

Results and analysis (5/5)

Parallelism performance in the SHVC decoder:

Configurations Decoding configurations(1, 1) (4, 1) (1, 4) (2, 2)

Speedup

×2 1 3.6 2.07 3.81

SNR 1 3.49 2.47 3.67

Cla

ssA

(160

0p) HEVC 1 3.49 2.83 3.22

Decoding ×2 23 84 47 88

frame rate SNR 17 62 43 65(fps) HEVC 33 120 96 110

Decoding ×2 47.85 13.65 55 32.65

time per SNR 63.92 18.76 73.67 50.95frame (ms) HEVC 34.56 10.25 48.65 21.90

Page 21: Real time SHVC decoder

Outline

Introduction

Real time SHVC decoder

Results and analysis

Conclusion

Page 22: Real time SHVC decoder

Conclusion and Perspectives (1/2)

Conclusion:

I First real time and parallel software (open source)implementation of the SHVC decoder

I The SHVC decoder decoding two layers introduces 43% and77% additional complexity in 2x and 1.5x spatial scalabilityconfigurations

I Low level optimizations and parallelism are required to reach areal time decoding of 4Kp60 enhancement layer

Page 23: Real time SHVC decoder

Conclusion and Perspectives (2/2)

Perspectives:

I Support the decoding of more than two layers

I Support the decoding of the base layer in legacy AVCstandard

I Unified software for all HEVC extensions: Design a softwaredecoder that support the decoding of all HEVC extensionsincluding MV-HEVC, RExt, screen content coding, and3D-HEVC extensions

Page 24: Real time SHVC decoder

Thank you for your attention