Upload
wassim-hamidouche
View
555
Download
1
Embed Size (px)
DESCRIPTION
Real Time SHVC Decoder: Implementation and Complexity Analysis
Citation preview
Real Time SHVC Decoder: Implementation andComplexity Analysis
Wassim Hamidouche
Research Engineer
E-mail: [email protected]
Phone : +33 6 19 14 16 38
October 2014
Outline
Introduction
Real time SHVC decoder
Results and analysis
Conclusion
Outline
Introduction
Real time SHVC decoder
Results and analysis
Conclusion
Introduction (1/3)
Scalable High efficiency Video Coding (SHVC) standard:
Enable higher temporal, spatial, quality, bitdepth resolutions and wider color gamut.
Use all the powerful of the HEVC standard:quadtree-based block partitioning, largetransform and prediction blocks, accurateintra/inter predictions and the in-loopfilters.
Introduction (2/3)
Block diagram of the SHVC encoder with two spatial scalabilitylayers:
Downsampling
Upsampling
& scaling MV
Entropy coding
BL bitstream
EL bitstream
Original picture SHVC EL encoder
HEVC BL encoder
T/Q
T /QLoop filter
MC/Inter-layer
prediction
Picture
buffer
Intra
prediction
Entropy codingT/Q
T /QLoop filter
MC/Inter-layer
prediction
Picture
buffer
Intra
prediction
-1 -1
-1 -1
SHVC bitstream
HD
4K
Introduction (3/3)
Scalable High efficiency Video Coding (SHVC) standard:
I Uses Inter-layer predictions to improve the coding efficiency(15%-30% gain vs independent coding) [1]
I Legacy compliant: supports a base layer compliant with theAVC standard
I Multi-loop based coding structure: all intermediate layersneed to be decoded
Access the complexity of a real time SHVC decoder in respect to asimulcast HEVC decoder
[1] V. Seregin, T. D. Chuang, Y. He, D. K. Kwon, and F. Le Leannec, ”AHG
Report: SHVC software,” in document JCTVC-L0011. Geneva, Switzerland,
January 2013.
Outline
Introduction
Real time SHVC decoder
Results and analysis
Conclusion
Real time SHVC decoder (1/6)
SHVC decoder:I Based on the OpenHEVC decoder [2]
I Open source implementation of the HEVC decoder developedin C language on the top of FFmpeg library
I Supports the decoding of all HEVC profiles and the HEVCconformance bitstreams
I The main decoding operations are heavily optimized in SSEinstructions for x86 architecture
I Enables all parallel processing solutions adopted in the HEVCstandard: tile, slice, wavefront and frame-based
[2] ”Open source HEVC decoder (OpenHEVC)” in
https:://github.com/OpenHEVC
Real time SHVC decoder (2/6)
Architecture of the OpenHEVC software:
hls_coding_tree
hls_sao_filter_ctb
hls_deblocking_filter_ctb
hls_transform_tree
hls_prediction_unit
hls_transform_unit
hls_coding_unit
Yes
No
Is SAO filter
enabled ?
No
Yes
No
No
Is a CTU ?
is a TTU ?
Slice decoded
Decode a slice
Is the CTU
decoded ?Yes
hls_decode_slice
Yes
Yes
No
More CTU
in the slice?Yes
Is deblocking
filter enabled ?
Real time SHVC decoder (3/6)
Extend the OpenHEVC decoder to decode SHVC enhancementlayers (EL):
I Parse multi-layers syntax elements at the enhancement layers
I New functions to perform the upsampling of the inter-layerreference picture and the upscaling its MVs (8-tap filter forluma and 4-tap filter for chroma)
I Manage reference lists by including new inter-layer referenceframes
Low level optimizations in the SHVC decoder:
I Optimize the up-sampling filters in Single Instruction MultipleData (SIMD) operations (SSE instructions for x86 processor)
Real time SHVC decoder (4/6)
Multiple instances of the modified OpenHEVC decoder, oneinstance to decode each layer:
Demultiplexer
SHVC bitstream
BL bitstream
OpenHEVC decoder
OpenHEVC decoder
EL bitstream
Access to the decoded BL & its MVs
SHVC decoder
Real time SHVC decoder (5/6)
Parallelism in the SHVC decoder :
I SHVC decoder supports three levels of parallelism:
I Wavefront parallelism: the CTB rows of each layer aredecoded in parallel for low latency applications
I Temporal frame-based parallelism: successive temporal frameare decoded in parallel
I Layers parallelism: frames of the SHVC layers are decoded inparallel
I Communication control mechanism is implemented to managewavefront dependencies as well as inter and inter-layerprediction dependencies
Real time SHVC decoder (6/6)
Hybrid parallel architecture for the SHVC decoder with two layers:
I frame I frame
B frame
B frame B frameInter prediction
Decoded CTBs
No decoded CTBs
CTB in decoding
Wait for the decoding of the referenced PU
P frame P frame
B frame
Inter prediction
B frame
Inter layer prediction
Bas
e la
yer
Enha
ncem
ent l
ayer
B frame
Thread 0
Thread 1{{
Thread 2Thread 3
{{
Thread 4
Thread 5
{{
Thread 5Thread 6
{{
Outline
Introduction
Real time SHVC decoder
Results and analysis
Conclusion
Results and analysis (1/5)
Experimental configuration:
I 4 Cores Intel i7 processor running at 2.8 GHz
I Common SHVC test conditions: video sequences from class Aand B and all QP, 3 scalability configurations 2x, 1.5x andSNR
I Reference Scalable Software Model (SHM) 4.1 is used toencode the test video sequences
I Three configurations of parallelism(n,m) = {(4, 1), (1, 4), (2, 2)} with n: number of threads forwavefront parallelism and m the number of frames decoded inparallel
Results and analysis (2/5)
Decoding time performance of the SHVC decoder (mono core):
SequencesDecoding time (second)
HEVC SHVC
SSE no SSE SNR 2x 1.5x
Kimono 3.53 12.35 5.78 5.06 5.86
ParkScene 4.03 12.08 7.24 5.86 6.92
Cactus 6.52 18.04 12.29 9.51 12.41BasketBallDrive 8.02 26.56 14.74 11.26 14.27BQTerrace 9.60 29.53 18.02 12.92 16.56
Traffic 4.14 11.27 7.83 6.22 -PeopleonStreet 6.70 15.90 10.42 9.73 -
Complexity (%) 0 195 80 43 77
Results and analysis (2/5)
Decoding time performance of the SHVC decoder (mono core):
SequencesDecoding time (second)
HEVC SHVC
SSE no SSE SNR 2x 1.5x
Kimono 3.53 12.35 5.78 5.06 5.86
ParkScene 4.03 12.08 7.24 5.86 6.92
Cactus 6.52 18.04 12.29 9.51 12.41BasketBallDrive 8.02 26.56 14.74 11.26 14.27BQTerrace 9.60 29.53 18.02 12.92 16.56
Traffic 4.14 11.27 7.83 6.22 -PeopleonStreet 6.70 15.90 10.42 9.73 -
Complexity (%) 0 195 80 43 77
Results and analysis (3/5)
Time repartition in the SHVC decoder (mono core):
Transform, 2.98
Motion compensation, 33.75
Inter-layer
prediction, 28.75
Intra prediction
, 0.98
In-loop filters, 16.6
Entropy decoding,
10.8
Rest, 6.15
(a) 2x (15.1 seconds)
Transform, 3.35
Motion compensation, 32.45
Inter-layer
prediction, 29.55
Intra prediction
, 1.08
In-loop filters, 15.75
Entropy decoding,
11.65
Rest, 6.18
(b) 1.5x (16.5 seconds)
Transform, 4
Motion compensation, 45.55
Inter-layer
prediction, 7.95
Intra prediction
, 1.45
In-loop filters, 22.2
Entropy decoding,
11.05
Rest, 7.8
(c) SNR (16.3 seconds)
Transform, 4.98
Motion compensation, 45.38
, 0
Intra prediction
, 3.45
In-loop filters, 22.58
Entropy decoding,
15.93
Rest, 7.7
(d) HEVC (8.5 seconds)
Figure : Average time distribution (%) in the SHVC decoder:BasketBallDrive, Random Access, all QP values.
Results and analysis (4/5)
Parallelism performance in the SHVC decoder:
Configurations Decoding configurations(1, 1) (4, 1) (1, 4) (2, 2)
Speedup
×2 1 2.99 2.02 3.24
×1.5 1 3.13 2.21 3.36
SNR 1 2.90 2.50 3.34HEVC 1 3.10 2.64 3.05
Decoding ×2 52 156 107 169
Cla
ssB
(108
0p) frame rate ×1.5 44 138 99 149
(fps) SNR 41 124 108 138HEVC 72 226 195 224
Decoding ×2 20.54 6.88 24.04 16.08
time per ×1.5 24.11 7.76 27.82 19.27
frame (ms) SNR 29.47 10.28 33.36 24.81HEVC 16.07 5.20 24.05 10.66
Results and analysis (5/5)
Parallelism performance in the SHVC decoder:
Configurations Decoding configurations(1, 1) (4, 1) (1, 4) (2, 2)
Speedup
×2 1 3.6 2.07 3.81
SNR 1 3.49 2.47 3.67
Cla
ssA
(160
0p) HEVC 1 3.49 2.83 3.22
Decoding ×2 23 84 47 88
frame rate SNR 17 62 43 65(fps) HEVC 33 120 96 110
Decoding ×2 47.85 13.65 55 32.65
time per SNR 63.92 18.76 73.67 50.95frame (ms) HEVC 34.56 10.25 48.65 21.90
Outline
Introduction
Real time SHVC decoder
Results and analysis
Conclusion
Conclusion and Perspectives (1/2)
Conclusion:
I First real time and parallel software (open source)implementation of the SHVC decoder
I The SHVC decoder decoding two layers introduces 43% and77% additional complexity in 2x and 1.5x spatial scalabilityconfigurations
I Low level optimizations and parallelism are required to reach areal time decoding of 4Kp60 enhancement layer
Conclusion and Perspectives (2/2)
Perspectives:
I Support the decoding of more than two layers
I Support the decoding of the base layer in legacy AVCstandard
I Unified software for all HEVC extensions: Design a softwaredecoder that support the decoding of all HEVC extensionsincluding MV-HEVC, RExt, screen content coding, and3D-HEVC extensions
Thank you for your attention