HEVC overview main

Overview of the High Efficiency Video Coding

(HEVC) Standard

Guided By (Dharmendra Savaliya)14mece05

Prof. Tanish Zaveri (Hitesh Patel)14mece11

HEVC Introduction

Typical HEVC

HEVC Video Coding Layer

HEVC Video Coding Technique

HEVC Application

Joint project between ISO/IEC/MPEG and ITU-T/VCEG

• ISO/IEC: MPEG-H Part 2 (23008-2)

• ITU-T: H.265

• The JCT-VC committee

• Joint Collaborative Team on Video Coding

• Target: roughly half the bit-rate at the same subjective quality compared to H.264/AVC

• Requirements:

• Progressive required for all profiles and levels

• Interlaced support using field SEI message

• Video resolution: sub QVGA to 8Kx4K, with more focus on higher resolution

video content (1080p and up)

• Color space and chroma sampling: YUV420, YUV422, YUV444, RGB444

• Bit-depth: 8-14 bits

Typical HEVC video encoder

A picture is partitioned into CTUs. The CTU is the basic processing unit.

It contains luma CTBs and chroma CTBs. A luma CTB covers L ×L samples.

Two chroma CTBs cover each L/2 ×L/2 samples.

HEVC supports variable-size CTBs. The value of L may be equal to 16, 32, or 64.

It is selected according to needs of encoders.

In terms of memory and computational requirements.

Large CTB is beneficial when encoding high-resolution video content.

Coding unit (CU) and coding block (CB) The root of quadtree is CTU.

CTU is partitioned into CUs recursively.

A CU consists of one luma CB and two chroma CB.

Each CU has an associated partitioning into prediction units (PUs) and a tree of transform units (TUs)

CTBs can be used as CBs or can be partitioned into multiple CBs using quadtree structures.

The quadtree splitting process can be iterated until the size for a luma CB reaches a minimum allowed luma CB size (8 ×8 or larger).

Prediction unit (PU) and prediction block (PB)

A PU partitioning structure has its root at the CU level

PB size can be from 64×64 down to 4×4

The prediction mode for CU is signed as being Intra or Inter

When it is signed as intra, the PB (prediction block) size is same as CB size for all block

- CB can be spilt into four PB quadrant when CB

size is equal to small CB size.

- It allows mode selection for block as small as 4 x 4.

Intrapicture prediction

Interpicture prediction

Asymmetric Motion

Partitioning

Transform unit (TU) and transform block (TB) A TU tree structure has its root at the CU level.

A luma CB may be identical to the luma TB or may be split into smaller luma TBs.

TB size can be 4×4, 8×8, 16×16, and 32×32

Motion compensation Quarter-sample precision is used for the MVs.

7-tap or 8-tap filters are used for interpolation of fractional-sample positions.

Intrapicture prediction 33 directional modes, planar (surface fitting), DC (flat)

Modes are encoded by deriving most probable modes (MPMs) based on those of previously decoded neighboring PBs.

Quantization control Uniform reconstruction quantization (URQ)

Entropy coding Context adaptive binary arithmetic coding (CABAC)

In-Loop deblocking filtering Similar to the one in H.264

More friendly to parallel processing

Sample adaptive offset (SAO) Nonlinear amplitude mapping

For better reconstruction of amplitude by histogram analysis

HEVC : block-based hybrid video coding ① Inter picture prediction

Temporal statistical dependences

② Intra picture prediction Spatial statistical dependences

③ Transform coding Spatial statistical dependences

Planar prediction (Intra_Planar) Amplitude surface with a horizontal and vertical slope

derived from boundaries

DC prediction (Intra_DC) Flat surface with a value matching the mean value of the

boundary samples

Directional prediction (Intra_Angular) 33 different directional prediction is defined for square

TB sizes from 4×4 up to 32×32

Fig. 6. Modes and directional orientations for intrapicture prediction

• HEVC supports motion vectors with units of one quarter of the distance between luma samples.

Fractional Sample Interpolation It is used to generate the prediction samples for non

integer sampling positions.

For residual coding, a CB can be recursively partitioned into transform blocks.

The partitioning is signaled by a residual quadtree.

• Subdivision of a CTB into CBs and TBs.

• Solid lines: CB boundaries, dotted lines: TB boundaries

A slice is divided into rows of CTUs.

This supports parallel processing of rows of CTUs by using several processing threads in the encoder or decoder.

Wavefront parallel processing (WPP)

HEVC uses transform coding of the prediction error residual. The residual block is partitioned into multiple square

TBs.

The supported transform block sizes are 4×4, 8×8, 16×16, and 32×32.

1. Core Transform

2. Alternative integer Transform

3. Scaling and Quantization

HEVC uses only CABAC for entropy coding.

Context modeling The number of contexts used in HEVC is less than in

H.264/MPEG-4 AVC.

Entropy coding design actually provides better compression.

Adaptive coefficient scanning Coefficient scanning is performed in 4×4 subblocks for all TB

sizes.

The selection of the coefficient scanning order depends on the directionalities of the intrapicture prediction.

• Adaptive coefficient scanning The horizontal scan is used when the prediction direction is close to

vertical.

The vertical scan is used when the prediction direction is close to horizontal.

For other prediction directions, the diagonal up-right scan is used.

Two processing steps, a deblocking filter (DBF) followed by an sample adaptive offset (SAO) filter, are applied to the reconstructed samples. The DBF is intended to reduce the blocking artifacts due to

block-based coding.

The DBF is only applied to the samples located at block boundaries.

The SAO filter is applied adaptively to all samples satisfying certain conditions. e.g. based on gradient.

Deblocking Filter It is applied to all samples adjacent to a PU or TU boundary.

HEVC only applies the deblocking filter to the edge that are

aligned on an 8×8 sample grid.

This restriction reduces the worst-case computational complexity without noticeable degradation of the visual quality.

It also improves parallel-processing operation.

The processing order of the deblocking filter is defined as horizontal filtering for vertical edges for the entire picture first, followed by vertical filtering for horizontal edges.

SAO (sample adaptive offset) It is a process that modifies the decoded samples by

conditionally adding an offset value to each sample after the application of the deblocking filter, based on values in look-up tables transmitted by the encoder.

It is performed on a region basis, based on filtering type selected per CTB.

sao_type_idx 0: it is not applied to the CTB.

sao_type_idx 1: band offset filtering

sao_type_idx 2: edge offset filtering

SAO In the band offset mode.

The selected offset value directly depends on the sample amplitude.

The full sample amplitude range is uniformly split into 32 segments called bands.

The sample values belonging to four of these bands are modified by adding transmitted values.

The main reason for using four consecutive bands is that in the smooth areas artifacts can appear.

SAO In the edge offset mode.

a horizontal, vertical, or one of two diagonal gradient directions is used for the edge offset classification in the CTB.

Each sample in the CTB is classified into one of five EdgeIdx categories.

I_PCM mode(bits) The prediction, transform, quantization and entropy

coding are bypassed.

The samples are directly represented by a pre-defined number of bits.

Its main purpose is to avoid excessive consumption of bits when the signal characteristics are extremely unusual.

Lossless mode The transform, quantization, and other processing that

affects the decoded picture are bypassed.

The residual signal from inter- or intrapicture prediction is directly fed into the entropy coder.

It allows mathematically lossless reconstruction.

SAO and deblocking filtering are not applied to this regions.

Transform skipping mode Only the transform is bypassed.

It improves compression for certain types of video content such as computer-generated images or graphics mixed with camera-view content.

It can be applied to TBs of 4×4 size only.

Harmonic has announced support for HEVC within its market-leading ProMedia solution. ProMedia Live supports live streaming from Mobile to HD for real-time video to software clients on smartphones, tablets, computers, gaming and streaming-player consoles. This capability will be extended to allow ProMedia to support resolutions up-to and including Ultra HD. Promedia Xpress brings Harmonic’s quality/performance leadership to file transcoding by adding HEVC and allowing efficient handling of all types of content, including Ultra HD for on-demand streaming, again to software decode clients.

A multitude of tech companies are starting to adapt the new standard and offer support. Companies such as Harmonic will be announcing support for the standard. Many others such as LG, Panasonic, Sony, Toshiba, Philips, Sharp, ARM, Intel, Nvidia, Qualcomm, Realtek Semiconductor and Mozilla have backed the codec, and even Google will support it in Chrome and hasn’t ruled out YouTube support. Apple has even moved to support HEVC on its iPads — those sold in 2012 are HEVC compliant.

Why the initial lag in adoption of the new codec?

One explanation was the uncertainty of price and what it would cost to use HEVC. Also the incorporation of HEVC playback into the iOS or Android platforms, either via an app or OS upgrade.

http://www.nasdaq.com/press-release/harmonic-brings-cuttingedge-ultra-hd-hevc-solutions-to-the-2014-nab-show-20140403-00337

32

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

Overview of the High Efficiency Video Coding (HEVC) Standard By Gary J. Sullivan, Fellow, IEEE, Jens-Rainer Ohm, Member, IEEE, Woo-Jin Han, Member, IEEE, and Thomas Wiegand, Fellow, IEEE

Engineering

HEVC overview main