Standardization Trends in Video Coding Technologies · the development of next-generation coding methods. 1. Introduction ... 3rd Generation mobile communications systems (3G) standardized

FEATURE

3

The JPEG format for encoding still images was standardized during the 1980s and 1990s. It uses the Discrete Cosine Transform (DCT), which was developed at the end of the 1970s, and Huffman encoding. JPEG technology was later extended to work with motion video signals, which are sets of still images, and it became the basis for the H.261 and MPEG-1 digital video coding formats that were internationally standardized in 1990 and 1991, respectively. These formats targeted teleconferencing systems using Integrated Services Digital Network (ISDN) circuits and Compact Disc (CD) storage media, so their performance was suitable for video signals of standard TV resolution compressed to a quality comparable to Video Home System (VHS) recordings. Later, with the spread of high-definition television (HDTV), the MPEG-2/H.262 and MPEG-4 Advanced Video Coding (AVC)/H.264 (video compression) methods were standardized with compressions suitable for maintaining quality of broadcasted high-definition video signals. Currently, they are widely used to view HDTV programming of digital television broadcasts and video recorded on Digital Versatile Discs (DVDs) and Blu-ray Discs (BDs). In 2013, a new video compression method supported to 4K/8K ultra high-definition video, called MPEG-H/H.265 High Efficiency Video Coding (HEVC), was standardized, and preparations are underway to begin 4K/8K broadcasting services using this coding method. This article describes the standardization activities relating to video coding methods that have evolved as video media has developed, as well as operational guidelines for video coding in broadcast services. It also discusses the trend in the development of next-generation coding methods.

1. Introduction

NHK is conducting R&D on 8K Super Hi-Vision (8K), which will have 33 mega-pixel video and 22.2 multi-channel sound, as a next-generation broadcast

service. This video format is capable of conveying much higher realism compared with the conventional HDTV of current digital broadcasting services. It has 16 times the resolution of HDTV, frame rates up to 120 Hz, progressive scan, and a signal bit depth up to 12 bits1). NHK is working on international standardization of the parameters of this 8K system and is developing devices with a view to practical implementation. We have collaborated with other organizations around the globe, conducting international 8K transmission trials using the Internet and broadcast satellites to verify the technology2)-6). From the results of these efforts, the Ministry of Internal Affairs and Communications drew up a roadmap for introducing a next-generation broadcasting service using 4K/8K technology. Plans have been made to deploy this service using broadcast satellites by the year 2016 and regular service by 20187). Currently, government and private enterprise are working together to realize this 4K/8K service.

2. History of video coding formats

Table 1 lists major events in the development of video media, along with the progress of video coding standardization. In the past, the appearance of new video formats has always been driven by the development of technologies to transmit or record video signal data. In 1964, NHK began R&D on HDTV, with six times the data of standard television coding, and in 1984, it developed the Multiple Sub-Nyquist Sampling Encoding (MUSE) (video compression) method for transmitting HDTV by broadcast satellite. Trial satellite broadcasts using MUSE began in 1989. The success of these trials increased interest in HDTV around the world, and consequently, interest in analog and digital formats for HDTV signal compression technology also increased.

In 1988, the International Organization for Standard-ization / International Electrotechnical Commission, Joint

Standardization Trends in Video Coding TechnologiesAtsuro Ichigaya, Advanced Television Systems Research Division

FEATURE

Technical Committee 1, Sub-committee 2, Working Group 8 (ISO/IEC JTC1, SC2 WG8) departed from the standard-ization activities on image coding that it had been participat-ing in under the Joint Photographic Experts Group (JPEG)*1 to form the Moving Picture Experts Group (MPEG)*2. The initial goal of MPEG was to record video on CD-ROM, the high-capacity recording media at the time, and it be-gan standardization of digital coding methods to this end. At the same time, the video coding experts group in the In-

ternational Telecommunication Union Telecommunication, Standardization Sector, Study Group 16, Working Party 3, Question 6 (ITU-T SG16 WP3 Q.6) Video Coding Experts Group (VCEG), was studying video coding methods for use in ISDN teleconferencing systems. VCEG and MPEG soon began a technical correspondence. VCEG standardized the H.2618) video coding method (for 64 kbps) in 1990, and MPEG standardized the MPEG-19) video coding format for data storage (at up to 1.5 Mbps) in 1991.

MPEG-1 was for data storage; it supposed an error-free environment, and its video quality was comparable to VHS. In 1995, MPEG issued a newer video coding

*1 Currently ISO/IEC JTC1/SC29/WG1.*2 Currently ISO/IEC JTC1/SC29/WG11.

4

Major developments

CDs enter market

Hi-Vision test satellite broadcasting starts

ISDN commercial services start

3rd Generation mobile communications systems (3G) standardized

BS digital broadcasting begins

3G services start

Digital terrestrial television broadcasting starts, BDs enter market

YouTube service starts

UHDTV standardized

LTE standardized

DVDs enter market

MPEG Inaugurated

VCEG Inaugurated

H.261 standardized

H.262 standardized

H.263 standardized

H.264 standardized

MPEG Inaugurated

MPEG-1 standardized

MPEG-2 standardized

MPEG-4 standardized

MPEG-4 AVC standardized

MPEG-H standardized

JCT-VC established

JVT established

1983

1984

1988

1990

1991

1995

1996

1998

1999

2000

2001

2003

2013

2005

2006

2009

2010

ISO/IEC standardization ITU-T standardization

Table 1: Major developments in video media and progress of video coding standardization

FEATURE

method named as MPEG-210), a general-purpose high-quality video coding that improved on the features of MPEG-1 and supported HDTV. At the same time, VCEG standardized the specification named as H.262 (hereinafter MPEG-2)11) at ITU-T. Use of MPEG-2 spread rapidly due to its high performance and general utility in broadcasting, communications, and storage media such as DVDs. It has since been adopted for digital television broadcasting in Japan and many other countries.

In 1996, VCEG standardized H.263 as a coding method for narrow-band applications with higher efficiency than that of H.261. MPEG added several improvements based on H.263 to its standard for strengthening its error resistance, and this resulted in the MPEG-4 method being standardized in 1998.

Later, as HDTV spread, a new high-capacity storage media, BDs, appeared; furthermore, in 1999, international standards for a third generation of mobile communications systems (3G) were established that made video transmissions over mobile telephone digital communications networks possible. These developments in turn prompted studies aimed at standardizing new coding methods. In 2001, MPEG and VCEG established the Joint Video Team (JVT) to begin standardization of a coding method for everything from video distribution services for mobile devices to 4K-resolution video signals. This method was standardized in 2003 as MPEG-4 AVC | ITU-T H.264 (hereinafter, AVC), and it had twice the coding efficiency of MPEG-212)13). AVC has excellent coding performance at low bit rates, and it has been used in Japan in the One-Seg broadcasting service for mobile receivers. As described above, video signal coding methods continue to go hand in hand with advances in video media and distribution networks.

In 2006, the 4K/8K new high-resolution video formats were internationally standardized14) under the name Ultra High Definition Television (UHDTV). At the same time, with the growth of streaming*3 services such as YouTube and other new video distribution services, stress on line capacities started to become an issue. As a result, in January, 2010, MPEG and VCEG together decided to begin

standardization of a new video coding format called HEVC, with the goal of improving coding efficiency by a factor of two compared with AVC.

Standardization of HEVC is being done by a working party established by MPEG and VCEG, called the Joint Collaborative Team on Video Coding (JCT-VC). The draft standard set out by JCT-VC was approved by both organizations and standardized in 2013 as the first edition of ISO/IEC 23008-2 and Rec. ITU-T H.26515)16). Note that at MPEG, it is called MPEG-H part2 because the ISO/IEC 23008 standard name is MPEG-H and the video coding is part 2, while at ITU-T, it is called H.265. Despite this, it commonly referred to as HEVC on the titles of the standard documents.

3. HEVC standardization activities

3.1 HEVC standardization procedure

The first JCT-VC meeting was held in April, 2010. The first version of HEVC was standardized in the 12th meeting in January 2013. The draft standards issued after the 12th meeting were submitted to both ISO/IEC and ITU-T, which then voted upon and approved them. The first editions were issued as Rec. ITU-T H.265 and ISO/IEC 23008-2 in April and December, 2013, respectively.

The first version standardizes the most generally used coding tool sets*4 (called “profiles”). It handles coding of signals with a luminance and color-difference signal pixel structure of 4:2:0*5 and a signal bit depth of 8 or 10 bits. It specifies three profiles, called the Main Still Picture Profile, the Main Profile, and the Main10 Profile, for encoding signals that are, respectively, still images with 8-bit signals, moving images with 8-bit signals, and moving images with 10-bit signals.

The main topics in the standardization of HEVC are shown in Table 2. JCT-VC called for methods to be the basis of the new coding method. At the first JCT-VC

*3 Services that receive video and audio data through a network and simultaneously play it back.

5

*4 Which define decoding functions that must be provided by the decoder for various uses.

*5 4:4:4 has color difference signals with the same number of pixels as luminance horizontally and vertically, 4:2:2 has half the number of pixels horizontally, and 4:2:0 has half the number of pixels horizontally and vertically.

FEATURE

meeting, there were proposals from 27 organizations, including NHK, and based on five of these methods, a set of powerful coding tools, the HEVC Test Model (HM), as the basis for the coding method, and a working draft standardization document were settled upon. At each meeting, organizations proposing new technologies were required to show quantitative improvements relative to the HM in terms of both processing and performance and to prepare revised documentation for the proposed method, showing differences with the preceding draft standard. The proposed technologies were evaluated on the basis of an analysis of the two points above, and on the correctness and clarity of the revisions to the draft standard, and discussions were held on whether to adopt them or not. Adopted technologies were combined, implemented in the HM, and integrated into the standard document. At the following meeting, the revised HM and draft standard were used as a basis for proposing further improvements. Through these arrangements, the coding performance was improved and the draft specification became more complete with each meeting. The same process is still being used to develop extensions to the standard.

NHK contributed to HEVC standardization from the first JCT-VC meeting onwards, and submitted technology proposals focused on intra-prediction and transform technologies. NHK also proposed to support 8K video format in HEVC, by providing 8K test sequences.

3.2 Extensions to the HEVC standard

Work has continued on extensions since the first edition of the HEVC standard was completed in 2013. In 2014, 2nd edition was standardized; it included extensions for professional-use signal formats, scalable coding, multi-viewpoint video, and 3D video coding. The signal format extensions are very important for broadcasting; they could also be called broadcasting extensions. As mentioned above, the profiles in the first edition encode signals with a 4:2:0 pixel structure and 8 or 10-bit depth signals. They cannot support the signals widely used in current video production, which have 4:2:2 or 4:4:4 structures and quantizations of 12 bits or more. The extentions of second edition supports such professional-use signals. It specifies 22 profiles with different combinations of pixel structure, depth signals, and coding control for various uses. These profiles are expected to be used for transmitting raw footage, for video editing, and for other applications in broadcasting.

Other extensions were included, some for graphical user interfaces (GUI), called “screen content”, and some specialized for computer graphics (CG), such as game screens. These are expected to be used in scenarios such as wireless displays. The 3rd edition was finalized in the Feb. meeting in 2016.

6

Table 2: Major developments in standardization of HEVC

Apr. 2010

July 2010

Oct. 2010

Feb. 2012

July 2012

Oct. 2012

Jan. 2013

1st Meeting

2nd Meeting

3rd Meeting

8th Meeting

10th Meeting

11th Meeting

12th Meeting

27 coding formats proposed from 27 organizations.Standard software decided based on 5 formats.

Group established to evaluate performance of the standard software.

Standard software HM 1.0 and Specification Document 1.0 released.

Draft international standard issued.

Main 10 profile/Main still picture profile settled.

International standard final draft issued.

Main profile/level draft decision committee issued the draft.

FEATURE

4. UHDTV broadcasting

4.1 UHDTV broadcasting video coding format

standardization

In 2013, the Ministry of Internal Affairs and Communications published a roadmap for promoting 4K/8K, and in response, the Next-Generation Television Forum (NexTV-F) was formed to promote 4K/8K throughout Japan. Since then, government and industry have been working with NexTV-F to draw up a specification for broadcasting service, develop devices, and perform other tasks. The Video Coding working group in the Digital Broadcasting System Development section of the Association of Radio Industries and Businesses (ARIB) is also working to standardize the video coding for broadcasting. It has created the ARIB STD-B32 standard,

which has adopted HEVC as video coding method, and multiplexing17).

Table 3 shows the comparisons of the coding methods currently used for digital broadcasting and HEVC. Similar to MPEG-2 and AVC, HEVC uses a hybrid coding scheme that divide the video signal into blocks, which are encoded using intra and inter predictions, orthogonal transformation, quantization and entropy coding*6. These coding functions were extended, and a variety of coding modes. By appropriately choosing the combination of coding modes, it can be achieved twice compression ratio than that of AVC. On the other hand, encoding requires searching for a

7

Table 3: Differences among coding formats

MPEG-2 MPEG-4 AVC/H.264 MPEG-H HEVC

Supported format†1 (Largest)

Motion compensation prediction

Orthogonal transform

Intra-image prediction

Entropy coding

In-loop filter†6

16×16, 16×8�1/2 pixel-accuracy estimation, no motion vector estimation

Real DCT (8×8)

None

2D VLC†9

None

4×4 to 16×16, 1/4 pixel accuracy estimation, estimation of motion vector by median of neighboring blocks

Accurate integer DCT (8×8, 4×4)

9 modes for 4×4, 8×8†4�4 modes for 16×16

CAVLC†10 or CABAC†11

Deblocking filter†7

8×4, 4×8 to 64×64, 1/4 pixel-accuracy estimation, optimized motion vector estimation from neighboring blocks and vector merge†5

Accurate integer DCT/DST, Transform skip†3 (4×4 to 32×32)

35 modes for 4×4 to 64×64

CABAC

Deblocking filterPixel adaptive offset†8

1,080/60/P (HDTV) 4,320/120/P (8K) 2,160/60/P (4K†2)

ITU-T standardization

†1 Samples in vertical direction/frame rate/scan method (interlaced scan (I), or Progressive scan (P))

†2 Extension to 8K is planned.†3 A mode in which a transform is not used.†4 Prediction based on vertical, horizontal, diagonal,

and average values. †5 Motion compensation estimation that reuses motion

information from the neighboring blocks.†6 A filter that is incorporated into the control loop

when encoding.†7 A filter that reduces block-shaped coding distortion

caused by encoding in block units.

†8 A filter technology that increases image quality by performing offset processing for each pixel in a block.

†9 Variable Length Code (VLC): A type of variable-length coding technology that reduces the amount of code by assigning shorter codes to symbols that occur more often, based on a pre-computed statistical model.

†10 Context-Adaptive Variable Length Coding (CAVLC): A type of variable length coding (arithmetic coding) that achieves higher coding efficiency than VLC by assigning codes optimized based on symbol occurrence rates in the input signal.

†11 Context Adaptive Binary Arithmetic Coding (CABAC): A type of variable-length (arithmetic) coding. In contrast to CAVLC, occurrence rates of symbols expressed in binary are measured per-bit to implement an optimal coding. Performance is better than CAVLC, but more processing is required.

Coded blocks 16×16 16×16 8×8 to 64×64

*6 A technique that reduces the average code length by assign-ing shorter codes to symbols that occur more frequently.

FEATURE

combination of coding modes optimized for the image, so as image resolution and data increase, it becomes increasingly difficult to do such processing in real time.

HEVC encoder and decoder devices for 4K broadcasting were relatively easy to develop and have already developed consumer devices. In fact, 4K broadcasting services using this standard began in June, 2014 using a Communications Satellite (CS). Commercial broadcasts began in March, 2015. Test satellite broadcasting of a 4K/8K service using a Broadcast Satellite (BS) is planned to start in 2016. Thus, several limitations of HEVC for implementation have been specified in the operational guidelines (Annex of ARIB STD-B32).

4.2 Broadcasting services and video coding formats

In addition to 4K/8K content, UHDTV broadcasting service will reuse a large quantity of earlier video resources; as such, it must be able to broadcast HDTV content of current digital television broadcasting.

Table 4 lists the video formats that can be broadcasted and the profiles and levels*7 of each format. UHDTV

broadcasting service is targeted to higher quality than current digital television service and employed Main 10 profile. However, to maintain compatibility with earlier broadcasts, the Main profile is permitted for HDTV broadcasting service only (1,080/60/I and 1,080/60/P).

Hierarchical coding*8 has been adopted because conventional HDTV broadcast content is produced at a 60 Hz frame rate and the affordable 4K/8K displays are initially only expected to be capable of displaying at 60 Hz. In this way, 60 Hz receivers can receive services when broadcasting with 120 Hz frame rates begins.

The ARIB video coding working group has conducted subjective evaluation test. It aimed to estimate required bit rates of new broadcasting service. The maximum bit rates shown in Table 4 were set based on the bit rate estimates of these test.

8

Table 4: Ultra-high-definition television broadcast video formats, profiles and levels

Parameter 1,080/60/I

1,920×1,080

8 bit, 10 bit 10 bit

Y'C'BC'R 4:2:0

Rec. ITU-R BT.709, IEC 61966-2-4 (xvYCC) Rec. ITU-R BT.2020

3,840×2,160 (4K) 7,680×4,320 (8K)

1,080/60/P 2,160/60/P 2,160/120/P 4,320/60/P 4,320/120/P

29.97, 30 59.94, 60 59.94, 60 119.88, 120 59.94, 60 119.88, 120

Effective samples

Frame rate (Hz)

Colorimetry†1

Color difference format

Pixel bit depth

Main/Main 10 Main 10Profile

4.1 5.1 5.2 6.1 6.2Level

Main TierTier†2

20 Mbps 40 Mbps 50 Mbps 120 Mbps 150 MbpsMax. bit rate

- - - Yes - YesHierarchical coding in time direction

†1 Specification for expressing color signals.†2 A concept used to differentiate bit rates required by application. Two types, Main and High, are defined and the Main Tier is used for broadcasting.

*7 Definitions of the upper limits of parameters such as video resolution and bit rate.

*8 A coding format in which the encoded signal has an hier-archical structure, which allows a 60 Hz signal to be played back using only the lower layers of the signal, and a 120 Hz signal to be generated by decoding both the lower and upper layers.

FEATURE

5. Trends in next-generation coding

HEVC standardization work is nearly finalized, and both MPEG and VCEG have begun exploring various possibilities for a next-generation video coding method. MPEG held a workshop in October, 2014 in Strasburg, discussing the timeframe, functionality, and performance required of a next-generation video coding standard.

In this work shop, the changing of viewing style was focused. Watching video content on smartphones is especially popular. In Japan, the environments for viewing regular digital television broadcasts on smartphones are being possible, but video uploading and downloading by ordinary users was extremely increased as enjoying video contents. Due to improvements of the performance of smartphones, anyone can now easily record high-resolution video, upload it to a server, and distribute it to viewers around the world. It strongly requests to keep increasing storage capacity of video server. It shows that the compression performance for storage media are still important issue for industries. Against this background, the following targets in terms of time frame, functionality, and performance are currently being explored for the next generation coding standard.

(1) Standardization time frame

The fifth-generation mobile network (5G) has been standarding for 2020. It is also approximately seven years since standardization of the first edition of HEVC was published, and it is expected that 4K/8K will have become popular by that time; hence, there will be demand for a new coding format. As such, a next generation coding is expected to be standardized around 2020.

(2) Video format

Since the resolution of mobile devices is increasing, it may be necessary to reconsider the necessity to support low-resolution formats. Currently, HDTV, UHDTV, and High Dynamic Range (HDR) video*9 are thought to be the most important.

(3) Real-time capability

Streaming services are becoming increasingly important, and the processing and real-time encoding of these services are not necessarily important. As such, a next-generation coding standard should be able to select a method that has a real-time capability or a high-performance capability with non-real-time encoding.

(4) Coding performance

For real-time applications, approximately 30% coding performance improvement over that of HEVC are being studied. Moreover, approximately 50% improvements for non-real-time applications is expected.

6. Future activities

For the response of the proposed performance requirements, several extensions to HEVC have been reported on top of the HM. It mainly consists of techniques known as effective but complex tools. Thus, it showed a little over 20% coding performance improvement, but ten times complexity than HM. In the following meetings, It is discussed continuously as a future issue. We also plan to continue to study next-generation coding techniques and search for ways to improve coding rates.

The Tokyo Olympic and Paralympic Games will be held in 2020, and we expect 4K/8K satellite broadcasting services to have been deployed by that time as well. After that, we expect to work on 4K/8K digital terrestrial broadcasting services, and we foresee that coding method exceeding HEVC will be needed. NHK will also research ever more efficient coding techniques and 4K/8K terrestrial broadcasting services.

References1) Rec. ITU-R BT.2020, “Parameter Values for Ultra-high

Definition Television Systems for Production and International

Programme Exchange,” (2012)

2) Sujikai, Suzuki, Iguchi, Shimizu, Kamimura, Ichigaya: “Super

Hi-Vision International Transmission Trials at IBC2008,

9

*9 A video signal with a broader range of brightness, that enables to reproduce brighter and darker areas more accu-rately.

FEATURE

10

”Broadcast Technology, Vol. 62, No. 3, pp.90-97 (2009) (in

Japanese)

3) Sujikai, Suzuki, Kojima, Shimizu, Hashimoto, Tanaka,

Kimura, Toyoda, Matsuo, Nakajima, Iguchi, Okumura,

Nakayama, Masuda, Osawa, Takahashi, Okawa, Shogen:

“Live, multi-channel Super Hi-Vision Trial Broadcasts with

the Kizuna Ultra-high-speed Internet Satellite,” Broadcast

Technology, Vol. 62, No. 9, pp. 91-98 (2009) (in Japanese)

4) Nojiri, Iguchi, Noguchi, Fujii, Ogasawara:“Super Hi-Vision

International Transmission Trials using the Global Research

and Education IP Network,” Broadcast Technology, Vol. 64,

No. 6, pp. 135-141 (2011) (in Japanese)

5) Kubota: “Super Hi-Vision: Public viewing initiatives for the

London Olympics,” ITU Journal, Vol. 42, No. 11, pp. 52-56

(2012) (in Japanese)

6) Izumoto: “NHK 8K Super Hi-Vision Public Viewings of the

World Cup,” ITU Journal, Vol. 44, No. 11, pp. 27-29 (2014)

(in Japanese)

7) http://www.soumu.go.jp/johotsusintokei/whitepaper/ja/h25/

html/nc112320.html

8) Rec. ITU-T H.261, “Video Codec for Audio Visual Services at

px 64 kbit/s” (1990)

9) ISO/IEC 11172-2, “Information Technology - Coding of

Moving Pictures and Associated Audio for Digital Storage

Media at up to About 1.5 Mbit/s - Part 2: Video” (1991)

10) ISO/IEC 13818-2, “Information Technology - Generic Coding

of Moving Pictures and Associated Audio Information - Part

2: Video” (1995)

11) Rec. ITU-T H.262, “Information Technology - Generic Coding

of Moving Pictures and Associated Audio Information: Video”

(1995)

12) ISO/IEC 14496-10, “Information Technology - Coding of

Audio-visual Objects - Part 10: Advanced Video Coding”

(2003)

13) Rec. ITU-T H.264, “Advanced Video Coding for Generic

Audiovisual Services” (2003)

14) Rec. ITU-R BT.1769, “Parameter Values for an Expanded

Hierarchy of LSDI Image Formats for Production and

International Programme Exchange” (2006)

15) Rec. ITU-T H.265, “High Efficiency Video Coding” (2013)

16) ISO/IEC 23008-2, “Information Technology - High Efficiency

Coding and Media Delivery in Heterogeneous Environments -

Part 2: High Efficiency Video Coding” (2013)

17) ARIB: “Video coding, audio coding and multiplexing

specifications for digital broadcasting,” ARIB STD-B32

(2015)

Documents

Standardization Trends in Video Coding Technologies · the development of next-generation coding methods. 1. Introduction ... 3rd Generation mobile communications systems (3G) standardized