Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
FEATURE
3
The JPEG format for encoding still images was standardized during the 1980s and 1990s. It uses the Discrete Cosine Transform (DCT), which was developed at the end of the 1970s, and Huffman encoding. JPEG technology was later extended to work with motion video signals, which are sets of still images, and it became the basis for the H.261 and MPEG-1 digital video coding formats that were internationally standardized in 1990 and 1991, respectively. These formats targeted teleconferencing systems using Integrated Services Digital Network (ISDN) circuits and Compact Disc (CD) storage media, so their performance was suitable for video signals of standard TV resolution compressed to a quality comparable to Video Home System (VHS) recordings. Later, with the spread of high-definition television (HDTV), the MPEG-2/H.262 and MPEG-4 Advanced Video Coding (AVC)/H.264 (video compression) methods were standardized with compressions suitable for maintaining quality of broadcasted high-definition video signals. Currently, they are widely used to view HDTV programming of digital television broadcasts and video recorded on Digital Versatile Discs (DVDs) and Blu-ray Discs (BDs). In 2013, a new video compression method supported to 4K/8K ultra high-definition video, called MPEG-H/H.265 High Efficiency Video Coding (HEVC), was standardized, and preparations are underway to begin 4K/8K broadcasting services using this coding method. This article describes the standardization activities relating to video coding methods that have evolved as video media has developed, as well as operational guidelines for video coding in broadcast services. It also discusses the trend in the development of next-generation coding methods.
1. Introduction
NHK is conducting R&D on 8K Super Hi-Vision (8K), which will have 33 mega-pixel video and 22.2 multi-channel sound, as a next-generation broadcast
service. This video format is capable of conveying much higher realism compared with the conventional HDTV of current digital broadcasting services. It has 16 times the resolution of HDTV, frame rates up to 120 Hz, progressive scan, and a signal bit depth up to 12 bits1). NHK is working on international standardization of the parameters of this 8K system and is developing devices with a view to practical implementation. We have collaborated with other organizations around the globe, conducting international 8K transmission trials using the Internet and broadcast satellites to verify the technology2)-6). From the results of these efforts, the Ministry of Internal Affairs and Communications drew up a roadmap for introducing a next-generation broadcasting service using 4K/8K technology. Plans have been made to deploy this service using broadcast satellites by the year 2016 and regular service by 20187). Currently, government and private enterprise are working together to realize this 4K/8K service.
2. History of video coding formats
Table 1 lists major events in the development of video media, along with the progress of video coding standardization. In the past, the appearance of new video formats has always been driven by the development of technologies to transmit or record video signal data. In 1964, NHK began R&D on HDTV, with six times the data of standard television coding, and in 1984, it developed the Multiple Sub-Nyquist Sampling Encoding (MUSE) (video compression) method for transmitting HDTV by broadcast satellite. Trial satellite broadcasts using MUSE began in 1989. The success of these trials increased interest in HDTV around the world, and consequently, interest in analog and digital formats for HDTV signal compression technology also increased.
In 1988, the International Organization for Standard-ization / International Electrotechnical Commission, Joint
Standardization Trends in Video Coding TechnologiesAtsuro Ichigaya, Advanced Television Systems Research Division
FEATURE
Technical Committee 1, Sub-committee 2, Working Group 8 (ISO/IEC JTC1, SC2 WG8) departed from the standard-ization activities on image coding that it had been participat-ing in under the Joint Photographic Experts Group (JPEG)*1 to form the Moving Picture Experts Group (MPEG)*2. The initial goal of MPEG was to record video on CD-ROM, the high-capacity recording media at the time, and it be-gan standardization of digital coding methods to this end. At the same time, the video coding experts group in the In-
ternational Telecommunication Union Telecommunication, Standardization Sector, Study Group 16, Working Party 3, Question 6 (ITU-T SG16 WP3 Q.6) Video Coding Experts Group (VCEG), was studying video coding methods for use in ISDN teleconferencing systems. VCEG and MPEG soon began a technical correspondence. VCEG standardized the H.2618) video coding method (for 64 kbps) in 1990, and MPEG standardized the MPEG-19) video coding format for data storage (at up to 1.5 Mbps) in 1991.
MPEG-1 was for data storage; it supposed an error-free environment, and its video quality was comparable to VHS. In 1995, MPEG issued a newer video coding
*1 Currently ISO/IEC JTC1/SC29/WG1.*2 Currently ISO/IEC JTC1/SC29/WG11.
4
Major developments
CDs enter market
Hi-Vision test satellite broadcasting starts
ISDN commercial services start
3rd Generation mobile communications systems (3G) standardized
BS digital broadcasting begins
3G services start
Digital terrestrial television broadcasting starts, BDs enter market
YouTube service starts
UHDTV standardized
LTE standardized
DVDs enter market
MPEG Inaugurated
VCEG Inaugurated
H.261 standardized
H.262 standardized
H.263 standardized
H.264 standardized
MPEG Inaugurated
MPEG-1 standardized
MPEG-2 standardized
MPEG-4 standardized
MPEG-4 AVC standardized
MPEG-H standardized
JCT-VC established
JVT established
1983
1984
1988
1990
1991
1995
1996
1998
1999
2000
2001
2003
2013
2005
2006
2009
2010
ISO/IEC standardization ITU-T standardization
Table 1: Major developments in video media and progress of video coding standardization
FEATURE
method named as MPEG-210), a general-purpose high-quality video coding that improved on the features of MPEG-1 and supported HDTV. At the same time, VCEG standardized the specification named as H.262 (hereinafter MPEG-2)11) at ITU-T. Use of MPEG-2 spread rapidly due to its high performance and general utility in broadcasting, communications, and storage media such as DVDs. It has since been adopted for digital television broadcasting in Japan and many other countries.
In 1996, VCEG standardized H.263 as a coding method for narrow-band applications with higher efficiency than that of H.261. MPEG added several improvements based on H.263 to its standard for strengthening its error resistance, and this resulted in the MPEG-4 method being standardized in 1998.
Later, as HDTV spread, a new high-capacity storage media, BDs, appeared; furthermore, in 1999, international standards for a third generation of mobile communications systems (3G) were established that made video transmissions over mobile telephone digital communications networks possible. These developments in turn prompted studies aimed at standardizing new coding methods. In 2001, MPEG and VCEG established the Joint Video Team (JVT) to begin standardization of a coding method for everything from video distribution services for mobile devices to 4K-resolution video signals. This method was standardized in 2003 as MPEG-4 AVC | ITU-T H.264 (hereinafter, AVC), and it had twice the coding efficiency of MPEG-212)13). AVC has excellent coding performance at low bit rates, and it has been used in Japan in the One-Seg broadcasting service for mobile receivers. As described above, video signal coding methods continue to go hand in hand with advances in video media and distribution networks.
In 2006, the 4K/8K new high-resolution video formats were internationally standardized14) under the name Ultra High Definition Television (UHDTV). At the same time, with the growth of streaming*3 services such as YouTube and other new video distribution services, stress on line capacities started to become an issue. As a result, in January, 2010, MPEG and VCEG together decided to begin
standardization of a new video coding format called HEVC, with the goal of improving coding efficiency by a factor of two compared with AVC.
Standardization of HEVC is being done by a working party established by MPEG and VCEG, called the Joint Collaborative Team on Video Coding (JCT-VC). The draft standard set out by JCT-VC was approved by both organizations and standardized in 2013 as the first edition of ISO/IEC 23008-2 and Rec. ITU-T H.26515)16). Note that at MPEG, it is called MPEG-H part2 because the ISO/IEC 23008 standard name is MPEG-H and the video coding is part 2, while at ITU-T, it is called H.265. Despite this, it commonly referred to as HEVC on the titles of the standard documents.
3. HEVC standardization activities
3.1 HEVC standardization procedure
The first JCT-VC meeting was held in April, 2010. The first version of HEVC was standardized in the 12th meeting in January 2013. The draft standards issued after the 12th meeting were submitted to both ISO/IEC and ITU-T, which then voted upon and approved them. The first editions were issued as Rec. ITU-T H.265 and ISO/IEC 23008-2 in April and December, 2013, respectively.
The first version standardizes the most generally used coding tool sets*4 (called “profiles”). It handles coding of signals with a luminance and color-difference signal pixel structure of 4:2:0*5 and a signal bit depth of 8 or 10 bits. It specifies three profiles, called the Main Still Picture Profile, the Main Profile, and the Main10 Profile, for encoding signals that are, respectively, still images with 8-bit signals, moving images with 8-bit signals, and moving images with 10-bit signals.
The main topics in the standardization of HEVC are shown in Table 2. JCT-VC called for methods to be the basis of the new coding method. At the first JCT-VC
*3 Services that receive video and audio data through a network and simultaneously play it back.
5
*4 Which define decoding functions that must be provided by the decoder for various uses.
*5 4:4:4 has color difference signals with the same number of pixels as luminance horizontally and vertically, 4:2:2 has half the number of pixels horizontally, and 4:2:0 has half the number of pixels horizontally and vertically.
FEATURE
meeting, there were proposals from 27 organizations, including NHK, and based on five of these methods, a set of powerful coding tools, the HEVC Test Model (HM), as the basis for the coding method, and a working draft standardization document were settled upon. At each meeting, organizations proposing new technologies were required to show quantitative improvements relative to the HM in terms of both processing and performance and to prepare revised documentation for the proposed method, showing differences with the preceding draft standard. The proposed technologies were evaluated on the basis of an analysis of the two points above, and on the correctness and clarity of the revisions to the draft standard, and discussions were held on whether to adopt them or not. Adopted technologies were combined, implemented in the HM, and integrated into the standard document. At the following meeting, the revised HM and draft standard were used as a basis for proposing further improvements. Through these arrangements, the coding performance was improved and the draft specification became more complete with each meeting. The same process is still being used to develop extensions to the standard.
NHK contributed to HEVC standardization from the first JCT-VC meeting onwards, and submitted technology proposals focused on intra-prediction and transform technologies. NHK also proposed to support 8K video format in HEVC, by providing 8K test sequences.
3.2 Extensions to the HEVC standard
Work has continued on extensions since the first edition of the HEVC standard was completed in 2013. In 2014, 2nd edition was standardized; it included extensions for professional-use signal formats, scalable coding, multi-viewpoint video, and 3D video coding. The signal format extensions are very important for broadcasting; they could also be called broadcasting extensions. As mentioned above, the profiles in the first edition encode signals with a 4:2:0 pixel structure and 8 or 10-bit depth signals. They cannot support the signals widely used in current video production, which have 4:2:2 or 4:4:4 structures and quantizations of 12 bits or more. The extentions of second edition supports such professional-use signals. It specifies 22 profiles with different combinations of pixel structure, depth signals, and coding control for various uses. These profiles are expected to be used for transmitting raw footage, for video editing, and for other applications in broadcasting.
Other extensions were included, some for graphical user interfaces (GUI), called “screen content”, and some specialized for computer graphics (CG), such as game screens. These are expected to be used in scenarios such as wireless displays. The 3rd edition was finalized in the Feb. meeting in 2016.
6
Table 2: Major developments in standardization of HEVC
Apr. 2010
July 2010
Oct. 2010
Feb. 2012
July 2012
Oct. 2012
Jan. 2013
1st Meeting
2nd Meeting
3rd Meeting
8th Meeting
10th Meeting
11th Meeting
12th Meeting
27 coding formats proposed from 27 organizations.Standard software decided based on 5 formats.
Group established to evaluate performance of the standard software.
Standard software HM 1.0 and Specification Document 1.0 released.
Draft international standard issued.
Main 10 profile/Main still picture profile settled.
International standard final draft issued.
Main profile/level draft decision committee issued the draft.
FEATURE
4. UHDTV broadcasting
4.1 UHDTV broadcasting video coding format
standardization
In 2013, the Ministry of Internal Affairs and Communications published a roadmap for promoting 4K/8K, and in response, the Next-Generation Television Forum (NexTV-F) was formed to promote 4K/8K throughout Japan. Since then, government and industry have been working with NexTV-F to draw up a specification for broadcasting service, develop devices, and perform other tasks. The Video Coding working group in the Digital Broadcasting System Development section of the Association of Radio Industries and Businesses (ARIB) is also working to standardize the video coding for broadcasting. It has created the ARIB STD-B32 standard,
which has adopted HEVC as video coding method, and multiplexing17).
Table 3 shows the comparisons of the coding methods currently used for digital broadcasting and HEVC. Similar to MPEG-2 and AVC, HEVC uses a hybrid coding scheme that divide the video signal into blocks, which are encoded using intra and inter predictions, orthogonal transformation, quantization and entropy coding*6. These coding functions were extended, and a variety of coding modes. By appropriately choosing the combination of coding modes, it can be achieved twice compression ratio than that of AVC. On the other hand, encoding requires searching for a
7
Table 3: Differences among coding formats
MPEG-2 MPEG-4 AVC/H.264 MPEG-H HEVC
Supported format†1 (Largest)
Motion compensation prediction
Orthogonal transform
Intra-image prediction
Entropy coding
In-loop filter†6
16×16, 16×8�1/2 pixel-accuracy estimation, no motion vector estimation
Real DCT (8×8)
None
2D VLC†9
None
4×4 to 16×16, 1/4 pixel accuracy estimation, estimation of motion vector by median of neighboring blocks
Accurate integer DCT (8×8, 4×4)
9 modes for 4×4, 8×8†4�4 modes for 16×16
CAVLC†10 or CABAC†11
Deblocking filter†7
8×4, 4×8 to 64×64, 1/4 pixel-accuracy estimation, optimized motion vector estimation from neighboring blocks and vector merge†5
Accurate integer DCT/DST, Transform skip†3 (4×4 to 32×32)
35 modes for 4×4 to 64×64
CABAC
Deblocking filterPixel adaptive offset†8
1,080/60/P (HDTV) 4,320/120/P (8K) 2,160/60/P (4K†2)
ITU-T standardization
†1 Samples in vertical direction/frame rate/scan method (interlaced scan (I), or Progressive scan (P))
†2 Extension to 8K is planned.†3 A mode in which a transform is not used.†4 Prediction based on vertical, horizontal, diagonal,
and average values. †5 Motion compensation estimation that reuses motion
information from the neighboring blocks.†6 A filter that is incorporated into the control loop
when encoding.†7 A filter that reduces block-shaped coding distortion
caused by encoding in block units.
†8 A filter technology that increases image quality by performing offset processing for each pixel in a block.
†9 Variable Length Code (VLC): A type of variable-length coding technology that reduces the amount of code by assigning shorter codes to symbols that occur more often, based on a pre-computed statistical model.
†10 Context-Adaptive Variable Length Coding (CAVLC): A type of variable length coding (arithmetic coding) that achieves higher coding efficiency than VLC by assigning codes optimized based on symbol occurrence rates in the input signal.
†11 Context Adaptive Binary Arithmetic Coding (CABAC): A type of variable-length (arithmetic) coding. In contrast to CAVLC, occurrence rates of symbols expressed in binary are measured per-bit to implement an optimal coding. Performance is better than CAVLC, but more processing is required.
Coded blocks 16×16 16×16 8×8 to 64×64
*6 A technique that reduces the average code length by assign-ing shorter codes to symbols that occur more frequently.
FEATURE
combination of coding modes optimized for the image, so as image resolution and data increase, it becomes increasingly difficult to do such processing in real time.
HEVC encoder and decoder devices for 4K broadcasting were relatively easy to develop and have already developed consumer devices. In fact, 4K broadcasting services using this standard began in June, 2014 using a Communications Satellite (CS). Commercial broadcasts began in March, 2015. Test satellite broadcasting of a 4K/8K service using a Broadcast Satellite (BS) is planned to start in 2016. Thus, several limitations of HEVC for implementation have been specified in the operational guidelines (Annex of ARIB STD-B32).
4.2 Broadcasting services and video coding formats
In addition to 4K/8K content, UHDTV broadcasting service will reuse a large quantity of earlier video resources; as such, it must be able to broadcast HDTV content of current digital television broadcasting.
Table 4 lists the video formats that can be broadcasted and the profiles and levels*7 of each format. UHDTV
broadcasting service is targeted to higher quality than current digital television service and employed Main 10 profile. However, to maintain compatibility with earlier broadcasts, the Main profile is permitted for HDTV broadcasting service only (1,080/60/I and 1,080/60/P).
Hierarchical coding*8 has been adopted because conventional HDTV broadcast content is produced at a 60 Hz frame rate and the affordable 4K/8K displays are initially only expected to be capable of displaying at 60 Hz. In this way, 60 Hz receivers can receive services when broadcasting with 120 Hz frame rates begins.
The ARIB video coding working group has conducted subjective evaluation test. It aimed to estimate required bit rates of new broadcasting service. The maximum bit rates shown in Table 4 were set based on the bit rate estimates of these test.
8
Table 4: Ultra-high-definition television broadcast video formats, profiles and levels
Parameter 1,080/60/I
1,920×1,080
8 bit, 10 bit 10 bit
Y'C'BC'R 4:2:0
Rec. ITU-R BT.709, IEC 61966-2-4 (xvYCC) Rec. ITU-R BT.2020
3,840×2,160 (4K) 7,680×4,320 (8K)
1,080/60/P 2,160/60/P 2,160/120/P 4,320/60/P 4,320/120/P
29.97, 30 59.94, 60 59.94, 60 119.88, 120 59.94, 60 119.88, 120
Effective samples
Frame rate (Hz)
Colorimetry†1
Color difference format
Pixel bit depth
Main/Main 10 Main 10Profile
4.1 5.1 5.2 6.1 6.2Level
Main TierTier†2
20 Mbps 40 Mbps 50 Mbps 120 Mbps 150 MbpsMax. bit rate
- - - Yes - YesHierarchical coding in time direction
†1 Specification for expressing color signals.†2 A concept used to differentiate bit rates required by application. Two types, Main and High, are defined and the Main Tier is used for broadcasting.
*7 Definitions of the upper limits of parameters such as video resolution and bit rate.
*8 A coding format in which the encoded signal has an hier-archical structure, which allows a 60 Hz signal to be played back using only the lower layers of the signal, and a 120 Hz signal to be generated by decoding both the lower and upper layers.
FEATURE
5. Trends in next-generation coding
HEVC standardization work is nearly finalized, and both MPEG and VCEG have begun exploring various possibilities for a next-generation video coding method. MPEG held a workshop in October, 2014 in Strasburg, discussing the timeframe, functionality, and performance required of a next-generation video coding standard.
In this work shop, the changing of viewing style was focused. Watching video content on smartphones is especially popular. In Japan, the environments for viewing regular digital television broadcasts on smartphones are being possible, but video uploading and downloading by ordinary users was extremely increased as enjoying video contents. Due to improvements of the performance of smartphones, anyone can now easily record high-resolution video, upload it to a server, and distribute it to viewers around the world. It strongly requests to keep increasing storage capacity of video server. It shows that the compression performance for storage media are still important issue for industries. Against this background, the following targets in terms of time frame, functionality, and performance are currently being explored for the next generation coding standard.
(1) Standardization time frame
The fifth-generation mobile network (5G) has been standarding for 2020. It is also approximately seven years since standardization of the first edition of HEVC was published, and it is expected that 4K/8K will have become popular by that time; hence, there will be demand for a new coding format. As such, a next generation coding is expected to be standardized around 2020.
(2) Video format
Since the resolution of mobile devices is increasing, it may be necessary to reconsider the necessity to support low-resolution formats. Currently, HDTV, UHDTV, and High Dynamic Range (HDR) video*9 are thought to be the most important.
(3) Real-time capability
Streaming services are becoming increasingly important, and the processing and real-time encoding of these services are not necessarily important. As such, a next-generation coding standard should be able to select a method that has a real-time capability or a high-performance capability with non-real-time encoding.
(4) Coding performance
For real-time applications, approximately 30% coding performance improvement over that of HEVC are being studied. Moreover, approximately 50% improvements for non-real-time applications is expected.
6. Future activities
For the response of the proposed performance requirements, several extensions to HEVC have been reported on top of the HM. It mainly consists of techniques known as effective but complex tools. Thus, it showed a little over 20% coding performance improvement, but ten times complexity than HM. In the following meetings, It is discussed continuously as a future issue. We also plan to continue to study next-generation coding techniques and search for ways to improve coding rates.
The Tokyo Olympic and Paralympic Games will be held in 2020, and we expect 4K/8K satellite broadcasting services to have been deployed by that time as well. After that, we expect to work on 4K/8K digital terrestrial broadcasting services, and we foresee that coding method exceeding HEVC will be needed. NHK will also research ever more efficient coding techniques and 4K/8K terrestrial broadcasting services.
References1) Rec. ITU-R BT.2020, “Parameter Values for Ultra-high
Definition Television Systems for Production and International
Programme Exchange,” (2012)
2) Sujikai, Suzuki, Iguchi, Shimizu, Kamimura, Ichigaya: “Super
Hi-Vision International Transmission Trials at IBC2008,
9
*9 A video signal with a broader range of brightness, that enables to reproduce brighter and darker areas more accu-rately.
FEATURE
10
”Broadcast Technology, Vol. 62, No. 3, pp.90-97 (2009) (in
Japanese)
3) Sujikai, Suzuki, Kojima, Shimizu, Hashimoto, Tanaka,
Kimura, Toyoda, Matsuo, Nakajima, Iguchi, Okumura,
Nakayama, Masuda, Osawa, Takahashi, Okawa, Shogen:
“Live, multi-channel Super Hi-Vision Trial Broadcasts with
the Kizuna Ultra-high-speed Internet Satellite,” Broadcast
Technology, Vol. 62, No. 9, pp. 91-98 (2009) (in Japanese)
4) Nojiri, Iguchi, Noguchi, Fujii, Ogasawara:“Super Hi-Vision
International Transmission Trials using the Global Research
and Education IP Network,” Broadcast Technology, Vol. 64,
No. 6, pp. 135-141 (2011) (in Japanese)
5) Kubota: “Super Hi-Vision: Public viewing initiatives for the
London Olympics,” ITU Journal, Vol. 42, No. 11, pp. 52-56
(2012) (in Japanese)
6) Izumoto: “NHK 8K Super Hi-Vision Public Viewings of the
World Cup,” ITU Journal, Vol. 44, No. 11, pp. 27-29 (2014)
(in Japanese)
7) http://www.soumu.go.jp/johotsusintokei/whitepaper/ja/h25/
html/nc112320.html
8) Rec. ITU-T H.261, “Video Codec for Audio Visual Services at
px 64 kbit/s” (1990)
9) ISO/IEC 11172-2, “Information Technology - Coding of
Moving Pictures and Associated Audio for Digital Storage
Media at up to About 1.5 Mbit/s - Part 2: Video” (1991)
10) ISO/IEC 13818-2, “Information Technology - Generic Coding
of Moving Pictures and Associated Audio Information - Part
2: Video” (1995)
11) Rec. ITU-T H.262, “Information Technology - Generic Coding
of Moving Pictures and Associated Audio Information: Video”
(1995)
12) ISO/IEC 14496-10, “Information Technology - Coding of
Audio-visual Objects - Part 10: Advanced Video Coding”
(2003)
13) Rec. ITU-T H.264, “Advanced Video Coding for Generic
Audiovisual Services” (2003)
14) Rec. ITU-R BT.1769, “Parameter Values for an Expanded
Hierarchy of LSDI Image Formats for Production and
International Programme Exchange” (2006)
15) Rec. ITU-T H.265, “High Efficiency Video Coding” (2013)
16) ISO/IEC 23008-2, “Information Technology - High Efficiency
Coding and Media Delivery in Heterogeneous Environments -
Part 2: High Efficiency Video Coding” (2013)
17) ARIB: “Video coding, audio coding and multiplexing
specifications for digital broadcasting,” ARIB STD-B32
(2015)