14
HEVC Figure 1. Optimizing the codec in terms of complexity and robustness. Figure 2. Profiles in H.264/AVC [1]. Current Video Coding Standards: H.264/AVC, Dirac, AVS China and VC- 1 K. R. Rao, IEEE Fellow, and Do Nyeon Kim Dept. of Electrical EngineeringBarun Technologies, Corp. University of Texas at Arlington Arlington, Texas, USA Seoul, South Korea [email protected] [email protected] Abstract—Video coding standards: H.264/AVC, DIRAC, AVS China and VC- 1are presented. These are the latest standards and are adopted by ITU-T/ISO-IEC, BBC, China standards organization and SMPTE respectively. Besides presenting these standards, research potential and as well projects (both at UG and grad levels) are emphasized. These are available by accessing the database for research and projects in [18]. Web/ftp sites for accessing standards documents, software, test sequences, conformance bit streams, industry activities etc are provided. Keywords- H.264/AVC; Dirac; AVS China; VC-1 I. INTRODUCTION Residual image data is that which is obtained through taking the pixel by pixel differences between the original data and the image reconstructed after lossy compression. For lossless compression, the residual from compression are separately compressed using an appropriate lossless compression approach [72].Work has been done on optimizing the codec, either by reducing the complexity, encoding time, improving the quality, or improving the robustness of the standard 1

Paper Title (use style: paper title) · Web viewDirac uses a more flexible and efficient form of entropy coding called arithmetic coding which packs the bits efficiently into the

Embed Size (px)

Citation preview

Figure 3. Coding structure for H.264/AVC encoder for a macroblock [7].

HEVC

Figure 1. Optimizing the codec in terms of complexity and robustness.

Figure 2. Profiles in H.264/AVC [1].

Current Video Coding Standards: H.264/AVC, Dirac, AVS China and VC-1

K. R. Rao, IEEE Fellow, and Do Nyeon KimDept. of Electrical Engineering Barun Technologies, Corp.

University of Texas at ArlingtonArlington, Texas, USA Seoul, South Korea

[email protected] [email protected]

Abstract—Video coding standards: H.264/AVC, DIRAC, AVS China and VC-1are presented. These are the latest standards and are adopted by ITU-T/ISO-IEC, BBC, China standards organization and SMPTE respectively. Besides presenting these standards, research potential and as well projects (both at UG and grad levels) are emphasized. These are available by accessing the database for research and projects in [18]. Web/ftp sites for accessing standards documents, software, test sequences, conformance bit streams, industry activities etc are provided.

Keywords- H.264/AVC; Dirac; AVS China; VC-1

I. INTRODUCTION

Residual image data is that which is obtained through taking the pixel by pixel differences between the original data and the image reconstructed after lossy compression. For lossless compression, the residual from compression are separately compressed using an appropriate lossless compression approach [72].Work has been done on optimizing the codec, either by reducing the complexity, encoding time, improving the quality, or improving the robustness of the standard using algorithms for error concealment and error correction (Fig. 1).

MPEG-4 AVC/H.264 is developed for multimedia applications [1, 3, 5-13, 19]. It adopted advanced coding techniques such as multiple-reference frame prediction, and context-based adaptive binary arithmetic coding (CABAC). It provides high compression efficiency. Thus it enables to compress video to 1.5~2Mbps for standard definition (SD), and 6~8Mbps for HD. It can save storage space, channel bandwidth, and frequency spectrum.

II. H.264/AVC

A. H.264 intra-frame encoding

1

Figure 4. H.264/MPEG-4 AVC decoder block diagram [1].

16 16 8 8

4 4

8 4 8

4 8

8

MB

Sub MB

0 1

2 3

0

1 1 0 0

0 0 1 0 1

0 1

2 3

Figure 5. MB and sub MB partitions for adpative ME/MC prediction (seven block sizes). The coded blocks with motion vectors are ordered in a raster-scan order.

H.264 (Figs. 2, 3 and 4) uses the methods of adaptive prediction of intra-coded macroblocks to reduce the high amount of bits coded by original input signal itself. For encoding a block or macroblock in intra-coded mode, a prediction block is formed based on previously reconstructed blocks. For the luma samples, the prediction block may be formed for each 4 × 4 subblock, each 8 × 8 block, or for a 16 × 16 macroblock. One mode is selected from a total of 9 prediction modes for each 4 × 4 (similar to Fig. 7) and 8 × 8 luma blocks; 4 modes for a 16 × 16 luma block; and 4 modes for each chroma block. The residuals generated from the difference between the current block and the best mode are further processed by the transform and quantization unit, and reconstructed by their inverse operations to be the reference for the next macroblock. The coefficients after quantization are encoded by entropy coding for final bit stream output.

The best prediction mode(s) are chosen utilizing the R-D optimization which is described as:

J(s, c, MODE|QP)

= D(s, c, MODE|QP)+MODE R(s, c, MODE|QP) (1)

The distortion D(s,c,MODE|QP) is measured as sum of squared differences(SSD)between the original block sand the reconstructed block c, and QP is the quantization parameter, MODE is the prediction mode. R(s,c,MODE|QP) is a number of bits for coding the block. The modes(s) with the minimum J(s,c,MODE|QP) are chosen as the prediction mode(s) of the macro block.

Adpative seven block sizeME/MC prediction for inter-frame predictionis shown in Fig. 5.

III. AVS CHINA [47-53]

A. StandardsAVS Part 1 System comprises a set of

standards that converts single/multi channel audio and video bit streams into a single multiplexed stream for transmission and storage and also defines an encoding syntax which is necessary for synchronous de-multiplexing of audio and video bit streams.

AVS System basically comprises of two data streams namely the program stream and transport stream where each one has its own applications. AVS Part1 complies with AVS Part 2 or AVS Part 7 video, AVS Part 3 audio as its elementary bit stream [46].

While H.264 specifies only video, it is meaningful to encode and multiplex audio with the video bitstream. Hence this is a viable research area where the best audio codec can be multiplexed with the latest video codecs such as AVS China, H.264/AVC, VC-1 and Dirac(Fig. 6).Ten parts of AVS china are listed in Table I.

B. ProfilesJizhun Profile (base profile or main profile) is defined in

AVS Part 2 and is targeted mainly at digital video applications like commercial broadcasting and storage media. It has moderate computational complexity. Jiben Profile (basic profile or baseline profile) is defined in AVS Part 7 for mobile applications. Shenzan and Jiaqiang profiles are defined in AVS Part 2 for video surveillance and multimedia entertainment respectively.

2

Figure 7. Nine adaptive directional intra prediction modes including the DC mode for luminance in AVS-China Part 7

[53].

Nine adaptive directional intra prediction modes including the DC mode for luminance in AVS-China Part 7 is show in Fig. 7 [53].

C. Inter-frame prediction (Part 7)Similar to Fig. 5, seven sizes of the blocks in inter-

frame adaptive ME/MC prediction are 16×16, 16×8, 8×16, 8×8, 8×4, 4×8 and 4×4 depending on the amount of information present within the macro-block. Motion is predicted up to ¼ pixel accuracy. If the half_pixel_mv_flag is 1 then it is up to ½ pixel accuracy.

Eight-tap filter F1 = (−1,4,−12,41,41,−12,4,−1) and four-tap filter F2 = (−1,5,5,−1) are used for horizontal and vertical interpolations respectively for ½ pixel MV search and averaging (liner interpolation) is used for ¼ pixel accuracy.

Figure 6. Multiplexing of audio/video and lip sync.

TABLE I. TEN PARTS OF AVS CHINA STANDARD FAMILY [46]

AVS ContentsPart 1 System for broadcastingPart 2 SD/HD videoPart 3 AudioPart 4 Conformance testPart 5 Reference softwarePart 6 Digital rights managementPart 7 Mobility videoPart 8 System over IPPart 9 File formatPart 10 Mobile speech and audio coding

IV. SMPTE VC-1 (WINDOWS MEDIA VIDEO 9)VC-1 [24-27] is an informal name of the SMPTE 421M

video codec. This standard initially has been developed by Microsoft – Window Media Video 9. WMV-9 supports progressive video and is mainly used for online video services. VC-1 extends WMV-9 and adds features necessary for broadcast services such as interlace support. It is a supported standard for Blu-ray Discs and Windows Media Video. The high definition DVD format Blue ray has mandated MPEG-2, H.264 and VC-1 as the video compression formats.VC-1 is compared with H.264 in Fig. 8.

V. DIRAC

Dirac [28-45] is a family of video codecs spanning mobile to UHDTV and film video post production. For low bit rate applications such as the Internet, we can think of Dirac as functionally similar to H.264 (Fig. 8) and offering similar compression performance. For high quality compression in production, Dirac is functionally similar to JPEG2000 [54-69]. Dirac is royalty free open technology. Dirac is simple, low cost. Dirac is a hybrid motion-compensated video coding, whereas Dirac Pro (standardized as SMPTE VC-2) is only intra frame coding for professional or production applications.

In the Dirac codec, image motion is tracked and the motion information is used to make a prediction of a later frame. A transform is applied to the prediction error between the current frame and the previous frame aided by motion compensation and the transform coefficients are quantized and entropy coded (Figs. 9 and 10).Temporal and spatial redundancies are removed by motion estimation, motion compensation and discrete wavelet transform respectively. Dirac uses a more flexible and efficient form of entropy coding called arithmetic coding which packs the bits efficiently into the bit stream [28, 44].The two-dimensional discrete wavelet transform provides Dirac with the flexibility to operate at a range of resolutions. This is because wavelets operate on the entire picture at once, rather than focusing on small areas at a time. In Dirac, the discrete wavelet transform plays the same role as the DCT in MPEG-2 in de-correlating data in a roughly frequency-sensitive way, whilst having the advantage of preserving fine details better than block based transforms [42]. An experiment showed the difference in the encoding time taken by Dirac and H.264 / MPEG-4 for QCIF, CIF and SD sequences. The

3

8x8, 4x8, 8x4, 4x4adaptive block sizeFrequency-independent dequantization scalingVLC-based entropy coding4 tap bicubic filters for MCRelatively-simple loop filterOverlap intra filteringRange reduction/expansionResolution red./exp.

8x8 and 4x4adaptive block sizeFrequency-dependent dequantization matrixCABAC or VLCLong filters for MCComplex loop filterSpatial intra predictionMulti-picture arbitrary-order referencingIntra PCM

VC-1

H.264

Block

motion16-bit

integer

transform

sBit-

exact

specFading

predictionLoop filter

Figure 10. Dirac decoder architecture [42].

Figure 9. Dirac encoder architecture [44, 45].0 20 40 60 80 100 120 140 160 180 200

0

10

20

30

40

50

60

70

80

90

100

Bitrate (k bits per second)

Compression ratio

Compression ratio vs Bitrate at CBR (QCIF)

H.264Dirac

0 20 40 60 80 100 120 140 160 180 2000.965

0.97

0.975

0.98

0.985

0.99

0.995

1

Bitrate (k bits per second)

SSIM

SSIM vs Bitrate at CBR (QCIF)

H.264Dirac

simplicity of the Dirac encoder is evident, as its encoding speed was much higher compared to the H.264 AVC [42].

VI. SIMULATION RESULTS

The comparison between H.264and AVS-China’s performance was produced by encoding several test sequences at different bit rates and shown in Figs. 14 thru 17. Test

sequences with HD (1280×720) and standard-definition (SD) (720×480) are used for evaluation. The two methods are very close and comparable in peak-to-peak signal-to-noise ratio (PSNR).

Objective test methods attempt to quantify the error between a reference and an encoded bitstream. To ensure the accuracy of the tests, each codec must be encoded using the same bit rate. Since the latest version of Dirac does include a constant bit rate (CBR) mode, the comparison between Dirac and H.264’s performance was produced by encoding several test sequences at different bit rates. By utilizing the CBR mode within H.264, we can ensure that H.264 is being encoded at the same bit rate as that of Dirac.

Objective tests are divided into three sections, namely (i) compression, (ii) structural similarity index (SSIM), and (iii) peak-to-peak signal-to-noise ratio (PSNR). The test sequences “Miss-America” QCIF (176×144), “Stefan” CIF (352×288) and “Susie" standard-definition (SD) (720×480) are used for

evaluation (Figs.11, 12 and13). The two methods are very close and comparable in compression, PSNR and SSIM. Also, a significant improvement in encoding time is achieved by Dirac, compared to H.264 for all the test sequences [42].

Figure 11. Compression ratio comparison of Dirac and H.264 for “Miss-America” QCIF sequence [42].

Figure 12. SSIM comparison of Dirac and H.264 for “Miss-America” QCIF sequence [42].

VII. CONCLUSIONS

Video coding standards: H.264/AVC, DIRAC, AVS China and VC-1 are presented. Performance comparison of these standards using different test sequences is presented. Their functionalities are summarized in Tables II and III. In general H.264 performs better compared to Dirac, AVS China and VC-1, but at the cost of additional complexity.

REFERENCES

H.264/AVC[1] A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-

4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp. 793-849, Oct. 2004.

4

0 20 40 60 80 100 120 140 160 180 20038

40

42

44

46

48

50

52

54

Bitrate (k bits per second)

PSNR (in dB)

PSNR vs Bitrate at CBR (QCIF)

H.264Dirac

0 5 10 15 20 25 30 35 4032

34

36

38

40

42

44

Bitrate (M bits per second)

PSNR (in dB)

PSNR vs Bitrate for a HDTV sequence

H.264-HighAVS-JizhunDirac

0 5 10 15 20 25 30 35 400

5

10

15

20

25

30

35

Bitrate (M bits per second)

MSE

MSE vs Bitrate for a HDTV sequence

H.264-HighAVS-JizhunDirac

0 1 2 3 4 5 6 7 8 9 1030

32

34

36

38

40

42

Bitrate (M bits per second)

PSNR (in dB)

PSNR vs Bitrate for a SDTV sequence

H.264-HighAVS-Jizhun

0 10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

45

50

Bitrate (M bits per second)

MSE

MSE vs Bitrate for a SDTV sequence

H.264-HighAVS-Jizhun

Figure 13. PSNR (peak-to-peak signal-to-noise ratio) comparison of Dirac and H.264 for “Miss-America” QCIF sequence [42].

Figure 14. Bitrate vs. PSNR for Harbour – HDTV sequence (1280 720p).AVS Jizhun Profile is a main profile.

Figure 15. Bitrate vs. MSE for Harbour – HDTV sequence (1280 720p).

Figure 16. Bitrate vs. PSNR for Bus – SDTV sequence (720 480i).

Figure 17. Bitrate vs. MSE for Bus – SDTV sequence (720 480i).

[2] H.264 AVC JM software: http://iphome.hhi.de/suehring/tml/ [3] D. Kumar, P. Shastry and A. Basu, “Overview of the H.264 / AVC”, 8th

Texas Instruments Developer Conference India, 30 Nov – 1 Dec 2005, Bangalore.

[4] H.264 encoder and decoder: http://www.adalta.it/Pages/407/266881_266881.jpg

[5] “H.264 video compression standard”, White paper, Axis communications.

[6] R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan. 2003.

[7] T.Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. CSVT, vol.13, pp 560–576, July 2003.

[8] M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar paper at Chemnitz University of Technology, June 2004

[9] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.

[10] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264/ISO/IEC 14496-10, Mar.2005.

[11] S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4 Part 10” J. Visual Communication and Image Representation, vol. 17, pp.186–216, April 2006.

[12] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp. 134–143, Aug. 2006.

5

[13] T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp. 148–153, March 2007.

[14] Z. Wang, et al “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. on Image Processing, vol. 13, pp. 600-612, Apr. 2004. http://www.ece.uwaterloo.ca/~z70wang/

[15] H. Jia and L. Zhang, “Directional diamond search pattern for fast block motion estimation”, IEE Electronics Letters, vol. 39, No. 22, pp. 1581-1583, 30th Oct. 2003.

[16] Video test sequences (YUV 4:2:0): http://trace.eas.asu.edu/yuv/index.html

[17] Video test sequences ITU601: http://www.cipr.rpi.edu/resource/sequences/itu601.html

[18] K.R. Rao, Mutimedia Processing, Course Website, UT Arlington: http://ee.uta.edu/Dip/Courses/EE5359/index.html

[19] I. Richardson, H.264 Advanced Video Compression Standard, II Edition, Hoboken, NJ: Wiley, 2010.

[20] Y.Q. Shi and H. Sun, “Image and video compression for multimedia engineering”, Boca Raton: CRC Press, II Edition, (Chapter on H. 264), 2008.

[21] B. Furht and S.A. Ahson, “Handbook of mobile broadcasting, DVB-H, DMB, ISDB-T and MEDIAFLO,” Boca Raton, FL: CRC Press, 2008 (H.264 related chapters).

MPEG AND H.26X SERIES

[22] <http://en.wikipedia.org/wiki/MPEG>[22a] V. Vijaykumar and K.R. Rao, “Low complexity H.264 to VC-1

transcoder” J. of Real Time Image processing (under review).[22b] V.S. Kolkeri, J. H. Lee and K. R. Rao, ”Error concealment techniques in

H.264/AVC for wireless video transmission in mobile networks”, International Journal in Image Processing, (Under review)

[22c] K.R. Rao, A. Urs and S. Patil, “Comparison of 8 × 8 integer DCTs used in H.264, AVS-CHINA and VC-1 video codecs”,CMIC 2011, 4-7 Jan. 2011, Chiang Mai, Thailand.

[22d] D. Han et al, “ Low complexity H.264 encoder using machine learning”, IEEE SPA 2010, PP. 40-43, Poznan, Poland, Sept. 2010.

[22e] K.V.S Swaroop and K.R. Rao, “ Performance analysis and comparison of JM 15.1 and Intel . IPP H.264 encoder and decoder”, 42nd South Eastern Symp. on System Theory, pp. 371-375, Tyler, TX, March 2010.

[22f] S.-W. Lee and C.-C.J. Kuo, “ H.264/AVC entropy decoder complexity analysis and its applications”, J. VCIR, vol.22, pp. 61-72, Jan. 2011.

[22g] T. Wiegand and G.J. Sullivan, “The picturephone is here. Really,” IEEE Spectrum, vol. 48, pp. 50-54, Sept. 2011.

HEVC[23] G.J. Sullivan and J.-R. Ohm, “Recent developments in standardization of

high efficiency video coding (HEVC),” SPIE Optics + Photonics, Applications of Digital Image Processing XXXIII, vol. 7798, paper 7798-3, San Diego, CA, Aug. 2010.[23a]EEE Trans. on CSVT, vol. 20, Special section on high efficiency video coding (several papers), Dec. 2010.

[23b] M. Karczewicz et al, „A hybrid video coder based on extended macroblock sizes, improved interpolation and flexible motion representation“, IEEE Trans. CSVT, Vol.20, pp. 1698-1708, Dec. 2010.)

[23c] S. Jeong et al, “ High efficiency video coding for entertainment quality’, ETRI Journal, vol. 33, pp. 145-154, April 2011.

VC-1[23d] IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 7, pp.

1290-1297, Nov. 2011, New Video Coding Scheme Optimized for High-Resolution Video Sources - Asai, et. al

[23e] http://www.h265.net/ has info on developments in HEVC NGVC – Next generation video coding.

[23f] JVT KTA Reference Software http://iphome.hhi.de/suehring/tml/download/KTA

[23g] IEEE Trans. on CSVT, vol. 20, Special section on high efficiency video coding (several papers), Dec. 2010.

[23h] Z. Ma and A. Segall, „ Low resolution decoding for high-efficiency video coding“, IASTED SIP-2011, Dalls, TX, Dec. 2011.

[23i] T. Wiegand, B. Bross, W.-J. Han, J.-R. Ohm, and G. J. Sullivan, WD3: Working Draft 3 of High-Efficiency Video Coding, Joint Collaborative Team emerging HEVC standard on Video Coding (JCT-VC) of ITU-T VQEG and ISO/IEC MPEG, Doc. JCTVC-E603, Geneva, CH, March 2011.

[23j] Y. Ye and M. Karczewicz, “Improved H.264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning,” IEEE Int’l Conf. Image Process.’08 (ICIP08), San Diego, U.S.A., Oct. 2008.

[23k] IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 7, Nov. 2011 (several papers on HEVC) Introduction to the Issue on Emerging Technologies for Video Compression.

[23l] R. Joshi, Y.A. Reznik and M. Karczewicz, “Efficient large size transforms for high performance video coding”, Proc. SPIE, vol. 7798, pp. , San Diego, CA, Aug. 2010.

[23m] Special issue on emerging research and standards in next generation video coding” IEEE Trans. CSVT, Tentative publication date (Dec. 2012).

[24] VC-1 Compressed Video Bitstream Format and Decoding Process,SMPTE421M-2006, SMPTEStandard, 2006.

[25] S. Srinivasan and S. L. Regunathan, “An overview of VC-1,” Proc. SPIE, vol. 5950, pp. 720–728, 2005.

[26] Microsoft Windows Media: http://www.microsoft.com/windows/windowsmedia

[27] H. Kalva and J.-B. Lee, The VC-1 and H.264 video compression standards for broadband video services, Springer, 2008.

DIRAC

[28] K. Onthriar, K. K. Loo and Z. Xue, “Performance comparison of emerging Dirac video codec with H.264/AVC,” IEEE Int’l Conf. on Digital Telecommunications, ICDT 2006, vol. 6, Page: 22, Issue: 29-31, Aug. 2006.

[29] T. Davies, “The Dirac Algorithm”: http://dirac.sourceforge.net/documentation/algorithm/, 2008.

[30] M. Tun and W. A. C. Fernando, “An error-resilient algorithm based on partitioning of the wavelet transform coefficients for a Dirac video codec,” IEEE Tenth International Conf. on Information Visualization, IV’06, pp.615–620, July 2006.

[31] Daubechies wavelet: http://en.wikipedia.org/wiki/Daubechies_wavelet [32] Daubechies wavelet filter design: http://cnx.org/content/m11159/latest/ [33] Vorbis: http://www.vorbis.com/ [34] T. Borer, “Dirac coding: Tutorial & Implementation”, EBU Networked

Media Exchange seminar, June 2009. [35] Dirac software and source code: http://diracvideo.org/download/dirac-

research/ [36] Dirac video codec – A programmer's guide:

http://dirac.sourceforge.net/documentation/code/programmers_guide/toc.htm

[37] Dirac Pro: http://www.bbc.co.uk/rd/projects/dirac/diracpro.shtml[38] T. Davies, “A modified rate-distortion optimization strategy for hybrid

wavelet video coding,” ICASSP 2006,vol.2, pp.909–912, May 2006.[39] M. Tun, K.K. Loo and J. Cosmas, “Semi-hierarchical motion estimation

for the Dirac video codec,” 2008 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, pp.1–6, March 31-April 2, 2008.

6

[40] H. Eeckhautet al., “Speeding up Dirac’s entropy coder”, Proc. 5th WSEAS Intl. Conf. on Multimedia, Internet and Video Technologies, pp. 120–125, Greece, Aug. 2005.

[41] The Dirac web page and developer support: http://dirac.sourceforge.net[42] A. Ravi andK.R. Rao, “Performance analysis and comparison of the

Dirac video codec with H.264 / MPEG-4 Part 10 AVC,”IJWMIP, vol.4, pp.635-654, No.4, 2011.

[43] BBC Research on Dirac: http://www.bbc.co.uk/rd/projects/dirac/index.shtml

[44] T. Borer and T. Davies, “Dirac video compression using open technology,” BBC EBU Technical Review, July 2005.

[45] C. Gargour et al., “A short introduction to wavelets and their applications,” IEEE Circuits and Systems Magazine, vol. 9, pp. 57–68, II Quarter, 2009.

[45a] A. Ravi and K.R. Rao, “Performance analysis and comparison of the Dirac video codec with H.264/ MPEG- 4, Part 10,” for the book Advances in reasoning-based image processing, analysis and intelligent systems: Conventional and intelligent paradigms,” 2011.

[45b] A. Urs and K.R. Rao “Multiplexing/de-multiplexing Dirac video with AAC audio bit stream”, TELSIKS 2011, Nis, Serbia, 5-8 Oct. 2011.

AVS CHINA

[46] GB/T 20090.1 Information technology - Advanced coding of audio and video – Part 1: System, Chinese AVS standard.

[47] L. Yu et al., “An Overview of AVS-Video: tools, performance and complexity”, Visual Communications and Image Processing 2005, Proc. of SPIE, vol. 5960, pp.596021, July 31, 2006.

[48] L. Yu et al., “An area-efficient VLSI architecture for AVS intra frame encoder” Visual Communications and Image Processing 2007, Proc. of SPIE-IS & T Electronic Imaging, SPIE vol. 6508, pp. 650822, Jan. 29, 2007.

[49] W. Gao et al., “AVS – The Chinese next-generation video coding standard” NAB, Las Vegas, 2004.

[50] J. Wang et al., “An AVS-to-MPEG2 transcoding system” Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, Hong Kong, pp. 302-305, Oct. 20-22, 2004.

[51] X. Wang et al., “Performance comparison of AVS and H.264/AVC video coding standards” J. Comput. Sci. & Technol., vol.21, No.3, pp.310-314, May 2006.

[52] B. Tang et al., “AVS encoder performance and complexity analysis based on mobile video communication”,WRI International conference on Communications and Mobile Computing, CMC‘09, vol. 3, pp. 102–107, 6-8 Jan. 2009.

[53] L. Yu et al., “Overview of AVS video coding standards,” Signal Processing: Image Communication, vol. 24, pp. 263–276, April 2009.

[53a] D. Sahana and K.R. Rao, “A study on AVS-M standard”, Advanced Computational Technologies published by the Romanian Academy Publishing House, 2011.

[53b] S. Swaminathan and K.R. Rao, “Multiplexing and demultiplexing of AVS CHINA video with AAC audio,” TELSIKS 2011, Nis, Serbia, 5-8 Oct. 2011.

JPEG2000[54] D. T. Lee, “JPEG 2000: Retrospective and new developments,” Proc.

IEEE,vol. 93, pp.32–41, Jan. 2005.[55] P. Schelkens, A. Skodras and T. Ebrahimi, “The JPEG 2000 suite”,

Hoboken, NJ: Wiley, 2009.[56] M. S. Zhong and Z. M. Ma, “JPEG 2000 based scalable reconstruction

of image local regions”, IEEE ISIMP 2001, Hong Kong, May 2001.[57] C. Christopoulous, A. Skodras and T. Ebrahimi, “The JPEG 2000 still

image coding system: An overview,” IEEE Trans. on Consumer Electronics, vol. 46, pp. 1103–1127, Nov. 2000.

[58] M. D. Adams, “The JPEG-2000 still image compression standard,” JPEG Tutorial download from http://www.ece.uvic.ca/~mdadams/jasper (also software)

[59] A. Skodras, C. Christopoulous, and T. Ebrahimi, “JPEG-2000: The upcoming still image compression standard,” Pattern Recognition Letters, vol. 25, pp. 1337–1345, 2001.

[60] T. Fukuhara et al, “Motion-JPEG2000 standardization and target market,” IEEE ICIP, vol. 2, pp. 57–60, 2000.

7

TABLE II. COMPARISON OF VARIOUS VIDEO COMPRESSION STANDARDS

Algorithmic Element

MPEG-4 AVC

(H.264)

SMPTE VC-1(Windows Media

Video 9)

Dirac DiracPRO(SMPTE VC-2)

AVS ChinaPart 2

AVS ChinaPart 7

(AVS-Mobile)Intra

Prediction4×4 spatial

16×16 spatialI-PCM

Frequency domain

coefficient

4×4 spatial 4×4 spatial(forward, backward)

8×8 block based Intra Prediction

Intra_4×4(4×4 spatial).Direct Intra Prediction

Picture coding type

FrameField

Picture AFFMB AFF

FrameField

Picture AFFMB AFF

Frame Intra – Frame,Field (Interlace,

Progressive)

Frame Frame

Motion compensation

block size

16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4(seven

variable sizes)

16×16, 8×8 4×4 N/A 16×16, 16×8, 8×16, 8×8

16×16, 16×8, 8×16, 8×8, 8×4,

4×8

Motion vector Precision

Full pelHalf pel

Quarter pel

Full pelHalf pel

Quarter pel

1/8 pel N/A 1/4 pel 1/4 pel

P frame type Single referenceMultiple reference

Single reference,Intensity

compensation

Single reference,Multiple reference

No P frames Single and multiple reference

(maximum of 2 reference frames)

Single and multiple reference

(maximum of 2 reference frames)

B frame type One reference each way,Multiple

reference,Direct &

spatial direct weighted prediction

One reference each way

One reference each way,Multiple reference

No B frames One reference each way, Multiple

reference.Direct and

symmetrical mode

No B frames

In loop filters De-blocking De-blockingOverlap

transform

None None De-blocking filter

De-blocking filter

Entropy coding CAVLC,CABAC

Adaptive VLC Arithmetic coding

Context based adaptive binary

arithmetic coding,Exponential

Golomb coding

2D variable length coding.

Context based adaptive 2D

variable length coding

Transform Main: 4×4 integer DCT,

High:4×4&8×8

integer DCTs

4×4, 8×88×4& 4×8

integer DCTs

4×4 wavelet transform

4×4wavelet transform

8×8 integer DCT

4×4 integer DCT

Other Quantization scaling

matrices

Range reduction.Instream-post

processing control

Quantization scaling

matrices

Quantization scaling matrices

Quantization scaling matrices

Quantization scaling matrices

[61] M. Rabbani and R. Joshi, “ An overview of the JPEG 2000 still image compression standard,” Signal Processing: Image Communication, vol. 17, pp. 3–48, Jan. 2002.

[62] J. Hunter and M. Wylie, “JPEG2000 Image Compression: A real time processing challenge,” Advanced Imaging, vol. 18, pp.14–17, April 2003.

[63] D.S. Taubman and M.W. Marcellin, “JPEG 2000: Image compression fundamentals, standards and practice,” Kluwer, 2001.

[64] D. Marpe, V. George, and T. Wiegand, “Performance comparison of intra-only H.264/AVC HP and JPEG 2000 for a set of monochrome ISO/IEC test images,” JVT-M014, pp.18–22, Oct. 2004.

[65] D. Marpe et al, “Performance evaluation of motion JPEG2000 in comparison with H.264 / operated in intra-coding mode,” Proc. SPIE, vol. 5266, pp. 129–137, Feb. 2004.

8

TABLE III. STANDARD

Standard Main Compression Technologies Main Target ApplicationsH.264/

MPEG-4 Part 10

Standardization bodyJVT (ISO/IEC & ITU-T)

Main Target Bitrate8 kb/s up to about 150 Mb/s

–Integer DCT–Adaptive quantization–Zigzag reordering–Alternate Scan ordering–Predictive motion compensation–Bi-directional motion compensation

–Variable block size motion compensation with small block sizes– Quarter pixel motion compensation– Motion vector over picture boundaries– Multiple reference picture motion compensation–Adaptive intra directional prediction–In-loop deblocking filter

–Arithmetic coding–Variable length coding–Error resilient coding

–Broadcast over cable, terrestrial and satellite–Interactive or serial storage on optical and magnetic devices, DVD, etc–Conversational services–Video on demand–MMS over ISDN, DSL, Ethernet, LAN, wireless and mobile networks–HDTV–Digital camera

HEVC/ NGVC

Standardization bodyJVT (ISO/IEC & ITU-T)

Besides those listed under H.264 / MPEG4 part10

(1) RD Picture Decision(2) RDO_Q(3) New Offset (4) Adaptive Interpolation Filter (5) Block Adaptive Loop Filter (BALF) (6) Bigger Blocks and Bigger transform (32x32 and 64x64) (7) Multiple Angular Direction Intra Adaptive Prediction

(8) Inter prediction ( Multiple ref. pictures, bi-prediction, weighted prediction)

(9) New MV competition Transform unit block size 4X4 to 64X64 ( Mode dependent directional transform MDDT and rotational transforms)

Same as in H.264 / MPEG-4 part 10 but at lower bit rate and higher compression efficiency

AVS Part 2 Standardization bodyAVS workgroup

Main Target Bitrate1 Mb/s up to about 20 Mb/s

– Interlace handling: Picture-level adaptive frame/field coding (PAFF)–Macroblock-level adaptive frame/field coding (MBAFF)– Intra prediction: 5 modes for luma and 4 modes for chroma– Motion compensation: 16×16, 16×8, 8×16, 8×8 block size– Resolution of MV: 1/4-pel, 4-tap interpolation filter– Transform: 16bit-implemented 8×8 integer cosine transform– Quantization and scaling: scaling only in

– HD broadcasting– High density storage media– Video surveillances– Video on demand

[66] P. Topiwala, “Comparative study of JPEG2000 and H.264/AVC FRExt I-frame coding on high definition video sequences,” Proc. SPIE Int’l Symposium, Digital Image Processing, San Diego, Aug. 2005.

[67] P. Topiwala, T. Tran and W. Dai, “Performance comparison of JPEG2000 and H.264/AVC high profile intra-frame coding on HD video sequences,” Proc.SPIE Int’l Symposium, Digital Image Processing,

Applications of Digital Image ProcessingXXIX, vol. 6312, San Diego, Aug. 2006.JPEG XR (HD photo of Microsoft)

9

TABLE III. STANDARD(CONTINUED)

Standard Main Compression Technologies Main Target ApplicationsDirac Standardization body

BBC R&DMozilla Public License (MPL)

Main Target BitrateFew hundred kbps up to about 15Mbps

–4×4 wavelet transform–Dead-zone quantization and scaling–Entropy coding: Arithmetic coding–Hierarchical motion estimation–Intra, Inter prediction–Single and multiple reference P, B frames–1/8 pel motion vector precision–4×4 overlapped block based motion compensation (OBMC)–Daubechies wavelet filters

–Broadcasting–Live streaming video–Pod casting–Peer to peer transfers–HDTV with SD (standard definition) simulcast capability–Desktop production–News links–Archive storage–PVRs (personal video recorders)–Multilevel Mezzanine coding

DiracPRO (SMPTE VC-2)

Standardization bodyBBC R&DSMPTE

Main Target BitrateLossless HD to < 50Mb/s

Compression ratio 20:1

–4×4 wavelet transform–Dead-zone quantization and scaling–Entropy coding: Context based adaptive binary arithmetic coding (CABAC), exponential Golomb coding–Intra-frame only (forward, backward prediction modes also available)–Frame, Field coding (Interlaced and progressive)–Daubechies wavelet filters

–Professional (high quality, low latency) applications (not for end user distribution)–Lossless or visually lossless compression for archives–Mezzanine compression for re-use of existing equipment–Low delay compression for live video links

SMPTE VC-1 (WMV-9)

Standardization bodySMPTE 421M

Main Target Bitrate10 kbps– 8 Mbps

–Integer DCT–Adaptive block size transform: (8×8), (8×4), (4×8) and (4×4)–Motion estimation for (16×16) and (8×8) blocks–½ pixel and ¼ pixel motion vector resolution–Dead zone and uniform quantization–Multiple VLCs–In-loop deblock filtering, fading compensation

–Media delivery over the Internet–Broadcast TV–HD DVD–Digital projection in theaters, mobile phones–DVB-T, DVB-S

[68] T. Tran, L. Liu and P. Topiwala, “Performance comparison of leading image codecs: H.264/AVC intra, JPEG 2000, and Microsoft HD photo,” Proc.SPIE Int’l Symposium, Applications of Digital Image Processing XXX, vol. 6696, San Diego, Sept. 2007.

[69] JPEG-2000 open source softwarehttp://www.ece.uvic.ca/~mdadams/jasper/

[69a] Z. Liu, L.J. Karam and A.B. Watson, “JPEG2000 Encoding with Perceptual Distortion Control,” IEEE Transactions on Image Processing, vol.15, no.7, pp.1763-1778, July 2006.

[70] JPEG <http://en.wikipedia.org/wiki/JPEG>[71] MJPEG<http://en.wikipedia.org/wiki/MJPEG>

GENERAL

[72] D.A.Novik, J.C.Tilton and M. Manohar, "Compression through decomposition into browse and residual images" Space and Earth Science Data Compression Workshop, NASACP-3191, edited by James C. Tilton, WashingtonD.C., 1993.

JPEG[73] D. Santa-Cruz and T. Ebrahimi, “ A study of JPEG 2000 still image

coding versus other standards”, Proc X EUSIPCO, vol.2, pp. 673-676, Sept. 2000.

[74] E.L. Tan and W.S. Gan, “Perceptually tuned subband coder for JPEG,” J. Real Time Image Process., vol. 6, pp. 101-115, 2011.

10

JPEG-XR[75] S. Srinivasan, C. Tu, S. L. Regunathan, G. J. Sullivan, and R. A. Rossi,

“HD Photo: A New Image Coding Technology for Digital Photography,” Proc. SPIE, vol. 6696 (2007).

[76] MICROSOFT HD PHOTO SPECIFICATIONhttp://www.microsoft.com/whdc/xps/wmphotoeula.mspx

DIGITAL VIDEO

[77] DV <http://en.wikipedia.org/wiki/DV>[78] Y. Gao, D. Chan and J. Liang,” JPEG-XR optimization with graph-

based SOFT quantization”, IEEE ICIP 2011, Brussels, Aug. 2011.

11