Upload
manishmn1987
View
2.948
Download
1
Tags:
Embed Size (px)
Citation preview
9/29/20089/29/2008 11
H.264H.264
Subhrendu SarkarSubhrendu SarkarComputer Science, Computer Science,
Columbia UniversityColumbia University
COMS W4995 - VOIP SecurityCOMS W4995 - VOIP Security
9/29/20089/29/2008 22
•IntroductionIntroduction
•Video Formats and QualityVideo Formats and Quality
•Video Coding and H.264Video Coding and H.264
•PerformancePerformance
•ConclusionConclusion
•ReferencesReferences
9/29/20089/29/2008 33
IntroductionIntroduction• What is H.264 ?What is H.264 ?
– H.264 is a Video Coding Standard also H.264 is a Video Coding Standard also known as MPEG-4 Part-10 (AVC).known as MPEG-4 Part-10 (AVC).
• H.261, H.263, MPEG-1, MPEG-2 are H.261, H.263, MPEG-1, MPEG-2 are some predecessors of H.264.some predecessors of H.264.
• Purpose of a standardPurpose of a standard
– Define a coded representation (or syntax) Define a coded representation (or syntax) that describes visual data in a compressed that describes visual data in a compressed form and method of decoding the syntax form and method of decoding the syntax to reconstruct visual information. to reconstruct visual information.
– compliant encoders and decoders can compliant encoders and decoders can successfully interoperate with each other.successfully interoperate with each other.
• A Video standard specifically do not define an A Video standard specifically do not define an encoder; rather, they define the output that an encoder; rather, they define the output that an encoder should produce.encoder should produce.
• A decoding method is defined in each A decoding method is defined in each standard.standard.
Video
from Latin “I see”
9/29/20089/29/2008 44
•IntroductionIntroduction• Video Formats and QualityVideo Formats and Quality
9/29/20089/29/2008 55
• Pixel (Picture Element)Pixel (Picture Element)
• Interlaced Video (Frames and Fields)Interlaced Video (Frames and Fields)
• Bitrate and Frame rate.Bitrate and Frame rate.– Typically 30 fps is good for human Typically 30 fps is good for human
visual system.visual system.– Higher the bitrate, better the quality of Higher the bitrate, better the quality of
video.video.– Higher frame rate, better the quality of Higher frame rate, better the quality of
video video
9/29/20089/29/2008 66
• Video FormatsVideo Formats– NTSC , PAL (analogue video)NTSC , PAL (analogue video)
• PAL (Europe, Asia, Australia, etc.) 25 frames/secPAL (Europe, Asia, Australia, etc.) 25 frames/sec• SECAM (France, Russia, parts of Africa etc.) 25 frames/secSECAM (France, Russia, parts of Africa etc.) 25 frames/sec• NTSC (USA, Canada, Japan, etc.) 29.97 frames/secNTSC (USA, Canada, Japan, etc.) 29.97 frames/sec
– According to resolutionAccording to resolution• VGA 640x480 (Video Graphics ArrayVGA 640x480 (Video Graphics Array), ), QVGAQVGA• CIF 352x288 CIF (CIF 352x288 CIF (Common Intermediate FormatCommon Intermediate Format))
– QCIF, SQCIFQCIF, SQCIF
• SDTV (e.g 720 x 480) SDTV (e.g 720 x 480) • HDTV (e.g HDTV (e.g 1920×10801920×1080 ) )
• Color SpacesColor Spaces– RGB (Red, Green, Blue)RGB (Red, Green, Blue)– YUV also known as YCYUV also known as YCbbCCr r (luminance, chroma)(luminance, chroma)
•Y Y = = kkrr R R + + kkggG G + + kkbbBB
9/29/20089/29/2008 77
• YUV Sampling FormatsYUV Sampling Formats– YUV 444 YUV 444 – YUY2 (4:2:2)YUY2 (4:2:2)– YV12 or YUV420 YV12 or YUV420
(4:2:0)(4:2:0)
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 88
Video QualityVideo Quality• Subjective Video QualitySubjective Video Quality
• Objective Video QualityObjective Video Quality– PSNR (Peak Signal to Noise Ratio)PSNR (Peak Signal to Noise Ratio)
measured on a logarithmic scale and depends on the meanmeasured on a logarithmic scale and depends on the mean
squared error (MSE) of between an original and ansquared error (MSE) of between an original and an
impaired Image or video frame, relative to (2impaired Image or video frame, relative to (2nn −1)−1)22 (the (the
square of the highest possible signal value in the image,square of the highest possible signal value in the image,
where where n n is the number of bits per image sample).is the number of bits per image sample).
9/29/20089/29/2008 99
IntroductionIntroduction
Video Formats and QualityVideo Formats and Quality
Video Coding and H.264Video Coding and H.264
9/29/20089/29/2008 1010
• A video CODEC encodes a source image or video A video CODEC encodes a source image or video sequence into a compressed form and decodes this to sequence into a compressed form and decodes this to produce a copy or approximation of the source produce a copy or approximation of the source sequence.sequence.
Three main functional Three main functional units of a Video Encoderunits of a Video Encoder
• Spatial ModelSpatial Model
• Temporal ModelTemporal Model
• Entropy EncoderEntropy Encoder
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 1111
• Macroblock, Block and Sub-Block.Macroblock, Block and Sub-Block.– 16 x16 Macroblocks.16 x16 Macroblocks.– 16 x 8, 8 x16, 8x8, 8x4, 4x8, 4x4 16 x 8, 8 x16, 8x8, 8x4, 4x8, 4x4
Blocks.Blocks.
• Temporal ModelTemporal Model– Prediction from the Previous Video Prediction from the Previous Video
FrameFrame•Optical Field FlowOptical Field Flow
•Block based Motion Estimation and Block based Motion Estimation and Motion CompensationMotion Compensation
Current Macroblock
Ref Block Ref
Block
9/29/20089/29/2008 1212
Motion vectorsMotion vectors
Search region, finds best matching MBSearch region, finds best matching MB• mv : mvx and mvy (distance between mv : mvx and mvy (distance between
current block and the ref macroblock.current block and the ref macroblock.• mvp - motion vector predictormvp - motion vector predictor• mvd - motion vector differencemvd - motion vector difference• mvd = Difference (mvp, mv)mvd = Difference (mvp, mv)
B
mvp
A
mvp
D
mvp
C
mvp
mvd
Courtesy : Diagram from http://wiki.multimedia.cx/index.php?title=Motion_Prediction
9/29/20089/29/2008 1313
Frame Fn Frame Fn-1
16x16 Motion Vectors
Residual Fn – Fn-1 Motion Compensated Reference
Motion Compensated Residual
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 1414
• I Frames, P Frames, B FramesI Frames, P Frames, B Frames– I Frames – Spatial prediction only for all MBsI Frames – Spatial prediction only for all MBs– P Frames – has I (intra) spatially predicted P Frames – has I (intra) spatially predicted
MB and P (inter) temporally predicted MBs.MB and P (inter) temporally predicted MBs.– B Frames – Bi-directionally predicted frames.B Frames – Bi-directionally predicted frames.
• Group of Pictures (GOP)Group of Pictures (GOP)– Display Order Display Order Frame No : Frame No : 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
Frame Type : I B B P B B P B B I ... Frame Type : I B B P B B P B B I ...
– Encoding of Decoding OrderEncoding of Decoding OrderFrame No : 0 3 1 2 6 4 5 7 8 9 Frame No : 0 3 1 2 6 4 5 7 8 9
Frame Type : I P B B P B B B B Frame Type : I P B B P B B B B I ... I ...
9/29/20089/29/2008 1515
• Spatial ModelSpatial Model– Intra Macroblocks.Intra Macroblocks.– Spatial Correlation Spatial Correlation
between Macroblocksbetween Macroblocks
• Temporal ModelTemporal Model– Inter macroblocksInter macroblocks– Temporal Temporal
Correlation between Correlation between MacroblocksMacroblocks
– Searching for similar Searching for similar macroblocks from macroblocks from reference frames. reference frames.
• TransformTransform– Time to frequency Time to frequency
domain. domain. – Discrete Cosine Discrete Cosine
Transform is used.Transform is used.– Theoretically not Theoretically not
lossy. lossy. Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 1616
• QuantizationQuantization– Basically dividing the transformed Basically dividing the transformed
coefficients by quant values in the coefficients by quant values in the encoder and multiplying by the quant encoder and multiplying by the quant value in the decoder.value in the decoder.
– LossyLossy– Helps meet bitrate constraints. Helps meet bitrate constraints. – Human eye is less sensitive to higher Human eye is less sensitive to higher
frequency transform coefficients. frequency transform coefficients. • Entropy CoderEntropy Coder
– Reorder (Zig-Zag Scan order)Reorder (Zig-Zag Scan order)– Variable length codingVariable length coding
• Run Length CodingRun Length Coding• Huffman CodingHuffman Coding• Arithmetic CodingArithmetic Coding• H.264H.264
– Context Adaptive Variable Length Coding. Context Adaptive Variable Length Coding. (CAVLC) (CAVLC)
– Context Adaptive Binary Arithmetic Coding. Context Adaptive Binary Arithmetic Coding. (CABAC)(CABAC)
9/29/20089/29/2008 1717
Video EncoderVideo Encoder
In Loop Deblocking Filter
Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 1818
Video DecoderVideo Decoder
Deblocking Filter
Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 1919
Deblocking FilterDeblocking Filter
Non Deblocked Image Deblocked ImageCourtesy : Images from http://compression.ru/video/deblocking/
9/29/20089/29/2008 2020
H.264 ProfilesH.264 Profiles
Courtesy : Diagram from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 2121
•IntroductionIntroduction
•Video Formats and QualityVideo Formats and Quality
•Video Coding and H.264Video Coding and H.264
•PerformancePerformance
9/29/20089/29/2008 2222
Sample VideosSample Videos
• H.263H.263
• MPEG-4 BasicMPEG-4 Basic
• MPEG-4 ImprovedMPEG-4 Improved
• MPEG-4 Part 10 (AVC) or H.264MPEG-4 Part 10 (AVC) or H.264
QUALITY CompressionComplexity
Courtesy : Videos from http://mac.sillydog.org/qt/compare.php
9/29/20089/29/2008 2323
ApplicationsApplications
Courtesy : Table from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 2424
ConclusionConclusion• H.264 is a Digital Video Coding StandardH.264 is a Digital Video Coding Standard
• H.264 gives best quality and compression H.264 gives best quality and compression when compared to earlier codec (s).when compared to earlier codec (s).
• H.264 more complex and thus requires H.264 more complex and thus requires more processing.more processing.
• Different Profiles cater to different kinds of Different Profiles cater to different kinds of applications.applications.
• Tradeoff between Quality of Video, Tradeoff between Quality of Video, Compression achieved and complexity of Compression achieved and complexity of the codec the codec
9/29/20089/29/2008 2525
ReferencesReferences
• H.264 And MPEG-4 Video CompressionH.264 And MPEG-4 Video Compression– Ian RichardsonIan Richardson
• ISO/IEC 14496-10 and ITU-T Rec. H.264, ISO/IEC 14496-10 and ITU-T Rec. H.264, Advanced Video Coding, 2003.Advanced Video Coding, 2003.
• H.264 Reference Software VersionH.264 Reference Software Version– http://iphome.hhi.de/suehring/tml/ http://iphome.hhi.de/suehring/tml/ – Current software version: JM 14.2Current software version: JM 14.2
9/29/20089/29/2008 2626
Appendix - H.264 Appendix - H.264 FeaturesFeatures
• Weighted PredictionWeighted Prediction
• Entropy CodingEntropy Coding– CAVLC (Context Adaptive Variable CAVLC (Context Adaptive Variable
Length Coding)Length Coding)– CABAC (Main profile) Context CABAC (Main profile) Context
Adaptive Binary Arithmetic CodingAdaptive Binary Arithmetic Coding
9/29/20089/29/2008 2727
Some H.264 FeaturesSome H.264 Features• Reference PicturesReference Pictures
• SlicesSlices
• Integer and Sub-sample predictionInteger and Sub-sample prediction
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 2828
Appendix - Rate ControlAppendix - Rate Control
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson
9/29/20089/29/2008 2929
Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson