影像壓縮技術
Course Outline IntroductionVideo CodingMotion Compensated Prediction & Color
FormatJPEG/JPEG2000H.261,H.263,H.263+,H.264MPEG-1,-2,-4Error ConcealmentRate Control
Standards Comparison
- still image coding- still image coding- DCT + VLC- DCT + VLC- simple hardware, low cost- simple hardware, low cost
JPEGJPEG
H.261H.261 - video-conferencing (64 kb/s - 1.92 Mb/s)- video-conferencing (64 kb/s - 1.92 Mb/s)- DCT + VLC + optional integer-pixel MC- DCT + VLC + optional integer-pixel MC- progressive video- progressive video
H.263H.263 - improved H.261 with four optional modes - improved H.261 with four optional modes - improved H.263 with 12 new optional modes- improved H.263 with 12 new optional modes- error coding- error coding
H.263+H.263+
Video CodingMPEG-1,MPEG-2 & MPEG-4
• MPEG 的全名為 Moving Pictures Experts Group ,由國際標準組織 (International Organization for Standardization , ISO) 與國際電工委員會( International Electrotechical Commission , IEC )於 1988 年聯合成立,致力於制定數碼活動圖象及其伴音的編碼標準。
• 目前常用的數碼視訊有 3 個 MPEG 標準; MPEG-1 ,MPEG-2 及 MPEG-4 。而 MPEG-7 及 MPEG-21 仍在發展階段。 VCD 及 DVD 等的數碼影音系統的編碼便用上了 MPEG-1 和 MPEG-2 。而 MPEG-4 的應用,主要是在低頻寬的場合例如無線電、互聯網等的地方。 MPEG-7 及 MPEG-21 的特點則會加強影音內容數據庫及管理方面的功能。是屬於未來的標準。
MPEG-1, MPEG-2, AND MPEG-4(Motion Picture Experts Group)
• MPEG-1 (1992): – 1.5 Mb/s
– CD-ROM
• MPEG-2 (1994):– 4 Mb/s to 80 Mb/s
– DVD, Digital TV, HDTV
• MPEG-4 (1999):– 5 kb/s - 50 Mb/s
– Flexible networked multimedia applications
MPEG VideoMPEG-1
MPEG-2*1.2 to 1.5 Mbps (for digital storage media)
*Wider range of bit rates,optimized for 4 to 15 Mbps
*Supports interlaced video
*Supports scalable coding
MPEG 1
• 是針對 1.5Mbps 以下數據傳輸率的數字存儲媒質運動圖像及其伴音編碼的國際標準。MPEG1 用於在 CD—ROM 上存儲同步和彩色運動視頻信號。可優化為中等分辨率,並在其優化模式下,採用所謂的標準交換格式( SIF )。 MPEG1 對色差分量採用 4:1:1 的二次採樣率。 MPEG1 旨在達到 VRC質量,其視頻壓縮率為 26:1 。
MPEG-1 Video Coding StandardMPEG-1 Video Coding Standard
Video Video en-en-
codercoder
AudioAudioen-en-
codercoder
StorageStorage EditingEditing StorageStorageTrans-Trans-portport StorageStorage
VideoVideo de-de-
codercoder
AudioAudiode-de-
codercoder
Important Features for MPEG-1 ApplicationsImportant Features for MPEG-1 Applications
• Normal playbackNormal playback
• Random accessRandom access
• Reverse playbackReverse playback
• Fast forward / reverse searchesFast forward / reverse searches
• Audio-visual synchronizationAudio-visual synchronization
• Robustness to errorsRobustness to errors
• EditabilityEditability• Format flexibilityFormat flexibility
• Cost tradeoffsCost tradeoffs
Typical MPEG-1 Video Source FormatTypical MPEG-1 Video Source Format
Signal componentSignal component
352352
176176
176176
FormatFormat
Luminance (Y)Luminance (Y)
Chrominance (Cb)Chrominance (Cb)
Chrominance (Cr)Chrominance (Cr)
240240
120120
120120
標準交換格式標準交換格式 (525)(525)
Lines/FrameLines/Frame Pixels/LinePixels/Line
• Uncompressed bit-rate for transmitting SIF at 30 fps Uncompressed bit-rate for transmitting SIF at 30 fps is 30.4 Mb/sis 30.4 Mb/s
Six Hierarchical Layers of MPEG• Sequence: Random access unit for context• GOP (Group of Pictures): Random access unit for
video. Smallest independent coding unit in sequence.• Picture: Primary coding unit.
Intra-Frame (I)* Predicted-Frame (P)* Bidirectional-Predicted-Frame (B)
• Slice: Resynchronization unit.• MB (Macroblock)(16x16): Motion compensation unit.• Block(8x8): DCT unit.
Group of Picture (GOP)
*Contains I,P,and B pictures*N=number of pictures in a GOP *M=prediction distance
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
I B B P B B P B B P B B P B B I
Forward Prediction
Backward Prediction
Example of Temporal Picture Structure
0 1 2 3 4 5 6 7 8 9 10 11 12
I B B P B B P B B P B B I
Forward Prediction
Bidirectional Prediction
0 1 2 3 4 5 6 7 8 9 10 11 12
I B B P B B P B B P B B IDisplay Order
Coding Order
0 3 1 2 6 4 5 9 7 8 12 10 11
I P B B P B B P B B I B B
Coding Structure of GOPCoding Structure of GOP
I I IB B B B B B B B B BP P PB B P
Group of pictures Group of pictures
I = Intra-Picture Coding, allow random access, for referenceP = Predictive coding, causal prediction only, can be referencedB = Bi-directional coding, never referenced
Frame Reordering Frame Reordering
Encoder Input:Encoder Input:
1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P
Decoder Input:Decoder Input:
1I 4P 2B 3B 7P 5B 6B 10I 8B 9B 13P 11B 12B 16P 14B 15B1I 4P 2B 3B 7P 5B 6B 10I 8B 9B 13P 11B 12B 16P 14B 15B
GOP1GOP1 GOP2GOP2
GOP1GOP1 GOP2GOP2
CLOSEDCLOSED OPENOPEN
Bi-directional Prediction Prediction from the previous frame, or the prediction from the future frame, or an average of both can be used as the
final prediction
The prediction error is then coded and transmitted
BEST MATCH
BEST MATCH
PREVIOUS P-FRAME
CURRENT B-FRAME
FUTURE P-FRAME
BI-Directional Motion Estimation
Frame N-1 Frame N Frame N+1
• Forward, backward, or average prediction:Forward, backward, or average prediction: one or two motion vectors per 16x16 blockone or two motion vectors per 16x16 block
BI-Directional Motion EstimationBI-Directional Motion Estimation
Current B-picture
Backward motion vector
Past reference picture
Future reference picture
Forward motion vector
Best matching macroblock
Best matching macroblock
Forward/Backward/Interpolative Decision
In B frame, for each input macroblock, calculate the construction
*With forward motion vectors*With backward motion vectors *Weighted average of the forward and backward
Selection is based on minimization of error measurement (MSE,MAE,etc)
MPEG Encoder
Buffer
Regulator
Inverse
Quantizer
IDCT
Predictor
Frame
Re-Order
Motion
Estimation+ DCT Quantizer
Variable
Length
Coding
Multiplexing
+
Vectors
Modes
Source Input Picture
+
-
+ +
MPEG Decoder
Buffer
Variable
Length
Decoding
IDCT
AdderDisplay
BufferForward
Motion Compensation
Interpolated
Motion Compensation
Backward
Motion Compensation
Previous
Picture
Store
Future
Picture
Store
Decoded
Video
MPEG Picture Types
• Generated number of bits: I > P > B. For example, I~300 kbits, P~100-65 kbits (fast/slow motion), B~18-7 kbits (fast/slow motion), per frame.
• B Pictures: – Best prediction and compression, object occlusio
n and entrance into scene, noise averaging.– Encoder delay, high complexity, large encoder buffers required.
MPEG-1 Picture
SLICE 1SLICE 1
SLICE 2SLICE 2
SLICE 15SLICE 15
SLICE 14SLICE 14
SLICE 3SLICE 3
Y LUMINANCEY LUMINANCE
SLICE 1SLICE 1SLICE 2SLICE 2
SLICE 15SLICE 15SLICE 14SLICE 14
SLICE 3SLICE 3
Cb ChrominanceCb Chrominance
SLICE 1SLICE 1SLICE 2SLICE 2
SLICE 15SLICE 15SLICE 14SLICE 14
SLICE 3SLICE 3
Cr ChrominanceCr Chrominance
MACROBLOCKMACROBLOCK
SLICE 2SLICE 2
Motion Estimation/Compensation Motion estimation on 16x16 luminance blocks
*Chrominance motion vectors by dividing luminance motion vectors and truncating
Half-pel update on integer motion vectors for improved performance
Supports maximum motion vector range of
*-512 to +511.5 pels for half-pel motion vectors*-1024 to +1023 for full pel
MPEG-1 Constraint Parameter Set MPEG-1 Constraint Parameter Set
• Horizontal size <= 720 pelsHorizontal size <= 720 pels• Vertical size <= 576 pelsVertical size <= 576 pels
• Total number of Macroblocks/picture <= 396Total number of Macroblocks/picture <= 396• Total number of Macroblocks/secondTotal number of Macroblocks/second
<= 396x25 = 330x30<= 396x25 = 330x30• Picture rate <= 30 frames/secondPicture rate <= 30 frames/second• Bit rate <= 1.86 Mbits/secondBit rate <= 1.86 Mbits/second• Decoder Buffer <= 376832 bitsDecoder Buffer <= 376832 bits
Video sequence
Picture
Slice
Macroblock
.....
.....
.....
.....
Sequence header GOP GOP GOP GOP Sequence end code
GOP header Picture Picture Picture
Picture header Slice Slice Slice
Slice header Macroblock Macroblock Macroblock
Macroblock header Block 0 Block 1 Block 2 Block 3 Block 4 Block 5
Differential DC coefficient
Sequence layer
GOP layer
Picture layer
Slice layer
Macroblock layer
Block layerAC coefficient AC coefficient AC coefficient End-Of-Block
.....
MPEG-1 VIDEO BIT-STREAMMPEG-1 VIDEO BIT-STREAM
MPEG-1 Compare to H.261• Bi-directional motion compensation with half-pixel accuracyBi-directional motion compensation with half-pixel accuracy
• Visually weighted quantizationVisually weighted quantization
• In Intra-mode, the DC-coefficient is encoded similar to that in JPEGIn Intra-mode, the DC-coefficient is encoded similar to that in JPEG
• I, P, and B picture types organized as a flexible Group of Pictures (GOP) I, P, and B picture types organized as a flexible Group of Pictures (GOP)
• Slice structure instead of Group of Blocks (GOB)Slice structure instead of Group of Blocks (GOB)
• Support maximum motion vector range of -512 to +511.5 pixels,Support maximum motion vector range of -512 to +511.5 pixels, for half-pixel motion vectors: -1024 to + 1023 for full-pixel for half-pixel motion vectors: -1024 to + 1023 for full-pixel
• Flexible format: picture sizes up to 4k x 4k, 360 x 240 (SIF) normallyFlexible format: picture sizes up to 4k x 4k, 360 x 240 (SIF) normally used. Variety of picture rates: 23.98, 24, 25, 29.97, 30, 50, 59.94, 60used. Variety of picture rates: 23.98, 24, 25, 29.97, 30, 50, 59.94, 60
Simulation Model 3 (SM3) Simulation Model 3 (SM3) • A specific reference implementation of MPEG-1A specific reference implementation of MPEG-1
encoder including details which were not specifiedencoder including details which were not specified in the standardin the standard
• Motion estimation: one forward and/or one backward Motion estimation: one forward and/or one backward vector per MB with half-pixel resolution; 2-step search:vector per MB with half-pixel resolution; 2-step search: (i) full search in the range of +/- 7 pixels (2) search 8 (i) full search in the range of +/- 7 pixels (2) search 8 neighboring half-pel positions neighboring half-pel positions
• Methods for MC / No MC and Intra / Inter decisionMethods for MC / No MC and Intra / Inter decision
• Quantizer, rate controlQuantizer, rate control
• MPEG-1 is mainly for storage media and broadcasting MPEG-1 is mainly for storage media and broadcasting applicationsapplications• Due to the use of B-pictures, it may result in long Due to the use of B-pictures, it may result in long end-to-end delayend-to-end delay• MPEG encoder is much more expensive than the decoder MPEG encoder is much more expensive than the decoder
due to the motion estimation which has large search due to the motion estimation which has large search range and may have half-pel accuracyrange and may have half-pel accuracy• MPEG-1 syntax can support a variety of rates and formatsMPEG-1 syntax can support a variety of rates and formats for storage media applicationsfor storage media applications• Pre-processing, encoding, and post-processing are open Pre-processing, encoding, and post-processing are open to improvementto improvement• Extensions to include added features are possibleExtensions to include added features are possible
SUMMARYSUMMARY
MPEG-2 Video Standard
• Standardization established in 1995
• A generic video codec to address a wide variety of application
• at rate 4Mbit/s ~ 80Mbit/s
MPEG 2• 目前 MPEG-2 的應用主要是針對 3 ~ 10Mbps 的影音圖
象數據。 MPEG - 2 可以提供一個較廣的範圍改變壓縮比,以適應不同畫面質量、存儲容量和帶寬的要求。 MPEG - 2 可以將一部 120 分鐘長的電影壓縮到 4 ~ 8GB可供收錄在 DVD 碟片之內。 MPEG-2 的音頻編碼可提供左、右、中及兩個環繞聲道、一個加重低音聲道和多達 7 個伴音聲道,因此 DVD 可有 8 種語言配音。除了作為 DVD 的指定標准外, MPEG-2 還可用於為廣播、有線電視網、電纜網絡等提供廣播級的數字視頻。但由於現在電視機的解析度參差,在播放 DVD 時,觀眾不一定領略到 MPEG-2 所帶來的高清晰度畫面質量。但觀眾一定可以感受到其綽越的音頻特性,例加多聲道等的效果
MPEG-2 Video StandardMPEG-2 Video Standard• Primarily for coding interlaced video at 4 - 15 Mb/s for di
gital broadcast TV and high quality Digital Storage Media; also for HDTV, Cable/Satellite TV, video services over networks (e.g., ATM), and 2-way communications
• Started late 1990 after completion of technical work of MPEG-1
• Competitive tests of video algorithms held in Nov. ‘91• Collaborative phase for developing video coding algorith
m• Committee Draft for video part achieved Nov. ‘93• Standard specifies only bitstream syntax and decoding p
rocess
MPEG2 StandardsMPEG2 Standards ISO-IEC/JTC1/SC29/WG11 ISO-IEC/JTC1/SC29/WG11 ITU-T ATM Video Coding Experts GroupITU-T ATM Video Coding Experts Group
ISO/IEC 13818 ISO/IEC 13818 1) Systems1) Systems2) Video2) Video3) Audio3) Audio4) Conformance Testing4) Conformance Testing5) Simulation Software Technical Report5) Simulation Software Technical Report6) Digital Storage Media Control Commands6) Digital Storage Media Control Commands7) Non-Backward Compatible Audio7) Non-Backward Compatible Audio8) 10-bit Video8) 10-bit Video9) Real-Time Interface9) Real-Time Interface......
11/9311/9311/9311/9311/9311/9311/9411/9411/9411/9403/9503/9507/9607/9610/9510/9503/9503/95
ITU-T H.262: MPEG-2 VideoITU-T H.262: MPEG-2 Video
• MPEG-2 的應用環境
陸地 / 人造衛星廣播或有線電視
MPEG-2音訊解碼
音訊介面
立體聲
DRAM
MPEG-2傳輸解多工
DRAM
解調子系統
調變子系統 壓縮 MPEG
資料流MPEG-2
視訊解碼器NTSC/PAL
編碼器MPEG-2編碼器 磁碟
複合式視訊輸出
視訊 / 音訊
輸入
音訊輸出已壓縮 MPEG 資料流
MPEG2 STANDARDSMPEG2 STANDARDS
Features Features
• Picture quality - good quality NTSC (4-6 Mb/s)Picture quality - good quality NTSC (4-6 Mb/s) excellent quality NTSC (8-10 Mb/s) excellent quality NTSC (8-10 Mb/s) • Random access/channel switching in limit time - intra-picturesRandom access/channel switching in limit time - intra-pictures• Trick modes - basic VCR functionsTrick modes - basic VCR functions• Delay - low delay mode using Simple Profile for visual communicationsDelay - low delay mode using Simple Profile for visual communications• Error resilience - intra-mv, data-partitioning, priority assignment to videoError resilience - intra-mv, data-partitioning, priority assignment to video layerslayers• Allow higher chroma resolution - e.g. 4:2:2 and 4:4:4Allow higher chroma resolution - e.g. 4:2:2 and 4:4:4• Scalability - Data partition, SNR scalability, Spatial Scalability,Scalability - Data partition, SNR scalability, Spatial Scalability, Temporal scalability, Hybrid scalability (up to 3 layers)Temporal scalability, Hybrid scalability (up to 3 layers)• Compatibility - decodes MPEG-1 bit-stream, base layer may be decodedCompatibility - decodes MPEG-1 bit-stream, base layer may be decoded by MPEG-1 decoderby MPEG-1 decoder• Support multiple video formats and frame ratesSupport multiple video formats and frame rates• Subset of the standard permit real-time encoders of reasonable complexity Subset of the standard permit real-time encoders of reasonable complexity
SPSPMPMP
SNRPSNRP
SSPSSPHPHP
• Each profile supports groups of features for an application area• Simple Profile: low-delay videoconferencing• Main Profile: most important, for general applications• SNR Profile: multiple grades of quality• Spatially Scaleable Profile: multiple grades of quality and resolution• High Profile: multiple grades of quality, resolution, and chroma forma
t
Applications Applications
PROFILES AND LEVELSPROFILES AND LEVELS
LevelLevel
ProfileProfile
SimpleSimple4:2:04:2:0
MainMain4:2:04:2:0
SNR SNR ScalableScalable
4:2:04:2:0
SpatiallySpatiallyScalableScalable
4:2:04:2:0
HighHigh4:2:0 or4:2:0 or
4:2:24:2:2
HighHigh1920x11521920x1152(60 frames/s)(60 frames/s)
High-1440High-14401440x11521440x1152(60 frames/s)(60 frames/s)
MainMain720x576720x576(30 frames/s)(30 frames/s)
LowLow352x288352x288(30 frames/s)(30 frames/s)
15 Mbit/s15 Mbit/s
80 Mbit/s80 Mbit/s
60 Mbit/s60 Mbit/s
15 Mbit/s15 Mbit/s
4 Mbit/s4 Mbit/s
15 Mbit/s15 Mbit/sfor 2 layerfor 2 layer
ss
60 Mbit/s60 Mbit/sfor 3 layersfor 3 layers
100 Mbit/s100 Mbit/sfor 3 layersfor 3 layers
4 Mbit/s4 Mbit/sfor 2 layersfor 2 layers
80 Mbit/s80 Mbit/sfor 3 layersfor 3 layers
20 Mbit/s20 Mbit/sfor 3 layersfor 3 layers
* numbers in the table are maximum allowed* numbers in the table are maximum allowed
Requirements CCIR-601 interlaced video with high quality at 4 to 9 Mbps Random access/channel switching in limited time:
*Frequent access points
*Seek and play in FF/FR using access points
*e.g.4:2:2 and 4:4:4
Fast forward/reverse
Allow video coding higher chroma resolution formats
High quality low delay video coding for video communicationsScalable video coding for multi-quality video applications
MPEG-1 & MPEG-2
• Coding of interlaced picture
• Scalability– Allow a receiver to decode a subset of the full b
itstream in order to display an image sequence at a reduced quality, spatial and temporal resolution
New Feature OF MPEG-2
• Frame/field picture structure• Frame/filed dual prime adaptive motion
compensation• Frame/filed adaptive DCT• Alternate scan for DCT coefficients• Picture format: (4:2:0),(4:2:2),(4:4:4)• Nonlinear quantization table
MPEG-2: Resolutions & Formats
• Picture sizes extension up to 16k x 16 k; 720 x480 ~ TV resolution
• Support picture rates: 23.98, 24, 25, 29.97, 30, 50, 59.94, 60
• Support both progressive and interlaced formats
• Support 4:2:0, 4:2:2, and 4:4:4 sampling formats
Interlaced Video CodingInterlaced Video Coding
1) Frame / Field motion compensated predictive coding1) Frame / Field motion compensated predictive coding prediction modes: frame, field, dual primeprediction modes: frame, field, dual prime
2) Frame / Field DCT2) Frame / Field DCT
3) Progressive / Interlaced scan3) Progressive / Interlaced scan
Frame/Field Format for ME
16
16
Frame macroblock
Field 16x8 blocks
16
16
8
8
Low Delay CodingLow Delay Coding
• For face-to-face applications
• Total encoding and decoding delay of less than 150 ms can be achieved
• Low delay coding by not using B-pictures, using dual-prime prediction for P-frames, intra slices, skip frames
TEST MODEL 5 (TM5)TEST MODEL 5 (TM5)
- Frame/field/dual prime and forward/backward ME- Frame/field/dual prime and forward/backward ME - Integer pel full search followed by half-pel search- Integer pel full search followed by half-pel search - MPEG-1 mode decision: MC/no MC, inter/intra- MPEG-1 mode decision: MC/no MC, inter/intra - MPEG-1 and nonlinear quantizer tables- MPEG-1 and nonlinear quantizer tables - Zigzag scan for inter; alternate scan for intra coding- Zigzag scan for inter; alternate scan for intra coding - Quantizer and rate control- Quantizer and rate control
MPEG- 4
MPEG-4An emerging coding standard
*Content-based interactivity:To interact with meaningful objects in audiovisual data*Universal access:Access to audiovisual data can be available over a wide range of storage and transmission media*High compression:Especially at low bit rates*Flexible syntax for downloadable algorithms
Computer TV/Film
Telecommunication
MPEG-4
Wireless,Internet,WWW,ISDN,POTS,Cable
MPEG-4 Applications MPEG-4 Applications
• Audiovisual communications and messaging• Multimedia database access• Remote monitoring, surveillance and control• Video on LAN, Internet multimedia, Wireless video• Interactive TV, tele-shopping, home movies• Collaborative environment, distance learning, virtual reality games• Audio/video streaming
MPEG-4 Products MPEG-4 Products
• Microsoft supports MPEG-4 video in MS Media PlayerMicrosoft supports MPEG-4 video in MS Media Player
• Sharp (JP) introduced MPEG-4 ViewCam in January ‘99Sharp (JP) introduced MPEG-4 ViewCam in January ‘99
• Toshiba MPEG-4 chipToshiba MPEG-4 chip
• Japan announced the use of MPEG-4 video for wirelessJapan announced the use of MPEG-4 video for wireless services in the IMT2000 projectservices in the IMT2000 project
• PacketVideo Inc. provides technology for streaming MPEG-4PacketVideo Inc. provides technology for streaming MPEG-4 video over Internet and 2nd/3rd generation wirelessvideo over Internet and 2nd/3rd generation wireless networksnetworks
Add compressed MPEG-4 video images to e-mail or Internet home pages for new communication possibilities Add compressed MPEG-4 video images to e-mail or Internet home pages for new communication possibilities
• Supported by Microsoft Media PlayerSupported by Microsoft Media Player
• MPEG-4 ViewCam enables up to 2 hrs videoMPEG-4 ViewCam enables up to 2 hrs video (1/8 TV size) on 32 MB storage device. (1/8 TV size) on 32 MB storage device.
MPEG-4 ViewCamMPEG-4 ViewCam
型號 : J-SH53 大小 :50 X 98 X24 (mm) 重量 : 約 115 公克
擁有 100 萬像素 CCD 攝影機
最大 1140X858 的畫面攝影
內面液晶螢幕為 26 萬色 (2.4QVGA)外面液晶為 6 萬 5536 色 (1.2GF)
使用 MPEG-4 壓縮格式 :80X60 最大攝錄時間為 10 秒 ,128x96 為五秒 ( 均含影像與聲音 )
支援 Java 程式
MSDLMPEG-4 System Description Languages
*Define API, tools and algorithms*Connect tools into algorithms,algorithms into profiles
Typical session with MSDL
*Negotiate decoder configuration*Describe profile*Download missing tools (cf.Java byte code )*Begin transmission
MPEG-4MPEG-4• Started in ‘93• Originally targeted at < 64 kb/s next generation coding
• July ‘94, focus shifted from compression to new functionality + compression • Address needs for emerging audiovisual applications which are not well supported by current standards• Video coding for wider range of bit-rates and applications
• Significant emphasis on content oriented functionality• MPEG-4 decoder must decode H.263
MPEG-4 Video
* Video object plane (VOP) structure* Polygonal matching for motion estimation* Padding* Motion/Texture coding derived from H.263* Shape coding* B-VOPs derived from H.263 B-pictures and M
PEG-1/2 B-pictures
Verification model (VM7)
Video Object Plane (VOP)
VOP1 VOP3
MPEG-4 VIDEO OBJECT PLANE (VOP)MPEG-4 VIDEO OBJECT PLANE (VOP)
VOP2
• Overview the MPEG-4 video
• Overview the MPEG-4 video
video objectssegmentation& object mask
separate decoding
content-basedbitstream access& manipulation
content-basedscalability
layered encoding
VOP1
VOP2
VOP3
bitstream(VOP3)
bitstream(VOP2)
bitstream(VOP1)
CompressionCompression
ScalabilityScalability
Content-based CodingContent-based Coding
Baseline Extended
Conventional coding Object coding
Still Texture CodingStill Texture Coding
MPEG-4 Video
MPEG-4 VIDEO
Content-Based CodingContent-Based Coding• Allows the user to access Arbitrarily-Shaped ObjectsAllows the user to access Arbitrarily-Shaped Objects in a Coded Scenein a Coded Scene
• Enables High Interaction with Scene ContentEnables High Interaction with Scene Content
• Manipulation of Scene Content on Bitstream Level Manipulation of Scene Content on Bitstream Level
MPEG-4 SCENEMPEG-4 SCENE
2D Background2D Background
3D Furniture3D Furniture
SpeechSpeech
Video ObjectVideo Object
AV PresentationAV Presentation
MPEG-4 TERMINALMPEG-4 TERMINAL
...Network
Layer
Audiovisual HierarchicalInteractive Scene
...
SyntacticDecoded Streams
SyntacticDecoding
...
ElementaryStreams
Demultiplex
...PrimitiveAV Objects
Decompression Composition andRendering
...Upstream Data
(User Events, Class Request, ...)
CompositionInformations
Scene Description(Script or Classes)
Structure of VOP Encoder
InputVOP
Definition
VOP 0
Coding
VOP 1
Coding
VOP 2
Coding
MUX Bitstream
InputComposition
VOP 0
Decoding
VOP 1
Decoding
VOP 2
Decoding
MUXBitstream
Structure of VOP Decoder
VOP Encoder
Shape
Coding
Motion
Estimation
Motion
Compensation
Texture
Coding
Previous Reconstructed
VOP
MUX Buffer
+
-
VOP of arbitrary shape
VOP of arbitrary shape
Shape information
Motion information
Texture information
VOP Decoder
Demultiplexor
Shape
Decoding
Motion
Decoding
Texture
Decoding
Motion
Compensation
VOP Memory
Reconstucted
VOP
VOP Decoder ShapeShape
DecodingDecoding
TextureTextureDecodingDecoding
Shape InformationShape InformationDDEEMMUULLTTIIPPLLEEXXEERR
Motion Motion CompensationCompensation
Bit
stre
amB
itst
ream MotionMotion
DecodingDecoding
VOPMemory
Reconstructed Reconstructed VOPVOP
CompositorCompositor
Video OutVideo Out
Compositing scriptCompositing script
Bitmap-based shape codingModified MMR shape codingContext-based Arithmetic Encoder (CAE)
Vertex-based shape coding
Baseline-based shape coding
Contour-based shape coding
Shape Coding
Bitmap-based shape coding
Context-based Arithmetic Encoder (CAE)
Intra-frame
Current pixel
Inter-frame
Previous current
Contour-based shape coding
Chain code : 4 or 8 or more direction to describe contoursMGCC code :Multi-Grid chain codingVertex code :To find out a similar many- sided figure Baseline code :The projection of shape onto the x-axis,encode the distance (y-coordinate)
0
3 2
7
1
5
4
6
Chain code and MGCC
origin contour
Thd>
12
Start point
Vertex point
End point
Vertex code and Baseline code
x
y
+
Sprite Coding
Chroma-key
STANDARDS COMPARISONSTANDARDS COMPARISON
- still image coding- still image coding- DCT + VLC- DCT + VLC- simple hardware, low cost- simple hardware, low cost
JPEGJPEG
H.261H.261 - video-conferencing (64 kb/s - 1.92 Mb/s)- video-conferencing (64 kb/s - 1.92 Mb/s)- DCT + VLC + optional integer-pixel MC- DCT + VLC + optional integer-pixel MC- progressive video- progressive video
MPEG-1MPEG-1 - storage based applications (1.5 Mb/s)- storage based applications (1.5 Mb/s)- DCT + VLC + optional half-pixel bi-directional MC- DCT + VLC + optional half-pixel bi-directional MC- progressive video- progressive video
MPEG-2MPEG-2 - general high quality applications (> 2 Mb/s)- general high quality applications (> 2 Mb/s)- DCT+VLC+optional half-pixel bi-directional frame/field based MC- DCT+VLC+optional half-pixel bi-directional frame/field based MC- progressive and interlaced video- progressive and interlaced video
H.263H.263 - improved H.261 with four optional modes- improved H.261 with four optional modes
- improved H.263 with 12 new optional modes- improved H.263 with 12 new optional modes- error resilience coding- error resilience coding
H.263+H.263+
- improved H.263 with object-based coding and manipulations- improved H.263 with object-based coding and manipulationsMPEG-4MPEG-4
Future• 繼 MPEG-4 之後,要解決的矛盾就是對日漸龐大的圖象、聲音訊
息的管理和迅速搜索。 1998 年 10 月基於這種設想的 MPEG-7 標準被提出,它的正式名稱是「多媒體內容描述介面」( Multimedia Content Description Interface )。建立 MPEG-7 標準的目的是要將對各種不同類型的多媒體資訊進行標準化的描述,並把其描述的內容聯系起來,以便快速及更有效的搜索。由於 MPEG-7 標準可以獨立運作,也沒有規定利用描述進行搜索的工具或任何程式,因此,它可以獨立於其他MPEG 標準使用。但 MPEG-4 中所定義的對音頻、視頻對象的描述仍然適用於 MPEG-7 ,這種描述是分類的基礎。我們也可以利用 MPEG-7 的描述來增強其他MPEG 標準的功能。 MPEG-7 的應用範圍很廣泛,例如可以用於數碼圖書館、多媒體名錄、圖象目錄、音樂字典等。
MPEG-1,-2,-4 參考文獻