Upload
newman
View
37
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Multimedia. Video/Audio Compression. Hybrid coding. Images: JPEG Video/Audio M-JPEG MPEG (1, 2, 4) Other codings H.26x. Video Coding Requirements. Random access Fast forward /reverse searches Reverse playback Audio-visual synchronization Robustness to errors - PowerPoint PPT Presentation
Citation preview
T.Sharon-A.Frank
Multimedia
Video/Audio Compression
2T.Sharon-A.Frank
Hybrid coding
• Images:– JPEG
• Video/Audio– M-JPEG– MPEG (1, 2, 4)– Other codings– H.26x
3T.Sharon-A.Frank
Video Coding Requirements
• Random access• Fast forward /reverse searches• Reverse playback• Audio-visual synchronization• Robustness to errors• Low coding/decoding delay• Editability• Format flexibility• Cost tradeoffs
4T.Sharon-A.Frank
• Spatial (intra-frame) compression:– Compresses each frame in isolation, treating it as
a bitmapped image.– Based on quantization of DCT coefficients.
• Temporal (inter-frame) compression:– Compresses sequences of frames by only storing
differences between them.– Record displacement of object plus changed pixels
in area exposed by its movement.– Based on Motion Compensation (MC).
Video Compression
5T.Sharon-A.Frank
• Image compression applied to each frame.• Can therefore be lossless or lossy, but lossless rarely produces sufficiently high compression ratios for volume of data.• Lossless compression implies a loss of quality if decompressed then recompressed.• Ideally, work with uncompressed video during post-production.
Spatial Compression
6T.Sharon-A.Frank
• Key frames are spatially compressed only– Key frames often regularly spaced
(e.g., every 12 frames).• Difference frames only store the
differences between the frame and the preceding frame or most recent key frame.
• Difference frames can be efficiently spatially compressed.
Temporal Compression
7T.Sharon-A.Frank
Motion-JPEG (M-JPEG)
• Purely spatial compression. • Apply JPEG compression to each video frame.• Compression rates: 2:1 to 12:1
– lossy: up to 5:1 is considered broadcast quality.
• No standard, but MJPEG-A format widely supported.
• Excellent when there are rapid scene changes in the video.
• Easy to edit.
8T.Sharon-A.Frank
Video Compression
• Divide Image to blocks– 16x16 luminance– 8x8 chrominance (color)
• Use DCT based techniques for spatial redundancy removal (Intra-frame compression).
• Use MC (Motion Compensation) techniques for temporal redundancy removal (Inter-frame compression).
• Final stage is two dimensional run-length coding.
Coding of video is carried out in a series of steps:
Usually
9T.Sharon-A.Frank
Three consecutive video frames
10T.Sharon-A.Frank
Motion Compensation
• Motion compensation compensates for inter-frame differences.
• Real-time communication consideration – only the closest previous frame is used for prediction to reduce the encoding delay.
previous frame current frame
best match
11T.Sharon-A.Frank
Motion Compensation Algorithm
• Sends new location of block• If block changed more than a certain
threshold, resends all the block• Refreshes all the image once in a while
best match
previous frame current frame
12T.Sharon-A.Frank
Frame Types in Compressed Video
• Key Frame– Compression is based on content of this frame.
• Difference/Delta Frame– Compression is based on last key frame.
13T.Sharon-A.Frank
Bi-directional Motion Compensated Interpolation
14T.Sharon-A.Frank
MPEG Dynamics
• Delicate balance between Intra-frame and Inter-frame coding.
• Two basic techniques:– Transform domain DCT-based compression
for the reduction of spatial redundancy (intra-frame).
– Block-based bi-directional MC for reduction of the temporal redundancy (inter-frame).
15T.Sharon-A.Frank
The MPEG Standard
Three types of MPEG-2 frames processed by the viewing program:
1. I (Intracoded) frames: self-contained JPEG-encoded still pictures.
2. P (Predictive) frames: block-by-block difference with the last frame.
3. B (Bidirectional) frames: differences with the last and next frame.
16T.Sharon-A.Frank
Use of MPEG Image Types
<I> Intra-picture/frame/image– Access points for random access– Moderate Compression
<P> Predicted pictures– Coded with a reference to a past picture – Used as reference for future predicted pictures
<B> Bi-directional prediction (interpolated pictures) – Require past and future reference for prediction– Highest compression
17T.Sharon-A.Frank
• Group of Pictures (GOP):– Repeating sequence of I-, P- and B-pictures.– Always begins with an I-picture.– Display order – frames in order they will be
displayed.– Bitstream order – re-ordered so that every
P- or B-picture comes after frames it depends on, allowing reconstruction of the complete frames.
MPEG GOPs
18T.Sharon-A.Frank
A Typical MPEG Picture Display Order
I B B B B B B IP
Forward prediction
I I? B B? 25fps (9 I/P, 17B)
19T.Sharon-A.Frank
A Typical MPEG Picture Bitstream Order
• Transmitting order: 1, 5, 2, 3, 4, 9, 6, 7, 8
Forward prediction
1 2 3 4 5 6 7 8 9
I B B B P B B B I
Bi-directional prediction
20T.Sharon-A.Frank
MPEG Standards
• MPEG-1– 352x240 at 30 fps. – Quality is slightly below standard VCR videos.
• MPEG-2– 720x480 & 1280x720 at 60 fps, with full CD-quality audio. – Sufficient for television (including HDTV). – Used on DVD-ROMs.
• MP3– Audio compression. – Reduces digital sound files by 12:1 ratio with virtually no
loss in quality.
21T.Sharon-A.Frank
• Source Interchange Format (SIF)– 4:2:0 chrominance sub-sampling– 352x240 pixel frame
• MPEG-1 compressed SIF video at 30 frames per second has data rate of 1.86Mbps (CD video – 40mins of video at that rate).
• MPEG-1 can be scaled up to larger frames, but cannot handle interlacing.
MPEG-1 Compression
22T.Sharon-A.Frank
• Profiles define subsets of the features of the data stream.
• Levels define parameters such as frame size and data rate.
• Each profile may be implemented at one or more levels.
• Notation: profile@level, e.g. MP@ML.
MPEG Profiles & Levels
23T.Sharon-A.Frank
• MPEG-2 Main Profile at Main Level (MP@ML) used for DVD video:– CCIR 601 scanning– 4:2:0 chrominance sub-sampling– 15 Mbits per second– Most elaborate representation of MPEG-2
compressed data.
MPEG-2 Main Profile & Level
24T.Sharon-A.Frank
• Refinement of MPEG-1 compression:– I-pictures compressed by quantizing and Huffman
coding DCT coefficients.– Improved motion compensation leads to better quality
than MPEG-1 at same bit rates.• Designed to support a range of multimedia data at bit
rates from 10Kbps to >1.8Mbps.• Applications from mobile phones to HDTV.• Video codec becoming popular for Internet use –
is incorporated in QuickTime, RealMedia and DivX.
MPEG-4 (1)
25T.Sharon-A.Frank
• Standard defines an encoding for multimedia streams made up of different sorts of object –video, still images, animation, 3-D models…
• Higher profiles divide a scene into arbitrarily shaped video objects were each one may be compressed and transmitted separately; scene is composed at receiving end by combining them.
• SP and ASP profiles restricted to rectangular objects, usually complete frames.
MPEG-4 (2)
26T.Sharon-A.Frank
• Simple Profile (SP), suitable for low bandwidth streaming over Internet:– P-pictures only– Efficient decompression, suitable for PDAs, etc– SP@L1, 64 kbps, 176x144 pixel frame.
• Advanced Simple Profile (ASP) suitable for broadband streaming:– B-pictures– Global Motion Compensation– Sub-pixel motion compensation– ASP@L5, 8000 Kbps, full CCIR 601 frame.
MPEG-4 Profiles & Levels
27T.Sharon-A.Frank
• Starts with chrominance sub-sampling of CCIR 601.• Constant data rate 25Mbits per second; higher quality
than MJPEG at same rate.• Apply DCT, quantization, run-length and Huffman
coding on zig-zag sequence – like JPEG – to 8x8 blocks of pixels.
• If little or no difference between fields (almost static frame), apply DCT to block containing alternate lines from odd and even fields.
• If motion between fields, apply DCT to two 8x4 blocks (one from each field) separately, leading to more efficient compression of frames with motion.
DV Compression
28T.Sharon-A.Frank
DVI (Digital Video Interactive)
• Developed by General Electric.• Uses specialized processors for compression.• Hardware-only codec – lossless transforms.• Compression rate: 80:1-160:1
– 10 sec video clip is compressed to ~2MB.• Intel – software version of DVI algorithms, marketed
as Indeo (a software only codec):– there is also an audio version of Indeo.– latest version uses hybrid wavelet transform for
compression algorithm.
29T.Sharon-A.Frank
Cinepak• Developed by Apple and SuperMac.• Outputs 320x240 (quarter screen) at 15 fps
with good quality – data rate that even slow single-speed and 2x
CD-ROM players can deliver.• Software only codec supported by Microsoft’s
Video for Windows and Apple’s QuickTime.• Better color definition than other codecs,
so good for natural video without graphics or animation.
30T.Sharon-A.Frank
QuickTime
• Developed by Apple but is now cross-platform.
• Supports Cinepak, Indeo, M-JPEG and MPEG-1, and is extensible to support future codecs, such as DVCAM.
• Synchronizes all types of digital media.• For example, video frames are dropped if
necessary for synchronization with audio.
31T.Sharon-A.Frank
Video For Windows
• Microsoft (therefore, not cross-platform).• Uses generic AVI (audio video interleaved)
format which is provided by MCI (media control interface).
• Supports a number of compression methods in real-time, non-real-time, with or without hardware assistance– Cinepak, Indeo, Microsoft Video-1.
32T.Sharon-A.Frank
ActiveMovie (API from Microsoft)
• Now called DirectShow (supports DVD).• Solves problems of VfW and QuickTime.• Cross-platform.• Supports codecs supported by VfW as well as
MPEG audio, WAV audio, MPEG video, and Apple QuickTime video.
• Fully integrated with DirectX technology, allowing use of DirectX components and more graphics card features.
33T.Sharon-A.Frank
Video Streaming Players
• RealVideo (from RealNetworks)– G2 Player also plays RealAudio.– Uses a variety of compression techniques.
• RealProducer (also from RealNetworks)– Allows you to create streaming audio and
video.– Free software just like G2!
34T.Sharon-A.Frank
H.261 (Px64)
• Video compression for videoconferences– Compression in real-time– Targeted to ISDN
• Compressed data stream: p*64 Kbits/s, p=1, …, 30)
• 2 resolutions:– Common Intermediate Format (CIF)– Quarter CIF (QCIF)
35T.Sharon-A.Frank
H.261 (Px64) Resolutions
QCIFCIFLines/frame Pixels/lineLines/frame Pixels/line
144 176288 352Luminance (Y) 72 88144 176Chrominance (Cb) 72 88144 176Chrominance (Cr)
• Common Intermediate Format (CIF)• Quarter CIF (QCIF)
36T.Sharon-A.Frank
Image Preparation
• Uncompressed CIF– One frame = 288*352*8 + 2*144*176*8 =
1,216,512 bits– 30 fps– Bandwidth = 1,216,512*30 = 36.4 Mbits/s
• Uncompressed QCIF = 9.1Mbits/s• ISDN channels: 64Kbits/s-2Mbits/s
=> bit reduction required
37T.Sharon-A.Frank
Desktop Videophone Applications
• Channel capacity (p=1) = 64Kbits/s• QCIF at 10 fps --> 3 Mbits/s• Required compression ratio =
3Mbs/64Kbs=47• Channel capacity (p=10) = 640Kbits/s• CIF at 30 fps --> 36.4 Mbits/s• Required compression ratio =
36.4Mbs/640Kbs=57
38
• In general, lossy methods required because of complex and unpredictable nature of audio data.
• CD quality, stereo, 3-minute song requires over 25 Mbytes– Data rate exceeds bandwidth of dial-up Internet
connection.• Difference in the way we perceive sound and
image means different approach from image compression is needed.
Audio Compression
39T.Sharon-A.Frank
Audio Compression Techniques
Samplingfrequency
(KHz)
Quanti-zation(bits)
Format Quality CD quality
44.1 16 PCM HiFi music CD-DA
37.8 8 ADPCM HiFi music CD-I level
37.8 8 ADPCM FM broadcast(music)
CD-I level B
18.9 4 ADPCM AM broadcast(speech)
CD-I level
8 8 PCM Telephone N/A
40T.Sharon-A.Frank
Standards of Speech Encoding
Standard Description
G.711 PCM of voice frequencies
G.722 Audio coding at 7 KHz within 64Kbit/s(ADPCM)
G.728 Coding of speech at 16 Kbit/s using low delaycode excited linear prediction (LD-CELP)
41T.Sharon-A.Frank
Basic Steps of Audio Encoding
PsychoacousticalModel
Filter-Banks
QuantizationMultiplexer
EntropyCoder
Uncompressed audio data
Compressed audio data
[]
32 Sub-Bands
Control
42
• MP3 = MPEG-1 Audio, Layer 3• Three layers of audio compression in MPEG-1
(MPEG-2 essentially identical).• Layer 1...Layer 3, encoding proces increases in
complexity, data rate for same quality decreases– e.g. Same quality 192kbps at Layer 1, 128kbps at
Layer 2, 64kbps at Layer 3.• 10:1 compression ratio at high quality.• Variable bit rate coding (VBR).
MP3
43T.Sharon-A.Frank
Voice Quality - QoS
The Objective:Provide
unfailing, ubiquitous, toll quality service
0
200
400
160
0 1 5 10
Service Level Agreement Violation
Area of Unacceptable Operation
One
-Way
Del
ay (m
s)
Marginal Acceptance
Acceptable Operation
Packet Loss)%( high threshold low threshold
The Challenge:Eliminate the impact of delay-insensitive traffic on real-time
traffic
44T.Sharon-A.Frank
QoS Parameters Best High Medium Best
Effort
Mouth-to-
Ear Delay:
0ms -
150ms
150ms -
250ms
250ms -
450ms
450ms and
above
Call
Setup:
0 sec - 1
sec
1 sec - 3
sec
3 sec - 5
sec
5 sec and
above
few ms
echo pathPSTN PSTNGG IP network
EC EC
hundred ms
few ms hundred ms few ms
few msDelay Budgets