Upload
dortha-greene
View
224
Download
5
Tags:
Embed Size (px)
Citation preview
Concepts of Multimedia Concepts of Multimedia Processing and TransmissionProcessing and Transmission
IT 481, Lecture #13Dennis McCaughey, Ph.D.
4 December, 2006
08/28/2006IT 481, Fall 20062
I-Picture Encoding Flow ChartI-Picture Encoding Flow Chart
Slide: Courtesy, Hung Nguyen
08/28/2006IT 481, Fall 20063
I-Picture Encoding ProcessI-Picture Encoding Process
Decomposing image to three components in RGB space Converting RGB to YCbCr Dividing image into several macroblocks (each macroblock
has 6 blocks , 4 for Y, 1 for Cb, 1 for Cr) DCT transformation for each block After DCT transform , Quantizing each coefficient Then use zig-zag scan to gather AC value Use DPCM to
encode the DC value, then use VLC to encode it
Use RLE to encode the AC value, then use VLC to encode it
Slide: Courtesy, Hung Nguyen
08/28/2006IT 481, Fall 20064
The Inter-frame Encoding Flow The Inter-frame Encoding Flow ChartChart
Slide: Courtesy, Hung Nguyen
08/28/2006IT 481, Fall 20065
Inter-frame Encoding ProcessInter-frame Encoding Process
Decomposing image to three components in RGB space Converting RGB to YCbCr Perform motion estimation to record the difference between the
encoding frame and the reference frame stored within the frame buffer
Dividing image into several macroblocks (each macroblock has 6 blocks , 4 for Y, 1 for Cb, 1 for Cr)
DCT transformation for each block Quantizing each coefficient Use zig-zag scan to gather AC value Reconstruct the frame and store it to the frame buffer if necessary DPCM is applied to encode the DC value, then use VLC to encode it Use RLE to encode the AC value, then use VLC to encode it
Slide: Courtesy, Hung Nguyen
08/28/2006IT 481, Fall 20066
WatermarkingWatermarking
Watermarking is a secret code described by a digital signal carrying information about the copyright property of the product.
The watermark is embedded in the digital data in such a way that it is not visually perceptible.
The copyright owner should be the only person who can show the existence of his own watermark and to prove then origin of the product.
08/28/2006IT 481, Fall 20067
Watermark RequirementsWatermark Requirements
Alterations introduced into the image or audio should be perceptually invisible.
A water mark must be undetectable and not removable by an attacker.
A sufficient n umber of watermarks in the same image or audio, detectable by their own key, can be produced.
The detection of the watermark should not require the original image or audio.
A watermark should be robust against attacks which preserve the desired quality of the image or audio.
08/28/2006IT 481, Fall 20068
Main Features of WatermarkingMain Features of Watermarking
Perceptual Invisibility Trustworthy Detection Associated Key Automated Detection/Search Statistical Invisibility Multiple Watermarks Robustness
08/28/2006IT 481, Fall 20069
Perceptual InvisibilityPerceptual Invisibility
Watermark should not degrade the perceived image/audio quality
Differences may become apparent when the original and watermarked versions are directly compared
08/28/2006IT 481, Fall 200610
Trustworthy DetectionTrustworthy Detection
Watermarks should constitute a sufficient and trustworthy part of ownership.
False alarms should be extremely rare. Watermarks signatures/signals should be
complex. An enormous set of watermarks prevents
recovery by trial-and-error methods.
08/28/2006IT 481, Fall 200611
Associated keyAssociated key
Watermarks should be associated with an identifiable number called the watermark key.
Key used to cast, detect and remove the watermark.
The key should be private and should exclusively characterize the legal owner.
Any signal removed from the image/audio is assumed to be valid only if it can be associated with the key via a well established algorithm
08/28/2006IT 481, Fall 200612
Automated Detection/SearchAutomated Detection/Search
Watermark should combine with a search algorithm.
08/28/2006IT 481, Fall 200613
Statistical InvisibilityStatistical Invisibility
Watermark should not be recoverable using statistical methods.
The possession of a great number of watermarked images, embedded with the same key should not enable the recovery of the watermark through statistical methods.– Watermarks should be image/audio
independent.
08/28/2006IT 481, Fall 200614
Multiple WatermarksMultiple Watermarks
Multiple watermarks assist in the case where someone illicitly watermarks and already watermarked image/audio.
Convenient in transferring copyrighted material.
08/28/2006IT 481, Fall 200615
RobustnessRobustness
A watermark should survive some modifications to the data.
Common manipulations to image/video– Data Compression– Filtering– Color, quantization , brightness modifications,
geometric distortions, etc– Other trans-coding operations.
08/28/2006IT 481, Fall 200616
Robustness, Resilience & DetectionRobustness, Resilience & Detection
Applications Domain
Unintentional Attacks
Intentional Attacks Every
DecoderHigh
CapacityApplications
ExampleAT1 AT2 AT3 AT4 AT5
A1 Yes Yes Maybe No No Yes Yes Value-added metadata
A2 Yes Yes Yes Yes Yes Yes No Copy Protection
A3 Yes Yes Yes Yes Yes No No Ownership/fingerprint
A4 Yes No No No Some Yes No Authentication
A5 Yes Yes NO No Yes Yes Yes Broadcasting
A6 Yes Yes Maybe Maybe Yes No Yes Secret Communication
08/28/2006IT 481, Fall 200617
Application DomainsApplication Domains
A1: Carrying value-added metadata– Additional information such as hyperlinks, content based
indexing– Malicious and non-malicious attacks– Survive MPEG encoding
A2: Copy protection and conditional access– Control Intellectual Property Management and Protection– View and copy options– Every compliant decoder must be able to trigger protection
or royalty collection mechanisms at the time of decoding– Unauthorized individuals should not be able to defeat the
watermarks by any means A3: Ownership assertion, recipient tracking
– Establish ownership and determine origin of unauthorized duplication.
– Prosecution of copyright infringement
08/28/2006IT 481, Fall 200618
Application Domains Cont’dApplication Domains Cont’d
A4: Authentication and verification– Allows fragile watermarks; if contents modified watermarks
should disappear. – Helps in identifying areas that were modified
A5: Broadcast monitoring– Monitor where and when the contents are played– Advertisements. Here heavy content degradation is less of an
issue.– Watermark removal, invalidation and forgery can be significant
concern– Counterfeiting should be intractable for the system to be effective
A6: Secret communication or steganography– Data hiding may require higher capacity watermarks than other
applications– Secrecy may be the overriding concern in some applications
08/28/2006IT 481, Fall 200619
AttacksAttacks
AT1: Basic attacks– Lossy compression, frame dropping & temporal rescaling
AT2: Simple attacks– Blurring, median filtering, noise addition gamma correction
and sharpening AT3: Normal attacks
– Translation, cropping and scaling AT4: Enhanced attacks
– Aspect ratio change & random geometric perturbations (Stirmark)
AT5: Advanced Attacks– Delete/insert watermarks, single document watermark
estimation attacks & multiple-document statistical attacks
08/28/2006IT 481, Fall 200620
Human PerceptionHuman Perception
Watermarking schemes take advantage of the fact that the human audio and visual systems are imperfect detectors.
Audio & visual signals must have a minimum intensity or contrast before they are perceptible.
These minima are spatially, temporally and frequency dependent.
These dependencies are either implicitly or explicitly exploited
08/28/2006IT 481, Fall 200621
Transform Domain ConsiderationsTransform Domain Considerations
The human eye is more sensitive to noise in the lower frequency range than in the higher frequency counterparts
However, energy in most images is concentrated in the lower frequency range.
Quantization used in DCT based compression reflects the HVS which is less sensitive in the higher frequencies
A trade is required to balance watermark invisibility and survivability resulting in the use of the mid-frequency terms.
08/28/2006IT 481, Fall 200622
Transform Domain ConsiderationsTransform Domain Considerations
An alteration of a transform coefficient is spread across the entire spatial block
A one dimensional example:
0 20 40 60 80 100 120 1400
0.5
1DCT Spectrum
0 20 40 60 80 100 120 1400
0.5
1Time Sequence
08/28/2006IT 481, Fall 200623
Data Embedding AlgorithmData Embedding Algorithm
Embedding Algorithm
Perceptual Analysis
Key
SignalWith embedded data
Information
Signal(image, audio or video)
08/28/2006IT 481, Fall 200624
Embedded Data ExamplesEmbedded Data Examples
Multilingual soundtracks within a motion picture
Copyright data Distribution permissions Data used for accounting and billing and
royalties Etc.
08/28/2006IT 481, Fall 200625
Watermarking TechniquesWatermarking Techniques
Non-Blind: Watermark recovery requires the original
Blind: Watermark recovery does not require the original
Spatial domain or transform domain embedding
Spatial domain:– LSB, color pallet, geometric
Transform Domain:– FFT, DCT, Wavelet
08/28/2006IT 481, Fall 200626
Error ResilienceError Resilience
Redundancy is added to the compressed bitstream to allow the detection and correction of errors– Can be added in either the source or channel encoder
Shannon Information Theory:– Separately design the source and channel coders to
achieve error-free transmission so long as the source is represented by a rate below the channel capacity
Source should compress the source as much as possible
Channel coder, via Forward Error Correction (FEC) adds redundancy bits to enable error detection and correction
08/28/2006IT 481, Fall 200627
Binary Symmetric ChannelBinary Symmetric Channel
1-p
pp
1-p
Capacity C = 1 + plog2(p) + (1-p)log2(1-p)
p C1.00E-06 1.00001.00E-05 0.99981.00E-04 0.99851.00E-03 0.98861.00E-02 0.91921.00E-01 0.53105.00E-01 0.0000
08/28/2006IT 481, Fall 200628
Shannon’s Capacity TheoremShannon’s Capacity Theorem
If the Rate (R) of a code R = log2(m)/L is less than channel capacity C, there exists a combination of source and channel encoders such that the source can be communicated over the channel with fidelity arbitrarily close to perfect – m = Number of message words– L = Number of code word bits
08/28/2006IT 481, Fall 200629
Protocols for MultimediaProtocols for Multimedia
Network Layer Protocol Transport Protocol Session Control Protocol
– UDP– TCP– RTP– RTCP
08/28/2006IT 481, Fall 200630
Standardized ProtocolsStandardized Protocols
Several protocols have been standardized for communication between clients and streaming servers.
Future research topics on design of protocols include: – 1) How to take caches into account (e.g., how to
communicate with continuous media caches and how to control continuous media caches);
– 2) How to efficiently support pause/resume operations in caches (since the pause/resume operations interfere with the sharing of a multimedia stream among different viewers); and
– 3) How to provide security in the protocols
08/28/2006IT 481, Fall 200631
Streaming Server ComponentsStreaming Server Components
Communicator: – A communicator involves the application layer and
transport protocols implemented on the server. – Through a communicator, the clients can communicate
with a server and retrieve multimedia contents in a continuous and synchronous manner.
Operating system: – Different from traditional operating systems, – An operating system for streaming services needs to
satisfy real-time requirements for streaming applica tions. Storage system: A storage system for streaming services
08/28/2006IT 481, Fall 200632
Protocol Stack for MultimediaProtocol Stack for Multimedia
08/28/2006IT 481, Fall 200633
RTPRTP
RTP does not guarantee QoS or reliable delivery, but rather, provides the following functions in support of media streaming:
Time-stamping: RTP provides time-stamping to synchronize different media streams.
Sequence numbering: RTP employs sequence numbering to place the incoming RTP packets in the correct order. Since packets arriving at the receiver may be out of sequence (UDP does not deliver packets in sequence),.
Payload type identification: The type of the payload contained in an RTP packet is indicated by an RTP-header field called payload type identifier.
– The receiver interprets the content of the packet based on the payload type iden tifier.
– Certain common payload types such as MPEG-audio and video have been assigned payload type numbers
– For other payloads, this assignment can be done with session control protocols.
Source identification: The source of each RTP packet is identified by an RTP-header field called Synchronization Source identifier (SSRC), which provides a means for the receiver to distinguish different sources.
08/28/2006IT 481, Fall 200634
RTCPRTCP
QoS feedback: This is the primary function of RTCP. – RTCP provides feedback to an application regarding the quality of
data distribution. – The feedback is in the form of sender reports (sent by the source)
and receiver re-ports (sent by the receiver). – The reports can contain in-formation on the quality of reception
such as: – 1) Fraction of the lost RTP packets, since the last report; – 2) Cumulative number of lost packets, since the beginning of re
ception; – 3) Packet interarrival jitter; and – 4) Delay since receiving the last sender’s report.
Participant identification: A source can be identified by the SSRC field in the RTP header.
– RTCP provides a human-friendly mechanism for source identification.
– RTCP SDES (source description) packets contain textual information called canonical names as globally unique identifiers of the session participants.
– It may include a user’s name, telephone number, email address, and other information.
08/28/2006IT 481, Fall 200635
RTCPRTCP
Control packets scaling: To scale the RTCP control packet transmission with the number of participants, a control mechanism is designed as follows.– The control mechanism keeps the total control packets to 5% of
the total session bandwidth. – Among the control packets, 25% are allocated to the sender
reports and 75% to the receiver reports. – To prevent control packet starvation, at least one control packet
is sent within 5 s at the sender or receiver. Inter-media synchronization: RTCP sender reports contain an
indication of real time and the corresponding RTP timestamp. This can be used in inter-media synchroniza tion like lip synchronization in video.
Minimal session control information. This optional functionality can be used for transporting session information such as names of the participants.
08/28/2006IT 481, Fall 200636
Session Control ProtocolsSession Control Protocols
RTSP functions – Support VCR-like control operations such as stop, pause/re sume,
fast forward, and fast backward. – Provides a means for choosing delivery channels (e.g., UDP, mul
ticast UDP, or TCP), and delivery mechanisms based upon RTP. – RTSP works for multicast as well as unicast. – Also establishes control streams of continuous audio and video
media between the media servers and the clients. Specifically, RTSP provides the following operations.
– Media retrieval: The client can request a presentation description, and ask the server to setup a session to send the requested media data;
– Adding media to an existing session: The server or the client can notify each other about any additional media becoming available to the established session
08/28/2006IT 481, Fall 200637
Session Initiation ProtocolSession Initiation Protocol
Similar to RTSP, SIP can also create and terminate sessions with one or more par ticipants.
Unlike RTSP, SIP supports user mobility by proxying and redirecting requests to the user’s current location
08/28/2006IT 481, Fall 200638
Types of CodesTypes of Codes
Block Codes– Hamming Codes– Bose-Chaudhuri-Hocquenhem BCH Codes– Reed-Solomon Codes
Convolutional Codes
08/28/2006IT 481, Fall 200639
DVB-S Transmission System Ku BandDVB-S Transmission System Ku Band
08/28/2006IT 481, Fall 200640
SpecificsSpecifics
What is the Packetizer function? What is the PES? What is the transport stream? What is the function of the scrambler? Why the R-S coder? Why the Interleaver? Why the Convolutional encoder? Why QPSK for satellites? Why COFDM for terrestrial?
08/28/2006IT 481, Fall 200641
Overview of MPEG-4 SystemOverview of MPEG-4 System
Scene segmentationand depth layering
O2
O3 O1 Layered encoding
contourmotion texture
contourmotion texture
contourmotion texture
bitstreamlayer 1
bitstreamlayer 2
bitstreamlayer 3 m
ultip
lexe
r
dem
ultip
lexe
r
Separate decoding
com
posi
tor
AV-objects
08/28/2006IT 481, Fall 200642
Two Enhancement Types in MPEG-4 Two Enhancement Types in MPEG-4 Temporal ScalabilityTemporal Scalability
1. Type I: The enhancement-layer improves the resolution of only a portion of the base-layer 2. Type II: The enhancement-layer improves the resolution of the entire base-layer.
In enhancement type I, only a selected region of the VOP (i.e. just the car) is enhanced, while the rest (i.e. the landscape) is not.
In enhancement type II, enhancement is applicable only at entire VOP level.
08/28/2006IT 481, Fall 200643
DVB-T/H TransmitterDVB-T/H Transmitter
NOKIA
08/28/2006IT 481, Fall 200644
Terrestrial Drivers Terrestrial Drivers
Terrestrial broadcasts are omnidirectional Multiple copies of the same signal may
arrive at the receiver with slightly different delays and thus interfere with each other– Multipath = (direct path signal + reflected signal
+ refracted signal)– Intersymbol Interference (ISI)– Limits the bit rate that may be achieved
08/28/2006IT 481, Fall 200645
MultipathMultipath