Concepts of Multimedia Processing and Transmission IT 481, Lecture #13 Dennis McCaughey, Ph.D. 4 December, 2006

Concepts of Multimedia Concepts of Multimedia Processing and TransmissionProcessing and Transmission

IT 481, Lecture #13Dennis McCaughey, Ph.D.

4 December, 2006

08/28/2006IT 481, Fall 20062

I-Picture Encoding Flow ChartI-Picture Encoding Flow Chart

Slide: Courtesy, Hung Nguyen

08/28/2006IT 481, Fall 20063

I-Picture Encoding ProcessI-Picture Encoding Process

Decomposing image to three components in RGB space Converting RGB to YCbCr Dividing image into several macroblocks (each macroblock

has 6 blocks , 4 for Y, 1 for Cb, 1 for Cr) DCT transformation for each block After DCT transform , Quantizing each coefficient Then use zig-zag scan to gather AC value Use DPCM to

encode the DC value, then use VLC to encode it

Use RLE to encode the AC value, then use VLC to encode it


08/28/2006IT 481, Fall 20064

The Inter-frame Encoding Flow The Inter-frame Encoding Flow ChartChart


08/28/2006IT 481, Fall 20065

Inter-frame Encoding ProcessInter-frame Encoding Process

Decomposing image to three components in RGB space Converting RGB to YCbCr Perform motion estimation to record the difference between the

encoding frame and the reference frame stored within the frame buffer

Dividing image into several macroblocks (each macroblock has 6 blocks , 4 for Y, 1 for Cb, 1 for Cr)

DCT transformation for each block Quantizing each coefficient Use zig-zag scan to gather AC value Reconstruct the frame and store it to the frame buffer if necessary DPCM is applied to encode the DC value, then use VLC to encode it Use RLE to encode the AC value, then use VLC to encode it


08/28/2006IT 481, Fall 20066

WatermarkingWatermarking

Watermarking is a secret code described by a digital signal carrying information about the copyright property of the product.

The watermark is embedded in the digital data in such a way that it is not visually perceptible.

The copyright owner should be the only person who can show the existence of his own watermark and to prove then origin of the product.

08/28/2006IT 481, Fall 20067

Watermark RequirementsWatermark Requirements

Alterations introduced into the image or audio should be perceptually invisible.

A water mark must be undetectable and not removable by an attacker.

A sufficient n umber of watermarks in the same image or audio, detectable by their own key, can be produced.

The detection of the watermark should not require the original image or audio.

A watermark should be robust against attacks which preserve the desired quality of the image or audio.

08/28/2006IT 481, Fall 20068

Main Features of WatermarkingMain Features of Watermarking

Perceptual Invisibility Trustworthy Detection Associated Key Automated Detection/Search Statistical Invisibility Multiple Watermarks Robustness

08/28/2006IT 481, Fall 20069

Perceptual InvisibilityPerceptual Invisibility

Watermark should not degrade the perceived image/audio quality

Differences may become apparent when the original and watermarked versions are directly compared

08/28/2006IT 481, Fall 200610

Trustworthy DetectionTrustworthy Detection

Watermarks should constitute a sufficient and trustworthy part of ownership.

False alarms should be extremely rare. Watermarks signatures/signals should be

complex. An enormous set of watermarks prevents

recovery by trial-and-error methods.

08/28/2006IT 481, Fall 200611

Associated keyAssociated key

Watermarks should be associated with an identifiable number called the watermark key.

Key used to cast, detect and remove the watermark.

The key should be private and should exclusively characterize the legal owner.

Any signal removed from the image/audio is assumed to be valid only if it can be associated with the key via a well established algorithm

08/28/2006IT 481, Fall 200612

Automated Detection/SearchAutomated Detection/Search

Watermark should combine with a search algorithm.

08/28/2006IT 481, Fall 200613

Statistical InvisibilityStatistical Invisibility

Watermark should not be recoverable using statistical methods.

The possession of a great number of watermarked images, embedded with the same key should not enable the recovery of the watermark through statistical methods.– Watermarks should be image/audio

independent.

08/28/2006IT 481, Fall 200614

Multiple WatermarksMultiple Watermarks

Multiple watermarks assist in the case where someone illicitly watermarks and already watermarked image/audio.

Convenient in transferring copyrighted material.

08/28/2006IT 481, Fall 200615

RobustnessRobustness

A watermark should survive some modifications to the data.

Common manipulations to image/video– Data Compression– Filtering– Color, quantization , brightness modifications,

geometric distortions, etc– Other trans-coding operations.

08/28/2006IT 481, Fall 200616

Robustness, Resilience & DetectionRobustness, Resilience & Detection

Applications Domain

Unintentional Attacks

Intentional Attacks Every

DecoderHigh

CapacityApplications

ExampleAT1 AT2 AT3 AT4 AT5

A1 Yes Yes Maybe No No Yes Yes Value-added metadata

A2 Yes Yes Yes Yes Yes Yes No Copy Protection

A3 Yes Yes Yes Yes Yes No No Ownership/fingerprint

A4 Yes No No No Some Yes No Authentication

A5 Yes Yes NO No Yes Yes Yes Broadcasting

A6 Yes Yes Maybe Maybe Yes No Yes Secret Communication

08/28/2006IT 481, Fall 200617

Application DomainsApplication Domains

A1: Carrying value-added metadata– Additional information such as hyperlinks, content based

indexing– Malicious and non-malicious attacks– Survive MPEG encoding

A2: Copy protection and conditional access– Control Intellectual Property Management and Protection– View and copy options– Every compliant decoder must be able to trigger protection

or royalty collection mechanisms at the time of decoding– Unauthorized individuals should not be able to defeat the

watermarks by any means A3: Ownership assertion, recipient tracking

– Establish ownership and determine origin of unauthorized duplication.

– Prosecution of copyright infringement

08/28/2006IT 481, Fall 200618

Application Domains Cont’dApplication Domains Cont’d

A4: Authentication and verification– Allows fragile watermarks; if contents modified watermarks

should disappear. – Helps in identifying areas that were modified

A5: Broadcast monitoring– Monitor where and when the contents are played– Advertisements. Here heavy content degradation is less of an

issue.– Watermark removal, invalidation and forgery can be significant

concern– Counterfeiting should be intractable for the system to be effective

A6: Secret communication or steganography– Data hiding may require higher capacity watermarks than other

applications– Secrecy may be the overriding concern in some applications

08/28/2006IT 481, Fall 200619

AttacksAttacks

AT1: Basic attacks– Lossy compression, frame dropping & temporal rescaling

AT2: Simple attacks– Blurring, median filtering, noise addition gamma correction

and sharpening AT3: Normal attacks

– Translation, cropping and scaling AT4: Enhanced attacks

– Aspect ratio change & random geometric perturbations (Stirmark)

AT5: Advanced Attacks– Delete/insert watermarks, single document watermark

estimation attacks & multiple-document statistical attacks

08/28/2006IT 481, Fall 200620

Human PerceptionHuman Perception

Watermarking schemes take advantage of the fact that the human audio and visual systems are imperfect detectors.

Audio & visual signals must have a minimum intensity or contrast before they are perceptible.

These minima are spatially, temporally and frequency dependent.

These dependencies are either implicitly or explicitly exploited

08/28/2006IT 481, Fall 200621

Transform Domain ConsiderationsTransform Domain Considerations

The human eye is more sensitive to noise in the lower frequency range than in the higher frequency counterparts

However, energy in most images is concentrated in the lower frequency range.

Quantization used in DCT based compression reflects the HVS which is less sensitive in the higher frequencies

A trade is required to balance watermark invisibility and survivability resulting in the use of the mid-frequency terms.

08/28/2006IT 481, Fall 200622

Transform Domain ConsiderationsTransform Domain Considerations

An alteration of a transform coefficient is spread across the entire spatial block

A one dimensional example:

0 20 40 60 80 100 120 1400

0.5

1DCT Spectrum

0 20 40 60 80 100 120 1400

0.5

1Time Sequence

08/28/2006IT 481, Fall 200623

Data Embedding AlgorithmData Embedding Algorithm

Embedding Algorithm

Perceptual Analysis

Key

SignalWith embedded data

Information

Signal(image, audio or video)

08/28/2006IT 481, Fall 200624

Embedded Data ExamplesEmbedded Data Examples

Multilingual soundtracks within a motion picture

Copyright data Distribution permissions Data used for accounting and billing and

royalties Etc.

08/28/2006IT 481, Fall 200625

Watermarking TechniquesWatermarking Techniques

Non-Blind: Watermark recovery requires the original

Blind: Watermark recovery does not require the original

Spatial domain or transform domain embedding

Spatial domain:– LSB, color pallet, geometric

Transform Domain:– FFT, DCT, Wavelet

08/28/2006IT 481, Fall 200626

Error ResilienceError Resilience

Redundancy is added to the compressed bitstream to allow the detection and correction of errors– Can be added in either the source or channel encoder

Shannon Information Theory:– Separately design the source and channel coders to

achieve error-free transmission so long as the source is represented by a rate below the channel capacity

Source should compress the source as much as possible

Channel coder, via Forward Error Correction (FEC) adds redundancy bits to enable error detection and correction

08/28/2006IT 481, Fall 200627

Binary Symmetric ChannelBinary Symmetric Channel

1-p

pp

1-p

Capacity C = 1 + plog2(p) + (1-p)log2(1-p)

p C1.00E-06 1.00001.00E-05 0.99981.00E-04 0.99851.00E-03 0.98861.00E-02 0.91921.00E-01 0.53105.00E-01 0.0000

08/28/2006IT 481, Fall 200628

Shannon’s Capacity TheoremShannon’s Capacity Theorem

If the Rate (R) of a code R = log2(m)/L is less than channel capacity C, there exists a combination of source and channel encoders such that the source can be communicated over the channel with fidelity arbitrarily close to perfect – m = Number of message words– L = Number of code word bits

08/28/2006IT 481, Fall 200629

Protocols for MultimediaProtocols for Multimedia

Network Layer Protocol Transport Protocol Session Control Protocol

– UDP– TCP– RTP– RTCP

08/28/2006IT 481, Fall 200630

Standardized ProtocolsStandardized Protocols

Several protocols have been standardized for communication between clients and streaming servers.

Future research topics on design of protocols include: – 1) How to take caches into account (e.g., how to

communicate with continuous media caches and how to control continuous media caches);

– 2) How to efficiently support pause/resume operations in caches (since the pause/resume operations interfere with the sharing of a multimedia stream among different viewers); and

– 3) How to provide security in the protocols

08/28/2006IT 481, Fall 200631

Streaming Server ComponentsStreaming Server Components

Communicator: – A communicator involves the application layer and

transport protocols implemented on the server. – Through a communicator, the clients can communicate

with a server and retrieve multimedia contents in a continuous and synchronous manner.

Operating system: – Different from traditional operating systems, – An operating system for streaming services needs to

satisfy real-time requirements for streaming applica tions. Storage system: A storage system for streaming services

08/28/2006IT 481, Fall 200632

Protocol Stack for MultimediaProtocol Stack for Multimedia

08/28/2006IT 481, Fall 200633

RTPRTP

RTP does not guarantee QoS or reliable delivery, but rather, provides the following functions in support of media streaming:

Time-stamping: RTP provides time-stamping to synchronize different media streams.

Sequence numbering: RTP employs sequence numbering to place the incoming RTP packets in the correct order. Since packets arriving at the receiver may be out of sequence (UDP does not deliver packets in sequence),.

Payload type identification: The type of the payload contained in an RTP packet is indicated by an RTP-header field called payload type identifier.

– The receiver interprets the content of the packet based on the payload type iden tifier.

– Certain common payload types such as MPEG-audio and video have been assigned payload type numbers

– For other payloads, this assignment can be done with session control protocols.

Source identification: The source of each RTP packet is identified by an RTP-header field called Synchronization Source identifier (SSRC), which provides a means for the receiver to distinguish different sources.

08/28/2006IT 481, Fall 200634

RTCPRTCP

QoS feedback: This is the primary function of RTCP. – RTCP provides feedback to an application regarding the quality of

data distribution. – The feedback is in the form of sender reports (sent by the source)

and receiver re-ports (sent by the receiver). – The reports can contain in-formation on the quality of reception

such as: – 1) Fraction of the lost RTP packets, since the last report; – 2) Cumulative number of lost packets, since the beginning of re

ception; – 3) Packet interarrival jitter; and – 4) Delay since receiving the last sender’s report.

Participant identification: A source can be identified by the SSRC field in the RTP header.

– RTCP provides a human-friendly mechanism for source identification.

– RTCP SDES (source description) packets contain textual information called canonical names as globally unique identifiers of the session participants.

– It may include a user’s name, telephone number, email address, and other information.

08/28/2006IT 481, Fall 200635

RTCPRTCP

Control packets scaling: To scale the RTCP control packet transmission with the number of participants, a control mechanism is designed as follows.– The control mechanism keeps the total control packets to 5% of

the total session bandwidth. – Among the control packets, 25% are allocated to the sender

reports and 75% to the receiver reports. – To prevent control packet starvation, at least one control packet

is sent within 5 s at the sender or receiver. Inter-media synchronization: RTCP sender reports contain an

indication of real time and the corresponding RTP timestamp. This can be used in inter-media synchroniza tion like lip synchronization in video.

Minimal session control information. This optional functionality can be used for transporting session information such as names of the participants.

08/28/2006IT 481, Fall 200636

Session Control ProtocolsSession Control Protocols

RTSP functions – Support VCR-like control operations such as stop, pause/re sume,

fast forward, and fast backward. – Provides a means for choosing delivery channels (e.g., UDP, mul

ticast UDP, or TCP), and delivery mechanisms based upon RTP. – RTSP works for multicast as well as unicast. – Also establishes control streams of continuous audio and video

media between the media servers and the clients. Specifically, RTSP provides the following operations.

– Media retrieval: The client can request a presentation description, and ask the server to setup a session to send the requested media data;

– Adding media to an existing session: The server or the client can notify each other about any additional media becoming available to the established session

08/28/2006IT 481, Fall 200637

Session Initiation ProtocolSession Initiation Protocol

Similar to RTSP, SIP can also create and terminate sessions with one or more par ticipants.

Unlike RTSP, SIP supports user mobility by proxying and redirecting requests to the user’s current location

08/28/2006IT 481, Fall 200638

Types of CodesTypes of Codes

Block Codes– Hamming Codes– Bose-Chaudhuri-Hocquenhem BCH Codes– Reed-Solomon Codes

Convolutional Codes

08/28/2006IT 481, Fall 200639

DVB-S Transmission System Ku BandDVB-S Transmission System Ku Band

08/28/2006IT 481, Fall 200640

SpecificsSpecifics

What is the Packetizer function? What is the PES? What is the transport stream? What is the function of the scrambler? Why the R-S coder? Why the Interleaver? Why the Convolutional encoder? Why QPSK for satellites? Why COFDM for terrestrial?

08/28/2006IT 481, Fall 200641

Overview of MPEG-4 SystemOverview of MPEG-4 System

Scene segmentationand depth layering

O2

O3 O1 Layered encoding

contourmotion texture



bitstreamlayer 1

bitstreamlayer 2

bitstreamlayer 3 m

ultip

lexe

r

dem

ultip

lexe

r

Separate decoding

com

posi

tor

AV-objects

08/28/2006IT 481, Fall 200642

Two Enhancement Types in MPEG-4 Two Enhancement Types in MPEG-4 Temporal ScalabilityTemporal Scalability

1. Type I: The enhancement-layer improves the resolution of only a portion of the base-layer 2. Type II: The enhancement-layer improves the resolution of the entire base-layer.

In enhancement type I, only a selected region of the VOP (i.e. just the car) is enhanced, while the rest (i.e. the landscape) is not.

In enhancement type II, enhancement is applicable only at entire VOP level.

08/28/2006IT 481, Fall 200643

DVB-T/H TransmitterDVB-T/H Transmitter

NOKIA

08/28/2006IT 481, Fall 200644

Terrestrial Drivers Terrestrial Drivers

Terrestrial broadcasts are omnidirectional Multiple copies of the same signal may

arrive at the receiver with slightly different delays and thus interfere with each other– Multipath = (direct path signal + reflected signal

+ refracted signal)– Intersymbol Interference (ISI)– Limits the bit rate that may be achieved

08/28/2006IT 481, Fall 200645

MultipathMultipath

Documents

Concepts of Multimedia Processing and Transmission IT 481, Lecture #13 Dennis McCaughey, Ph.D. 4 December, 2006