53
Lecture 4 Predictive Coding Wen-Hsiao Peng, Ph.D Multimedia Architecture and Processing Laboratory (MAPL) Department of Computer Science, National Chiao Tung University March 2008 Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 1 / 53

Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding

Wen-Hsiao Peng, Ph.D

Multimedia Architecture and Processing Laboratory (MAPL)Department of Computer Science, National Chiao Tung University

March 2008

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 1 / 53

Page 2: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Background

Introduction

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53

Page 3: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Background

Digital Communication System

Separation of Source and Channel Coding

Point-to-Point Communication, Ergodic Channel, In�nite Delay

Source Coding - Source Statistics and Distortion Measure

Channel Coding - Channel Statistics

Claude Shannon (http://en.wikipedia.org/wiki/Claude Shannon)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 3 / 53

Page 4: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Background

Why Video Compression?

Video Raw Data Rate

Data Rate Format Size/Hour

9.1Mbps A: QCIF(176x144),4:2:0,30P (3G Phone) 4GB

37Mbps B: CIF(352x288),4:2:0,30P (VCD) 16GB

166Mbps C: BT.601(720x480),4:2:2,60I (DVD) 74GB

746Mbps D: 1080I(1920x1080),4:2:0,60I (HDTV) 335GB

Transmission Capacity

Network Capacity FPS w/o Compression

V.90 Modem 56kbps (Down), 33kbps (Up) A: 0.18

3G 128-384kbps (Car-Pedestrian ) B: 0.1-0.31

ADSL Typical 1-2Mbps B: 0.8-1.6, C: 0.18-0.36

Ethernet Lan Max. 100Mbps, Typical 10-20Mbps C: 1.8-3.6, D: 0.4-0.8

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 4 / 53

Page 5: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Background

Approach

Lossless - Reversible, Low Compression

Employ spatiotemporal correlation + non-uniform symbol distribution

Lossy - Irreversible, High Compression

Introduce non-perceivable �delity loss

Most lossy systems include a lossless compression

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 5 / 53

Page 6: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Spatiotemporal Correlation

Frame (t-1) Frame (t)

Temporal Correlations

Spatial Correlations

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 6 / 53

Page 7: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Linear Correlation

Linear Correlation between random variables X and Y

(Y � µY ) = α (X � µX ) , where α can be any value

Correlation Coe�cient

ρX ,Y =cov(X ,Y )

σXσY=

E ((X � µX )(Y � µY ))pE ((X � µX )

2)pE ((Y � µY )

2)

Strength and directions of linear correlation

Linearly Increasing Linearly Decreasing Linearly Uncorrelated Linearly Uncorrelated

ρX ,Y= 1 ρX ,Y= �0.4 ρX ,Y= 0 ρX ,Y= 0

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 7 / 53

Page 8: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Linear Correlation

Correlation Coe�cient ρX ,Y detects only linear correlation

Given(1) X is uniformly distributed over (�1,+1)(2) Y = X 2 is completely determined by X

) ρX ,Y = 0

Sample Correlation Coe�cient rX ,Y

Estimate ρX ,Y based on N realizations of (X ,Y )

rX ,Y =∑(xi � x)(yi � y)

(N � 1)| {z }SX ,Y

1vuuuut1

(N � 1) ∑(xi � x)2| {z }S2X

1vuuuut1

(N � 1) ∑(yi � y)2| {z }S2Y

rX ,Y ) ρX ,Y as N ) ∞

x and y (sample means) are unbiased estimators of meanA, B, and C are unbiased estimators of variance and covariance

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 8 / 53

Page 9: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Unbiased Estimation of Variance

S2X =1

(N�1) ∑(xi � x)2 is an unbiased estimator of σ2X

E ( S2X|{z}R.V .

) = E�(X � µX )

2�

S2Y =1

(N�1) ∑(yi � y)2 is an unbiased estimator of σ2Y

E ( S2Y|{z}R.V .

) = E�(Y � µY )

2�

SX ,Y =∑(xi�x)(yi�y)

(N�1) is an unbiased estimator of cov(X ,Y )

E (SX ,Y|{z}R.V .

) = E ((X � µX )(Y � µY ))

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 9 / 53

Page 10: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Autocorrelation Function

Random signal (process) variation w.r.t. time/space

Detect periodic component and fundamental frequency

Continuous1

Rxx (t1, t2) = E (X (t1)X (t2))

Rxx (τ) = Rxx (t, t + τ) if X (t) is W.S.S.

Discrete

Rxx [n1, n2] = E (X [n1]X [n2])

Rxx [m] = Rxx (n, n+m) if X [n] is W.S.S.

1Wide-Sense Stationary (W.S.S.) Random Signal X (t)

E (X (t)) = E (X (t + τ))8τ 2 R

E (X (t1)X (t2)) = E (X (t1 + τ)X (t2 + τ))8τ 2 R

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 10 / 53

Page 11: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Estimation of Autocorrelation Function

Estimate Rxx [m] based on a �nite record of a random signal X [n]

v [n], a �nite record of X [n]

v [n] =

�X [n] 0 � n � L� 10 otherwise

Rxx [m] estimator bRxx [m] = 1

L� jmjCvv [m]

Cvv [m] =L�1∑n=0

v [n]v [n+m] =

8><>:L�1�jmj

∑n=0

X [n]X [n+ jmj] jmj � L� 1

0 otherwise

Unbiased Estimator

E (bRxx [m]) = Rxx [m] for jmj � L� 1Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 11 / 53

Page 12: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Estimation of Autocorrelation Function

Rxx [mx ,my ] Estimator (2-D Case)

bRxx [mx ,my ] = 1

Lx � jmx j1

Ly � jmy jCvv [mx ,my ]

Cvv [mx ,my ] =

8>>><>>>:Ly�1�jmy j

∑ny=0

Lx�1�jmx j

∑nx=0

X [nx , ny ]X [nx + jmx j , ny + jmy j],

for jmx j � Lx � 1, jmy j � Ly � 10, otherwise

Unbiased Estimator

E (bRxx [mx ,my ]) = Rxx [mx ,my ] for jmx j � Lx � 1, jmy j � Ly � 1Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 12 / 53

Page 13: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Correlation

Autocorrelation Function

Periodic PCM DPCM

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 13 / 53

Page 14: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Predictive Coding

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 14 / 53

Page 15: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Di�erential Pulse Code Modulation (DPCM)

Input x [n]

Predictor bxp [n] = f (fex [k ]g)| {z }How to Find f (�)?

Residual e[n] = x [n]� bxp [n]Output ee[n] = e[n] + Quant.z}|{

q[n]

Coded Value ex [n] = bxp [n] + ee[n]Fidelity Loss4[n] = ex [n]� x [n] = q[n]

Predictor

+

Prediction

Output

Reconstruction

Quantizer

Coded Value

InputResidual

][nx

][nxp

][~ nx

][ne][~ ne

+

Predictor

Reconstruction

OutputInput][~ nx][~ ne

][nxp

Closed-Loop Structure

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 15 / 53

Page 16: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

PCM vs. DPCM

PCM SNR2

OutputQuantizer

Input][nx ][~ nx

SNRPCM =σ2x [n]

σ2∆1[n]=

σ2x [n]

σ2q1[n]| {z }

Quant.SNR ∝ Bits

DPCM SNR

SNRDPCM =σ2x [n]

σ2∆2[n]=

σ2x [n]

σ2q2[n]

=σ2x [n]

σ2e[n]| {z }Gain

�σ2e[n]

σ2q2[n]| {z }

Quant.SNR ∝ Bits

Optimal Predictor in MSE f �(�) = arg minff (�)g

σ2e[n]

2SNR of (B+1)-bit quantizer σ2x/σ2e =�12 � 22B � σ2x

�/Xm

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 16 / 53

Page 17: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Open-Loop DPCM

Input x [n]

Predictor bxep [n] = f ( fx [k ]g| {z }Source Data

)

Residual e[n] = x [n]� bxep [n]Output ee[n] = e[n] + Quant.z}|{

q[n]

Coded Value ex [n] = bxdp [n] + ee[n]Fidelity Loss 4[n] = ex [n]� x [n]

q[n] +�bxdp [n]� bxep [n]�| {z }ex [n�1]�x [n�1]

Accumulate 4[n] = q[n] +4[n� 1]

Predictor

Prediction

OutputQuantizer

InputResidual

][nx

][ne][~ ne

+

Predictor

Reconstruction

OutputInput][~ nx

Mismatch!][nxd

p][nxep

][~ ne

Open-Loop StructureWen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 17 / 53

Page 18: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Open-Loop vs. Closed-Loop

Open-Loop SNR (Dynamic) bxop [n] � x [n� 1]SNRo =

σ2x [n]

σ24[n]=

σ2x [n]

∑k=n

k=1σ2qo [k ]

=σ2x [n]

n"σ2eo [n]

σ2eo [n]

σ2qo [n]| {z }

∝ Bits

Closed-Loop SNR (Static)3 bxcp [n] � ex [n� 1] = x [n� 1] + qc [n� 1]| {z }Asymptotically Uncorrelated

SNRc =σ2x [n]

σ24[n]=

σ2x [n]

σ2qc [n]

=σ2x [n]

σ2ec [n]

σ2ec [n]

σ2qc [n]

=σ2x [n]

σ2eo [n]

+ σ2qc [n�1]

σ2ec [n]

σ2qc [n]| {z }

∝ Bits

SNRo > SNRc i� nσ2eo [n] < σ2ec [n] (= σ2eo [n] + σ2qc [n�1]) for n > 1

3E (x [n]qc [n]) < E (jqc [n]j2)Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 18 / 53

Page 19: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Drifting and Accumulation Errors

Gradually blurred image quality with open-loop control

Closed-Loop Open-Loop (w. Drifting Errors)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 19 / 53

Page 20: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Di�erential Pulse Code Modulation

Adaptive Loop Control

Adaptive loop control can outperform closed-/open-loop only control

Question: which loop control to use and at what granularity?

Closed-Loop Adaptive Loop Control

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 20 / 53

Page 21: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Linear Minimum Mean Squared Error (LMMSE) Predictor

Notion

x [n] : Input (Random)y [n] = W [n] � x [n] :Predictor

d [n] : Desired (Random)e[n] = d [n]� y [n]

Performance Function ξ(�)

ξ(W [n]) = E (je[n]j2) = E (jd [n]� y [n]j2)

Wiener Filter (Impulse Response W �[n])

W �[n] = arg minfW [n]g

ξ(W [n])

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 21 / 53

Page 22: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Wiener Filter

Transversal Filter (FIR)

x(n) and d(n) are real, stationary process

FIR Filter w = [w0,w1, ..,wN�1]T

Input x(n) = [x [n], x [n� 1], .., x [n�N + 1]]T

y [n] = wTx[n] =N�1∑i=0

wix [n� i ]

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 22 / 53

Page 23: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Wiener Filter

Prediction Errore[n] = d [n]�wTx[n]

Performance Function ξ(�)

ξ(w) = E�e[n]eT [n]

�= E (d2[n])� 2wTp+wTRw

= E (d2[n])� 2∑lwld [n]x [n� l ]+∑l ∑m

wl rlmwm

where p = E (x[n]d [n]),R =E (x[n]xT [n])

Optimal Filter Tap wo

5ξ(w) = [∂ξ(w)

∂w1,

∂ξ(w)

∂w2, ...,

∂ξ(w)

∂wN�1]T = 0

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 23 / 53

Page 24: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Wiener Filter

Optimal Filter Tap wo

ξ(w) = E (d2[n])� 2∑lwld [n]x [n� l ]+∑l ∑m

wl rlmwm

∂ξ(w)

∂wi= �2d [n]x [n� i ] + 2∑l

rilwl = 0

5ξ(w) = 0) (1) Rwo = p(2) ξmin = E (d

2[n])�wTo p

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 24 / 53

Page 25: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Orthogonality

eo [n] is uncorrelated with �lter input fx [n� i ]ji = 0, 1, ..,N � 1g

∂E�e2[n]

�∂wi

= 0) 2E (e[n]∂e[n]

∂wi) = �2E (e[n]x [n� i ]| {z }

Orthogonal

) = 0

eo [n] is uncorrelated with predictor (�lter output) yo [n] = wTo x[n]

E (e[n]y [n]) = 0

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 25 / 53

Page 26: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

LMMSE Prediction of Two Random Variables

LMMSE Predictor of Y based on X

(1) Yp = αX

(2) αo = argminfαgE ((Y � Yp)2)

µX = µY = 0

E (Y ) = E (Yp) = 0,

αo = R�1p =

E (XY )

E (X 2)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 26 / 53

Page 27: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

LMMSE Prediction of Two Random Variables

µX 6= 0, µY 6= 0 + Without Mean RemovalX 0 and Y 0 have zero mean

E (Y ) 6= E (Yp)| {z } = α0oµX

α0o = R�1p =

E ((X 0 + µX ) (Y0 + µY ))

E�(X 0 + µX )

2� =

E (X 0Y 0) + µXµYE (X 02) + µ2X

µX 6= 0, µY 6= 0 + Mean Removal

(Yp � µY ) = α (X � µX )) Yp = αX + (µY � αµX )

E (Y ) = E (Yp)| {z }αo =

E ((X � µX ) (Y � µY ))

E�(X � µX )

2�

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 27 / 53

Page 28: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Least-Squares Method

Linear Predictor of Y based on XcYp = αX + β|{z}if µX 6=0,µY 6=0

Least-Squares Method

(α�, β�) = arg minfα,βg∑ (yi � αxi � β)2 =

�SX ,YS2X

,Y � SX ,YSX

X

�,

where X ,Y are sample meansInterpretation (SX ,Y ! cov(X ,Y ), S2X ! σ2X , Y ! µY , X ! µX )�cYp � Y � =

SX ,YS2X| {z }α�

�X � X

(Yp � µY ) =E ((X � µX ) (Y � µY ))

E�(X � µX )

2�

| {z }αo

(X � µX )

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 28 / 53

Page 29: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Linear Minimum Mean Squared Error Prediction

Forward Prediction

Notion

Forward Prediction fm[n] = x [n]�∑ am,ix [n� i ]

Forward Prediction Error Filter

fm[n] = x [n] � am[n]

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 29 / 53

Page 30: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Building Blocks of Video Compression

Block Diagram of Video Encoder

Spatiotemporal DPCM - Energy ReductionSpatial Transform - Decorrelation, Energy CompactionQuantization - Perceptual-oriented Lossy CompressionEntropy Coding - Symbol Compaction

Bitstream: Control Info., Motion Vectors, Transform Coe�cients

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 30 / 53

Page 31: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Building Blocks of Video Compression

Block Diagram of Video Decoder

Inverse Quantization

Inverse Transform

Spatiotemporal Comp.

Reconstruction

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 31 / 53

Page 32: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Motion Compensated Temporal Prediction

Motion-compensated DPCM along temporal axis

Closed-Loop - construct predictor from previously coded frames

Open-Loop - construct predictor from source frames

Unidirectional Prediction

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 32 / 53

Page 33: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Block Size of Motion Compensation

Original Zero 16x16

t=0 t=0, No MC t=0 + 16x16 MC

t=9 t=9, Residual t=9, Residual

MAD=32.23 MAD=14.36

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 33 / 53

Page 34: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Block Size of Motion Compensation

4x4 8x8 16x16

t=0 + 4x4 MC t=0 + 8x8 MC t=0 + 16x16 MC

t=9, Residual t=9, Residual t=9, Residual

MAD=6.00 MAD=9.98 MAD=14.36

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 34 / 53

Page 35: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Variable Block Size Motion Compensation

Motion compensation can be switched among di�erent block sizes

Extra bits for signaling motion vectors and block partitions

0

Sub-macroblockpartitions

0

1

0 1

0 1

2 3

0

0

1

0 1

0

2

1

3

1 macroblock partition of16*16 luma samples and

associated chroma samples

Macroblockpartitions

2 macroblock partitions of16*8 luma samples and

associated chroma samples

4 sub-macroblocks of8*8 luma samples and

associated chroma samples

2 macroblock partitions of8*16 luma samples and

associated chroma samples

1 sub-macroblock partitionof 8*8 luma samples and

associated chroma samples

2 sub-macroblock partitionsof 8*4 luma samples and

associated chroma samples

4 sub-macroblock partitionsof 4*4 luma samples and

associated chroma samples

2 sub-macroblock partitions of 4*8 luma samples and

associated chroma samples

Quad-Tree-based Block Sizing Selection of Block Size

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 35 / 53

Page 36: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Block Size Statistics

Pedestrian (HD) Rush Hour (HD)

4x4 may not be the most favorable one

Temporal prediction may become less bene�cial for HD@High Rate

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 36 / 53

Page 37: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Block Size Statistics

Pedestrian (QCIF) Rush Hour (QCIF)

Smaller block size is more preferable at high bit rate

Temporal prediction reveals signi�cant coding gain

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 37 / 53

Page 38: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Occlusion

Better match in Frame (t+1) instead of Frame (t-1)

Extra bits for signaling motion vectors and prediction directions

Bidirectional Prediction

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 38 / 53

Page 39: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Coding Order vs. Display Order

Bidirectional prediction requires frame bu�er and picture re-ordering

Encoding/Decoding order: 0, 2, 1, 5, 3, 4

Display order: 0, 1, 2, 3, 4, 5

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 39 / 53

Page 40: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Hierarchical Bidirectional Prediction

Better coding e�ciency as compared to IBP...

Encoding/Decoding order: 0, 4, 2, 1, 3,..

Display order: 0, 1, 2, 3, 4,..

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 40 / 53

Page 41: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Hierarchical Prediction + Adaptive Loop Control

Open-loop for B pictures but Closed-loop for P pictures

Adaptive loop control can also be applied at macroblock level

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 41 / 53

Page 42: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Multiple Reference Frames

Di�erent regions can be predicted from di�erent reference frames

Reference frames 0, 2 must be stored

Extra bits for signaling reference pictures and increased bu�er size

Multiple Reference Prediction

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 42 / 53

Page 43: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Subpixel Motion Compensation

Better match may be found from interpolated sample positions

Extra bits for signaling motion vectors of higher precision

Frame (t-1)

Frame (t)Frame (t-1)

Frame (t)Sampling

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 43 / 53

Page 44: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Subpixel Motion Compensation

Integer Pel Sub-Pel (Horizontal) Di�. (Horizontal)

Sub-Pel (Vertical) Sub-Pel (Diagonal) Di�. (Diagonal)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 44 / 53

Page 45: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Subpixel Interpolation

bb

a cE F I JG

h

d

n

H

m

A

C

B

D

R

T

S

U

M s NK L P Q

fe g

ji k

qp r

aa

b

cc dd ee ff

hh

gg

A B

C D

xFracC

yFracC

8-xFracC

8-yFracC

Luma Chroma

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 45 / 53

Page 46: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Temporal Prediction

Subpixel Interpolation

Half Pel Samples (b, h,m, s), j

b = E � 5F + 20G + 20H � 5I + Jj = aa� 5bb+ 20b+ 20s � 5qq + hh

Quarter Pel Samples (a, c , d , n, f , q, i , k)

a = (G + b)/2f = (b+ j)/2

Quarter Pel Samples (e, g , p, r)

e = (b+ h)/2g = (b+m)/2

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 46 / 53

Page 47: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Motion Vector Statistics

Pedestrian (HD) Rush Hour (HD)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 47 / 53

Page 48: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Motion Vector Statistics

Pedestrian (QCIF) Rush Hour (QCIF)

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 48 / 53

Page 49: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Coding Gain

Comparison of Coding Tools

Mobile

Rate (bits/s)

500 1000 1500 2000 2500 3000

PSNR

-Y

171819202122232425262728293031323334353637

16x16+8x16,16x8+8x8+8x4, 4x8, 4x4+Subpixel+NumFrame=5+IBBP+GOP8

Foreman

Rate (bits/s)

500 1000 1500 2000 2500 3000 3500

PSNR

-Y

282930313233343536373839404142434445

16x16+8x16,16x8+8x8+8x4, 4x8, 4x4+Subpixel+NumFrame=5+IBBP+GOP8

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 49 / 53

Page 50: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Bitstream Composition

Motion Information vs. Compression Ratio

Mobile

QP

15 20 25 30 35 40 45 50

Percentage (%) 0

5

10

15

20

25

30

35

40

45 16x16+16x8, 8x16+8x8+4x8, 8x4, 4x4+Subpixel+NumberFrame=5+IBBP+GOP8

Foreman

QP

15 20 25 30 35 40 45 50

Percentage (%)

0

5

10

15

20

25

30

35

40

45

16x16+16x8, 8x16+8x8+4x8, 8x4, 4x4+Subpixel+NumberFrame=5+IBBP+GOP8

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 50 / 53

Page 51: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Appendix

∂ (∑l ∑m wl rlmwm) /∂wi

∑l ∑mwl rlmwm = ∑l

wl fl (wi ) = ∑l=N�1l=0,l 6=i wl fl (wi ) + wi fi (wi ),

wherefl (wi ) = ∑m

rlmwm

Take derivative w.r.t. wi

∂fl (wi )/∂wi = rli

∂ (wi fi (wi )) /∂wi = fi (wi ) + riiwi

Thus

∂�∑l ∑m

wl rlmwm

�/∂wi = ∑l=N�1

l=0,l 6=i wl rli +∑m=N�1m=0

rimwm + riiwi

= ∑l=N�1l=0

rilwl +∑m=N�1m=0

rimwm

= 2∑l=N�1l=0

rilwl

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 51 / 53

Page 52: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

Appendix

f (α, β) = ∑ (yi � αxi � β)2

∂f (α, β)/∂α = 0∂f (α, β)/∂β = 0

) ∑ yixi = α ∑ xixi + β ∑ xi∑ yi = α ∑ xi + nβ�

∑ yixi∑ yi

�=

�∑ xixi ∑ xi∑ xi n

� �αβ

��

αβ

�=

1

n∑ (x2i )� (∑ xi )2

�n �∑ xi

�∑ xi ∑ xixi

� �∑ yixi∑ yi

α =n∑ yixi �∑ xi ∑ yin∑ (x2i )� (∑ xi )

2 , β =∑ yi ∑ xixi �∑ xi ∑ yixin∑ (x2i )� (∑ xi )

2

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 52 / 53

Page 53: Lecture 4 Predictive Codingmapl.nctu.edu.tw/course/vc_2008/files/lecture4.pdf · Introduction Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 2 / 53. Lecture 4 Predictive Coding Background

Lecture 4 Predictive Coding Statistics of Temporal Prediction

References

1 B. Farhang-Boroujeny - Adaptive Filters Theory and Applications

2 A. Oppenheim, et. al - Discrete-Time Signal Processing

3 G. Sullivan, et. al - ISO/IEC 14496 10 Advanced Video Coding 3rdEdition, W6540

4 Y. Wang, et. al - Video Processing and Communications

Wen-Hsiao Peng, Ph.D (NCTU CS) MAPL March 2008 53 / 53