Upload
lynga
View
218
Download
0
Embed Size (px)
Citation preview
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 1
Motion Estimation for Video Coding
Motion-Compensated Prediction
Bit Allocation
Motion Models
Motion Estimation
Efficiency of Motion Compensation Techniques
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 2
Intra-Frame
Decoder
Motion-
Compensated
Predictor
Control
Data
DCT
Coefficients
Motion
Data
0
Intra/Inter
Coder
Control
Decoder
Motion
Estimator
Intra-Frame
DCT Coder-
],,[ tyxs ],,[ tyxu
],,[' tyxs
],,[' tyxu
],,[̂ tyxs
Hybrid Video Encoder
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 3
Motion-Compensated Prediction
Prediction for the luminance signal s[x,y,t] within the moving object:
Moving
object
Displaced
object
time t
x
y
Previous
frame
Current
frame
Stationary
backgroundt
y
x
d
d
),,(],,[̂ ttdydxstyxs yx
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 4
Motion-Compensated Prediction: Example
Referenced blocks in frame 1 Difference between motion-
compensated prediction and
current frame u[x,y,t]
Frame 1 s[x,y,t-1] (previous)
Frame 2 with
displacement vectors
Accuracy of Motion Vectors
Frame 2 s[x,y,t] (current) Partition of frame 2 into blocks
(schematic)
Size of Blocks
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 5
yx
yyxx
dd
yx
yx
yxfyydyxfxxd
,
,
,
,
),,( ),,,(
ba
ba
Motion in 3-D space corresponds to displacements in the image plane
Motion compensation in the image plane is conducted to provide a
prediction signal for efficient video compression
Efficient motion-compensated prediction often uses side information to
transmit the displacements
Displacements must be efficiently represented for video compression
Motion models relate 3-D motion to displacements assuming
reasonable restrictions of the motion and objects in the 3-D world
Motion Models
Motion Model
: location in previous image
: location in current image
: vector of motion coefficients
: displacements
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 6
Decoded video signal is given as
Representation of Video Signal
],,[],,[̂],,[ tyxutyxstyxs
Motion-compensated
prediction signal
),,(],,[̂ ttdydxstyxs yx
),(),,(1
0
1
0
yxbdyxad i
N
i
iyi
N
i
ix
Prediction residual signal
),(),(1
0
yxcyxu j
M
j
j
,...)(,...),(),,( 00 bafRm baba
Transmitted motion parameters
Transmitted residual
parameters
,...)(),( 0cfRu cc
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 7
Optimum trade-off:
Displacement error variance can be influenced via
• Block-size, quantization of motion parameters
• Choice of motion model
Bit-rate
Displacement
error variance
D
Prediction
error rate
Ru
Motion
vector rate
Rm
Rate-Constrained Motion Estimation
um
um
RRRdR
dD
dR
dD ,
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 8
Lagrangian Optimization in Video Coding
Distortion after
motion compensationLagrange
parameter
Number of bits
for motion vector
Rate-Constrained Motion Estimation [Sullivan, Baker 1991]:
Rate-Constrained Mode Decision [Wiegand, et al. 1996]:
A number of interactions are often neglected
• Temporal dependency due to DPCM loop
• Spatial dependency of coding decisions
• Conditional entropy coding
Distortion after
reconstructionLagrange
parameter
Number of bits
for coding mode
mmm RD min
RD min
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 9
Translational motion model
4-Parameter motion model: translation, zoom (isotropic Scaling),
rotation in image plane
Affine motion model:
Parabolic motion model
Motion Models
00 , bdad yx
yaxabd
yaxaad
y
x
120
210
ybxbbd
yaxaad
y
x
210
210
xybybxbybxbbd
xyayaxayaxaad
x
x
5
2
6
2
2310
5
2
6
2
2310
),(),,(1
0
1
0
yxbdyxad i
N
i
iyi
N
i
ix
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 10
Impact of the Affine Parameters
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
translation
scaling
sheering
0adx
xadx 1
yadx 3
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 11
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
-150
-100
-50
0
50
100
150
-150 -100 -50 0 50 100 150
x
y
Impact of the Parabolic Parameters
2
2xadx
2
6 yadx
xyadx 5
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 12
Differential Motion Estimation
Assume small displacements dx,dy:
Aperture problem: several observations required
Inaccurate for displacements > 0.5 pel
multigrid methods, iteration
Minimize
Displace frame difference Horizontal and vertical
gradient of image signal S
yx
yx
dy
ttyxsd
x
ttyxsttyxstyxs
ddtyxstyxstyxu
),,(),,(),,(),,(
),,,,(ˆ),,(),,(
),,(min1 1
2 tyxuBy
y
Bx
x
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 13
Gradient-Based Affine Refinement
Displacement vector field is represented as
Combination
yields a system of linear equations:
System can be solved using, e.g., pseudo-inverse,
by minimizing
ybxbby
yaxaax
321
321
)()(),,(),,(),,( 321321 ybxbby
syaxaa
x
sttyxstyxstyxu
By
y
Bx
xbbbaaa
tyxu1 1
2
,,,,,
),,(minarg321321
3
2
1
3
2
1
,,,,,,
b
b
b
a
a
a
yy
sx
y
s
y
sy
x
sx
x
s
x
sssu
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 14
• Subdivide current frame
into blocks.
• Find one displacement
vector for each block.
• Within a search range, find
a “best match” that
minimizes an error
measure.
• Intelligent search strategies
can reduce computation.
search range in
previous frame
block of current
frame
Sk1
Sk
Block-matching Algorithm
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 15
Previous Frame Current Frame
Block of pixels is selected
as a measurement window
Measurement window is
compared with a shifted block
of pixels in the other image,
to determine the best match
Block-matching Algorithm
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 16
Block-matching Algorithm
. . . process repeated for another block.
Previous Frame Current Frame
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 17
Error Measures for Block-matching
Mean squared error (sum of squared errors)
Sum of absolute differences
Approximately same performance
SAD less complex for some architectures
2
1 1
)],,(),,([),( ttdydxstyxsddSSD y
By
y
Bx
x
xyx
|),,(),,(|),(1 1
ttdydxstyxsddSAD y
By
y
Bx
x
xyx
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 18
Full search
All possible displacements within the search range
are compared.
Computationally expensive
Highly regular, parallelizable
Block-matching: Search Strategies
dx
dy
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 19
Complexity of block-matching:
evaluation of complex error measure for many candidates
Speedup of Block-matching
Combine both approaches:
Choose starting point and search order that maximizes
likelihood for efficient approximations, early terminations
and excluding of candidates
Reduce complexity
• Approximations
• Early terminations
• Exclude candidates
Reduce number
• Cover likely search areas
• Unequal steps between
searched candidates
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 20
Approximations
• Stop search, if match is “good enough”
(SSD, SAD < threshold or J=D+R is small enough)
• Practical method in video conferencing for static
background: test zero-vector first and stop search if
match is good enough
static background
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 21
Early Termination
Partial distortion measure:
Previously computed minimum:
Early termination without loss:
Early termination with possible loss, but higher speedup:
p
y
K
y
B
x
xyxK ttdydxstyxsddDx
)],,(),,([),(1 1
yyxK BNKddD ...),,(
Jmin
min),(),( if Stop, JddRddD yxmyxK
min)(),(),( if Stop, JKddRddD yxmyxK
2/1)/()( e.g., yBKK
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 22
Block Comparison Speed-Ups
Triangle inequality for samples in block B (here SAD)
Strategy:1. Compute partial sums for blocks in current and previous frame
2. Compare blocks based on partial sums
3. Omit full block comparison, if partial sums indicate worse error
measure than previous best result
Block partitioning
Choose blocks to be nested (4x4, 8x8, 16x16, …) – efficient
for modern variable block size video codecs
n B
kk
B
kk
n
SSSS |||| 11
nn
n BBB ,
B
kk
B
kk SSSS |||||| 11
nB
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 23
Blockmatching: Search Order I
• Iterative comparison of error
measure values at 5
neighboring points
• Logarithmic refinement of the
search pattern if
– best match is in the center of
the 5-point pattern
– center of search pattern
touches the border of the
search range
2D logarithmic search [Jain + Jain, 1981]
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 24
Diamond search [Li, Zeng, Liou, 1994] [Zhu, Ma, 1997]
d x
d y
Start with
large diamond
pattern at
(0,0)
If best match
lies in the center
of large diamond,
proceed with
small diamond
If best match does not lie
in the center of large
diamond, center large
diamond pattern at new
best match
Blockmatching: Search Order II
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 25
Hierarchical blockmatching
current frame
previous frame
Block matching
Block matching
Block matching
Filtering and subsampling
Displacement vector field
Filtering and subsampling
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 26
Sub-pel Translational Motion
Motion vectors are often not restricted to only point into
the integer-pel grid of the reference frame
Typical sub-pel accuracies: half-pel and quarter-pel
Sub-pel positions are often estimated by „refinement“
•1 •1 •1
•1
•1
d
d
x
y
•1•1
•1•1
•2•2•2 •2
•2•2•2•2
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 27
Sub-pel Motion Compensation
Sub-pel positions are obtained via interpolation
Example: half-pixel accurate displacements
• • • • • • • • • • • • •• • • • • • • • • • • • •• • • • • • • • • • • • •
• • • • • • • • • • • • •
• • • • • • • • • • • • •
• • • • • • • • • • • • •
• • • • • • • • • • • • •• • • • • • • • •
• • • • • • • • •
• • • • • • • • •• • • • • •
• • • •• • • •• • • •• • • • • • •
• • • • • • • • • • • • •
• • • • • • • • • • • • •
• • • •• • • •
• • • •
• • • •
4.5
4.5
x
y
d
d
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 28
Bi-linear Interpolation
Interpolated
Pixel Value
brightness
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 29
History of Motion Compensation
• Intraframe coding: only spatial correlation exploited
DCT [Ahmed, Natarajan, Rao 1974], JPEG [1992]
• Conditional replenishment
H.120 [1984] (DPCM, scalar quantization)
• Frame difference coding
H.120 Version 2 [1988]
• Motion compensation: integer-pel accurate displacements
H.261 [1991]
• Half-pel accurate motion compensation
MPEG-1 [1993], MPEG-2/H.262 [1994]
• Variable block-size (16x16 & 8x8) motion compensation
H.263 [1996], MPEG-4 [1999]
• Variable block-size (16x16 – 4x4) and multi-frame
motion compensation
H.264/MPEG-4 AVC [2003]
Complexity
increase
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 30
Milestones in Video Coding
0 100 200 300
28
30
32
34
36
38
40
Rate [kbit/s]
PSNR
[dB]
Variable block size
(16x16 – 8x8)
(H.263, 1996) +
quarter-pel
motion compensation
(MPEG-4, 1998)
Variable block size
(16x16 – 4x4) +
quarter-pel +
multi-frame
motion compensation
(H.264/AVC, 2003)
Intraframe
DCT coding
(JPEG, 1990)
Foreman
10 Hz, QCIF
100 frames
Frame
Difference coding
(H.120 1988)
Conditional
Replenishment
(H.120)
Half-pel
motion compensation
(MPEG-1 1993
MPEG-2 1994)
Integer-pel
motion
compensation
(H.261, 1991)
Bit-rate Reduction: 75%35
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 31
Milestones in Video Coding
0 100 200 300
28
30
32
34
36
38
40
Rate [kbit/s]
PSNR
[dB]
Integer-pel
motion
compensation
(H.261, 1991)
Variable block size
(16x16 – 8x8)
(H.263, 1996) +
quarter-pel
motion compensation
(MPEG-4, 1998)
Variable block size
(16x16 – 4x4) +
quarter-pel +
multi-frame
motion compensation
(H.264/AVC, 2003)
Intraframe
DCT coding
(JPEG, 1990)
Foreman
10 Hz, QCIF
100 frames
Half-pel
motion compensation
(MPEG-1 1993
MPEG-2 1994)
Frame
Difference coding
(H.120 1988)
Conditional
Replenishment
(H.120)
Visual Comparison
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 32
Visual Comparison
Foreman, QCIF, 10 Hz, 100 kbit/s
JPEG H.264/AVC
T. Wiegand / B. Girod: EE398A Image and Video Compression Motion estimation no. 33
Summary
Video coding as a hybrid of motion compensation and prediction
residual coding
Motion models can represent various kinds of motions
Lagrangian bit-allocation rules specify constant slope allocation to
motion coefficients and prediction error
In practice: affine or 8-parameter model for camera motion,
translational model for small blocks
Differential methods calculate displacement from spatial and
temporal differences in the image signal -
Block matching computes error measure for candidate
displacements and finds best match
Speed up block matching by fast search methods, approximations,
early terminations and clever application of triangle inequality
Hybrid video coding has been drastically improved by enhanced
motion compensation capabilities