Upload
minyare-nj
View
222
Download
0
Embed Size (px)
Citation preview
8/2/2019 04042353
1/4
Temporal Motion Prediction for Fast Motion Estimation in Multiple Reference
Frames
G Nageswara Rao, PSSBK Gupta
Emuzed A Flextronics Company, Bangalore, India
Email: {nageswararao.gunupudi, shyam.pallapothu}@flextronicssoftware.com
Abstract--- In the new video coding standard,H.264/MPEG-4 Part 10, motion compensation is allowed
to use multiple reference frames that improves the rate
distortion performance but at the cost of drastic increase
in complexity. The increased computation is in
proportion to the number of searched reference frames.
However, the reduction of prediction residues is highly
dependent on the nature of sequences. In this paper, we
present a fast technique to predict the motion vector in
reference frames to speed up the matching process for
multiple reference frames. The proposed technique isbased on choosing the motion centre and carrying the
search around the centre with a radius of 1 or 2 pixels in
all reference frames which exception of the one which
immediately precedes the current frame. For the
reference frame that immediately precedes the current
frame any motion estimation technique can be used. The
results show that the proposed technique reduces the
computational requirements down to that required for
single reference frame motion estimation with only a
negligible loss of objective quality.
Index terms--- H.264/AVC, Motion Estimation,
Multi Frame Motion Compensation (MFMC)
I. INTRODUCTION
H.264/AVC is the newest video coding standard [1] of the
ITU-T Video Coding Experts Group and the ISO/IEC
Moving Picture Experts Group. H.264/AVC introduced lot
of new coding and error resilient tools and as a result it has
achieved a significant improvement in rate-distortion
efficiency relative to existing standards, up to 50% of bit-
rate reduction is achieved over MPEG-4 advanced simple
profile. Motion compensation process defined in H.264 at
quarter-pixel accuracy with variable block sizes and multiple
reference frames greatly reduces prediction errors. Multi
frame motion compensation (MFMC) is one such tool that
provides significant coding gain and better error robustness
as well. The multi-frame buffer stores frames at encoder and
decoder those are efficient for motion-compensated
prediction. Multi-frame motion estimation was proposed as
a technique to improve the error-resilience of compressed
video by Budagavi and Gibson [2]. The multi-frame motion
compensation (MFMC) coder makes use of the redundancy
that exists across multiple frames in typical video-
conferencing sequences to achieve additional compression
over that obtained by using the single frame motion
compensation (SFMC) approach [3]. The advantage of the
MFMC approach is that it is more robust to error
propagation when compared to the traditional SFMC, as it
makes use of information from multiple frames.
As a consequence of many prediction tools in motion
compensation the motion estimation process in H.264
becomes further more complex. The Multi-frame motion
compensation improves the rate distortion performance
substantially by introducing much higher loading to the
system. Without considering temporal correlations between
multiple reference frames, conventional single-frame searchalgorithms can still be applied to multi-frame motion
estimation, but using a rather inefficient frame-by-frame
approach. But it increases the complexity of Motion
Estimation (ME) by number of reference frames times to
that of single frame motion estimation, as ME needs to be
carried for all reference frames. This is not feasible for most
cases in real time implementation. Nevertheless, the
decrease of prediction residues depends on the nature of
sequences. Sometimes the prediction gain is very significant,
but sometimes a lot of computation is wasted without any
considerable bit rate reduction. As a consequence several
fast techniques are developed recently for motion estimation
in multiple reference frames. We present an effectivealgorithm to accelerate the multiple reference frames ME
without significant loss of video quality.
In this paper we proposed an algorithm, which makes use
of the already computed motion vectors with respect to the
first reference frame i.e. the reference frame closer to the
current reference frame for prediction of best motion vectors
for second and subsequent reference frames. The proposed
scheme is to minimizing the computational overheads
resulting from motion estimation of reference frames other
than first reference frame. However the proposed technique
is independent of ME algorithm used for first reference
frame. The main goal of the algorithm is to perform a fast
search in all reference frames other than first referenceframe while maintaining a PSNR quality similar to that
would be obtained when full search block-matching
algorithm (FSBMA) is used in each reference frame. The
rest of the paper presents various statistics of multi-frame
motion estimation (Section II), proposed motion vector
prediction system and algorithm for motion estimation
(Section III & IV), simulation results (Section V) and
conclusions (Section VI).
2006 IEEE International
Symposium on Signal Processingand Information Technology
0-7803-9754-1/06/$20.002006 IEEE 817
8/2/2019 04042353
2/4
II. MULTI FRAME MOTION ESTIMATION
Multi-frame motion estimation extends the temporal
displacement vectors utilized in the block-matching video
coding by permitting the use of more frames than the one
that previously decoded for the motion-compensated
prediction. The use of multiple frames for the motion
estimation in many cases provides significant improvement
in coding gain [5] and also provides better error robustness.
It is well agreed that motion estimation is the complex
module of standard based video encoders. And the
complexity increases N times with motion vector search in N
reference frames. Nevertheless, the decrease of prediction
residues depends on the nature of video content. The newly
proposed standard H.264 supports the hybrid block motion
compensation with multiple reference frames, which
significantly increases the complexity of video encoders for
real time implementations. Given an inter-mode, the
reference software JM9.5 adopts a full search and carries out
the matching process in all reference frames one by one. The
best mode is chosen by minimizing a Lagrangian cost
function, which considers both 2-D 4x4 Hadamard
transformed SAD (SATD) and number of bits required to
code the side information. Table.1 gives the PSNR results
and complexity of ME in seconds for different sequences
with one, two and three reference frames at bit rate of 512
Kbps. Table.2 shows the percentage of references from first
and other reference frames. It is observed that the number of
macro blocks that are referenced in farthest reference frames
decreases. It also can be observed that the number of
references from all other reference frames is less than that of
the first reference frame.
Table.2 shows that maximal probability for references for
motion compensation are from the first reference frame
except for Salesman sequence, but the references from the
other reference frames is also not insignificant, which shows
the scope for designing fast and accurate ME techniques in
secondary reference frames. A PSNR loss is observed in
salesman sequence (Table.1) from two reference frames to
three reference frames, which is justified by its referenceframe statistics shown in Table.2 where increase in overhead
bits for reference frame index signaling in three reference
frames case is not compensated by its residual error gains.
Frames and Motion Vectors classification in MFMC:
The statistics and observations presented in the previous
section clearly gives a specific importance for the reference
frame that temporally precedes the current frame and motion
vectors corresponding to that reference frame. Hence here a
classification for the reference frames and motion vector
used in MFMC is presented. We termed the reference frame
that immediately precedes the current frame as primary
reference frame (PRF) and rest all as secondary reference
frames (SRF). Similarly the motion vectors obtained withrespect primary reference frames are termed as primary
motion vectors (PMV) and rest all secondary motion vectors
(SMV). This classification of primary and secondary
reference frames and motion vectors adds clarity for motion
estimation process in terms of complexity-accuracy trade-
offs. The proposed technique considers the motion
estimation in secondary reference frames And the motion
estimation algorithm in primary reference frames is allowed
to choose any fast technique.
III. TEMPORAL PREDICTION OF MOTION VECTOR
The temporal prediction algorithm proposed here is based
on the fact that the motion of macro blocks can be trackedtemporally over the frames through the primary motion
vectors in the frames in between current frame and reference
frame. The best match for a block in current frame with a
block in a reference frame can be tracked temporally by
adding the motion vectors in the every two neighboring
frames that are between the current frame and reference
frame i.e. primary motion vectors of frames between the
current frame and reference frame.Table.2. References frames statistics with 10
reference frames
Table.1. Quality (PSNR) and Complexity for Multiple reference frame Motion Estimation
818
8/2/2019 04042353
3/4
It is empirically observed that the motion across the frames
is linear and smooth. Hence tracking the motion between
every two frames using primary motion vectors and
summing up the motion vectors of every adjacent frame
between the current frame and the secondary reference can
identify best match of a block in current frame
Lets say the current block is at (x, y) in current frame
and found to be best matched at (x
1
, y
1
) location in primaryreference frame, i.e (x x1, y y
1) is the primary motion
vector (PMV) of the block. Then it can be strongly justified
that the block (x, y) can be closely matched with the block
that is best matched for (x1, y
1). But the block (x
1, y
1) may
not be aligned with a block boundary of the reference frame
at T-1, (x1, y
1) is approximated to nearest block boundary
(xb1, yb
1) and motion vector of block (xb
1, yb
1), say (mvx
2,
mvy2), is used as the predicted motion vector for the
reference frame at T-2. If (xb1, yb
1) falls out of the frame
boundary, it is approximated to nearest block position. Then
it is observed that the best match for the current block is
more likely to around the (x2, y
2) in first secondary reference
frame. Hence the motion search for the current block in itsfirst secondary reference frame can be carried out around
(x2, y
2). Thus search center of motion estimation in first
secondary reference frame can be formulated as follows.
Fig.2 explains the temporal motion tracking system for
secondary reference frames.
(mvx2, mvy
2) = PMV of (xb
1, yb
1)
(x2, y
2) = (xb
1+ mvx
2, yb
1+ mvy
2), and
xb1
= { (x1
+ BlockSize/2) >> B }* BlockSize
yb1
= { (y1
+ BlockSize/2) >> B } * BlockSize
where B = log2 (BlockSize) (1)
The statistics to show deviation of predicted motion vector
from the finally best motion vector in secondary reference
frame are given for foreman and mobile sequences in Fig.1.
It is showing that about 80% to 90% of predicted motion
vectors are falling with in 2 pixels displacement from the
best motion vector and 60% of predicted motion vectors are
falling with in 1 pixels displacement from the best motion
vector. Thus motion search around the predicted motion
vector calculated as described in Eq.1with search range of 2
pixels is sufficient for finding the best match of the given
macro block in the given reference frame. The PSNR results
are presented for fast motion estimation with search range of
2 and 1 (referred as 1x1 and 2x2 searches respectively)
are given Table 3. To further speed up the computation of
motion vector of secondary reference frames, we havechosen a iterative pattern with 1 search area, where the
search is carried first in 1 search area and complete the
search if the best motion vector is found at center of search
area, other continue search in 1 range for best motion
vector around the best motion vector position of the current
search area. Table 3 also shows the results for the iterative
search strategy with search window 1 pixels with two
iterations. The early exit of iterative technique enables much
faster convergence of motion estimation in secondary
reference frames. Fig.3 gives various alternative search
patterns that can be applied for search around search center.
Fig.3. Search Pattern for Secondary reference
frames Motion estimation
Fig.2. Motion vector tracking over the frames to predict
in Motion vector in secondary referenceframes
T
T 1
(PRF)
T 2
SRF
(x2
,y
2)
(x1
, y1)
(x, y)
Fig.1
819
8/2/2019 04042353
4/4
IV. PROPOSED ALGORITHM
The observation made in previous sections lead to a fast
motion estimation algorithm for secondary reference frames.
This section presents the fast motion estimation algorithm
for secondary reference frames. The seed point for the
motion search is obtained from the primary motion vectors
(PMVs) of the frames that are between the current frame and
reference frame under ME search. However to includedifferent motions and make algorithm more robust, we
include zero motion vectors (ZMV) and predicted motion
vector defined in H.264 standard (SPMV) in seed point
selection. Among the ZMV, temporally predicted motion
vector, and standard predicted motion vector (SPMV) the
motion vector which result in least rate-distortion cost is
selected as seed point for motion search and iterative search
is with in 1 search area is carried about the seed point. The
algorithm for motion estimation with multiple reference
frames is summarized bellow.
Step1: Perform the motion estimation in primary reference
frame (Any fast technique would be applied). And store the
primary motion vectors.Step2: Compute temporal motion vector predictor using the
primary motion vectors of the frames between the current
frame and reference frame under search.
Find the seed for the ME in first secondary reference frame
as the primary motion vector of the block in the primary
reference frame which is maximally covering the best match
of the current block of the current frame.
Step3: Choose the best seed among zero motion vector,
standard predicted motion vector and temporally predicted
motion vector derived in step 2 by minimizing the R-D cost.
Step4: Search the best match for the current block in
secondary reference frame around the best seed computed in
step3 with radius of 1.Step5: If the best MV resulted in step 4 is the search center
i.e seed point, then decide that as the best MV for the
current block and go to step 6. Otherwise the best MV in the
1 search is then chosen as search center and continue best
MV search about new search center with 1 search range.
Step6: Repeat step 2 to step 5 for all the secondary
reference frames.
V. SIMULATION RESULTS
JM9.5 reference software is used for generation of
simulation results for the proposed algorithm. All the results
are generated in Baseline profile for CIF (352x288)
resolution at 512 kbps and at frame rate of 15fps. The same
techniques are verified with different bit rates. Foreman,
Hall, football, Claire, Flower, and Mother and Daughter
sequences are used. Only I and P frames are used and GOVlength of 60 is used. Search range of 32, Inter prediction
modes up to 8x8 blocks are used. Table.3 give the
comparison of PSNR quality and Motion estimation time in
seconds with full search algorithm and proposed techniques
with two reference frames being used for motion
compensation. Full search block matching algorithm is used
for motion estimation in primary reference frame.
VI. CONCLUSIONS
A fast technique for predicting motion in multiple reference
frames is presented. The proposed algorithm scales down
the complexity of motion estimation in multiple reference
frames scenario close to that required for single referenceframe motion estimation with negligible loss of PSNR. The
proposed scheme minimizes the memory traffic for hardware
implementations as the search area is minimized in motion
estimation.
REFERENCES
[1] Joint Video Team of ITU-T and ISO/IEC JTC 1, ITU-T
Rec. H.264 ISO/IEC 14496-10 AVC, March 2003.
[2] M. Budagavi and J. Gibson, Multi-frame Block Motion
Compensated Video Coding for Wireless Channels, in
Thirtieth Asilomar Conf. on Signals, Systems, and
Computers, vol. 2, pp.953-957, Nov. 1996.
[3] T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra,
Overview of the H.264 / AVC Video Coding Standard,IEEE Transactions on Circuit and Systems for Video
Technology,VOL. 13, NO. 7, July 2003
[4] Yi-Hon Hsiao, Tien-Hsu Lee, Pao-Chi Chang:
Short/long-term motion vector prediction in multi-frame
video coding system. ICIP 2004: 1449-1452.[5] T.Wiegand, X.Zhang, and B.Girod Long-term memory
motion compensated prediction, IEEE Trans. Circuits Syst.
Video Technol.., vol 9, no.1, pp.7084, Feb. 1999.
Table.3. Quality (PSNR) and Complexity of different fast MFMC schemes proposed with two reference frames
820