04042353

Embed Size (px)

Citation preview

  • 8/2/2019 04042353

    1/4

    Temporal Motion Prediction for Fast Motion Estimation in Multiple Reference

    Frames

    G Nageswara Rao, PSSBK Gupta

    Emuzed A Flextronics Company, Bangalore, India

    Email: {nageswararao.gunupudi, shyam.pallapothu}@flextronicssoftware.com

    Abstract--- In the new video coding standard,H.264/MPEG-4 Part 10, motion compensation is allowed

    to use multiple reference frames that improves the rate

    distortion performance but at the cost of drastic increase

    in complexity. The increased computation is in

    proportion to the number of searched reference frames.

    However, the reduction of prediction residues is highly

    dependent on the nature of sequences. In this paper, we

    present a fast technique to predict the motion vector in

    reference frames to speed up the matching process for

    multiple reference frames. The proposed technique isbased on choosing the motion centre and carrying the

    search around the centre with a radius of 1 or 2 pixels in

    all reference frames which exception of the one which

    immediately precedes the current frame. For the

    reference frame that immediately precedes the current

    frame any motion estimation technique can be used. The

    results show that the proposed technique reduces the

    computational requirements down to that required for

    single reference frame motion estimation with only a

    negligible loss of objective quality.

    Index terms--- H.264/AVC, Motion Estimation,

    Multi Frame Motion Compensation (MFMC)

    I. INTRODUCTION

    H.264/AVC is the newest video coding standard [1] of the

    ITU-T Video Coding Experts Group and the ISO/IEC

    Moving Picture Experts Group. H.264/AVC introduced lot

    of new coding and error resilient tools and as a result it has

    achieved a significant improvement in rate-distortion

    efficiency relative to existing standards, up to 50% of bit-

    rate reduction is achieved over MPEG-4 advanced simple

    profile. Motion compensation process defined in H.264 at

    quarter-pixel accuracy with variable block sizes and multiple

    reference frames greatly reduces prediction errors. Multi

    frame motion compensation (MFMC) is one such tool that

    provides significant coding gain and better error robustness

    as well. The multi-frame buffer stores frames at encoder and

    decoder those are efficient for motion-compensated

    prediction. Multi-frame motion estimation was proposed as

    a technique to improve the error-resilience of compressed

    video by Budagavi and Gibson [2]. The multi-frame motion

    compensation (MFMC) coder makes use of the redundancy

    that exists across multiple frames in typical video-

    conferencing sequences to achieve additional compression

    over that obtained by using the single frame motion

    compensation (SFMC) approach [3]. The advantage of the

    MFMC approach is that it is more robust to error

    propagation when compared to the traditional SFMC, as it

    makes use of information from multiple frames.

    As a consequence of many prediction tools in motion

    compensation the motion estimation process in H.264

    becomes further more complex. The Multi-frame motion

    compensation improves the rate distortion performance

    substantially by introducing much higher loading to the

    system. Without considering temporal correlations between

    multiple reference frames, conventional single-frame searchalgorithms can still be applied to multi-frame motion

    estimation, but using a rather inefficient frame-by-frame

    approach. But it increases the complexity of Motion

    Estimation (ME) by number of reference frames times to

    that of single frame motion estimation, as ME needs to be

    carried for all reference frames. This is not feasible for most

    cases in real time implementation. Nevertheless, the

    decrease of prediction residues depends on the nature of

    sequences. Sometimes the prediction gain is very significant,

    but sometimes a lot of computation is wasted without any

    considerable bit rate reduction. As a consequence several

    fast techniques are developed recently for motion estimation

    in multiple reference frames. We present an effectivealgorithm to accelerate the multiple reference frames ME

    without significant loss of video quality.

    In this paper we proposed an algorithm, which makes use

    of the already computed motion vectors with respect to the

    first reference frame i.e. the reference frame closer to the

    current reference frame for prediction of best motion vectors

    for second and subsequent reference frames. The proposed

    scheme is to minimizing the computational overheads

    resulting from motion estimation of reference frames other

    than first reference frame. However the proposed technique

    is independent of ME algorithm used for first reference

    frame. The main goal of the algorithm is to perform a fast

    search in all reference frames other than first referenceframe while maintaining a PSNR quality similar to that

    would be obtained when full search block-matching

    algorithm (FSBMA) is used in each reference frame. The

    rest of the paper presents various statistics of multi-frame

    motion estimation (Section II), proposed motion vector

    prediction system and algorithm for motion estimation

    (Section III & IV), simulation results (Section V) and

    conclusions (Section VI).

    2006 IEEE International

    Symposium on Signal Processingand Information Technology

    0-7803-9754-1/06/$20.002006 IEEE 817

  • 8/2/2019 04042353

    2/4

    II. MULTI FRAME MOTION ESTIMATION

    Multi-frame motion estimation extends the temporal

    displacement vectors utilized in the block-matching video

    coding by permitting the use of more frames than the one

    that previously decoded for the motion-compensated

    prediction. The use of multiple frames for the motion

    estimation in many cases provides significant improvement

    in coding gain [5] and also provides better error robustness.

    It is well agreed that motion estimation is the complex

    module of standard based video encoders. And the

    complexity increases N times with motion vector search in N

    reference frames. Nevertheless, the decrease of prediction

    residues depends on the nature of video content. The newly

    proposed standard H.264 supports the hybrid block motion

    compensation with multiple reference frames, which

    significantly increases the complexity of video encoders for

    real time implementations. Given an inter-mode, the

    reference software JM9.5 adopts a full search and carries out

    the matching process in all reference frames one by one. The

    best mode is chosen by minimizing a Lagrangian cost

    function, which considers both 2-D 4x4 Hadamard

    transformed SAD (SATD) and number of bits required to

    code the side information. Table.1 gives the PSNR results

    and complexity of ME in seconds for different sequences

    with one, two and three reference frames at bit rate of 512

    Kbps. Table.2 shows the percentage of references from first

    and other reference frames. It is observed that the number of

    macro blocks that are referenced in farthest reference frames

    decreases. It also can be observed that the number of

    references from all other reference frames is less than that of

    the first reference frame.

    Table.2 shows that maximal probability for references for

    motion compensation are from the first reference frame

    except for Salesman sequence, but the references from the

    other reference frames is also not insignificant, which shows

    the scope for designing fast and accurate ME techniques in

    secondary reference frames. A PSNR loss is observed in

    salesman sequence (Table.1) from two reference frames to

    three reference frames, which is justified by its referenceframe statistics shown in Table.2 where increase in overhead

    bits for reference frame index signaling in three reference

    frames case is not compensated by its residual error gains.

    Frames and Motion Vectors classification in MFMC:

    The statistics and observations presented in the previous

    section clearly gives a specific importance for the reference

    frame that temporally precedes the current frame and motion

    vectors corresponding to that reference frame. Hence here a

    classification for the reference frames and motion vector

    used in MFMC is presented. We termed the reference frame

    that immediately precedes the current frame as primary

    reference frame (PRF) and rest all as secondary reference

    frames (SRF). Similarly the motion vectors obtained withrespect primary reference frames are termed as primary

    motion vectors (PMV) and rest all secondary motion vectors

    (SMV). This classification of primary and secondary

    reference frames and motion vectors adds clarity for motion

    estimation process in terms of complexity-accuracy trade-

    offs. The proposed technique considers the motion

    estimation in secondary reference frames And the motion

    estimation algorithm in primary reference frames is allowed

    to choose any fast technique.

    III. TEMPORAL PREDICTION OF MOTION VECTOR

    The temporal prediction algorithm proposed here is based

    on the fact that the motion of macro blocks can be trackedtemporally over the frames through the primary motion

    vectors in the frames in between current frame and reference

    frame. The best match for a block in current frame with a

    block in a reference frame can be tracked temporally by

    adding the motion vectors in the every two neighboring

    frames that are between the current frame and reference

    frame i.e. primary motion vectors of frames between the

    current frame and reference frame.Table.2. References frames statistics with 10

    reference frames

    Table.1. Quality (PSNR) and Complexity for Multiple reference frame Motion Estimation

    818

  • 8/2/2019 04042353

    3/4

    It is empirically observed that the motion across the frames

    is linear and smooth. Hence tracking the motion between

    every two frames using primary motion vectors and

    summing up the motion vectors of every adjacent frame

    between the current frame and the secondary reference can

    identify best match of a block in current frame

    Lets say the current block is at (x, y) in current frame

    and found to be best matched at (x

    1

    , y

    1

    ) location in primaryreference frame, i.e (x x1, y y

    1) is the primary motion

    vector (PMV) of the block. Then it can be strongly justified

    that the block (x, y) can be closely matched with the block

    that is best matched for (x1, y

    1). But the block (x

    1, y

    1) may

    not be aligned with a block boundary of the reference frame

    at T-1, (x1, y

    1) is approximated to nearest block boundary

    (xb1, yb

    1) and motion vector of block (xb

    1, yb

    1), say (mvx

    2,

    mvy2), is used as the predicted motion vector for the

    reference frame at T-2. If (xb1, yb

    1) falls out of the frame

    boundary, it is approximated to nearest block position. Then

    it is observed that the best match for the current block is

    more likely to around the (x2, y

    2) in first secondary reference

    frame. Hence the motion search for the current block in itsfirst secondary reference frame can be carried out around

    (x2, y

    2). Thus search center of motion estimation in first

    secondary reference frame can be formulated as follows.

    Fig.2 explains the temporal motion tracking system for

    secondary reference frames.

    (mvx2, mvy

    2) = PMV of (xb

    1, yb

    1)

    (x2, y

    2) = (xb

    1+ mvx

    2, yb

    1+ mvy

    2), and

    xb1

    = { (x1

    + BlockSize/2) >> B }* BlockSize

    yb1

    = { (y1

    + BlockSize/2) >> B } * BlockSize

    where B = log2 (BlockSize) (1)

    The statistics to show deviation of predicted motion vector

    from the finally best motion vector in secondary reference

    frame are given for foreman and mobile sequences in Fig.1.

    It is showing that about 80% to 90% of predicted motion

    vectors are falling with in 2 pixels displacement from the

    best motion vector and 60% of predicted motion vectors are

    falling with in 1 pixels displacement from the best motion

    vector. Thus motion search around the predicted motion

    vector calculated as described in Eq.1with search range of 2

    pixels is sufficient for finding the best match of the given

    macro block in the given reference frame. The PSNR results

    are presented for fast motion estimation with search range of

    2 and 1 (referred as 1x1 and 2x2 searches respectively)

    are given Table 3. To further speed up the computation of

    motion vector of secondary reference frames, we havechosen a iterative pattern with 1 search area, where the

    search is carried first in 1 search area and complete the

    search if the best motion vector is found at center of search

    area, other continue search in 1 range for best motion

    vector around the best motion vector position of the current

    search area. Table 3 also shows the results for the iterative

    search strategy with search window 1 pixels with two

    iterations. The early exit of iterative technique enables much

    faster convergence of motion estimation in secondary

    reference frames. Fig.3 gives various alternative search

    patterns that can be applied for search around search center.

    Fig.3. Search Pattern for Secondary reference

    frames Motion estimation

    Fig.2. Motion vector tracking over the frames to predict

    in Motion vector in secondary referenceframes

    T

    T 1

    (PRF)

    T 2

    SRF

    (x2

    ,y

    2)

    (x1

    , y1)

    (x, y)

    Fig.1

    819

  • 8/2/2019 04042353

    4/4

    IV. PROPOSED ALGORITHM

    The observation made in previous sections lead to a fast

    motion estimation algorithm for secondary reference frames.

    This section presents the fast motion estimation algorithm

    for secondary reference frames. The seed point for the

    motion search is obtained from the primary motion vectors

    (PMVs) of the frames that are between the current frame and

    reference frame under ME search. However to includedifferent motions and make algorithm more robust, we

    include zero motion vectors (ZMV) and predicted motion

    vector defined in H.264 standard (SPMV) in seed point

    selection. Among the ZMV, temporally predicted motion

    vector, and standard predicted motion vector (SPMV) the

    motion vector which result in least rate-distortion cost is

    selected as seed point for motion search and iterative search

    is with in 1 search area is carried about the seed point. The

    algorithm for motion estimation with multiple reference

    frames is summarized bellow.

    Step1: Perform the motion estimation in primary reference

    frame (Any fast technique would be applied). And store the

    primary motion vectors.Step2: Compute temporal motion vector predictor using the

    primary motion vectors of the frames between the current

    frame and reference frame under search.

    Find the seed for the ME in first secondary reference frame

    as the primary motion vector of the block in the primary

    reference frame which is maximally covering the best match

    of the current block of the current frame.

    Step3: Choose the best seed among zero motion vector,

    standard predicted motion vector and temporally predicted

    motion vector derived in step 2 by minimizing the R-D cost.

    Step4: Search the best match for the current block in

    secondary reference frame around the best seed computed in

    step3 with radius of 1.Step5: If the best MV resulted in step 4 is the search center

    i.e seed point, then decide that as the best MV for the

    current block and go to step 6. Otherwise the best MV in the

    1 search is then chosen as search center and continue best

    MV search about new search center with 1 search range.

    Step6: Repeat step 2 to step 5 for all the secondary

    reference frames.

    V. SIMULATION RESULTS

    JM9.5 reference software is used for generation of

    simulation results for the proposed algorithm. All the results

    are generated in Baseline profile for CIF (352x288)

    resolution at 512 kbps and at frame rate of 15fps. The same

    techniques are verified with different bit rates. Foreman,

    Hall, football, Claire, Flower, and Mother and Daughter

    sequences are used. Only I and P frames are used and GOVlength of 60 is used. Search range of 32, Inter prediction

    modes up to 8x8 blocks are used. Table.3 give the

    comparison of PSNR quality and Motion estimation time in

    seconds with full search algorithm and proposed techniques

    with two reference frames being used for motion

    compensation. Full search block matching algorithm is used

    for motion estimation in primary reference frame.

    VI. CONCLUSIONS

    A fast technique for predicting motion in multiple reference

    frames is presented. The proposed algorithm scales down

    the complexity of motion estimation in multiple reference

    frames scenario close to that required for single referenceframe motion estimation with negligible loss of PSNR. The

    proposed scheme minimizes the memory traffic for hardware

    implementations as the search area is minimized in motion

    estimation.

    REFERENCES

    [1] Joint Video Team of ITU-T and ISO/IEC JTC 1, ITU-T

    Rec. H.264 ISO/IEC 14496-10 AVC, March 2003.

    [2] M. Budagavi and J. Gibson, Multi-frame Block Motion

    Compensated Video Coding for Wireless Channels, in

    Thirtieth Asilomar Conf. on Signals, Systems, and

    Computers, vol. 2, pp.953-957, Nov. 1996.

    [3] T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra,

    Overview of the H.264 / AVC Video Coding Standard,IEEE Transactions on Circuit and Systems for Video

    Technology,VOL. 13, NO. 7, July 2003

    [4] Yi-Hon Hsiao, Tien-Hsu Lee, Pao-Chi Chang:

    Short/long-term motion vector prediction in multi-frame

    video coding system. ICIP 2004: 1449-1452.[5] T.Wiegand, X.Zhang, and B.Girod Long-term memory

    motion compensated prediction, IEEE Trans. Circuits Syst.

    Video Technol.., vol 9, no.1, pp.7084, Feb. 1999.

    Table.3. Quality (PSNR) and Complexity of different fast MFMC schemes proposed with two reference frames

    820