Error Concealment in Video

WCE Sangli

Page 1

WALCHAND COLLEGE OF ENGINEERING, SANGLI

A SEMINAR REPORT ON

Motion Vector Recovery Based Error Concealment For H.264 Video Communication

Submitted by:

BHADGAONKAR SATISH A.

T.E (ELECTRONICS)

ROLL NO: 70

SEAT NO: W33383

Under the guidance of:

Mr. R.G.MEVEKARI

WCE Sangli

Page 2

CERTIFICATE

This is to certify that the seminar report entitled

Motion Vector Recovery Based Error Concealment For H.264 Video Communication

Submitted by:

BHADGAONKAR SATISH A.

It is a record of his own work carried out by him in partial fulfilment of

T.Y.B.TECH.(ELECTRONICS)

W.C.E, SANGLI.

Under my guidance during the academic year

2010-2011

Date:

Place: Sangli.

Mr. R. G. Mevekari Dr. (Mrs) S. S. Deshpande

(Guide) (H.O.D)

DEPARTMENT OF ELECTRONICS ENGINEERING

WALCHAND COLLEGE OF ENGINEERING, SANGLI.

WCE Sangli

Page 3

ACKNOWLEDGEMENT I would like to thank to Mr. R.G.Mevekari for his guidance and inspiration for me to

complete this seminar in a better way.

I am thankful to WCE i.e. my college for providing me the facility of accessing IEEE papers. I also thank to the library personnel for offering all the help I needed for this work .After all I am thankful to all my colleagues who helped me directly or indirectly.

Mr. BHADGAONKAR SATISH A.

WCE Sangli

Page 4

DECLARATION

I, hereby, declare that the seminar report entitled

“Motion Vector Recovery Based Error Concealment For H.264 Video Communication “

It is an independent seminar work carried out by me during T.Y.B.TECH. (ELECTRONICS) course under the guidance of Mr. R. G. Mevekari. This seminar report has not been previously submitted for any course in this college. I understand that any such copying is liable to be punished in the way the College Authorities decided.

Date:

Place: Sangli.

Mr. BHADGAONKAR SATISH A.

WCE Sangli

Page 5

1. Introduction

INDEX

2. Errors

3. Overview of Error Control and Error Concealment Techniques

4. Motion vector recovery

1. Temporal Replacement based MVR Scheme

2. Boundary Matching based MVR Algorithm

3. MVR based on Lagrange Interpolation

4. MVR based on Polynomial Model

5. MVR Techniques: A Performance Comparison

6. Conclusion and Open Issues

7. Bibliography

WCE Sangli

Page 6

1. Introduction

With the explosive growth of the Internet and the wireless network, video services over these networks are becoming more and more popular. As a result, several compression-centric coding standards, such as MPEG-2/4, H.263 and H.264, have been developed to transmit video sequences especially over bandwidth-limited channels. The common techniques employed by these video-coding standards include discrete cosine transform (DCT), motion estimation/motion compensation (ME/MC), and variable length coding (VLC).

WCE Sangli

Page 7

2. All of the above are successful as they exploit the spatial, temporal, and statistical

redundancy in video streams to compress the same. However, highly compressed video bit streams are susceptible to transmission errors. These errors can propagate both spatially and temporally while decoding the current and subsequent video frames, thus leading to a severe degradation in the visual quality at the receiving side. Transmission errors can be roughly classified into two categories: 'random bit errors' and 'erasure errors'. Random bit errors are caused by the imperfections of physical channels, causing bit insertion, bit inversion, and/or bit deletion. Depending upon the coding methods and the affected information content, the impact of random bit errors can range from negligible to objectionable. Erasure errors, on the other hand, can be caused by packet loss in packet networks, burst errors in storage media due to the physical defects, or transient system failures. The effect of erasure errors is much more destructive than random bit errors as the former cause loss or damage to contiguous segment of bits.

ERRORS

Figure 1: Subjective quality comparison for the "Stefan (QCIF)" sequence at 15% MBs loss in 9th frame at 1354 Kbps (a) Without Error; PSNR=39.02 dB (b) 15% MBs lost; PSNR=13.24 dB.

Several techniques have been proposed to combat this visual quality degradation caused by errors that occur during transmission of such compressed videos.

1) Error resilience based techniques that improve the robustness of videos against transmission errors

2) Techniques that initiate an automatic retransmission request (ARQ) on a decoding error

3) Error concealment (EC) based techniques that hide or recover the errors by using the other non-erroneous video information received.

WCE Sangli

Page 8

3. Overview of Error Control and Error Concealment Techniques

Several error-resilient coding techniques have been propose such as multiple description coding (MDC), layered video coding, error-resilient entropy coding (EREC), and reversible variable length coding (RVLC). A prominent drawback of these techniques is the low degree of compression. The error resilience based methods proposed in rely on feedback from the decoder at the receiving end. In these techniques, error propagation cannot be suppressed immediately after the detection of an erroneous frame due to the round-trip time needed to transmit the feedback signal. The ARQ-based techniques also introduce retransmission delay in case of decoding errors. Thus, these techniques are not suitable in the context of real-time video streaming. Out of the techniques mentioned above, the post-processing EC methods are the most prominent. These error concealment (EC) methods are further classified into three, namely, frequency, spatial, and temporal schemes. There are also hybrid schemes that use more than one of these three. Importantly, these methods can be made adaptive to changing error tolerance requirements, based on the channel noise. All the three EC schemes mentioned above use the data that is spatially/temporally adjacent to the lost data, to recover the latter. In the case of H.264 standard, the video is transmitted as a set of frames. Each frame is made of several MBs that in turn are divided into sub-MBs. Each such sub-MB comprises a motion vector (MV). The frequency concealment based techniques estimate the discrete cosine transform (DCT) coefficients of a missing block using the corresponding DCT coefficients of its neighbour blocks. These techniques use either all the DCT coefficients or only the DC values of the DCT coefficients. Spatial approaches on the other hand exploit the correlation between data (pixels or MVs) belonging to the blocks that are neighbours to the lost MB on the same frame, to recover the lost MB. Temporal concealment based techniques use the blocks from other frames for recovering the lost data either by attempting to reconstruct the MV of the lost MB or by searching for a block that has a good match to the neighbourhood of the missing MB.

WCE Sangli

Page 9

4. MVR techniques are more popular as they effectively address two important problems related to EC in video communication, namely, the computational time requirement; and, quality of the recovered video. These MVR techniques take less time to execute without compromising the quality of the recovered video and that make them highly suitable for real-time video streaming applications.

Motion Vector Recovery

4.1 Temporal Replacement based MVR Scheme As mentioned in the previous section, the most common temporal MVR method is the temporal replacement (TR), which replaces the lost MVs with (0, 0). This signifies that no movement happened in the lost area of a given frame compared to the previous frame. Since all the lost MVs are replaced by (0, 0) in the TR technique, this technique is the fastest among all the existing MVR techniques reported in the literature. The main drawback with the TR technique is the poor quality of the recovered video compared to the other MVR techniques discussed below.

4.2 Boundary Matching based MVR Algorithm The test model calculates the MVs for each of the lost MBs by using a matching algorithm. The H.264 standard offers the following flexibility in configuring the size of an MB. Each of the MBs can be any one of the following sizes: single MB which is a 16×16 pixel matrix, four sub-MBs which are 8×8 pixel matrices, eight sub-MBs which are either 4×8 pixel or 8×4 pixel matrices each, and sixteen sub-MBs which are 4×4 pixel matrices each. In each of the above cases, every sub-MB is associated with an MV. The BMA works only on the MBs that are configured as 8×8 sub-MBs. If in case an MB is realized as a single 16×16 pixel matrix, it is divided into four 8×8 sub-MBs. The MVs of the new sub-MBs are the same as that of the larger sized MB. In case an MB is divided into other sizes, for example 4×8, 8×4, etc., the sub-MBs of smaller sizes are merged to form a larger 8×8 sub-MB. In this case the MVs of the newly formed larger sub-MB are the average of the MVs of the smaller sized sub-MBs that were merged to form the larger one. The prediction of the lost MV for this sub-MB is done by choosing one of the MVs from other correctly decoded/recovered adjacent sub-MBs. The decision of which MV of a neighbouring sub-MB be used as prediction for a lost sub-MB is made as described below: In this procedure, all the MV values of the correctly decoded/recovered sub-MB adjacent to the lost sub-MB are considered. The MV value of one such sub-MB is taken and is assigned as the MV value of the lost sub-MB. Now the sub-MB is inserted into its place in the frame and the luminance change

WCE Sangli

Page 10

across its boundaries are computed. The above step is repeated for each of the adjacent non-erroneous sub-MBs of the lost sub-MB and that MV which gives out the smallest luminance change across the boundaries of the lost sub-MB is chosen as its predicted MV value. The luminance change in the boundary of two sub-MBs is the average of the absolute difference values of the pixels in the boundary. Though this technique ensures reasonable quality of the picture in the recovered video, it requires very large computational time compared to its counterparts discussed further in this paper. Unlike other video-coding standards, the MVs of H.264 cover smaller area of the video frame being encoded. This leads to a strong correlation between neighbouring MVs, thus making H.264 standard amenable for statistical analysis to recover the lost MVs. The techniques discussed further in this paper are based on such statistical analysis.

4.3 MVR based on Lagrange Interpolation This sub-section presents an MVR method that is based on the Lagrange Interpolation (LAGI) formula. Lagrange interpolation formula is one of the most widely used interpolation functions. Its computational cost is lower than most of the other interpolation functions reported in the literature. The remaining of this section describes how a third order (n=3) polynomial interpolation can be used for MVR. As mentioned earlier, the H.264 standard divides every frame into several MBs. Each MB is associated with 1 to 16 MVs ensuring backward compatibility with previous standards. [Figure 2] shows an H.264 frame segment with 9 MBs denoted by F m,n, where m and n denote the spatial location of the MB within the frame. Each MB is associated with 16 MVs. In [Figure 2], let F m,n denote the lost MB. As in the case of many EC algorithms for MVR, it is assumed that either two of the vertically adjacent or two of the horizontally adjacent MBs of the lost MB are correctly decoded. In [Figure 2], it is assumed without loss of generality that both the horizontally adjacent MBs of F m,n are error-free. In this case, the lost MVs of F m,n are recovered row-by-row. Let MV ij (0≤i, j ≤3) denote the correct MVs that belong to the rows of the horizontally adjacent MBs of F m,n as shown in [Figure 2]. Let V ij 0(0≤i, j ≤3) represent the MVs of the rows of F m,n that need to be recovered.

http://www.tr.ietejournals.org/viewimage.asp?img=IETETechRev_2011_28_1_29_74509_u3.jpg�




WCE Sangli

Page 11

Figure 2: Location of lost MB and its neighbouring MBs.

The procedure to recover one row of MVs of F m,n is described as follows: Based on the LAGI formula, the correct neighbouring MVs MV i , ..., MV i3 and the corresponding values of x coordinates (p i s) are used to constitute a Lagrange polynomial. The value of V ij can be computed as follows:

where,

WCE Sangli

Page 12

It is obvious from above equations that the values of Lagrange parameters (L 0j ,..., L 3j) are constant across all the lost MBs. Similarly, the recovery of the other rows follows the same procedure. A similar procedure is followed if the vertically adjacent MBs of F m,n are error-free. The main advantage with the LAGI technique is that it ensures the high quality of recovered video consuming very less computational time.

4.4 MVR based on Polynomial Model This subsection presents a PIM to form a polynomial that describes the motion tendency of MVs adjacent to any of the lost MVs. This polynomial model results in an approximate function that can describe the change tendency of the MVs within a small area. Based on the property of this polynomial model, an approximation of the lost MVs can be obtained from the neighbouring MVs and the lost MB can be reconstructed. As shown in, the correct neighbouring MVs: y is (MV i0 ,..., MV i3) and the corresponding values of x coordinates (p i s) are used to constitute a polynomial model. A polynomial model, which describes the correlation of the MVs in the neighbouring MBs, can be constituted as follows:

Where a 0, a 1 ..., al are a set of unknown coefficients that can be calculated by the given points and l is the order of the polynomial. The objective is to compute the set of coefficients such that the squares of differences between W l(x i) and y i are minimized. The squares of differences between W l(x i) and y i can be presented as a function of the independent variable a 0 , a 1 , ..., a l as shown in the following equation

To obtain the minimum of F (a 0 , a 1 , ..., a l), the set of coefficients a 0 , a 1 , ..., a l should satisfy the Equation (4)

WCE Sangli

Page 13

From Equation (4), a set of functions can be obtained to calculate the coefficients, as presented in Equation (5)

Since there are four MVs available in the neighbouring MB, polynomial up to the third order (i.e. l=3) can be used to perform this interpolation. However, the first order polynomial cannot accurately describe nonlinear movement. The third order polynomial often results in an oscillatory curve, and it is suitable only for the interpolation data that change quickly. The second-order polynomial can represent the smooth curve, thus the second-order polynomial is more suitable for this kind of applications compared to the other two polynomials. This technique produces comparable quality of the recovered video as it is produced in case of the LAGI technique. On the other hand, this technique is little bit more expensive compared to the LAGI in terms of number of computations. Different polynomials are needed to handle the varying amount of motion in received video frames, ensuring higher PSNR values. In this context, the main advantage of this technique over the LAGI technique is that it allows the flexibility of choosing different types of polynomials based on the characteristics of the frame to be concealed, which in turn makes this technique adaptive in nature.

WCE Sangli

Page 14

5. MVR Techniques: A Performance Comparison The experimental results presented in this section use the standard benchmark video sequence, namely, the Coastguard. The video sequence has total of 300 frames and the video sequence is encoded and decoded by the JM12.4, which is a standard CODEC program for H. 264. In the simulations reported in this section, the fixed group of pictures (GOPs) length (IntraPeriod) parameter is set to 11 to achieve the best trade-off between compression and quality. The PSNR results for the different benchmark sequences across different MVR algorithms with both the test scenarios are presented. In each case, the MB loss rate is assumed to be 15% and QP is set to 20. [Figure 5] presents a totally different test scenario where deterministic errors of 15% MB loss are introduced in a given P frame of the Coastguard sequence. To capture the worst-case scenario, the simulator introduces these errors in frames that have the maximum motion. Interestingly, in this case, irrespective of the bitrates in which it is coded, the 69th frame has the maximum motion. It is evident from this performance analysis that the LAGI and PIM are comparable, BMA stands second, and TR is at the bottom in terms of the quality of the recovered video.

Figure 5: Subjective quality comparison for the "Coastguard (QCIF)" sequence at 15% macro blocks loss in 69th frame at QP=24 (a) Origi-nal (b) 15% macro blocks lost (c) Concealed using TR (d) Concealed using BMA (e) Concealed using LAGI (f) Concealed using PIM.


WCE Sangli

Page 15

6. Conclusion and Open Issues In this paper various techniques have been described for performing EC in video communication. A subclass of EC methods applied on video streams focuses on the optimal estimation of erroneously received motion fields based on available surrounding information. This subclass is termed as motion vector recovery (MVR). This paper elaborates some of the most popular, simple, efficient and widely used MVR techniques reported in the literature. This paper also presents a performance comparison results for these selected MVR techniques in simulation as well as in real-time video streaming environment. There are many practical issues in error-resilient video communication that needs to be addressed. The first and foremost of them is a system-level framework for video communication wherein the encoding algorithm, transport protocol, and post-processing method are designed jointly to minimize the combined distortion due to both compression and transmission. In case of the MVR-based EC techniques, there must be more emphasis on covering all kinds of motion in a video frame (e.g. nonlinear, fast, and sudden movements) using a single MVR algorithm. There is a great scope for using some of the more sophisticated nonlinear interpolation techniques in order to capture different kinds of movements in the lost MB at a lower computational complexity.

WCE Sangli

Page 16

1. Kavish Seth, V Kamakoti, S Srinivasan Department of Electrical Engineering, Indian Institute of Technology - Madras, Chennai - 600036, India Department of Computer Science & Engineering, Indian Institute of Technology - Madras, Chennai - 600036, India

Bibliography

2. Jinghong Zheng, Student Member, IEEE, and Lap-Pui Chau, Senior Member, IEEE

3. Donghyung Kim, Sanghyup Cho, and Jechang Jeong

Dept. of Electrical and Computer Engineering, Hanyang University Haengdang, Seongdong, Seoul, South Korea

Documents

Error Concealment in Video