An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

Embed Size (px)

Citation preview

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    1/16

    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 425-440 (2008)

    425

    An Evaluation Framework for More Realistic Simulations

    of MPEG Video Transmission

    CHIH-HENG KE1, CE-KUEN SHIEH2, WEN-SHYANG HWANG3AND ARTURZIVIANI41Department of Computer Science and Information Engineering

    National Kinmen Institute of Technology

    Kinmen, 892 Taiwan2Department of Electrical Engineering

    National Cheng Kung University

    Tainan, 701 Taiwan3Department of Electrical Engineering

    National Kaohsiung University of Applied Sciences

    Kaohsiung, 807 Taiwan4National Laboratory for Scientific Computing (LNCC)

    Petrpolis, Rio de Janeiro, 25651-075 Brazil

    We present a novel and complete tool-set for evaluating the delivery quality of

    MPEG video transmissions in simulations of a network environment. This tool-set is

    based on the EvalVid framework. We extend the connecting interfaces of EvalVid to re-

    place its simple error simulation model by a more general network simulator like NS2.

    With this combination, researchers and practitioners in general can analyze through

    simulation the performance of real video streams, i.e. taking into account the video se-

    mantics, under a large range of network scenarios. To demonstrate the usefulness of our

    new tool-set, we point out that it enables the investigation of the relationship between

    two popular objective metrics for Quality of Service (QoS) assessment of video quality

    delivery: the PSNR (Peak Signal to Noise Ratio) and the fraction of decodable frames.

    The results show that the fraction of decodable frames reflects well the behavior of the

    PSNR metric, while being less time-consuming. Therefore, the fraction of decodable

    frames can be an alternative metric to objectively assess through simulations the delivery

    quality of transmission in a network of publicly available video trace files.

    Keywords:network simulation, MPEG video, Evalvid, NS2, PSNR, the fraction of de-

    codable frames

    1. INTRODUCTION

    The ever-increasing demand for multimedia distribution in the Internet motivates

    researchon how to provide better-delivered video quality through IP-based networks [1].

    Previous studies [2-7] often use publicly available real video traces to evaluate their pro-

    posed network mechanisms in a simulation environment [8-12]. Results are usually pre-

    sented using different performance metrics, such as the packet/frame loss rate, packet/

    frame jitter [13], effective frame loss rate [8], picture quality rating (PQR) [13], and thefraction of decodable frames [9]. Nevertheless, packet loss or jitter rates are network

    performance metrics and may be insufficient to adequately rate the perceived quality by a

    (human) end user. Although effective frame loss rate, PQR, and the fraction of decodable

    Received January 9, 2006; revised June 19, 2006; accepted August 2, 2006.

    Communicated by Chung-Sheng Li.

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    2/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI426

    frames are application-level Quality of Service (QoS) metrics, they are not as well known

    and acceptable as MOS (Mean Opinion Scores) and PSNR (Peak Signal Noise Ratio)

    [14]. Furthermore, it is hard to study the effects of proposed network mechanisms on

    different characteristics of the same video extensively because the encoding settings for

    the publicly available video traffic traces are limited. As a consequence, how to best

    simulate and evaluate the performance of video quality delivery in a simulated network

    environment is a recursive open issue in network simulation forums, such as [15].

    EvalVid [16], a complete framework and tool-set for evaluation of the quality of

    video transmitted over a real or simulated communication network, provides packet/

    frame loss rate, packet/frame jitter, PSNR, and MOS metrics for video quality assess-

    ment purposes. The primary aim of EvalVid is to assist researchers or practitioners in

    evaluating their network designs or setups in terms of the perceived video quality by the

    end user. Nevertheless, the simulated environment provided by EvalVid is simply an

    error model to represent corrupted or missing packets in the real network. The lack of

    generalization of this simple error model causes problems for researchers or practitioners

    who seek to assess the delivered video quality to end users in more complex and realistic

    network scenarios. For example, when transmitting video packets via unicast over IEEE802.11 wireless network, the MAC layer at a sender will retransmit an unacknowledged

    packet at a maximum ofNtimes before it gives up. The perceived correct rate at applica-

    tion-level is thus

    1

    1

    (1 ) 1 ,N

    i NCORRECT

    i

    P p p p

    =

    = =

    whereNis the maximum number of retransmission at the MAC layer andp is the packet

    error rate at the physical-level.As a consequence, the application-level error rate is peffec-tive = p

    N. In this kind of scenario, the results obtained from original Evalvid framework

    are misleading since the simple error model does not take the retransmission mechanism

    into consideration.This paper integrates EvalVid with NS2 [17], a widely adopted network simulator.

    On the one hand, the resulting tool-set from this integration allows network researchers

    and practitioners to analyze their proposed new network designs in the presence of real

    video traffic in a straightforward way. On the other hand, mechanisms for enhancing the

    delivery quality of video streams can be evaluated in more complex simulated network

    scenarios, including characteristics like relatively large topologies, broadband access,

    limited bandwidth, wireless, node mobility, and whatever functionality is available at the

    network simulator. Furthermore, we use our new evaluation framework provided by this

    tool-set to investigate the relationship between two objective QoS assessment metrics:

    PSNR [18] and the fraction of decodable frames [9]. PSNR takes into account the video

    content and hence it is more time-consuming than the fraction of decodable frames,

    which is straightforward to compute. The new tool-set enables the analysis showing that

    the fraction of decodable frames can reflect the behavior of the PSNR metric adequately,

    while being less time-consuming.

    To the best of ourknowledge, no tool-set is publicly available to perform a com-

    prehensive video quality evaluation of real video streams in network simulation envi-

    ronment. We argue that the proposed tool-set enables more realistic simulations of video

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    3/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 427

    transmission in a dual sense. This tool-set enables video-coding or video-QoS techni-

    cians to simulate the effects of a more realistic network on video sequence resulting from

    their coding or QoS scheme, respectively. Likewise, the proposed tool-set also enables

    networking operatives to evaluate the effects of real video streams on proposed network

    protocols, for instance. Indeed, we believe that our tool-set provides a convergence to

    more realistic video simulations of video transmissions in the broad sense, thus enabling

    a large range of video transmissions in network scenarios to be evaluated. [19-21] are

    examples that use this tool-set for their respective proposed mechanism evaluation. This

    new proposed tool-set for evaluating the quality performance of network video transmis-

    sions is publicly available at [22].

    The remainder of this paper is organized as follows. Section 2 provides a brief over-

    view of EvalVid. Section 3 describes the developed connecting agents between EvalVid

    and NS2 as well as an improved fix YUV program to replace the conventional one. Sec-

    tion 4 analyzes the proposed QoS assessment framework for video streams using two

    examples to illustrate the video quality evaluation. Section 5, investigates the relation-

    ship between the QoS assessment metrics PSNR and the fraction of decodable frames.

    Finally, section 6 presents the concluding remarks.

    2. OVERVIEW OF EVALVID

    The structure of the EvalVid framework is shown in Fig. 1, redrawn from [16].

    VSVideo

    Encoder

    ET

    PSNR

    FV

    MOS

    Source

    Network

    Loss / delay

    (or Simulation)

    Video

    Decoder

    erroneous

    video

    raw YUV video

    (receiver)

    play-out

    bufferuser

    raw YUV video

    (sender)

    coded video

    video tracesender

    trace

    receivertrace

    reconstructed

    erroneous

    video

    reconstructed

    raw YUV video (receiver)

    RESULTS:

    - frame loss / frame jitter

    - user perceived quality

    Fig. 1. Schematic illustration of the evaluation framework provided by EvalVid.

    The main components of the evaluation framework are described as follows:

    Source The video source can be either in the YUV QCIF (176 144) or in the YUVCIF (352 288) formats.

    Video Encoder and Video Decoder Currently, EvalVid only supports single layer video

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    4/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI428

    coding. It supports three kinds of MPEG4 codecs, namely the NCTU codec [23], ffmpeg

    [24], and Xvid [25]. The focus of this investigation is NCTU codec for video coding

    purposes.

    VS (Video Sender) The VS component reads the compressed video file from the out-

    put of the video encoder, fragments each large video frame into smaller segments, and

    then transmits these segments via UDP packets over a real or simulated network. For

    each transmitted UDP packet, the framework records the timestamp, the packet ID, and

    the packet payload size in the sender trace file with the aid of third-party tools, such as

    tcp-dump [26] or win-dump [27], if the network is a real link. Nevertheless, if the net-

    work is simulated, the sender trace file is provided by the sending entity of the simulation.

    The VS component also generates a video trace file that contains information about every

    frame in the real video file. The video trace file and the sender trace file are later used for

    subsequent video quality evaluation. Examples of a video trace file and a sender trace file

    are shown in Tables 1 and 2, respectively. It can be seen that the packets with IDs 1 to 4

    originate from the same video frame since their transmission times are equal.

    Table 1. Example of video trace file.

    Frame Number Frame Type Frame Size Number of UDP-packets Sender Time

    0 H 29 1 segment at 33 ms

    1 I 3036 4 segments at 67 ms

    2 P 659 1 segment at 99 ms

    3 B 357 1 segment at 132 ms

    4 B 374 1 segment at 165 ms

    ...

    Table 2. Example of sender trace file.

    Time stamp (sec) Packet ID Packet Type Payload Size (bytes)0.033333 0 udp 29

    0.066666 1 udp 1000

    0.066666 2 udp 1000

    0.066666 3 udp 1000

    0.066666 4 udp 36

    0.099999 5 udp 659

    0.133332 6 udp 357

    0.166665 7 udp 374

    ... ... ... ...

    ET (Evaluate Trace) Once the video transmission is over, the evaluation task begins.

    The evaluation takes place at the sender side. Therefore, the information about the time-

    stamp, the packet ID, and the packet payload size available at the receiver has to be

    transported back to the sender. Based on the original encoded video file, the video trace

    file, the sender trace file, and the receiver trace file, the ET component creates a frame/

    packet loss and frame/packet jitter report and generates a reconstructedvideo file, which

    corresponds to the possibly corrupted video found at the receiver side as it would be re-

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    5/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 429

    produced to an end user. In principle, the generation of the potentially corrupted video

    can be regarded as a process of copying the original video trace file frame by frame,

    omitting frames indicated as lost or corrupted at the receiver side. Nevertheless, the gen-

    eration of the possibly corrupted video is more complex than this and the process is fur-

    ther explained in more details in section 3.2. Furthermore, the current version of the ET

    component implements the cumulative inter-frame jitter algorithm [8] for play-out buffer.

    If a frame arrives later than its defined playback time, the frame is counted as a lost

    frame. This is an optional function. The size of the play-out buffer must also be set, oth-

    erwise it is assumed to be of infinite size.

    FV (Fix Video) Digital video quality assessment is performed frame by frame. There-

    fore, the total number of video frames at the receiver side, including the erroneous frames,

    must be the same as that of the original video at the sender side. If the codec cannot han-

    dle missing frames, the FV component is used to tackle this problem by inserting the last

    successfully decoded frame in the place of each lost frame as an error concealment tech-

    nique [28].

    PSNR (Peak Signal Noise Ratio) PSNR is one of the most widespread objective met-

    rics to assess the application-level QoS of video transmissions. The following equation

    shows the definition of the PSNR between the luminance component Yof source image S

    and destination imageD:

    PSNR(n)dB = 20 log10

    2

    0 0

    ,

    1[ ( , , ) ( , , )]

    col row

    peak

    N N

    S Dcol row i j

    V

    Y n i j Y n i jN N = =

    where Vpeak = 2k 1 and k= number of bits per pixel (luminance component). PSNR

    measures the error between a reconstructed image and the original one. Prior to transmis-sion, it is possible to compute a reference PSNR value sequence on the reconstruction of

    the encoded video as compared to the original raw video. After transmission, the PSNR

    is computed at the receiver for the reconstructed video of the possibly corrupted video

    sequence received. The individual PSNR values at the source or receiver do not mean

    much, but the difference between the quality of the encoded video at the source and the

    received one can be used as an objective QoS metric to assess the transmission impact on

    video quality at the application level.

    Table 3. Possible PSNR to MOS conversion [29].

    PSNR[dB] MOS

    > 37

    31-3725-31

    20-25

    < 20

    5 (Excellent)

    4 (Good)3 (Fair)

    2 (Poor)

    1 (Bad)

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    6/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI430

    MOS (Mean Opinion Score) MOS is a subjective metric to measure digital video

    quality at the application level. This metric of the human quality impression is usually

    given on a scale that ranges from 1 (worst) to 5 (best). In this framework, the PSNR of

    every single frame can be approximated to the MOS scale using the mapping shown in

    Table 3.

    3. ENHANCEMENT OF EVALVID

    This section introduces the proposed enhancement of EvalVid by constructing three

    connecting interfaces (agents) between EvalVid and NS2. Additionally, this section dis-

    cusses the problem associated with the conventional fix YUV component (FV) and de-

    velops animproved fix YUV component to overcome this problem.

    3.1 New Network Simulation Agents

    Fig. 2 illustrates the QoS assessment framework for video traffic enabled by the

    new tool-set that combines EvalVid and NS2. As shown in Fig. 2, three connecting

    simulation agents, namely MyTrafficTrace, MyUDP, and MyUDPSink, are imple-

    mented between NS2 and EvalVid. These interfaces are designed either to read the video

    trace file or to generate the data required to evaluate the quality of delivered video.

    Fig. 2. Interfaces between EvalVid and NS2.

    Consequently, the whole evaluation process starts from encoding the raw YUV

    video, and then the VS program will read the compressed file and generate the traffic

    trace file. The MyTrafficTrace agent extracts the frame type and the frame size of the

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    7/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 431

    video trace file generated from the traffic trace file, fragments the video frames into

    smaller segments, and sends these segments to the lower UDP layer at the appropriate

    time according to the user settings specified in the simulation script file. MyUDP is an

    extension of the UDP agent. This new agent allows users to specify the output file name

    of the sender trace file and it records the timestamp of each transmitted packet, the

    packet ID, and the packet payload size. The task of the MyUDP agent corresponds to the

    task that tools such as tcp-dump or win-dump performs in a real network environment.

    MyUDPSink is the receiving agent for the fragmented video frame packets sent by

    MyUDP. This agent also records the timestamp, packet ID, and payload size of each

    received packet in the user specified receiver trace file. After simulation, based on these

    three trace files and the original encoded video, the ET program produces the corrupted

    video file. Afterward, the corrupted video is decoded and error concealed. Finally, the

    reconstructed fixed YUV video can be compared with the original raw YUV video to

    evaluate the end-to-end delivered video quality.

    3.2 Problem of the Original FV Program

    As described in section 2, when the video transmission is over, the receiver trace

    file has to be sent back to the sender side for the video quality evaluation. Based on the

    video trace file, the sender trace file, and the receiver trace file, the lost frames can be

    identified. If a frame is lost due to packet loss, the ET component sets the vop_coded bit

    of this video object plane (VOP) header in the original compressed video file to 0. The

    setting of this bit to 0 indicates that no subsequent data exists for this VOP. This type of

    frame is referred to as a vop-not-coded frame. When a frame is received completely and

    the vop_coded bit is set to 1, this type of frame is referred to as a decodable frame. After

    setting the vop_coded bit to 0 for all the lost frames, the processed file is then used to

    represent the compressed video file delivered to the receiver side.

    Currently, no standard exists to define an appropriate treatment of vop-not-coded

    frames. Some decoders with an error concealment mechanism simply replace the vop-not-coded frames by the last successfully decoded frame [28]. In these cases, the FV

    component is not required. Other decoders, however, without error concealment, such as

    ffmpeg, decode all frames other than the vop-not-coded frames. In these cases, the FV

    component can handle these vop-not-coded frames without difficulty by simply replacing

    them with the last successfully decoded frames. Other decoders, such as Xvid or the

    NCTU codec, additionally fail to decode the subsequent frames in some cases. For ex-

    ample, when decoding a subsequent frame that is a decodable frame, this frame may fail

    to be decoded if the frame it depends on is a vop-not-coded frame because there is not

    enough information to decode it. This type of frame is referred to as a non-decodable

    frame. In this case, the original FV component fails since it does not take this possibility

    into consideration.

    Based on these limitations, a requirement exists to design a new algorithm capable

    of solvingthe problem of non-decodable frames. In this study, we develop an algorithmthat uses the decoder output to fix the decoding results, i.e. reconstructed erroneous video

    sequence. If a frame is decodable, the improved FV component copies this decoded YUV

    frame data from the reconstructed erroneous raw video file into a temporary file and

    keeps it in a buffer as the last successfully decoded frame data. If a frame is vop-not-

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    8/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI432

    coded, the improved FV component reads this frame data from the reconstructed errone-

    ous raw video file, but it does not copy the data into the temporary file. This is because

    the data read is useless and the file pointer needs to be moved to the next frame. The im-

    proved FV component copies the data from the buffer into the temporary file instead. If a

    frame is missing or considered non-decodable, the improved FV component simply cop-

    ies the last successfully decoded YUV frame data in the buffer into the temporary file.

    After processing all the frames in the reconstructed and possibly corrupted video se-

    quence, the resulting temporary file is the reconstructed fixed video sequence. After-

    wards, the frame-by-frame PSNR can be evaluated in the usual manner.

    4. SIMULATION RESULTS

    This section demonstrates the usefulness of the new tool-set by considering two ex-

    perimental casessimulated in a best-effort network and in a DiffServ (Differentiated Ser-

    vice) network [19, 30, 31] when transmitting real video streams instead of synthetic gen-

    erated video flow sequences. Fig. 3 presents the simple simulation topology, in whichHost A delivers a video traffic stream to Host B through routers R1 and R2. The deliv-

    ered video is a foreman QCIF format sequence composed of 400 frames. It also has a

    mean bit rate of 200 Kbps and a peak bit rate of 400 Kbps. The bottleneck link has a ca-

    pacity of 180 Kbps and is situated between router R1 and router R2. The queue limit at

    each router is set to 10 packets. The simulation scripts are publicly available at [22].

    Host AHost B

    Router R1 Router R2

    Fig. 3. Simulation topology.

    4.1 Conventional Best-Effort Network

    In the first experiment, the video is delivered over a best-effort network and router

    R1 and R2 implement conventional First In First Out (FIFO) queue management. When

    the queue size reaches the queue limit, the FIFO queue management discards all the in-

    coming packets until the queue size decreases. Fig. 4 shows the results. It is clearly

    shown in the figure that the curve of psnr_myfix_be, which is the video fixed by the im-

    proved FV component, outperforms that of psnr_fix_be, which is the video fixed by the

    original component, on intervals from frame number 200 and number 250 and above 370.

    This is because the original FV component cannot distinguish the vop-not-coded frame

    and the missing frame. As a consequence, the FV component may copy the wrong framedata from the reconstructed erroneous raw video file into the temporary file. In terms of

    average PSNR, the psnr_myfix_be curve measures 26.86 dB and psnr_fix_be curve

    measures 23.43 dB. The simulation results demonstrate that the improved FV component

    is more effective than the conventional one in reconstructing the corrupted video se-

    quence.

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    9/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 433

    Fig. 4. Original FV vs. improved FV for best-effort

    delivered video.

    Fig. 5. QoS delivery vs. best-effortdelivery.

    4.2 DiffServ Network

    The second experiment is simulated in a DiffServ network in which I-frame packets

    are pre-marked with the lowest drop probability in the application layer at the source,

    P-frame packets are pre-marked with a medium drop probability, and B-frame packets

    are pre-marked with the highest drop probability. The queue management of router R1

    and R2 implements a Weighted Random Early Detection (WRED) queue management.

    When the queue builds up and exceeds a given threshold, the WRED starts to drop

    packets following the specified drop probability parameters. Fig. 5 shows the results.The

    PSNR difference values between psnr_noloss, which means no packet loss during

    transmission, and psnr_myfix_qos, which is the video transmitted by QoS delivery, are

    less than those between psnr_noloss and psnr_myfix_be, which is the video transmitted

    by best-effort delivery, especially on the intervals from frame number 260 to number

    360. In terms of average PSNR, the delivered video quality in a DiffServ network

    measures 28.64 dB. As expected, it outperforms the results obtained in a best-effort net-

    work, i.e. an average PSNR of 26.86 dB. Consequently, a DiffServ network provides more

    suitable environment for video transmission. In addition, to illustrate the how difference

    in performance is perceived by an end user, the corresponding visual effects are shown in

    Fig. 6 by means of the YUV display tool, i.e. yuvviewer [32]. This kind of visual result

    for a real video stream being transmitted over a simulated network is enabled by our new

    tool-set. The possibility of transmitting real video streams over a simulated network also

    enables the use of the PSNR quality measurement metric that takes into account the

    video content.

    5. RELATIONSHIP BETWEEN PSNR AND THE DECODABLE FRAMES

    In this section, we investigate the relationship between two popular objective met-

    rics: PSNR and the fraction of decodable frames. PSNR is a commonly accepted objec-

    tive performance metric that takes into account the video content to assess the video

    quality. However, pixel-by-pixel and frame-by-frame comparison to get the PSNR value

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    10/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI434

    (a) QoS delivery.

    (b) Best-effort delivery.

    Fig. 6. Visual comparison of the reconstructed 180-184th frames.

    Table 4. QoS Mappings.

    QoS Index Green Yellow Red

    0 I P B1 I P + B

    2 I P + B

    3 I + P B

    4 I + P B

    5 I+ P + B

    6 I P + B

    7 I + P B

    8 I + P + B

    9 I + P + B

    is a slow and laborious job. If the metric of fraction of decodable frames can adequately

    correspond to the behavior of the PSNR metric and at the same time be less time-con-suming, it can be an alternative to objectively evaluate the delivery quality of transmitted

    video streams.

    The fraction of decodable frames reports the number of decodable frames over the

    total number of transmitted frames. A frame is considered to be decodable if at least a

    fraction , called decodable threshold, of the data in each frame is received. However, a

    frame is only considered decodable if and only if all of the frames upon which it depends

    are also decodable. Therefore, for instance, when = 0.75, 25% of the data from a frame

    can be lost without causing that frame to be considered as undecodable.

    The simulation settings refer to [10]. The goal of that paper was to study the deliv-

    ered video quality for different QoS source mappings. The adopted QoS mapping table is

    shown in Table 4. For example, QoS 0 means that I frame packets are pre-marked as

    green, P frame packets are pre-marked as yellow, and B frame packets are pre-marked asred; where color marking in red, yellow, and green represents increasing packet loss pro-

    tection within the DiffServ network.

    This paper investigates the relationship between the objective metrics PSNR and

    fraction of decodable frames. The adopted network topology for this purpose is shown in

    Fig. 7. Three video sources were connected to a DiffServ network. The three video

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    11/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 435

    S1

    S2

    S3

    D1

    D2

    D3

    R1 R2 R3

    10 Mbps, 1ms 10 Mbps, 1ms

    10 Mbps, 1ms Mbps, 1ms

    Fig. 7. Network topology for different QoS source mappings.

    Fig. 8. PSNR for foreman video sequence. Fig. 9. The fraction of decodable frames for foreman

    video sequence when = 1.0 and = 0.75.

    sources transmitted the same video sequence to their respective destinations with a ran-

    dom start time within an interval of 3 seconds. The tested video sequences covered three

    different kinds of video content, i.e. foreman, akiyo, and highway [33]. These real

    video traces have different properties in terms of motion, frame size, and quality. Each

    frame is fragmented into packets of 1,000 bytes before transmission. The three routers in

    the simulation scenario implement the WRED mechanism for active queue management.

    The WRED parameters include a minimum threshold, a maximum threshold, and a maxi-

    mum drop probability, i.e. minth, maxth, andPmax. The WRED parameters and the bottle-

    neck bandwidth are set differently and specified in the following three simulation sce-

    narios.

    In the first set of simulations, the tested video sequence is foreman. The parame-

    ters for WRED queue mechanism are specified respectively as {10, 20, 0.1} for red

    packets, {20, 30, 0.05} for yellow packets, and {30, 40, 0.025} for green packets. The

    bottleneck bandwidth is set to 512 Kbps. The simulation results are shown in Figs. 8

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    12/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI436

    Fig. 10. PSNR for akiyo video sequence. Fig. 11. The fraction of decodable frames for akiyo

    video sequence when = 1.0 and = 0.75.

    and 9. The error bars show the 95% confidence interval. The behavior of the PSNR met-

    ric for different QoS indexes matches exactly that of the fraction of decodable frames no

    matter if= 1.0 or= 0.75. When the QoS indexes have higher PSNR values, the values

    of the fraction of decodable frames are also higher. Likewise, when the QoS indexes

    have lower PSNR values, the values of the fraction of decodable frames are also lower.

    In the second set of simulations, the tested video sequence is the CIF format akiyo

    video sequence, which has 300 frames coded at 30 frames/sec. It has a mean bit rate of

    237 Kbps and a peak rate of 595 Kbps. The parameters for WRED queue mechanism are

    specified respectively as {20, 40, 0.1} for red packets, {40, 60, 0.05} for yellow packets,

    and {60, 80, 0.025} for green packets. The bottleneck bandwidth is set to 640 Kbps.

    The simulation results are shown in Figs. 10 and 11. The error bars show the 95% confi-

    dence interval. Similarly to the foreman sequence, in the akiyo sequence the behav-

    ior of the PSNR metric for different QoS indexes matches exactly that of the fraction ofdecodable frames when = 0.75. However, the curve is somewhat inconsistent with that

    of PSNR values for QoS index 5 and QoS index 8 when = 1.0. During PSNR simula-

    tions, the improved FV conceals some packet losses, but the system is completely intol-

    erant to losses in the case of= 1.0. Therefore, using a smaller is better than using a

    largerin matching PSNR.

    In the third set of simulations, the tested video sequence is the CIF format high-

    way video sequence, which has 2000 frames coded at 30 frames/sec. It has a mean bit

    rate of 412 Kbps and a peak rate of 1116 Kbps. The parameters for WRED queue

    mechanism are specified respectively as {20, 40 and 0.1} for red packets, {40, 60 and

    0.05} for yellow packets, and {60, 80 and 0.025} for green packets. The bottleneck

    bandwidth is set to 1.024 Mbps. The simulation results are shown in Figs. 12 and 13.

    The error bars show the 95% confidence interval. Likewise, the behavior of the PSNRmetric for different QoS indexes matches exactly to that of the fraction of decodable

    frames when = 0.75.

    It is also interesting to have a closer look on this simulation. When computing the

    PSNR metric, it takes around 3 to 4 minutes to finish the task of simulating, evaluating

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    13/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 437

    Fig. 12. PSNR for highway video sequence. Fig. 13. The fraction of decodable frames for highway

    video sequence when = 1.0 and = 0.75.

    traces, decoding, fixing, and doing the frame-by-frame PSNR comparison on a PentiumIII 1 GHz computer equipped with 512 MB RAM. In contrast, it takes less than 10 sec-

    onds to get the value of the fraction of decodable frames. Similar results hold for the

    other two video sequences. It needs to be carefully noticed that highway has only 2000

    frames or around 1.11 minutes for video transmission at the rate of 30 frames/second. If

    the test sequence has more frames, it needs more time to finish all the tasks.

    6. CONCLUSION AND FUTURE WORK

    The contribution of this paper is twofold. First, we have presented the integration of

    EvalVid and NS2 to provide a novel generalized and comprehensive tool-set for evaluat-

    ing the video quality performance of network designs in a simulated environment. Thedeveloped integration provides three new connecting simulation agents, namely MyTraf-

    ficTrace, MyUDP, and MyUDPSink. These agents enable EvalVid to link seamlessly

    with NS2 in such a way that researchers or practitioners have greater freedom to analyze

    their proposed network designs for video transmission without being obliged to consider

    an appropriate tool-set for video quality evaluation. Simulations of real video streams are

    enabled over a large set of network scenarios, including relatively large topologies, node

    mobility, different kinds of concurrent traffic, or any other functionality available by the

    network simulator. Furthermore, in an analysis enabled by the new tool-set, we have

    shown that the fraction of decodable frames can adequately reflect the behavior of the

    PSNR QoS video assessment metric with reasonable accuracy and while being less time-

    consuming by at least one order of magnitude. Therefore, when researchers or practitio-

    ners want to encode their own test video sequences or adopt well-known ones in order to

    evaluate the delivered video quality in a simulated network environment, our proposedQoS assessment framework would be a good choice.

    Although this new evaluation framework is beneficial for networking or video-cod-

    ing technicians for most of cases, there are still some limitations. First, in its current ver-

    sion, it only supports non-scalable video encoding now. Second, due to the video encod-

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    14/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI438

    ing modes and the agents we developed, the current framework is not suitable for video

    transmission over bi-directional channels. The video encoding parameters can not be

    changed during simulation time. So researchers interested in rate adaptive design can

    refer to [34] for more information. In the future, we will incorporate more codecs into the

    framework and support scalable video coding and multiple description coding (MDC).

    The prototype of a multiple description coding evaluation framework is publicly avail-

    able at [35]. Researchers interested in multiple-path transport and load balance designs

    can try this prototype framework for preliminary evaluation.

    REFERENCES

    1. S. F. Chang and A. Vetro, Video adaptation: concepts, technologies, and open is-sues, inProceedings of the IEEE, Vol. 93, 2005, pp. 148-158.

    2. F. H. P. Fitzek and M. Reisslein, MPEG-4 and H.263 video traces for network per-formance evaluation, IEEE Network, Vol. 15, 2001, pp. 40-54.

    3.

    P. Seeling, M. Reisslein, and B. Kulapala, Network performance evaluation usingframe size and quality traces of single-layer and two-layer video: a tutorial, IEEE

    Communications Surveys and Tutorials, Vol. 6, 2004, pp. 58-78.

    4. Traffic trace from Mark Garretts MPEG encoding of the Star Wars movie, http://www.research.att.com/~breslau/vint/trace.html.

    5. Video traffic generator based on TES (Transform Expand Sample) model of MPEG4trace files, contributed by Ashraf Matrawy and Ioannis Lambadaris, It generates traf-

    fic that has the same first and second order statistics as an original MPEG4 trace,

    http://www.sce.carleton.ca/~amatrawy/mpeg4.

    6. O. Rose, Statistical properties of MPEG video traffic and their impact on trafficmodeling in ATM systems, Report No. 101, Institute of Computer Science, Univer-

    sity of Wurzberg, Germany, 1995.

    7. D. Saparilla, K. Ross, and M. Reisslein, Periodic broadcasting with VBR-encodedvideo, inProceedings of IEEE INFOCOM, 1999, pp. 464-471.

    8. L. Tionardi and F. Hartanto, The use of cumulative inter-frame jitter for adaptingvideo transmission rate, in Proceedings of the Conference on Convergent Tech-

    nologies for Asia-Pacific Region, Vol. 1, 2003, pp. 364-368.

    9. A. Ziviani, B. E. Wolfinger, J. F. Rezende, O. C. M. B. Duarte, and S. Fdida, Jointadoption of QoS schemes for MPEG streams, Multimedia Tools and Applications,

    Vol. 26, 2005, pp. 59-80.

    10. J. M. H. Magalhaes and P. R. Guardieiro, A new QoS mapping for streamed MPEGvideo over a DiffServ domain, in Proceedings of the IEEE International Confer-

    ence on Communications, Circuits and Systems and West Sino Expositions, 2002, pp.

    675-679.

    11. M. F. Alam, M. Atiquzzaman, and M. A. Karim, Traffic shaping for MPEG videotransmission over the next generation internet, Computer Communications, Vol. 23,2000, pp. 1336-1348.

    12.N. E. Nasser and M. Al-Abdulmunem, MPEG traffic over diffserv assured service,inProceedings of Asia-Pacific Conference on Communication, 2003, pp. 494-498.

    13. J. Takahashi, H. Tode, and K. Murakami, QoS Enhancement methods for MPEG

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    15/16

    REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 439

    video transmission on the Internet, IEICE Transactions on Communications, Vol.

    E85-B, 2002, pp. 1020-1030.

    14. F. A. Shaikh, S. McClellan, M. Singh, and S. K. Chakravarthy, End-to-end testingof IP QoS mechanisms,IEEE Computer Magazine, Vol. 35, 2002, pp. 80-87.

    15.NS related mailing lists, http://www.isi.edu/nsnam/htdig/search.html.16. J. Klaue, B. Rathke, and A. Wolisz, EvalVid A framework for video transmission

    and quality evaluation, in Proceedings of the International Conference on Model-

    ling Techniques and Tools for Computer Performance Evaluation, 2003, pp. 255-

    272.

    17.NS, http://www.isi.edu/nsnam/ns/.18. S. Olsson, M. Stroppiana, and J. Baina, Objective methods for assessment of video

    quality: state of the art, IEEE Transactions on Broadcasting, Vol. 43, 1997, pp.

    487-495.

    19. C. H. Ke, C. K. Shieh, W. S. Hwang, and A. Ziviani, A two-markers system for im-proved MPEG video delivery in a DiffServ network,IEEE Communications Letters,

    Vol. 9, 2005, pp. 381-383.

    20. J. Naoum-Sawaya, B. Ghaddar, S. Khawam, H. Safa, H. Artail, and Z. Dawy,Adaptive approach for QoS support in IEEE 802.11e wireless LAN, in Proceed-

    ings of the IEEE International Conference on Wireless and Mobile Computing,Net-

    working and Communications, 2005, pp. 167-173.

    21. H. Huang, J. Ou, and D. Zhang, Efficient multimedia transmission in mobile net-work by using PR-SCTP, in Proceedings of the IASTED International Conference

    on Communications and Computer Networks, 2005, pp. 213-217.

    22. http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm.23.NCTU codec, http://megaera.ee.nctu.edu.tw/mpeg.24. ffmpeg, http://ffmpeg.sourceforge.net/index.php.25. Xvid, http://www.xvid.org/.26. tcp-dump, http://www.tcpdump.org.27.

    win-dump, http://windump.polito.it.28. Y. Wang and Q. F. Zhu, Error control and concealment for video communication: areview, inProceedings of the IEEE, Vol. 86, 1998, pp. 974-997.

    29. J. R. Ohm, Bildsignalverarbeitung fuer multimedia-systeme, Skript, 1999.30. B. Carpenter and K. Nichols, Differentiated services in the internet, in Proceed-

    ings of the IEEE, Vol. 90, 2002, pp. 1479-1494.

    31. J. Shin, J. Kim, and C. C. J. Kuo, Quality of service mapping mechanism for packetvideo in differentiated services network, IEEE Transactions on Multimedia, Vol. 3,

    2001, pp. 219-231.

    32. yuvviewer, http://eeweb.poly.edu/~yao/VideobookSampleData/video/application/YUV-viewer.exe.

    33. YUV video sequences (CIF), http://www.tkn.tu-berlin.de/research/evalvid/cif.html.34. Evalvid-RA, http://www.item.ntnu.no/~arnelie/Evalvid-RA.htm.35. Multiple description coding evaluation framework, http://hpds.ee.ncku.edu.tw/~small-

    ko/ns2/MDC.htm.

  • 8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

    16/16

    CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI440

    Chih-Heng Ke () received his B.S. and Ph.D degrees

    in Electrical Engineering from National Cheng-Kung University,

    in 1999 and 2007. He is an assistant professor of Computer Sci-

    ence and Information Engineering, National Kinmen Institute of

    Technology, Kinmen, Taiwan. His current research interests in-

    clude multimedia communications, wireless network, and QoS

    network.

    Ce-Kuen Shieh () is currently a professor teaching in

    the Department of Electrical Engineering, National Cheng Kung

    University. He received his Ph.D., M.S., and B.S. degrees from the

    Electrical Engineering Department of National Cheng Kung Uni-

    versity, Tainan, Taiwan. His current research areas include distrib-uted and parallel processing systems, computer networking, and

    operating systems.

    Wen-Shyang Hwang () received his B.S., M.S., and

    Ph.D. degrees in Electrical Engineering from National Cheng

    Kung University, Taiwan, in 1984, 1990 and 1996, respectively.

    He is professor of Electrical Engineering, National Kaohsiung

    University of Applied Sciences, Taiwan. His current research fo-cus includes multi-channel WDM networks, performance evalua-

    tion, QoS, RSVP, WWW database applications

    Artur Ziviani received a B.Sc. in Electronics Engineering

    in 1998 and a M.Sc. in Electrical Engineering in 1999, both from

    the Federal University of Rio de Janeiro (UFRJ), Brazil. In 2003,

    he received a Ph.D. in Computer Science from the University of

    Paris 6, France, where he has also been a lecturer during 2003 to

    2004. Since 2004, he is with the National Laboratory for Scien-tific Computing (LNCC), Brazil. His research interests include

    QoS, wireless computing, Internet measurements, and the appli-

    cation of networking technologies in telemedicine.