An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

8/3/2019 An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

1/16

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 425-440 (2008)

425

An Evaluation Framework for More Realistic Simulations

of MPEG Video Transmission

CHIH-HENG KE1, CE-KUEN SHIEH2, WEN-SHYANG HWANG3AND ARTURZIVIANI41Department of Computer Science and Information Engineering

National Kinmen Institute of Technology

Kinmen, 892 Taiwan2Department of Electrical Engineering

National Cheng Kung University

Tainan, 701 Taiwan3Department of Electrical Engineering

National Kaohsiung University of Applied Sciences

Kaohsiung, 807 Taiwan4National Laboratory for Scientific Computing (LNCC)

Petrpolis, Rio de Janeiro, 25651-075 Brazil

We present a novel and complete tool-set for evaluating the delivery quality of

MPEG video transmissions in simulations of a network environment. This tool-set is

based on the EvalVid framework. We extend the connecting interfaces of EvalVid to re-

place its simple error simulation model by a more general network simulator like NS2.

With this combination, researchers and practitioners in general can analyze through

simulation the performance of real video streams, i.e. taking into account the video se-

mantics, under a large range of network scenarios. To demonstrate the usefulness of our

new tool-set, we point out that it enables the investigation of the relationship between

two popular objective metrics for Quality of Service (QoS) assessment of video quality

delivery: the PSNR (Peak Signal to Noise Ratio) and the fraction of decodable frames.

The results show that the fraction of decodable frames reflects well the behavior of the

PSNR metric, while being less time-consuming. Therefore, the fraction of decodable

frames can be an alternative metric to objectively assess through simulations the delivery

quality of transmission in a network of publicly available video trace files.

Keywords:network simulation, MPEG video, Evalvid, NS2, PSNR, the fraction of de-

codable frames

1. INTRODUCTION

The ever-increasing demand for multimedia distribution in the Internet motivates

researchon how to provide better-delivered video quality through IP-based networks [1].

Previous studies [2-7] often use publicly available real video traces to evaluate their pro-

posed network mechanisms in a simulation environment [8-12]. Results are usually pre-

sented using different performance metrics, such as the packet/frame loss rate, packet/

frame jitter [13], effective frame loss rate [8], picture quality rating (PQR) [13], and thefraction of decodable frames [9]. Nevertheless, packet loss or jitter rates are network

performance metrics and may be insufficient to adequately rate the perceived quality by a

(human) end user. Although effective frame loss rate, PQR, and the fraction of decodable

Received January 9, 2006; revised June 19, 2006; accepted August 2, 2006.

Communicated by Chung-Sheng Li.


2/16

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANGAND ARTURZIVIANI426

frames are application-level Quality of Service (QoS) metrics, they are not as well known

and acceptable as MOS (Mean Opinion Scores) and PSNR (Peak Signal Noise Ratio)

[14]. Furthermore, it is hard to study the effects of proposed network mechanisms on

different characteristics of the same video extensively because the encoding settings for

the publicly available video traffic traces are limited. As a consequence, how to best

simulate and evaluate the performance of video quality delivery in a simulated network

environment is a recursive open issue in network simulation forums, such as [15].

EvalVid [16], a complete framework and tool-set for evaluation of the quality of

video transmitted over a real or simulated communication network, provides packet/

frame loss rate, packet/frame jitter, PSNR, and MOS metrics for video quality assess-

ment purposes. The primary aim of EvalVid is to assist researchers or practitioners in

evaluating their network designs or setups in terms of the perceived video quality by the

end user. Nevertheless, the simulated environment provided by EvalVid is simply an

error model to represent corrupted or missing packets in the real network. The lack of

generalization of this simple error model causes problems for researchers or practitioners

who seek to assess the delivered video quality to end users in more complex and realistic

network scenarios. For example, when transmitting video packets via unicast over IEEE802.11 wireless network, the MAC layer at a sender will retransmit an unacknowledged

packet at a maximum ofNtimes before it gives up. The perceived correct rate at applica-

tion-level is thus

1

1

(1 ) 1 ,N

i NCORRECT

i

P p p p

=

= =

whereNis the maximum number of retransmission at the MAC layer andp is the packet

error rate at the physical-level.As a consequence, the application-level error rate is peffec-tive = p

N. In this kind of scenario, the results obtained from original Evalvid framework

are misleading since the simple error model does not take the retransmission mechanism

into consideration.This paper integrates EvalVid with NS2 [17], a widely adopted network simulator.

On the one hand, the resulting tool-set from this integration allows network researchers

and practitioners to analyze their proposed new network designs in the presence of real

video traffic in a straightforward way. On the other hand, mechanisms for enhancing the

delivery quality of video streams can be evaluated in more complex simulated network

scenarios, including characteristics like relatively large topologies, broadband access,

limited bandwidth, wireless, node mobility, and whatever functionality is available at the

network simulator. Furthermore, we use our new evaluation framework provided by this

tool-set to investigate the relationship between two objective QoS assessment metrics:

PSNR [18] and the fraction of decodable frames [9]. PSNR takes into account the video

content and hence it is more time-consuming than the fraction of decodable frames,

which is straightforward to compute. The new tool-set enables the analysis showing that

the fraction of decodable frames can reflect the behavior of the PSNR metric adequately,

while being less time-consuming.

To the best of ourknowledge, no tool-set is publicly available to perform a com-

prehensive video quality evaluation of real video streams in network simulation envi-

ronment. We argue that the proposed tool-set enables more realistic simulations of video


3/16

REALISTIC NETWORKSIMULATIONSOF MPEG VIDEO TRANSMISSION 427

transmission in a dual sense. This tool-set enables video-coding or video-QoS techni-

cians to simulate the effects of a more realistic network on video sequence resulting from

their coding or QoS scheme, respectively. Likewise, the proposed tool-set also enables

networking operatives to evaluate the effects of real video streams on proposed network

protocols, for instance. Indeed, we believe that our tool-set provides a convergence to

more realistic video simulations of video transmissions in the broad sense, thus enabling

a large range of video transmissions in network scenarios to be evaluated. [19-21] are

examples that use this tool-set for their respective proposed mechanism evaluation. This

new proposed tool-set for evaluating the quality performance of network video transmis-

sions is publicly available at [22].

The remainder of this paper is organized as follows. Section 2 provides a brief over-

view of EvalVid. Section 3 describes the developed connecting agents between EvalVid

and NS2 as well as an improved fix YUV program to replace the conventional one. Sec-

tion 4 analyzes the proposed QoS assessment framework for video streams using two

examples to illustrate the video quality evaluation. Section 5, investigates the relation-

ship between the QoS assessment metrics PSNR and the fraction of decodable frames.

Finally, section 6 presents the concluding remarks.

2. OVERVIEW OF EVALVID

The structure of the EvalVid framework is shown in Fig. 1, redrawn from [16].

VSVideo

Encoder

ET

PSNR

FV

MOS

Source

Network

Loss / delay

(or Simulation)

Video

Decoder

erroneous

video

raw YUV video

(receiver)

play-out

bufferuser

raw YUV video

(sender)

coded video

video tracesender

trace

receivertrace

reconstructed

erroneous

video

reconstructed

raw YUV video (receiver)

RESULTS:

- frame loss / frame jitter

- user perceived quality

Fig. 1. Schematic illustration of the evaluation framework provided by EvalVid.

The main components of the evaluation framework are described as follows:

Source The video source can be either in the YUV QCIF (176 144) or in the YUVCIF (352 288) formats.

Video Encoder and Video Decoder Currently, EvalVid only supports single layer video


4/16


coding. It supports three kinds of MPEG4 codecs, namely the NCTU codec [23], ffmpeg

[24], and Xvid [25]. The focus of this investigation is NCTU codec for video coding

purposes.

VS (Video Sender) The VS component reads the compressed video file from the out-

put of the video encoder, fragments each large video frame into smaller segments, and

then transmits these segments via UDP packets over a real or simulated network. For

each transmitted UDP packet, the framework records the timestamp, the packet ID, and

the packet payload size in the sender trace file with the aid of third-party tools, such as

tcp-dump [26] or win-dump [27], if the network is a real link. Nevertheless, if the net-

work is simulated, the sender trace file is provided by the sending entity of the simulation.

The VS component also generates a video trace file that contains information about every

frame in the real video file. The video trace file and the sender trace file are later used for

subsequent video quality evaluation. Examples of a video trace file and a sender trace file

are shown in Tables 1 and 2, respectively. It can be seen that the packets with IDs 1 to 4

originate from the same video frame since their transmission times are equal.

Table 1. Example of video trace file.

Frame Number Frame Type Frame Size Number of UDP-packets Sender Time

0 H 29 1 segment at 33 ms

1 I 3036 4 segments at 67 ms

2 P 659 1 segment at 99 ms

3 B 357 1 segment at 132 ms

4 B 374 1 segment at 165 ms

...

Table 2. Example of sender trace file.

Time stamp (sec) Packet ID Packet Type Payload Size (bytes)0.033333 0 udp 29

0.066666 1 udp 1000

0.066666 2 udp 1000

0.066666 3 udp 1000

0.066666 4 udp 36

0.099999 5 udp 659

0.133332 6 udp 357

0.166665 7 udp 374

... ... ... ...

ET (Evaluate Trace) Once the video transmission is over, the evaluation task begins.

The evaluation takes place at the sender side. Therefore, the information about the time-

stamp, the packet ID, and the packet payload size available at the receiver has to be

transported back to the sender. Based on the original encoded video file, the video trace

file, the sender trace file, and the receiver trace file, the ET component creates a frame/

packet loss and frame/packet jitter report and generates a reconstructedvideo file, which

corresponds to the possibly corrupted video found at the receiver side as it would be re-


5/16


produced to an end user. In principle, the generation of the potentially corrupted video

can be regarded as a process of copying the original video trace file frame by frame,

omitting frames indicated as lost or corrupted at the receiver side. Nevertheless, the gen-

eration of the possibly corrupted video is more complex than this and the process is fur-

ther explained in more details in section 3.2. Furthermore, the current version of the ET

component implements the cumulative inter-frame jitter algorithm [8] for play-out buffer.

If a frame arrives later than its defined playback time, the frame is counted as a lost

frame. This is an optional function. The size of the play-out buffer must also be set, oth-

erwise it is assumed to be of infinite size.

FV (Fix Video) Digital video quality assessment is performed frame by frame. There-

fore, the total number of video frames at the receiver side, including the erroneous frames,

must be the same as that of the original video at the sender side. If the codec cannot han-

dle missing frames, the FV component is used to tackle this problem by inserting the last

successfully decoded frame in the place of each lost frame as an error concealment tech-

nique [28].

PSNR (Peak Signal Noise Ratio) PSNR is one of the most widespread objective met-

rics to assess the application-level QoS of video transmissions. The following equation

shows the definition of the PSNR between the luminance component Yof source image S

and destination imageD:

PSNR(n)dB = 20 log10

2

0 0

,

1[ ( , , ) ( , , )]

col row

peak

N N

S Dcol row i j

V

Y n i j Y n i jN N = =

where Vpeak = 2k 1 and k= number of bits per pixel (luminance component). PSNR

measures the error between a reconstructed image and the original one. Prior to transmis-sion, it is possible to compute a reference PSNR value sequence on the reconstruction of

the encoded video as compared to the original raw video. After transmission, the PSNR

is computed at the receiver for the reconstructed video of the possibly corrupted video

sequence received. The individual PSNR values at the source or receiver do not mean

much, but the difference between the quality of the encoded video at the source and the

received one can be used as an objective QoS metric to assess the transmission impact on

video quality at the application level.

Table 3. Possible PSNR to MOS conversion [29].

PSNR[dB] MOS

> 37

31-3725-31

20-25

< 20

5 (Excellent)

4 (Good)3 (Fair)

2 (Poor)

1 (Bad)


6/16


MOS (Mean Opinion Score) MOS is a subjective metric to measure digital video

quality at the application level. This metric of the human quality impression is usually

given on a scale that ranges from 1 (worst) to 5 (best). In this framework, the PSNR of

every single frame can be approximated to the MOS scale using the mapping shown in

Table 3.

3. ENHANCEMENT OF EVALVID

This section introduces the proposed enhancement of EvalVid by constructing three

connecting interfaces (agents) between EvalVid and NS2. Additionally, this section dis-

cusses the problem associated with the conventional fix YUV component (FV) and de-

velops animproved fix YUV component to overcome this problem.

3.1 New Network Simulation Agents

Fig. 2 illustrates the QoS assessment framework for video traffic enabled by the

new tool-set that combines EvalVid and NS2. As shown in Fig. 2, three connecting

simulation agents, namely MyTrafficTrace, MyUDP, and MyUDPSink, are imple-

mented between NS2 and EvalVid. These interfaces are designed either to read the video

trace file or to generate the data required to evaluate the quality of delivered video.

Fig. 2. Interfaces between EvalVid and NS2.

Consequently, the whole evaluation process starts from encoding the raw YUV

video, and then the VS program will read the compressed file and generate the traffic

trace file. The MyTrafficTrace agent extracts the frame type and the frame size of the


7/16


video trace file generated from the traffic trace file, fragments the video frames into

smaller segments, and sends these segments to the lower UDP layer at the appropriate

time according to the user settings specified in the simulation script file. MyUDP is an

extension of the UDP agent. This new agent allows users to specify the output file name

of the sender trace file and it records the timestamp of each transmitted packet, the

packet ID, and the packet payload size. The task of the MyUDP agent corresponds to the

task that tools such as tcp-dump or win-dump performs in a real network environment.

MyUDPSink is the receiving agent for the fragmented video frame packets sent by

MyUDP. This agent also records the timestamp, packet ID, and payload size of each

received packet in the user specified receiver trace file. After simulation, based on these

three trace files and the original encoded video, the ET program produces the corrupted

video file. Afterward, the corrupted video is decoded and error concealed. Finally, the

reconstructed fixed YUV video can be compared with the original raw YUV video to

evaluate the end-to-end delivered video quality.

3.2 Problem of the Original FV Program

As described in section 2, when the video transmission is over, the receiver trace

file has to be sent back to the sender side for the video quality evaluation. Based on the

video trace file, the sender trace file, and the receiver trace file, the lost frames can be

identified. If a frame is lost due to packet loss, the ET component sets the vop_coded bit

of this video object plane (VOP) header in the original compressed video file to 0. The

setting of this bit to 0 indicates that no subsequent data exists for this VOP. This type of

frame is referred to as a vop-not-coded frame. When a frame is received completely and

the vop_coded bit is set to 1, this type of frame is referred to as a decodable frame. After

setting the vop_coded bit to 0 for all the lost frames, the processed file is then used to

represent the compressed video file delivered to the receiver side.

Currently, no standard exists to define an appropriate treatment of vop-not-coded

frames. Some decoders with an error concealment mechanism simply replace the vop-not-coded frames by the last successfully decoded frame [28]. In these cases, the FV

component is not required. Other decoders, however, without error concealment, such as

ffmpeg, decode all frames other than the vop-not-coded frames. In these cases, the FV

component can handle these vop-not-coded frames without difficulty by simply replacing

them with the last successfully decoded frames. Other decoders, such as Xvid or the

NCTU codec, additionally fail to decode the subsequent frames in some cases. For ex-

ample, when decoding a subsequent frame that is a decodable frame, this frame may fail

to be decoded if the frame it depends on is a vop-not-coded frame because there is not

enough information to decode it. This type of frame is referred to as a non-decodable

frame. In this case, the original FV component fails since it does not take this possibility

into consideration.

Based on these limitations, a requirement exists to design a new algorithm capable

of solvingthe problem of non-decodable frames. In this study, we develop an algorithmthat uses the decoder output to fix the decoding results, i.e. reconstructed erroneous video

sequence. If a frame is decodable, the improved FV component copies this decoded YUV

frame data from the reconstructed erroneous raw video file into a temporary file and

keeps it in a buffer as the last successfully decoded frame data. If a frame is vop-not-


8/16


coded, the improved FV component reads this frame data from the reconstructed errone-

ous raw video file, but it does not copy the data into the temporary file. This is because

the data read is useless and the file pointer needs to be moved to the next frame. The im-

proved FV component copies the data from the buffer into the temporary file instead. If a

frame is missing or considered non-decodable, the improved FV component simply cop-

ies the last successfully decoded YUV frame data in the buffer into the temporary file.

After processing all the frames in the reconstructed and possibly corrupted video se-

quence, the resulting temporary file is the reconstructed fixed video sequence. After-

wards, the frame-by-frame PSNR can be evaluated in the usual manner.

4. SIMULATION RESULTS

This section demonstrates the usefulness of the new tool-set by considering two ex-

perimental casessimulated in a best-effort network and in a DiffServ (Differentiated Ser-

vice) network [19, 30, 31] when transmitting real video streams instead of synthetic gen-

erated video flow sequences. Fig. 3 presents the simple simulation topology, in whichHost A delivers a video traffic stream to Host B through routers R1 and R2. The deliv-

ered video is a foreman QCIF format sequence composed of 400 frames. It also has a

mean bit rate of 200 Kbps and a peak bit rate of 400 Kbps. The bottleneck link has a ca-

pacity of 180 Kbps and is situated between router R1 and router R2. The queue limit at

each router is set to 10 packets. The simulation scripts are publicly available at [22].

Host AHost B

Router R1 Router R2

Fig. 3. Simulation topology.

4.1 Conventional Best-Effort Network

In the first experiment, the video is delivered over a best-effort network and router

R1 and R2 implement conventional First In First Out (FIFO) queue management. When

the queue size reaches the queue limit, the FIFO queue management discards all the in-

coming packets until the queue size decreases. Fig. 4 shows the results. It is clearly

shown in the figure that the curve of psnr_myfix_be, which is the video fixed by the im-

proved FV component, outperforms that of psnr_fix_be, which is the video fixed by the

original component, on intervals from frame number 200 and number 250 and above 370.

This is because the original FV component cannot distinguish the vop-not-coded frame

and the missing frame. As a consequence, the FV component may copy the wrong framedata from the reconstructed erroneous raw video file into the temporary file. In terms of

average PSNR, the psnr_myfix_be curve measures 26.86 dB and psnr_fix_be curve

measures 23.43 dB. The simulation results demonstrate that the improved FV component

is more effective than the conventional one in reconstructing the corrupted video se-

quence.


9/16


Fig. 4. Original FV vs. improved FV for best-effort

delivered video.

Fig. 5. QoS delivery vs. best-effortdelivery.

4.2 DiffServ Network

The second experiment is simulated in a DiffServ network in which I-frame packets

are pre-marked with the lowest drop probability in the application layer at the source,

P-frame packets are pre-marked with a medium drop probability, and B-frame packets

are pre-marked with the highest drop probability. The queue management of router R1

and R2 implements a Weighted Random Early Detection (WRED) queue management.

When the queue builds up and exceeds a given threshold, the WRED starts to drop

packets following the specified drop probability parameters. Fig. 5 shows the results.The

PSNR difference values between psnr_noloss, which means no packet loss during

transmission, and psnr_myfix_qos, which is the video transmitted by QoS delivery, are

less than those between psnr_noloss and psnr_myfix_be, which is the video transmitted

by best-effort delivery, especially on the intervals from frame number 260 to number

360. In terms of average PSNR, the delivered video quality in a DiffServ network

measures 28.64 dB. As expected, it outperforms the results obtained in a best-effort net-

work, i.e. an average PSNR of 26.86 dB. Consequently, a DiffServ network provides more

suitable environment for video transmission. In addition, to illustrate the how difference

in performance is perceived by an end user, the corresponding visual effects are shown in

Fig. 6 by means of the YUV display tool, i.e. yuvviewer [32]. This kind of visual result

for a real video stream being transmitted over a simulated network is enabled by our new

tool-set. The possibility of transmitting real video streams over a simulated network also

enables the use of the PSNR quality measurement metric that takes into account the

video content.

5. RELATIONSHIP BETWEEN PSNR AND THE DECODABLE FRAMES

In this section, we investigate the relationship between two popular objective met-

rics: PSNR and the fraction of decodable frames. PSNR is a commonly accepted objec-

tive performance metric that takes into account the video content to assess the video

quality. However, pixel-by-pixel and frame-by-frame comparison to get the PSNR value


10/16


(a) QoS delivery.

(b) Best-effort delivery.

Fig. 6. Visual comparison of the reconstructed 180-184th frames.

Table 4. QoS Mappings.

QoS Index Green Yellow Red

0 I P B1 I P + B

2 I P + B

3 I + P B

4 I + P B

5 I+ P + B

6 I P + B

7 I + P B

8 I + P + B

9 I + P + B

is a slow and laborious job. If the metric of fraction of decodable frames can adequately

correspond to the behavior of the PSNR metric and at the same time be less time-con-suming, it can be an alternative to objectively evaluate the delivery quality of transmitted

video streams.

The fraction of decodable frames reports the number of decodable frames over the

total number of transmitted frames. A frame is considered to be decodable if at least a

fraction , called decodable threshold, of the data in each frame is received. However, a

frame is only considered decodable if and only if all of the frames upon which it depends

are also decodable. Therefore, for instance, when = 0.75, 25% of the data from a frame

can be lost without causing that frame to be considered as undecodable.

The simulation settings refer to [10]. The goal of that paper was to study the deliv-

ered video quality for different QoS source mappings. The adopted QoS mapping table is

shown in Table 4. For example, QoS 0 means that I frame packets are pre-marked as

green, P frame packets are pre-marked as yellow, and B frame packets are pre-marked asred; where color marking in red, yellow, and green represents increasing packet loss pro-

tection within the DiffServ network.

This paper investigates the relationship between the objective metrics PSNR and

fraction of decodable frames. The adopted network topology for this purpose is shown in

Fig. 7. Three video sources were connected to a DiffServ network. The three video


11/16


S1

S2

S3

D1

D2

D3

R1 R2 R3

10 Mbps, 1ms 10 Mbps, 1ms

10 Mbps, 1ms Mbps, 1ms

Fig. 7. Network topology for different QoS source mappings.

Fig. 8. PSNR for foreman video sequence. Fig. 9. The fraction of decodable frames for foreman

video sequence when = 1.0 and = 0.75.

sources transmitted the same video sequence to their respective destinations with a ran-

dom start time within an interval of 3 seconds. The tested video sequences covered three

different kinds of video content, i.e. foreman, akiyo, and highway [33]. These real

video traces have different properties in terms of motion, frame size, and quality. Each

frame is fragmented into packets of 1,000 bytes before transmission. The three routers in

the simulation scenario implement the WRED mechanism for active queue management.

The WRED parameters include a minimum threshold, a maximum threshold, and a maxi-

mum drop probability, i.e. minth, maxth, andPmax. The WRED parameters and the bottle-

neck bandwidth are set differently and specified in the following three simulation sce-

narios.

In the first set of simulations, the tested video sequence is foreman. The parame-

ters for WRED queue mechanism are specified respectively as {10, 20, 0.1} for red

packets, {20, 30, 0.05} for yellow packets, and {30, 40, 0.025} for green packets. The

bottleneck bandwidth is set to 512 Kbps. The simulation results are shown in Figs. 8


12/16


Fig. 10. PSNR for akiyo video sequence. Fig. 11. The fraction of decodable frames for akiyo


and 9. The error bars show the 95% confidence interval. The behavior of the PSNR met-

ric for different QoS indexes matches exactly that of the fraction of decodable frames no

matter if= 1.0 or= 0.75. When the QoS indexes have higher PSNR values, the values

of the fraction of decodable frames are also higher. Likewise, when the QoS indexes

have lower PSNR values, the values of the fraction of decodable frames are also lower.

In the second set of simulations, the tested video sequence is the CIF format akiyo

video sequence, which has 300 frames coded at 30 frames/sec. It has a mean bit rate of

237 Kbps and a peak rate of 595 Kbps. The parameters for WRED queue mechanism are

specified respectively as {20, 40, 0.1} for red packets, {40, 60, 0.05} for yellow packets,

and {60, 80, 0.025} for green packets. The bottleneck bandwidth is set to 640 Kbps.

The simulation results are shown in Figs. 10 and 11. The error bars show the 95% confi-

dence interval. Similarly to the foreman sequence, in the akiyo sequence the behav-

ior of the PSNR metric for different QoS indexes matches exactly that of the fraction ofdecodable frames when = 0.75. However, the curve is somewhat inconsistent with that

of PSNR values for QoS index 5 and QoS index 8 when = 1.0. During PSNR simula-

tions, the improved FV conceals some packet losses, but the system is completely intol-

erant to losses in the case of= 1.0. Therefore, using a smaller is better than using a

largerin matching PSNR.

In the third set of simulations, the tested video sequence is the CIF format high-

way video sequence, which has 2000 frames coded at 30 frames/sec. It has a mean bit

rate of 412 Kbps and a peak rate of 1116 Kbps. The parameters for WRED queue

mechanism are specified respectively as {20, 40 and 0.1} for red packets, {40, 60 and

0.05} for yellow packets, and {60, 80 and 0.025} for green packets. The bottleneck

bandwidth is set to 1.024 Mbps. The simulation results are shown in Figs. 12 and 13.

The error bars show the 95% confidence interval. Likewise, the behavior of the PSNRmetric for different QoS indexes matches exactly to that of the fraction of decodable

frames when = 0.75.

It is also interesting to have a closer look on this simulation. When computing the

PSNR metric, it takes around 3 to 4 minutes to finish the task of simulating, evaluating


13/16


Fig. 12. PSNR for highway video sequence. Fig. 13. The fraction of decodable frames for highway


traces, decoding, fixing, and doing the frame-by-frame PSNR comparison on a PentiumIII 1 GHz computer equipped with 512 MB RAM. In contrast, it takes less than 10 sec-

onds to get the value of the fraction of decodable frames. Similar results hold for the

other two video sequences. It needs to be carefully noticed that highway has only 2000

frames or around 1.11 minutes for video transmission at the rate of 30 frames/second. If

the test sequence has more frames, it needs more time to finish all the tasks.

6. CONCLUSION AND FUTURE WORK

The contribution of this paper is twofold. First, we have presented the integration of

EvalVid and NS2 to provide a novel generalized and comprehensive tool-set for evaluat-

ing the video quality performance of network designs in a simulated environment. Thedeveloped integration provides three new connecting simulation agents, namely MyTraf-

ficTrace, MyUDP, and MyUDPSink. These agents enable EvalVid to link seamlessly

with NS2 in such a way that researchers or practitioners have greater freedom to analyze

their proposed network designs for video transmission without being obliged to consider

an appropriate tool-set for video quality evaluation. Simulations of real video streams are

enabled over a large set of network scenarios, including relatively large topologies, node

mobility, different kinds of concurrent traffic, or any other functionality available by the

network simulator. Furthermore, in an analysis enabled by the new tool-set, we have

shown that the fraction of decodable frames can adequately reflect the behavior of the

PSNR QoS video assessment metric with reasonable accuracy and while being less time-

consuming by at least one order of magnitude. Therefore, when researchers or practitio-

ners want to encode their own test video sequences or adopt well-known ones in order to

evaluate the delivered video quality in a simulated network environment, our proposedQoS assessment framework would be a good choice.

Although this new evaluation framework is beneficial for networking or video-cod-

ing technicians for most of cases, there are still some limitations. First, in its current ver-

sion, it only supports non-scalable video encoding now. Second, due to the video encod-


14/16


ing modes and the agents we developed, the current framework is not suitable for video

transmission over bi-directional channels. The video encoding parameters can not be

changed during simulation time. So researchers interested in rate adaptive design can

refer to [34] for more information. In the future, we will incorporate more codecs into the

framework and support scalable video coding and multiple description coding (MDC).

The prototype of a multiple description coding evaluation framework is publicly avail-

able at [35]. Researchers interested in multiple-path transport and load balance designs

can try this prototype framework for preliminary evaluation.

REFERENCES

1. S. F. Chang and A. Vetro, Video adaptation: concepts, technologies, and open is-sues, inProceedings of the IEEE, Vol. 93, 2005, pp. 148-158.

2. F. H. P. Fitzek and M. Reisslein, MPEG-4 and H.263 video traces for network per-formance evaluation, IEEE Network, Vol. 15, 2001, pp. 40-54.

3.

P. Seeling, M. Reisslein, and B. Kulapala, Network performance evaluation usingframe size and quality traces of single-layer and two-layer video: a tutorial, IEEE

Communications Surveys and Tutorials, Vol. 6, 2004, pp. 58-78.

4. Traffic trace from Mark Garretts MPEG encoding of the Star Wars movie, http://www.research.att.com/~breslau/vint/trace.html.

5. Video traffic generator based on TES (Transform Expand Sample) model of MPEG4trace files, contributed by Ashraf Matrawy and Ioannis Lambadaris, It generates traf-

fic that has the same first and second order statistics as an original MPEG4 trace,

http://www.sce.carleton.ca/~amatrawy/mpeg4.

6. O. Rose, Statistical properties of MPEG video traffic and their impact on trafficmodeling in ATM systems, Report No. 101, Institute of Computer Science, Univer-

sity of Wurzberg, Germany, 1995.

7. D. Saparilla, K. Ross, and M. Reisslein, Periodic broadcasting with VBR-encodedvideo, inProceedings of IEEE INFOCOM, 1999, pp. 464-471.

8. L. Tionardi and F. Hartanto, The use of cumulative inter-frame jitter for adaptingvideo transmission rate, in Proceedings of the Conference on Convergent Tech-

nologies for Asia-Pacific Region, Vol. 1, 2003, pp. 364-368.

9. A. Ziviani, B. E. Wolfinger, J. F. Rezende, O. C. M. B. Duarte, and S. Fdida, Jointadoption of QoS schemes for MPEG streams, Multimedia Tools and Applications,

Vol. 26, 2005, pp. 59-80.

10. J. M. H. Magalhaes and P. R. Guardieiro, A new QoS mapping for streamed MPEGvideo over a DiffServ domain, in Proceedings of the IEEE International Confer-

ence on Communications, Circuits and Systems and West Sino Expositions, 2002, pp.

675-679.

11. M. F. Alam, M. Atiquzzaman, and M. A. Karim, Traffic shaping for MPEG videotransmission over the next generation internet, Computer Communications, Vol. 23,2000, pp. 1336-1348.

12.N. E. Nasser and M. Al-Abdulmunem, MPEG traffic over diffserv assured service,inProceedings of Asia-Pacific Conference on Communication, 2003, pp. 494-498.

13. J. Takahashi, H. Tode, and K. Murakami, QoS Enhancement methods for MPEG


15/16


video transmission on the Internet, IEICE Transactions on Communications, Vol.

E85-B, 2002, pp. 1020-1030.

14. F. A. Shaikh, S. McClellan, M. Singh, and S. K. Chakravarthy, End-to-end testingof IP QoS mechanisms,IEEE Computer Magazine, Vol. 35, 2002, pp. 80-87.

15.NS related mailing lists, http://www.isi.edu/nsnam/htdig/search.html.16. J. Klaue, B. Rathke, and A. Wolisz, EvalVid A framework for video transmission

and quality evaluation, in Proceedings of the International Conference on Model-

ling Techniques and Tools for Computer Performance Evaluation, 2003, pp. 255-

272.

17.NS, http://www.isi.edu/nsnam/ns/.18. S. Olsson, M. Stroppiana, and J. Baina, Objective methods for assessment of video

quality: state of the art, IEEE Transactions on Broadcasting, Vol. 43, 1997, pp.

487-495.

19. C. H. Ke, C. K. Shieh, W. S. Hwang, and A. Ziviani, A two-markers system for im-proved MPEG video delivery in a DiffServ network,IEEE Communications Letters,

Vol. 9, 2005, pp. 381-383.

20. J. Naoum-Sawaya, B. Ghaddar, S. Khawam, H. Safa, H. Artail, and Z. Dawy,Adaptive approach for QoS support in IEEE 802.11e wireless LAN, in Proceed-

ings of the IEEE International Conference on Wireless and Mobile Computing,Net-

working and Communications, 2005, pp. 167-173.

21. H. Huang, J. Ou, and D. Zhang, Efficient multimedia transmission in mobile net-work by using PR-SCTP, in Proceedings of the IASTED International Conference

on Communications and Computer Networks, 2005, pp. 213-217.

22. http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm.23.NCTU codec, http://megaera.ee.nctu.edu.tw/mpeg.24. ffmpeg, http://ffmpeg.sourceforge.net/index.php.25. Xvid, http://www.xvid.org/.26. tcp-dump, http://www.tcpdump.org.27.

win-dump, http://windump.polito.it.28. Y. Wang and Q. F. Zhu, Error control and concealment for video communication: areview, inProceedings of the IEEE, Vol. 86, 1998, pp. 974-997.

29. J. R. Ohm, Bildsignalverarbeitung fuer multimedia-systeme, Skript, 1999.30. B. Carpenter and K. Nichols, Differentiated services in the internet, in Proceed-

ings of the IEEE, Vol. 90, 2002, pp. 1479-1494.

31. J. Shin, J. Kim, and C. C. J. Kuo, Quality of service mapping mechanism for packetvideo in differentiated services network, IEEE Transactions on Multimedia, Vol. 3,

2001, pp. 219-231.

32. yuvviewer, http://eeweb.poly.edu/~yao/VideobookSampleData/video/application/YUV-viewer.exe.

33. YUV video sequences (CIF), http://www.tkn.tu-berlin.de/research/evalvid/cif.html.34. Evalvid-RA, http://www.item.ntnu.no/~arnelie/Evalvid-RA.htm.35. Multiple description coding evaluation framework, http://hpds.ee.ncku.edu.tw/~small-

ko/ns2/MDC.htm.


16/16


Chih-Heng Ke () received his B.S. and Ph.D degrees

in Electrical Engineering from National Cheng-Kung University,

in 1999 and 2007. He is an assistant professor of Computer Sci-

ence and Information Engineering, National Kinmen Institute of

Technology, Kinmen, Taiwan. His current research interests in-

clude multimedia communications, wireless network, and QoS

network.

Ce-Kuen Shieh () is currently a professor teaching in

the Department of Electrical Engineering, National Cheng Kung

University. He received his Ph.D., M.S., and B.S. degrees from the

Electrical Engineering Department of National Cheng Kung Uni-

versity, Tainan, Taiwan. His current research areas include distrib-uted and parallel processing systems, computer networking, and

operating systems.

Wen-Shyang Hwang () received his B.S., M.S., and

Ph.D. degrees in Electrical Engineering from National Cheng

Kung University, Taiwan, in 1984, 1990 and 1996, respectively.

He is professor of Electrical Engineering, National Kaohsiung

University of Applied Sciences, Taiwan. His current research fo-cus includes multi-channel WDM networks, performance evalua-

tion, QoS, RSVP, WWW database applications

Artur Ziviani received a B.Sc. in Electronics Engineering

in 1998 and a M.Sc. in Electrical Engineering in 1999, both from

the Federal University of Rio de Janeiro (UFRJ), Brazil. In 2003,

he received a Ph.D. in Computer Science from the University of

Paris 6, France, where he has also been a lecturer during 2003 to

2004. Since 2004, he is with the National Laboratory for Scien-tific Computing (LNCC), Brazil. His research interests include

QoS, wireless computing, Internet measurements, and the appli-

cation of networking technologies in telemedicine.

Documents

An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission