14
Accepted Manuscript Performance assessment of three-dimensional video codecs in mobile terminals Almudena S ´ anchez, Mat´ ıas Toril, Marta Solera, Salvador Luna-Ram´ ırez, Gerardo G ´ omez PII: S0140-3664(17)31268-9 DOI: 10.1016/j.comcom.2018.04.016 Reference: COMCOM 5692 To appear in: Computer Communications Received date: 3 December 2017 Revised date: 22 March 2018 Accepted date: 24 April 2018 Please cite this article as: Almudena S ´ anchez, Mat´ ıas Toril, Marta Solera, Salvador Luna-Ram´ ırez, Gerardo G ´ omez, Performance assessment of three-dimensional video codecs in mobile terminals, Computer Communications (2018), doi: 10.1016/j.comcom.2018.04.016 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

Accepted Manuscript

Performance assessment of three-dimensional video codecs inmobile terminals

Almudena Sanchez, Matıas Toril, Marta Solera,Salvador Luna-Ramırez, Gerardo Gomez

PII: S0140-3664(17)31268-9DOI: 10.1016/j.comcom.2018.04.016Reference: COMCOM 5692

To appear in: Computer Communications

Received date: 3 December 2017Revised date: 22 March 2018Accepted date: 24 April 2018

Please cite this article as: Almudena Sanchez, Matıas Toril, Marta Solera, Salvador Luna-Ramırez,Gerardo Gomez, Performance assessment of three-dimensional video codecs in mobile terminals,Computer Communications (2018), doi: 10.1016/j.comcom.2018.04.016

This is a PDF file of an unedited manuscript that has been accepted for publication. As a serviceto our customers we are providing this early version of the manuscript. The manuscript will undergocopyediting, typesetting, and review of the resulting proof before it is published in its final form. Pleasenote that during the production process errors may be discovered which could affect the content, andall legal disclaimers that apply to the journal pertain.

Page 2: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Performance assessment of three-dimensional video codecs in mobile terminals

Almudena Sancheza, Matıas Torila, Marta Soleraa, Salvador Luna-Ramıreza, Gerardo Gomeza

aDepartamento de Ingenierıa de Comunicaciones, E.T.S.I.Telecomunicacion, Universidad de Malaga, Bulevar Louis Pasteur, S/N, 29010, Malaga,Spain

Abstract

Understanding the most relevant factors influencing the end user experience is key to design and manage multimediaservices in mobile networks. Thus, user demand can be dealt with efficient strategies of resource management. Inthis paper, a comprehensive analysis is carried out to evaluate the Quality of Experience (QoE) obtained by three-dimensional (3D) video codecs in mobile terminals. The analysis covers the most widespread 3D video codingformats, namely the frame-compatible Side-by-Side (SbS) and Multiview Video Coding (MVC) schemes. The analy-sis considers both subjective and objective measurements over compression strategies that combine bit rate reductionand frame-rate dropping. Subjective tests are done by presenting 3D video sequences with different coding parame-ters to users who judge image quality, depth perception and visual comfort. For this purpose, a mobile phone withautostereoscopic screen is used. Then, mean opinion scores are compared with objective measurements obtained bya video quality measurement tool. Results show that MVC outperforms SbS in terms of image quality. However,when internal parameter settings are set to very restrictive configurations, the impairment of the original image aresimilar for both codecs. Likewise, results show that the reduction of the video bit rate is the key parameter control-ling video quality, and that the use of frame-rate dropping is not enough to counteract the impairment introduced bybit rate reduction. Finally, it is confirmed that a simple QoE indicator for 3D video can be obtained from objectivemeasurements.

Keywords: 3D video, Codecs, Mobile terminals, Subjective, Objective

1. Introduction

In the last decade, advances in consumer electronicshave led to an unprecedented growth in the numberof applications and devices in telecommunication net-works. Youtube, Vine, and social media applications,such as Facebook, Instagram or Snapchat, allow usersto upload, watch and share videos. As a result, videocontent has become a major contributor to Internet traf-fic, and it is expected that video traffic will be 80% ofall consumer Internet traffic by 2019 [1].

In parallel, the introduction of tablets and smart-phones has paved the way for new multimedia mobileservices. New mobile terminals make it easier to watchand share videos, which has become a daily activityfor many users. Mobile video is forecast to grow from50% in 2015 to 70% in 2021 of all mobile data traffic[2]. Thus, mobile networks have to cope with the hugeamount of data and strict performance requirements as-sociated to video services.

The blooming of video applications in the enter-tainment area has also favoured the return of on-and-

off technologies, such as three-dimensional (3D) video,which re-started in 2009 with the Avatar film. There-after, a large number of productions have been dis-tributed in 3D format. 3D video has not only foundits way into films, but it has also jumped into other ar-eas, such as medicine (e.g., to help in surgery [3] ordiagnosis [4]), sports (e.g., for advanced player perfor-mance analysis [5] or for investigation purposes (e.g.,to monitor animal behaviour [6]). Furthermore, 3D de-vices have evolved from the first 3D television sets athome to intelligent 3D-capable mobile terminals withautostereoscopic screens [7],[8].

To provide a successful 3D video service, it is neces-sary to somehow measure user’s satisfaction or Qualityof Experience (QoE), understood as how well a servicemeets customers goals and expectations, rather than fo-cusing only on network performance, which was thegoal of Quality of Service (QoS) [9].

In this paper, a thorough study of the Quality of Ex-perience (QoE) provided by 3D video codecs in mo-bile terminals is presented. Two of the most well-

Preprint submitted to Computer Communications April 30, 2018

Page 3: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

known standards for 3D video coding, namely frame-compatible Side-by-Side (SbS) format and MultiviewCoding (MVC), are analysed by performing both sub-jective and objective measurements. Subjective assess-ment is carried out by means of a smartphone withan autostereoscopic screen. In that terminal, observersare provided with a comprehensive set of 3D videosequences corresponding to different coding parameterconfigurations that reduce both bit rate and frames. Theaim is to evaluate several 3D video aspects, such as im-age quality, depth quality and visual comfort. Subjec-tive measurements, measured as mean opinion scores(MOS), are compared with classical objective measure-ments, such as Peak Signal-to-Noise Ratio (PSNR) andMulti-Scale Structural Similarity (MS-SSIM), obtainedby a publicly-available video quality measurement tool.

1.1. Related workDistributing high-quality 3D video content requires a

large bandwidth. Thus, researchers have focused theirefforts on the development of efficient 3D coding tech-niques to reduce the amount of transmitted data [10]. Afirst approach, referred to as Simulcast, consist of dis-tributing a full-resolution stereo video by two indepen-dent views of maximum quality in parallel. Thus, com-putational load is kept to a minimum, although data rateis doubled compared to conventional 2D video.

Alternatively, frame-compatible formats were bornto restrict the data flow of the video representation bycombining sub-sampled versions of the left and rightviews into a single frame, which can then be codedwith conventional 2D video codecs, such as MPEG2,H.264/MPEG4 Part 10 Advanced Video Coding (AVC)[11] or High Efficiency Video Coding (HEVC) [12].Thus, operators can still use the existing infrastruc-ture to introduce stereoscopic services at the expense ofslightly deteriorating 3D video quality. An intermediateapproach is Mixed Resolution Stereo Coding (MRSC),where the resolution of one view is reduced to exploitthat the perceived quality of stereoscopic video withviews of different sharpness is rated by the observerclose to the sharper view [13].

More advanced schemes specifically designed for 3Dvideo, such as H.262/MPEG2 Multi-View Profile [14]and H.264/MPEG4 Multiview Video Coding (MVC)[11], make use of differential coding between left andright view to exploit inter-view redundancy (stereovideo). Nonetheless, the resulting coding rate is still50 % larger than that of a single view [15] becausethe information related to view’s differences is also sentin order to restore the secondary view in the decoder.Alternatively, depth-based coding schemes (V+D, for

video and depth) [16] [17], add depth information (usu-ally, a greys map) to a classical 2D video stream, en-abling depth-image-based rendering (DIBR) of addi-tional viewpoints in the decoder to generate the 3Dvideo representation with full resolution. The result-ing coding rate is only 10-20% higher than conventionalvideo.

Another important issue of 3D video is the need fornew metrics to assess video quality. Subjective metricsfor stereoscopic video have significant differences com-pared to 2D video metrics, due to the clever way the hu-man visual system handles similarities and dissimilari-ties between views. A first difference is the inclusion ofdepth information in video sequences, for which severalvideo databases exist in the public domain [18]. More-over, subjective 3D video assessment requires evaluat-ing new indicators, in addition to picture quality, such asdepth quality, naturalness, emotion, sense of presence oreven the price of the network service, as recommendedin ITU-BT.2021-1 [19]

Likewise, objective 3D video quality metrics alsodiffer from those of 2D, since 3D video presents sig-nificantly different quality issues that are not encoun-tered or do not have their equivalent in 2D. A naiveapproach to build 3D metrics from already existing 2Dmeasurements is to average the value of any objectivemetric (e.g., Peak Signal-to-Noise Ratio [20], StructuralSimilarity [21], Moving Picture Quality Metric [22], orVideo Quality Model [23]) for the left and right views[24]. More refined metrics take into account the proper-ties of 3D vision (e.g., disparity [25] or binocular depth[26]). However, the latter are not currently availablein open access measurement tools. In this work, the se-lected objective measurements are PSNR and MS-SSIM(described in detail in later sections), the latter being amulti-scale variant of Structural Similarity index. Thesemeasurements commonly used in video assessment areavailable in several publicly available video measure-ment tools, allowing the reader to replicate the studypresented here. Furthermore, these tools allow the useof common video file formats (e.g., mp4, yuv, avi . . . ).Thus, there is no need of a preprocessing stage wherethe original video format is modified to compute the de-sired objective metric.

Regarding the novelties that this article provides,multiple references have evaluated the performance of3D video coding schemes on large screen formats withobjective [27][15], or subjective tests [28] [29][30].However, few studies have extended the analysis to mo-bile phones with small autostereoscopic screens, wherenot only screen size impacts MOS scores, but also otherfactors such as the maturity of the 3D technology or user

2

Page 4: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

familiarity with autostereoscopic screens. In [31] and[32], several stereoscopic coding schemes suitable formobile devices are compared based on objective met-rics, such as rate-distortion curves and computationalcomplexity. These studies aim to find the best codecparameter settings depending on processing power andmemory of the mobile device. However, these studiesdo not take into account the user subjective perspec-tive. Reference [33] presents subjective measurementsof 3D video over a Long Term Evolution ( LTE) networkunder packet losses following a Gilbert-Elliot ChannelModel. However, it does not consider bit rate variationsin the network and source, which is an important param-eter affecting video quality.

A more comprehensive analysis of the experienceprovided by 3D codecs in mobile phones is presentedin [34] [35]. It was shown there that the use of the highcoding profile (i.e., hierarchical B-frames and context-adaptive binary arithmetic coding) can provide the samequality as baseline profile with lower bit rates. Ourstudy extends the results in [34] [35] by also includingframe dropping as a rate reduction scheme to check ifa combination of both bit rate and frame dropping canobtain a higher QoE.

The rest of the paper is organized as follows. Sec-tion 2 reviews 3D video compression techniques. Sec-tion 3 describes the assessment methodology used inthis work. Section 4 presents the results of the subjec-tive assessment of 3D video coding approaches carriedout with real observers. Section 5 presents the resultsof objective measurements. Section 6 compares the re-sults of subjective and objective tests. Finally, Section 7presents the main conclusions of the study.

2. 3D Video Coding

This section describes the 3D video coding schemesused in this work. The selected coding formats includea scheme that makes use of existing 2D video codecs(frame-compatible Side-by-Side, SbS) and a schemebased on a base view (Multiview Video Coding, MVC).Both schemes have been standardized and provide back-ward support to legacy devices. For this reason, SbSand MVC have become the most widespread 3D videoschemes.

2.1. Side-by-Side

In frame-compatible formats, the stereo video signalis a multiplex of the two views into a single frame (or asequence of frames). To keep the same sequence size,the left and right views are sub-sampled and merged.

Decimation can be done in the spatial or temporal do-main. In the former case, the two views are decimatedhorizontally or vertically and stored in a side-by-side(SbS), top-and-bottom (TaB) or line/column-interleavedformat [36]. In the latter case, the left and right viewsare interleaved as alternating frames or fields. Both op-tions are referred to as frame sequential and field se-quential schemes, respectively [37].

Figure 1 shows how SbS exploits spatial redundancyto reduce the horizontal resolution into a single frame.The process begins at the encoder by low-pass filteringthe image of each view to reduce aliasing before sub-sampling. Then, the left and right views are decimated.Decimation can be performed by taking the same po-sition of pixels in both views (referred to as commondecimation) or alternated positions (e.g., odd columnsin left view and even columns in right view) [37]. Then,the sub-sampled versions of the views are merged into asingle frame. At the decoder side, de-multiplexing andinterpolation is carried out to reconstruct the two views.To facilitate the adoption of frame compatible formats,the H.264/AVC standard introduced a new Supplemen-tal Enhancement Information (SEI) message to signalthe frame packing arrangement and the view order used[11]. From these messages, a decoder can recognize theformat and apply suitable image processing techniques(e.g., scaling, denoising or color-format conversion).

2.2. Multiview Video Coding

Multiview Video Coding (MVC) [11] is an exten-sion of the Advanced Video Coding (AVC) standard,designed for coding a video sequence captured by mul-tiple cameras. MVC exploits temporal and inter-viewredundancy by interleaving camera views and using dif-ferential coding between frames and views. For in-stance, MVC can be used to reduce the transmitted dataof multi-view video to ensure a robust transmission overwireless networks despite the high bandwidth require-ment and time-varying channel quality [38].

Figure 2 shows an example of operation of the MVCStereo High Profile. Each view consists of an or-dered sequence of intra-coded (I), predicted (P) and bi-directionally predicted (B) frames, in a repetitive pat-tern known as group of pictures (GOP). Additionally,MVC applies differential coding between the left andright views to eliminate inter-view redundancy. To re-duce complexity, MVC does not allow the prediction ofa picture in one view at a given time using a picture fromanother view at a different time.

The MVC decoding process requires several addi-tions to the H.264/MPEG-4 AVC standard [36]. As

3

Page 5: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Figure 1: Side-by-Side coding method.

in H.264, MVC encoders can be configured to differ-ent levels, each defining constraints on the generatedbitstream to reduce decoding complexity. For MVC,the high-level syntax defined by H.264/MPEG-4 AVCis enriched with a multi-view extension of the sequenceparameter set (SPS) providing view identification andview dependencies. Likewise, MVC profiles determinethe subset of coding tools that must be supported byconforming decoders. There are two profiles defined byMVC: the Multiview High profile, to support multipleviews, and the Stereo High profile, only for two views.In both profiles, the base view can be encoded using ei-ther the High profile of H.264/MPEG-4 AVC, or a moreconstrained profile known as the Constrained Baselineprofile.

3. 3D Video Quality Assessment Methodology

In this section, the experimental methodology usedto evaluate the performance of different 3D codecs onthe screen of a mobile device is presented. Experimentsinclude both subjective and objective tests.

3.1. Test material

A proper selection of the video sequences is key forquality assessment. In this work, the following crite-

Figure 2: MVC coding method.

ria are used to construct the set of sequences in the tests[39][40]: a) avoid sequences with distracting content, b)combine real and animated sequences, as they responddifferently to compression algorithms, c) include se-quences with different planar motion (e.g., quick or slowmovement), d) avoid sequences causing visual discom-fort and/or eyestrain on the observer, which is achievedby limiting depth in the 3D scene, and e) use a suffi-ciently large number of sequences to ensure reliable re-sults, but not too many to shorten the display sessions(note that multiple coding approaches have to be testedfor each sequence).

The three chosen videos in the tests are taken from theMobile3DTV project server [18]. The original videosare provided as two separate raw sequences, calledviews (i.e., left and right views). These can be playedby specific YUV players (e.g., Yuv Player Deluxe orOpenEyeViwer). Table 1 shows the main properties ofthe sequences. Flower is a pop-out sequence of a flowerblooming. Bullinger is a single front shot of a mantalking, which might be representative of a videoconfer-ence. Finally, Rope is an animated video of a skippingrabbit. All of them have low spatial resolutions adaptedto the mobile screen. The nominal bitrate (NBR) shownin the table is the average bitrate per view ensuring aminimum PS NR = 40 dB for a reference codec (SbS inthis work).

3.2. Evaluated coding approaches

The analysis aims to compare several 3D videocodecs and rate reduction approaches used by networkand service providers. The considered coding schemesare SbS and MVC. At the same time, two rate reductionapproaches are evaluated. A first one is the classical ap-proach based only on a quantified scale. Alternatively,a second approach combines frame dropping and quan-tified scale [41]. Table 2 summarizes the 5 rate reduc-

4

Page 6: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Table 1: Characteristics of the original sequences per view.

Source Resolution Content Frame-rate Color sampling Bit depth Duration Nominal bitrateFlower 480x270 Pop-out flower 25 fps 4:2:0 8 16 s 200 kbps

Bullinger 432x240 Man talking 25 fps 4:2:0 8 16 s 200 kbpsRope 432x240 Skipping rabbit cartoon 25 fps 4:2:0 8 16 s 900 kbps

Table 2: Rate reduction approaches.

NR No reduction, used as a reference (e.g., Bullinger bitrate equal to nominal bit rate, 200 kbps)BR/ 2 Reduction of bitrate to 50 % only by quantified scale (e.g., Bullinger BR=100 kbps)BR/10 Reduction of bitrate to 10% only by quantified scale (e.g., Bullinger BR=20 kbps)

BR/2 & FR/2 Reduction of bitrate to 50% and discarding half of the frames (e.g., Bullinger BR=100 kbps)BR/10 & FR/4 Reduction of video bitrate to 10% and discarding 3 out of 4 frames(e.g., Bullinger BR=20 kbps)

tion approaches considered in the analysis (more pre-cisely, 4 reduction approaches and 1 reference method).When combined with the coding schemes, this resultsin 30 coded video sequences (= 3 sequences * 2 codingschemes * 5 rate reduction approaches).

Sequences are coded with the JM v18.5 ReferenceSoftware [42]. The reasons for selecting this tool arethat: a) it implements H.264-MPEG4 AVC SbS andMVC schemes, b) it is well proven and well docu-mented, c) it has an extremely flexible configuration bya comprehensive parameter set, and d) it is easy to con-trol via a parameter file.

Table 3 shows the values of the most relevant parame-ters that must be set in the JM configuration file. Pro-fileIDC determines the tools and capacities that corre-spond to SbS (i.e., H.264-MPEG4 /AVC mMain profileset by ProfileIDC=77) and MVC coding (i.e., Multi-view High Profile set by ProfileIDC =118). LevelIDCsets the memory requirements a decoder must fulfill todecode the bitstream (3.1 level in this work). Num-berOfViews sets the number of views (1 for SbS and 2for MVC). FrameSkip and Bitrate parameters define therate reduction strategies. FrameSkip reduces the framerate of the original video by a factor of 2 and 4 (i.e.,from 25 Hz to 12.5 and 6.25 Hz), whereas Bitrate fixes

Table 3: Coding parameters

Parameter Scheme SbS MVCProfileIDC 77 118LevelIDC 31 31

NumberOfViews 1 2FrameSkip 1,3 1,3

Bitrate NR,NR/2,NR/10 NR,NR/2,NR/10NumberBFrames 7 7

PrimaryGOPLength 8 8

the average bitrate to the nominal bitrate, half the nom-inal bitrate or 10% of the nominal bitrate of the particu-lar sequence. The last two parameters, NumberBFramesand PrimaryGOPLength , configure the GOP structure(i.e., 8-frame closed GOP, IBBBBBBBP).

3.3. Subjective Assessment

In this work, the Single-Stimulus (SS) approach de-scribed in ITU-R BT.500 [43] and ITU-T P.910 [44] isadopted. In SS, only the degraded sequence is presentedto the observer (i.e., the original sequence is not avail-able for comparison). Then, the observer provides aquality score of the entire presentation. SS is selectedas it is closer to how the mobile user perceives the videosequence in a real situation. Moreover, the limited view-ing angle of current autostereoscopic screens makes itvery difficult to display the reference sequence simulta-neously in a second screen.

3.3.1. ParticipantsA set of 31 observers (9 females and 22 males, ages

ranging from 18 to 45) partook in the subjective tests.All of them were naıve (i.e.,. non-expert) users and ful-filled the conditions of visual acuity, color vision andstereoscopic vision required by ITU-R BT.500 recom-mendation. No observer was rejected after the screeningof subjective results.

3.3.2. Environment and EquipmentThe evaluation took place in a controlled lab envi-

ronment. The mobile terminal used in the experimentswas an LG Optimus 3D P920, including an autostereo-scopic screen of 480x800 pixels implementing parallaxbarrier technology. Observers were placed 30-40 cmaway from the terminal to obtain an optimal viewingangle, since the technology used by the smartphone lim-its the angle where 3D is perceived correctly. To avoid

5

Page 7: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

highlights on the screen that could hinder the observerevaluation, the mobile terminal was introduced inside ablack velvet box that blocked any disturbing light.

3.3.3. Test sessionThe observers were given the terminal with the full

set of 30 video sequences as independent files, so thatthese could be launched separately. Thus, the observercould launch each video sequence as many times as de-sired. To make voting easier, video sequences werepreviously edited to extend their length by reproducingthem four times in succession. The video player usedin the tests was the one provided by the manufacturer.Each session consisted of an accommodation stage, inwhich participants learned how to use the autostereo-scopic display and got used to impairments, followedby an evaluation stage, where participants rated the sub-jective quality of each video sequence. The total sessionlength was 30 minutes per observer, approximately.

3.3.4. Subjective quality indicatorsSubjective assessment is based on three indicators: a)

image quality, given by the lack of degradations (e.g.,blocking, blurring, staircase, tessellation, ...), b) depthquality, measuring how realistic is the reproduction ofdifferent planes in the scene or volume in objects, and c)visual comfort, measuring the level of naturalness, un-derstood as a lack of eyestrain. Each observer scores vi-sual quality, depth perception and visual comfort usingthe five-grade scale described in ITU-R BT.500 (Bad,Poor, Fair, Good and Excellent). Then, scores are trans-lated into three Mean Opinion Scores (MOS) from 1 to5 (i.e., 1 denotes bad quality and 5 denotes excellentquality). An overall MOS per coded sequence (i.e., pervideo sequence, codec and rate reduction approach) isthen calculated for each indicator by averaging MOSvalues from all observers. For such averages, 95% con-fidence intervals are computed.

3.4. Objective assessment

In this work, two simple objective video quality met-rics are used, as the focus is on subjective quality as per-ceived by the mobile user. The selected objective met-rics are full reference measurements, obtained by com-paring the degraded video sequence with the originalsequence.

The first one is the classical Peak Signal-to-Noise-Ratio (PSNR) for luminance [45], computed for a framek as

PS NR(k) = 10log10(MAXi

2

MS E) [dB] (1)

where MAX denotes the peak digital value for lumi-nance, given by the number of bits (i.e., 235 for 8-bitvideo formats [46]), and MS E(k) is the mean square er-ror of frame k, computed as

MS E(k) =

M∑i=1

N∑j=1

[ f (i, j, k) − F(i, j, k)]2

M · N , (2)

where f (i, j, k) and F(i, j, k) are the original and recov-ered luminance at pixel (i, j) in frame k, and M and Nare the number of pixels per line and lines per frame inthe video sequence, respectively.

Alternatively, a video quality metric can be ob-tained by computing the Multi-Scale Similar Structural-ity (MS-SSIM) indicator [47]. MS-SSIM is an ex-tended version of the SSIM indicator [48]. Both arebased on the fact that the human visual system is ex-tremely sensitive to the structural information from thescene, and therefore a measure of structural similar-ity should be a good approximation of perceived im-age quality. Structural similarity is obtained from thecomparison of the luminance, contrast and structure ofthe original and degraded images. The multi-scale vari-ant incorporates image details at different spatial reso-lutions. Similar to SSIM, MS-SSIM takes the form of aquality map showing the estimated quality of each pixelof the image. A mean value is then obtained by averag-ing across pixels and frames of the sequence. The readeris referred to [47] for more details about MS-SSIM.

In this work, PSNR and MS-SSIM are computed ona per-frame and per-view basis for each sequence withMSU Video Quality Measurement Tool (VQMT) [49].Then, average PSNR and MS-SSIM are computed persequence by averaging the values of the two views. Thelatter values are compared with the average MOS val-ues for image quality to check the correlation betweenobjective and subjective measurements.

4. Subjective assessment results

This section shows the results of subjective tests. Theanalysis is broken down by indicator to identify whichis the most effective coding scheme and rate-reductionapproach in terms of image quality, depth quality andvisual comfort.

4.1. Image quality

Table 4 presents the average MOS of image qualityfor the 3 sequences, 2 coding schemes and 5 rate re-duction approaches tested. To ease the analysis, aver-age values per rate reduction approach and sequence are

6

Page 8: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Table 4: Image quality MOS results

Sequence Bullinger Flower Rope Avg. per rate reduction approachApproach\ Codec SbS MVC SbS MVC SbS MVC SbS MVC

NR 3.08 3.85 3.77 4 3.65 4.15 3.5 4BR/2 3.08 3.58 3.85 3.81 3.58 4.12 3.5 3.83

BR/10 1.31 1.54 1.58 1.69 1.92 2.12 1.6 1.78BR/2 & FR/2 3.08 3.69 3.04 3.73 3 3.5 3.04 3.64

BR/10 & FR/4 2 2.35 2.08 2.12 2.04 2.04 2.04 2.17Avg. per sequence 2.51 3 2.86 3.07 2.84 3.18 2.54 2.86

Table 5: Depth quality MOS results

Sequence Bullinger Flower Rope Avg. per rate reduction approachApproach\ Codec SbS MVC SbS MVC SbS MVC SbS MVC

NR 2.76 2.52 4.04 3.56 3.68 3.52 3.49 3.2BR/2 3.16 3 4.08 3.48 3.24 3.32 3.49 3.27

BR/10 1.92 2.32 2.4 2 2.72 2.68 2.35 2.33BR/2 & FR/2 2.28 3.16 3.88 3.32 2.96 3.4 3.04 3.29

BR/10 & FR/4 2.2 2.52 2.72 2.2 2.84 2.68 2.59 2.47Avg. per sequence 2.46 2.7 3.42 2.91 3.09 3.12 2.87 2.84

Figure 3: Image quality for different coding approaches.

also computed by averaging across rate reduction ap-proaches and sequences (right and bottom of the table,respectively). In the table, it is clear that, as expected,MVC outperforms SbS for nearly all sequences and ratereduction approaches. Specifically, the overall averageMOS of image quality is 2.86 for MVC and 2.54 forSbS.

Figure 3 shows the average MOS for image qualityfor different rate reduction approaches and codecs byaveraging the values for the 3 sequences (i.e., last 2columns in Table 4). For completeness, error bars withthe 95th percentile are also superimposed. In the figure,it is first confirmed that MVC outperforms SbS in allrate reduction approaches. This is due to the high inter-view redundancy in normal 3D sequences, which is ef-

Figure 4: Image quality for same bitrate but frame-rate reduction.

fectively eliminated in MVC by differential coding. Onaverage, MVC outperforms SbS by 0.35 absolute MOSpoints, and the difference can be of up to 0.5 points.Secondly, it is observed that subjective image qualitydecreases when the bitrate decreases, as NR shows bet-ter MOS scores than BR and BR/10 approaches. Inparticular, note that a reduction of 50% of the bitrate(i.e., from NBR to NBR/2) causes a small degradationin image quality (e.g., 4% MOS decrease from NR toBR/2 with MVC). Additionally, a 10-fold rate reduction(i.e., BR/10) causes a large degradation of image qual-ity. Such a MOS decrease is larger for MVC, as MVCstarts from a better image quality score. More impor-tantly, it is seen in Figure 4 that for small compressionratios (i.e., rate reduction factor of 2), it is more effec-

7

Page 9: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Figure 5: Image quality for same bitrate but frame-rate reduction.

tive to rely only on the quantizer scale (i.e., BR/2) thanusing frame rate reduction also (i.e., BR/2&FR/2). Thisis possibly due to a less effective temporal predictioncaused by the loss of correlation between consecutiveframes when every 1 out of 2 frames is discarded. Incontrast, for large compression ratios (i.e., rate reduc-tion factor of 10), it is more effective to first reduce theamount of information by discarding multiple frames(i.e., BR/10&FR/4) rather than leaving all the work tothe quantizer (i.e., BR/10). From these results, it is in-ferred that, for large compression ratios, temporal deci-mation is preferred to very aggressive quantization.

Figure 5 breaks down image quality statistics per se-quence to identify differences among sequences. First,it is observed that the sequence with the lowest imagequality scores with the nominal bitrates (i.e., NR) isBullinger. Note that, even if Rope has more movement,recall that the nominal bitrate of Rope is 4.5 larger thanBullinger. For BR/10, Bullinger is also the worst se-quence. A more detailed analysis (not presented here)shows that this sequence suffers from pronounced blur-ring, blocking and freeze frames with large compres-sion ratios due to the continuous hand movement in thescene. Moreover, it is worth remarking that, for Rope,the difference between BR/10 and BR/10&FR/4 is al-most negligible. This result indicates that the impact ofthe frame rate reduction is less in Rope. This is possiblydue to the fact that, even if Rope has more movement,it has a lower spatial complexity as it is a synthetic se-quence (cartoon), unlike Bullinger, which is a real se-quence. All these trends are maintained for both SbSand MVC.

4.2. Depth quality

Table 5 presents the average depth quality MOSscores for the sequences, coding schemes and rate re-duction approaches tested. Again, average values per

Figure 6: Depth quality for different approaches.

rate reduction approach and sequence are computedby averaging across rate reduction approaches and se-quences.

Figure 6 shows MOS values for depth quality withthe different coding approaches. Again, it is observedthat the higher the bitrate, the better the depth sensa-tion. Likewise, for higher bitrates (i.e., lower compres-sion ratios), it is advantageous to rely only on quan-tization scale, whereas, for lower bitrates (i.e., highercompression ratios), it is better to also take advan-tage of frame-rate reduction. However, unexpectedly,MVC does not outperform SbS, but, on the contrary,SbS gives better depth quality than MVC for 3 out of5 of the rate reduction approaches. However, this isan indication that binocular disparity is more affectedby MVC errors. A more detailed analysis of the de-graded sequences shows that SbS sequences presentmore pronounced blocking artifacts in foreground ob-jects. In contrast, MVC sequences have better imagequality in general, but show noticeable artifacts (e.g.,blurring, staircase, oscillation, mosaic . . . [50]) in theborder of foreground objects (e.g., between the man andthe background in Bullinger, around the bunny in Ropeor around the flower in Flower), which explains the lossof depth perception.

Fig. 7 breaks down subjective depth quality resultsper sequence. It is observed that Bullinger sequencetends to have the worst depth quality in most coding ap-proaches. Unexpectedly, for Bullinger, SbS shows bet-ter depth quality than MVC. Nonetheless, care must beexercised in drawing conclusions from the comparisonof sequences, since depth perception is not only influ-enced by codec performance, but also by many otherfactors in the production process (e.g., scene content,inter-camera distance) or display process (e.g., screensize).

8

Page 10: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Table 6: Visual Comfort quality MOS results

Sequence Bullinger Flower Rope Avg. per rate reduction approachApproach\ Codec SbS MVC SbS MVC SbS MVC SbS MVC

NR 2.24 3.38 3.48 4.17 3.38 3.76 3.03 3.77BR/2 2.21 3.34 3.79 3.76 2.76 4.03 2.92 3.71

BR/10 1.66 2.79 2.28 2.41 2.45 3.14 2.13 2.78BR/2 & FR/2 1.83 3.79 2.79 3.83 2.83 3.45 2.48 3.69

BR/10 & FR/4 1.79 3.1 2.34 2.72 2.45 2.41 2.2 2.75Avg. per sequence 1.94 3.28 2.94 3.38 2.77 3.36 2.43 3.23

Figure 7: Subjective depth quality per sequence.

Figure 8: Visual Comfort for different approaches.

4.3. Visual Comfort

Table 6 presents the average visual comfort MOSscores for the sequences, coding schemes and rate re-duction approaches tested. In the table, it is clear thatMVC consistently gives more comfortable images thanSbS.

Similarly to previous figures, Figure 8 and 9 showvisual comfort results broken down by coding strat-egy and sequence, respectively. From Figure 8, it canbe concluded that discarding frames does not have anyimpact on visual comfort, as BR/10 is rated the sameas BR/10&FR/4. From Figure 9, it is deduced thatBullinger is the most uncomfortable sequence with SbS

Figure 9: Visual Comfort for difference codecs and rate reduction ap-proaches.

for all bitrates, whereas no sequence is more comfort-able than others for MVC. It is also worth noting thevisual discomfort caused by the elimination of framesin Rope with BR/10&FR/4.

4.3.1. Overall comparison of coding schemes

Figure 10 plots the values of three subjective indica-tors (image quality, depth quality and visual comfort)obtained by MVC and SbS, calculated as the averageMOS across rate reduction approaches and sequences.From the figure, it can be concluded that MVC performsbetter than (or, at least, similar to) SbS, especially for vi-sual comfort indicator (2.4 against 3.3). Only for depthquality, both schemes provide similar results. It shouldbe pointed that depth quality is heavily influenced by ex-ternal factors other than the compression method (e.g.,sequence content itself). For instance, low depth videosare often scored with smaller MOS values in spite of avery good coding method. In contrast, image quality ismainly affected by the compression process. Likewise,visual comfort is mainly given by compression issues,but is also affected by other factors.

9

Page 11: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Figure 10: Global subjective assessment.

Figure 11: Average peak signal-to-noise ratio for different codecs andrate reductions approaches.

Figure 12: PSNR assessment with sequence.

5. Objective assessment result

In this section, objective measurements are presented.These are restricted to image quality results in termsof PSNR and MS-SSIM, in the absence of a widely-accepted objective indicator for depth quality and visualcomfort. Average values are broken down by codingstrategy and video sequence as in subjective tests.

5.1. PSNR

Figure 11 displays average PSNR values for differentcoding strategies. It is observed that objective measure-ments for SbS and MVC strategies are almost identi-cal. This situation is due to the fact that coding per-formance for both codecs is similar when a high bitratereduction is applied, thus resulting in similar measureddistortion. Additionally, as shown in the figure, PSNRis only reduced by 1.5 dB from NR to BR/2 with bothSbS and MVC. However, for BR/10, PSNR is reducedby 7 and 21 dB with SbS and MVC, respectively. It isalso observed that frame-rate reduction approaches re-duce PSNR significantly.

Figure 12 displays average PSNR values for differentsequences and coding strategies. From the figure, it isdeduced that Rope is the sequence with lowest PSNRfor all bitrates.

A more detailed analysis can be made by checkingthe PSNR time evolution. Figure 13 breaks down thePSNR per frame for Rope sequence with SbS and MVCcodecs and rate-reduction approach BR/10 (i.e., nomi-nal bitrate of 900/10 = 90 kbps per view). It is ob-served that both codecs follow similar trends. Somefluctuations can be seen in both codecs. Long-termchanges are due to the video content (i.e., a skippingrabbit), whereas short term fluctuations are due to theGOP structure (i.e., size N=10 frames). MVC givesbetter PSNR than SbS for all frames. More importantly,PSNR deviations with MVC are only of 1.5 dB, whereasdeviations of up to 2 dB are seen with SbS. Similar re-sults are obtained for other sequences. Therefore, it canbe concluded that inter-view prediction not only reducescompression errors, but also leads to a more stable videoperformance.

Analogue plots can be derived for MS-SSIM. Forbrevity, only the analysis for different codecs and ratereduction approaches is presented in Figure 14. It is ob-served that, as with PSNR, MS-SSIM significantly de-grades with frame-rate reduction strategies.

5.2. PSNR vs MS-SSIM

A linear regression analysis is carried out to com-pare the value of both objective indicators (i.e., PSNRand MS-SSIM) over the set of tested 3D sequences andcoding approaches. In such an analysis, each samplecorresponds to a sequence with a certain codec andrate-reduction approach. Such a regression results inMS − S S IM = 0.0213PS NR + 0.162, with a deter-mination coefficient of R2 = 0.74. It can thus be con-cluded that both indicators give similar information inthe considered set of sequences and codecs, and, thus,

10

Page 12: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

Figure 13: PSNR time evolution for Rope sequence with a tenfold bitrate (BR/10).

Figure 14: MS-SSIM assessment with bitrate.

only one indicator can be used from now on. Hereafter,only PSNR is considered.

6. Comparison of objective and subjective measure-ments

To fully characterize the 3D codecs, a comparisonbetween objective and subjective measurements is pre-sented. Figure 15 plots both the average PSNR andimage quality MOS values across observers for differ-ent coding strategies. Each of the 30 points in the fig-ure corresponds to a different sequence and coding ap-proach. It is observed that, when all samples are consid-ered for the regression, the determination coefficient islow (R2 = 0.23). A closer analysis reveals that the prob-lem is due to the samples with very low PSNR, on theleft of the figure, corresponding to sequences with re-duced frame rate. If these samples are discarded, PSNR

Figure 15: Comparison between objective and subjective image qual-ity.

and image quality MOS follows a linear regression withR2 = 0.59, and thus an estimate of image QoE could becomputed as MOS = 0, 1833PS NR − 3, 6682.

7. Conclusions

In this paper, a comprehensive analysis has been car-ried out to evaluate the Quality of Experience (QoE)obtained by 3D video codecs in mobile terminals. Theanalysis has covered the most widespread 3D video cod-ing formats, namely the frame-compatible Side-by-Side(SbS) and Multiview Video Coding (MVC) schemes.The analysis combines both subjective and objectivemeasurements. Subjective tests have been done by pre-senting 3D video sequences with different coding pa-rameters to users who judge image quality, depth per-

11

Page 13: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

ception and visual comfort. For this purpose, a mobilephone with autostereoscopic screen has been used.

Results for image quality perception have shown thatH.264-MPEG4/MVC is the coding scheme that adaptsbetter to 3D multimedia content, since the processedvideos are clear and display no degradation at first sight.Moreover, MVC is more robust against more aggres-sive settings of coding parameters (frame-rate and bi-trate). On the contrary, H.264-MPEG4/AVC (corre-sponding to SbS method) videos presented blocking andbasis function artifacts. It has been shown that a reduc-tion in the video frame-rate decreases objective and sub-jective indicators, so that dropping a certain number offrames does not compensate for the negative effects ofan abrupt reduction of the video bitrate. As expected,MSSSIM measurement resembles more accurately sub-jective measurements, although PSNR can alternativelybe used for low encoding rates . It is also seen that SbSand MVC performance converge towards similar valuesfor low encoding bitrates.

In terms of depth quality perception, SbS affects spe-cially objects in the foreground while MVC tends to dis-tort the borders of these backgrounds, causing loss ofdepth perception. The variability in scores clearly indi-cates that the compression configuration parameters arenot the only factor affecting depth perception. For in-stance, screen size proves to be a relevant factor, sinceSbS mobile scores shown in this work are at least 2points lower than those obtained in similar studies inlarge TV screens [29]. For a better view of image depthin mobile devices, a higher coding profile and 3D con-tent more aggressive in terms of depth should be usedto compensate for the decrease of image quality and de-terioration of 3D experience when visualized in a hand-held. Finally, visual comfort results indicate that thebetter the image quality, the more comfortable the 3Dexperience of the observers is. However, it is unclearif other elements can influence visual comfort, such asambient light or display technology.

Acknowledgements

This research was supported by grant TIN2012-36455 and TEC2015-69982-R from the Spanish Min-istry of Economy and Competitiveness.

Conflict of interests

The authors have no competing interests

References

[1] Cisco, Cisco visual networking index. Forecast and methodol-ogy, 2014-2019, White paper, Cisco Systems Inc. (July 2015).

[2] Ericsson, Ericsson mobility report november 2015, Tech. rep.(January 2016).

[3] A. Tabaee, V. Anand, J. Fraser, S. Brown, A. Singh, T. Schwartz,3-dimensional endoscopic pituitary surgery, Neurosurgery. Op-erative Neurosurgery Supplement 2 (2009) 288–295.

[4] C. Hewage, M. Martini, N. Khan, 3D medical video transmis-sion over 4G networks (2011) 180.

[5] L. Unzueta, J. Goenetxea, M. Rodriguez, M. Linaza, Viewpoint-dependent 3D human body posing for sports legacy recoveryfrom images and video, Signal Processing Conference (EU-SIPCO) 2014, Proceedings of the 22nd European (2014) 361–365.

[6] S. Ballesta, G. Reymond, M. Pozzobon, J. Duhamel, A real-time 3D video tracking system for monitoring primate groups,Journal of neuroscience methods 234 (2014) 147–152.

[7] P. Didyk, P. Sitthi-Amorn, W. Matusik, F. Durand, W. Freeman,Joint view expansion and filtering for automultiscopic 3D dis-plays, uS Patent 9,756,316 (2017).

[8] N. Guo, X. Sang, S. Xie, P. Wang, D. Chen, C. Yu, A three-to-dense view conversion system based on adaptive joint viewoptimization, Multimedia & Expo Workshops (ICMEW), 2017IEEE International Conference on (2017) 91–96.

[9] M. Zapater, M. Nieblas, G. Bressan, Quality of Experience forVideo Services, IGI Global, US, 2009.

[10] A. Vetro, A. Tourapis, K. Muller, T. Chen, 3-d tv content stor-age and transmission, IEEE Transactions on Broadcasting 57 (2)(2011) 384–394.

[11] ITU-T, I. J. 1, H.264: Advanced video coding for generic audio-visual services (2010).

[12] ITU-T, H.265: High efficiency video coding (April 2015).[13] P. Aflaki, M. Hannuksela, M. Gabbouj, Subjective quality as-

sessment of asymmetric stereoscopic 3D video, EURASIP Jour-nal on Signal, Image and Video Processing 9 (2) (2015) 331–345.

[14] ITU-R report BT.2017 - stereoscopic television MPEG-2 multi-view (1999).

[15] D. Tian, P. Pandit, P. Yin, C. Gomila, Study of MVC codingtools, Joint Video Team JVT-Y044.

[16] Y. Chen, M. Hannuksela, T. Suzuki, S. Hattori, Overview of theMVC+D 3D video coding standard, Journal of Visual Commu-nication and Image Representation 25 (4) (2014) 679–688.

[17] L. Goldmann, F. D. Simone, T. Ebrahimi, A comprehensivedatabase and subjective evaluation methodology for quality ofexperience in stereoscopic video, Electronic Imaging (EI), 3DImage Processing (3DIP) and Applications 7526 (MMSPL-CONF-2009-021).

[18] D.-T. Group, Mobile 3D-TV content delivery optimizationgroup, 2016.URL http://sp.cs.tut.fi/mobile3dtv/

[19] ITU-R, Subjective methods for the assessment of stereoscopic3DTV systems (February 2015).

[20] R. Gonzalez, R.E.Woods, Digital image processing, Prentice-Hall, Inc., US, 2006.

[21] Z. Wang, A. Conrad, H. Sheikh, E. Simoncelli, Image qualityassessment: From error visibility to structural similarity, IEEETransactions on Image Processing 13 (4) (2004) 600–612.

[22] C. Lambrecht, O. Verscheure, Perceptual quality measure usinga spatio-temporal model of the human visual system, Proceed-ings of the SPIE 2668 (1996) 450–461.

[23] M. Pinson, S. Wolf, A new standardized method for objectively

12

Page 14: Performance assessment of three-dimensional video codecs ...webpersonal.uma.es/~TORIL/files/2018 CC 3D video codecs.pdf · ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Performance assessment

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIP

T

measuring video quality, IEEE Transactions on Broadcasting50 (3) (2004) 312–322.

[24] C. Hewage, S. Worrall, S. Dogan, S. Villette, A. Kondoz, Qual-ity evaluation of color plus depth map-based stereoscopic video,IEEE Signal Processing 3 (2) (2009) 304–318.

[25] A. Benoit, P. Callet, P. Campisi, R. Cousseau, Quality assess-ment of stereoscopic images, EURASIP Journal on Image andVideo Processing 2008 (2009) 1–13.

[26] A. Boev, A. Gotchev, K. Egiazarian, A. Aksay, G. Akar,Towards compound stereo-video quality metric: a specificencoder-based framework, IEEE Southwest Symposium on Im-age Analysis and Interpretation (2006) 218–222.

[27] H. Schwarz, C. Bartnik, S. Bosse, H. Brust, T. Hinz, H. Lak-shman, P. Merkle, K. Muller, H. Rhee, G. Tech, M. Winken,D. Marpe, T. Wiegand, Extension of High Efficiency VideoCoding (HEVC) for multiview video and depth data, in: Im-age Processing (ICIP), 2012 19th IEEE International Confer-ence, 2013, pp. 205–208.

[28] P. Hanhart, N. Ramzan, V. Baroncini, T. Ebrahimi, Cross-labsubjective evaluation of the MVC+D and 3D-AVC 3D videocoding standards, 2014 Sixth International Workshop on Qualityof Multimedia Experience (QoMEX) (2014) 183–188.

[29] J. Gutierrez, P. Perez, F. Jaureguizar, N. Garcıa, Subjective studyof adaptive streaming strategies for 3DTV, 19th IEEE Interna-tional Conference on Image Processing (ICIP) (2012) 2265–2268.

[30] T. Chen, Y. Kashiwagi, Subjective picture quality evaluation ofMVC stereo high profile for full-resolution stereoscopic high-definition 3D video applications, Proceedings of IASTED Con-ference Signal Image Processing.

[31] A. Aksay, G. Akar, Evaluation of stereo video coding schemesfor mobile devices, 3DTV Conference: The True Vision - Cap-ture, Transmission and Display of 3D Video (2009) 1–4.

[32] P. Merkle, H. Brust, K. Dix, Y. Wang, A. Smolic, Adaptationand optimization of coding algorithms for mobile 3DTV, Tech.Rep. 216503, Mobile 3D-TV (2008).

[33] M. Nasralla, C. Hewage, M. Martini, Subjective and objectiveevaluation and packet loss modeling for 3D video transmissionover LTE networks, in: Telecommunications and Multimedia(TEMU), 2014 International Conference on, IEEE, 2014, pp.254–259.

[34] D. Strohmeier, S. Jumisko-Pyykko, K. Kunze, Open profiling ofquality as a mixed method approach to study multimodal expe-rienced quality, 8th European Conference on Interactive TV andVideo EuroITV.

[35] D. Strohmeier, G. Tech, On comparing different codec profilesof coding methods for mobile 3D television and video, Proceed-ings on 2nd Conference: 3D Systems and Applications, 3DSA.

[36] A. Vetro, T. Wiegand, G. Sullivan, Overview of the stereo andmultiview video coding extensions of the H.264/MPEG-4 AVCstandard, Proceedings of the IEEE 99 (4) (2011) 626–642.

[37] G. Su, Y. Lai, A. Kwasinski, H. Wang, 3D Visual communica-tions, John Wiley and Sons, US, 2013.

[38] Z. Chen, X. Zhang, Y. Xu, J. Xiong, Y. Zhu, X. Wang, MuVi:Multi-view video aware transmission over MIMO wireless sys-tems, IEEE Transactions on Multimedia.

[39] M. Urvoy, M. Barkowsky, R. Cousseau, Y. Koudota, V. Ri-cordel, P. L. Callet, J. Gutierrez, N. Garcıa, Subjective videoquality assessment database on coding conditions introducingfreely available high quality 3D stereoscopic sequences, IEEEon Quality of Multimedia Experience (QoMEX) (2012) 109–114.

[40] M. Pinson, M. Barkowsky, P. Callet, Selecting scenes for 2D and3D subjective video quality tests, EURASIP Journal on Imageand Video Processing 2013 (1) (2013) 50.

[41] J. Gutierrez, P. Perez, F. Jaureguizar, J. Cabrera, N. Garcıa, Sub-jective study of adaptive streaming strategies for 3DTV, in: Im-age Processing (ICIP), 2012 19th IEEE International Confer-ence on, IEEE, 2012, pp. 2265–2268.

[42] H. F. Institute, Developer group of JM 18.5 coding/decodingtool, 2017.URL http://iphome.hhi.de/suehring/tml/download/

[43] ITU-R, Methodology for the subjective assessment of the qual-ity of television pictures (January 2012).

[44] P. ITU-T RECOMMENDATION, Subjective video quality as-sessment methods for multimedia applications.

[45] R. C. Gonzalez, R. E. Woods, et al., Digital image processing(1992).

[46] S. Wolf, Objective video quality measurement using a peak-signal-to-noise-ratio (PSNR) full reference technique, T1, Tech.rep., TR.

[47] Z. Wang, E. P. Simoncelli, A. C. Bovik, Multiscale structuralsimilarity for image quality assessment, in: Signals, Systemsand Computers, 2004. Conference Record of the Thirty-SeventhAsilomar Conference on, Vol. 2, IEEE, 2003, pp. 1398–1402.

[48] Z. W. Ligang, Z. Wang, L. Lu, A. C. Bovik, Video quality as-sessment using structural distortion measurement, in: in Proc.IEEE Int. Conf. Image Proc, 2002.

[49] MSU-VQMT, Analysis tool for objective image quality assess-ment, 2017.URL http://compression.ru/video/quality measure/video measurement tool en.html

[50] M. Yuen, H. Wu, A survey of hybrid MC/DPCM/DCT videocoding distortions, Signal processing 70 (3) (1998) 247–278.

13