Evaluation of HDR video tone mapping for mobile devices

Contents lists available at ScienceDirect

Signal Processing: Image Communication

Signal Processing: Image Communication 29 (2014) 247–256

0923-59http://d

n CorrPortuga

E-m

journal homepage: www.elsevier.com/locate/image

Evaluation of HDR video tone mapping for mobile devices

Miguel Melo a,b,n, Maximino Bessa a,b, Kurt Debattista c, Alan Chalmers c,d

a Universidade de Trás-os-Montes e Alto Douro, Portugalb INESC Porto, Porto, Portugal ORINESC Porto, Portugalc WMG, University of Warwick, United Kingdomd goHDR Ltd., United Kingdom

a r t i c l e i n f o

Available online 12 October 2013

Keywords:HDR videoTMO evaluationMobile devices

65/$ - see front matter & 2013 Elsevier B.V.x.doi.org/10.1016/j.image.2013.09.010

esponding author at: INESC Porto, Porto, Porl. Tel.: þ351 966385206.ail address: [email protected] (M. Melo).

a b s t r a c t

Tone mapping operators (TMOs) allow the visual appearance of High Dynamic Range (HDR)video to be reproduced on Low Dynamic Range (LDR) displays. While several studies havebeen made to compare the performance of different TMOs on standard displays, there is onlyone preliminary study that takes into account the characteristics of Small Screen Devices(SSDs) which are common on mobile platforms that are growing rapidly in popularity. Due tothe unique characteristics of mobile devices, the variety of viewing conditions where they areused, and the fact that they are becoming so widespread, it is important to identify what isthe best method to deliver HDR video content to these devices.

This paper presents an evaluation of HDR video on mobile devices using an HDR displayas reference. Six state-of-the-art HDR video tone mappers are evaluated by conducting apsychophysical experiment where participants were asked to rank the tone mappers appliedto different HDR video footage. A comparison was made between tone mapped HDR videofootage shown on a tablet and an LCD display compared with the same HDR video footageshown simultaneously on an HDR display. This study shows there is a subtle but statisticallysignificant difference in the preference of TMOs between SSD and LDR even though theoverall ordering of TMOs is the same across both displays.

& 2013 Elsevier B.V. All rights reserved.

1. Introduction

High Dynamic Range (HDR) imaging is able to capture,store, transmit and deliver real-world lighting. This is astep-change compared to conventional imaging technol-ogy. HDR can benefit all the aspects of digital imagingincluding diverse areas such as security, entertainment,art, scientific research and health. With HDR techniques, itis possible to deliver an enhanced viewing experience tousers by providing the full dynamic range that the HumanVisual System (HVS) can perceive at any level of adapta-tion. To achieve this it is necessary to ensure that the HDR

All rights reserved.

tugal ORINESC Porto,

content is preserved from the moment of capture until itssubsequent display.

While it is possible for those who have access to an HDRdisplay to reap the benefits of HDR content, the majority ofdisplays available are not HDR but standard or low dynamicrange (LDR) displays. This means that it is necessary to applya Tone Mapping Operator (TMO) to the HDR content in orderto preserve, to the extent the hardware allows, the percep-tion of HDR on these displays. Several TMOs have beenproposed that take into consideration human visual percep-tion or image characteristics, but only a few of them havebeen designed to deal with HDR video.

As mobile or small screen devices (SSDs) are rapidlybecoming the leading platform for the consumption ofmultimedia content [1], there is an urgent need to ensurean optimal experience when viewing HDR content ontypical mobile device screens. Current mobile device

www.sciencedirect.com/science/journal/09235965

www.elsevier.com/locate/image

http://dx.doi.org/10.1016/j.image.2013.09.010



http://crossmark.crossref.org/dialog/?doi=10.1016/j.image.2013.09.010&domain=pdf



mailto:[email protected]


M. Melo et al. / Signal Processing: Image Communication 29 (2014) 247–256248

screens raise new concerns that traditional displays do not.In particular, a mobile device can be exposed to differentambient lighting, the viewing angle and distance areconsiderably different, and, of course, they are smallerthan traditional desktop displays. Although the area andresolution of mobile devices have increased significantly inthe last few years, the majority of mobile device displaysare still less powerful than conventional displays resultingin quality disparities when reproducing content on SSDs.Due to the wide variety of screens available on modernSSDs the retargeting of the contents can also be challen-ging [2].

This paper presents an evaluation of HDR video tonemappers for LDR and SSDs. It is novel because it is, to thebest of our knowledge, the first evaluation of this type toevaluate tone mapped HDR video for LDR. It is also onlythe second paper to evaluate TMOs on SSDs, after the workof Urbano et al. [3]; however, unlike the work presentedhere Urbano et al.'s evaluation only took into account tonemapped still images not HDR video. In this paper wepresent the results of the evaluation of six HDR videoTMOs. The TMOs evaluated were: Spatio-Temporal basedon a Retina model [4], the model of visual adaptation [5],the method proposed by Van Hateren [6] that encodesHDR video using a model of human cones, the displayadaptive technique [7], a time-dependent visual adapta-tion approach [8], and the TMO proposed by Boitard et al.[9] that takes into account temporal coherency. Thisevaluation intends to determine if the participants pre-ference of the tone mapper is the same for a 37″ LDR andSSDs and if there is a distinction between the preferenceacross the 37″ LDR and SSDs.

2. Related work

TMOs can be divided into two categories: global opera-tors and local operators [10]. Global operators are spatiallyinvariant and they process the image as a whole, mappingall the pixels of an image equally. These operators useimage statistics to optimize the dynamic range reduction.Some common statistics used are the maximum lumi-nance and logarithmic or arithmetic averages. Globaloperators are simple and fast and they preserve the globalcontrast of the image, but are unable to maintain the localcontrast of the image which can result in a loss of details insome regions. The Model of Visual Adaptation TMO [5], theBrightness Reproduction TMO [11], Quantization Techni-ques TMO [12], and the Histogram Adjustment TMO [13]are examples of global TMOs.

Local operators, on the other hand, are applied differ-ently to each pixel, taking into account a set of surround-ing pixels to perform the calculation for that pixel. One ofthe advantages of local operators is that they attempt topreserve both global and local contrast that can lead to abetter image quality since the HVS is sensitive to localcontrast. The major drawbacks of local TMOs are that theyare typically complex and thus take more computationaleffort (and therefore may not be feasible for mobiledevices), and they can introduce artifacts, for examplehalos [10]. Some examples of local TMOs are SpatiallyNon-uniform Scaling TMO [14], the Multi-scale Model of

Adaptation and Spatial Vision for Realistic Image DisplayTMO [15], the Photographic Tone Reproduction TMO [16],or the Tone Mapping Algorithm for High Contrast ImagesTMO [17].

In addition TMOs for HDR video need to also take intoaccount temporal coherency [4,7,9]. This is because sig-nificant luminance changes may occur in a sequence offrames and, if the frames are processed individually, thismay cause some noticeable flickering as the individualprocessing of a single frame has not taken into considera-tion the luminance levels of the adjacent frames or of theoverall video.

2.1. TMO evaluation

As a large number of TMOs have been proposed over twodecades, some form of evaluation is needed to determinewhich TMOs perform better under different circumstances.To carry out such an evaluation, two approaches are possible:error metrics and psychophysical experiments.

Error metrics are generally thought to be an objectivemeasurement. They are based on theoretical models anduse computers to compare images. For example, thecomparison can be made based on differences in indivi-dual pixel values, or using metrics that simulate the HVS inorder to identify which aspects of the image would beperceived by the HVS [10]. A problemwith this approach isthat the HVS is complex and hard to simulate, althoughsome studies such as [18] are making progress in under-standing this complexity. The most used techniques of thistype are the VDP (Visual Difference Predictor) [11] andHDR-VDP that predicts visible differences in HDR images[19]. One of the issues about using objective evaluations isthat there are as yet no reliable video comparison methodsparticularly for HDR.

Psychophysical experiments, on the other hand, aresubjective and based on studies with participants that areasked to give feedback about a comparison of images.Typically these experiments take place in a room wherethe experimental team has control of the environmentalconditions. The evaluation can be made based on differentrating methods. One possible approach is to have theparticipant make a pairwise comparison where he/she isconfronted with one reference of the image and two otherimages of the same scene in order to determine which one ofthese two images reproduces the content most similar to thereference [10]. Another possibility, which is the one we usedin our experiments, is to have the participants rank the tonemapped images against a reference. Previous TMO evalua-tions that have used psychophysical experiments includeDrago et al. [20], Ledda et al. [21], Yoshida et al. [22], Kuanget al. [23], Čadík et al. [24] and Petit et al. [25].

The studies conducted by Drago et al. [20] were one ofthe first that aimed to evaluate TMOs with regard tosimilarity and preference. Four different scenes were con-sidered. The comparison was made between seven TMOs.About 11 participants undertook a pairwise comparison of allthe possible combinations of the TMOs applied to the fourscenes. The comparison was made by showing each partici-pant two images with different TMOs applied, each one on aCRT screen.

M. Melo et al. / Signal Processing: Image Communication 29 (2014) 247–256 249

In 2005, Ledda et al. [21] were the first to use an imagedisplayed on an HDR monitor as a reference. In addition toevaluating six TMOs, the authors also introduced a newmethodology for comparing images. The six TMOs wereapplied to 23 images and examined by 18 participants. Theexperiments consisted a paired comparison in which theparticipant sat in front of an HDR monitor (the reference)that had two LDR displays, one on each side, reproducingthe same image as the reference but with different TMOsapplied on each. In this study no correlation was foundbetween TMOs and test scenes.

Yoshida et al. [22] undertook a set of experiments thatconsisted a comparison between HDR images displayed onLDR displays and the real world scene. The comparisonswere made using seven different TMOs. The results showedthat local TMOs are preferred over global TMOs. Otherconclusions of the study were that the seven TMOs wereperceived very differently in terms of all of the attributeswhen compared to their corresponding real-world views,and that none of the image appearance attributes had astrong influence on the perception of naturalness.

Kuang et al. [23] conducted studies to evaluate thepreference and accuracy of HDR rendering algorithms. Thesestudies involved 33 participants and considered 6 TMOs.These were divided in three experiments. The first experi-ment used a paired comparison, the second one a rankingscale, and the third used a ranking scale to compare the realworld scene with the tone mapped image.

Čadík [24] used a ranking-based experiment with a realworld scene reference and a similar experiment withoutthe reference. These studies evaluated 14 TMOs on threedifferent scenes. One finding of this work was that almostall of the studied image quality attributes could be eval-uated without comparison to a real HDR reference.

Petit et al. [25] is one of the few works that consideredHDR video sequences and it investigated images withinvirtual environments. These studies consisted of two experi-ments that tested TMOs not on predetermined scenes butrather within a virtual environment with dynamically chan-ging environmental conditions. In the first experiment fivecomputer generated HDR videos were tone mapped usingeight different TMOs. About 14 participants had to decidewhich TMO they felt was more realistic without having areference. For the second experiment nine HDR photographswere tone mapped with the same operators and presentedto 13 further participants. The participants had to rate on ascale of 1–10 the realism of each tone mapped photographhaving as reference the real-world scene.

Urbano et al. [3] evaluated several TMOs on differentsize displays. The experiment consisted of a pairwisecomparison of tone mapped images with a real scenereference. Three different displays were used: 2� 17″displays and 1� 2:8″ display. The aim was to analyse therankings for each display and show that for small displaysthe rankings are different and thus different TMOs need tobe developed for mobile devices. The major findings of thiswork were that user preferences for TMOs change accord-ing to the display that is reproducing the HDR sequences,and that for SSDs users prefer stronger detail reproduc-tion, more saturated colours, and overall brighter imageappearance.

In the TMO evaluations presented, most of them addressstill images and only [25] used video clips that werecomputer generated; and this was for specific scenarios.Furthermore, the only previous experiment that was con-ducted using an HDR display as reference was [21] while theother evaluations used as reference the real scene, or noreference at all. The experiments presented in this paperbenefit from using an HDR display as reference and addressHDR video tone mapping with state-of-the-art TMOs.

The experiment described in this paper was preceded bya preliminary study [26] that considered the same six TMOsusing the same HDR video footage. Within this study theusers had to rank the different tone mapped HDR videoshown on a tablet with an HDR display as a reference. Thisprevious study had the goal to gather initial data about thepreferred TMOs for watching HDR video on mobile devices.

2.2. Tone mapping operators considered

The six TMOs used in our evaluation are the following:

Benoit (Ben): The spatio-temporal TMO proposed in [4] isbased on a model of the retina local adaptationproperties developed by Meylan et al. [27] and iscomplemented by spatio-temporal filters of theretina. This work simulates some of the fovealretina functionalities and includes temporalcoherency and thus is able to avoid flicker.

Boitard (Boi): This paper [9] investigated temporal coher-ency for video tone mapping that preserves theoverall contrast of the video. To achieve this, theauthors considered perception consistency of anobject throughout the video and strived to pre-serve the temporal overall contrast consistency.The TMO processes a frame in two steps. The firststep processes each frame of the video individu-ally, while the second step considers the lumi-nance of each frame taking into account thelighting within the whole HDR video. This algo-rithm can be used in conjunction with any TMOas it works as a post-processing algorithm.

Ferwerda (Fer): The TMO developed by Ferwerda et al. [5]is based on a model of visual adaptation frompsychophysical experiments that considered var-ious aspects of the HVS such as visibility, visualacuity and illumination changes adaptation. Thisoperator uses TVI functions for modelling photo-pic and scotopic vision. The mesotopic range isachieved by a linear combination of both photo-pic and scotopic vision.

Hateren (Hat): This TMO [6] is based on the model ofhuman cones and takes advantage of theirdynamic response characteristics. The TMO per-forms 2 steps. The first combines the dynamic non-linearities, while the second reduces noise througha low-pass filtering that adapts to the sceneluminance.

Mantiuk (Man): The TMO proposed in [7] aims to mini-mise the visible contrast distortions for a widerange of devices based on a model of the HVS.


The TMO takes into account the environmentluminance levels and the display characteristics,such as the peak luminance of the display or thereflectivity of a screen. With these parameters itis possible to calibrate the tone mapping processin order to optimise the results for situationswith different variables.

Pattanaik (Pat): Known as the time-dependent visualadaptation TMO [8], this method takes intoconsideration that the HVS does not adaptinstantly to big changes in luminance intensities.This TMO includes these appearance changes tomatch the user's visual responses so he/she canexperience the viewing of a displayed scene ashe/she would in the real world scene.

3. Experimental framework

The psychophysical experiments in this paper consistedof ranking seven HDR video sequences tone mapped by thesix TMOs presented in the previous section compared to areference of the same HDR video footage shown on an HDRdisplay. The tone mapped footage was shown on a standardLDR display or a tablet, representing the SSD, to determine ifthe rankings were different between the displays.

3.1. Method

For this experiment ranking was chosen over rating andpairwise comparison. This is because the experiments

Fig. 1. Experimenta

intended to achieve clear and timely results regarding thepreference order of the TMOs. With rating this may becompromised since it could result in a narrow distributionof ratings; furthermore, the ratings may be dependent on theresponse styles of the participants since participants may bepredisposed to rate all items equally or, on the contrary, onlygive low or high scores. While normalisation does alleviatesome of these issues it may not be appropriate to normaliseby the minimum and maximum values of each participant asit is not always clear what the absolute thresholds of eachparticipant are. Although pairwise comparisons could beused to infer ranks, it requires significantly more time dueto the large number of comparisons required to reproducesimilar results to ranking. Ranking was thus chosen as themethod that could report TMO preference order better sinceit guarantees that each ranked item has a unique valuethrough the ranking of all tone mapped HDR video footagefor each scene.

The experiment consisted of three independent vari-ables: the six chosen TMOs, the displays and the scenegroups. In total four groups of participants were used. Thedisplay was between a participants variable. The partici-pants were divided into four groups: two for each display(LDR, SSD) by two for the choice of scenes. Half theparticipants used the 37″ LDR display and the other halfthe SSD. Furthermore, the scenes – another betweenparticipants variable – were divided between the twogroups that were assigned for each display such that onegroup viewed four scenes under one of the display condi-tions and the other group viewed three scenes under both

l setup scheme.


display conditions. This choice was made to control forparticipant fatigue and also to analyse whether the resultswere consistent across scenes. All participants viewed andranked all the six TMOs.

Fig. 1 shows the overall configuration. Both displaygroups had an HDR display as reference. In the SSD groupa tablet was placed on a table at 2.5 m distance from theHDR display (A). For the LDR display group, the LDRdisplay was placed at the left side of the HDR display (B).In both cases the participant stood at a distance of 2.5 mfrom the HDR display.

The video sequences shown during the experimentscould be played at the same time on the LDR/SSD displayand on the HDR display; this may lead to a bias towardsgiving preference to those TMOs that generate brighterfootage as more luminous content is known to be pre-ferred [28]. To minimise the impact of such an issue thequestion asked to the participants was “which videosequence is closest to the reference?” rather than “whichvideo sequence do you prefer?”. In the results’ section wepresent results that test the correlation of the preferredTMOs against the average luminance of the tone mappedfootage to understand if this bias was influencing thepreferences.

3.2. Materials

The reference HDR display used in the experiments wasa 37″ DR37-P from Brightside [29]. The LDR display was aWestinghouse [30] 37″ LCD, which is the same exact frontpanel as the DR37-P, and the tablet an Apple iPad4 [31].The viewing angle of the HDR display was around 401horizontal and 151 vertical while the LDR display valueswere approximately 1601 horizontal and 1201 vertical. Thetablet viewing angle is approximately 1751. Table 1 liststechnical specifications of each of the displays used.

The HDR video footage was labelled “CGRoom”, “Jaguar”,“Kalabsha”, “Morgan Lovers”, “Explosion”, “IDLWedding”, and“Medical” (Fig. 2). Three of the clips (“Morgan Lovers”, “IDLWedding” and “Medical”) were shot with a Spheron HDRv[32] that is capable of capturing 20 f-stops, full HD resolutionat 30 frames-per-s, one clip (Explosion) was captured using aCanon 5D (12 f-stops), one clip (Jaguar) was captured using aCanon 1D Mark II (14 f-stops) and two (“CGroom” and“Kalabsha”) were computer generated (20 f-stops). Each videowas tone mapped by the 6 different TMOs.

CGRoom: A scene in a garage in which a barrel falls from atall shelf that is in the dark and after falling rolls

Table 1Technical specifications of the displays used in the experiments.

Parameters HDR display LDR display Tablet

Brand Brightside Westinghouse AppleModel DRP37-P LVM-37w1 iPad 4 (A1458)Size 37″ 37″ 9:7″Resolution 1920�1080 1920�1080 2048�1536Contrast ratio 200 000:1 1 000:1 877:1Maximum luminance 4000 cd/m2 550 cd/m2 476 cd/m2

Minimum luminance 0 cd/m2 0.55 cd/m2 0.48 cd/m2

through a well-lit part of the scene. This clip has7 s long and an average Dynamic Range (DR)of 16.6.

Jaguar: This footage has the duration of 13 s and anaverage DR of 13.4. It shows a Jaguar e-type car,where the camera moves from the back to thefront. The video has spots with high luminancelevels due to the spotlights directly pointed atthe car, together with some dark areas.

Kalabsha:This scene is a high-fidelity HDR computer graphicsreconstruction of the ancient Egyptian templeKalabsha. The rendering was done to enable Egyp-tologists to explore how the site may have appearedin the past, before the temple was moved in 1963 toavoid it being submerged by Lake Nasser. Theaverage DR of this clip 18.5 and it is 11 s long.

Morgan Lovers: This is the first short film that was shot,manipulated and displayed entirely in HDR. Itwas a joint venture between The University ofWarwick, goHDR, Vermillion films, Morgan cars,the Boultbee Flight Academy, and Vintage Flyers,Cotswold Airport. This footage has the length of15 s and the average DR of 19.5.

IDL wedding: This clip was taken at the International DigitalLaboratory located at the University of Warwickand shows a wedding couple exiting the building.The captured dynamic range enables all the detailsof the bride's dress to be distinguished even as shesteps into the bright sunshine, and the personbehind the glass door, and the reflection of thecamera crew, to be seen. The measured DR for thisclip was 18.5 in a clip of 10 s.

Explosion: This footage was shot with a Canon 5D Mark II. Itshows 2 people running and then an explosionoccurs. It is a scene from the short film. “DeliveryBoy”; a joint venture between the University ofWarwick, goHDR, and Entanglement Productions.The average DR of this video sequence is 12 and itis 8 s long.

Medical: This scene consists of footage taken of a thoracicsurgery. Shot over the shoulder of the surgeon,with no additional lighting, apart from the theatrelights, the filmwas a joint endeavour between theUniversity of Warwick, goHDR, and HeartlandsHospital's Medical Media unit and Thoracic Surgeryteam. This clip is 14 s long and it has an average DRof 15.2.

The experiments took place in a room with controlledluminance levels with a value of 55 Lux that corresponds toapproximately the ambient luminance levels of a familyliving room [33]. The participants were placed at approxi-mately 2.5 m from the 37″ displays. The displays were placedside by side and at the same height. The participants stood atthe table where the tablet was. The iPad was at approxi-mately 45 cm from the participants. A mouse was used tocontrol the experimental software on the computer con-nected to the LDR display. The participants were given timeto adapt to the ambient luminance of the room.

The settings of the displays were set to default. The LDRdisplay was placed on the left of the HDR display at the

Fig. 2. Thumbnails of the videos used for the experiments.


same distance and at the same height with a slight anglebetween them in order to optimise for user viewing. Thetablet was placed where the user stood centred betweenthe two 37″ displays, Fig. 1. The tablet was placed on aproper stand that formed an angle of approximately 451but the participant was given the freedom of placing thetablet in the most pleasant position for them. Apart fromthe tone mapping, no retargeting was done between thefootage of the HDR display and LDR display since the panelof the LDR is identical to that of the HDR display. To ensurea similar viewing experience of the HDR video contentbetween the larger displays and the tablet, additionaladjustments were required. The tablet used has a retinadisplay that has a scale of two (a pixel is represented by 2points) so a scale factor of 2 was used. To overcome theresolution difference between the images (1920�1080)and the tablet supported resolution (2048�1536) theresolution of the images was maintained and the videoswere centred in a black background.

Regarding the settings used for the TMOs, for Ben theconfigurable parameters are the histogram clipping value(h¼0), the colour saturation factor (s¼3), the retina horizon-tal cells gain (βh¼40), the local adaptation photo-receptors(Rp,0¼193) and the local adaptation ganglion cells (Rg,0¼186).Boi was used with gamma correction (γ¼2.2) using theRamsey et al. TMO [34], since it was the one advised by theauthors to use. Fer was used setting the configurable para-meters to 200 cd/m2 and 50 cd/m2 for the maximum displayluminance and adaptation luminance values respectively. Hatwas used with the fixed pupil area (pa¼10mm2), the calciumfeedback constant (ac¼9�10�2), the gain (kβ¼1.6�10�4),and the residual activity (Cβ¼2.8�10�3). Man was appliedusing the “LCD profile” that corresponds to the profile thatbest describes the environmental conditions of the experi-ments. The parameters for this profile are gamma correction(γ¼2.2), the maximum display luminance (Lmax¼200), theblack levels of the display (Lblack¼0.8), the reflectivity of the

display (k¼0.01) and the ambient illumination (Eamb¼60).For Pat the global version of the TMO was used since the localversion cancels the time-dependent effects and the adaptationlevels for cones and rods were calculated using the averageluminance.

3.3. Participants

A total of 60 participants, 34 men and 26 women withages between 19 and 28 years was randomly assignedbetween the LDR and SSD groups and between the sevenscenes (half of them evaluated the first four scenes and theother half the last three scenes). There were a total of 15evaluations for each video on each device. All the partici-pants reported normal or corrected to normal vision.

3.4. Procedure

The participants were asked to view the six tonemapped videos for the scenes they were allocated (fouror three). Each scene was presented to the participant as athumbnail of the tone mapped contents used. The thumb-nails of the tone mapped contents were presented on theleft side of the HDR screen and a thumbnail of thereference video in the centre of the screen. On the rightside of the screen there were six empty slots numberedfrom 1 to 6. Fig. 3 shows the software running for one ofthe scenes. The participants were asked to drag and dropthe thumbnails of the tone mapped HDR videos accordingto how similar they thought each was compared to thereference. When a participant double-clicked on any of thethumbnails the corresponding tone mapped video playedon the LDR display simultaneously as the HDR videoplayed on the HDR display. The participants were allowedto look at any of the tone mapped footage; they wereallowed to do this as often as required before taking a finaldecision on any given scene. At any point they could focus

Fig. 3. Screen shot of the experimental software.


on either the LDR/SSD or the HDR directly. After makingthe choice they would proceed onto the next scene untilthey performed the ranking procedure for all the allocatedscenes. Before starting the evaluations each participantwas allowed a few minutes to adapt to the ambientluminance of the room. Furthermore to avoid visualdiscomfort between scenes due to the variation of thelightning conditions on the displays, a grey image wasshown to help the eyes adjust.

4. Results

In this section the results of the study are presented.Initially the results averaged across all the scenes arepresented. These include results for the different displays.Subsequently the results are presented for each scene.

To calculate the main effects the results were collapsedacross the video sequences by averaging the values acrossthe seven video sequences. The main effects of display(mobile or LDR), TMO (Ben, Boi, Fer, Hat, Man, Pat) andgroup (1st group that saw four video sequences, 2nd groupthat saw three video sequences) were analysed usinga 2(group)�2(display)�6(TMO) mixed design factorialANOVA. TMO was a within-participant variable and groupand display were between participant variables. Further-more, Kendall's co-efficient of Concordance W was used togive an estimate of agreement amongst participants. Usingthis statistic W¼1 signifies perfect agreement amongstparticipants and W¼0 complete disagreement.

The main effect of group was insignificant Fð1;56Þ ¼ 0;ðp40:5Þ. This means that although different participantsran through different scenes the results between thesetwo groups were insignificant. The results thus did not

change according to the group and the different set ofscenes – meaning results are more robust. This also meansthe adopted approach to reduce fatigue by splitting theparticipants into two groups could be deemed successful.

The main effect of display was significant Fð1;56Þ ¼6:179; ðpo0:05Þ. This signifies a difference in the resultsacross the two displays. Effectively this means that there is adifference in the results of the different TMOs across bothdisplays. Although the order remains the same (as shall beseen below), there is a difference between how much thedifferent TMOs are preferred between the displays.

For the main effect of TMO, sphericity was violated(Maulchy's test, po0:05), therefore Greenhouse-Geissercorrection was applied. The main effect of TMO was alsosignificant Fð3:839;215Þ ¼ 2:836; ðpo0:05Þ. This by itselfonly signifies that there was a significant difference amongstall the TMOs. Pairwise comparisons using Bonferroni adjust-ment were used to indicate significance amongst all TMOs(at po0:05). In this case there was a significant differenceamongst all the TMOs. The order was Man, Boi, Pat, Ben,Fer and Hat. Kendall's co-efficient of Concordance wasW ¼ 0:862; ðpo0:01Þ. This indicates a very high level ofagreement amongst the participants when assigning scores.The overall results are shown in Table 2.

4.1. Results obtained for the LDR display

When comparing results for the LDR display only, therewas a main effect of TMO Fð3:102;86:858Þ ¼ 191:322;ðpo0:01Þ with Greenhouse-Geisser correction as spheri-city was violated (Maulchy's test for sphericity, po0:01).As before, the main effect of group was not significantFð1;28Þ ¼ 0; ðp40:5Þ.

Table 2Overall similarity results from the experiments. Coloured groupings represent TMOs that were not found to be significantly different using pairwisecomparisons to each other, via Bonferroni adjustment, at po0.01.

n(po0.01).

Table 3Overall results obtained for each scene. Coloured groupings represent TMOs that were not found to be significantly different using pairwise comparisons toeach other, via Bonferroni adjustment, at po0.05.

n(po0.01).


The order of TMO preference was Man, Boi, Pattanaik,Ben, Fer and Hat. Pairwise comparisons with Bonferroniadjustments show that Man and Boi were groupedtogether, Pat and Boi were also grouped together as wellas Boi, Man and Pat (see Table 2) indicating no significantdifference between them (at po0:05). There was a sig-nificant difference amongst the rest of the TMOs. Theseresults have a Kendall co-efficient of Concordance ofW ¼ 0:850; ðpo0:01Þ. Again this is a high value indicatingstrong agreement amongst participants.

4.2. Results obtained for the SSD

For the SSD group, there was a significant differenceamongst TMOs Fð3:101;86:827Þ ¼ 290:309; ðpo0:01Þ withGreenhouse-Geisser correction as sphericity was violated

(Maulchy's test for sphericity, po0:01). Again, the maineffect of group was not significant Fð1;28Þ ¼ 0; ðp40:5Þ.

The order of preference is the same between bothdisplays but in the case of the SSD, pairwise comparisonswith Bonferroni adjustment showed a significant differencebetween all TMOs except between the second and the thirdranked TMOs (Boi and Pat). For the SSD group Kendall'scoefficient of Concordance was W ¼ 0:890; ðpo0:01Þ,another significant result showing strong agreement amongstparticipants.

4.3. Overall results obtained for each scene

Table 3 shows the results obtained for all the scenesused in the experiments. The coloured squares indicatethat the TMOs inside them have no significant difference


between them for po0:05. As can be seen, the preferenceorder does not remain the same for all scenes and thesignificant difference between TMOs is also not the samefor all cases. Kendall's coefficient of Concordance W wassignificant, at po0:01, for all scenes across both displays.

4.4. Testing for luminance bias

In order to test whether the TMOs that producedbrighter footage had a consistent effect on the results,the average brightness per footage for all the scenes wastested against the order of the preferred TMOs usingSpearman's Rho rank correlation coefficient. Resultsdemonstrated that the average luminance values for eachTMO were not significantly related to the order of thepreferred TMOs, ρ¼ �0:429; p40:05. This indicates thatthe bias due to brightness did not significantly affect theresults.

5. Discussion

The overall results of the experiments presented in thispaper are summarized in Table 2. The table shows thegroupings of the 6 considered TMOs as well as the F valuefor significance and the co-efficient of Concordance. As wecan see, there is a strong level of agreement between theparticipants (W¼0.85 for LDR and W¼0.89 for SSD atpo0:01). The results gathered show that the preferenceorder for the LDR display was Man, Boi, Pat, Ben, Fer andHat and for the SSD, despite the order being the same, Boiand Pat were grouped together as there was no significantdifference between them. Across the displays the prefer-ence order is the same with no groupings. When consider-ing the results as a whole an interesting fact is revealed: atleast at medium luminance levels, with HDR images thepreference order is significantly different (comparing withthe results obtained by Urbano et al. [3]), whereas withHDR video the TMO preference order remains the same,although it can have some groupings on the preferenceorder. This may occur since still images give the opportu-nity for participants to view more spatial details, thedifferences in size between SSD and LDR become moreobvious. With video the spatial differences may not be thatclear – hence the TMO preference may be closer.

Analysing the preference order, the results suggest thatfor HDR video TMOs participants value colour appearance(Man and Pat). Although Boi is not dedicated to maintaincolour appearance, the good performance can be explainedby the fact that it is the only TMO that considers a sequenceas an whole and preserves the relative luminance levels aswell as the temporal brightness consistency improving thescene object's perception. Overall Man was the preferredTMO. Besides preserving colour appearance, it preservesvisible contrast as well as it was designed to match thescene and the display brightness by making a comparisonof the LDR image with the HDR image in order to minimizethe residual errors. The performance of Pat was fair and itsthird place could be explained by the fact that although Patpreserves colour appearance and is time-dependent, it wasnot designed for video sequences; it only simulates theeffect of the HVS to adapt to illumination changes and this

could be why Boi has performed slightly better than Pat.The best rank achieved by Fer was fourth on Explosion onthe LDR display and for Kalabsha on the SSD but it wasmainly ranked as the fifth most preferred TMO. Fer's maingoal is to deal with the adaptation to illumination changesand despite trying to preserve colour appearance it seemsto fail its purpose when the dynamic range of the image iswide due to not being able to preserve details on darkregions while reproducing details in bright areas since itseems to add some artifacts to the sequence. Hat was theTMO that obtained the lowest rank. The TMO aims topreserve scene luminance and noise reduction but it seemsto fail when dealing with brighter scenes since the contrastis small on the regions with more luminance. Anotherfactor that may have contributed for the low ranking ofHat is that it considers the light-dependent activity of anenzyme, which has a residual activity even in completedarkness. This affects temporal processing according to afirst-order low-pass filter that is dependent on the retinalluminance which can be considered to adapt, continuouslyand automatically, to the prevailing light level.

The results obtained for each scene also show that thepreference order of the TMOs changes and this may berelated with the scene features. The results are not uni-form for all the evaluations; although one TMO may be thebest ranked for one video it is not the best ranked for allvideos. The best TMO for each video depends on theattributes of the video, for example, sudden light changes,movement, mainly dark or bright scenes.

6. Conclusions and future work

Prior to running the experiments it was thought thatthe TMO preference on tone mapped HDR video wassimilar to tone mapped HDR images. This assumptionwas based on the Urbano et al. [3] study that consideredSSDs and concluded that for LCD and CRT displays theTMO preference was the same. However, our experimentsshowed when comparing those displays with a SSD theresults were significantly different. The results alsoshowed that when referring to tone mapped HDR videothere can be a slight difference to the TMOs preferenceorder between displays but across displays this preferenceremains the same. This interesting finding could beexplained due to the motion associated with video clips.When a participant is viewing an image he/she pays moreattention to all details since it is static, whereas with avideo the viewer is more concentrated on the sequenceand thus the differences between the performance ofTMOs in different displays is not significant compared totheir overall performance.

The tests took place in a room with medium ambientluminance values that could possibly have direct influenceon the TMO preference order. Further work will be done tounderstand how the ambient luminance levels can have animpact on the TMO preference order and thus if thispreference order is the same for the different sized dis-plays under different luminance levels. The data gatheredwill help to identify more precisely what factors in a videoinfluences the preference for a TMO, and whether indeedsequences within one video may benefit from different


TMOs. Due to the fact that the preference order of TMOschange depending on the scene, a number of new HDRvideo sequences with specific test conditions, such asmainly bright and dark areas, in-door, outdoor etc., willneed to be investigated. Furthermore, as mobile devicesare typically used in a variety of lighting conditions, in theshadow, bright sunshine etc., the preferred TMO alsoneeds to be considered for these different conditions.

Acknowledgements

We would like to thank to the authors of the TMOs fortheir help with the code for tone mapping the frames usedin the experiments and the participants for taking part inthe study. We would also like to thank Elmedin Selmano-vic for providing the Jaguar HDR footage.

This work was partially supported by the Portuguesegovernment, through the National Foundation for Scienceand Technology – FCT (Fundação para a Ciência e a Tecno-logia) through the project SFRH/BD/76384/2011 entitled“HDR Video for Mobile Devices” and partially supported bythe Portuguese government, through the National Founda-tion for Science and Technology - FCT (Fundação para aCiência e Tecnologia) and the European Union (COMPETE,QREN and FEDER) through the project REC I/EEI-SII/0360/2012 entitled “MASSIVE - Multimodal AcknowledgeablemultiSenSorial Immersive Virtual Environments”.

This work was also partially supported by UK EPSRCGrant EP/I038780/1 and by the ICT COST Action IC1005“HDRi: The digital capture, storage, transmission and dis-play of real-world lighting”.

References

[1] ITU, The World in 2011: ICT Facts and Figures, ⟨http://www.itu.int/ITU-/ict/facts/material/ICTFactsFigures2011.pdf⟩, 2011. Accessed 16/02/2013.

[2] J. Ross, R. Simpson, B. Simpson, Media richness, interactivity andretargeting to mobile devices: a survey, Int. J. Arts Technol. 4 (2011)442–459.

[3] C. Urbano, L. Magalhães, J. Moura, M. Bessa, A. Marcos, A. Chalmers,Tone mapping operators on small screen devices: an evaluationstudy, Comput. Graph. Forum 29 (2010) 2469–2478.

[4] A. Benoit, D. Alleysson, P. Le Callet, J. Hérault, Spatio-temporal tonemapping operator based on a retina model, in: Computational ColorImaging Workshop, Lecture Notes in Computer Science, Springer,Département Images et Signal, Saint Etienne, France, 2009, pp. 12–22.

[5] J.A. Ferwerda, S.N. Pattanaik, P. Shirley, D.P. Greenberg, A model of visualadaptation for realistic image synthesis, in: Proceedings of the 23rdAnnual Conference on Computer Graphics and Interactive Techniques,SIGGRAPH '96, ACM, New York, NY, USA, 1996, pp. 249–258.

[6] J.H. Van Hateren, Encoding of high dynamic range video with amodel of human cones, ACM Trans. Graph. 25 (2006) 1380–1399.

[7] R. Mantiuk, S. Daly, L. Kerofsky, Display adaptive tone mapping, ACMTrans. Graph. 27 (2008) 68:1–68:10.

[8] S.N. Pattanaik, J. Tumblin, H. Yee, D.P. Greenberg, Time-dependentvisual adaptation for fast realistic image display, in: Proceedings ofthe 27th Annual Conference on Computer Graphics and InteractiveTechniques, SIGGRAPH '00, ACM Press/Addison-Wesley PublishingCo., New York, NY, USA, 2000, pp. 47–54.

[9] R. Boitard, K. Bouatouch, R. Cozot, D. Thoreau, A. Gruson, Temporalcoherency for video tone mapping, in: Proceedings of SPIE, Applica-tions of Digital Image Processing XXXV, vol. 8499, pp. 84990D-84990D-10.

[10] F. Banterle, A. Artusi, K. Debattista, A. Chalmers, Advanced HighDynamic Range Imaging: Theory and Practice, first edition, AKPeters, Ltd (CRC Press), 2011.

[11] J. Tumblin, H. Rushmeier, Tone reproduction for realistic images,Comput. Graph. Appl., IEEE 13 (1993) 42–48.

[12] C. Schlick, Quantization techniques for visualization of high dynamicrange pictures, in: G. Sakas, S. Müller, P. Shirley (Eds.), PhotorealisticRendering Techniques, Focus on Computer Graphics, Springer,Berlin, Heidelberg, 1995, pp. 7–20.

[13] G.W. Larson, H. Rushmeier, C. Piatko, A visibility matching tonereproduction operator for high dynamic range scenes, IEEE Trans.Vis. Comput. Graph. 3 (1997) 291–306.

[14] K. Chiu, M. Herf, P. Shirley, S. Swamy, C. Wang, K. Zimmerman,Spatially nonuniform scaling functions for high contrast images, in:Proceedings of Graphics Interface '93, pp. 245–253.

[15] S.N. Pattanaik, J.A. Ferwerda, M.D. Fairchild, D.P. Greenberg, Amultiscale model of adaptation and spatial vision for realistic imagedisplay, in: Proceedings of the 25th Annual Conference on ComputerGraphics and Interactive Techniques, SIGGRAPH '98, ACM, New York,NY, USA, 1998, pp. 287–298.

[16] E. Reinhard, M. Stark, P. Shirley, J. Ferwerda, Photographic tonereproduction for digital images, ACM Trans. Graph. 21 (2002)267–276.

[17] M. Ashikhmin, A tone mapping algorithm for high contrast images,in: Proceedings of the 13th Eurographics Workshop on Rendering,EGRW '02, Eurographics Association, Aire-la-Ville, Switzerland,Switzerland, 2002, pp. 145–156.

[18] R. Brémond, J.-P. Tarel, E. Dumont, N. Hautière, Vision models forimage quality assessment: one is not enough, J. Electron. Imaging 19(2010) 043004–043004-14.

[19] R. Mantiuk, K. Myszkowski, H.-P. Seidel, Visible difference predicatorfor high dynamic range images, in: Proceedings of IEEE InternationalConference on Systems, Man and Cybernetics, pp. 2763–2769.

[20] F. Drago, W. Martens, K. Myszkowski, H.-P. Seidel, PerceptualEvaluation of Tone Mapping Operators with Regard to Similarityand Preference, Research Report MPI-I-2002-4-002, Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken,Germany, 2002.

[21] P. Ledda, A. Chalmers, T. Troscianko, H. Seetzen, Evaluation of tonemapping operators using a high dynamic range display, in: ACMSIGGRAPH 2005 Papers, SIGGRAPH '05, ACM, New York, NY, USA,2005, pp. 640–648.

[22] A. Yoshida, V. Blanz, K. Myszkowski, H. Peter Seidel, Perceptualevaluation of tone mapping operators with real-world scenes, in:Human Vision & Electronic Imaging X, SPIE, 2005, pp. 192–203.

[23] J. Kuang, H. Yamaguchi, C. Liu, G.M. Johnson, M.D. Fairchild, Evalu-ating HDR rendering algorithms, ACM Trans. Appl. Percept. 4 (2007).

[24] M. Čadík, M. Wimmer, L. Neumann, A. Artusi, Evaluation of HDRtone mapping methods using essential perceptual attributes, Com-put. Graph. 32 (2008) 330–349.

[25] J. Petit, R. Brémond, A. Tom, Evaluation of tone mapping operators innight-time virtual worlds, Virtual Reality (2012) 1–10.

[26] M. Melo, M. Bessa, K. Debattista, A. Chalmers, Evaluation of tonemapping operators for HDR video on small form factor displays, in:M. Bessa, R. Manituk (Eds.), HDRi2013—First International Confer-ence and SME Workshop on HDR imaging.

[27] L. Meylan, D. Alleysson, S. Süsstrunk, Model of retinal local adapta-tion for the tone mapping of color filter array images, J. Opt. Soc. Am.A 24 (2007) 2807–2816.

[28] A.O. Akyüz, R. Fleming, B.E. Riecke, E. Reinhard, H.H. Bülthoff, DoHDR displays support LDR content?: a psychophysical evaluation, in:ACM SIGGRAPH 2007 Papers, SIGGRAPH '07, ACM, New York, NY,USA, 2007.

[29] Wikipedia, Brightside Technologies, ⟨http://en.wikipedia.org/wiki/BrightSide_Technologies_Inc⟩, 2013. Accessed 18/02/2013.

[30] Westinghouse Digital, ⟨http://legacywd.com/details.aspx?itemnum=56⟩,2013. Accessed 16/02/2013.

[31] Apple, ⟨http://www.apple.com/ipad/specs/⟩, 2013. Accessed 13/02/2013.

[32] A. Chalmers, G. Bonnet, F. Banterle, P. Dubla, K. Debattista, A. Artusi,C. Moir, High-dynamic-range video solution, in: ACM SIGGRAPHASIA 2009 Art Gallery & Emerging Technologies: Adaptation, SIG-GRAPH ASIA '09, ACM, New York, NY, USA, 2009, pp. 71.

[33] Illuminance—Recommended Light Levels, ⟨http://www.engineeringtoolbox.com/light-level-rooms-d_708.html⟩, 2013. Accessed 13/07/2013.

[34] S. Ramsey, J.T. Johnson III, C. Hansen, Adaptive temporal tonemapping, in: Computer Graphics and Imaging, CGIM 2004, pp. 3–7.

http://www.itu.int/ITU-/ict/facts/material/ICTFactsFigures2011.pdf

http://www.itu.int/ITU-/ict/facts/material/ICTFactsFigures2011.pdf

http://refhub.elsevier.com/S0923-5965(13)00159-8/sbref2






http://refhub.elsevier.com/S0923-5965(13)00159-8/othref0010





















































































http://en.wikipedia.org/wiki/BrightSide_Technologies_Inc

http://en.wikipedia.org/wiki/BrightSide_Technologies_Inc

http://legacywd.com/details.aspx?itemnum=56

http://www.apple.com/ipad/specs/






http://www.engineeringtoolbox.com/light-level-rooms-d_708.html

http://www.engineeringtoolbox.com/light-level-rooms-d_708.html



Documents

Evaluation of HDR video tone mapping for mobile devices