Immersive authoring of Tangible Augmented Reality content: A user study

ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal of Visual Languages and Computing

Journal of Visual Languages and Computing 20 (2009) 61–79

1045-92

doi:10.1

� Cor

E-m

journal homepage: www.elsevier.com/locate/jvlc

Immersive authoring of Tangible Augmented Reality content:A user study

Gun A. Lee a, Gerard J. Kim b,�

a Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), Pohang, Republic of Koreab Department of Computer Science and Engineering, College of Information and Communication, Korea University, Seoul, Republic of Korea

a r t i c l e i n f o

Article history:

Received 8 August 2007

Received in revised form

25 July 2008

Accepted 30 July 2008

Keywords:

Immersive authoring

Augmented reality

Tangible interface

User study

Interaction design

6X/$ - see front matter & 2008 Elsevier Ltd. A

016/j.jvlc.2008.07.001

responding author. Tel.: +82 2 3290 3196; fax:

ail address: [email protected] (G.J. Kim).

a b s t r a c t

Immersive authoring refers to the style of programming or developing content from

within the targetexecutable environment. Immersive authoring is important for fields

such as augmented reality (AR) in which interaction usability and user perception of the

target content must be checked first hand, in situ. In addition, the interaction efficiency

and usability of the authoring tools itself is equally important forease of authoring.

In this paper, we propose design principles and describe an implementation of

animmersive authoring system for AR. More importantly, we present a formal user

study demonstrating its benefits and weaknesses. In particular, our results demonstrate

that, compared to using the traditional 2D desktop development method, immersive

authoring gained significant efficiency in specifying spatial arrangements and behavior

tasks, a major component of AR content authoring. However, it was not so successful for

abstract tasks such as logical programming. Based on this result, we suggest that a

comprehensive AR authoring tool should include such immersive authoring function-

ality to help, particularly non-technical media artists, create effective contents based on

the characteristics of the underlying media and interaction style.

& 2008 Elsevier Ltd. All rights reserved.

1. Introduction

Augmented reality (AR) is a newly emerging typeof digital content that combines real imagery (usuallycaptured by video cameras) with virtual 3D graphicobjects. Thus, its content is 3D by nature. Comparedto 2D oriented content or applications for which astable interaction platform exists, developing 3D content(such as AR content) requires a careful considerationof interaction usability and user perception, in additionto the basic functionality. Immersive authoring has beenproposed in the virtual reality (VR) community as one wayto achieve this objective [1]. Immersive authoring refers tothe style of programming or developing content from

ll rights reserved.

+82 2 3290 4295.

within the target executable environment. By workingdirectly within the target executable environment, thedeveloper gains a better ‘‘sense’’ (since a full blown formalusability test is not always feasible) for the content indevelopment as seen, used, or felt by the user. Note thatthe executable environment of AR is quite different fromthat of the desktop, often requiring the user to wear ahead mounted display (HMD), camera, sensors and usenon-traditional interfaces, a time consuming processin itself. Thus, immersive authoring has the additionalbenefit of reducing the period between contents develop-ment and testing/deployment.

Immersive authoring is similar to the concept of ‘‘WhatYou See Is What You Get (WYSIWYG),’’ the most prevalentform of visual authoring tool today [2]. While the conceptof WYSIWYG is quite intuitive and its benefits havebeen attested to in theory and practice for some time,immersive authoring is still just an interesting proposal,

www.sciencedirect.com/science/journal/yjvlc

www.elsevier.com/locate/jvlc

dx.doi.org/10.1016/j.jvlc.2008.07.001

mailto:[email protected]

ARTICLE IN PRESS

G.A. Lee, G.J. Kim / Journal of Visual Languages and Computing 20 (2009) 61–7962

without benefits which have been formally demonstratedor verified. This is partly because the efficiency ofimmersive authoring depends on its own interactionusability and ease of use. Despite potential benefits, anyauthoring tools, immersive or not, will neither be effectivenor gain popularity if they are difficult to use. In addition,one must also consider that some aspect of the immersivecontent may not be achieved in the most efficient waythrough immersive interfaces (e.g. specifying logicalbehavior). Such issues need to be examined in conjunctionwith each other.

In this paper, we first propose requirements, particu-larly in terms of its interaction design, for immersiveauthoring for AR content. Then, we briefly describe ourimplementation. Our central concept of immersiveauthoring for AR is an extension of the ‘‘WYSIWYG’’ into‘‘WYXIWYG (What You eXperience Is What You Get)’’ [3].We demonstrate the projected benefits of immersiveauthoring by conducting an interaction usability test withAR authoring tasks, as compared to using the traditionaldesktop development method.

In the following, we first review some previousresearch related to our study. We then discuss the designprinciples and requirements for immersive authoring interms of interaction usability. We also briefly describeour implementation of an immersive authoring systemcalled iaTAR. Section 4 describes the formal usabilitytest we have performed to evaluate our immersiveauthoring system. Finally, we conclude the paper withan executive summary of our contribution and futureresearch directions.

2. Related work

The basic idea of immersive authoring has been knownfor some time, although mainly in the context of VR. Stileset al. proposed a conceptual VR authoring system calledthe ‘‘Lingua Graphica’’ in which various elements of theprogramming language were represented in a concretemanner using 3D objects [1]. Several researchers appliedimmersive VR (rather than using the conventional desktopCAD systems) to create 3D virtual worlds [4–7]. Similarattempts have been made in the AR area also, for example,to construct virtual scenes from within AR environments.Poupyrev et al. [8] suggested a rapid prototyping tool formodeling virtual aircraft cockpits. The system provided aset of virtual gauges and instruments that can be copiedover physical tiles. The users were able to test variouslayouts using an AR interface. Kato et al. [9] suggested ageneric interaction method for manipulating virtualobjects within AR environments, and applied it to thetask of arranging furniture in a virtual room. Piekarski andThomas [10] suggested 3D geometry modeling techniquesfor outdoor AR systems. The modeling system was forconstructing virtual representations of physical landmarkswhile roaming and examining the outdoor scene. All ofthese works, however, fell short of being true authoringtools, as they did not consider object behaviors.

On the other hand, ‘‘immersive’’ behavioral modelinghas not attracted the same degree of attention, perhaps

due to the seemingly logical nature of the task. Thus, it isnot considered fit for 3D immersive VR/AR platforms.However, many object behaviors can be both logical andspatial at the same time. Although geometric modeling isan important part of an authoring process, in this paperwe concentrate on the task of scene configuration, objectbehavior modeling, and other types of functionality forauthoring support. Few others have considered immersiveauthoring of object behaviors in the manner of Steed et al.[11] and Lee et al. [3]. All these systems explored definingbehaviors of the scene and objects within the virtualenvironment using VR interfaces. For example, in thesystem by Steed et al., the users were able to view andmanipulate the links (i.e. data flow) between virtualobjects. This was one of the first systems implemented forimmersive behavior modeling. However, the data-flowrepresentation was not general enough to accommodatevarious types of behaviors that were possible in a typicalVR system, and there was arguably no compelling reasonor advantage (other than merging the executable anddevelopment platform) to employ the 3D interactionor immersive environment to view and interact withthe data flow representation. Most importantly, it wasdifficult to judge the overall comparative benefits ofimmersive authoring from these works without anyformal user studies.

AR content has been developed mostly by program-ming with specialized toolkits (APIs) [12–14]. As apossible means of relieving the burden of low-levelprogramming, few researchers have proposed the use ofabstract mark-up languages and visual tools for specifyingAR content [15,16]. With the recent popularity andinterest in AR, more comprehensive AR authoring toolshave been developed extending this approach [17–21].These tools typically offer a desktop-based GUI interfacewith various representation constructs (e.g. data flow,state diagrams, geometry), and an executable windowshowing the evolving AR content (see Fig. 1). Note that inthis situation, a camera, usually fixed, is monitoring thetarget interaction area. For example, CATOMIR is a desktopAR authoring tool developed under the AMIRE project[19]. Its graphical user interface enables users to createand specify properties of required components and linkthem to create behavior chains. Users can immediatelyswitch to an executable mode (simply by pressing thecompile button) for running and testing the result (simplylooking at the AR content window, see Fig. 1). Thedesigners augmented reality toolkit (DART) is also a 2Ddesktop tool for rapid development and prototyping of ARapplications [21]. DART is implemented as an upper layerof Macromedia Director, leveraging its familiar behaviormodeling method using scores, sprites and scripts.

While development is still based on a 2D desktop andindirect (not immersive) method, the use of mark-uplanguages or GUI-based tools does significantly reduce thedevelopment time. That is, the AR content being designedcan be immediately compiled, executed and displayedusing the desktop window. While the view of the contentis neither first person nor immersive, the developmenttime is still significantly reduced compared to traditionalprogramming. We posited (and found) that experts

ARTICLE IN PRESS

Fig. 1. A typical desktop non-immersive authoring tool for augmented reality [18].

G.A. Lee, G.J. Kim / Journal of Visual Languages and Computing 20 (2009) 61–79 63

(who are used to AR programming) both prefer andare more productive using mark-up languages, while 2DGUI tools are more popular among novices (e.g. non-programmers, those not familiar with AR programming orthe syntax of the mark-up language) [22]. However, inboth cases, the problem remains the excessive trial-and-error process resulting from the indirect specifications ofthe spatial aspects of the content. Moreover, for mobileand location-dependent AR content (i.e. AR content tied toa wide physical area and displayed using a hand-heldor wearable device), desktop authoring becomes evenmore indirect and difficult [20].

In fact, authoring AR content can be classifiedinto three main types of tasks, (1) spatially oriented(3D interface needed/e.g. 3D placement/associationof objects), (2) logically oriented/discrete commands(2D interface is sufficient/e.g. logical behaviors), and(3) both (1) and (2) coupled (e.g. spatial behaviors). Thus,the effectiveness and design of an authoring tool, 2Doriented or 3D immersive, must be based on the nature ofthe particular target content (e.g. what proportion of thetarget content constitutes spatial behavior?). Our studyfocuses on investigating the benefits and weaknesses of3D and immersive authoring for AR through a formalexperiment using representative authoring tasks.

3. Immersive authoring system for Tangible AugmentedReality (iaTAR)

Immersive authoring allows a content developer toexperience and verify the content first hand, whilecreating it through natural and direct interaction withinthe same environment as the one where the final result is

used, in this case the AR environment. By definition, inimmersive authoring, the development process occurswithin the same environment as the one where thecontent will be experienced by the end users; thereforedevelopers need not switch between the authoring andtest environments.

Likewise, by the term, ‘‘AR authoring,’’ we meana process of creating and organizing AR content byusing interactive software (primarily designed for non-programmers), in contrast to programming libraries orapplication programming interfaces (APIs). More specifi-cally, the authoring system described in this paper istargeted at creating Tangible Augmented Reality (TAR)content. TAR interfaces [23] are those in which (1) eachvirtual object is registered with a physical object and (2)the user interacts with virtual objects by manipulating thecorresponding physical object. TAR applications/contenthave recently become quite popular. MagicBook is onesuch example [24]. MagicBook is an AR-based pop-upbook for which users can watch 3D animations poppingout from a physical story book (see Figs. 2 and 3).

3.1. Requirements for immersive TAR authoring

Based on the previous definition and our specific targetapplication/content, we established four general require-ments for a TAR immersive authoring to help guideour implementation. We believe that our requirementswill apply to immersive authoring systems in general. Todescribe the first and most basic fundamental designprinciple of immersive authoring systems, we have coineda new term ‘‘WYXIWYG, ‘‘which stands for ‘‘What You

eXperience Is What You Get’’ [3]. Like the term ‘‘WYSIWYG

ARTICLE IN PRESS

Fig. 2. A tangible AR interactive story book built with iaTAR.

Fig. 3. Four examples of AR content with different task requirements: (a) circulation of water and makings of rain [31] which requires extensive

immersive 3D interaction design (top row), (b) MagicBook type [30] of application that only needs association of markers with pre-built graphical objects/

behaviors (bottom left), and (c) mobile AR content, the Human Pacman [32], which is very difficult to implement using mere programming or desktop

tools (bottom right).


ARTICLE IN PRESS


(What You See Is What You Get)’’ [2] in modern graphicaluser interfaces, this design principle implies that animmersive authoring system must support fast andintuitive evaluation of virtual world being built. In otherwords, developers must be able to experience thesame feeling or aura (not only visual and aural, but eventactile or haptic) the end users might feel, with thecontent that is under development. This provides aninstant (or even concurrent) evaluation of the contentunder construction, helping the developers to accuratelyunderstand the current status of the content. Ideally, anyinteractive and immersive content should undergo formaltesting of its usability, level of presence, and degreeof information transfer. In practice, this is difficult, forreasons of cost, time and limited resources. The next bestalternative would be enabling the developer to get the feelof the content as quickly and easily as possible. This wouldenable the developer to experience the user’s perspective,and, ensure that the developer’s intentions are trulyreflected.

The next design principle is to employ direct manipula-

tion techniques in the authoring process as much aspos-sible. Direct manipulation is another important conceptborrowed from the 2D user interfacedevelopment field[22]. This refers to manipulating graphical objects directly.Since its introduction, along with the mouse, it hasrevolutionized and transformed the way we interact withcomputers, particularly for creating graphically orientedcontent [22]. Similarly, since the immersive authoringenvironment uses 3D interfaces by its nature, direct 3Dmanipulation should provide an efficient and intuitiveway of manipulating virtual objects (one of the mainsubtasks in authoring 3D TAR content). Providing direct-ness and tactility will increase the intimacy between thedeveloper and the content [22].

While direct manipulation is undoubtedly intuitive, itlacks sufficient spatial accuracy to support fine andprecise modeling tasks. This is because the exact 3Dpositioning/orienting task (with 6 degrees of freedom) isdifficult by itself, and also because the current movementtracking sensors lack the required accuracy. A separateprovision (such as constrained positioning and alphanu-meric input) must be made to ensure sufficient detail

control is possible and support a reasonable range ofmodeling functionality.

3.2. Task analysis and interaction design for iaTAR

Suppose we want to construct interactive TAR-basedcontent for the following simple story (‘‘The Hare andTortoise’’) from the Aesop’s fables.

A Hare one day ridiculed the short feet and slow paceof the Tortoise, who replied, laughing, ‘‘Though yoube swift as the wind, I will beat you in a race.’’ TheHare, believing her assertion to be simply impossible,assented to the proposal; and they agreed that the Foxshould choose the course and fix the goal. On the dayappointed for the race the two started together.The Tortoise never for a moment stopped, but wenton with a slow but steady pace straight to the end of

the course. The Hare, lying down by the wayside, fellfast asleep. At last waking up, and moving as fast as hecould, he saw the Tortoise had reached the goal, andwas comfortably dozing after her fatigue. Slow butsteady wins the race.

To realize this story as a TAR content, several types offunctionality will be required (this can be implemented inmany ways). The objects must be modeled (i.e. geometricshape and configuration) according to the details requiredby their functions, which in turn must also be modeled.For instance, the hare’s running requires modelingof its legs and a periodic animation/sound associatedwith this.

Then, various scenes must be put in place (e.g. start ofthe race, Hare’s sleeping scene, etc.). Specific details needto be specified for the object’s behavior, such as its timing,motion profiles, conditions, triggering events, etc. Notethat the manifestation of the behavior may require acareful selection of multimodal output for the best effect.To make the story interactive, 2D or 3D interaction may bedesigned and inserted (see Fig. 2). The content developercan insert special effects, sound tracks and changinglighting conditions into the behavioral time line. All ofthese important modeling and specification tasks maybe repeated and rehearsed, as the content develops andmatures, during which the developer will take notes,adjust parameters, try different versions, replay andreview, check for usability and immersion effects, andeven act out the object’s role oneself. The developer willalso constantly require various types of information tomake decisions and perform these tasks.

Table 1 summarizes some of the important subtasksrequired for immersive authoring. The four major tasksare broadly identified as individual object specification,overall scene specification, content review (or executable)and miscellaneous operations, which are further decom-posed into various subtasks. Table 2 matches each subtaskto among four forms of authoring interfaces, e.g. specia-lized tools, programming or text-based input, 2D GUI,and 3D immersive authoring. A more detailed explanationfollows.

3.2.1. Object specification

One of the major tasks in object specification isgeometric modeling as part of the specification of itsform. Although there are benefits to modeling theappearance of the virtual objects within the scene using3D/AR interfaces, this is outside the scope of the paper, asthere has been substantial previous work in this area[4,5,7,25]. Instead, we assume that there already existgeometric models of the virtual objects for the developerto import, use, modify, manipulate and specify grossbehavior for. Detailed geometric/graphic modeling andanimation of objects are often accomplished separatelyusing specialized tools. Likewise, we can also assumethat the virtual objects come with few ‘‘basic’’ types offunctionalities already modeled, so that the developerdoes not have to bother specifying trivial details, althoughthey can be specified using a separate function specificationinterface, if needed. Otherwise, ‘‘immersive’’ specification of

ARTICLE IN PRESS

Table 1Possible subtasks for immersive authoring for/with TAR

Category Tasks and subtasks

Objects specification Form specification Geometric modeling (not covered in this paper)

(Sub) Object placement/rotation (reconfiguration)

Shape modification (e.g. scaling)

Discrete attribute setting

Function specification Scripting/programming

(Self) Motion specification (e.g. animation)

Behavior specification/user interaction Scripting/programming

Model-based specification (e.g. event-action rules)

(Gross) Motion specification

Behavioral coordination (e.g. synchronization of concurrent behaviors)

Object to object linking (routes, data flow)

2D interaction specification

3D interaction specification

Object-prop binding Associating a virtual object with a specific prop or real object

Scene specification Object placement/deletion

Object hierarchy specification

Scene-wide settings (sound effects, lighting effects, camera placements)

Content review Deployment and testing individual objects

Running and testing entire content

Miscellaneous Version management (save, replay)

Information browsing/note taking

Usability/presence assessment

Table 2Matching the authoring subtasks to four different styles of authoring interfaces: using specialized tools, programming or text input, 2D GUI, and 3D

immersive interfaces

Category Tasks and subtasks Specialized tool(e.g. geometricmodeler)

Programmingtext/scriptinput

2D GUI/menu/drag/drop

3D/immersive

Form specification Geometric modeling | X n n

(Sub) Object reconfiguration X n J |Shape modification X X J |Discrete attribute setting X J | n

Function specification Scripting/programming | J n n

(Self) Motion specification (animation) | X n n

Behavior specification/user interaction

Scripting/programming X | n n

Model-based specification X n | n

(Gross) Motion specification X n J |Behavioral coordination X n n |Object to object linking X | | J

2D interaction specification X J | n

3D interaction specification X n n |

Object-prop binding Object to prop binding a specific propor real object

X | | J

Scene specification Object placement/deletion X X J |Object hierarchy specification X J | n

Scene-wide settings X J | J

Content review Deployment and testing objects X n | |Running/testing entire content X n J |

Miscellaneous Version management X X | J

Info. browsing/note taking X X | J

Usability/presence assessment X X n |

|: most appropriate, J: can be easily done, n: difficult, X: almost impossible.


ARTICLE IN PRESS


form mainly involves small geometric adjustments (e.g.slight reconfiguration of the object organization, sizeadjustment, etc.) and assigning values for the object’srelevant attributes. Whether the attribute values arediscrete (e.g. color), continuous (e.g. dimension), or spatial(e.g. 2D or 3D) may dictate the required style of theinterface.

Particular types of virtual objects (e.g. articulatedcharacters, moving objects, buildings, etc.) commonlyused by a given application or content type can beidentified through a domain analysis. Such a basic objectmodel can free the user from needing to define frequentlyused attributes or functions during the authoring process.When definition of a new attribute or function is required,the content developer can resort to an alphanumeric inputinterface for typing identifiers or script codes.

The application of high level authoring tools is deemedmore appropriate for specifying complex object behaviorsin the context of the overall story and scene. Currentmethods rely mostly on programming. While there maybe cases where programming will be necessary, much ofthe behavioral authoring can be made easy by the use ofstructured behavior ‘‘models.’’ For example, the VirtualReality Mark-up Language [26] uses the notion of ‘‘routes’’to define behaviors. Object attributes can be connected bythe routes to define dynamic behaviors, forming a dataflow graph of virtual objects. Other methods such as state-based methods and ‘‘drag and drop’’ scripts are possiblealso [27,28]. Thus, describing a behavior can be consideredto be specification of a chosen behavior model. Based onthe chosen behavior model, the content developer mightuse different interaction methods, 2D or 3D/immersive.For instance, behavior specification using ‘‘routes’’ (mak-ing logical connections) or importing reusable behaviorscan be accomplished using 2D interfaces. On the otherhand, when using event-driven imperative models, theevents and actions (e.g. sequence of functions) can berepresented with metaphorical virtual objects becausevirtual object behaviors frequently exhibit spatial char-acteristics. In this case, 3D direct manipulation anddemonstration can be more useful. A behavior specifica-tion can even involve ‘‘spatially acting out’’ specificsituations to encapsulate an event or behavior, e.g. acolliding motion or coordination of timing by directlymanipulating the involved virtual objects. User interactioncan be considered as the means by which individualobjects ‘‘behave’’ based on user input. Again, this userinteraction behavior may be 2D or 3D in nature. Finally, anauthoring subtask unique in TAR applications is the‘‘binding’’ task for associating a virtual object to a physicalprop or real object in the AR environment.

3.2.2. Scene specification

Virtual objects collectively constitute a scene and thescene changes according to user input and objectbehavior. Scenes are constructed by importing the neededobjects/props into the (real) scene, and positioning andorienting them. In a conventional development platform,this is accomplished through trial and error (recompile,display and review), by guessing the translation androtation offsets of the objects with respect to the parent

coordinate system. Immersive 3D direct manipulationof objects can aid in making this process more intuitiveand efficient. Studies have revealed human difficulty inspecifying 3D spatial information without a stronglyestablished reference, when having to resort to mentalsimulation, or with 2D interfaces (i.e. keyboard andmouse) [29].

An AR scene can also be considered as an individualobject with its own attributes (e.g. shading properties,background sounds) and behavior (e.g. scene switch,changing weather effects). Also, note that there may becomputational and ‘‘formless’’ objects such as the cameraand collision detection modules which have features orbehavior that can affect the whole scene. They can also bespecified through interfaces similar to those used forobject attribute specification, and their attribute typeswill dictate the styles of the required interfaces.

3.2.3. Content review/miscellaneous

As has been already pointed out, an authoring task isbasically an exploration task with much trial and error.Thus, aside from the above two major authoring tasks, theauthors need to execute and review the authored content.Note that the author may wish to deploy and review onlyparts of the content rather than the content in its entirety.During this process, the author will save different versionsof the virtual world/objects for later reuse/replay. ARworlds or objects saved at different times may have to bemerged to define a new version. Managing these opera-tions can be accomplished easily through 2D interfaces.

During the review, the author should be able tonavigate throughout the scene and view the content firsthand from various view points to assess the level ofimmersion and interaction usability. Thus, such a reviewhas to be performed in the immersive executable mode.

3.2.4. Examples: Hare and Tortoise, ARVolcano [30],

circulation of water [31], Human Pacman [32]

As summarized previously, AR authoring involvesvarious types of subtasks. The composition of the differentsubtasks will vary according to the particular content. Weuse four different examples to further illustrate the needfor immersive authoring, and its unique role. The first isthe interactive story of Hare and Tortoise (see Fig. 2) inwhich the dominant tasks are the scene configuration andcreation of the coordinated behaviors among the maincharacters, both of which are best accomplished using 3Dimmersive interfaces. The second is the AR Volcano (seebottom left part of Fig. 3), a typical MagicBook application[30]. In this type of AR content, logical/2D authoring tasksconsist of binding a particular marker (e.g. a page in abook) with previously modeled virtual objects/behaviors(e.g. exploding volcano) and specifying simple 2D inter-action (e.g. controlling the magnitude of an explosion andmolten lava using the slide bar on the side). The third isAR-based educational content for teaching the concept ofcirculation of water [31]. This content involves manymetaphorical 3D interactions, e.g. circulation of themarker (e.g. river–mid air–cloud–mid air–underground)along with which particular events (illustration of vapor-ization, condensation, formation of clouds, etc.) occur,

ARTICLE IN PRESS

Table 3Interfaces for major tasks in iaTAR

Category Tasks and subtasks Interaction/interface

Objects/scene specification Object placement/rotation Direct 3D manipulation (props)

Shape modification (e.g. scaling) Prop-based 2D GUI/direct 3D manipulation (props)

Discrete attribute setting Prop-based 2D GUI

Scripting/programming Real keyboard

Motion specification Direct 3D manipulation (Props)/PBD

Object/prop binding/linking Prop-based 2D GUI/direct 3D manipulation (Props)

Timing coordination Prop-based 2D GUI/direct 3D manipulation (props)

Miscellaneous Deployment/testing Prop-based 2D GUI

Version management/system control Prop-based 2D GUI

Navigation/review Natural body motion

Information browsing/note taking Real keyboard/real or virtual terminal/voice recording


requiring the manipulation of water molecules to makerain drops (see top row of Fig. 3). This requires explorationfor the best possible way to interact, in situ, to commu-nicate the scientific concept to students. Immersiveauthoring seems most appropriate for this case. The finalexample is the case of mobile AR content, called the MixedReality Pacman [32]. Placing virtual balls (for the humanPacman and monsters to eat) at particular locations in awide area undoubtedly requires immersive authoringcapabilities. The four examples demonstrate the practicalneeds and advantages of immersive authoring (derivedfrom the particular needs of AR content) in addition to thestandard 2D GUI.

1 As this paper focuses on the interaction design for immersive

authoring and its validation, we give only a brief explanation of the

details of the execution model and TARML.

3.3. Interface and implementation

Our prototype implementation of the immersiveauthoring tool for AR content is named iaTAR. The toolespecially focuses on creating TAR content, where physicalprops work as a medium for controlling virtual objects.Specific interfaces were chosen for the variety of subtasksdescribed in Section 3.2 (Table 3) based on our analysis(Table 2). Among the various subtasks, 3D spatial opera-tions are performed by directly manipulating the props.For instance, the subtask ‘‘Object Placement/Rotation’’ isrealized in such a way. Logical tasks/discrete commands(e.g. changing a color of an object, opting to run thecontent) are realized through a 2D GUI associated withvarious props. For this purpose, virtual button function-ality was partially implemented using an approach calledthe occlusion-based interaction (OBI) [33]. In OBI, thevisual occlusion of markers (by the interacting fingers) isused to detect and signal a virtual button. The basicprinciple can be applied to design a variety of 2D tangibleinterfaces, such as menus, buttons, slider bars, andkeypad. Finally, although rare, there are instances whenthe use of the keyboard is necessary for alphanumericinput. Note that navigation in the AR environment (forinspection or changing the view point) is simply accom-plished by motion of the user’s own body (or head) in realspace. Figs. 5–8 illustrate various examples of authoringtask interactions.

When the user finishes the authoring tasks throughthese interfaces, in order to execute the authored result,the input data is interpreted according to an underlyingexecutable model based on a structured mark-up lan-guage called the TARML. Thus, the authored content canbe saved in the TARML format, too.1 TARML is very similarto VRML in terms of how it defines objects, theirproperties and values, and means of connecting them forpropagating events and simulating behaviors (like routesin VRML) [26]. TARML differs from the VRML in two ways:(1) by providing a way to associate physical props withobjects for tangible interaction and (2) by supplying manypredefined classes such as ‘‘logical’’ (e.g. for various logicalbehavior specification) and ‘‘sound’’ (e.g. for sound effects)objects for modeling convenience. Different types ofobjects (e.g. physical prop, virtual, logical, and sound)are specified using ‘‘tags.’’ For instance, the pobject tagsrepresent physical objects (i.e. markers) and vobject tags,virtual objects. Likewise, object properties and behavioralchains can be specified using the property and link tagsrespectively. Fig. 4 shows an example of AR contentrepresented in TARML. The content shows a rotatingvirtual fish on a physical card (prop), making bubblesounds when the card is visible. For richer, more flexibleand more convenient content specification, more objectand behavioral models should be added to TARML in thefuture. Nevertheless, content specification with the cur-rent form of TARML is conceptually quite easy, especiallyfor those with programming or engineering backgrounds.

iaTAR is implemented using a vision-based 3D trackinglibrary called the ARToolKit [12] (used for calculating the3D position and orientation of the visual markers/props)and OpenGL for the graphics part. A plain USB web camerafrom Logitech is used to acquire video images of the realenvironment and for the tracking. The capture resolutionis set to 320�240 and the shutter speed is 30 frames persecond. The camera is mounted on an HMD to provide areal world view to the user, forming a video see-through

ARTICLE IN PRESS

Fig. 4. An example of AR content represented in TARML. The content shows a rotating virtual fish on a physical card (prop), making bubble sounds when

the card is visible.

Fig. 5. Browsing through available objects and selecting one, using props.


AR configuration. A keyboard is used for alphanumericinput (see Figs. 9 and 10). iaTAR runs on a PC with the MSWindows XP operating system on a Pentium 4 processor

with 1GB main memory. A GeForce4 3D graphics cardfrom NVIDIA is used to accelerate the OpenGL graphicsprocessing.

ARTICLE IN PRESS

Fig. 6. Using the inspector pad to browse through the object attributes and their values.

Fig. 7. Recording the motion profile of two objects using two hands.


4. User experiment

The main purpose of this study is both to proposeinteraction and interfaces for immersive authoring, andto demonstrate its advantages (by comparison to theconventional desktop methods), for instance in terms ofauthoring efficiency (e.g. ease of learning and authoringtime) and resulting content quality (e.g. user felt pre-sence). However, such a validation is difficult becausein performing a comparative experiment and analysis,a representative task or subject neutral to a particularauthoring method is hard to find. Ideally, a particularapproach (like immersive authoring that has multifacetedfactors) should be informally validated by ‘‘word ofmouth’’ after it has been tested for various types ofcontent over a long period of time. The resulting contentquality is hard to measure also, because of the subjectivityin human perception. Despite these difficulties, weconducted a usability experiment comparing the proposed

immersive authoring to desktop authoring, in the hopes ofderiving a general principle.

In this paper, our focus is the authoring efficiency, suchas ease of use and authoring time (rather than the contentquality which is rather subjective), in comparison to thestandard desktop method. Specifically, we comparediaTAR to an XML-based scripting (which we namedTARML), for authoring content with given specificationsin a controlled setting. Our main hypothesis was that theimmersive authoring would be significantly more effi-cient, with less overall authoring time (particularly fortasks with spatial impact), and be more natural and easyto use compared to the desktop counterpart. Aside fromthe merely representational differences, existing 2D GUItools were not directly compared to iaTAR, because as wenoted previously, with a sufficient amount of training,direct editing can often be more efficient than using 2DGUI tools [22], and informal direct comparisons havepartially demonstrated similar results [34]. To put it more

ARTICLE IN PRESS

Fig. 8. Representing routes between object attributes; the visibility of the virtual fish is connected to the visibility of a single marker, showing that the

value is being updated according to the value of the connected attribute.


precisely, the followings are the three hypotheses we areinvestigating in this experiment.

Hypothesis I. Immersive authoring method (i.e., iaTAR) issignificantly more efficient in authoring AR contents thanconventional development methods based on program-ming languages (i.e. TARML).

Hypothesis II. Immersive authoring is significantly moreefficient on spatial authoring tasks in than on non-spatialtasks.

Hypothesis III. Immersive authoring method (i.e. iaTAR)is significantly easier to use and learn in comparison toconventional development methods based on program-ming languages (i.e. TARML).

4.1. Experimental design and task

The experiment was designed as a one factor within-subject experiment. The independent variable was thetype of the authoring system used (iaTAR or TARML).iaTAR represented the use of immersive authoring, and

TARML represented the use of a desktop text editing-based method (i.e. typing in the required scripts/constructsaccording to the TARML syntactic format). The majordependent variables were the task completion time andthe usability survey answers. The task assigned to thesubject was to construct content using the particularauthoring system that satisfied a given set of require-ments. The subjects were given as much time to finish thetask as needed, but were instructed to complete the taskas quickly and as accurately as possible.

4.2. Experimental procedure

The experimental procedure for a participant consistedof two sessions: an hour for the training session and oneand a half hours for the experimental session. During thetraining session, participants learned the basic conceptsof TAR content and the respective authoring methodsusing iaTAR and TARML. Note that the syntax for theTARML scripts (at least for the experimental task) wassufficiently simple and easy (especially for our subject

ARTICLE IN PRESS

Fig. 9. Using the real keyboard and text overlay in iaTAR.

Fig. 10. Experimental environments for script editing (left) and immersive authoring (right).


group with participants who had engineering back-grounds, also see Section 4.3). In addition to the detailedbriefing, the subjects practiced with each authoring toolby constructing a sample TAR content.

In the actual experimental session (which followed thetraining session after a short break), the subjects hadanother practice trial of authoring TAR content in orderto help subjects recall the authoring methods after thebreak. After the practice trial, the subjects were asked toauthor six different types of TAR content according to agiven specification (two examples are given in Table 4)using both authoring methods. The order of the authoringmethods and the six types of contents for each participantwas counter-balanced. The overall experimental process isshown in Table 5.

During the experimental session, to prevent anypossibility of the subjects being unduly guided by theexperimenters in any way, the subjects were not allowedto ask questions about how to use the authoringinterfaces. Instead, the participants were allowed to freelyuse the user guide document for reference. Only when thesubject got ‘‘lost’’ (e.g. not knowing what to do even afterlooking up the user guide) and it was determined thatthey had spent too much time on a particular subtask, e.g.more than one minute, was the experimenter allowedto help the subject. In most cases, users knewexactly what to do, but had forgotten the exact syntax(for TARML editing) or how to perform certain operations(for immersive authoring). One minute was deemedapproximately the right amount of time, based on our

ARTICLE IN PRESS

Table 4Two examples of requirement specifications, one spatial, the other non-spatial., given to the subject in the experiment

Spatial tasks Non-spatial (logical) tasks

0. Use page 1 to construct this scene

1. Place the terrain (‘‘underhill.obj’’)J The terrain must be parallel to page 1J The hill top must be on the right-hand sideJ Scale it up to fit the size of page 1

2. Place the tree ( ‘‘tree.obj’’)J Place it on the hill topJ Make sure the tree is neither floating over nor

buried under the groundJ The flat side of the tree must face the road on the

hill

3. Place the rabbit (‘‘rabbit.obj’’) and the turtle

(‘‘turtle.obj’’)J Both of them are at the left front end of the road,

under the hillJ The rabbit is on the left of the turtleJ Both of them are facing the front while slightly

turned to face each other

0. A following scene is givenJ There are two marker cardsJ Two models of a rabbit in different poses are placed on the first card, and likewise two

models of a turtle on the otherJ A logical object with the function of checking distance is provided

1. Connect the ‘‘position’’ properties of both cards to the ‘‘input position 1’’ and ‘‘input

position 2’’ properties of the logical object

2. Connect the ‘‘far’’ property of the logical object to the ‘‘visible’’ property of the rabbit

standing upright

3. Do the same to the turtle standing upright

4. Connect the ‘‘near’’ property of the logical object to the ‘‘visible’’ property of the rabbit

with its hands up

5. Do the same to the turtle with its hands up

6. Change the value of the ‘‘threshold’’ property of the logical object to 100

Table 5Experimental procedure

Sessions Detail tasks Time

Training (1 h) Subject feature survey questionnaire 5 min

Instructions: overview of tangible AR

content

10 min

Instructions: tutorial on TARML 5 min

Practice: TARML 15 min

Instructions: tutorial on iaTAR 5 min

Practice: iaTAR 15 min

Post-training survey questionnaire 5 min

Break 5 min

Experiment

(1.5 h)

A practice trial for both iaTAR and TARML

with a particular content specification

Approx.

90 min

Six trials for both iaTAR and TARML with

particular content specifications

Post-experiment survey questionnaire


prior observation, to resolve such a problem in authoring.Any extra time spent (due to help) was subtracted fromthe task completion time. This way, a provision was madeso that the overall performance data was not severelybiased by a few such outliers (there were two users,getting lost two times each, both during TARML editing,see Section 4.4).

The requirement specification sheet was provided andexplained by the experimenter at the beginning of eachtrial. The subject was asked to build the content describedin the requirement specification sheet as quickly andaccurately as possible. The subjects were allowed to referto the requirement specification sheet and to ask ques-tions to the experimenter about it whenever needed. Inaddition, the experimenter also periodically read out therequirements for the participant to remind the subjectwhat to do next. Such a procedure was needed because

when using the immersive authoring method, the sub-jects, wearing the HMD, sometimes could not read thespecification text displayed using the display, due to thelow resolution (this problem could have easily beenovercome by increasing the text size or using a higherresolution HMD).

There were six experimental trials, each with adifferent requirement specification sheet. In each trial,participants used both immersive and script editingauthoring methods, in order to build the same contenttwice, with each authoring interface. Three trials includedonly spatial tasks, the other three trials only non-spatialtasks. In the trials with spatial tasks, the participants wereto build an AR scene (see Table 4) positioning andorienting four virtual objects, and scaling one of them.To help the subjects make use of their spatial perception,spatial properties (positions, orientations and scale fac-tors) were described in a relative manner (not by specificvalues or numbers). In addition, a sample picture of thefinal scene was shown to help the subject understandand remember the scene to be constructed. In the trialswith non-spatial tasks, the subject began with a particularpre-modeled virtual scene. Using the virtual objects inthe particular virtual scene, the subjects were asked tomake data-flow links and to change their properties to aspecific value. As it was with the spatial task specifica-tions, the task amounts were also balanced betweenthe non-spatial task specifications; each specificationincluded six data-flow links and one property value tobe set (see Table 4).

To measure the user performance, task completiontime was recorded for each trial, for each authoringmethod. The number of times the participant referred tothe user guide and the number of times the participantgot lost with the interface were counted also. The periodof time the subject was lost (looking at the user guide formore than a minute and/or not knowing what to do) was

ARTICLE IN PRESS


subtracted from the task completion time, as alreadymentioned.

Subjective measurements were also collected withquestionnaires at the end of the training and experimentalsession. At the end of the training session, the participantswere asked to rate how easy the given authoring methodwas to learn. At the end of the experimental session, theywere asked to rate how easy the given authoring methodwas to use, and how confident they were with the contentthey built using each authoring method. Ratings weregiven on a Likert 7-point scale (0: very difficult/unconfident,3: neutral, 6: very easy/confident). Other subjectiveopinions, such as user preference and the strengths andweaknesses of the authoring methods were also collected.

4.3. Experimental setup

The experimental environments for the script editingand immersive authoring are shown in Fig. 10. A desktopcomputing environment (using a 2D display, keyboardand mouse) was provided for the script editing method.Because the target content to be built was an AR-basedcontent, a camera was set up on a fixture stand for testingthe content script. The user was allowed to change thecamera location freely if needed. For the immersiveauthoring configuration we used the iaTAR system. Thesystem consisted of an HMD with a camera attached, toprovide a video see-through functionality. Tangible propswere used as authoring interfaces.

Twelve graduate/undergraduate students with engi-neering backgrounds participated in the experiment. Theage of the participants ranged from 19 to 26 and theyall had sufficient typing skill for editing the scripts(an average of 242 characters per minute). Half of theparticipants were taking the VR class at the computerscience department. Thus, they had brief (approximately3 months) experience in VR system development. Theother half did not have any experience in VR systemdevelopment, nor in 3D graphics programming (but they

Table 6Average values for various features of the subject pool

Feature Subject group

All With VR ba

Number 12 6

Gender All male All male

Age (years) 22.2 (SD ¼ 2.04) 22.5 (SD ¼

Experience with C/C++/JAVA (years) 2.8 (SD ¼ 1.80) 4.3 (SD ¼ 1

Experience with HTML (years) 2.2 (SD ¼ 2.04) 3 (SD ¼ 2.4

Typing skill (chars/m) 242.4 (SD ¼ 78.89) 268.7 (SD ¼

Table 7Task completion time

iaTAR

Total task 27 m 43 s (SD ¼ 5 m 10 s)

For spatial tasks only 12 m 31 s (SD ¼ 2 m 50 s)

For non-spatial tasks only 15 m 12 s (SD ¼ 2 m 37 s)

possessed general programming skills). Table 6 sum-marizes the subject statistics.

4.4. Experimental results

In order to investigate the Hypotheses I and II, we haveconducted a series of one-way within subjects ANOVA forcomparing users’ task performances between iaTAR andTARML, all of the tests with an alpha level of 0.05. Whilethe two hypotheses could be investigated with a singletwo-way ANOVA test on task completion time over twoindependent variables (i.e. the authoring tools and tasktypes), this was avoided since it would introduce biasedresults due to the differences in the amounts of taskcarried out. That is, the task amounts were only balancedbetween the same task types, and thus it would be unfairto directly compare the task completion time betweenspatial tasks and non-spatial tasks. As an alternative, weintroduce an efficiency factor for the immersive authoringmethod and compared this over different task types witha separate one-way ANOVA test.

First, to validate the overall efficiency of the immersiveauthoring (Hypothesis I), the total task completion timefor authoring all six types of contents using each methodwas compared to each other. The average total authoringtime spent using iaTAR was 27 min 43 s, and for TARML,38 min and 37 s. The ANOVA revealed a statisticallysignificant difference with F(1,11) ¼ 37.20 and po0.0001(see Table 7). That represents about 28% of time savingwhen iaTAR is used in comparison to TARML. According tothis result, we can conclude that the Hypothesis I is valid.

As described earlier, in order to assess if iaTAR isindeed more efficient for spatial tasks (Hypothesis II), wecompared the efficiency factor of the immersive authoringmethod according to the task group: spatial and non-spatial. The efficiency factor E of iaTAR (over TARML) for agiven task was defined in the following way:

EðtaskÞ ¼ TðiaTAR; taskÞ=TðTARML; taskÞ

ckground With no VR background ANOVA

6 –

All male –

1.76) 21.83 (SD ¼ 2.40) F(1,10) ¼ 0.30, p ¼ 0.5954

.03) 1.25 (SD ¼ 0.61) F(1,10) ¼ 39.57, po0.0001

5) 1.3 (SD ¼ 1.21) F(1,10) ¼ 2.23, p ¼ 0.1660

84.19) 216.2 (SD ¼ 70.35) F(1,10) ¼ 1.37, p ¼ 0.2683

TARML ANOVA

38 m 37 s (SD ¼ 7 m 44 s) F(1,11) ¼ 37.20, po0.0001

25 m 12 s (SD ¼ 4 m 47 s) F(1,11) ¼ 99.94, po0.0001

13 m 24 s (SD ¼ 3 m 28 s) F(1,11) ¼ 6.20, p ¼ 0.0301

ARTICLE IN PRESS


where T(x, y) is the time spent for completing task y withthe authoring tool x. We applied ANOVA with theefficiency factor for each task group as the dependentvariable. As a result, the efficiency factor of iaTAR overTARML turned out to be significantly greater (F(1,11) ¼115.2647, po0.0001) for the spatial task (mean ¼ 0.5060,SD ¼ 0.1130) than for the non-spatial (mean ¼ 1.1764,SD ¼ 0.2461). Hence, we conclude that the Hypothesis IIis valid as well.

Our results so far appear to indicate that iaTAR is moreefficient for spatial tasks than for non-spatial tasks.To further confirm this, we investigated if the actualamount of time that iaTAR spent on non-spatial tasks wassignificantly more than that for spatial tasks: we directlycompared the task completion time of iaTAR over TARMLfor each task group.

First, we compared the task completion time betweeniaTAR and TARML, only with the spatial tasks. As expected,the iaTAR clearly took shorter amount of time forthe spatial authoring tasks (see Table 7). The total timespent for completing the spatial authoring tasks usingTARML (25 min 12 s on average) was approximately twiceas much as that of when using iaTAR (12 min 13 s onaverage), which was a significant difference under theANOVA (F(1,11) ¼ 99.94, po0.0001).

However, for non-spatial authoring tasks, iaTAR turnedout to be inefficient compared to TARML (see Table 7).With iaTAR, it took about 13% longer for subjects (mean ¼15 m 12 s), to complete the non-spatial tasks comparedto TARML (mean ¼ 13 m 24 s), and the difference wasstatistically significant (F(1,11) ¼ 6.20, p ¼ 0.0301). WhileiaTAR took longer with the non-spatial authoring tasks,the difference with TARML was not as large as that of the

Fig. 11. Total task completion

Table 8Subjective rating results

iaTAR

Ease of learning 4.50 (SD ¼ 1.168)

Ease of use 4.08 (SD ¼ 0.996)

Confidence with authoring results 3.92 (SD ¼ 1.165)

Ratings on Likert 7-scales (0: difficult or not confident, 3: neutral, 6: easy/confi

case with the spatial authoring task where iaTAR out-performed twice as faster over TARML as mentionedearlier. We discuss this result further in Section 5. Fig. 11shows the total tasks completion time according to thetask groups in a graph.

To assess the usability (i.e. ease of use and learning,namely the Hypothesis III), we compared the subjectiveratings collected with the questionnaire asking how easyiaTAR or TARML was to learn and use. Based on the resultsof subjective ratings (see Table 8), no statisticallysignificant differences were found between iaTAR andTARML (a ¼ 0.05). For instance, iaTAR was only marginallyeasier, strictly speaking, to learn than TARML, with thep-value ¼ 0.067. However, according to the results ofdebriefing, this appears to be mainly because of therelatively low quality of the device and sensing accuracy(see Tables 9 and 10) rather than from the method itself.On the other hand, while the syntax of the language mighthave taken the users some time to learn, the interfaceitself, standard 2D editing, was already quite a familiarmethod to most users.

For further assess the Hypothesis III, we compared thenumber of times subjects referred to the user guidedocument and the number of times subjects got lostduring the authoring task. Subjects referred to the userguide document 3.9 times on average (SD ¼ 3.55) whenusing TARML, whereas they never needed to do so foriaTAR. Subjects with VR backgrounds only used the guide1.8 times (SD ¼ 1.33) on average, whereas those with noVR background used it 6 times (SD ¼ 3.95). This demon-strates the obvious fact that there is a mental skill factorinvolved with using the conventional programmingapproach. It is desirable to eliminate such an obstacle as

time (minutes:seconds).

TARML ANOVA

3.92 (SD ¼ 1.240) F(1,11) ¼ 4.11, p ¼ 0.0674

4.25 (SD ¼ 1.138) F(1,11) ¼ 0.17, p ¼ 0.6887

4.50 (SD ¼ 0.905) F(1,11) ¼ 3.01, p ¼ 0.1106

dent).

ARTICLE IN PRESS

Table 9De-briefing results: strengths and weaknesses of each method

Questions Answers No. of participants

Strengths of iaTAR Easy when placing virtual objects 7

Easy to learn 5

Can check the results instantly (concurrently) 3

Weaknesses of iaTAR Coarse object positioning (fine positioning not available) 7

Tracking lost 3

Narrow field of view 2

Inconvenience in changing property values 2

Undo function unavailable 1

Strengths of TARML Easy to assign exact values to the property 6

Fast when given exact specification values 4

Copy and paste function available 1

Easy syntax 1

Weaknesses of TARML Need to learn grammars (not relatively easy to learn) 6

Repeated trial-and-error when placing objects 5

Need to reload the script after editing 2

Hard to perceive rotations in 3D space 1

Table 10De-briefing results: inconvenience of each method

Questions Answers No. of

participants

Inconvenience with

iaTAR

Narrow field of view 10

Unclear focus with HMD 4

Depth perception

problem

1

Inconvenience with

TARML

Repeated trial-and-error 6

3D space perception 1


much as possible for non-technical content developerssuch as artists.

Another indication of the authoring efficiency or easeof use can be observed from the number of times thesubjects got lost when using the respective authoringmethods. Two users (both not from the VR class) each gotlost two times when using TARML. They reported thatthey had gotten lost due to the confusion with the TARMLgrammar and problems with perceiving rotations in the3D space. Table 11 and Fig. 12 show that the average taskcompletion time per trial according to the authoringmethods for each subject group. No statistically significantdifference (a ¼ 0.05) was found between the subjectgroups with regards to the task performance for bothspatial and non-spatial authoring tasks. Thus, this result isanother indication that iaTAR is easy enough to learn anduse for those without background knowledge on VRtechnologies and 3D authoring tasks. Besides, with non-spatial tasks, a statistically significant interaction betweenthe subject group and authoring methods was found(F(1,10) ¼ 16.37, p ¼ 0.0023). From this, we can posit thatthe background knowledge on VR gave an advantagewhen using the conventional authoring method (i.e.TARML). We believe that this resulted from their prior

expertise and experience in content development, espe-cially dealing with 3D transformations and describingthem with a computer language (such as scripts). Andalthough it did not appear statistically significant(F(1,10) ¼ 4.45, p ¼ 0.0611), a similar trend was foundwith the spatial tasks as well.

5. Discussion

According to the results of the user experiment, we canconclude that there are significant evidences for theHypotheses I and II. That is, using the immersiveauthoring tool was efficient in comparison to usingTARML, with approximately 30% reduction in time tocomplete the given authoring task. This was due to thefact that immersive authoring yielded twice the perfor-mance in comparison to the conventional authoringmethod for spatial tasks, while exhibiting comparableperformance for a non-spatial task. Also note that, in anactual use of the conventional desktop authoring method,the user would have to change to an ‘‘execution’’ modeand ‘‘wear’’ the system (e.g. wear the HMD and theassociated sensors) to test the resulting content, which isa time consuming and bothersome process. Such a factorwas not even included in assessing the non-immersiveauthoring time. Thus, it can be predicted that the overallauthoring and testing time will be much shorter withimmersive authoring, even for the non-spatial tasks.

Finally, iaTAR demonstrated no marked difference inease of learning and use in comparison with TARMLediting, which employs one of the most familiar inter-faces, 2D editing (a similar argument can be extended for2D GUI also). Although the results of subjective ratingsdemonstrated no statistical difference between the two,the number of times participants referred to the userguide document, and the number of times participants gotlost when reading the user guide, provides indications ofhigher usability for iaTAR even for non-logical tasks.

ARTICLE IN PRESS

Table 11The average task completion time per trial according to the authoring tools for each subject group

Non-spatial tasks Spatial tasks

Subject group VR class Non-VR class Subject group VR class Non-VR class

iaTAR 5 m 8 s (SD ¼ 1 m 25 s) 5 m 0 s (SD ¼ 1 m 21 s) iaTAR 4 m 12 s (SD ¼ 1 m 1 s) 4 m 9 s (SD ¼ 1 m 27 s)

TARML 3 m 55 s (SD ¼ 1 m 22 s) 5 m 1 s (SD ¼ 1 m 34 s) TARML 7 m 39 s (SD ¼ 2 m 3 s) 9 m 9 s (SD ¼ 1 m 41 s)

Fig. 12. Average task completion time between participant groups (minutes:seconds).

Fig. 13. An ideal authoring tool for immersive content. Depending on the object types and behaviors prevalent in the content, the user might choose to

use the 2D or 3D immersive mode. Spatially dynamic content will be more efficiently authored using the 3D immersive mode and vice versa.


Noting that the most inconvenience in using the iaTARwas caused by imperfect devices (e.g. wired sensors, lowresolution HMD), we anticipate that the usability of iaTARwill be significantly improved by higher quality equipments.

Subject debriefing also reflected the preceding analysisresults. Participants were asked to freely write downtheir opinions on the strengths and weaknesses ofeach method. The reported opinions are summarized in

ARTICLE IN PRESS

Fig. 14. Using different tools during the development process, e.g. using 2D GUI/Editing for initial vague design and detailed logical object behaviors, and

using immersive tools for final adjustment, spatial behaviors and validation.


Tables 9 and 10. Although these informal results do nothave any statistical significance, they still raise enoughimportant issues, in the context of the quantitative results,to be considered when designing and evaluating immer-sive authoring systems. Most subjects reported that theease of configuring objects in 3D space was the foremostmerit of iaTAR and the ability to specify exact values wasthe foremost merit of TARML. The users of iaTARcomplained about the standard AR related usabilityproblems, such as the narrow field of view, degradeddepth perception, the difficulty with continuous trackingand the inability to specify exact values. Consequently, thesubjects expressed their preference in a specific man-ner.They preferred iaTAR for spatial authoring tasks, andTARML for the non-spatial and fine-tuning tasks. At theend of the questionnaire, the participants were asked fortheir preferences about the authoring methods. All theparticipants preferred to have both methods available inthe authoring system.

While authoring of most TAR content will involve bothspatial and non-spatial (e.g. logical/discrete) tasks, theirrelative proportion will be different every time, dependingon the particular target content. In addition, it is oftenpossible to metaphorically convert logical tasks to spatialtasks, which is one of the strengths and purposes of thevisual programming approach. The programming bydemonstration technique [35] is also a very good example.One can envision the authoring tool incorporating bothimmersive and desktop features, which can be used in aflexible manner depending on the nature of the targetapplication/content. That is, if the target content iscomposed of many 3D spatial behaviors, the system canbe used in an immersive manner (2D/logical tasks aredifficult to perform in immersion). Conversely, in a 2Ddesktop, 3D tasks are performed either in a limited 3rdperson viewpoint, or by intermittently donning theHMD and immersing oneself (see Fig. 13). An alternativeis to initially perform design/implementation of content,with the spatial aspect only vaguely specified using text

input and 2D GUI, then complete the final design andvalidation in executable mode within a larger contentdevelopment process. Fig. 14 illustrates the case for whichdifferent types of authoring tools are used during thedevelopment process, e.g. 2D GUI/editing for initial roughdesign and detailed logical object behaviors, and immer-sive tools for final adjustment, spatial behaviors andvalidation.

6. Conclusion and future work

In this paper, we proposed the concept of immersiveauthoring for TAR content. The main purpose of immer-sive authoring is to unify the development and executableenvironment and improve the authoring efficiency. Thiswas predicted to be the case because immersive content,like TAR content, and unlike other software, requiresin-situ evaluation with respect to human usability andperception. We implemented a prototype immersiveauthoring system called iaTAR. We also presented ausability experiment to further support the advantagesthat can be gained with immersive authoring.

TAR content authoring involves a variety of types oftasks. Immersive authoring is advantageous in general,because many of the object behaviors involve 3D spatialtasks, such as scene configuration, motion specification,motion coordinate and etc. Note that with the emergingmobile mixed reality content, the importance of immersivetools will increase also. However, this will only be the caseprovided that 2D interaction can be accomplished quiteefficiently in the immersive environment, which has not yetbeen achieved with current devices. Thus, at present, wepropose that an ideal TAR authoring system should includeboth types of functionality, 2D editing and 3D immersivemode, each of which is selected depending on the nature ofthe target content and the phases of the content develop-ment process. Such a tool will be ideal for the effectivecommunication and collaboration between non-technicalartists and programmers, during content development.

ARTICLE IN PRESS


There are still many road blocks to making immersiveauthoring more practical. Aside from problems with thedevices (such as cost, accuracy and usability—which—

however—are constantly improving), a comprehensiveauthoring system must be able to incorporate and reusevarious types of virtual objects and behavior models.Interaction and system issues for collaborative immersiveauthoring are an interesting future research direction.Despite these road blocks, we believe that immersiveauthoring is one of the keys to making AR technologydevelop into the main stream of future media and humancomputer interface.

Acknowledgments

This research was in part supported by the ‘‘TeukjungGicho’’ program of the Korea Science Foundation (Grantno. R01-2006-000-11142-0).

Appendix A. Supporting Information

Supplementary data associated with this articlecan be found in the online version at doi:10.1016/j.jvlc.2008.07.001.

References

[1] R. Stiles, M. Pontecorvo, Lingua graphica: a visual language forvirtual environments, IEEE Workshop on Visual Languages, 1992,pp. 225–227.

[2] J. Foley, A. van Dam, S. Feiner, J. Hughes, Computer Graphics: Principlesand Practice, second ed., Addison-Wesley, Reading, USA, 1990.

[3] G.A. Lee, G.J. Kim, M. Billinghurst, Immersive authoring: what youexperience is what you get, Communications of the ACM 48 (7)(2005) 76–81.

[4] J. Butterworth, A. Davidson, S. Hench, T.M. Olano, 3DM: a threedimensional modeler using a head-mounted display, in: Proceed-ings of Symposium on Interactive 3D Graphics, 1992, pp. 135–138.

[5] J. Liang, M. Green, JDCAD: a highly interactive 3D modeling system,Computer & Graphics 18 (4) (1994) 499–506.

[6] M.R. Mine, ISAAC: A virtual environment tool for the interactiveconstruction of virtual worlds, Technical Report of Department ofComputer Science, CS TR95-020, UNC, Chapel Hill, 1995.

[7] G. Wesche, H. Seidel, FreeDrawer—a free-form sketching system onthe responsive workbench, in: Proceedings of the Virtual RealitySoftware & Technology, 2001, pp. 167–174.

[8] I. Poupyrev, D.S. Tan, M. Billinghurst, H. Kato, H. Regenbrecht, N.Tetsutani, Tiles: a mixed reality authoring interface, in: Proceedingsof INTERACT, 2001, pp. 334–341.

[9] H. Kato, M. Billinghurst, I. Poupyrev, K. Immamoto, K. Tachibana,Virtual object manipulation on a table-top AR environment, in:Proceedings of the International Symposium on Augmented Reality,2000, pp. 111–119.

[10] W. Piekarski, B.H. Thomas, Augmented reality working planes: afoundation for action and construction at a distance, in: Proceedings ofthe International Symposium on Mixed and Augmented Reality, 2004.

[11] A. Steed, M. Slater, A dataflow representation for definingbehaviours within virtual environments, in: Proceedings of theVirtual Reality Annual International Symposium, 1996, pp. 163–167.

[12] ARToolKit, 2007, /http://www.hitl.washington.edu/artoolkitS.[13] Open Computer Vision Library (OpenCV), 2006, /http://sourceforge.

net/projects/opencvlibrary/S.

[14] D. Schmalstieg, A. Fuhrmann, G. Hesina, Z. Szalavari, L.M. Encarna-cao, M. Gervautz, W. Purgathofer, The studierstube augmentedreality project, Presence-Teleoperators and Virtual Environments 11(1) (2002) 33–54.

[15] F. Ledermann, D. Schmalstieg, APRIL: a high-level framework forcreating augmented reality presentations, in: Proceedings of theIEEE Virtual Reality, 2005.

[16] A. Vitzthum, SSIML/AR: a visual language for the abstractspecification of AR user interfaces, in: Proceedings of 3DUI, 2006,pp. 135–142.

[17] R. Berry, N. Hikawa, M. Makino, M. Suzuki, T. Furuya, Authoringaugmented reality: a code-free approach, in: Proceedings of ACMSIGGRAPH, August, 2004, pp. 8–12.

[18] C. Greenhalgh, S. Izadi, J. Mathrick, J. Humble, I. Taylor, A toolkit tosupport rapid construction of Ubicomp. Environments, in: Proceed-ings of UbiSys, 2004.

[19] P. Grimm, M. Haller, V. Paelke, S. Reinhold, C. Reimann, J. Zauner,AMIRE—Authoring Mixed Reality, The First IEEE InternationalAugmented Reality Toolkit Workshop, 2002.

[20] S. Guven, S. Feiner, Authoring 3D hypermedia for wearableaugmented and virtual reality, in: Proceedings of the Seventh IEEEInternational Symposium on Wearable Computers, 2003.

[21] B. MacIntyre, M. Gandy, S. Dow, J. Bolter, DART: a toolkit for rapiddesign exploration of augmented reality experiences, in: Proceed-ings of the 2004 ACM Symposium on User Interface Software andTechnology, 2004, pp. 197–206.

[22] B. Shneiderman, C. Plaisant, Designing the User Interface, fourth ed.,Addison Wesley, Reading, MA, 2005.

[23] M. Haller, M. Billinghurst, B. Thomas, Interaction design forTangible Augmented Reality applications, in: Emerging Technolo-gies of Augmented Reality: Interface and Design, 2006, pp. 261–282(Chapter XIII).

[24] M. Billinghusrt, H. Kato, I. Poupyrev, The magicbook—movingseamlessly between reality and virtuality, IEEE Computer Graphicsand Applications 21 (3) (2001) 6–8.

[25] D.A. Bowman, L.F. Hodges, User Interface Constraints for ImmersiveVirtual Environment Applications, Technical Report of Graphics,Visualization, and Usability Center, GIT-GVU-95-26, Georgia In-stitute of Technology, 1995.

[26] VRML Specification, 1997, /http://www.web3d.org/vrml/vrml.htmS.

[27] G.J. Kim, K. Kang, H. Kim, J. Lee, Software engineering of virtualworlds, in: Proceedings of Virtual Reality Software & Technology,1998, pp. 131–139.

[28] R. Pausch, T. Burnette, A.C. Capehart, M. Conway, D. Cosgrove, R.DeLine, J. Durbin, R. Gossweiler, S. Koga, J. White, Alice: rapidprototyping for virtual reality, IEEE Computer Graphics andApplications 15 (3) (1995) 8–11.

[29] A. Rizzo, J. Buckwalter, T. Bowerly, S. Yeh, J. Hwang, M. Thiebaux, G.J.Kim, Virtual reality applications for assessment and rehabilitationof cognitive and motor processes, in: 15th Congress of theInternational Society of Electrophysiology & Kinesiology, 2004.

[30] HITLabNZ Projects, 2007. /www.hitlabnz.org/index.php?page=projectsS.

[31] J. Seo, N. Kim, G.J. Kim, Designing interactions for augmentedreality based educational contents, in: Proceedings of Edutainment,Lecture Notes in Computer Science, vol. 3942, 2006, pp. 1187–1196.

[32] A. Cheok, K. Goh, W. Liu, F. Farbiz, S. Fong, S. Teo, Y. Li, X. Yang,Human Pacman: a mobile, wide-area entertainment system basedon physical, social, and ubiquitous computing, Personal andUbiquitous Computing 8 (2) (2004).

[33] G.A. Lee, M. Billinghurst, G.J. Kim, Occlusion based interactionmethods for Tangible Augmented Reality Environments, in: Pro-ceedings of ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry, 2004.

[34] G.A. Lee, C. Nelles, M. Billinghurst, G.J. Kim, Immersive authoring ofTangible Augmented Reality applications, in: Proceedings of IEEEand ACM International Symposium on Mixed and AugmentedReality, 2005, pp. 172–181.

[35] G.A. Lee, G.J. Kim, C.-M. Park, Modeling virtual object behaviorwithin virtual environment, in: Proceedings of ACM Symposium onVirtual Reality Software and Technology (VRST), Hong Kong, China,November 11–13, 2002, pp. 41–48.

dx.doi.org/10.1016/j.jvlc.2008.07.001

dx.doi.org/10.1016/j.jvlc.2008.07.001

http://www.hitl.washington.edu/artoolkit

http://sourceforge.net/projects/opencvlibrary/

http://sourceforge.net/projects/opencvlibrary/

http://www.web3d.org/vrml/vrml.htm

http://www.web3d.org/vrml/vrml.htm

http://www.hitlabnz.org/index.php?page=projects

http://www.hitlabnz.org/index.php?page=projects

Documents

Immersive authoring of Tangible Augmented Reality content: A user study