Worldlets - 3D Thumbnails for Wayfinding in Virtual ... · La Jolla, CA 92093-0515, USA ABSTRACT Virtual environment landmarks are essential in wayfinding: ... guidebook. A virtual

Worldlets - 3D Thumbnails forWayfinding in Virtual Environments

T. Todd Elvins David R. [email protected] [email protected]

San Diego Supercomputer CenterP.O. Box 85608

San Diego, CA 92186-9784, USA

David [email protected]

University of California, San Diego9500 Gilman Drive

La Jolla, CA 92093-0515, USA

ABSTRACTVirtual environment landmarks are essential in wayfinding:they anchor routes through a region and provide memorabledestinations to return to later. Current virtual environmentbrowsers provide user interface menus that characterizeavailable travel destinations via landmark textualdescriptions or thumbnail images. Such characterizationslack the depth cues and context needed to reliably recognize3D landmarks. This paper introduces a new user interfaceaffordance that captures a 3D representation of a virtualenvironment landmark into a 3D thumbnail, called aworldlet. Each worldlet is a miniature virtual worldfragment that may be interactively viewed in 3D, enabling atraveler to gain first-person experience with a traveldestination. In a pilot study conducted to compare textual,image, and worldlet landmark representations within awayfinding task, worldlet use significantly reduced theoverall travel time and distance traversed, virtuallyeliminating unnecessary backtracking.

KEYWORDS: 3D thumbnails, wayfinding, VRML, virtualreality.

INTRODUCTIONWayfinding is “the ability to find a way to a particularlocation in an expedient manner and to recognize thedestination when reached” [13]. Travelers find their wayusing survey, procedural, and landmark knowledge [5, 13,14, 9]. Each type of knowledge helps the traveler constructa cognitive map of a region and thereafter navigate usingthat map [10, 11].

Survey knowledge provides a map-like, bird’s eye view of aregion and contains spatial information including locations,orientations, and sizes of regional features. Proceduralknowledge characterizes a region by memorized sequencesof actions that construct routes to desired destinations.Landmark knowledge records the visual features oflandmarks, including their 3D shape, size, texture, etc. [2,9]. For a structure to be a landmark, it must have highimagability: it must be distinctive and memorable [10].

Landmarks are the subject of landmark knowledge, but alsoplay a part in survey and procedural knowledge. In surveyknowledge, landmarks provide regional anchors with whichto calibrate distances and directions. In proceduralknowledge, landmarks mark decision points along a route,helping in the recall of procedures to get to and fromdestinations of interest. Overall, landmarks help tostructure an environment and provide directional cues tofacilitate wayfinding.

Landmarks also influence the search strategies used bytravelers. With no a priori knowledge of a destination’slocation, a traveler is forced to use a naive, exhaustivesearch of the region. Landmarks provide directional cueswith which to steer such a naive search. In a primed search,the traveler knows the destination’s location and can movethere directly, navigating by survey, procedural, andlandmark knowledge. In practice, travelers use acombination of naive and primed searches. The location ofa curio shop, for instance, may be recalled as “near thecathedral,” enabling the traveler to use a primed search tothe cathedral landmark, then a bounded naive search in thecathedral’s vicinity to find the curio shop.

In city planning, the legibility of an environmentcharacterizes “the ease with which its parts can berecognized and can be organized into a coherent pattern”[10]. Legibility expresses the ease with which a travelermay gain wayfinding knowledge and later apply thatknowledge to search for and reach a destination. Forinstance, a city with distinctive landmarks, a clear citystructure (such as a street grid) and well-markedthoroughfares is legible.

In virtual environment design, the use of landmarks andstructure is essential in establishing an environment’slegibility. In a virtual environment lacking a structuralframework and directional cues, such as landmarks,travelers easily become disoriented and are unable to searchfor destinations or construct an accurate cognitive map ofthe region [5]. Such a virtual environment is illegible.

Real and virtual world travel guidebooks describe availablelandmarks and tourist attractions, highlighting regionalfeatures that enhance the environment’s legibility.Guidebook descriptions facilitate wayfinding by priming atraveler’s cognitive map with landmark knowledge,preparing them for exploration of the actual environment.

Similar to travel guidebooks, virtual environment browsersfacilitate wayfinding by providing menus of availabledestinations. Selection of a menu item “jumps” the travelerto the destination, providing them a short-cut to a point ofinterest. Systematic exploration of all destinations listed ona menu enables a traveler to learn an environment andprime their cognitive map with landmark knowledge.

Whereas a traveler’s landmark knowledge characterizes adestination by its 3D shape, size, texture, and so forth,browser menus and guidebooks characterize destinations bytextual descriptions or images. This representationmismatch reduces the effectiveness of destination menusand guidebooks. Unable to engage their memory of 3Dlandmarks to recognize destinations of interest, travelersmay resort to a naive, exhaustive search to find a desiredlandmark.

This paper introduces a user interface affordance to increasethe effectiveness of landmark menus and guidebooks. Thisaffordance, called a worldlet, reduces the mismatchbetween a traveler’s landmark knowledge and the landmarkrepresentation used in menus and guidebooks.

LANDMARK REPRESENTATION LEGIBILITYAnalogous to virtual environment legibility, the legibility ofa landmark representation technique expresses the ease withwhich it may be used to facilitate wayfinding. As a basis forcomparing landmark representations, we propose thefollowing legibility criteria:

• imagability: A landmark representation has goodimagability if it provides a faithful rendition of alandmark, preserving the landmark’s own imagability.Key landmark features recorded within a traveler’slandmark knowledge, such as 3D shape, size, andtexture, should be expressed in the landmarkrepresentation.

• landmark context: In addition to the landmark itself, alandmark representation should include portions of thesurrounding area. Such context supplies additionalvisual cues and enables a person to understand thelarger configuration of the environment [6, 7, 13].

• traveler context: Where landmark context expressesthe relationship between a landmark and itssurroundings, traveler context expresses therelationship between the landmark and the traveler.Travelers are better at recognizing a landmark when itis viewed from the direction in which they firstencountered it along a route [1]. Traveler contextexpresses this notion of an expected view of alandmark, such as a view of a prominent skyscraperfrom street level.

• multiple vantage points: While traveler contextprovides a typical vantage point of a landmark,additional vantage points enable a more comprehensiveunderstanding of a landmark and its context [10].

In addition to satisfying these criteria, a good landmarkrepresentation technique should be efficient to implementand have broad applicability.

RELATED WORKLandmark representations are used to characterizedestinations listed within the user interface of virtualenvironment browsers and within virtual environmentsthemselves. A browser may, for instance, list availabledestinations within a pull-down menu or in an on-line travelguidebook. A virtual environment may provide clickableanchor shapes distributed throughout the environment.Clicking on a door anchor shape in a virtual room, forinstance, may select and load a new virtual environmentpresumed to be behind the door.

Landmark representation use may be classified into twobroad categories:

• World selection: A virtual world is an independentlyloadable destination environment with its own shapes,lights, structural layout, and internal design themes.Browser world menus, guidebooks, or virtualenvironment anchors provide a selection of destinationworlds that, when clicked upon, load the selected worldinto the traveler’s browser.

• Viewpoint selection: A viewpoint is a preferredvantage point within the currently viewed virtualenvironment. Viewpoints are characterized by aposition and orientation. Browser viewpoint menus,guidebooks, or virtual environment anchors provide aselection of vantage points that, when clicked upon,jump the traveler to the selected destination.

Using the landmark representation legibility criteria above,we consider each of several representation techniques usedfor browser destination menus and guidebooks, or in virtualenvironments themselves.

Textual DescriptionsTextual descriptions are the dominant method used torepresent virtual environment landmarks in viewpoint andworld selection user interfaces. HTML pages, for instance,often provide lists of available Web-based virtualenvironments (such as those authored in VRML, the VirtualReality Modeling Language [3]), each one characterized bya URL, an environment name, and/or a brief description.Within VRML worlds, textual descriptions characterizeviewpoints and describe destinations associated withclickable anchor shapes.

In terms of our landmark representation legibility criteria,textual descriptions provide poor imagability, landmarkcontext, traveler context, and support for multiple vantagepoints. The subjective, and often brief nature of textualdescriptions limits their ability to express important visualcharacteristics of a landmark and its context. The complex3D shape of a distinctive building, for instance, may bedifficult to describe. The 3D position of a traveler in

relation to a landmark is often omitted from textualdescriptions, providing little support for traveler context.When traveler context is present in a textual description, itcharacterizes the author’s traveler context, and notnecessarily that of other travelers. Finally, the need to keeptextual descriptions relatively brief prevents a descriptionfrom providing descriptions for more than a few vantagepoints. Overall, textual descriptions provide a relativelyillegible form of landmark representation.

Images and IconsClickable icons, thumbnail images, and image maps providecommon visual wayfinding aids. In a 3D context, gamesoften provide “jump gates” onto which images of remotedestinations are texture mapped. Stepping through such agate jumps the traveler to the destination depicted on thegate.

In terms of our legibility criteria, images provide improvedimagability, landmark context, and traveler context,compared to textual descriptions, but do not supportmultiple vantage points. An image capturing a canonicalview of a landmark can show important visual detailsdifficult to describe textually. For complex 3D landmarks,or for landmarks placed in complex contexts, a single imagemay be insufficient. Overall, image-based descriptionsprovide an improved, but somewhat limited form oflandmark representation.

Image MosaicsAn image mosaic groups together multiple captured imagesinto a traversable structure. Apple’s QuickTime VR, forinstance, can use images captured from multiple viewingangles at the same viewing position [4]. By orderingimages within a traveler-centered cylindrical structure,QuickTime VR can provide a traveler the ability to look inany direction through automatic selection of an appropriateimage from the structure. By chaining multiple mosaicstructures together, the content author can create a walk-through path that hops from vantage point to vantage point.Similar image mosaics can be used to create zoom paths,pan paths, and so forth.

Using our landmark representation legibility criteria, theinclusion of multiple images within an image mosaicimproves imagability, landmark context, and travelercontext compared to that of a single image. Mosaics alsooffer multiple vantage points, but only those authored intothe mosaic structure. In a typical use, a QuickTime VRcylindrical mosaic provides multiple viewing angles, butonly a single viewing position. Such a mosaic structuremay not provide sufficient depth information to facilitaterecognition of complex 3D environments. Overall, mosaic-based descriptions provide increased landmarkrepresentation legibility, but are still limited in the range ofvantage points they support.

Miniature Worlds and MapsMost 3D environment browsers enable the traveler to zoomout and view the world in miniature, thereby gaining survey

knowledge. Stoakley et al extend this notion by creating aworld in miniature (or WIM) embedded within the mainworld [15, 12]. The miniature world duplicates all elementsof the main world and adds an icon denoting the traveler’sposition and orientation. Held within the traveler’s virtualhand, the traveler can reach into the miniature andreposition world content or themselves. Simultaneously,the outer main world is updated to match the alteredminiature, automatically adjusting the positions of shapes,or the traveler.

Similarly, 2D and 3D maps are frequently found asnavigation aids within virtual environments. 3D games, forinstance, often provide a 3D reduced-detail map in which anicon denotes the player’s location. Such maps can bepanned, zoomed, and rotated to provide alternate vantagepoints similar to that possible with miniature worlds.

Using our legibility criteria, miniature worlds and 3D mapsdo a good job of supporting imagability, landmark context,and multiple vantage points. Complex 3D landmarks, andtheir context, are accurately represented. The dominant useof a bird’s eye view of the miniature or map, however,somewhat limits the range of vantage points available andreduces support for traveler context. For instance, alandmark typically viewed and recognized at street levelmay be unrecognizable when viewed in a miniature fromabove.

The WIM approach is primarily designed to support a mapview of a region within an emersive environment. Thisspecial-purpose implementation has a few drawbacks. AWIM is held within the traveler’s virtual hand, occupyingspace in the main world and moving as the traveler moves.This implementation doubles the world’s rendering timeand requires that the traveler maintain adequate space infront of them to avoid collision between the WIM and mainworld features.

Additionally, the presence of the WIM within the mainworld may clash visually, affecting the environment’sstylistic integrity. A WIM of a mountain landscapehovering within the cockpit of a virtual aircraft simulator,for instance, would look out of place.

WIMs appear best suited within bounded environments,such as virtual rooms with walls and floors. In an unboundenvironment, such as one for a galaxy simulation, thesimilarly unbounded miniature may be indistinct andbecome easily lost in the background of the main world inwhich it hovers.

Overall, a miniature 3D representation of a virtual worldlandmark provides improved legibility over that availablewith textual descriptions, images, or image mosaics. WIMsillustrate a special-purpose approach to using 3Drepresentations within an emersive environment. This paperintroduces a general-purpose technique for creating 3Dlandmark representations.

WORLDLETSA worldlet is a 3D analog to a traditional 2D thumbnailimage or photograph. Like a photograph, a worldlet isassociated with a viewing position and orientation within aworld. Whereas a photograph captures the view of theworld as projected onto a 2D film plane, a worldlet capturesthe set of 3D shapes falling within the viewpoint’s viewingvolume. Where a photograph clips away shapes that projectoff the edges of the film, a worldlet clips away shapes thatfall outside of the viewing volume.

Like a thumbnail image, a worldlet provides a reduced-detail representation of larger content. Whereas athumbnail image reduces detail by down-sampling, theworldlet reduces detail by clipping away shapes outside of aviewing volume.

In typical use, the worldlet’s viewpoint is aimed at animportant landmark, and the worldlet’s captured shapesreconstruct that landmark and its associated context. Whenviewed within an interactive 3D browser, a worldletprovides a manipulatable 3D thumbnail representation ofthe landmark.

We have developed two types of worldlets:

• A frustum worldlet contains shapes within a standardpie-shaped viewing frustum, positioned and orientedbased upon a selected viewpoint. When viewed, afrustum worldlet looks like a pie-shaped fragmentclipped from the larger world.

• A spherical worldlet contains shapes within a sphericalviewing bubble, positioned at a selected viewpoint witha 360 degree field of view. When displayed, aspherical worldlet looks like a ball-shaped worldfragment, similar to a snow globe knick-knack.

For both worldlet types, hither and yon clipping planesrestrict the extent of the worldlet, insuring that the worldletcontains a manageable subset of the larger world. Worldletshape content is pre-shaded and pre-textured to match thecorresponding shapes in the main world. Though the mainworld may have content that changes over time, thecaptured worldlet remains static, recording the content ofthe world at the time the worldlet was captured.

Figure 1 shows a virtual city containing buildings,monuments, streets, stop lights, and so forth. Figure 1ashows the world from a viewpoint aimed at a landmark.Figure 1b shows a bird’s eye view highlighting the portionof the world falling within the viewing frustum anchored atthe viewpoint in Figure 1a. Figures 1c through 1f showseveral views of the same frustum worldlet captured fromthis viewpoint.

Figure 2a provides a bird’s eye view of the same virtualcity, highlighting a spherical portion of the world fallingwithin a viewing sphere anchored at a viewpoint. Figure 2bshows a spherical worldlet captured at the viewpoint.

(a) (b)

(c) (d)

(e) (f)

Figure 1: A virtual city landmark (a) viewed from avantage point, (b) showing the viewing frustum fromabove, and (c-f) captured within a frustum worldlet.

(a) (b)

Figure 2: A virtual city landmark (a) showing aviewing bubble from above, and (b) captured within aspherical worldlet.

Using our landmark representation legibility criteria, aworldlet provides good imagability, landmark context,traveler context, and support for multiple vantage points.The 3D content of a worldlet preserves a landmark’s 3Dshape, size, and texture, facilitating a traveler’s use oflandmark knowledge to recognize a destination of interest.The frustum or spherical capture area of a worldlet insuresthat landmark context is included along with a landmark.

To support a notion of traveler context, a worldlet istypically captured from a traveler-defined vantage point,such as street level within a virtual city. The traveler-defined vantage point insures that the landmarkrepresentation expresses what the traveler saw, while the 3Dnature of the worldlet enables the traveler to interactivelyexplore multiple additional vantage points.

WORLDLETS IN THE USER INTERFACEWe have incorporated worldlets into the user interface for aVRML browser. The browser provides features to selectamong world viewpoints and among previously visitedworlds on the browser’s history list.

Selecting ViewpointsTraditional VRML browsers provide a viewpoint menuoffering a choice of viewpoints, each denoted by a brieftextual description. We have extended this standard featureto provide three experimental viewpoint selectioninterfaces, each using worldlets. All three present a set ofworldlets, one for each author-selected viewpoint in theworld. The browser also supports on-the-fly capture ofworldlets using the traveler’s current viewpoint.

• The viewpoint list window provides a list of worldletsbeside a worldlet viewer. Selection of a worldlet fromthe list displays the worldlet in the viewer where it maybe interactively panned, zoomed, and rotated. A “Goto” button flies the main window’s viewpoint to thatassociated with the currently selected worldlet.

• The viewpoint guidebook window presents a grid ofworldlet viewers, arranged to form a guidebook photo-album page. Buttons on the window advance theguidebook forward or back a page at a time. Selectionof any worldlet on the page enables it to beinteractively examined. A “Go to” button flies themain window’s viewpoint to that of the currentlyselected worldlet. Figure 3 shows the viewpointguidebook window.

Figure 3: The viewpoint guidebook window.

• The viewpoint overlay window enables the traveler toselect a worldlet from a list, and overlay it atop themain window, highlighted in green. This worldletoverlay provides a clear indication of the worldlet’sviewpoint position and orientation, along with theportion of the world captured within that worldlet.Figures 1b and 2a, shown earlier, were each generatedusing this overlay technique.

Selecting WorldsTraditional VRML browsers provide a history list ofrecently visited worlds, each denoted by its title or URL.We have extended this standard feature to provide twoworld selection interfaces, each using worldlets.

• The world list window provides a list of worldletsbeside an interactive worldlet viewer, similar to theviewpoint list window discussed earlier. One worldletis available for each world on the browser’s history list.A “Go to” button loads into the main window the worldassociated with the currently selected worldlet.

• The world guidebook window uses the same guidebookphoto-album layout used for the viewpoint guidebookwindow discussed earlier. One worldlet is available foreach world on the history list. A “Go to” button loadsthe world associated with the currently selectedworldlet. Figure 4 shows the world guidebookwindow.

Figure 4: The world guidebook window.

Creating Worlds of WorldletsA “Save as” feature of the VRML browser enables thetraveler to save a worldlet to a VRML file. Using acollection of saved worldlets, a world author can create aVRML world of worldlets. Such a world acts like a 3Ddestination index, similar to a shelf full of snow globeknick-knacks depicting favorite tourist attractions. Whencast as a VRML anchor shape, a worldlet provides a 3D“button” that, when clicked upon, loads the associatedworld into the traveler’s browser

Figure 5 shows such a world of clickable worldlets. Figure5a shows a close-up view of a world “doorway” and a nichecontaining a worldlet illustrating a vantage point in thatworld. Figure 5b shows a wider view of the same worldand multiple such doorways.

(a) (b)

Figure 5: A world of worldlets that (a) associates aworldlet with each doorway (b) in an environmentcontaining multiple such doorways. Each doorwayleads to a different world.

SummaryThe viewpoint selection windows enable a traveler tobrowse a world’s viewpoint set using worldlets. Eachworldlet represents a 3D landmark and its context,facilitating the traveler’s recognition of a desireddestination. The use of viewpoint animation to fly betweenselected viewpoints helps the traveler understand landmarkspatial relationships and build up procedural knowledge forroutes between the landmarks.

World guidebook windows and worlds of worldlets bothenable a traveler to examine landmark worldlets in a set ofavailable worlds. Worldlets provide visual cues that help atraveler recognize a destination of interest.

In contrast to WIMs, the browser’s viewpoint and worldselection features display miniature worlds outside of themain world. No reserved space is required in the virtualenvironment between the traveler and collidable 3Dcontent. No stylistic clash or confusion with unboundedenvironments occurs. The separate display of worldlets andthe main world avoids impacting rendering performance.The use of separate worldlet display windows also enablesthe simultaneous display of multiple worldlets, includingthose for worlds different from that currently being viewedin the main viewer window.

An effect similar to WIMs can be created by including aworldlet within a world, like that shown in Figure 5. Aworldlet can remain stationary in the world or move alongwith the traveler, as in a WIM. In this regard, WIMs are aspecial-purpose implementation of the more generalworldlet concept.

IMPLEMENTATIONThe VRML browser used in this work maintains virtualenvironment geometry within a tree-like scene graph.Worldlets are also stored as scene graphs, together with

additional state information. To capture a worldlet ordisplay a worldlet or virtual environment the VRMLbrowser traverses the associated scene graph and feeds a 3Dgraphics pipeline.

Worldlet Capture in GeneralAny 3D graphics pipeline can be roughly divided into twostages: (1) transform, clip, and cull, and (2) rasterize [8].The first stage applies modeling, viewing, perspective, andviewport transforms to map 3D shapes to the 2D viewport.Along the way, shapes outside of the viewing frustum areclipped away and backfaces removed. The second stageuses 2D shapes output by the first stage and draws theassociated points, lines, and polygons on the screen.

Worldlet capture taps into this 3D graphics pipeline,extracting the transformed, shaded, clipped, and culledshape coordinates output by the first stage prior torasterization in the second stage. An extracted coordinatecontains X and Y screen-space components, a depth-bufferZ-space component, and the W coordinate. Each extractedcoordinate has an associated RGB color and texturecoordinates, computed by shading and texture calculationphases in the first pipeline stage.

To create a worldlet, these extracted coordinates areuntransformed to map them back to world space fromviewport space. The inverses of the viewport, perspective,viewing, and modeling transforms are each applied.Coordinate RGB colors and texture coordinates are used toreconstruct 3D worldlet geometry in a worldlet scene graph.

Display of a worldlet passes this 3D geometry back downthe graphics pipeline, transforming, clipping, culling, andrasterizing the worldlet like any other 3D content.

Frustum and Spherical WorldletsA frustum worldlet is the result of capturing 3D graphicspipeline output for a single traversal of the scene graph asviewed from the traveler’s current viewpoint. The shape setextracted after the first pipeline stage contains only thosepoints, lines, and polygons that fall within the viewingfrustum. The worldlet constructed by the browser from thisgeometry looks like a pie-shaped slice cut out of the world.

A spherical worldlet is the result of performing multiplefrustum captures and combining the results. The VRMLbrowser captures a spherical worldlet by sweeping outseveral stacked cylinders around a viewpoint position,generating a set of frustum worldlets each using a differentviewing orientation. Additional captures aimed straight up,and straight down complete the spherical worldlet. Theresulting set of capture geometry constructs a 360 degreespherical view from the current viewpoint.

When displayed, the spherical worldlet’s geometry lookslike a bubble cut out of the virtual environment. A closeyon clip plane keeps the bubble small, insuring that itcaptures only landmark features in the immediateneighborhood, and not the entire virtual world.

Worldlet Capture in OpenGLTo take advantage of the rendering speed offered by theaccelerated 3D graphics pipeline on high-speedworkstations, we implemented worldlet display and captureusing OpenInventor and OpenGL graphics libraries fromSilicon Graphics. Scene graph construction and displaytraversal is managed by OpenInventor. To capture worldletgeometry, the VRML browser places the pipeline intofeedback mode prior to a capture traversal, and returns it torendering mode following traversal.

While in feedback mode, the OpenGL pipeline diverts alltransformed, clipped, and culled coordinates into a bufferprovided by the browser. Upon completion of a capturetraversal, no rasterization has taken place and the feedbackbuffer contains the extracted geometry. By parsing thefeedback buffer, the VRML browser reconstructs worldletgeometry, applying appropriate inverse transforms.

OpenGL feedback buffer information includes shapecoordinates, colors, and texture coordinates, but does notinclude an indication of which texture image to use forwhich bit of geometry. To capture this additionalinformation, the VRML browser uses OpenGL’s passthrough features to pass custom flags down through thepipeline during traversal. To prepare these pass throughflags, the browser augments the world scene graph prior totraversal, assigning a unique identifier to each textureimage. During a capture traversal, each time a textureimage is encountered, the associated identifier is passeddown through the pipeline and into the feedback bufferalong with shape coordinates, colors, and texturecoordinates. During parsing of the feedback buffer, thesetexture identifiers enable worldlet geometry reconstructionto apply the correct texture images to the correct shapes.

PILOT STUDYA pilot study was conducted to evaluate landmarkrepresentation effectiveness within a wayfinding task.Subjects in the study were asked to use an on-line landmarkguidebook and follow a sequence of landmarks leadingfrom a starting point to a goal landmark. Guidebook entriesproviding landmark descriptions were offered in three ways:in textual form, as 2D images, and as 3D worldlets.

The pilot study used five subjects, three female and twomale. All subjects were computer-literate, but had varyingdegrees of experience with virtual environments. Subjectoccupations were student, programmer, ecologist, molecularbiologist, and computer animator.

Virtual Environment DesignSix different virtual city environments were created for thestudy. Each city was composed of a street grid, five blocksby five blocks, with pavement roads and sidewalks betweenthe blocks. Each block contained 20 buildings, side-by-sidearound the block perimeter. Using a cache of 100 buildingdesigns, buildings were randomly selected and placed oncity blocks. Buildings were colored using texture imagesderived from photographs of buildings in the San Francisco

area. Typical building photographs were of two-storyhouses, office buildings, shops, and warehouses.

Three of the six cities were used for training subjects, andthe remaining three used for the timed portion of theexperiment. The timed experiment required that subjectsmake their way from a starting point to a goal. Timedexperiment cities, therefore, contained a starting point, anending goal, and three intermediate landmarks. Thedistance between any adjacent pair of these varied betweenone and two blocks. The total distance from the startingpoint to the ending goal was six blocks. The intermediatelandmarks included two buildings and one non-building(mailbox, fire hydrant, or newspaper stand). The endinggoal was a distinctive six-sided kiosk marked “GOAL”.The starting point was unmarked.

Training cities were structurally equivalent to cities used inthe timed experiment. However, subjects were given astarting point, only a single intermediate landmark, and thegoal kiosk. The landmark in each training city differedfrom landmarks used in the timed cities.

Software DesignThe VRML browser user interface was modified for thestudy. A main city window displayed the city. Keyboardarrow key presses moved the subject forward and back by afixed distance, or turned the subject left or right by a fixedangle. Subjects were instructed to press a “Start” button tobegin the experiment and press a “Stop” button when theyreached the goal. Between the two button presses, datadescribing the subject’s position and actions wasautomatically collected at one second intervals.

A “Guidebook” button on the main window displayed afull-screen guidebook photo-album window with textual,image, or worldlet landmark descriptions. A “Dismiss”button on the guidebook window removed the window andagain revealed the main city window. The subject could notsee the main city window without dismissing the guidebook.

The study used a within-subject randomized design. Eachsubject visited three virtual cities in a random order. Foreach subject, one city provided a guidebook with textuallandmark descriptions leading to the goal, one providedimage landmark descriptions, and one provided worldletlandmark descriptions. In cities using textual and imagelandmark descriptions, the guidebook contained statictextual and image information. In the city using worldletlandmark descriptions, the guidebook contained interactiveworldlets, each of which could be explored using the samearrow key bindings as the main city window.

For each landmark, the landmark and a fifteen meter radiusaround the landmark, were expressed in the description.Textual descriptions described both the landmark and theimmediate surroundings. Image landmark descriptionsshowed portions of the neighboring buildings. Worldletdescriptions included a spherical bubble with a fifteen meterradius centered in front of the landmark.

ProcedurePrior to beginning the experiment, instructions were read toeach subject and an image shown of the goal kiosk. Eachsubject was shown the user interface and taught use of thearrow keys, both for city movement and worldletmovement. Subjects were allowed to spend as much timeas they needed practicing in three training cities, each withguidebook landmark descriptions in either text, image, orworldlet form. When subjects felt comfortable with eachinterface, the timed portion of the experiment was begun.

During the timed portion, subjects were asked to navigatefrom the starting point to the goal kiosk as quickly aspossible.

ResultsThe independent variable in the study was the type oflandmark description used: text, image, or worldlet.Dependent variables include the time spent consulting theguidebook, the time spent standing still within the city, thetime spent moving forward over new territory, the timespent backtracking over territory previously traversed, thedistance traversed moving forward, and the distancetraversed while backtracking. Table 1 includes the meanvalues for subject data collected for each of the dependentvariables. Travel time is measured in wall-clock secondswhile travel distance is measured in meters within thevirtual environment. Mean overall travel times anddistances are also listed in the table.

Table 1: Mean times and distances traveled.

Mean Times (seconds) Text Image Worldlet

Consulting guidebook 47.6 44.6 91.0Standing still 179.2 141.6 58.6Moving forward 155.0 156.4 91.0Backtracking 86.2 78.0 0.4Overall 468.0 420.6 241.0

Mean Distances (meters)

Moving forward 684.6 739.0 421.6Backtracking 409.0 371.4 2.2Overall 1093.6 1110.6 424.0

In the table above, Consulting guidebook values indicatethe time subjects spent with the guidebook window on-screen. City movement could not occur while theguidebook window was displayed.

Standing still values indicate the time subjects spentstanding at a single location, looking ahead or turning leftand right.

Landmarks in all three cities were arranged so that at notime would a subject be required to traverse the same blocktwice to reach the goal. Moving forward times anddistances record movement through previously untraversedterritory. Backtracking times and distances measureunnecessary travel over previously traversed territory.

In a post-study questionnaire subjects were asked to rankeach landmark representation technique according to howeasy it was to use. Table 2 summarizes subject rankings forthe five subjects in the pilot study.

Table 2: Rankings of landmark representations.

Text Image WorldletVery easy 0 0 3Easy 2 2 2Doable 1 1 0Difficult 2 1 0Very difficult 0 1 0Median Doable Doable Very easy

AnalysisA one-way analysis of variance (ANOVA) was performedfor each of the dependent variables and the overall timesand distances. The within-subjects variable was thelandmark description type with three levels: text, image,and worldlet. Post-hoc analyses were done using the TukeyHonest Significant Difference (HSD) test. We adopted asignificance level of .05 unless otherwise noted. Table 3summarizes these results.

Table 3: F-test values for F(2,8) and p < .05.

Mean Times F(2,8)

Consulting guidebook 5.78Standing still 5.80Moving forward 8.20Backtracking 5.40Overall 5.46

Mean Distances

Moving forward 7.09Backtracking 5.82Overall 6.79

Post-hoc analyses of each of the dependent variablesrevealed:

• Time spent consulting guidebook: text and imagetimes were not significantly different, but image timeswere significantly less than for worldlets.

• Time spent standing still: text and image times werenot significantly different, but text times weresignificantly greater than for worldlets. Image andworldlet times were not significantly different.

• Time spent moving forward: text and image timeswere not significantly different, but both weresignificantly greater than for worldlets.

• Time spent backtracking: text and image times werenot significantly different, but both were significantlygreater than for worldlets.

• Overall time: text and image times were notsignificantly different, but text times were significantly

greater than for worldlets. The difference betweenimage and worldlet times approached significance (p =.08) with image times greater than those for worldlets.

• Moving forward distance: text and image movementdistances were not significantly different, but both weresignificantly greater than for worldlets.

• Backtracking distance: text and image backtrackingdistances were not significantly different, but both weresignificantly greater than for worldlets.

• Overall distance: text and image movement distanceswere not significantly different, but both weresignificantly greater than for worldlets.

DiscussionFigure 6 plots mean times for each type of landmarkdescription for the time used consulting the guidebook,standing still, moving forward over new territory, andbacktracking over previously traversed territory.

Figure 6: Mean times.

Subjects spent more time on average consulting worldletdescriptions than consulting either text or imagedescriptions. This extra consultation time was more thancompensated for by reductions in time spent standing still,moving forward, and most dramatically in time spentbacktracking.

A natural conjecture is that subjects spent the additionaltime with worldlets creating a more comprehensivecognitive model of the landmark region which enabled themto spend less time searching for landmarks or landmarkcontext. This is reflected in the reduced total travel times.The striking reduction in backtracking time, bringing itvirtually to zero, indicates that worldlets enabled subjects todo less wandering and to move more directly to the nextlandmark.

Figure 7 plots mean travel distances for each type oflandmark description. As with travel time, forward andbacktracking travel distances also were reduced when usingworldlets.

Figure 7: Mean distances.

CONCLUSIONSWayfinding literature provides clear support for theimportance of landmarks in navigating an environment,whether real or virtual. Landmarks anchor routes throughan environment and provide memorable destinations toreturn to later. Landmarks help to structure an environmentand supply directional cues used to find destinations ofinterest.

Whereas a traveler’s landmark knowledge characterizes adestination by its 3D shape, size, texture, and so forth, themenus of today’s virtual environment browsers characterizedestinations by textual descriptions or thumbnail images.This representation mismatch reduces the effectiveness oflandmark descriptions in destination menus. Unable to usetheir memory of 3D landmarks to choose among menuitems, travelers may resort to a naive, exhaustive search tofind a desired landmark.

In a wayfinding task, textual or image guidebook landmarkdescriptions fail to engage the full range of 3D landmarkcharacteristics recognized and used by travelers to find theirway. Unable to extract sufficient landmark knowledge fromtextual or image descriptions, travelers move through anenvironment with less comprehensive cognitive models,spending more time standing still and looking around,moving in incorrect directions, and backtracking overpreviously traversed territory.

This paper has introduced a new user interface affordanceto increase wayfinding efficiency. This affordance, called aworldlet, captures a 3D thumbnail of a virtual environmentlandmark. Each worldlet is a miniature virtual worldfragment that may be interactively viewed in 3D. Byencapsulating a 3D description of a landmark, worldletsprovide better landmark imagability, landmark context,traveler context, and multiple vantage point support thantext or image representations. Displayed within abrowsable landmark guidebook, worldlets facilitate virtualenvironment wayfinding by enhancing a traveler’s ability torecognize and travel to destinations of interest. When usedto provide guidebook descriptions in a wayfinding task,worldlets significantly reduced the overall travel time anddistance traversed, virtually eliminating backtracking.

Text Image Worldlet

20

40

60

80

100

120

140

160

180

0

Still

Guidebook

Forward

Backtracking

Ti

me

Text Image Worldlet0

100

200

300

400

500

600

700

800

Forward

Backtracking

Distance

FUTURE WORKDevelopment of worldlets and the VRML browser revealedissues requiring further study:

• To insure that spherical worldlets capture only thetraveler’s immediate vicinity, the yon clip plane isautomatically placed relatively close to the traveler’sviewpoint. The current approach sets the yon clipplane distance to a fixed value. However, this distanceshould vary with traveler avatar characteristics, theenvironment being viewed, or the landmark captureintended. A general-purpose, automatic yon clip planeselection algorithm is needed.

• VRML provides features that describe worldcharacteristics that do not reduce to points, lines, ortriangles, and thus do not show up in a capturedworldlet. These features include background color,sounds, behaviors, and shape collidability. Worldletsconstructed without capture of these features may notlook and act like the main world from which they werecaptured. A mechanism to capture this additionalinformation is needed.

In addition to these issues, future work will include a moreextensive user study. The pilot study’s finding thatbacktracking was practically eliminated was unexpected anddeserves further attention.

ACKNOWLEDGEMENTSThe San Diego Supercomputer Center (SDSC) is funded bythe National Science Foundation (under grantASC8902825), industrial partners, and the State ofCalifornia. This work was also partially funded by the SanDiego Bay Interagency Water Quality Panel. SuzanneFeeney of the University of California, San Diego (UCSD)Psychology Department and Rina Schul of the UCSDCognitive Science Department were instrumental indeveloping the pilot study. Special thanks to John Morelandfor assistance in developing the software, and to MikeBailey, Andrew Glassner, Allan Snavely, and Len Wangerfor their input on the project. Thanks also to John Hellyand Reagan Moore for their support.

REFERENCES1. Allen, G.L., Kirasic, K.C. Effects of the Cognitive

Organization of Route Knowledge on Judgments ofMacrospatial Distances. In Memory & Cognition,1985, 3, pp. 218-227.

2. Appleyard, D.A. Why buildings are known. InEnvironment and Behavior, 1969, 1, pp. 131-156.

3. Bell, G.; Carey, R.; Marrin, C. The Virtual RealityModeling Language, version 2.0, 1996. Athttp://vrml.vag.org/VRML2.0/FINAL

4. Chen, S. E. QuickTime VR - An Image-basedApproach to Virtual Environment Navigation. In

Proceedings of the ACM SIGGRAPH 95 Conference,August 1995, Los Angeles, CA. pp. 29-38.

5. Darken, R. P., and Sibert, J. L. WayfindingStrategies and Behaviors in Large Virtual Worlds. InProceedings of the ACM CHI 96 Conference, April1996, Vancouver, BC., pp. 142-149.

6. Downs, R. J., and Stea, D. Cognitive Maps andSpatial Behavior. In Image and Environment,Chicago: Aldine Publishing Company, 1973, pp. 8-26.

7. Evans, G. Environmental cognition. In PsychologyBulletin, 1980, 88, pp. 259-287.

8. Foley, J., van Dam, A., Feiner, S., and Hughes, J.Computer Graphics Principles and Practice,Addison-Wesley, 1990.

9. Goldin, S.E., Thorndyke, P.W. SimulatingNavigation for Spatial Knowledge Acquisition. InHuman Factors, 1982, 24(4), pp. 457-471.

10. Lynch, K. The Image of the City, M.I.T. Press, 1960.

11. Passini, R. Wayfinding in Architecture, VanNostrand Reinhold, NY, second edition, 1992.

12. Pausch, R., Burnette, T., Brockway, D., Weiblen,M.E. Navigation and Locomotion in Virtual Worldsvia Flight into Hand-Held Miniatures. InProceedings of SIGGRAPH 95, 1995, pp. 399-400.

13. Peponis, J., Zimring, C., and Choi, Y.K. Finding theBuilding in Wayfinding. In Environment andBehavior, 1990, 22 (5), pp. 555-590.

14. Satalich, G. A. Navigation and Wayfinding in VirtualReality: Finding the Proper Tools and Cures toEnhance Navigational Awareness. Masters Thesis,Department of Computer Science, University ofWashington, 1995.

15. Stoakley, R., Conway, M. J., and Pausch, R. VirtualReality on a WIM: Interactive Worlds in Miniature.In Proceedings of the ACM CHI 95 Conference, pp.265-272.

Documents

Worldlets - 3D Thumbnails for Wayfinding in Virtual ... · La Jolla, CA 92093-0515, USA ABSTRACT Virtual environment landmarks are essential in wayfinding: ... guidebook. A virtual