Semantic visualization of 3D urban environments

Preview:

Citation preview

Semantic visualization of 3D urban environments

Jose Luis Pina & Eva Cerezo & Francisco Seron

Published online: 11 March 2011# Springer Science+Business Media, LLC 2011

Abstract The purpose of this work is the semantic visualization of complex 3D citymodels containing numerous dynamic entities, as well as performing interactive semanticwalkthroughs and flights without predefined paths. This is achieved by using a 3Dmultilayer scene graph that integrates geometric and semantic information as well as by theperformance of efficient geometric and what we call semantic view culling. The proposedsemantic-geometric scene graph is a 3D structure composed of several layers which issuitable for visualizing geometric data with semantic meaning while the user is navigatinginside the 3D city model. BqR-Tree is the data structure specially developed for thegeometric layer for the purpose of speeding up rendering time in urban scenes. It is animproved R-Tree data structure based on a quadtree spatial partitioning which improves therendering speed of the usual R-trees when view culling is implemented in urban scenes.The BqR-Tree is defined by considering the city block as the basic and logical unit. Theadvantage of the block as opposed to the traditional unit, the building, is that it is easilyidentified regardless of the data source format, and allows inclusion of mobile and semanticelements in a natural way. The usefulness of the 3D scene graph has been tested with lowstructured data, which makes its application appropriate to almost all city data containingnot only static but dynamic elements as well.

Keywords Semantic visualization . 3D Scenegraphs . Dynamic urban scenes . Virtual flightsand walkthroughs . Data structures . View culling

Multimed Tools Appl (2012) 59:505–521DOI 10.1007/s11042-011-0776-3

J. L. Pina (*) : E. Cerezo : F. SeronAdvanced Computer Graphics Group (GIGA), Computer Science Department, University of Zaragoza,Engineering Research Institute of Aragon (I3A), Zaragoza, Spaine-mail: jlpinam@gmail.com

E. Cerezoe-mail: ecerezo@unizar.es

F. Serone-mail: seron@unizar.es

1 Introduction

In the last years, 3D city models are increasingly being used in more and more sections ofeconomy and public administration. Moreover, new applications of 3D city models havearisen in the areas of tourism, town planning and urban development, thus increasing theneed for more sophisticated and geometrically complex models. However, these complexmodels should be used not only for visualisation purposes, as is the common casenowadays, but should be integrated in semantic applications where graphic elements have ameaning and thus they can be used by choose what to see: health centers, churches, buslines,… Semantic driven visualization of complex city models would dramatically enhancethe usability of the models and would open the door to the use of complex models in moresophisticated applications.

There are several problems when trying to use semantics in virtual 3D cities. The firstproblem is that the graphical data formats commonly used in urban 3D visualization do notcontain semantic information. Graphical data themselves are designed for representation ofthe visual aspect only; therefore additional semantic information must be added as externaldata. The second problem is related to the graphical tools used to render the data, the scenegraphs. As is well known, a scene graph is a directed acyclic graph, which describes theentities and dependencies of all the graphic elements in the scene. Scene graphs havebecome an established tool for developing interactive 3D applications, but they are notprepared for the inclusion of such information and are difficult to expand.

The solution presented in this paper helps to overcome both problems: a 3D multilayerscene graph that integrates geometric and semantic information, based on the use of newhybrid graphic data structure, the BqR-Tree, is proposed. This semantic-geometric scenegraph is a 3D structure, which is suitable for visualizing geometric data with semanticmeaning while the user is navigating inside the 3D city model. Moreover, the use of what wecall semantic view culling —drawing an analogy with standard geometric view culling—greatly improve rendering time, allowing implementation of interactive semantic urbanwalkthroughs and flights without the use of predefined paths.

The geometric layer is based on a new data structure named BqR-Tree, which is capableof supporting the interactive visualization of city models containing numerous dynamicentities. Fast view frustum cullers, that cull away all scenegraph nodes that lay outside theview frustum, i.e. those objects that are outside the observer’s visual field, are particularlyimportant if complex and large scene graphs are to be visualized. Dividing complexgeometries comprised of multiple triangles into an adequate data structure can greatlyimprove the ability to cull away triangles that lie outside the view frustum, resulting infewer triangles to be sent to the rendering pipeline and, therefore, in a lowering of renderingtime. Moreover, the work involved in creating the data structure is performed in pre-processing time and represents a negligible addition to the real-time computation. Theproposed BqR-Tree or Block-quadtree-R-Tree, presented here in detail, is based on thedecomposition of the city into blocks and can be applied to highly or completelyunstructured urban data, and therefore may be applied to data acquired from severalsources: 2D GIS, terrain measurements, etc.

The subsequent semantic layers allow the addition of semantics to the data and theimplementation of semantic view culling. The idea of semantic view culling is to applythe same procedure applied to the geometric layer with usual geometric view culling to thesemantic layers. A semantic square, a kind of magic-lens, delimits the semantic region ofinterest while the user is navigating. All the semantic elements outside the frame are culledout while traversing the semantic layer of the scene graph and, therefore, are not rendered

506 Multimed Tools Appl (2012) 59:505–521

onto the screen. At the end of the process the geometry must be rendered and, therefore,standard geometric view culling over the remaining geometry must also be applied, aprocess which is also especially efficient because it is based on the use of the BqR-Treedata structure. Nevertheless, and although the graphic layer, BqR-Tree, is speciallydeveloped for urban scenes, the semantic view culling technique and the multilayered scenegraph are independent of the kind of scene. The semantic visualization of any other type ofmodel will only involve the change of the graphic layer with other scene graph more suitedto the specific scene. Benefits of the semantic view culling will remain unchanged in termsof usability and the rendering speed will even improve with the increase of the graphicalcomplexity of the semantic elements.

The organization of the paper is as follows: the next section is devoted to discussingprevious related work. Section 3 presents the 3D semantic scene graph and Section 4 showsthe results, and finally, in Section 5, conclusions and future work are outlined.

2 Related work

2.1 Semantic related work

Efforts made in the semantic graphical visualization field have taken two main directions:the construction of semantic objects and the presentation of semantic information orsemantic GUI’s.

Objects construction is more common, and is based on assigning semantic meaning toevery graphic element, basic or complex, in some cases building a scene tree. There isstrong interest in the research field of the enrichment of geometric models themselves or viatheir scene graphs. In the first case Attene et al. [3] tackles the problem of providing usefulsemantic annotations to 3D shapes. They decompose the model into interesting featureswithin a multi-segmentation framework, and introduce the annotation pipeline to attachsemantics to the features and the whole model. In the second case, complex elements areorganized in an object decomposition diagram with their basic elements on the bottom(leaves of the tree) and their intermediate aggregations in the middle (nodes). Other kinds ofinformation can be incorporated to the elements of the tree. For example, an engine locatedat the root of the tree is decomposed into several complex pieces located at the nodes, andfinally, the basic parts are located at the leaves with information concerning them, such ascolour, symmetry, etc. In urban scenes, decomposition is performed in a similar way: basicelements such as roofs, walls, etc… are located at the leaves of the tree. Those basicelements are grouped on intermediate nodes such as rooms, houses, etc. The town is locatedat the top of the tree (root) [4]. A great deal of research in this field focuses on improvingand standardizing the semantic information added. Otto [19] presents RDF, a ResourceDescription Framework, with the aim of making it possible to develop system-independentsoftware. Biermann et al. [5] introduce a semantic representation for virtual prototyping ininteractive virtual construction applications. Semantic representation reflects dynamicconstraints that define the modification and construction behaviour of objects as well asknowledge structures, which contain not only the identification of the element, but alsopossible processes and actions that can be carried out within the environment. Stadler et al.[21] present a system capable of reducing ambiguities for geometric integration of spatialdata obtained from different sources which are often thematically and spatially fragmented,using semantic information. Despite these standardization efforts, these works do not solvethe problem of what to do once the information is available.

Multimed Tools Appl (2012) 59:505–521 507

Other works focus on improving the organization of added semantic information.Leissler et al. [15] introduced the idea of multilayered graphs, stating that in a visualapplication, two layers can be identified as an outline structure: a semantic layer and avisualization layer. The semantic layer describes what is to be visualised, namely, therelation between concepts, and the visualization layer describes what is visible on thescreen. Building on this idea, Klima et al. [13] implement a multilayered graph to constructbuildings in which the elements of a layer are grouped by type and connected with anelement of its superior layer which represents them, similar to a level of detail (LOD).Inside a layer, the interrelated elements are also connected. None of these works takeadvantage of the multilayered graph to speed up visualization, as in our case.

A second group of works focuses on how to show semantic information whennavigating through the geometric model. With this aim, several solutions for flexible GUI’shave been proposed. Lie and Hsu propose the use of virtual forces that may be configuredby user [16] to build an intelligent 3D user interface, and Gerhard and Dieter [9] proposethe addition of semantic behaviour to the nodes by means of a generic traversal contextstate, allowing flexible implementation of 3D user interfaces. Falquet et al. [8] integratenon-geometric urban knowledge and data into 3D city models and define a declarativespecification language that enables the designer to specify 3D representations for non-geometric knowledge elements. All these approaches allow the user to define how tonavigate but do not avail themselves of the user selection to improve navigation speed.

The paper of Mendez et al. [18] is more closely related to the aim of the present paper, inthat it uses semantic and geometric criteria to implement a model rather similar to an urbanLOD. Their application area is the underground infrastructure owned and operated by utilitycompanies. The 3D models are encoded in a scene-graph that is a mixture of visual modelsand semantic markup used for interactive filtering and styling of the models. In fact, thistechnique is a construction method, although the semantic information shown whennavigating varies.

Our work contributes solutions in both research directions. A 3D scene graph containinga graphical layer and several semantic layers is proposed. The structure, as opposed toother, similar ones [19, 5], is not used to construct objects but rather to locate them in theenvironment. This allows for the implementation of culling, which we have called semanticview culling, of the graphical elements, and is the foundation of the semantic navigation ofthe user through the model, rooted in a simple GUI.

2.2 Rendering related work

A great variety of solutions can be found in the bibliography related to the rendering oflarge data models in real time. Methods can be classified into five main groups oftechniques: LOD [7], billboards [1], occlusion culling [10], ray tracing [22] and viewculling [2]. These techniques are often combined to produce better results as well.Nonetheless, most of these techniques were developed for the efficient management oflarge static polygonal models, so their application to the management of thousands ofcomplex dynamic entities, such as virtual actors, is not a trivial matter. This is specially thecase of view culling, an acceleration technique well studied in static environments, butrarely implemented in dynamic scenes.

Most of the previous algorithms incorporate an indexing data structure to acceleraterendering which results in a great improvement in rendering times, and this is done inpreprocessing time. Two main categories can be found among the most widely usedindexing structures: Hierarchical structures and Bounding Volume Hierarchy (BVH). The

508 Multimed Tools Appl (2012) 59:505–521

main difference between these two categories lies in their approach to dividing data space.Structures which belong to the first category use space partitioning methods that divide dataspace along predefined hyper-planes regardless of data distribution. The resulting regionsare mutually disjoint and their union completes the entire space. Quadtree [14] belongs tothis category. Structures which belong to the second category use data-partitioning methodswhich divide data space in buckets of MBV (Minimum Bounding Volumes) according to itsdistribution, which can lead to possible overlapping regions. R-Tree [11] belongs to thiscategory. Those data structures which belong to the R-Tree families yield the bestperformance, being the VamSplit R-Tree the best performance R-tree [12]. This is why thisstructure has been chosen to make comparisons with the proposed data structure. But R-treedata structure is very sensitive to the order in which objects are inserted, but its realdrawback is the overlap, especially affecting large objects common in urban scenes. Thecriterion of object insertion, minimal bounding volumes, usually leads to the separation oflarge objects on the tree although they may be near to each other in the scene.

On the other hand, Quadtree is not sensitive to the order in which objects are insertedand there is no overlapping, it keeps objects that are near to each other in the scene togetherin the scene graph, therefore it is a very efficient spatial decomposition. Nevertheless, it hasbeen designed like a point access method and although there are more complex versionscapable of managing polygonal data, they are not well suited to manage large urban objects.

Hybrid data structures have been proposed to overcome the drawbacks of the traditionalspatial index structures [6, 23].

In this paper, an hybrid indexed data structure (BqR-Tree [20]) to be built inpreprocessing time is proposed for the geometric layer. BqR-Tree is a R-Tree but, insteadof using a space decomposition based on MBV (Minimum Bounding Volumes) as is usualin an R-Tree, and in most urban applications, a Quadtree decomposition of the space hasbeen chosen, which is rare but not unique in urban environments [17, 1].

3 The 3D semantic scene graph

The proposed 3D scene graph is a multilayered graph which is composed of a geometriclayer and several semantic layers; as many semantic layers as desired may be associated tothe same geometric layer. The data structure of the geometric layer is a BqR-Tree and theorganization of the semantic layers is hierarchical. The geometric layer is composed ofnodes and leaves. The leaves are used to allocate all the model geometry in the geometriclayer; therefore, no leaves are present in the semantic layers.

3.1 The graphic layer: the BqR-Tree data structure

The proposed data structure for the graphic layer, the BqR-Tree [20], is based on thedecomposition of a city into its blocks: the block is considered the minimum and indivisibleunit of the city as well as the basis of the proposed structure. The term block is used toname the group of urban elements completely surrounded by streets, i.e., the usual meaningof urban block. Even though the name of a street is often the same across manyintersections with other streets, the term “street” is used here, for clarity reasons, as theportion of street from an intersection to the next.

The BqR-Tree structure is an improved R-Tree whose method of space partitioning is aquadtree decomposition, whose tree structure it is populated with the geometry and itsMBVs, like an R-tree. The structure also incorporates a second improvement. The MBVs of

Multimed Tools Appl (2012) 59:505–521 509

the nodes are bounding-spheres according to the results of [12]. This produces improve-ments in node access speed and storage requirements. Nevertheless, the use of bounding-spheres produces too much overlapping, which is the reason for the use of the intersectionof bounding-boxes and spheres for the leaves (from now on bounding-boxes-spheres), as itwill be explained later on.

In order to build the BqR-tree, from the city data, it is necessary:

& To identify all the blocks in the scene and calculate its geometric centres and boundingspheres

& To assign every graphic element of the town to an urban block.

In Fig. 1 a part of the city under consideration (Zaragoza, Spain) with the blocks and itscentres already identified is displayed. Identification of blocks takes place setting out fromthe streets, which are identified from the outset. Block status is assigned to any portion ofterrain completely surrounded by streets. The data used to test our model has been exportedfrom 2D-GIS. In the initial files neither buildings nor blocks were already identified asentities. Therefore, filtering, structuring and 3D elevation, have been automaticallyperformed.

To build the BqR-Tree, the first step is to take the smallest rectangle bounding the city.This rectangle is divided into four equal rectangles called quadrants or buckets; allquadrants are recursively subdivided in this manner. The BqR-Tree tree is formed byrecursive division of the city into quadrants. A quadrant is considered to be indivisible if itcontains only one block. No splitting of the objects (urban blocks) is implemented, sincethis would involve a scattering along the tree of the pieces of urban blocks, which would inturn lead to a poorer performance of the data structure. Should a division cross a block, theentire block is assigned to a single quadrant, the one that contains the geometric centre ofthe block. At the end of the process, every final quadrant contains a single block or remainsempty. Figure 2 presents a quadtree decomposition performed on the example shown inFig. 1, where dashed lines correspond to the sub-nodes of the main nodes 1, 2, 3, 4, andsmall grey lines correspond to these four main nodes.

The second step is to build the tree. To do this, every quadrant is assigned to a node orsub-node and every block is assigned to a leaf. Figure 3 shows a graphic representation ofthe BqR-Tree structure corresponding to the example of Fig. 1. Leaves are represented bysquares and nodes by ellipses. Empty quadrants are not assigned to nodes, thus categorizingthe quadtree as an adaptive quadtree. Each leaf of the tree stores the entire geometry of the

Fig. 1 Identifying the blocks andtheir centers in a virtual city

510 Multimed Tools Appl (2012) 59:505–521

block as well as the bounding volume for that block. Each node stores a pointer to itsdescendants and to the bounding-sphere of the node. In the case of the leaves, thebounding-box-sphere of each block (intersection of the sphere and the box of the block) iscalculated and stored. For each node, the bounding sphere is composed of the union ofbounding-volumes of its descendants, in a bottom-up manner.

The properties of the resulting BqR-Tree data structure are the following:

& The structure is a tree of the nodes’ bounding-spheres (quadrants are disjoint but not thebounding-spheres of the nodes associated with them) and ends with the bounding-boxes-spheres and the geometry of the blocks.

& The complexity of the search for an element is proportional to the depth of the tree,being O(n) in the worst case, where n is the maximum depth of the tree, which tends tobe small. Union, intersection and complement share the same complexity, which leadsto very efficient tree culling.

One basic aim of our work has been the management of not only static but dynamicentities of mobile elements (MEs). Every ME is always associated with a street and adirection. Each street is composed of two directions and every direction (semi-street) isassociated with a block. Therefore, every ME is associated in an unambiguous way to onlyone block, and the geometry of each mobile element is stored in the corresponding leaf of afinal node. Thus, the position and speed of every ME is easy and quickly known everytime. When the MEs are moving, it is necessary to inspect their connections. Only the MEswhose movement leads them to another street should be inspected. If the new street belongsto a different block, then the ME must be disconnected from its old block and connected tothe new block. But in fact, not all MEs have to be re-connected, only those that are inside

Fig. 3 Tree representation of the example seen in Fig. 1

Fig. 2 Quadtree decompositionof the example in Fig. 1

Multimed Tools Appl (2012) 59:505–521 511

the view-frustum or have just abandoned it. For the rest of the MEs, it is enough to annotatethe change and implement it when they come into the view-frustum. Connections areperformed by means of a pointer to the ending node of the block (the parent of thegeometry of the block). The geometry of the ME is stored into a leaf which has a pointer toits parent, the node of the block. Thus leaves of the block and the ME are siblings. Re-connection is performed by a re-writing of the pointer of the ME, from its old parent to itsnew parent.

3.2 The semantic layers

Each semantic layer is built on the basis of the association and grouping of elements with arelated meaning in the tree structure. The semantic layer is only composed of nodes, andnodes correspond to clusters of elements with a related semantic meaning. The leaves of thesemantic layers are stored into the graphic layer, and these leaves are shared by all thesemantic layers. Therefore, the semantic layer doesn’t have leaves itself; the final nodes ofthe layer are connected to the leaves of the geometric layer. Therefore, all the geometricelements with a related semantic meaning belong to one or another node of the semanticlayer (see Fig. 4).

Elements of the graphic layer without a semantic meaning are not connected to asemantic layer. Thus, these elements will never be inside the semantic view frustum.

The model of the city of Zaragoza used for the BqR-Tree example will be also used as toillustrate the process of constructing the 3D scene graph and performing the semantic viewculling process. In the example, the visualization application is used to help the user tolocate the most appropriate bus line for him or her.

The graphic layer for the city model under consideration is built first; following this,three bus lines are implemented on it. The bus lines are represented by a collection of busesmoving in the direction of the real buses (see Fig. 5).

Each bus line is composed of the semi-streets (excepting one-way streets) whichcorrespond to the route of the bus. And since the semi-streets are the constituents of the

Fig. 4 Example of a Semantic3D Scenegraph containing thegraphic layer and one semanticlayer

512 Multimed Tools Appl (2012) 59:505–521

block and they are inside the block, the semantic nodes of the bus lines are connected to theblocks of the graphic layer.

The root node of the semantic layer corresponds to “Public transport”. Descendants ofthe root node are bus lines 1, 2, and 3, which are the final nodes for this tree. Every “busline” node is connected with the leaves (blocks) of the graphic layer, which correspond tothe semi-streets along which the bus lines run. The number of nodes and the depth of thesemantic layer of the scene graph depend on the complexity of the implemented subject.

3.3 Semantic real-time visualization: geometric and semantic view culling

Several view culling processes are performed for every frame: one for the graphic layer andone for each semantic layer. The semantic 3D scene graph supports the selection of thesemantic elements which will be interactively seen in rendering time. A square, similar to amagic-lens, delimits the semantic region of interest while the user is navigating. Thesemantic region of interest is depicted for the user as a rectangle (the orange rectangle thatcan be seen in the next figures). This rectangle is the superior base of a truncated pyramidcorresponding to the semantic view frustum. The inferior base of the pyramid would be atinfinite. This rectangle delimits the semantic region of interest and belongs to the GUI. Buta variation of this rectangle involves a variation on the semantic frustum and, therefore, avariation on the results of the culling. The view culling of the geometric layer is performedagainst the frustum of the window, and the view culling of the semantic layers is performedagainst the semantic frustum. The size of this square can be adjusted by the user, who canthus select the size of the area in which he is interested. The semantic frustum is thereforean interactively adjustable but restricted frustum with a size delimited by the square thatmay be adjusted by the user. All the semantic elements outside the square are culled outwhile traversing the semantic layer of the scene graph and, therefore, are not rendered ontothe screen. But note that if at least one leaf of a semantic node is inside the semanticfrustum, then all the leaves of the node will be displayed. At the end of the process it isnecessary to render the geometry and, therefore, to apply standard geometric view cullingover the remaining geometry, which is also especially efficient, since it is based on the useof the BqR-Tree data structure.

It is important not to confuse the proposed semantic view culling technique with theusual magic-lens technique as the one present in the work of Mendez et al. [18]. Althoughboth techniques use a rectangle to delimit the region of interest, those techniques are verydifferent and their results are very different too. All the elements inside the frustum of amagic-lens are displayed with high resolution and elements outside are displayed at low

Fig. 5 The three bus lines imple-mented without semantic viewculling

Multimed Tools Appl (2012) 59:505–521 513

resolution; it is a LOD. In the case of semantic view culling, all the elements semanticallyrelated with the element that is inside the frustum of the semantic view culling are displayedat high resolution, although they are usually outside the semantic frustum, and all theelements not related with it are not displayed.

The culling process takes place as follows. First, semantic view culling is performed.Nodes accepted in semantic culling are marked as visible and so are their attachedgeometric leaves. This process takes place very quickly because of the bounding volumesof the nodes. Four situations are possible, the first three being the usual ones in geometricview culling:

1. If a semantic node is completely outside the semantic view frustum, then the node andall its leaves are rejected, per usual.

2. If the semantic node is partially inside the semantic view frustum, all its leaves must betested for their visibility, (also as per usual).

3. If the semantic node is completely inside the semantic frustum the node and all itsleaves are accepted.

4. But finally, and this differs slightly from the usual view culling, an unusual case mustbe considered: if the bounding volume of the semantic node is larger than the semanticfrustum it is possible that none of the leaves were inside the semantic view frustum,and then the node and its leaves should be rejected. This is the case in which thesemantic node includes elements not connected to the semantic node, e.g., it is donut-shaped. The bounding volume of the semantic node is the outside of the donut and thesemantic frustum is inside the hole of the donut. The semantic frustum is insidethe bounding volume of the semantic node but there are no semantic elements insidethe semantic frustum. Therefore, a testing process of the leaves is necessary against thesemantic view frustum until one leaf is accepted or all of them are tested.

Static and mobile elements which do not belong to a visible semantic node will not bevisible, which largely improves rendering time. However, sometimes it may be consideredmore adequate to always show certain elements for different reasons, such as userorientation…, even if they have no semantic meaning (like in our example, where thegeometric elements are still displayed). To do so, all of these geometric elements are notconnected to the semantic tree and are marked as never culled in the semantic view culling.Finally, a standard view culling of the geometric layer for the leaves marked as visible,returns the set of elements that will be sent to the graphic card.

We will continue with the public transport example presented in the previous section toshow how semantic view culling is performed. It is implemented by an adjustable orangesquare that delimits the semantic view frustum. Blocks which do not belong to a semanticnode (bus line) inside the orange square, should not be visible; nevertheless, for reasons ofclarity, in the example and in the later tests only mobile elements have been culled away.Thus it is possible to see the entire city, resulting in better orientation for the user.

In Figs. 5 and 6 it is possible to see different results of the semantic view culling process.In those Figures, the city is shown in a grey scale for the purpose of clarity. Figure 5 showsthe city with the implemented bus lines whereas Fig. 6 shows the city after applyingsemantic view culling; in this instance, two lines (red and blue buses) are inside thesemantic frustrum but the other bus line is outside.

Figures 7 and 8 shows the influence of the semantic view culling on walkthroughs. Theusefulness of the semantic view culling can also be appreciated; when the user changes hisor her point of view, the semantic environment changes too. This could be particularlyuseful if implemented on virtual reality head mounted displays. In Fig. 7 two lines (red and

514 Multimed Tools Appl (2012) 59:505–521

blue buses) are inside the semantic frustum while in Fig. 8, only one semantic node (redbuses) is inside the semantic frustum. In Fig. 7 one may also appreciate, that although onlyone element (one semi-street or block) of the semantic node (the red bus line) is inside thesemantic view frustum (orange square), all the elements of the node (all the red bus lines)are displayed.

4 Visualization tests

The data used for the tests belong to the city of Zaragoza (Spain) and have been kindlyprovided by the city council. The initial file was in Microstation format and its data wereconverted to a 2,013,900 point text format. The BqR-Tree structure built was comprised of145 nodes and 96 leaves. Auxiliary files for the 96 blocks as well as one file with the300 Mb 3D model and 1,688,218 triangles were created. All tests were performed with acomputer provided with a DualP4 Xeon Pentium 4 processor at 2.8 GHz and a memory of2 Gb. The graphic card was a GeForce Fx 6800 Ultra with a memory of 256 Mb. Theoperating system was Windows XP.

4.1 Testing the impact of the BqR-Tree

The scene graph OpenGL Performer has been used to test the proposed BqR-Tree datastructure. In order to load the structure, a file with all the geometry and the BqR-Tree

Fig. 6 Two of the three bus linesare inside the semantic viewculling

Fig. 7 Two bus lines are insidethe semantic view frustum on thiswalkthrough

Multimed Tools Appl (2012) 59:505–521 515

structure elements arranged in nodes and leaves is supplied to Performer. A dll reads it,nodes are loaded as pfGroup, blocks are loaded as pfGeoset and pfBuilder creates the treein Performer format.

Of course, any other scene graph may be used, as only a new dll, which can be generatedfor the new format, is required. Besides, the preprocessing programs are in tcl, and theassociated dll in C. Both are independent from the operating system and hardware and are,therefore, totally portable. Although Performer is equipped with tools for speeding uprendering, in order to show that the increase in rendering speed is caused only by the newstructure, the only acceleration technique implemented is view culling, which is directlyderived from the culling of the structures. Performer, as the other scene graphs, implementsits view culling in a top-down manner. The bounding-volume of the first node, the root, istested against the view frustum, if the bounding-volume of the node is completely orpartially inside the view frustum, the culling process continues with its descendent,otherwise the node and its descendent are culled away because they are not visible.

The starting point for the tests was the previously mentioned 3D model of the city, which iscomprised of 1,688,023 triangles. A programwith the essential elements allowing free or guidedcamera movements was implemented. A route containing all the usual camera movement inflights and walkthroughs was designed; this route is the same for all the structures tested.

The route (see Fig. 9) begins at the outskirts of the city at ground level, and undertakes awalkthrough to the heart of the city. Once there, the camera rises vertically until it is abovethe rooftops facing the ground. Finally, a diagonal flight is performed from this position tothe ground, along with camera rotations pointing in the direction of the movement.

Fig. 8 One bus line is inside the semantic view frustum on this walkthrough

Fig. 9 Camera route used inthe tests. To facilitate the identi-fication of urban blocks these arecoloured instead of textured

516 Multimed Tools Appl (2012) 59:505–521

To determine the real improvement resulting from the use of BqR-Tree, it was decided tocompare it with the data structure which yielded the best results in the tests performed byother authors. As stated in the Related Work section, the VAMSplit R-Tree structure is theone which leads to best rendering speed. The view culling is applied for both data structuresin the same manner, the bounding volumes of the nodes and leaves are tested against theview frustum pyramid. An unstructured model was also used as a reference.

The average performance of each structure is shown in Table 1. If the ratio between theaverage rendering time of both structures is calculated, a marked improvement may beappreciated: the result of the BqR-Tree/VamSplit R-Tree division is 0.638 and that of theBqR-Tree/No Tree division is 0.261. Therefore an improvement of almost 40% can beobserved in relation to the VamSplit R-Tree structure and the results are 75% better thanwhen no structure is used.

Cars have been used as mobile elements for tests purposes. The car model used is composedof 387 triangles. The cars’ geometry is stored in an independent node, to which neither LOD norbackface-culling is applied. Tests are implemented in the previously mentioned city model, andcars are initially randomly situated on the streets of the city. In every frame every car isdisplaced a fixed amount. Tests with cars attached to the BqR-tree data structure and thepreviously mentioned VamSplit-Rtree data structure have been performed.

Figure 10 shows average rendering times plotted for both data structures with 100 and1,000 cars moving in the city. As shown in the figure, the differences between both datastructures remain after the addition of MEs.

In order to improve rendering times is necessary the addition of other accelerationtechniques. The usual solution is the implementation of LODs. Therefore, the sameprevious tests have been performed with cars with different levels of details: the modelsrange from 18 triangles to 1,701 triangles.

The evolution of the average rendering times for both data structures when increasingthe number of cars is plotted in Fig. 11. It is easy to realise that the differences betweenboth data structures persist although the absolute rendering time has decreased in both casesdue to the implementation of LOD.

Seconds/frame Triangles/frame

No tree 0.185 1,688,023

VamSplit 0.0758 704,873

BqR-Tree 0.0483 420,894

Table 1 Average rendering timesand visible triangles per framefor both structure

Average Rendering Time vs Number of Cars

0,00

0,02

0,04

0,06

0,08

0,10

0,12

100 1000

Bq

Vs

Fig. 10 Evolution of the averagerendering time (in seconds perframe) with the number of carsin the city for both structures(Vs dotted line: VamSplit-Rtree,Bq continuous line: BqR-tree)

Multimed Tools Appl (2012) 59:505–521 517

4.2 Testing the impact of the semantic view culling

Although the aim of the semantic view culling is to allow meaningful navigation, animprovement of rendering time is obtained due to the previous culling of the unwantedgeometry. Therefore it is interesting to test the influence of the semantic layers over the BqR-tree data structure in rendering time terms. Thus a test regarding rendering time with andwithout the semantic layer is performed on the basis of the urban transportation example.

The city model has been populated with 52 buses, each of which is composed of 659triangles for a first test and of 24,139 triangles for a second test. The first test contains avery small number of triangles per mobile and the second a very high number in order totest the performance of the semantic tree according to the number of mobile triangles. Threebus routes are implemented. The data structure BqR-tree, with and without the semantictree, has been tested. The model without semantics is the same model presented in a priorsection, but this time the mobiles move with prefixed routes instead of randomly. Thesecond model is the semantic tree 3D previously presented, in which view semantic cullingis implemented besides the graphic view culling of the BqR-tree data structure. The samecamera route already presented in previous sections has been used to study rendering time.

Results can be seen in Fig. 12, which shows the average rendering time in seconds perframe for each data structure; the first line corresponds to the BqR-tree alone and the linebelow it corresponds to the semantic view culling for the 3D tree, i.e., the semantic layerplus BqR-Tree. Therefore, the introduction of semantic view culling improves renderingtime noticeably, a difference which increases when the number of triangles of the mobileelements also increases. Thus, one may conclude that the time invested in previoussemantic view culling is more than adequately made up for by the smaller amount of

Average Rendering Time vs Number of Cars (Lod)

0,00

0,02

0,04

0,06

0,08

0,10

1000 10000

Bq

Vs

Fig. 11 Evolution of the averagerendering time of both datastructures (Vs dotted line:VamSplit-Rtree, Bq continuousline: BqR-tree) when increasingthe number of LOD cars

0

0,02

0,04

0,06

0,08

0,1

0,12

659 Tris 24,139 Tris

Tris

No Sem

Sem

Fig. 12 Rendering time rendering for each structure in seconds/frame

518 Multimed Tools Appl (2012) 59:505–521

triangles sent to the graphic card. Moreover, this improvement is produced only by thesemantic culling of the mobile elements; if the blocks were to be culled along with themobile elements the improvement would be larger.

5 Conclusions

The purpose of this work is the semantic visualization of complex 3D city modelscontaining numerous dynamic entities, as well as performing interactive semanticwalkthroughs and flights without predefined paths. This is achieved by using a 3Dmultilayer scene graph that integrates geometric and semantic information as well as by theperformance of efficient geometric and what we call semantic view culling. Theoutstanding features of our work are:

& The graphic layer, BqR-Tree, is defined by considering the city block as the basic andlogical unit. The advantage of the block as opposed to the traditional unit, the building,is that it is easily identified regardless of the data source format, and allows theinclusion of mobile and semantic elements in a natural way.

& The structure of the multilayer scene graph integrates the geometric layer as well as allthe different semantic contexts desired.

& Semantic view culling allows faster navigation and the rendering of meaningfulsemantic elements.

& The usefulness of this 3D structure has been tested with low structured city data, whichmakes its application appropriate to almost all city data.

Regarding future developments, research focuses on integrating additional acceleratingtechniques in very extensive urban scenes and to implement semantic view culling on headmounted displays.

Acknowledgements This work has been partly financed by the Spanish Dirección General deInvestigación, contract number TIN2007-63025 and by the Government of Aragón by way of the WALQAagreement.

References

1. Andujar C, Díaz J, Brunet P (2008) Relief impostor selection for large scale urban rendering, IEEEVirtual Reality Workshop on Virtual Cityscapes: Key Research Issues in Modeling Large-ScaleImmersive Urban Environments

2. Assarsson U, Moller T (2000) Optimized view frustum culling algorithms for bounding boxes. JGraphics Tools 5(1):9–22

3. Attene M, Robbiano F, Spagnuolo M, Falcidieno B (2009) Comput-Aided Des 41(10):756–7634. Benner J, Geiger A, Leinemann K (2005) Flexible generation of semantic 3D building models. 1st

Intern. ISPRS/EuroSDR/DGPF-Workshop on Next Generation 3D City Models., 495. Biermann, Peter and Frohlich, Christian and Latoschik, Marc and Wachsmuth, Ipke (2007) Semantic

information and local constraints for parametric parts in interactive virtual construction, SG ‘07:Proceedings of the 8th international symposium on Smart Graphics, 124–134}

6. Chakrabarti K, Mehrotra S (1999) The hybrid tree: an index structure for high dimensional featurespaces. In: Proc. Int. Conf. Data Engineering, pp 440–447

7. Dollner J, Buchholz H (2005) Continuous level-of-detail modeling of buildings in 3D city models:Proceedings of the 13th annual ACM international workshop on Geographic information systems,Bremen, Germany, pp 173–181

Multimed Tools Appl (2012) 59:505–521 519

8. Falquet G, Métral C (2005) Integrating urban knowledge into 3D City Models. In: Proc. FirstInternational Workshop on Next Generation 3D City Models

9. Gerhard R, Dieter S (2005) Flexible parameterization of scene graphs. In: Proceedings IEEE VirtualReality Conference 2005 (VR‘05), pp 51–58

10. Grundhöfer A, Brombach B, Scheibe R, Fröhlich B (2005) Level of detail based occlusion culling fordynamic scenes. In: Proceedings of the 3rd international Conference on Computer Graphics andinteractive Techniques in Australasia and South East Asia,. GRAPHITE ‘05. ACM, pp 37–45

11. Guttman A, Yormark B (1984) R-Trees: A Dynamic Index Structure for Spatial Searching, SIGMOD 84Proceedings of Annual Meeting, Boston, Massachusetts, pp 47–57

12. Katayama N, Shinichi S (1997) The {SR-Tree}: an index structure for high-dimensional nearestneighbour queries, Proceedings of ACM SIGMOD May 1997, pp 369–380

13. Klima M, Halabala P, Slavik P (2004) Semantic information visualization, Semantic informationvisualization. CODATA Workshop Prague

14. Lee B-J, Park J-S, Sung MY (2006) View-dependent simplification of complex urban scenes usingweighted quadtrees. Adv Artif Real Tele Existence Springer Berlin 4282/2006:1243–1252

15. Leissler M, Müller A, Hemmje M, Neuhold EA, Component VR (2000) Architecture for visual multi-user applications. In: ISCA 2nd International Conference on Information Reuse and Integration (IRI-2000)

16. Li T, Hsu S (2004) An intelligent 3D user interface adapting to user control behaviors. In: Proceedings ofthe 9th international Conference on intelligent User interfaces. IUI ‘04. ACM Press, pp 184–190

17. Manocha D (2008) Real-time motion planning for agent-based crowd simulation, Proceedings of 2008IEEE Virtual Reality Workshop on Virtual Cityscapes

18. Mendez E, Schall G, Havemann S, Junghanns S, Fellner D, Schmalstieg D (2008) Generating semantic3D models of underground infrastructure. IEEE Comput Graph Appl 28(3):48–57

19. Otto K (2005) Semantic virtual environments. In: Proceedings of the 14th international World Wide Webconference. ACM press, Chiba, Japan, pp 1036–1037

20. Pina JL, Serón FJ, Cerezo E (2009) Efficient visualization of dynamic urban scenes iasted VIIP09. 652–044:1776–183

21. Stadler A, KOLBE TH (2007) Spatio-semantic coherence in the integration of 3D city models. In:Proceedings of the 5th International Symposium on Spatial Data Quality ISSDQ

22. Wald, Ingo, Ize, Thiago, Parker, Steven G (2008) Special section: parallel graphics and visualization:fast, parallel, and asynchronous construction of BVHs for ray tracing animated scenes. Comput Graph 32(1):3–13

23. Xia Y, Prabhakar S (2003) Q + Rtree: efficient ndexing for moving object databases, Proceedings of theEighth International Conference on Database Systems for Advanced Applications, pp 175

Jose Luis Pina studied his degree in physical sciences in the Faculty of Sciences of the University ofZaragoza (Spain). Obtained his DEA certification (Advanced Studies Certification) in the Computer ScienceDepartment, University of Zaragoza. Now he is finishing his PhDegree in Visualization with the AdvancedComputer Graphics Group (GIGA) of the Computer Science Department, University of Zaragoza, underdirection of Francisco .J. Seron and Eva Cerezo. http://jlpina.blogspot.com/.

520 Multimed Tools Appl (2012) 59:505–521

Eva Cerezo obtained a B.S. degree in Physics in 1990 and the Ms.C. degree in Nuclear Physics in 1992 fromthe University of Zaragoza, Spain. She received a Ph.D. degree in Computer Science in 2002. She iscurrently an Associate Professor at the Computer Sciences and Systems Engineering Department at theUniversity of Zaragoza and a member of the Advanced Computer Graphics Group. Her research fields are invisualization, virtual humans, and affective multimodal human computer interaction. http://giga.cps.unizar.es/fotos/faces/eva.html.

Francisco J. Serón is the head of the Advanced Computer Graphics Group, named GIGA (http://giga.cps.unizar.es), of the University of Zaragoza in Spain. Its laboratories are located in the Polytechnical Center inthe same city. Dr. Serón has done Research and Innovation work on fields such as Simulation of NaturalPhenomena, Visualization and Computer Graphics techniques, and Numerical Computation and ParallelComputing. He has published numerous papers in these areas. His research has been funded by regional,national and European Union government agencies and by private corporations including IBM, GME,CASA, INDAL, WWP… http://webdiis.unizar.es/~seron/.

Multimed Tools Appl (2012) 59:505–521 521

Recommended