9
Future Generation Computer Systems 22 (2006) 755–763 www.elsevier.com/locate/fgcs Alternative approaches to multicast group management in large-scale distributed interactive simulation systems Azzedine Boukerche * , Caron Dzermajko, Kaiyuan Lu PARADISE Research Laboratory, SITE, University of Ottawa, Canada Available online 3 May 2006 Abstract A primary concern in implementing and executing large-scale distributed simulations is limiting and controlling the volume of data, regarding simulated entities, exchanged between entities participating in the simulation. Computer scientists in the academic, military and corporate world have spent much time studying and applying various methods of Data Distribution Management (DDM) to learn the strengths and weaknesses of each approach, improve upon existing DDM methods, and discover the most efficient method to use for a particular application. The key to efficient DDM is to successfully limit the data sent to only the data that is needed, and to direct that data to only those entities requiring the data. In this paper, we explain the benefits and goals of DDM, define and describe the various methods of DDM, compare similarities and differences in DDM methods, and discuss several existing DDM implementations. In our discussion of DDM methods, we include Region-Based, Fixed and Dynamic Grid-Based, Hybrid, Agent-Based and Class-Based DDM and compare and contrast these methods and their applications. We also discuss existing High-Level Architecture (HLA)-compliant and non-compliant Run-Time Infrastructures (RTI) and their DDM implementations. Our goal is to promote an understanding of the benefits of DDM and to offer a detailed explanation of the available DDM methods and the RTIs that employ those methods. c 2006 Published by Elsevier B.V. 1. Introduction Data sharing and management, across a large-scale distributed simulation environment, requires message passing between processors and the Run-Time Infrastructure (RTI), to coordinate causality and state information throughout a simulation execution. Simulating many entities across many different hosts can increase communication across a network, on the scale of O ( N 2 ), where N is the number of processors or hosts [9]. Data Distribution Management (DDM) is a service that seeks to control the volume of messages exchanged during a simulation, thereby decreasing the workload on the simulation hosts. The key to reducing message volume in a large-scale distributed simulation is to find an efficient way of determining what data is relevant to which simulation hosts and relay events and state information only to those applications that require This work was partially supported by the Canada Research Chair Program, Canada Foundation for Innovation Funds, OIT/Ontario Distinguished Researcher Award, and the NSERC Grant and Early Research Career Award. * Corresponding author. E-mail address: [email protected] (A. Boukerche). them [3,1,4,7,9,14,13,18]. As one of the six services supported by the Run-time Infrastructure (RTI), as defined by the High- Level Architecture (HLA) specifications [9,21], DDM has been the subject of much research in recent years and will continue to be, as long as the need to perform large-scale distributed simulations requires more scalability and accuracy in DDM. HLA is a recently approved Institute of Electrical and Electronic Engineers (IEEE) standard, designed under the direction of the Defense Modeling and Simulation Office (DMSO), to link heterogenous simulators across a common architecture with the goal of reuse and interoperability as a prime motive [6,9,14,13,4,21]. With the recent approval of HLA as an IEEE standard came increased interest in the corporate, military and academic sectors, which has spawned much research in the area of DDM. As a result, there are several DDM strategies currently in use and various RTIs with different DDM implementations [2]. The objective of this paper is to review several methods of DDM currently implemented in various RTIs. The remainder of this paper is organized as follows. Section 2 gives a brief history of High-Level Architecture. Section 3 presents 0167-739X/$ - see front matter c 2006 Published by Elsevier B.V. doi:10.1016/j.future.2006.02.001

Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

Embed Size (px)

Citation preview

Page 1: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

Future Generation Computer Systems 22 (2006) 755–763www.elsevier.com/locate/fgcs

Alternative approaches to multicast group management in large-scaledistributed interactive simulation systemsI

Azzedine Boukerche∗, Caron Dzermajko, Kaiyuan Lu

PARADISE Research Laboratory, SITE, University of Ottawa, Canada

Available online 3 May 2006

Abstract

A primary concern in implementing and executing large-scale distributed simulations is limiting and controlling the volume of data, regardingsimulated entities, exchanged between entities participating in the simulation. Computer scientists in the academic, military and corporate worldhave spent much time studying and applying various methods of Data Distribution Management (DDM) to learn the strengths and weaknessesof each approach, improve upon existing DDM methods, and discover the most efficient method to use for a particular application. The key toefficient DDM is to successfully limit the data sent to only the data that is needed, and to direct that data to only those entities requiring the data.In this paper, we explain the benefits and goals of DDM, define and describe the various methods of DDM, compare similarities and differencesin DDM methods, and discuss several existing DDM implementations. In our discussion of DDM methods, we include Region-Based, Fixedand Dynamic Grid-Based, Hybrid, Agent-Based and Class-Based DDM and compare and contrast these methods and their applications. We alsodiscuss existing High-Level Architecture (HLA)-compliant and non-compliant Run-Time Infrastructures (RTI) and their DDM implementations.Our goal is to promote an understanding of the benefits of DDM and to offer a detailed explanation of the available DDM methods and the RTIsthat employ those methods.c© 2006 Published by Elsevier B.V.

1. Introduction

Data sharing and management, across a large-scaledistributed simulation environment, requires message passingbetween processors and the Run-Time Infrastructure (RTI),to coordinate causality and state information throughout asimulation execution. Simulating many entities across manydifferent hosts can increase communication across a network,on the scale of O(N 2), where N is the number of processorsor hosts [9]. Data Distribution Management (DDM) is a servicethat seeks to control the volume of messages exchanged duringa simulation, thereby decreasing the workload on the simulationhosts. The key to reducing message volume in a large-scaledistributed simulation is to find an efficient way of determiningwhat data is relevant to which simulation hosts and relay eventsand state information only to those applications that require

I This work was partially supported by the Canada Research ChairProgram, Canada Foundation for Innovation Funds, OIT/Ontario DistinguishedResearcher Award, and the NSERC Grant and Early Research Career Award.

∗ Corresponding author.E-mail address: [email protected] (A. Boukerche).

0167-739X/$ - see front matter c© 2006 Published by Elsevier B.V.doi:10.1016/j.future.2006.02.001

them [3,1,4,7,9,14,13,18]. As one of the six services supportedby the Run-time Infrastructure (RTI), as defined by the High-Level Architecture (HLA) specifications [9,21], DDM has beenthe subject of much research in recent years and will continueto be, as long as the need to perform large-scale distributedsimulations requires more scalability and accuracy in DDM.

HLA is a recently approved Institute of Electrical andElectronic Engineers (IEEE) standard, designed under thedirection of the Defense Modeling and Simulation Office(DMSO), to link heterogenous simulators across a commonarchitecture with the goal of reuse and interoperability as aprime motive [6,9,14,13,4,21]. With the recent approval ofHLA as an IEEE standard came increased interest in thecorporate, military and academic sectors, which has spawnedmuch research in the area of DDM. As a result, there are severalDDM strategies currently in use and various RTIs with differentDDM implementations [2].

The objective of this paper is to review several methods ofDDM currently implemented in various RTIs. The remainderof this paper is organized as follows. Section 2 gives abrief history of High-Level Architecture. Section 3 presents

Page 2: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

756 A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763

an overview of DDM. Multicast Group Communications isintroduced in Section 4. In Section 5, we discuss variousDDM strategies, including the Region-Based, Hybrid, Agent-Based, Class-Based and Grid-Based (including Fixed Grid-Based and Dynamic Grid-Based) DDM methods. Section 6presents existing RTIs and discusses the DDM implementationof each RTI. Section 7 concludes the paper with a summary ofthe DDM methods and their RTI implementations, as well as abrief discussion of our plans for future research.

2. High-level architecture (HLA)

High-level architecture (HLA) is a way to link multipleand different simulations across a common architecture. Reuseand interoperability is the primary goal of the HLA concept,developed for the US Department of Defense (DoD) in the midto late 1990s [8,4,21].

The DoD awarded contracts for the definition of HLAin 1994. The Defense Modeling and Simulation Office(DMSO) reviewed the presentations of that definition inMarch 1995 [7]. Ultimately, the HLA formed as threedocuments, Interface Specifications, HLA Rules, and ObjectModel Template, which defines the functionality, rules andinterfaces between federates and the RTI [8]. Throughoutthe late 1990s and into the 21st century, these documentsunderwent modifications and enhancements through the jointefforts of military and non-military personnel. In additionto a team of DMSO employees, contracted companies andindividuals directly involved in the HLA project, variouscompanies and universities are independently researchingmethods to implement HLA efficiently and cost effectively. Inthe Fall of 2000, IEEE adopted the most recent versions 1516,1516.1, 1516.2 of the HLA documents as an industry standard,broadening the HLA community even further [13].

The HLA is made up of distinct modular components, eachof which has a clearly defined function within a federation. Afederation is a designated group of simulations linked togetherby the HLA [8,4,21]. One of the components that makes upthe HLA is the set of simulations or federates that form afederation. A federate is a broad term that encompasses anysimulation, utility or interface that will run independently,but whose entities may interact with other federates’ entities,through messaging and data exchange that is controlled andmonitored by the Run-Time Infrastructure (RTI) and facilitatedby a run-time interface. The term federate might refer to a datacollector or a live participant interface, as well as any computeror manned simulator [8,4]. Federates communicate informationto each other through the RTI, using a run-time interface.

The run-time interface provides a way for federates tocall upon the various services of the RTI. The interface isdesigned to set a standard for exchange between any federateand the RTI, independently of object models and data exchangerequirements of a federation [21]. The central component ofthe HLA is the RTI, in that any and all interactions betweenfederates occur through the RTI, which offers six classes ofservice to those federates. This HLA modular design supportsthe DoD and DMSO goals of reuse and interoperability of

simulations. The following briefly describes RTI, the centralcomponent of HLA.

The Run-Time Infrastructure (RTI) is another component ofthe HLA, which offers six services to facilitate the interfacebetween federates across the HLA. The Interface Specificationdocument defines and describes these services [21]. The focusof this thesis, data distribution management (DDM), is one ofthe six services that the RTI provides to facilitate the exchangeof state information of simulated entities between simulators(federates) linked through the RTI. The following sectiondescribes the function of the RTI in detail and discusses currentimplementations and ongoing research into a more efficient RTIto support larger-scale simulations.

3. Overview of Data Distribution Management (DDM)

Before we move forward with our discussions about DDM,there are a few terms that we need to define.

• Dimension: A named coordinate axis.• Extent: A sequence of ranges, one for each dimension.• Federate: A simulator or processor participating in a

federation simulation.• Federation: A distributed simulation.• Interest region: A general term referring to publication and

subscription regions.• Intersecting region: The intersection of a publication and

subscription region.• Name space: A set of values used to define and describe

information of interest.• Publication region: A defined set of data for which

messages deliver state updates to subscribing federates.• Range: A set of ordered pairs defining a continuous interval

on a dimension.• Region: A set of extents that are bound to declare a routing

space.• Routing space: A multi-dimensional coordinate system.• Subscription region: A defined subset of the name space

that describes the data in which a federate is interested.

The aim of DDM is to limit and control the volume ofdata exchanged during a simulation, and reduce the processingrequirements of simulation hosts by relaying events and stateinformation only to those applications that require them [3,1,4,7,9,14,13,18]. To accomplish this task, we use a multi-dimensional coordinate system known as a routing space, whichdefines areas of interest to federates, called publication andsubscription regions, and identifies intersection regions, whichare the intersections of publication and subscription regions [3,4,7,9,4,21]. Fig. 1 illustrates a two-dimensional routing spacewith two regions.

3.1. Routing space

The multi-dimensional coordinate system, known as arouting space, is a grid without partitions [4], within which afederate declares interest in receiving or sending state updatesand interaction notifications. We declare routing spaces through

Page 3: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763 757

Fig. 1. Two-dimensional routing space.

the HLA’s federation execution data interchange format (FEDDIF), which is a standard file exchange format [9,21] tofacilitate the use of tools for data file transfer and exchangebetween federates. This declared routing space is a normalizedcoordinate system [7,9] consisting of at least two dimensions(usually more) and representing the simulation environment orvirtual world of the simulation.

The original HLA design incorporates the use of multiplerouting spaces. However, implementation of this design canlead to problems with sharing a single attribute over multiplerouting spaces. As a result, the IEEE 1516 version HLAspecification includes several changes to the DDM service.One of these changes is the replacement of multiple routingspaces by a single coordinate space, facilitating the routingof additional attributes to base class definitions. This changealso results in a reduction in coding modifications necessary toeach federate when a dimension is added to a routing spaceshared by those federates. Another difference in the IEEE1516 version of the Interface Specification is the elimination ofextents to improve the accuracy of interaction notifications [13].We explain extents in further detail in Section 3.2.

To put this into perspective, we will draw an example.Imagine a simulation of a military tank opposing an enemyfortress. At the least, this simulation requires a routingspace consisting of three dimensions. Two spatial dimensionsare required, to represent tank movement (North–South andEast–West) and a team or country designation is needed, toindicate opposition. The team, determines an entity’s ability todetect or be detected by an entity on the opposite team, whichbrings us to the next subsection, describing publication andsubscription regions.

3.2. Publication and subscription regions

A subscription region is the area of interest declared by afederate, within a routing space. A publication region is theset of data made available to other federates by a simulatedentity, or federate. An interest region is a general term thatrefers to both publication and subscription regions [3,7,4]. Inthe versions of the HLA Interface Specification prior to IEEE1516, an extent is an ordered pair indicating the minimum andmaximum ranges of a region across a routing space dimension.

As previously mentioned, the IEEE 1516 version eliminatesextents from the HLA specifications, and replaces them withregion sets [13]. An example of a subscription region is theradar range of a ground-based radar (GBR) simulator. Themaximum range of the radar is defined as the boundaries ofthe subscription region. Similarly, the entities in a simulation,designed to be detected by the GBR, have defined publicationregions to update the GBR with position data and collisions orinteractions. Where these subscription and publications regionsoverlap is called an intersection region, which we describedfurther in the following subsection.

3.3. Intersection region

The intersection of defined publication and subscriptionregions is called an intersection region. An intersection regionexists when corresponding region sets overlap. When anintersection region exists, data exchange occurs between thesubscribing federate and the publishing federate. How toidentify these intersection regions and the method in whichpublishing and subscribing federates exchange data are thebasic concerns of DDM [4]. The following section describesone method of sending data once an intersection region isdetected.

4. Multicast group communications

Once an intersection region is detected, the published datamust be sent to the federates subscribing to that data. Multicastgroup communication is a commonly employed method oftransferring this data.

Some of the DDM methods that we discuss in the followingsections utilize multicast groups in a variety of ways. Generally,federates join and resign from multicast groups based ontheir subscription and/or publication regions. A list of activemulticast group names is kept and membership to those groupsis kept updated during a simulation execution. The publishingfederates that belong to a multicast group send their data toall subscribers who are members of that same multicast group.What differs between DDM schemes is when and what triggersthe creation of a multicast group, whether the multicast groupsthat become inactive, due to no membership, can be reusedand when and what triggers federates to join a multicast group.We will discuss multicast group communication in more detail,and with examples, as we discuss the various DDM schemesand how multicasting is implemented with a particular DDMmethod.

5. DDM methods

Researchers throughout the mid to late 1990s published theresults of experimental simulations on distributed simulationsystems, showing performance gains using DDM [1]. Thevarious DDM methods that appear in this work differ in themanner in which each method determines intersections ofpublication and subscription regions and how they employmulticast group communications. In this paper, we intend toexplain several DDM strategies, and compare and contrast the

Page 4: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

758 A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763

methods that they use to determine intersecting regions andexchange data.

The six DDM methods that we describe include Region-Based [5,7,8,18], Fixed Grid-Based [3,16], Dynamic Grid-Based [1,4], Class-Based [8,18], Agent-Based [18], andHybrid [18] DDM methods. Each DDM method identifiesintersections in publication and subscription regions differently.Class-Based DDM identifies interest regions based on objectclass and object class attributes declarations, and discardsirrelevant data at the receiving processor [18]. The Region-Based method uses either matching or a centralized DDMcoordinator to detect intersection regions. The Hybrid methoduses a central DDM coordinator, but limits the amount ofmatching performed by that DDM coordinator through the useof a grid, similar to the Grid-Based methods. The Grid-Basedmethods (Fixed Grid and Dynamic Grid) map interest regionsto a multi-dimensional grid and determine intersecting regionsby identifying those grids with both publishers and subscribers.The difference between the Dynamic Grid-Based and FixedGrid-Based DDM approaches is in how they assign multicastgroups. Finally, the Agent-Based DDM mechanism employsintelligent mobile agents to collect and filter data from thepublishers and deliver only relevant data to subscribers.

The subsections that follow detail the DDM strategiesoutlined above.

5.1. Class-Based DDM

The Class-Based method of DDM utilizes an HLA serviceknown as Declaration Management to implement class-basedsubscription. Federates subscribe to attributes of an object classand receive notification any time that attribute is modified. Thismethod allows for federates to track changes in attributes, butnotifications are sent for any object instance of that attribute,usually resulting in a lot of unnecessary data exchange [8].

Take, for example, a simulation of an Airborne Warningand Control System (AWAC), tracking several squadrons ofenemy fighter planes. Perhaps the AWAC has radar capabilitiesfor a radius of 500 miles and there are two squadrons outsidethat 500 mile radius. Using the Class-Based method, theAWAC simulator would be responsible for discarding theinformation received for those squadrons that are outsideits radar range, thereby expending unnecessary resources toprocess and discard these unwanted messages.

The following subsections describe value-based DDMmethods that seek ways to filter data further, beyond thecapabilities of Class-Based DDM.

5.2. Region-Based DDM

The Region-Based DDM approach, implemented in RTI1.3 [6,7], compares the subscription and publication regions,directly, in order to find which ones overlap [5,18]. We refer tothis process as matching. Once the DDM mechanism identifiesintersections, a second type of communication must be enabled,and that is the sending of data from publishers to subscribers.This second kind of inter-host communication generally usesmulticast technology [1], which we introduced in Section 3.

Fig. 2. Region-Based DDM.

Now we will review how the Region-Based scheme usesthe information provided by matching to assign hosts tomulticast groups and facilitate data transfer. Fig. 2 representsa federation with four regions, Region 1, Region 2, Region 3and Region 4, and five federates, subscribing federates S1and S2 and publishing federates P1, P2 and P3. Note that S2and P3 intersect and that the DDM mechanism in a Region-Based DDM scheme would detect a match as a result of thatintersection. The detection of that match would trigger Region 2to join a multicast group, which we will call MG-P, and theDDM mechanism would trigger federates S2 and P3 to joinMG-P. P3 then begins sending update data to all subscribingfederates on MG-P or, in our example, to S2. Subscriptions thatintersect with Region 2 also join multicast group MG-P andtheir federates receive triggers to join MG-P, thereby receivingany data intended for that group. This facilitates the transferof data between P3 and all other federates interested in thedata that P3 publishes in Region 2. After the DDM mechanismperforms additional matching, due to the addition, modification,or deletion of regions, suppose that Region 2 is no longer partof an intersection. Then, all federates that belong to MG-P,including P3, receive triggers to leave the group, terminatingthe data exchange between publisher(s) and subscriber(s).

Another way of determining intersecting regions with theRegion-Based method is by using a centralized coordinator. Allfederates send their interest information, or subscriptions to theDDM coordinator, which is a dedicated processor that collectsthe publication and subscription information in a database. TheDDM coordinator is responsible for performing the matchingbetween publications and subscriptions and triggering federatesto join or leave multicast groups based on the results ofthe matching. One of the limiting factors of using a DDMcoordinator is the potential for a bottleneck, since all messagingperformed in a simulation, using a DDM coordinator, mustbe triggered and processed by the single processor that isdesignated as the coordinator.

5.3. Grid-Based DDM

The motivation behind the Grid-Based approach is to reducethe computational and communication overhead of the Region-Based method. While the Region-Based approach necessitatesmatching between all publications and all subscriptions,the Grid-Based technique avoids these extensive and costlyoperations. Instead of explicitly comparing publication and

Page 5: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763 759

Table 1Dynamic Grid-Based grouping

Federate Entity Interest type Cells in region Assigned group to group

Fed 1 Spy plane Subscription 1, 3 MG-3Fed 2 Squad A planes Publication 3 MG-3Fed 3 Squad B planes Publication 4 〈none〉

Fig. 3. Grid overlay.

subscription regions, an RTI maps each interest region onto amulti-dimensional grid, which represents the routing space, asshown in Fig. 3. The grid will generally have the same numberof dimensions as the routing space [4].

Referring to Fig. 3, each grid block, or cell, is numbered 1through 4. The subscription of a spy plane, located in cell 1, ismapped to cell 1 and cell 3. These cells represent the spy plane’sarea of interest. The publication of planes in one squadron,which are the three planes located in cell 3, is mapped to cell 3.The four planes, or squadron of planes, located in cell 4 havea publication mapped to cell 4. Cells that are part of botha publication and a subscription represent the intersection. InFig. 3, the intersection occurs in cell 3. Therefore, entities withsubscriptions to cell 3, such as our spy plane, require updatesabout entities publishing to cell 3, like our three squadronplanes.

An RTI component on each host performs the process ofoverlaying a grid on the terrain. Each host’s RTI componentperforms the region-to-cells mapping independently of theother hosts, since we initialize the number and size of thegrid cells before the simulation begins and these values remainconstant throughout the federation execution. This methoddoes not require a centralized DDM coordinator and doesnot incur the overhead involved in the complex operation ofmatching. While this potential reduction in communicationoverhead offers advantages in scalability, the major drawbackof the Grid-Based method is that its accuracy in identifyingintersection regions tends to be lower, compared to the Region-Based method. The mapping of the interest regions to the gridcells may not be exact, since region boundaries tend not to fallon cell boundaries, in which case the area covered by the cellsrepresenting the interest region will be slightly larger than theregion itself. This may cause superfluous intersections in somecells, leading to publishers sending subscribers unneeded data.

• Fixed Grid-Based DDM: The Fixed Grid-Based DDMmethod involves assigning a multicast group to each grid cell

at system initialization [3,4]. As the simulation progresses,an RTI component, running on each host, maps the interestregions of that federate (we assume one federate per host)to grid cells and the federate joins the multicast groups pre-assigned to those cells.

Referring again to Fig. 3, assume that three federatesparticipate in the simulation. One federate simulates thespy plane, which we will call Fed 1. Fed 2 simulates thethree planes in cell 3, which we refer to as Squad A inTable 1. Another federate, which we will refer to as Fed 3,simulates the four planes in cell 4, also known as Squad B.So, Fed 1, Fed 2 and Fed 3, our three participating federates,which simulate a spy plane and two squadrons, Squad Aand Squad B, respectively, comprise our federation. Fed 1subscribes to the terrain within its radar range, which the RTIcomponent maps to cell 1 and cell 3. Therefore, Fed 1 joinstwo multicast groups, which we refer to as MG-1 and MG-3,that map to cell 1 and cell 3, respectively. Fed 2 simulatedplanes map to cell 3, while Fed 3 planes publish in cell 4.So, Fed 2 joins MG-3 and Fed 3 joins a multicast group thatwe label MG-4.

A feature unique to the Fixed Grid-Based method isthat the DDM mechanism does not, directly or indirectly,detect intersection regions. This results in the advantageof minimizing inter-federate communication. Whenever anintersection is present, data is properly transferred betweenthe publisher and subscriber because they both join the samemulticast groups. These groups correspond to the cells in theintersection, which, in this example, are MG-3 and cell 3.However, data will be transferred for every publicationcell, even when no intersection is present. The cost of thisirrelevant data transmission must be weighed against thebenefit of decreased dependency between hosts to evaluatethe appropriateness of the Fixed Grid-Based method for aparticular simulation [1,4].Dynamic Grid-Based DDM: The Fixed Grid-Basedmethod offers no mechanism to prevent publishers fromsending data on a group with no subscribers, such asMG-4, in our previous example. The Dynamic Grid-Basedmethod [3,1,4] addresses this drawback of the Fixed Grid-Based scheme.

Like all grid-based approaches, the cells are definedby overlaying the terrain within a grid. While the FixedGrid-Based scheme statically assigns multicast groups to allof the cells in the grid, the Dynamic Grid-Based schemedynamically allocates multicast groups, based on the currentpublication and subscription regions in the system, andtriggers hosts to join those groups, as in the Region-Basedmethod. This DDM mechanism assigns multicast groups to

Page 6: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

760 A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763

only those cells in which there is at least one publishing andone subscribing federate. Thus, multicast groups correspondonly to each cell that is part of the intersection of apublication region and a subscription region [1,4].

Federates join and leave the appropriate multicast groupsbased on trigger messages that the grid system sends.Publishers only join and transmit from a group if there is atleast one subscriber interested in that data, and subscribersonly join and listen in on a group if there is at least onepublisher transmitting from that group. This technique hasthe dual advantage of preventing senders from transmittingdata needlessly and reducing the number of multicast groupsthat a federate needs to join.

Using the previous scenario, which we represent in Fig. 3,as well as the dynamic approach, the grid system detects anintersection in cell 3, and Fed 1 and Fed 3 receive triggers tojoin MG-3 (Table 1). In this simple example, the DynamicGrid-Based method uses only one multicast group, MG-3, compared to the three groups MG-1, MG-3 and MG-4that the Fixed Grid-Based method allocates. For a large-scale simulation with many federates and many entities, thesavings that the Dynamic Grid-Based scheme provides, inmulticast groups used, is an important advantage of thisapproach over the Fixed Grid-Based method for promotingscalability.

Another benefit of the Dynamic Grid-Based scheme isthat the RTI components on each host perform intersectiondetection and triggering of federates to join multicast groups.We can draw an analogy between these distributed RTIcomponents and distributed DDM coordinators. Each RTIcomponent is responsible for keeping track of a specific setof grid cells. There is no central database or coordinator,making this approach more scalable than a Region-Basedscheme that uses a central DDM coordinator.

5.4. Hybrid DDM

Dynamic Grid-Based DDM seeks to reduce the numberof multicast groups used, by assigning multicast groups onlywhen an intersection occurs, compared to the Fixed Grid-Basedmethod. Similarly, the Hybrid method of DDM aims to reducethe cost of matching in the Region-Based method by onlyperforming matching when an intersection is apparent [19].By using a centralized DDM coordinator, intersections aredetermined, similarly to the grid-based approaches, by mappinginterest regions onto a grid. Using the information gleaned frommapping interest regions onto a grid, the DDM coordinator onlyperforms matching where there is an intersection. This use of acentralized DDM coordinator still risks a potential bottleneckand limits scalability, but the reduction in matching operationsoffers an advantage over the Region-Based DDM approach,while the reduction in subscription list and publication listupdates improves upon the costs associated with the FixedGrid-Based DDM scheme.

Fig. 4 illustrates an example of a tank dog fight federationwith a representation of a grid overlay. In this example, all tankspublish and subscribe, meaning that we have three subscription

Fig. 4. Hybrid DDM with tank dog fight federation.

regions, T1, T2, T3, and three publication regions, T1, T2,T3. Using the Hybrid DDM method, the DDM coordinatorestablishes a multicast group associated with cell 22, sinceT1 and T2 both intersect with cell 22 and a multicast groupassociated with cell 23, since T2 and T3 both intersect thatcell. The next step for the DDM coordinator is to determineif a true intersection region exists between the publishing andsubscribing federates in those cells, or if the federates overlapthe cell region without overlapping each other’s regions, asis the case with cell 23 and T2 and T3. Using the Region-Based approach of matching, the DDM coordinator wouldrecognize that T2 and T3 do not overlap and would remove thesubscribing federates from that multicast group list.

Additional research has been done to investigate the mostappropriate cell size for the Hybrid approach, as well as thegrid-based approaches [3,19]. Findings show that a largercell size will produce large multicast groups, leading to alarger number of messages being sent to federates that do notrequire the information carried in those messages, but a smallercell size requires more frequent updating of multicast groupmembership lists.

5.5. Agent-Based DDM

Agent-Based DDM precisely filters data, in a large-scaledistributed simulation, by using intelligent mobile agents inlieu of a centralized DDM coordinator. Upon the declarationof a subscription, an agent, or sub-child, is launched bythe subscribing federate to seek out the information neededfrom the publishing federate. When a publisher updates itsinteractions or object attributes, these agents fetch and filterthe data and generate only those messages with data needed bytheir subscribing parent. Any time that a subscription region ismodified, the subscriber notifies its agent, in order to keep theagents’ filtering data current [18].

Fig. 5 shows the improved structure of the agent-basedRTI (ARTI) as introduced in [18]. An agent-based DDMmechanism, implemented in the ARTI, employs an agentenvironment called D’agents and socket programming toreduce delays in communication between agents. The improvedstructure allows sub children to communicate with federatesthrough sockets for data updates, and federates can modifytheir subscription regions through their sub children, greatlyreducing time delays related to these communications. TheSubMaster agent manages launched sub children, while theagent interface provides federates with agent-related services.

Page 7: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763 761

Table 2Comparison of DDM methods

DDM method Filtering technique Attributes Basis

Class-Based None High message overhead and unnecessary processing on receiverside

Class

Region-Based Matching High CPU processing but lower message overhead ValueRegion-Based Centralized DDM coordinator with matching Limited scalability and potential bottleneck ValueFixed Grid-Based Grid overlay High message overhead but simple implementation ValueHybrid Grid overlay and central DDM coordinator with

matchingLower CPU processing and lower overhead but potential bottleneck Value

DynamicGrid-Based

Grid overlay Lower message overhead and fewer multicast groups Value

Agent-Based Intelligent agents and matching Difficult implementation but low message overhead Value

Fig. 5. Agent-Based RTI structure.

While this method offers a promising improvement over theRegion-Based method with a centralized coordinator, providesexact filtering of data, and reduces network communicationcosts, the implementation is complicated and research isongoing.

5.6. Summary of DDM methods

Recall that the Class-Based DDM is a basic method ofDDM that utilizes the HLA RTI Declaration Managementservice to declare subscriptions to object class attributes, butthe subscribing federate must filter unwanted data about objectinstances that are out of the subscriber’s region of interest. Theuse of a Region-Based DDM scheme, using either matchingor a centralized DDM coordinator, would eliminate much ofthis unwanted data, but results in high overhead due to theexpensive operation of matching, or a potential bottleneck ifa DDM coordinator is used. The Fixed Grid-Based methodeliminates the need for matching or a DDM coordinator byoverlaying a grid on the simulated terrain and sending updatemessages, via multicast groups, based on subscription regionsoverlapping the grid, but results in a potential for messagesto be sent on a multicast group with no subscribers. TheDynamic Grid-Based DDM strategy eliminates those multicastgroups that do not contain at least one publisher and onesubscriber, but, like all grid-based techniques, the potentialfor spurious matches still exists when a subscription region isnot bounded exactly by the grid cell boundaries. The HybridDDM method uses a grid to help reduce the matching requiredby a central DDM coordinator, but, by using a centralizedDDM coordinator, the potential for a bottleneck still exists.Finally, the Agent-Based DDM method eliminates the needfor a centralized DDM coordinator by using intelligent mobile

agents to gather subscription data and filter it before sendingthe data to the subscriber, but it is a bit difficult to implementdue to an additional interface required between the agents andthe RTI.

Table 2 summarizes the DDM methods that we havediscussed and compares their desirable and undesirableattributes.

Of the six DDM methods discussed in the previoussubsections, all have been implemented and tested indifferent RTIs, with different simulation scenarios. The sectionthat follows describes some existing RTIs and the DDMimplementations on each, and concludes with a comparison ofthe RTIs.

6. RTI implementations

The advent of HLA resulted in the implementation of avariety of RTIs. Some of these RTIs are HLA compliant andsome are not, but all the RTIs discussed in this paper haveone thing in common; they all implement some form of DDMservice. In the following subsections, we detail some of theRTIs that are currently in use and summarize by comparingattributes of these RTIs and their implemented DDM strategies.

1. DMSO RTI-NG 1.3: The Defense Modeling and SimulationOffice (DMSO) Run-Time Infrastructure (RTI) NextGeneration (NG) 1.3 is a benchmark or referenceimplementation that was sponsored and developed byDMSO [11]. As of June 29, 2001, v4 of the DMSO RTI-NG is in general release and is available free of charge [10].This RTI was developed on a contract awarded to ScienceApplication International Corporation (SAIC) and was thefirst DMSO-sponsored RTI resulting from competitionbetween government contracting corporations [4]. SAICcurrently offers support for v4–v6 of the RTI-NG 1.3,effective from October 1, 2002 [17], which is when DMSOdecided to transition HLA and RTI support to the corporatesector. The design of this RTI is HLA-compliant andimplements the Region-Based method of DDM.

2. MAK RTI: Effective December 5, 2002, MAK Technolo-gies announced that DMSO certified the MAK RTI as fullyHLA-compliant [12]. Prior to that date, the MAK RTI didnot implement all HLA RTI services and, therefore, was notconsidered fully HLA-compliant. MAK Technologies is aleading supplier of distributed simulation software and is

Page 8: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

762 A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763

Table 3Summary of RTIs

RTI Developer Category DDM method

RTI-NG 1.3 DMSO/SAIC Defense Region-BasedMAK RTI MAK Technologies Defense/Commercial Entertainment Region-BasedpRTI Pitch Corporation Defense/Commercial Region-BasedGMU LW RTI George Mason University R&D Fixed Grid-BasedFDK-DRTI Georgia Institute of Technology R&D Region-BasedUNT-RTI University of North Texas R&D Hybrid and Dynamic Grid-BasedARTI National University of Singapore R&D Agent-Based

based in Cambridge, Massachusetts, USA [12,4]. The MAKRTI is also available free of charge to federation developersand implements a Region-Based DDM approach.

3. Pitch portable RTI (pRTI): Pitch is a Swedish corporationspecializing in distributed simulation software and supportand, with their pRTI 1516 product, touts the first certifiedIEEE 1516 HLA-compliant RTI. This certification wasfinalized in March 2003. Pitch also offers a pRTI 1.3 productthat is certified HLA-compliant. Pitch touts their product asa best performer regarding latency and claims to use senderside filtering to minimize messaging. However, at the time ofthis writing, the actual DDM implementation is not knownto the author [15].

4. Academic RTIs: The previously discussed RTIs are allHLA-compliant, commercial or military RTIs, but thereare several RTIs in use in the academic sector thatwarrant discussion. The following RTIs were developedat universities to further the research and improvement ofspecific RTI services, but are not fully HLA-compliant.These RTIs were developed with only some of the RTIservices implemented, in order to focus on the performanceof those services in experiments that seek to discover orimprove upon limitations of the selected service.• GMU Light-Weight RTI: This RTI was developed at

George Mason University and focuses on the Declara-tion Management and Data Distribution Management ser-vices [16]. The Fixed Grid-Based DDM approach is im-plemented in this RTI, which is best suited to small- ormedium-sized federations.

• RTI-Kit (FDK-DRTI): The RTI-Kit was developed atGeorgia Institute of Technology by Professor RichardFujimoto [11] and contains a set of libraries designedto assist in the development of RTIs [4]. The RTI-Kitfocuses on the Time Management and DDM services andimplements the Region-Based DDM strategy.

• UNT-RTI: This RTI-Kit was based on Georgia Tech’sRTI-Kit and was developed at the University of NorthTexas [3,1,4]. The focus of this RTI is DDM and itimplements the Fixed Grid-Based, Dynamic Grid-Basedand Region-Based DDM approaches.

• Agent-Based RTI (ARTI): The ARTI was developedat the National University of Singapore’s Schoolof Computing [18,20]. This implementation employsD’Agents as the agent environment and uses theAgent-Based DDM filtering mechanism, described inSection 5.5, to administer the DDM services.

Table 3 summarizes the various RTIs discussed inthis section, their developers, HLA compliance status andimplemented DDM method.

7. Conclusion

The selection of an appropriate data distribution manage-ment scheme can be a critical factor in accomplishing efficientdata exchange within large-scale distributed interactive simul-tions. Of the various RTI designs, we have mentioned bothcommercial and academic RTIs, where each group has im-plemented a different variation of DDM techniques. We hopethat our review of the six DDM schemes above, Class-Based,Region-Based, Fixed Grid-Based, Dynamic Grid-Based, Hy-brid and Agent-Based DDM approaches, help to clarify theexisting DDM schemes and promote research to continue im-provements in these strategies.

Our plans for future research include a further simulationexperimental study of the affects of aggregation anddisaggregation of federates with relation to various DDMapproaches and Agent-Based DDM schemes.

References

[1] A. Boukerche, C. Dzermajko, Performance comparisons of datadistribution management strategies, in: Proc. 5th IEEE Int’l Workshop onDistributed Simulation and Real-Time Applications, 2001, pp. 67–75.

[2] A. Boukerche, C. Dzermajko, Alternative approaches to data distributionmanagement in large scale distributed systems, in: Int’l Symposium onPerformance Evaluation of Computer and Telecommunication Systems,2003, Canada.

[3] A. Boukerche, A.J. Roy, In search of DDM in large-scale distributedsimulations, in: Proc. Summer Computer Simulation Conference, Canada,2000, pp. 12–19.

[4] A. Boukerche, A.J. Roy, Dynamic grid-based multicast group assignmentin data distribution management, in: Proc. 4th IEEE Int’l Workshopon Distributed Simulation and Real-Time Applications Workshop, 2000,pp. 27–34.

[5] J.O. Calvin, C.J. Chiang, S.M. McGarry, S.J. Rak, D.J. Van Hook, M.R.Salisbury, Design, implementation, and performance of the STOW RTIprototype (RTI-s), in: Proc. of Spring SIW Workshop, 1997, 97S-SIW-019.

[6] J.S. Dahmann, K.L. Morse, High level architecture for simulation:An update, in: Proceedings Third IEEE International Workshop onDistributed Simulation and Real-Time Applications, 1998, pp. 32–40.

[7] R.T. Fujimoto, Parallel and Distributed Simulation Systems, John Wileyand Sons, New York, 2000.

[8] R.T. Fujimoto, T. Mclean, K. Perumalla, I. Tacic, Design of highperformance RTI software, in: Proc. Fourth IEEE International Workshopon Distributed Simulation and Real-Time Applications, 2000, pp. 41–49.

Page 9: Alternative approaches to multicast group management in large-scale distributed interactive simulation systems

A. Boukerche et al. / Future Generation Computer Systems 22 (2006) 755–763 763

[9] R.T. Fujimoto, I. Tacic, Synchronized data distribution management indistributed simulations, in: Proc. Twelfth Workshop on Parallel andDistributed Simulation, 1998.

[10] P. Grammer, [SDC]RTI-NG-NG 1.3v4 Release,http://lists.dmso.mil/pipermail/sdc-announce/2001-July/000000.html.

[11] P. Knight, A. Corder, R. Liedel, J. Giddens, R. Drake, C. Jenkins, P.Agarwal, Evaluation of Run-Time Infrastructure (RTI) Implementations,http://www.scs.org/confernc/hsc/hsc02/hsc/papers/hsc017.pdf.

[12] http://www.mak.com/pr rtiverified.htm.[13] K.L. Morse, M. Petty, Data distribution management migration from DoD

1.3 to IEEE 1576, in: Proc. 5th IEEE Inti’l Workshop on DistributedSimulation and Real-Time Applications, 2001, pp. 58–65.

[14] K.L. Morse, J.S. Steinman, Data distribution management in the HLA:Multidimensional regions and physically correct filtering, in: Proc. SpringSIW Workshop, 1997, 97S-SIW-052.

[15] http://www.pitch.se/prti.[16] M.J. Pullen, V.P. Laviano, M. Moreau, Creating a light-weight RTI using

selectively reliable transmission as an evolution of dual-mode multicast,in: Proc. Fall SIW Workshop, 1997, 97F-SIW-149.

[17] http://helpdesk.dctd.saic.com.[18] G. Tan, L. Xu, F. Moradi, Y. Zhang, An agent-based DDM filtering

mechanism, in: Proc. 8th IEEE/ACM Int’l Workshop on Modeling,Analysis and Simulation of Computer and Telecommunication Systems,2000, pp. 374–381.

[19] G. Tan, Y. Zhang, R. Ayani, A hybrid approach to data distributionmanagement, in: Proc. 4th IEEE Int’l Workshop Distributed Simulationand Real-Time Applications, 2000, pp. 33–40.

[20] G. Tan, W.N. Ng, F. Moradi, Aggregation/disaggregation in HLA multi-resolution distributed simulation, in: Proc. 5th IEEE Int’l WorkshopDistributed Simulation and Real-Time Applications, 2001, pp. 76–83.

[21] US Department of Defense, Defense Modeling and Simulation Office,High Level Architecture Interface Specification, April 2, 1998,http://www.dmso.mil/public/resources/documents, link to: Project Areas,High Level Architecture, Technical Specifications.

Azzedine Boukerche is a Full Professor and holds aCanada Research Chair position at the University ofOttawa. He is the Founding Director of the PARADISEResearch Laboratory at the University of Ottawa. Priorto this, he held a Faculty position at the Universityof North Texas, USA, and was working as a SeniorScientist at the Simulation Sciences Division, MetronCorporation, located in San Diego. He was alsoemployed as Faculty at the School of Computer

Science, McGill University, and taught at the Polytechnic of Montreal. Hespent a year at the JPL/NASA-California Institute of Technology, where hecontributed to a project centered on the specification and verification of thesoftware used to control interplanetary spacecraft operated by JPL/NASALaboratory.

His current research interests include distributed computing, large-scaledistributed interactive simulation, wireless networks, mobile and pervasivecomputing, wireless multimedia, Quality of Service (QoS) provisioning,wireless ad hoc and sensor networks, peformance evaluation and modelingof large-scale distributed systems, and parallel discrete event simulation. Dr.Boukerche has published several research papers in these areas. He was therecipient of the Best Research Paper Award at IEEE/ACM PADS’97, and therecipient of the 3rd National Award for Telecommunication Software 1999 forhis work on a distributed security system for mobile phone operations, and hasbeen nominated for the best paper award at the IEEE/ACM PADS’99, ACMMSWiM 2001, and ACM MobiWac 2004.

Dr. A. Boukerche is a holder of an Early Career Research Excellence Award(previously known as Premier of Ontario Research Excellence Award), OntarioDistinguished Researcher Award, and Glinski Research Excllence Award. Heis a Co-Founder of QShine International Conference, on the Quality of Servicefor Wireless/Wired Heterogeneous Networks (QShine 2004), has served asa General Chair for several conferences, such as ACM/IEEE MASCOST1998, IEEE DS-RT 1999-2000, ACM MSWiM 2000, as Program Chair forACM/IFIPS Europar 2002, IEEE/SCS Annual Simulation Symposium ANNS2002, ACM WWW’02, IEEE/ACM MASCOTS 2002, IEEE Wireless LocalNetworks WLN 03-04, IEEE WMAN 04-05, and ACM MSWiM 98-99, andas a TPC member of numerous IEEE and ACM conferences. He servedas a Guest Editor for the Journal of Parallel and Distributed Computying(JPDC) (Special Issue for Routing for mobile ad hoc; Special Issue forwireless communication and mobile computing; Special Issue for mobile adhoc networking and computing), and ACM/Kluwer Wireless Networks andACM/Kluwer Mobile Networks Applications, and the Journal of WirelessCommunication and Mobile Computing.

Dr. A. Boukerche serves as a General Chair for the 8th ACM/IEEESymposium on Modeling, Analysis and Simulation of Wireless and MobileSystems, and the 9th ACM/IEEE Symposium on Distributed Simulationand Real-Time Application. Dr. A. Boukerche serves as an AssociateEditor for ACM/Kluwer Wireless Networks, Wiley International Journal ofWiteless Communication and Mobile Computing, the Journal of Parallel andDistributed Computing, and the SCS Transactions on simulation. He alsoserves as a Steering Committee Chair for the ACM Modeling, Analysis andSimulation for Wireless and Mobile Systems Symposium, the ACM Workshopon Performance Evaluation of Wireless Ad Hoc, Sensor, and UbiquitousNetworks, and the IEEE Distributed Simulation and Real-Time ApplicationsSymposium (DS-RT). He is a member of the ACM and IEEE.

Caron Dzermajko received her MSc degree from the University of NorthTexas. Her research interests are: large-scale distributed simulation systems anddata distribution management.

Kaiyuan Lu is a PhD candidate at the University of Ottawa. She has receivedher MSc degree from the same university. Her research of interests are: large-scale distributed interactive simulation systems, real-time systems, and gridcomputing.