14
DRAFT Please do not cite without lead author’s permission 1 AbstractLarge-scale agent-based simulation has been shown to be an effective tool for understanding a variety of complex systems, such as market economies, war games, and epidemic propagation models. As systems of interest grow in complexity, it is often desirable to support different categories of artificial agents that execute tasks on widely varying time scales. The scalability of a simulation environment becomes a crucial measure of its ability to handle the complexity of the underlying system. In this paper, we present a design for a highly scalable architecture for multi-resolution agent- based simulation. SAMAS is a dynamic data-driven application system (DDDAS) that allows large numbers of heterogeneous agents to operate across a wide range of time scales without unnecessarily compromising simulation runtime. This is accomplished through the use of gossiping, an efficient broadcasting communication model for maintaining the overall consistency of the simulation environment. We demonstrate the effectiveness of this communication model using experimental results obtained from simulation. Index Terms— Cooperative Systems, Agent-based Simulation, Temporal Resolution 1 INTRODUCTION Large-scale agent-based simulation has become an effective modeling platform for understanding a variety of highly complex systems and phenomena such as market economies, war games, and epidemic propagation [1]. What is becoming increasingly clear from the deployment of such environments is that they are powerful media for integrating models of widely differing aggregation and granularity. This multi-resolution property of agent-based simulations has already been demonstrated in the spatial dimensions and in the emergence of multi-agent systems, which support a diversity of agents playing different roles and exhibiting different behaviors. What has been slower in forthcoming is the support of temporal multiscale models. Specifically, as systems and their associated models grow in complexity, it becomes increasingly desirable and necessary to have different categories of artificial agents that execute tasks on varying time scales. Consider the eCommerce example shown in Figure 1 in which the agents can be grouped into three layers according to their time resolution requirements. At the market layer, agents represent various components of the IT infrastructure, such as web servers and databases that store merchandise inventory information. These agents must operate on millisecond-based time resolution in order to process orders posted by millions of customers around the globe. At the supplier layer, orders are aggregated and sent to different supplier agents, which need to operate on hourly-based time resolution. The supplier agents respond to changes in the market This research is partially funded by National Science Foundation and Indiana 21 st Century Research and Technology Fund. SAMAS: Scalable Architecture for Multiscale Agent-based Simulation Alok Chaturvedi, Mike Cheng, Daniel Dolk, and Jie Chi

SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

1

Abstract— Large-scale agent-based simulation has been shown to be an effective tool for understanding a variety of

complex systems, such as market economies, war games, and epidemic propagation models. As systems of interest grow in complexity, it is often desirable to support different categories of artificial agents that execute tasks on widely varying time scales. The scalability of a simulation environment becomes a crucial measure of its ability to handle the complexity of the underlying system. In this paper, we present a design for a highly scalable architecture for multi-resolution agent-based simulation. SAMAS is a dynamic data-driven application system (DDDAS) that allows large numbers of heterogeneous agents to operate across a wide range of time scales without unnecessarily compromising simulation runtime. This is accomplished through the use of gossiping, an efficient broadcasting communication model for maintaining the overall consistency of the simulation environment. We demonstrate the effectiveness of this communication model using experimental results obtained from simulation.

Index Terms— Cooperative Systems, Agent-based Simulation, Temporal Resolution

1 INTRODUCTION Large-scale agent-based simulation has become an effective modeling platform for understanding a

variety of highly complex systems and phenomena such as market economies, war games, and epidemic propagation [1]. What is becoming increasingly clear from the deployment of such environments is that they are powerful media for integrating models of widely differing aggregation and granularity. This multi-resolution property of agent-based simulations has already been demonstrated in the spatial dimensions and in the emergence of multi-agent systems, which support a diversity of agents playing different roles and exhibiting different behaviors.

What has been slower in forthcoming is the support of temporal multiscale models. Specifically, as systems and their associated models grow in complexity, it becomes increasingly desirable and necessary to have different categories of artificial agents that execute tasks on varying time scales. Consider the eCommerce example shown in Figure 1 in which the agents can be grouped into three layers according to their time resolution requirements. At the market layer, agents represent various components of the IT infrastructure, such as web servers and databases that store merchandise inventory information. These agents must operate on millisecond-based time resolution in order to process orders posted by millions of customers around the globe. At the supplier layer, orders are aggregated and sent to different supplier agents, which need to operate on hourly-based time resolution. The supplier agents respond to changes in the market

This research is partially funded by National Science Foundation and Indiana 21st Century Research and Technology Fund.

SAMAS: Scalable Architecture for Multiscale Agent-based Simulation

Alok Chaturvedi, Mike Cheng, Daniel Dolk, and Jie Chi

Page 2: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

2

Organization

Supply Chain

MISFinance

HR

Ops Marketing

Notifying

Sampling

Notifying

SamplingMarket

N otifyi ng

Sam pling

Not ify i ng

S ampl ing

CPUPeriphr .

HD

Main Board

Memory

Server

Database

Server

Database Database Database

Figure 1. Multi-resolution nature of agents in an Extended Enterprise

layer, such as a demand change in certain product. At the strategy layer, managers plan business policies and makes policy changes in response to changes in market and supplier layers. The effective simulation of this type of scenario presents great challenges to conventional agent-based systems in terms of temporal scalability. These systems typically dictate a single time resolution based on the requirement of the agents running at the highest resolution (lowest granularity, as in the market layer agents in the above example). As a result, agents at the lower time resolution are either idle most of the time, causing considerable waste of resources, or they have to be explicitly tracked and activated by the system at different time intervals by the simulation environment, causing excessive overhead for the system. While these strategies are simple to implement, and work well for systems of largely uniform agents, they do not scale well to a large number of diverse agents operating on a wide range of time resolutions.

In this paper, we propose the design of SAMAS, a dynamic data driven application system (DDDAS) for multi-resolution simulation that uses the gossiping communication model to allow agents to maintain consistency across different time resolutions. Consistency between layers is maintained through a “sample and synchronize” process. In this process, an agent running asynchronously at a higher level can dynamically sample the lower level, which may be another agent (human or artificial), system, component, instrument, and/or sensor, and adjust its parameters.

We approach this problem by first discussing in general terms the power of multi-resolution, agent-based simulation to integrate organizational processes and models at many levels of scope and time (Section 2). We then focus our attention on the problem of temporal multi-resolution and discuss a design of SAMAS (Section 3), which relies upon a gossiping algorithm to coordinate agents operating in different time domains (Section 4). Using simulation, we provide results showing the effectiveness of the gossiping algorithm as a communication model for maintaining global consistency of agents in a simulation environment (Section 5). We summarize our results within the larger context of multi-resolution systems, and briefly present a plan for continuing this avenue of research.

Page 3: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

3

Figure 2. Integration of Multi-resolution Organizational Processes and Models

2 MULTI-RESOLUTION, AGENT-BASED SIMULATION IN SUPPORT OF ORGANIZATIONAL MODEL INTEGRATION

Agent-based simulation with robust multi-resolution capabilities offers the potential for coalescing the entire portfolio of processes and models upon which an organization relies for policy and decision-making. Figure 2 shows a conceptual notion of an integrated simulation environment, which takes a layered approach to organizational models. In this framework, models exist at different levels of organization with the Organizational Model being the highest level in terms of generality. The Organizational layer drives the Business Process layer which drives the Workflow layer which, in turn, depends, as all the models do, upon the underlying Infrastructure layer. Each of these layers can be modeled and then solved for different scenarios using the SEAS [2, 3] agent-based simulation engine. The resulting goal is to tie these layers together in an overarching virtual environment, which captures the overall organizational behavior to an acceptable degree of verisimilitude.

One of the critical problems in effecting the kind of integration envisioned in Figure 2 is how to handle the time dimension in simulations. The timeframes for organizational processes vary widely in scope. At the strategic level, the focus may be on the next few months or years, at the tactical level days or weeks, and at the operational level, minutes or hours. This disparity in time resolution has a large impact on an organization’s modeling processes, which tend to focus model development on a particular domain within a specific timeframe. Although this may be locally efficient, it is often globally suboptimal. The fragmentation of models by domain and timeframe interferes with an organization’s ability to integrate and reuse these models, which then can result in the inability to plan and operate to the fullest extent possible. We address the problem of time resolution below by designing an architecture in which agents existing at different time levels in an integrated multi-level model can communicate effectively with one another without consuming inappropriate amounts of overhead.

3 SAMAS ARCHITECTURE

3.1 Naїve Design for Multi-Resolution Agent-based Simulation In most of the complex systems that we study, there are thousands, if not millions, of input elements

that may affect the outcome of the system in any number of different ways. Agent-based simulation

Scenario 3

Scenario 2

Scenario 1

Organization Model Business Process Model

Workflow Model Infrastructure Model

Seas Agent-Based

Engine

Organization Layer

Supply Chain Layer

Market Layer

Scenario Builder

Page 4: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

4

Figure 3 Multi-resolution simulation based on naive design Figure 4. SAMAS Simulation Design attempts to capture this complexity by using a large number of artificial agents, each of which plays the role of one or more of the elements in the real system. While existing agent-based simulation systems, given a sufficient number of agents, can effectively mimic the behavior of real world systems, they generally do not effectively take into consideration the temporal relationship among the different agents. As a result, the entire simulation environment uses one global clock time interval for the execution of all agents in the systems. This design, however, does not scale well to large number of heterogeneous agents and may lead to gross inefficiencies in simulation execution. In many sufficiently complex systems, there is often a hierarchy of different types of agents that need to operate on different time scales. The agents that operate at a lower resolution (slower) often have to rely on agents that operate at higher time resolution (faster) for information. To implement such a scenario using the naїve design, we have to assign a central location, e.g. an agent, at the lower resolution level to gather all the information needed by agents at the upper resolution level. The agents at the upper level periodically gather information from the designated party. This scenario is shown in Figure 3. There are two major drawbacks associated with this design:

1. Since there is only one global clock, most of the time the agents on the upper level are doing nothing but checking for information in the lower level. These activities waste the resources in the simulation environment.

2. The agent that is responsible for gathering information at the lower level becomes the single point of contention as the number of agents increase.

3.2 SAMAS Design The design of SAMAS is motivated by the need to build an agent-based simulation system that is

capable of scaling up to millions of artificial agents running at multiple time resolutions. The SAMAS simulation environment no longer requires a single uniform global clock. Instead, it adopts a multi-resolution time interval concept that enables agents to operate at anytime resolution. As shown in Error! Reference source not found., agents are grouped into Time Resolution Layers (TRLs), which are defined by their respective time resolutions. The agents on the same TRL require the same time resolution and therefore operate under a common clock. In addition to the interactions among agents on the same TRL, SAMAS allows the interaction among agents on different TRLs by creating mappings among agents. An agent can be mapped to one or more agents on adjacent TRLs. The mapped agents, however, do not communicate directly with each other. Instead, an agent on each TRL is designated as the Coordinator Agent (CA) and all communication across TRLs go through the CAs. Intuitively, a CA can be understood as the process responsible for Remote Procedure Call on a single processor in a multi-processor parallel computing architecture. Equivalently, TRL can be understood as processors in the parallel computing paradigm. The asynchronous design of TRL and the use of the CA enable simulation of agents on different TRLs to be carried out in parallel by different processors thereby achieving excellent scalability. The asynchronous nature of agents across different TRLs presents a formidable challenge in maintaining consistency in agent behaviors. Agents on different TRLs operate asynchronously based on certain prior knowledge of the state information of the agents in other TRLs.

Page 5: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

5

When one agent detects a change in its state, the information of this event must be propagated across TRLs to the agents that are mapped to the originator of the event. These mapped agents may further signal changes to their own states and the chain of events may continue. SAMAS addresses this issue through the use of Gossip Communication Model, Periodic Sampling, and Event Notification.

3.2.1 Gossiping Model for Intra-TRL Communication

Intra-TRL communication is achieved by using the Gossip Communication Model for agents to maintain global state information within a given TRL. Figure 5 shows the initialization procedure based on gossiping used by all agents. Ai denote the set of information, namely events, which agent i currently possesses. Ri and Si denote a set of receiving agents and sending agents, respectively. In the event of a state change, the originating agent of the event randomly sends the information about the change to a small number of its neighbors, which, in turn, proceed in the same manner. The size of the set of agents receiving information is a function of the network size and bandwidth. In the experimental results section, we show that the Gossip communication model can achieve a “near optimal” level of communication latency while imposing a very small burden on the network. The procedure terminates when there is no new information to be sent for a given number of periods. When an agent receives the same information from more than one agent, the redundant messages are simply dropped. Our experimental results show that, by using Gossiping, state information can be quickly propagated among agents within a given TRL while imposing minimal bandwidth requirement.

3.2.2 Periodic Sampling for Inter-TRL Communication

Inter-TRL communication is conducted through the CAs. The CA on a given TRL periodically requests state information from the CAs on adjacent TRLs. The frequency of this sampling can be set beforehand according to the nature of the simulation. When a CA discovers a change from other CAs, it propagates the change to the agents on its TRL using Gossiping. The success of periodic sampling in maintaining consistency among agents across different TRLs relies on the assumption that CAs will have a close approximation of the most current state information of their respective TRLs as a result of Gossiping Communication Model.

3.2.3 Event Notification for High Priority Inter-TRL Communication

In addition to periodic sampling, SAMAS offers the selective capability for CAs to directly notify other CAs on adjacent TRLs in the event of a state change. When properly calibrated, the event notification mechanism can work as a complement to periodic sampling to enable faster propagation of high priority information. On the other hand, event notification is an aggressive communication protocol which, if overused, can impose a high burden on system bandwidth. In this paper, we do not devote attention to this communication mechanism, instead we focus our experiment on the less aggressive sampling method.

Figure 5. Outline of the Gossiping Algorithm used in simulation

1. Initialize, for every agent i, the set { }, 0

iA i i n! " < <

2. Loop until {0,1,..., }, 0iA n i n= ! < <

3. Agent i randomly chooses a set of agents, , ,| |s

S i S S N n! = < and sends Ai to each of them 4. Agent i randomly receives a set ,| |

RR R N n! < of messages addressed to it, such that

,i i j jA A A A R= ! " # 5. End loop

Page 6: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

6

4 RELATED WORKS The problems presented in the temporal multiscale agent-based simulation are similar in nature to those

studied in traditional multiscale models. Problems that require multiscale solutions occur across a wide range of fields of applied sciences, including material science [14, 15, 16, 17], chemistry [19], fluid dynamics [20] and biology [18]. Both numerical methods [12, 13] and simulation techniques [14-20] have been developed to solve multiscale problems. Much like the study in this paper, the goal of multiscale modeling is to understand and predict the behavior of complex systems that contain elements across a continuum of different scales, both in terms of space and time. In this paper, we adopt the agent-based simulation model in for temporal multiscale simulation. We remove the excess requirement on computational resources and alleviate the load balancing problem faced in parallel simulation by adopting a handshake mechanism through which artificial agents on different TRLs, which could reside on different processors, exchange state information. The successful execution of simulation in SAMAS relies upon the use of the Gossiping Communication Model for the maintenance and update of state information.

The Gossiping Communication Model, which is also known as a variant of all-to-all broadcast, involves each node in a network of n nodes sending a unique piece of information to the rest of the n − 1 nodes. The gossiping problem has long been the subject of research under a variety of contexts [4]. Researchers have investigated a number of different metrics for the solution of the problem. For a complete network of n nodes, the lower bound on time complexity (the number of communication rounds) is established to be O(log n) [5] (all logarithms are base 2 in this paper unless otherwise specified). The minimum number of calls needed for the operation has been proved to be 2n − 4 for n > 4 [6]. Czumaj et. al.[7] shows the trade-off between the time and cost of the Gossip Communication Model, where the cost is measured by either the number of calls or the number of links used in the communication. Various strategies have been proposed to perform gossiping on different network architectures with different lower bounds on the performance metrics [8, 9]. In particular, these works show that gossiping can be carried out with desirable performance properties on hypercube architecture.

Krumme and Cybenko [10] further show that gossiping requires at least log N! , where ρ is the golden ratio, communication steps under the weakest communication

Figure 6. Network size=128. Number of communication rounds as a function of both the number of messages sent and received. Both the number of messages sent and received axes are log-scaled.

Page 7: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

7

model (H1), in which the links are half duplex and a node can only communicate with one other node at each communication step. Gossiping algorithms are at the heart of many fundamental parallel operations, such as matrix transposition and Fast Fourier Transformation [8]. In SAMAS, we implement a simple randomized gossiping algorithm. The details of the operations have been discussed in detail in Section 3.2

5 EXPERIMENTAL RESULTS We conduct experiments on two levels to verify the performance of SAMAS. Since the performance of

the Gossip Communication Model is essential to the maintenance of consistency among agents across different TRLs, it is first important to measure the performance of the Gossiping Algorithm on a single TRL layer network of agents. We use simulation to test this performance for networks exhibiting various sizes, or numbers of agents. Second, we carry out a multi-resolution, agent-based simulation with several TRLs, each of which uses the Gossiping Algorithm as its intra-TRL communication process, and which uses sampling for inter-TRL communication.

5.1 Level 1 Experiment: Intra-Level Gossiping We define the following parameters used in the Gossip Communication Model simulation. 1. Number of Communication Rounds (NCR). This is the number of steps it takes to finish the all-to-

all broadcast and serves as the performance metric for the experiments. 2. Number of Messages to Send (NS). This is the upper bound on the number of messages an agent can

send (and by implication the number of agents it can contact) during each communication round. It represents the outbound bandwidth of an agent.

3. Number of Messages to Receive (NR). This is the upper bound on the number of messages an agent can receive at each communication round. It represents the inbound bandwidth of an agent.

4. Network size is the number of agents in a given TRL. The network size is usually different for different TRLs. The TRLs at lower levels often have larger number of agents, e.g. in the case where agents represent customers in the market layer in Figure 1.

Figure 7. Network size=256. Number of communication rounds as a function of both the number of messages sent and received. Both the number of messages sent and received axes are log-scaled

The Number of Communication Rounds performance metric is measured and compared to the optimal result in three different agent network sizes of 128, 256, and 512 agents respectively. The optimum in this case is O(log n) steps where n is the number of agents in the TRL. We assume overlay networks, which means the networks are fully connected. The simulation is carried out using the procedure shown in

Page 8: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

8

Figure 5. When the number of agents in the simulation is small, some irregularities are observed for the cases

where few messages are being sent (the zigzag pattern along the base line in Figure 6). These irregularities are attributable to the random nature of the simulation. Since the Gossiping Algorithm implemented in SAMAS dictates that agents randomly choose the recipients of its messages, the probability of an agent choosing the same recipient over time is higher when the number of agents is small in the network. As the number of agents in the network increases, the run time converges nicely to a continuous surface (Figure 7 and 8).

The number of communication rounds is associated with the run time of the gossiping algorithm. We demonstrate the number of communication rounds as a function of the outbound bandwidth and the inbound bandwidth of agents in the network (Figures 6, 7, 8). In all three networks with different sizes, we observe the following patterns: • Only relatively small network bandwidth is required to perform gossip optimally, or very close to the optimum

of O(log n) steps, where n is number of agents. In fact, the simulation on average finishes in 7 steps when agents both send and receive 4 messages in each step in the 27=128 agent case, and similarly in other cases.

• A network of agents that has more balanced inbound and outbound bandwidth results in better performance. • The performance is more sensitive to inbound bandwidth. This asymmetric property can be demonstrated by

observing the large increase in time when the agents are allowed to send large number of messages but only receive a small number of them, whereas the inverse is not true. This makes sense since limiting the inbound bandwidth means that some messages will not be received and therefore not passed on, thus the propagation of a message across the TRL will be delayed. This case is an example of the negative effect of overly aggressive update effort.

Figure 8. Network size=512. Number of communication rounds as a function of both the number of messages sent and received. Both the number of messages sent and received axes are log-scaled

Our empirical results indicate that the randomized gossiping communication model adopted in SAMAS can operate at, or very close to, the optimal level shown in previous theoretical works [5,11], particularly in cases where the inbound and outbound bandwidths are nearly equal. In the context of the multi-resolution simulation model presented in this paper, these results demonstrate that gossiping is an effective procedure for maintaining global state information across agents operating within the same time interval. It further suggests a criterion for determining when multi-resolution across adjacent TRLs will be effective. Specifically, if the difference in time scales between two adjacent TRLs is greater than O(log(n)) where n is the number of agents in the finer-grained TRL, then gossiping should be sufficient to keep the two layers operating without inter-communication delays. For example, if there are 1000 Type I agents operating on a 10-millisecond TRL and several Type II agents operating on another 1-second TRL, the worst case would require type I agents about 10 steps (100 milliseconds) for the CA on the Type I agent TRL to update any global information needed by the Type II agents. This translates into a minimum sample rate of 10 times/interval for the Type II agents, which ensures that Type II agents can

Page 9: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

9

almost always get the most updated information.

5.2 Level 2 Experiment: Inter-Level Communication Next we use SAMAS to execute simulations with multiple TRLs, each hosting a different number of

agents. Specifically, we run a total of ten simulations, with an increasing number of TRLs ranging from 1 to 10. In each simulation, the lowest TRL hosts the most number of (210) agents and the number of agents decreases exponentially as we go up in TRLs. For example, when we have 3 TRLs in our simulation, the top level TRL has 256 agents, the second level 512 agents, and the third, and bottom level, TRL has 1024 agents, This design is compatible with the use of agents in complex simulations, which often are structured using hierarchical decomposition to manage the complexity.

The topology of the agent network on each TRL is approximately a hypercube network, meaning each agent is connected to approximately log(n) other agents, where n is the number of agents on that TRL. This is a reasonable level of connectivity that maps closely to social economical systems in the real world [21,22]. One of the agents on each TRL is designated as the CA, which is the agent responsible for inter-TRL communication. In this experiment, we allow the CAs to sample during every cycle, the worst case scenario where sampling imposes the most bandwidth overhead on the system. When an event occurs, such as a failure of equipment in the lower TRL, the agent responsible for the event sends the information about the failure to all of its log(n) neighbors. When an agent receives an update, it will forward the update to all of its neighbors if the update is new. Once the CA acquires the information about a state change through Gossiping, it has the choice of directly forwarding the information to its adjacent TRLs in both directions. The CAs are informed of the CAs on adjacent levels during the initialization process. This static design eliminates mapping of different CAs during run-time to simplify the design which improves the reliability of the system.

During the experiment, we let the agents at the lowest TRL initiate events, which represent state changes associated with these agents in actual simulations. Note that the time resolution at this lowest level TRL is at the finest granularity for the whole simulation. These events will be propagated through the TRL via Gossiping and upward across TRLs through sampling. We track the following metrics in this experiment: • Time to Stabilize is the number of communication rounds it takes for all agents on all TRLs to learn

about the state changes, i.e. the time it takes from the occurring of the event to the time when all agents are aware of the event.

• Total Simulation Cycles is the total number of simulation cycles elapsed from the time when event was originated to the time when the whole system stabilizes.

• Gossiping Cycles is the total number of simulation cycles that agents spend on propagating new event information via the Gossiping Communication Model.

• Communication Load is defined as the ratio of simulation cycles used for gossiping over all simulation cycles. Conversely, if we think of cycles used for communication as overhead of the system, 1-communication load is the percentage of simulation cycles can be used for useful computation.

Total Layers

Total Agents

Time to Stabilize

(in NCRs)

Total Simulation Cycles(NS)

Gossiping Cycles(NG)

Communication Load(NG / NS)

1 1024 5 5120 1024 0.200000 2 1536 9 13824 2045 0.147931 3 1792 16 28672 2814 0.098145 4 1920 24 45952 3328 0.072423 5 1984 31 61248 3650 0.059594 6 2016 38 76288 3846 0.050414 7 2032 44 89008 3963 0.044524 8 2040 48 97472 4030 0.041345 9 2044 50 101728 4068 0.039989

10 2046 52 105908 4091 0.038628 Table 1. Summary of SAMAS simulation performance

Page 10: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

10

Table 1 contains a summary of the measurements, with each row representing one simulation. We

examine Time to Stabilize and Communication Load in greater detail. Figure 9 shows the desirable property that the time required to broadcast an event throughout all TRLs is linearly proportional to the number of TRLs in the system. Thus, it takes a relatively short amount of time to fully broadcast an event to all agents within the SAMAS system, largely due to the desirable properties of the Gossip Communication Model. In figure 9, the number of events creates a small effect on Time to Stabilize for 2 and 3 levels of TRL. The effect does not appear in single level TRL and gradually decreases as the number of TRL increases. This result is created by the randomized behavior of our Gossip Communication Model. In the model, we assume a CA can receive more than one event at the same time. Section 5.3 discusses this behavior in detail.

Figure 10 and 11 show that in general the time an agent spends on gossiping is inversely proportional to both the number of TRLs and the number of agents in the system. An agent spends a larger proportion of time propagating information about state change when there are fewer TRLs. This is because as the number of TRL increases, the communication load necessary to maintain the SAMAS environment grows at a slower rate than the agent growth rate. In the worst cases, agents spend about 50% of their simulation cycles on gossiping. However, the worst case scenario occurs only when the agents in the system are largely homogeneous in terms of time resolution, which is not the type of system intended to be handled by SAMAS. As the time resolution difference among agents become large and the number of TRLs increases, agents in the system typically spend about 30% of their simulation cycle in maintaining global state information through Gossip Communication Model. Considering the benefit of SAMAS in terms of the additional work which can now be done by otherwise idle agents, this cost in communication is quite reasonable.

0

10

20

30

40

50

60

0 2 4 6 8 10 12

Number of TRLs

Tim

e t

o s

tab

iliz

e

1 event

2 events

4 events

8 events

Figure 9. Time to stabilize as the number of TRLs increases

Page 11: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

11

0

0.1

0.2

0.3

0.4

0.5

0.6

0 2 4 6 8 10 12

Number of TRLs

Co

mm

un

ica

tio

n L

oa

d

1 event

2 events

4 events

8 events

Figure 10. Communication Load as the number of TRLs increases

0

0.1

0.2

0.3

0.4

0.5

0.6

1000 1200 1400 1600 1800 2000 2200

Number of Agents

Co

mm

un

ica

tio

n L

oa

d

1 event

2 events

4 events

8 events

Figure 11. Communication load as the number of agent increases

As the number of event increases in the systems, SAMAS scales well in terms of the amount of time it

takes for the event to propagate throughout the system and the amount of resources the system has to commit in the effort of propagating the event information. Figure 9 shows that it takes about the same time for events to propagate throughout all TRLs as the number of events increases. Figure 0 and 11 show the changes in the communication load as the number of events change. In the worst case, the system uses about 50% of the time for the purpose of event propagation when there is one type of agent in the system. As the system becomes more heterogeneous in its time resolution and the number of TRLs increases, the percentage of simulation cycles committed to communication decreases dramatically, e.g. the communication load reduces to 7% as we have four different types of agents (4 TRLs). In addition, we observe excellent scalability as the number of TRLs in the system increases. The additional communication overhead required as the number of event increases is very small, especially as the system becomes more heterogeneous. We believe that due to the non-aggressive nature of the Gossip Communication Model, the increase is very small considering the timely propagation of the events.

Page 12: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

12

5.3 Analysis of CA overhead The following experiments investigate the overhead of CAs. Figure 12 and figure 13 show the increase

in CA communication cycles as the number of TRL layers and the number of events increase. It is shown that the number of TRL layers has a greater effect on CA the communication cycles. This result is trivial, assume there are 8 events per TRL layer, by increasing 1 layer, it is equivalent of increasing 8 events. This characteristic is shown especially in figure 13.

As it has been mentioned in the previous section, we assume a CA can receive more than one event at the same time. The number of events a CA receives at a given time is not deterministic, and due to this behavior, the CA communication cycle in figure 13 is not linearly increasing.

Figure 12 and 13 are the average case results by running the experiment for 100 times each with a different random seed that generates events at a random time. Equation 5.3.1 is the theoretical worst case overhead of a CA. Equations 5.3.2 is the probability for the worst case to happen. We define worst case as at given time, a CA receive only one unique message and only one unique event is forwarded to other CAs. Therefore, for k number of unique events, CA will take k communication cycles to forward the events.

0

50

100

150

200

250

300

350

400

450

500

0 50 100 150 200 250 300

Number of Events

CA

co

mm

un

icati

on

cycle

1 layer

2 layer

3 layer

4 layer

5 layer

6 layer

7 layer

8 layer

Figure 12. CA communication cycle vs. total number of events

0

50

100

150

200

250

300

350

400

450

500

1 2 3 4 5 6 7 8 9

Number of Layers

CA

co

mm

un

icati

on

cycle

s

1 events

2 events

4 events

8 events

16 events

32 events

64 events

128 events

256 events

Page 13: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

13

Figure 13. CA communication cycle vs. number of TRL layers

Given i

events as the number of events, in as the number of agents and

iCACC as the CA

communication cycles for the ith TRL layer. Given i

CA has in2log neighbors, for every event that is

generated in the ith layer, i

CA must forward it to in2log neighbors and to

1+iCA on the upper layer. The CAs at the bottom layer only forward events from its layer and do not receive event from other layers. The following equations show the worst case

iCACC for different TRL layers.

iiiiineventsneventsCACC 212 log*)1(log* !++= (Middle layer TRL)

)1(log* 2 +=iiineventsCACC (Bottom layer TRL)

iiiiineventsneventsCACC 212 log*)(log* !+= (Top layer TRL)

Equations 5.3.1 Probability that a CA sends 1 event to the other layers is the union of the following cases:

1. Probability that only 1 neighbor of the CA sends an event 2. Probability that CA receives more than one event from its neighbors and they are the same

events. Let n be the number of agents at a given TRL layer, p be the probability that a neighbor of a CA sends an event. If we assume p being the same for all neighbors of a CA at a given layer, the probability for the

first case is 1log

22)1(*log

!! in

ippn and

i

i

n

npni1

)1log2( 2

log2 !! is the probability for the second

case. Therefore, the worst case overhead of a CA:

iii events

i

i

nn

inpnppn )1

)1log2()1(*(log 2

log1log

222 !!+!

!

Equation 5.3.2 The worst case probability is a very small number. Given p equals to 0.5, n equals to 100, and event equals to 100, the worst case probability is 2.78e-34.

6 CONCLUSION AND FUTURE WORK

In this paper, we have discussed the importance of multi-resolution simulation for integrating models and processes at all levels of an organization independent of time granularity. We have proposed a preliminary design for SAMAS, a highly scalable dynamic data driven application system (DDDAS) for multi-resolution, agent-based simulation. At the core of SAMAS, we use a gossiping algorithm to efficiently maintain global information among the agents, which operate at different time resolutions. We demonstrate through simulation results that gossiping can be used to implement an architecture that allows large numbers of agents to operate on a wide range of time scales. Should our future experiments with large-scale simulations confirm the feasibility of the SAMAS gossiping approach, we will have taken a significant step in creating a virtual environment for integrating and coordinating mission critical organizational models.

Page 14: SAMAS: Scalable Architecture for Multiscale Agent-based Simulationmisrc.umn.edu/workshops/2006/spring/chaturvedi.pdf · complex systems, such as market economies, war games, and epidemic

DRAFT Please do not cite without lead author’s permission

14

REFERENCES [1] Chaturvedi, A., Mehta, S., Drnevich, P., “Live and computational experimentation in bio-terrorism response”.

In Dynamic Data Driven Applications Systems, Kluwer Academic Publishers, (2004) [2] Chaturvedi, A., Mehta, S., Dolk, D., Ayer, R., “Agent-based simulation for computational experimentation:

developing an artificial labor market”, European Journal of Operations Research 166,3, (2005), 694-716 [3] Chaturvedi, A.R., Choubey, A.K., Roan, J.S., “Active replication and update of content for electronic

commerce”. International Journal of Electronic Commerce, 5, (2003) [4] Hedetniemi, S.M., Hedetniemi, T., Liestman, A.L., “A survey of gossiping and broadcasting in communication

networks”, NETWORKS, 18, (1988), 319–349 [5] Bavelas, A. “Communication patterns in task-oriented groups”, J. Acoust. Soc. Amer., 22 (1950), 271–282 [6] Baker, B., Shostak, R., ”Gossips and telephones”, Discrete Mathematics, 2 (1972), 191–193 [7] Czumaj, A., Gasieniec, L., Pelc, A., “Time and cost trade-offs in gossiping”, SIAM Journal on Discrete

Mathematics, 11 (1998), 400–413 [8] Grama, A., Kumar, V., Gupta, A., Karypis, G. “Introduction to Parallel Computing, An: Design and Analysis of

Algorithms”, 2nd Edition. Addison-Wesley (2003) [9] Krumme, D.W., “Fast gossiping for the hypercube”, SIAM J. Comput. 21 (1992) 365–380 [10] Krumme, D.W., Cybenko, G., Venkataraman, K.N. “Gossiping in minimal time”, SIAM J. Comput., 21 (1992)

111–139 [11] Landau, H. “The distribution of completion times for random communications in a task-oriented group”, Bull.

Math.Biophys., 16 (1954) 187–201 [12] Zhang, L. T., Liu, W. K., Li, S. F., Qian, D. Hao, S., “Survey of Multi-Scale Meshfree Particle Methods”

Lecture Notes in Computational Science and Engineering (LNCSE), 26 (2002), 441,. [13] Weinan, E., Bjorn, E., “Multiscale Modeling and Computation”, Notices of The AMS, 50(9) (2003), 1062-

1070. [14] Odette, G.R., Lucas, G.E., "Embrittlement of Nuclear Reactor Pressure Vessels", Iron and Steel (2001), pp. 18-

22. [15] Liu, G., Zhang, G.J., Ding, X.D., Sun J., Chen, K.H., “The Influences of Multiscale-Sized Second-Phase

Particles on Ductility of Aged Aluminum Alloys”, Metallurgical and Materials Transactions A, (2004), vol. 35, no. 6, pp. 1725-1734(10)

[16] Kremer, K., Muller-Plathe, F., “Multiscale problems in polymer science: Simulation Approaches”, MRS Bulletin (2001), 26(3):205.210,.

[17] Ortiz, M., Cuitino, A. M., Knap, J., Koslowski, M.,”Mixed Atomistic-Continuum Models of Material Behavior: The Art Transcending Atomistics and Informing Continua”, MAT. RES. SOC. BULLETIN (2001),1-6,

[18] Villa, E., Balaeff, A., Mahadevan, L., Schulten, K., “Multiscale Method for Simulating Protein-DNA Complexes”, Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal (2004), 2 (4), pp. 527-553

[19] Bordat, P., Sacristan, J., Reith, D., Girard, S., Müller-Plathe, F.: “An improved Dimethyl Sulfoxide Force Field for Molecular Dynamics Simulations“, J. Chem. Phys. (2002)

[20] Jendrejack, R. M., de Pablo, J. J., Graham, M. D., ``A Multiscale Approach to Computational Fluid Dynamics'', Proceedings of the XIIIth International Congress on Rheology (2000).

[21] J. Guare, Six Degrees of Separation: A Play, Vintage Books, New York (1990) [22] Mark Newman, "The structure of scientific collaboration networks", Proc. Natl. Acad. Sci. USA, vol. 98, pp.

404--409, (2001).