A Distributed Multiplayer Game Server Systemashvin/courses/ece1746/2003/... · A Distributed Multiplayer Game Server System ... In this architecture, the server is responsible for

A Distributed Multiplayer Game Server System

Eric Cronin Burton Filstrup Anthony KurcElectrical Engineering and Computer Science Department

University of MichiganAnn Arbor, MI 48109-2122

fecronin,bfilstru,[email protected]

May 4, 2001

2

Chapter 1

Introduction

1.1 Background

Online multiplayer games have improved dramati-cally in the past few years. The newest online virtualworlds such as Everquest [5] feature fantastic art-work, realistic graphics, imaginative gameplay andsophisticated artificial intelligence. Before the nextgeneration of multiplayer games can complete thetransition to lifelike virtual worlds, they must be ableto support real-time interactions. The main obsta-cle to real-time interactions is the Internet’s inabilityto provide low-latency guarantees. In this paper, wepresent a system for enabling real-time multiplayergames.

In particular, we focus on real-time multiplayergames that have strong consistency requirements.That is, all players must share a common view ofa complex virtual world. The combination of low la-tency and absolute consistency is difficult to achievebecause messages may be delayed indefinitely in thenetwork. Our system realizes these goals using threebuilding blocks: a Mirrored-Server architecture, atrailing state synchronization protocol, and a low-latency reliable multicast protocol. We have imple-mented and integrated these three components into aworking prototype and performed some preliminaryexperiments on our system.

Commercial games are either designed on top of

client-server architectures or, less frequently, on topof peer-to-peer architectures. Client-server archi-tectures are simple to implement and allow gam-ing companies to retain control over the game state.Peer-to-peer architectures deliver lower latenciesand eliminate bottlenecks at the server. Our pro-posed Mirrored-Server architecture requires a com-plex consistency protocol, but it achieves low latencyand allows administrative control over the gamestate.

In order to make our mirrored server architectureattractive, we have developed a consistency proto-col that can be used to port a client-server game toa Mirrored-Server architecture with minimal modifi-cations. The two key pieces of the consistency proto-col are a synchronization mechanism called trailingstate synchronization and CRIMP, a reliable multi-cast protocol.

Previous synchronization mechanisms such asbucket synchronization and Time Warp are not wellsuited to the demands of a real-time multiplayergame. Trailing state synchronization is designedspecifically for real-time multiplayer games, andachieves better responsiveness and scalability whilemaintaining near-perfect consistency.

The consistency protocol assumes a low-latency,many-to-many reliable multicast protocol. Previousreliable multicast protocols focus more on scalability

3

of state rather than overall packet latency. CRIMP isa custom-built, application-specific multicast proto-col that provides the infrastructure requirements ofour architecture in a low latency manner.

Two bootstrapping services will be included in thefinal system, although they have not yet been imple-mented. The first is a master server that allows usersto locate candidate game servers. The second is aserver selection service named qm-find that can beused to locate the closest game server to a particularclient.

The most significant result of this project is afully operational, distributed Quake system, com-plete with a working synchronization mechanismand a functional multicast protocol. In addition, wehave established the infrastructure to test and evalu-ate our system under different network conditions.The experimental evaluation of trailing state syn-chronization is in its early phases, and our prelim-inary results do not adequately explore the mecha-nism’s potential to respond to different network con-ditions. In future work, we would like to concen-trate on understanding the effects of different con-figurations on trailing state synchronization’s perfor-mance.

1.2 Quake

We chose Quake as our proof-of-concept game be-cause it is very sensitive to latency and provides avirtual world environment. Quake is a commercialgame that was not designed for a distributed archi-tecture. As such, Quake forces us to address theneeds of a real game, as opposed to a game that fitsneatly into our architecture. Another reason for usingQuake is that it is one of the few commercial gamesfor which the source code has been released.

In this section, we introduce Quake and describethe features of Quake that are relevant to our report.

We outline Quake’s architecture, including its mas-ter server. We describe Quake’s interest managementtechniques and then delve into details about the mes-saging protocols.

1.2.1 Introduction

Quake is a 3-D first player shooter, by id Soft-ware, the creators of a revolutionary and spectac-ularly successful line of games including Wolfen-stein 3D, Doom, Quake and Quake III. Besides 3-D graphics, their major innovation was the onlinemultiplayer game. College students found that theycould put the newly installed dorm LAN’s to useby engaging in mortal combat with their neighbors.Multiplayer Quake is a simple game. Your avatarhas a gun and so does everyone else. You’re going todie in the next minute or so, but before that happens,your goal is to finish off as many of your opponentsas possible. All interactions take place in real-time,and more than a 100 ms lay can be a crippling hand-icap.

John Carmack, the mastermind behind Quake, re-leased the source code in 1999. Two open sourceprojects arose to improve Quake, QuakeForge andQuakeWorld. The two subsequently merged theirsources into Quakeforge, which has developed sev-eral new versions of the original game. Develop-ment in Quakeforge is sporadic, with a handful ofpeople doing most of the work. Their major contri-butions have been porting Quake to new platforms,supporting more sound and graphics hardware, andenhancing the graphics. Quakeforge has also built insome simple cheat detection algorithm and a denial-of-service attack prevention mechanism.

A significant amount of information about Quakehas been compiled by the Quake DocumentationProject. Two pieces of this project describe the net-work code: The Unofficial Quake Specification andThe Unofficial DEM Format Description. For the

4

most part, these documents focus on the details ofthe message formats. Neither is a comprehensivelook at the networking issues in Quake. Other Quakeresources can be found at the Quake Developer’sPages.

The purpose of this document is to give a high-level overview of the technologies Quake uses to en-able multiplayer games. The first section describesthe Quake Client-Server architecture. Next, three op-timizations are discussed: dead reckoning, area of in-terest filtering and packet compression. The Quake-forge modifications to prevent cheating and denial-of-service attacks are discussed. Finally, the last sec-tion presents the network protocols.

1.2.2 Architecture

Quake uses a centralized Client-Server architecture.In this architecture, the server is responsible forcomputing all game state and distributing updatesabout that game state to the client. The client. Theclient acts as a viewport into the server’s state andpasses primitive commands such as button pressesand movement commands to the server. Because thisarchitecture allows most of the game state computa-tion to be performed on the server, there is no consis-tency problem. However, the gameplay is very sen-sitive to latencies between the client and the server,as users must wait at least a round trip time for keypresses to be reflected in reality.

The server is responsible for receiving input com-mands from client, updating the game state, andsending out updates to the clients. The control loopfor this procedure is discussed in more detail in Ap-pendix A. The server’s biggest job is keeping trackof entities. An entity is almost any dynamic object:avatars, monsters, rockets and backpacks. The servermust decide when a player is hit, use physics modelsto calculate the position of free-falling objects, turnkey presses into jumps and weapon fires, etc. The

server also informs clients of world state: gravity,lighting and map information. Aside from its role ingameplay, the server is a session manager that startsand ends games, accepts client connections and per-forms numerous other administrative tasks.

The client sends simple avatar control commands(movements, tilts, buttons) to the server. It usesits cache of the game state plus any updates fromthe server to render a 3-D representation of the vir-tual world. The graphics rendering is performedon the client, as is fine-grained collision detection.The client has been the main area of interest in thegame developer community because it is where allthe ’cool graphics’ are done. However, from an ar-chitectural or networking point of view, the Quakeclient is relatively simple compared to the server.

1.2.3 Master Server

Quake supports a master server that keeps track ofactive game servers on the Internet. Although themaster server is not packaged with Quake, GameSpyprovides a full-featured master server service. Fora Quake server to be listed in GameSpy’s database,the administrator must manually add GameSpy’s IPaddress to the master server list. The server then pe-riodically sends heartbeats containing the number ofconnected client to the master server. The server alsoinforms the master server when it is shutting down.

1.2.4 Area of Interest Filtering

One of the advantages of having a server architectureis that the server can dynamically filter informationbefore sending it to the clients. The Quake serveronly sends a client information for entities that arewithin that player’s potential field of sight. Simi-larly, sounds are filtered to those that are within thatplayer’s earshot.

Calculating an entity’s potential field of sight is

5

tricky. The data structure used to do this is calleda Binary Space Partition (BSP) tree or octree. Thethree dimensional world is split into eight regions bythree planes. Each region can be further subdivideduntil the necessary resolution is achieved. The fieldof sight for entities in each leaf of the BSP tree is pre-computed so that the server can quickly determinewhich other leaves of the BSP tree to include in thefilter. For example, a room might be represented by asingle leaf. This leaf would have associated with it alist of adjacent rooms (leaves) within the room’s fieldof sight. A great deal of information about this gen-eral technique is available on the web. Informationabout BSP trees in Quake is available in the Unoffi-cial Quake Specs.

1.2.5 Network Protocols

Almost all of the messages exchanged between theclient and the server during gameplay are sent un-reliably. Because there are few reliable messages,the client and server can get away with an inefficientyet simple stop-and-wait protocol with single-bit se-quence numbers. Although recovery of lost packetsis not necessary, the server does require knowledgeof which packets have been dropped so that it canimplement the FEC scheme described above. This isaccomplished using sequence numbers.

Both the client and the server have a frame rate.On the client, this frame rate is the frame rate of therendered scene. On the server, a frame is a single it-eration through the receive, process, send loop. Thisis typically a tight loop unless the server machine isalso running a client, in which case some time is re-served for client-side processing. At the start of aframe, all waiting packets are read. During the pro-cessing step of a frame, a number of different func-tions may write messages into either the reliable orunreliable message buffers. At the end of each frame,the server sends at most one packet to each client.

Similarly, the client sends at most one packet to theserver. Packets can contain both reliable and unre-liable data. Reliable data takes precedence over un-reliable data when constructing a packet. All unsentdata is sent in the next frame.

Reliability is achieved using a stop-and-wait pro-tocol with a single bit reliable sequence number.The receiver will not reorder packets; stale packetsare discarded. Acknowledgments are piggybackedon outgoing messages. Both unreliable and reliablemessages are sent using UDP datagrams.

Congestion control for both unreliable and reliablemessages is a simple rate-based scheme. In Quake-forge 5, the channel is assumed to have a capacity of20 Kbps. However, the original scheme used roundtrip times as a measure of bandwidth, allowing thechannel capacity to increase up to 40 Kbps. The cho-sen rate is enforced by aborting packet transmissionsthat might result in an overloaded channel. A packetis only sent if the sender decides, based on the timeand size of the last transmission, that the channel iscurrently clear. Aborted packets are sent at the endof the next frame. No flow control is performed - thesender assumes that the receiver can process packetsas fast as the sender can send them.

1.2.6 Client to Server Messages

During gameplay, the client sends the followingcommands to the server unreliably:

� move forward, up, sideways

� change orientation

� button presses

� impulse

� time between this command and the last

Each packet from the client contains the last threecommands to compensate for lost packets.

6

A number of other non-gameplay messages can ei-ther be generated automatically by the client or bythe user. These include instructions to spawn a newentity, kill an unruly player, pause the game, down-load a new map, change user information, etc.

1.2.7 Server to Client Messages

There’s a are a large number of server to clientmessages. The ones that are directly involved inthe gameplay and require low latency describe thestate of each client in the game. Key elements of aclient’s state include that client’s position, orienta-tion, health, weapons, and ammo. Client state infor-mation is computed as a delta off a client’s baselinestate. All server-to-client messages are sent unreli-ably.

7

8

Chapter 2

Mirrored-Server Architecture

2.1 Introduction

This chapter describes the tradeoffs of the two ba-sic multiplayer game architectures, the Client-Serverand Peer-to-Peer (P2P) architectures, and proposesa new architecture, the Mirrored-Server architecture.The architectures vary in their end-to-end latencies,bandwidth requirements, degrees of administrativecontrol and consistency protocol requirements. TheClient-Server architecture is popular in the gam-ing industry because it allows companies to retainadministrative control over servers and because noelaborate protocol is required to maintain consis-tency between different copies of the game state.The P2P architecture exhibits lower end-to-end la-tency and less concentrated bandwidth consumption.The Mirrored-Server architecture can achieve low la-tencies while still allowing administrative control ofservers.

2.2 Architecture Goals

The choice of architecture depends on the character-istics of the game in question. The games consid-ered in this report require very low latencies and ahigh degree of consistency over a complicated gamestate. First-person shooters such as Quake match thisdescription closely. Many games do not typically

fall into this category, including strategy games, roleplaying games, racing games, flight simulators, andsports games. Of secondary importance are minimiz-ing bandwidth usage and maintaining administrativecontrol over servers.

We consider games that are extremely sensitive tolatencies. These games must be playable with laten-cies equal to the network latency, but the playabil-ity must degrade rapidly as the latency is increased.For a racing game, a study has found that latenciesof 100 ms are almost unnoticeable, but as latenciesreach 200 ms, the players have trouble controllingtheir race cars [15]. We have experienced similar la-tency sensitivity despite Quake. In contrast, many ifnot most games can function adequately with highlatencies. In strategy and role-playing games, re-action times are not a factor. Even games that feellike action games may mask latencies by implement-ing vague or sluggish actions. For example, Ev-erquest players report realistic fighting, even thoughEverquest uses TCP connections over the Internet[5]. Games can fake real-time interactions by requir-ing a second to sluggishly swing an axe, or by usingweapons like spells where the exact time or directionof the action is unimportant or the results are difficultto verify.

The second criterion for relevant games is thatthey have strict consistency requirements over a

9

complex game state. Quake’s game state mustquickly and accurately reflect a number of permanentchanges such as rocket fires, rocket impacts, damageand death. For racing games [15] and simple gameslike MiMaze [8], consistency requirements are muchsimpler. Two game states need only agree on posi-tional information, which is refreshed rapidly, can betransiently incorrect and can be dead reckoned.

Bandwidth requirements for a multiplayer gamecan be significant. The Quake clients send out tinycommand packets, and the Quake server performsdelta updates and interest management to reducestate update sizes. Still, as you will see from ourresults, an eight-player game consumes 50 Kbps onthe client side link, making modem play difficult.Even worse, the server consumes 400 Kbps, whichmeans it must have good connectivity. If possible,we would like our architecture to reduce bandwidthrequirements, particularly on the client-side links.

The fourth goal is to use an architecture that allowsgame companies to retain administrative control overservers. Administrative control can allow the gamecompany to:

� Create persistent virtual worlds. The gamecompany’s servers can securely store playerprofiles and maintain centralized worlds that arealways ”on”. For example, Everquest [5].

� Reduce cheating. If a secure server managesgame state, opportunities for cheating may bereduced [1].

� Facilitate server and client maintenance. Thegame company can easily distribute bug fixes,and can allow the game to evolve after its re-lease date.

� Eliminate piracy. Players can be forced to login to verify their identity. [2]

� Enable player tracking. Everquest charges usersbased on the amount of time they spend playing.

The drawback of administrative control is that thegame company is forced to run its own servers. Par-ticularly for small companies, the cost of operatingservers may be prohibitive.

2.3 Assumptions

For the following discussion, we assume that thegaming company operates a private low-latency net-work where mirrors are gateways to that network.This assumption makes the real-time multiplayergame problem tractable; the Internet cannot providelow latencies, nor can it handle gaming traffic thatlacks congestion control mechanisms. A side benefitof this assumption is that we can choose to enableIP-Multicast within the private network, whereas IPMulticast is not available on the Internet. WithoutIP-Multicast, our architecture would require an end-host multicast protocol such as BTP [9], HyperCast[13] or NARADA [20]. Unfortunately, these end-host multicast protocols fail to provide the low la-tency requirements needed by our architecture.

2.4 Client-Server vs Peer-to-PeerArchitecture

In this section, we will briefly describe the key fea-tures of the Client-Server and P2P architectures andcompare them along the axes described in the pre-vious section, namely: latency, consistency, band-width, and administrative control.

Most commercial games use a Client-Server ar-chitecture, where a single copy of the game state iscomputed at the server. This architecture is used inQuake [16], Starcraft [3] . The Client-Server ar-chitecture as used in Quake is depicted in Figure

10

cmd

cmd

cmd

Figure 2.1: Peer-To-Peer Architecture

2.2. Clients input key presses from the user andsend them to the server. We refer to these client-to-server messages as commands. The server collectscommands from all of the clients, computes the newgame state, and sends state updates to the clients. Fi-nally, the clients render the new state. Notice thatthis is a thin client architecture, where clients areonly responsible for inputting commands and render-ing game state.

In the peer-to-peer (P2P) architecture, there is nocentral repository of the game state. Instead, eachclient maintains its own copy of the game state basedon messages from all the other clients (see Figure2.1). A P2P architecture has been used in a few sim-ple games, including Xpilot [19] and MiMaze [8]. Itis also used in complex, scalable battlefield simula-tors such as STOW-E. In MiMaze, clients multicasttheir game positions to every other client. Each clientcomputes its own copy of the game state based onmessages from other clients and renders that gamestate.

The primary advantage of the P2P architecture isreduced message latency. In the Client-Server archi-

cmdcmd

cmd

updates

Figure 2.2: Client-Server Architecture

tecture, messages travel from clients to a potentiallydistant server, then the resulting game state updatespropagate back from the server to the clients. Inthe P2P architecture, messages travel directly fromone client to another through the multicast tree. Be-cause no widespread multicast service that we knowof is provided on the Internet, in practice, latency-sensitive P2P messages need to be sent using unicast,which scales poorly with the number of clients.

In the Client-Server architecture, game state con-sistency is not a problem. There is only one copyof the game state, and that copy is maintained by asingle authoritative server. In the P2P architecture,there are multiple copies of the game state. If mes-sages are lost or arrive at different times, inconsis-tencies can arise. Although synchronization mech-anisms such as our novel trailing state synchroniza-tion can resolve these inconsistencies, these mecha-nisms tend to increase the system’s complexity andmay require disorienting rollbacks.

11

The relative bandwidth consumption of the two ar-chitectures are roughly equivalent. In the P2P archi-tecture, each host multicasts an input message ev-ery time period. This results in reception of N2 to-tal input messages per time period, where N is thenumber of hosts/players. In the Client-Server archi-tecture, each host unicasts an input message to thecentral server every time period. Also, the serversends a game state message per client per time pe-riod. So the Client-Server architecture totals N inputmessages and N game state messages per time pe-riod. Typically, game state messages grow linearlywith N because they must describe the change instate that occurred as a result of N input messages.Both architectures consume total bandwidth propor-tional to N2. However, note that the traffic in theClient-Server architecture is concentrated at the cen-tral server, which must handle all N2 messages.

Interest management techniques can be used to re-duce the bandwidth requirements of both architec-tures [14]. Hosts are only interested in a small por-tion of the game state. If this portion of the gamestate is constant despite the addition of new play-ers to the game, interest management will improvethe scalability of both architectures. A centralizedserver can determine which pieces of state a client isinterested in and only send them this filtered state in-formation. The result, constant-sized state updates,would result in cN total bandwidth consumption.In a P2P architecture, a host can subscribe only tothe multicast groups that match its area of interest.Again, the result would be cN total bandwidth con-sumption. A detailed discussion of interest manage-ment is beyond the scope of this paper. Overall, therelative bandwidth consumption depends on the sizesof different messages and the effectiveness of interestmanagement techniques. Later we will present sim-ple experiments that show that the total bandwidthusage of the two architectures is comparable.

The fourth axis along which we compare the two

architectures is administrative control. Administra-tive control is trivial for Client-Server architectures.The game company need only run its own servers,where it can protect sensitive information such asplayer profiles and the server code itself. In the P2Parchitecture, most if not all of the code and datathat are necessary to run a game must be residenton the clients, where they are vulnerable to hackers.Some administrative control can be retained by usinga Client-Server protocol for non-gameplay messag-ing, but this does not resolve the security issues.

For most large scale massively-multiplayer games,gaming companies overwhelmingly choose Client-Server architectures, but DOD simulations have gonewith the P2P architecture. The simplicity of deal-ing with a single repository of consistent state faroutweighs the cost of increased latency. In addi-tion, game companies may prefer to retain admin-istrative control of their servers. On the other hand,other large organizations operate highly scalable dis-tributed simulations. They generally have no admin-istrative control problems because they control all ofthe clients.

2.5 Mirrored-Server Architecture

Our chosen architecture is the Mirrored-Server archi-tecture, where game server mirrors are topologicallydistributed across the Internet and clients connect tothe closest mirror. The Mirrored-Server architectureis a compromise between the two previously dis-cussed architectures. As in the Client-Server archi-tecture, the Mirrored-Servers can be placed under acentral administrative control. Like the P2P archi-tecture, messages do not have to pay the latency costof traveling to a centralized server. Unfortunately, aMirrored-Server system must still cope with multi-ple copies of the game state. We hope to solve thislast remaining problem with our trailing state syn-

12

chronization mechanism.In our mirrored version of Quake, clients send

commands to their closest mirror. The mirror closestto the sender is referred to as the ingress mirror. Theingress mirror multicasts incoming commands to theother mirrors, which we term egress mirrors. Theegress mirror calculates its own copy of the gamestate and sends state updates to its directly connectedclients. The client then renders this updated gamestate.

Comparing bandwidth consumption among thethree architectures becomes more complicated whenassuming a high-speed network with each clientsending messages to the nearest gateway. A detailedanalysis of the bandwidth requirements of each ar-chitecture is presented in Section ?? The Mirrored-Server architecture uses more bandwidth on the pri-vate network where bandwidth is plentiful than itdoes on the public network. Also, the traffic con-centration seen in the Client-Server architecture isshared by the mirrors in the Mirrored-Server archi-tecture. Overall, the bandwidth consumption of allthree architectures is comparable.

The Mirrored-Server architecture appears to be areasonable alternative to the Client-Server architec-ture. Some of the larger gaming companies havealready distributed game server clusters across theworld. Both Blizzard [3] and Verant [5] have a num-ber of geographically distributed clusters. To ourknowledge, no game state is shared between theseclusters.

2.6 Experiments and Results

The experiments in this section are designed to inves-tigate the bandwidth requirements of the three archi-tectures. We begin by developing a model to predictbandwidth usage based on the number of players ina game. Then, we collect some bandwidth statistics

to populate the model. Our findings confirm that thearchitectures use similar amounts of bandwidth. Themeasurements are also used to evaluate the effective-ness of Quake’s interest management techniques, butthe results are inconclusive. We also include plansfor future experiments in this section.

The bandwidth requirements of Quake can be de-scribed using a simple model. Clients always gen-erate commands at a rate a, regardless of the archi-tecture. The command streams increase to a rate b

upon being forwarded to the multicast channel dueto added timestamps and multicast headers. In boththe Client-Server and Mirrored-Server architectures,the server sends state updates to each client at a ratec+n(d+�e). For the Mirrored-Server architecture,we assume 8 clients per server, m = n

8. Variable

definitions are listed in Table 2.6.First we calculate the asymptotic bandwidth re-

quirements of the architectures as the number ofclients increases. In Quake, interest managementtechniques remove only about 70% of the informa-tion about clients that are not in the area of interest,so they do not affect the asymptotic results. We fur-ther simplify the model by assuming that all mes-sages are unicast. The asymptotic input and outputrates of the clients, servers and mirrors are listedin Table 2.6. For the Mirrored-Server architecture,the server rates are for messages between the clientand mirrors and the mirror rates are for messagesbetween mirrors. Notice that all of the approachesuse �(n) �(n2) bandwidth on the public and privatenetworks and bandwidth at the client. However, theMirrored-Server architecture uses only �(n) band-width at the mirrors, while the Client-Server archi-tecture uses �(n2) bandwidth at the server.

To determine the values of a; b; c; d; e and � weset up a simple series of experiments. We ran a totalof five experiments with n = 1; 2; 3; 4; and 5 clients.For each experiment, the clients connected to a sin-gle mirror and played a two-minute game, 60 sec-

13

a Command rate on the Client-Server connectionb Command rate on the mirror-mirror connectionc Rate of state updates with zero clientsd Rate of information that is always included about each cliente Rate of information about clients in proximity to the receiving client� Describes the effectiveness of the interest management techniquesn Number of clients in the gamem Number of mirrors

Table 2.1: Variable Definitions

Client-Server Peer-to-Peer Mirrored-ServerClient Output Rate a bn a

Server Input Rate an a

Client Input Rate c+ (d+ e)n bn c+ (d+ e)n

Server Output Rate cn+ (d+ e)n2 c+ (d+ e)n

Mirror Output Rate bn

Mirror Input Rate bn

Private Network Rate an+ cn+ (d+ e)n2 bn2 bn2

Public Network Rate an+ cn+ (d+ e)n2 bn2 an+ cn+ (d+ e)n2

Total Client a+ c+ (d+ e)n bn a+ c+ (d+ e)n

Total Mirror/Server an+ cn+ (d+ e)n2 bn+ c+ (d+ e)n

Table 2.2: Asymptotic Input and Output Rates

14

a 5.68 Kbpsb 10.6 Kbpsc 1.16 Kbps

d+ e 5.36 Kpbs

Table 2.3: Linear Regression Results

onds of which was used for analysis. The server wasinstrumented to record the transport-layer bandwidthconsumption of client-to-server, server-to-client, andmirror-to-mirror messages. The coefficients a and b

were directly observable. The values of c and d + e

were calculated by doing a linear regression on theserver-to-client bandwidth measurements. Table 2.3contains the measured coefficient values. The lin-ear regression had a fairly encouraging correlationof 0.8682.

The asymptotic bandwidth usages of the architec-tures without interest management can be computedusing the values of a; b; c; and d + e. First, Figure2.3 compares our actual and predicted bandwidth us-ages. The actual server output rates are much lowerthan the predicted server output rates, although bothincrease quadratically. This discrepancy is due toQuake’s interest management techniques, which arenot reflected in our simple model. Table 2.4 show thebandwidth consumptions of each architecture. WithQuake-like bandwidth configurations, the P2P archi-tecture would use twice as much total bandwidth asthe traditional Client-Server architecture. However,Quake’s messaging protocols have been optimizedfor the Client-Server architecture, so this comparisonis not fair. Note that the Mirrored-Server architec-ture uses a twice as much bandwidth on the privatenetwork where congestion is not a problem. The ad-vantages of the Mirrored-Server architecture are il-lustrated in Figure 2.4, which compares the Client-Server server load and the Mirrored-Server mirrorload. With four mirrors and 32 clients, mirrors con-

sume 2 Mbs, while servers in the Client-Server ar-chitecture must handle 6 Mbs.

0

20

40

60

80

100

120

140

0 1 2 3 4 5

Ban

dwid

th (

Kbp

s)

Clients

Measured Server Input RatePredicted Server Input Rate

Measured Server Output RatePredicted Server Output Rate

Measured Mirror Input RatePredicted Mirror Input Rate

Figure 2.3: Measured vs. Predicted Bandwidth Us-age

To determine the success of Quake’s interest man-agement techniques, we attempted to estimate thevalues of c; d and e separately. Given these values,� can be computed for each trace. Remember that �represents the proportion of the other clients whosedetailed state is included in any given state update.Because d and e are both coefficients of n, it is im-possible to separate them using regression analysis.Instead, we looked at the source code and attemptedto estimate how many bytes per state update wouldbe devoted to c; d and e. These estimates are im-perfect because the amount of data written for eachclient depends on how much that client’s state haschanged since the last state update.

Table 2.5 lists the value of � for each set of mea-surements. We expected a fixed percentage of theother clients to be represented in any state updateregardless of the number of clients. Unfortunately,the results are not strongly correlated. We need toperform more experiments with a larger number ofclients to better understand the behavior of Quake’s

15

Client-Server Peer-to-Peer Mirrored-ServerPrivate Network Rate 5:36n2 + 6:84n 10:6n2 10:6n2

Public Network Rate 5:36n2 + 6:84n 10:6n2 5:36n2 + 6:84n

Total Client 5:36n+ 6:84 10:6n 5:36n+ 6:84

Total Mirror/Server 5:36n2 + 6:84n 53:5n+ 9:28, n

m= 8

Table 2.4: Bandwidth Consumption

0

200

400

600

800

1000

1200

1400

0 2 4 6 8 10 12 14

Ban

dwid

th (

Kbp

s)

Clients

Client-ServerMirrored-Server

Figure 2.4: Predicted Client-Server and Mirrored-Server Bandwidth Requirements

interest management mechanism. Also, better instru-mentation could be installed to eliminate the errordue to rough estimations of the coefficients c; d ande. We reserve this for future work.

Latency is the most important experimentally veri-fiable difference between the architectures presentedin this section. Unfortunately, coming up with mean-ingful latency comparisons involves choosing a real-istic network topology. At one extreme, a star topol-ogy centered at the server will result in identical la-tency for both the Client-Server and P2P architec-ture (see Figure 2.5a.). At the other extreme a topol-ogy where the server is located in an infinitely dis-tant or congested part of the network, as in Figure

Clients inClients State Updates (%)

1 N/A2 66.73 13.24 28.55 16.7

Table 2.5: � Values

2.5b., will result in the P2P architecture being in-finitely faster than the Client-Server architecture. Forthe Mirrored-Server architecture, the best and worstcases are less clear. The topology in Figure 2.5c. willresult in much better average end-to-end latency forthe Mirrored-Server and P2P architectures, becausea centralized server would have to be placed far awayfrom half of the clients. The topology-dependentstudy evaluation is reserved for future work.

2.7 Conclusions

Each of the three architectures sports a different setof advantages and disadvantages. No single architec-ture is suited to all classes of multiplayer games. TheClient-Server architecture has poor latency and goodconsistency characteristics, so it is a good match forany non-real-time game. The P2P architecture hasgood latency but poor administrative control char-

16

a. Star b. Remote Server c. Bi-nodal

Figure 2.5: Different topology types

acteristics, making it ideal for small scale, latency-sensitive games. The Mirrored-Server architectureachieves low latency and administrative control; wefeel it is a great solution for next-generation real-timepersistent-world games.

17

18

Chapter 3

Consistency Protocols and SynchronizationMechanisms

3.1 Introduction

The Mirrored-Server and P2P architectures involvecomputing multiple copies of the game state. Lostor delayed messages can cause inconsistencies be-tween game states. A consistency protocol is neededto deliver messages to their destinations, and a syn-chronization mechanism must be used to detect andresolve inconsistencies. Although the Mirrored-Server and P2P architectures pose identical con-sistency problems, we will focus on the Mirrored-Server architecture because it is the chosen architec-ture for our Quake system.

This chapter begins with some key definitions andan overview of our consistency goals. Next, welaunch into a description of two consistency proto-cols. The last half of the chapter concerns our syn-chronization mechanism, trailing state synchroniza-tion (TSS). We set the stage with a discussion of pre-viously proposed synchronization mechanisms andhow our approach differs from those mechanisms.TSS is described in detail, with some analysis of op-timal configurations. The results in this section sup-port our claim that TSS is an efficient, low-latency,consistent synchronization protocol that is appropri-ate for multiplayer games.

3.2 Definitions

We make a distinction between the consistency pro-tocol and the synchronization mechanism. The con-sistency protocol is responsible for delivering mes-sages to the destination mirror and incorporatingthose messages into the game state. A key pieceof the consistency protocol is the synchronizationmechanism, which detects and resolves inconsisten-cies. The consistency protocol gives a high-levelspecification of the messaging protocols, while thesynchronization mechanism deals with the details ofmaintaining consistent state.

Commands are simple messages that convey inputfrom a client to a server. In Quake, commands con-tain key presses and orientation information. Quakeservers calculate the game state directly from com-mands. If instead we think of the Quake server as asimulator, then it is responding to events by chang-ing the game state. We have simplified the Quakegameplay so that the following events are sufficientto initiate all game state changes.

� Move event: An avatar moves to a new position.

� Fire event: An avatar fires a rocket.

� Impact event: The rocket impacts and detonates

19

� Damage event: An avatar takes damage andpossibly dies.

� Spawn event: An avatar is reborn in a randompart of the map.

The clients generate move, fire and spawn events,while the mirrors generate impact and damage dur-ing the game state computation.

A key observation is that different events have dif-ferent consistency and latency requirements. Real-time events have strict timeliness and lax consistencyrequirements. At the other end of the spectrum, con-sistent events have lax timeliness and strict consis-tency requirements. The most challenging events areconsistent real-time events, which have strict timeli-ness and strict consistency requirements.

Real-time events are those events that playersmust be informed of in real-time in order to reactreflexively. The best example of a real-time eventis the move event. Any delay in the reaction timeof one’s own avatar movement inhibits the effective-ness of player reflexes. Delays in the perceived mo-tion of other avatars force the player to anticipatetheir motion. Because shooting another player re-quires significant hand-eye coordination, players areextremely sensitive to the timeliness of move events.If possible, real-time event latency must be less than100 ms. Although human reflexes can act on theorder of 100 ms, players cannot easily detect short-lived inaccuracies in avatar position. Therefore, real-time events can occasionally be lost or delayed with-out ill effect.

All events that take more than 100 ms to reactto are termed consistent events. Damage events,die events, and spawn events can all be placed intothis category. Unlike real-time events, these eventsaffect the game state in permanent and noticeableways. Therefore, consistent events must be consis-tently processed across all clients. For example, all

clients must agree that an avatar took a hit and died.Here, consistency is far more important than timeli-ness, so we might be willing to wait 500 ms beforethe avatar keeled over.

Unfortunately, there are some events that havetight real-time constraints and require a high degreeof consistency. Fire events and impact events are twoexamples. Clearly, all clients must agree on whetheror not a rocket hit an avatar or whether that avatardodged, allowing the rocket to whiz by. Assume thatwe treated this event like the consistent events, wherethere could be a 500 ms delay after the event oc-curred before the event is reflected in the game state.Then all weapon impacts would occur after they havestruck their targets. This means that rockets wouldcontinue along their trajectory for a fraction of a sec-ond, possibly passing through their target before det-onating. This behavior is unacceptable, so we musttreat consistent real-time events differently.

3.3 Consistency Protocols

3.3.1 Consistency Protocol Goals

The goals of the consistency protocol follow directlyfrom the event class definitions. Real-time eventsmust appear in the game state with the lowest possi-ble latency and consistent events must be consistentbetween game states. The mechanisms to accom-plish this should be as efficient as possible, both inbandwidth and computational load. For example, ifreal-time event servicing can be made more efficientthan consistent real-time event servicing, the samemechanism should not be used for both.

3.3.2 Distributing Game State Computa-tions

So far, we have not explicitly examined the issueof where the game state calculation should occur.

20

For this discussion, it helps to think of each ingressmirror as a sender in the peer-to-peer architecture,and the egress mirrors as the receivers. Should theingress mirror calculate the new game state and thensend a state update to the egress mirror? Or shouldthe ingress mirror just forward the incoming com-mand to the egress mirror and allow it to generatethe new game state? As a compromise, should theingress mirror perform some computation, send anintermediate result to the egress mirror, and then al-low the egress mirror to complete the computation?

Computing the game state at the ingress mirror isa strange proposition. This has the advantage that thegame state calculation need occur only once. How-ever, if different pieces of game state are computedat different ingress mirrors without input from othermirrors, there will be no interactivity between clientson different mirrors. For example, two ingress mir-rors could simultaneous decide that two differentavatars occupy the same position on the map insteadof coordinating to detect a collision.

The more natural solution is to have all game statecalculated at the egress mirror. This approach isused in the command-based consistency protocol. Inthis configuration, the ingress mirror forwards com-mands to the egress mirrors, and each egress mir-ror computes its own copy of the game state. Thisapproach requires redundant calculations, but colli-sions and other interactions are handled properly be-cause the egress mirror makes decisions based on in-put from all the clients.

A compromise between the two approaches is tocompute game state that is entirely under the controlof the ingress mirror at the ingress mirror. For ex-ample, the ingress mirror could take raw front-endcommands and convert the into different events (seeSection 3.4). This is the key feature of the event-based consistency protocol. It could also performinitial avatar position calculations, which would thenbe modified by the egress mirror to account for dy-

namic collisions. This is the approach used in Mi-Maze, which does an initial position calculation atthe ingress mirror, then calculates the final interac-tions at the egress mirror [8].

The decision on where to place the game statecomputation may be based on which representationconsumes the least bandwidth. In Quake, the initialcommand is very small. After some event calcula-tion, the generated events might be slightly largerthan a command. If a significant amount of gamestate were calculated at the ingress mirror, the result-ing game state updates could be huge.

In our implementation, we elected to use thecommand-based consistency protocol, where nogame state computations are performed at the ingressmirror. This decision produced a much simplerimplementation where most of the Quake code re-mained intact. Dividing the game state computa-tion between the egress mirror and the ingress mir-ror would have required a complete rewrite of theserver code into modular event handlers. In gen-eral, keeping all of the game state computation onone server becomes more attractive as the game be-comes more complicated. Although we chose thecommand-based consistency protocol, we also inves-tigate the event-based consistency protocol.

3.3.3 Command-Based Consistency Proto-col

Our implementation places all of the game state com-putation at the egress mirror. The ingress mirrortimestamps commands from the client and reliablymulticasts them on to the egress mirror. Given thecommands from each client and the time at whichthey were issued, the egress mirror can compute thecorrect game state. The egress mirror sends gamestate updates only to its directly connected clients.The command-based consistency protocol is sim-ple and provides low latency, reliable delivery of

21

all commands. However, it does not distinguishbetween different classes of events and requires amany-to-many, low-latency, reliable multicast proto-col.

By using a reliable multicast protocol, we ensurethat all commands eventually reach the egress mirror.The egress server can compute the correct game stateby executing the commands in the order in whichthey were issued. Because we perform optimisticexecution of command packets, we must correct thegame state when commands are received out of or-der. These corrections, or rollbacks, can cause unex-pected jumps in the game state, severely degradingthe user experience. Therefore, we must at all costsminimize the frequency and magnitude of rollbacks.Informally, we have found that rollbacks of less than100 ms are unnoticeable. The details of the trail-ing state synchronization mechanism are presentedin section 3.4.

The main feature of the command-based consis-tency protocol is simplicity. Our command-basedprotocol requires minimal modifications to the serverwhen the protocol is added to an existing game. Tothe existing server back-end running on the egressmirror, commands from any of the ingress mirrorscan appear to have come from a client attached di-rectly to the local mirror. On the ingress mirror thechanges are minimal as well. Instead of calling thelocal back-end to execute a command after pulling itup from the client, it multicasts it to the other mir-rors, and then later processes it like every other com-mand. Other consistency protocols require at leastpartial processing at the ingress mirror, and changesto the back-end to accept partially processed com-mands instead of raw commands. When a game isnot designed with distribution in mind this is an im-portant feature.

Aside from simplicity, the command-based con-sistency protocol has several secondary advantages.First, it achieves consistency without the need for

a third-party trusted server. Second, it should re-sult in reasonably efficient bandwidth usage by usingmulticast and sending only small command packets.Third, it will achieve the lowest possible latency be-cause commands are sent directly from mirror to mir-ror across a private network. The weakness of thisprotocol is that it requires low latency many-to-manymulticast, which is a difficult problem. Our reliablemulticast protocol, CRIMP, is covered in Chapter 4.

3.3.4 Event-Based Consistency Protocol

Previously, we stated that the ingress mirror can pro-cess incoming commands in order to generate differ-ent events. Real-time events, consistent events, andconsistent real-time events can then each be handledin a different way. In this section, we present a con-sistency protocol that does not require low-latencymany-to-many multicast. Instead, it relies on anuberserver and a standard (not low latency) one-to-many reliable multicast protocol.

In this protocol, a single uberserver is responsiblefor maintaining the authoritative game state, whileeach egress mirror maintains a local, possibly incon-sistent game state. The uberserver is responsible forinforming the egress mirrors of consistent events. Ifan egress mirror disagrees with the consistent event,it must perform a rollback to the correct state.

The ingress mirror generates move events in re-sponse to move commands. Remember that moveevents require low latency but not reliability. There-fore, move events can be multicast to other egressmirrors using unreliable multicast. However, in or-der to ensure approximately correct game state at theuberserver, move events are unicast reliably from theingress mirror to the uberserver. There are no con-sistency checks for move events.

Consistent events are generated solely by theuberserver, because the uberserver is the only serverwith the correct positional information. For example,

22

the uberserver is the only one that can determine ex-actly how much damage an avatar took because it isthe only server that knows exactly how far an avatarwas from the blast. The uberserver sends consistentevents to the egress mirrors using reliable multicast.On reception of a consistent event, the egress mirrormakes the necessary changes in the game state. Notethat the reliable multicast protocol could take on theorder of a second to deliver the consistent event.

The most challenging events are consistent real-time events. The only way to deal with such eventsis to process them optimistically, then do a rollbackif the event was processed out of order. A mecha-nism for performing efficient rollbacks is presentedin the next section 3.4. The ingress mirror generatesfire events in response to fire commands from theclient. However, the ingress mirror does not knowthe exact position and orientation of the shooter, soit cannot generate an authoritative fire event. Thenon-authoritative fire event is multicast unreliably tothe egress mirrors and processed as if it were thecorrect fire event. It is also forwarded reliably tothe uberserver, which generates an authoritative fireevent. This authoritative fire event is multicast reli-ably to the egress mirrors. If the authoritative andnon-authoritative fire events differ, the egress mirrorperforms a rollback using the trailing state synchro-nization mechanism.

Although the event-based consistency protocol isprobably a superior solution, it is much more com-plicated than the command-based consistency proto-col. Implementing the event-based consistency pro-tocol would have meant ripping apart the very poorlywritten Quake code into event handlers. We mustleave this ordeal for someone with more time on theirhands.

3.4 Synchronization Mechanisms

In order for each mirror to have a consistent view ofthe game state, we need some mechanism to guaran-tee a global ordering of events. This can either bedone by preventing misordering outright, or by hav-ing mechanisms in place to detect and correct mis-orderings. At the same time, if we are unable tomaintain ordering within a reasonable delay, no onewill be willing to use the game servers. In both mil-itary simulations and simpler games, similar prob-lems exist and there are a number of synchronizationalgorithms used in these cases [7]. However, noneof these algorithms work well in a fast-paced mul-tiplayer game such as Quake. Unlike other uses ofsynchronization, the events in Quake are not directlygenerated by other events – they are generated byuser commands. Therefore, many of the concepts inthese other algorithms such as event horizons, fixedbuckets and rounds of messages are not applicable.Likewise, unlike less complicated games, it is notsufficient to be mostly consistent and dead reckon therest. The speed at which game state changes causesthese types of errors to quickly multiply and lead tolarge divergences.

3.4.1 Conservative Algorithms

Lock-step simulation-time driven synchronization[7] is by far the simplest, and by far the least suit-able for a real-time game. No member is allowed toadvance its simulation clock until all other membershave acknowledged that they are done with compu-tation for the current time period. This takes the firstapproach to preventing inconsistencies from possi-bly being generated. In this system, it is impos-sible for inconsistencies to occur since no memberperforms calculations until it is sure it has the exactsame information as everyone else. Unfortunately,this scheme also means that it is impossible to guar-

23

antee any relationship with real wall-clock time. Inother words, the simulation does not advance in aclocked manner, much less in real time. In an in-teractive situation such as a multi-player game, thisis not an acceptable tradeoff. There are a num-ber of similar algorithms and variants such as fixedtime-bucket synchronization which still suffer fromthe same problem of simulation-time and wall-clockhaving no real correlation.

Chandy-Misra synchronization does not requirethe coordinated incrementing of simulation time likethe previous two algorithms. Instead, each memberis allowed to advance as soon as it has heard fromevery other member it is interacting with. Addition-ally, it requires that messages from each client ar-rive in order. This requires the concept of a roundof messages to be of any use. In a mirrored gameserver architecture, different servers will have differ-ent event generation rates depending on the numberof connected clients. This scheme limits executionto the rate of the slowest member. Additionally, thetimes at which these events are generated will havelittle relation with the generation of earlier events.

3.4.2 Optimistic Algorithms

The two types of algorithms above have taken a cau-tious approach to synchronization. There are alsoseveral algorithms which execute events optimisti-cally before they know for sure that no earlier eventscould arrive, and then repair inconsistencies if theyare wrong. These types of algorithms are far bettersuited for interactive situations.

Time Warp synchronization works by taking asnapshot of the state at each execution, and rollingback to an earlier state if an event earlier than thelast executed event is ever received. Periodically, allmembers reset the oldest time at which an event canbe outstanding, thereby limiting the number of snap-shots needed. On a rollback, the state is first restored

to the snapshot, and then all events between the snap-shot and the execution time are re-executed. Addi-tionally, since, like the other algorithms, Time Warpassumes that events directly generate new events, aspart of the rollback anti-messages are sent out tocancel any now invalid events (which in turn triggerother rollbacks). The big problem with Time Warp ina game such as Quake is the requirement to check-point at every message. A Quake context consumesabout one megabyte of memory, and new messagesarrive at a rate of one every thirty milliseconds foreach client. Additionally, copying a context involvesnot just the memory copy but also repairing linkedlists and other dynamic structures. The other prob-lem of Time Warp, anti-message explosion, is not asimportant since the Mirrored-Servers do not directlygenerate new events in response.

There are additionally a class of algorithms whichare “breathing” variations on the above algorithms.Instead of fully optimistic execution, breathing algo-rithms limit their optimism to events within an eventhorizon. Events beyond the horizon can not be guar-anteed to be consistent, and are therefore not exe-cuted. Since in Quake events do not directly generatenew events, this concept does not work.

The algorithm implemented in MiMaze [8] is anoptimistic version of bucket synchronization. Eventsare delayed before being executed for a time whichshould be long enough to prevent misorderings. Ifevents are lost or delayed however, MiMaze doesnot detect an inconsistency and attempt to recoverin any way. If no events from a member are availableat a particular bucket, the previous bucket’s event isdead reckoned; if multiple events are available, onlythe most recent is used. Late events are just sched-uled for the next available bucket, but, because onlyone event per bucket is executed, are not likely tobe used. For a simple game such as MiMaze, theseoptimizations at the cost of consistency are claimedto be acceptable. For a much faster paced multi-

24

player game though, a higher level of consistency isrequired.

3.5 Trailing State Synchronization

As described above, none of the existing distributedgame or military simulation synchronization algo-rithms are well suited to a game such as Quake.Our solution to this problem is trailing state syn-chronization (TSS). Similar to Time Warp, TSS is anoptimistic algorithm, executing rollbacks when nec-essary. However, it does not suffer from the highmemory and processor overheads of constant snap-shots. TSS also borrows several ideas from Mi-Maze’s bucket synchronization algorithm. Instead ofexecuting events immediately, synchronization de-lays are added to allow events to reorder. The end re-sult is an algorithm which has the strong consistencyrequirements needed by a rapidly changing gamewithout the excessive overhead of numerous snap-shots of game state. Because TSS was designed withour mirrored game server in mind, we refer to com-mands being synchronized as opposed to the eventsin the last section. This is merely semantic and hasno bearing on the operation of TSS.

In order to recover from an inconsistency, theremust be some way to “undo” any commands whichshould not have been executed. The easiest wayto handle this problem is to “roll-back” the gamestate to when the inconsistency occurred, and thenre-execute any following commands. The difficultpart is where to get the state to roll back from. Sinceany command could be out of order and there arehundreds of commands generated every second, thenumber of states needed quickly grows out of hand.Even if a limit is placed on how old a commandcan be and still be successfully recovered (creating awindow of synchronization), to maintain reasonableconsistency this limit will probably still be larger

than the number of snapshots we would like to keep.Instead of keeping snapshots at every command,TSS keeps multiple constantly updating copies of thesame game, each at a different simulation time. If theleading state (which is the one that actually commu-nicates to clients) is executing a command at simu-lation time t, then the first trailing state will be exe-cuting commands up to simulation time t � d1, thesecond trailing state commands up to t � d2 and soon. This way, only one snapshot’s worth of mem-ory is required for each trailing state, reducing andbounding the memory requirements.

TSS is able to provide consistency because eachtrailing state will see fewer misordered commandsthan the one before it. The leading state executeswith little or no synchronization delay d0. The syn-chronization delay is defined as the difference be-tween wall-clock time and simulation time, usedto allow commands to be reordered properly at thesynchronizers before execution. If a command isstamped with wall-clock time a at the ingress mir-ror, then it cannot be executed until wall-clock timea+ d, even on the same ingress mirror. With a delayof zero, the leading state will provide the fastest up-dates to it’s clients (preserving the “twitch” nature ofthe game) but also very frequently be incorrect in it’sexecution. The first trailing state will wait longer be-fore executing commands, and therefore will be lesslikely to create inconsistencies but also less respon-sive to clients. This continues, forming a chain ofsynchronizers each with a slightly longer synchro-nization delay than its predecessor.

In order to detect inconsistencies, each synchro-nizer looks at the changes in game state that the ex-ecution of a command produced, and compares thiswith the changes recorded at the directly precedingstate. In Quake, there are two basic classes of eventsthat a command can generate. The first type we referto as weakly consistent, and consists of move com-mands. With these events, it is not essential that the

25

Simulation Time

100ms 200ms 300ms Executed Commands

Pending CommandsLeading State

Trailing State #1

Trailing State #2

Command A: MOVE, issued at t=150mson-time

Command B: FIRE, issued at t=200msReceived when leading state at t=225ms

Execution Time(real time - synch delay)

Figure 3.1: Trailing State Synchronization Execution

same move happened at the same time as much asthat the position of the player in question is withinsome small margin of error in both states. The otherclass is strictly consistent, and for these events (suchas missiles being fired) it is important that both statesagree on exactly when and where it occurred.

If an inconsistency is discovered, a rollback fromthe trailing state to the leading state is preformed.This consists of copying the game state as well asadding any commands after the rollback time backto the leading state’s pending list. The next time thatstate executes, it will calculate it’s execution time ast�d and execute all the commands again. The differ-ence in delays between states determines how dras-tic the rollback will be. Additionally, the rollback ofone state may cause, upon it’s re-execution, incon-sistencies to be detected in its leading state. In thisfashion, any inconsistencies in a trailing state whichthe leading state also shares will be corrected. Thelast synchronizer has no trailing state to synchronizeit, and therefore any inconsistencies there will go un-detected. However, if it is assumed that the longestdelay, as in the other bounded optimistic algorithms,

is large compared to expected command transit de-lays, this is unlikely to occur.

3.5.1 An Example of TSS

Figures 3.1 and 3.2 depict a simple example of TSS.There are three states in the example with delays of0ms, 100ms and 200ms each. Two commands arehi-lighted. Command A is a MOVE, issued (locally)at t = 150 and executed immediately in the leadingstate. At time t = 250, the first trailing state reachesa simulation time of 150 and executes command A.Since A was on time, it’s execution matches the lead-ing state’s and no inconsistency occurs. Similarly, attime t = 350, the final trailing state reaches simula-tion time 150 and executes command A. It too findsno inconsistency, and no one is left to check it (thisis a contrived example, in real life you would likelywant a longer delay than 200ms on the last synchro-nizer). Command B is a FIRE event, issued at timet = 200 on a remote mirror. By the time it arrives,the real-time is t = 225. The command is executedimmediately in the leading state and placed in theproper position in the other two states since they are

26

at simulation times 100 and 0. At time t = 300, thefirst trailing state executes B. When it compares it’sresults with the leading state’s, it is unable to find aFIRE event from the same player at time 200, andsignals the need for a rollback. Figure 3.2 zooms inon the recovery procedure. The state of the trailingstate is copied to the leading state which places itat time 200. The leading state then marks all com-mands after time 200 as unexecuted and re-executesthem up to the current t. This example hi-lights oneof the features of TSS. It is possible that there wereother inconsistencies in the gap between times 200and 300 (a burst of congestion perhaps). The re-covery of the first inconsistency at time 200 in effectcanceled any future recoveries in this window.

3.5.2 Analysis of TSS

Although similar to many other synchronization al-gorithms, TSS has key differences with each of them.It is clearly very different from any of the conser-vative algorithms, since its scheduling of executionis based on synchronization delay and not when itis safe. It is also clearly different from MiMaze’sbucket synchronization since it provides absolutesynchronization for events delayed no later than thelongest synchronization delay. MiMaze on the otherhand does not really detect let alone recover from anyinconsistencies it may cause. TSS and Time Warpboth execute commands as soon as they arrive. Theydiffer however in their methods of recovering frominconsistencies. TSS is a little more optimistic thanTime Warp in that it does not keep a snapshot of thestate before executing every command so that it canrecover as soon as a late command arrives. Instead itcatches inconsistencies by detecting when the lead-ing state and the correct state diverge, and correctingat that point. It is possible, especially with weaklyconsistent events, that though executed out of ordercommands may not cause an inconsistency.

TSS will perform best in comparison to other syn-chronization algorithms when three situations arepresent: the game state is large and expensive tosnapshot, the gap between states’ delays is small andeasy to repair, and event processing is easy. The firstis definitely present in Quake, with one megabyte ofdata per context. The second is a parameter of TSSand therefore it is possible to tweak the number anddelays of synchronizers (see Section 3.5.3). The costof processing each command is not immediately ap-parent, and we must determine this experimentally.A preliminary study of the performance of TSS is inSection 3.6.

3.5.3 Choosing Synchronization Delays

Picking the correct number of synchronizers and thesynchronization delay for each is critical to the op-timal performance of TSS. If too few synchronizersare used, in order to provide a large enough windowof synchronization the gaps between synchronizersmust necessarily be large. This leads to greater delaybefore an inconsistency is detected, and more dras-tic and noticeable rollbacks. Conversely, if too manysynchronizers are used the memory savings providedby TSS will be eliminated. Additionally, rollbackswill likely be more expensive since a longer cascad-ing rollback is needed before reaching the trailingstate. Given information on the network and serverparameters, (delay distribution, message frequency,cost of executing a message and the cost of a contextcopy) it should be possible to build a model to cal-culate how many synchronizers should be used andwith what delays. Due to time constraints, this aspectof TSS has not been fully explored.

3.5.4 Implementation

The first step in implementing TSS, or any opti-mistic synchronization algorithm for that matter, in

27

Simulation Time

100ms 200ms 300ms Executed Commands

Pending Commands

Command B: FIRE, issued at t=200msReceived when leading state at t=225ms

Leading State(pre-recovery)

Trailing State #1

Execution Time(real time - synch delay)

Leading State(post-recovery)

Copy

Figure 3.2: Trailing State Synchronization Rollback

Quake was to alter the server so that all game statedata was within a context, and create functions capa-ble of copying one context onto another. AlthoughQuake uses almost no dynamic memory for perfor-mance reasons, they make numerous very ugly hacksto their static structures to avoid extraneous derefer-encing of pointers for the same reason. Additionally,the actual game play within the server is not donein C code, but in a primitive interpreted byte codelanguage known as QuakeC. It was because of theamount of work done within QuakeC that we decidedto simplify the Quake game to a single weapon andlevel.

Once the Quake server had been altered to usecontexts, it was fairly simple to add synchronization.Instead of executing client packets immediately, theyare intercepted and sent out over the multicast chan-nel. Upon receiving a multicast command, it is in-serted back into Quake’s network buffer and parsedby quakes normal function. All commands otherthan moves (which also include fires) get executednormally in all contexts, while the moves are insertedinto the synchronizers. Every time through the mainevent loop, each of the synchronizers is checked, andany pending commands which can be are executed,

and inconsistencies checked for.In addition to the mirroring and synchronization,

we also added a trace feature to the server. This logsto a file every command sent to the multicast channeland allows games to be replayed exactly in the futurefor deterministic simulations.

3.6 Experimental Results

To test the performance of TSS, we ran a series ofsimulations using the trace feature described abovewith different network and synchronization param-eters. Unfortunately, due to QuakeC code still ex-hibiting randomness in certain situations, even witha single server and two states inconsistencies wouldoccasionally occur. Because of this, a detailed andaccurate study of the behavior of TSS with differentconfigurations and different network conditions wasnot possible.

From the experiments we were still able to gatherseveral useful results. The results in Table 3.6 showseven runs, all using the same trace file with threeusers connected to each of two servers. The statisticswere gathered at mirror one, which saw 18593 com-mands from local clients and the multicast group.

28

Synchronization Executed Execution Rollbacks Rollback Total Command RollbackDelays Commands Time Time Time Cost Cost0,50 40780 6.135408 817 1.148895 8.953207 .15ms 1.41ms0,100 45401 6.369374 870 1.226166 9.317506 .14ms 1.41ms0,50,100 59981 9.024021 938 1.315154 12.296119 .15ms 1.40ms0,100,1000 331687 26.195788 6687 10.105350 43.772133 .08ms 1.51ms0,50,100,150 79357 12.144979 1092 1.534347 15.904039 .15ms 1.41ms0,50,100,500 99730 13.261478 2370 3.361073 19.375112 .13ms 1.42ms0,50,500,1000 251044 23.223288 6995 10.513721 39.266619 .09ms 1.50ms

Table 3.1: Trailing State Synchronization Execution Times

For each configuration of TSS, the seconds spent ex-ecuting commands (including re-execution of com-mands due to rollbacks) and those spent performingrollbacks (copying contexts and moving commandsback to the pending list) as well as the number ofoccurrences of these events is listed. These are thenused to calculate a per-event cost of executing a com-mand or performing a rollback. In all the cases thesecosts were nearly identical, with command executionbeing an order of magnitude less expensive. Thissupports the third condition for TSS to be advanta-geous, that event processing be inexpensive. For ev-ery command executed, TSS does an order of mag-nitude better than Time Warp or other similar opti-mistic algorithms.

3.7 Future Work

As discussed above, a more through examination ofthe parameter space for synchronization delays isneeded. Additionally, taking the idea of weakly con-sistent events one step further, it would be interestingto look at the effect of correcting positional differ-ences between states as opposed to performing a fullrollback. While we were limited in what we couldimplement by Quake’s existing design, in another sit-

uation the Event based consistency protocol mightbe implemented and studied. Finally, even thoughit is entirely subjective, a better user study to deter-mine what delay and rollback rates are unnoticeable,what are noticeable but playable, and at what pointthe game becomes unplayable is needed to determinethe true usefulness of the mirrored multi-player gameservers.

29

30

Chapter 4

Reliable Multicast

4.1 Introduction

As stated in earlier sections, our architecture requiresa reliable many-to-many delivery method betweenmirrors. In this chapter, we investigate the require-ments of a delivery method, and provide a design,the Clocked Reliable Interactive Multicast Protocol(CRIMP), that meets them.

This part is organized as follows. Section 4.2 de-scribes the requirements and desired features of ourdelivery method. In Section 4.3 we discuss other de-signs in literature that fueled our design decisions.Section 4.4 provides a description of our architec-ture. In Section 4.6 we provide some evaluation andanalysis of CRIMP. We close the chapter with futurework.

4.2 Requirements and Simplifica-tions

To provide perfect consistency, our architecture re-quires that all packets be delivered to each mir-ror. Each server provides consistency by execut-ing the commands sent between mirrors. If pack-ets are dropped, the servers’ states could diverge, inessence, no longer being consistent.

The mirrored architecture, from a quality of playpoint of view, also requires that packets be delivered

in a low latency manner. If a long delay synchronizerplays back a packet which an earlier synchronizerdid not play back, the mirror is forced to roll-back.Our architecture evaluation shows why this circum-stance should be avoided. If possible, a reliable de-livery method designed for our architecture shouldminimize what we call perceived latency. We defineperceived latency as the difference between when apacket is originally sent by the sender, ts, and whena receiver receives the packet, tr.

Since the consistency layer of the architecturedoes its own ordering of packets, packet ordering isnot required by the delivery layer. Instead, the syn-chronizer insists that the packets be passed up to theapplication layer as soon as they arrive.

As described in [12], most packet loss is due tocongestion, and the delivery layer must carefully re-act to packet loss. The delivery layer should avoidflooding the network when a loss is detected to avoiddeteriorating network conditions further.

The delivery layer can use the abstraction of anuberserver. This is an entity on the delivery chan-nel which can provide authority on any issues in thechannel.

Since the mirrors in the channel forward clientcommands to the many-to-many delivery channel,delivery rate is thus determined by the clients, andnot the mirror. Congestion control must thus be done

31

with the cooperation of the clients. Since we are un-able to modify the Quake client, congestion controlis out of the scope of our design.

Since each mirror has a finite number of synchro-nizers, after a packet misses its playback point in thelast synchronizer, it can no longer be used. Thus,each packet has a finite life in the system.

4.3 Discussion

The idea of reliable many-to-many communication isnot a new concept. In this section, we will provide adiscussion of which ideas we can borrow, and whichwe have to discount for our delivery layer.

4.3.1 Unicast

Naive Unicast

The most basic many-to-many method of providingreliability is a full mesh of connections between en-tities on the channel. Using this method, a senderhas connections to each of the receivers. Either thesender or the receiver detects losses and prompts thesender to retransmit lost packets. This method maywork well on a small scale, but as noted in [20], itcauses unnecessarily high link usage, especially atthe sender, as it must send indiviual messages to eachreceiver. In addition, all recovery messages mustcome from the sender, and thus time to recovery isbound by the longest RTT. Because of the problemsthis design, we discounted naive unicast as an infea-sible method.

End-Host Multicast

Another method of providing a many-to-many re-liable delivery is to use end-host multicast [20],whereby entities unicast messages to each otherbased on some sort of global topology. Reliability

is enforced between neighbors in this method, gen-erally in the transport layer. However, as the au-thors concede, latency is generally higher than othermany-to-many methods such as IP-Multicast [20].Packets must propagate through the topology untilthey reach every host. This is quite scalable, yet itmost likely would not meet the latency requirementsof our architecture.

4.3.2 IP-Multicast

To solve the problem of propagation delay of end-host multicast, we can use IP-Multicast to send effi-ciently to many hosts. However, since IP-Multicastis built at the IP-layer, to this date, no standard re-liable transport protocol is available for it. How-ever, research [11, 18, 4] has been done making IP-Multicast scalably reliable. For simplicity, we cat-egorize these efforts into two separate categories,Topology-Based, and Receiver-Based.

Topology-Based Reliability

In this category of reliable IP-Multicast, receivers aregrouped together in trees, rings or some other topol-ogy [11]. Reliability is ensured by having receiversaggregate their acknowledgments and sending themback to the sender, which retransmits. Local recov-ery is an enhancement of this, whereby some mem-ber of the aggregate group fill the requests for re-transmission between themselves [11]. The authorsof [18, 11] show this to be a very scalable approachto reliability over IP-Multicast. However, the scala-bility comes at the cost of latency, as retransmissionmust wait for NAKs to propagate up from a receiverthrough the topology, or use ACKs and RTO timers,which is bound by RTTs.

32

Receiver-Based Reliability

This category of reliability, as described in [4] usesreceiver NACKs to fuel sender retransmission. Thereceivers detects losses and sends a recovery requestback to the multicast group. Anyone who stored thepacket can respond to the request with the data. Oneimplementation of this, SRM, sets RTT based timersto ensure that the multicast channel is not floodedwith requests. This method is less scalable thantopology-based reliability, in that all entities mustkeep state of all other entities on the channel. How-ever, the timers set ensure that network utilization iskept to a minimum. The authors of [12] show that inthe ideal case, SRM provides the lowest latency re-covery of known solutions. Observe that Topology-Based Reliability is actually a subset of Receiver-Based Reliability.

4.4 CRIMP Design

Using the methods of reliable many-to-many deliv-ery discussed above, we designed a reliable multi-cast layer which conforms to the requirements of thearchitecture. We also use the simplifications that thearchitecture afforded us to further enhance the de-sign. In this section, we describe the design deci-sions of CRIMP, our multicast library.

4.4.1 Design Overview

Since we required low latency recovery, we usedthe Receiver-based reliable multicast as describedabove. Our implementation borrows heavily fromSRM [4]. The major criticism [11] of this approachis the scalability of state kept by the senders, the re-quirement of infinite storage for packets, and sessionmessages to avoid missing ends of packet sequences.Since the senders in our architecture are limited innumber, keeping state for these is not unreasonable.

Since packets have a limited lifetime, infinite storageis also not a problem. Packets are part of an unendingstream, and thus session messages are unnecessary.Since the simplifications afforded by the architecturecancel these issues, we felt that a Receiver-based ap-proach with slight augmentations could provide uswith a scalable, lower latency solution. In the follow-ing subsections, we discuss the details of the multi-cast layer, and the augmentations we made to providepotentially lower latency.

4.4.2 Bootstrapping

In CRIMP, since we limit the number of serversand each server needs a unique identity, we requirethat new servers establish themselves before send-ing game messages to the multicast channel. Theunique identity is required since packets are refer-enced, akin to SRM, by the tuple fserver identifier,sequence numberg. Using the uberserver abstrac-tion, we defined a bootstrapping sequence for estab-lishing a mirror in the group.

1. When a mirror wishes to join the game, it sendsa request-join control packet to the multicastchannel.

2. The uberserver responds back to the mirror, onthe multicast channel with either an establish-ment message or a denial message. The estab-lishment message contains the mirror’s identi-fier.

3. Upon establishment, the mirror requests thegame-state from the uberserver.

4. The uberserver replies via unicast TCP with thegame-state.

If the messages in steps 1 or 2 are lost, the mirrorwill retransmit its request based on an exponentially

33

backing off RTO. These control messages are doneon the multicast channel so other mirrors can add thenew mirror to their server tables. The uberserver willreply with the game-state instead of another mirror,since the game state at the uberserver is consideredto be the reference state. This is done via TCP tosimplify the reliable transmission of a large chunk ofdata. The mirror will capture packets on the multi-cast channel while the transfer of game-state is oc-curring, and apply the changes once it finishes. Oncethe changes are applied, the mirror can become anactive participant in the game. 1

4.4.3 Loss Detection

Since packets are sent by each mirror at a relativelyhigh rate, at least one packet per 30 ms, detectingloss can be done quite quickly by counting packetsreceived after the next expected sequence number.This scheme, which we borrow from TCP’s dupli-cate ACKs, we call N -Out-Of-Order, where N cor-responds to the number of packets received after theexpected one. If N is chose to be to low, packets willbe falsely assumed to be lost. If N is chosen to betoo high, packets will be discovered lost too late.

4.4.4 Recovering from Loss

The loss recovery algorithm which we use is verysimilar to that of SRM. Like SRM, when a lostpacket is detected, the receiver schedules a recov-ery message to the multicast channel on the interval[C1dS;A; (C1 + C2)dS;A]. As in [4], dS;A is the ap-proximate one-way delay from the source server S tothe receiving server A as estimated by A. C1 and C2

1Since the focus of the paper is on the asymptotic perfor-mance and not startup cost, for most of our studies, implementeda simplified the uberserver-mirror transaction, by having themirror block until the uberserver register all games. This al-lowed for a more deterministic study of both the network layerand the overall architecture

are parameters to the equation which tune how manyduplicate recovery messages are sent versus the timewaited. In our case, we wish to set C1 and C2 ag-gressively, to 1:5 and 1, somewhat more aggressivelythan in SRM to avoid latency. As in SRM, if the re-ceiver receives either the packet or a recovery requestfor the same packet from a different server, nothingis done. Otherwise, the receiver sends a recover re-quest for the packet to the multicast channel. Ourenhancement of this recovery algorithm is to havethe receiver send an immediate NAK with probabil-ity 1

nwhere n is the number of mirrors. By doing

this, we expect to always have an immediate NAK ifno mirrors receiver the packets. This should add anegligible amount of traffic, in that in the worst case,only 1 extra packet is expected.

To repair the lost packet, any server which re-ceives the recovery request and has the packetstored becomes a candidate for repairing the lostpacket. As in SRM, the candidate will schedule arepair packet randomly on the uniform distribution[D1dA;C ; (D1 +D2)dA;C ]. dA;C is the approximateone-way delay from the server A which sent the re-covery request to the repair candidate C as estimatedby C . D1 and D2 are parameters to the equationwhich tune how many duplicate repairs are sent ver-sus the delay in receiving a repair. Ideally, these pa-rameters provide a single low latency repair packet.We choose D1 = 2 and D2 = 1 as they for SRM.If a repair is seen for the packet before the time ex-pires, no repair is sent. Otherwise, a repair is sentwhen the timer elapses. We enhance this further byhaving the sender respond to recovery requests im-mediately. This will provide a low latency solutionto all receivers, or the set of receivers close to thesender missing the packet. This adds only the cost ofa packet per recovery request to the multicast chan-nel.

To avoid responding to recovery requests for thesame lost packet, every candidate server which sees a

34

repair message will ignore recovery requests for thatpacket for EdS;C . E is a parameter which tunes howmany duplicate repairs are sent for the same recoveryrequest versus how frequently a fresh recovery re-quest will be ignored. We chose E = 3, as in SRM.We think that this gives a good balance of ignoringduplicate messages and replying to valid messages.

A server which either sends a recovery request orwitnesses a recovery request from a different servermust wait for a repair to return. It is possible tolose either the recovery message or the repair, so aretransmission timeout is set for the waiting server.This RTO is FdS;A and is started when the recov-ery request was sent or witnessed. Ideally, a repairshould return in about an RTT to the sender plus thewait-time needed to avoid duplicate repairs. If thistimer expires, the server restarts the recovery pro-cess. We chose F = 2, corresponding to about anRTT.

4.4.5 Canceling Recovery

Since packets are only useful for a finite slice of time,a server can stop trying to recover a lost packet aftera certain amount of time. This is facilitated throughthe clocking built into CRIMP. Packets and first re-covery tries are stored in a data structure called aCRIMP interval. CRIMP intervals act as a history ofwhat has occurred on the multicast channel. Whena new packet arrives, it is copied into the newestCRIMP interval. Periodically, the server will call afunction tic()which moves the newest CRIMP in-terval into the past and provides a newer CRIMP in-terval. The oldest CRIMP interval will be removed,and any stored packets in the interval are deleted andrecovery requests are canceled. The length of thehistory is configurable. In our case, it makes senseto set the length of the history of intervals to that ofthe longest synchronizer, since packets are of no useafter that point.

4.4.6 Server Management

CRIMP also handles the problems of network par-titioning and server crashes. This is done using theauthority of the uberserver. The uberserver maintainsa list of last contacts with servers. If packets are notreceived in a time corresponding to 3

4of the length

of time that its CRIMP intervals correspond to, theserver is considered dropped. This time is used, sinceafter that point, the server would have difficulty re-covering from losses if it was down. Hence, anyservers which can reach the uberserver are still con-sidered part of the game. If the network partitions,all servers in a section other than the uberserver aredropped. If the partitioned servers return after be-ing dropped, they must go through the bootstrappingprocess again to carry the game.2.

4.5 Server Selection

Since each game will have a different set of serversfor it, we needed a system for a user to be directedto a game mirror. We do this with a master serverwhich tracks games and a graphical tool qm-findwhich interfaces with the master server.

4.5.1 Master Server

When a new game is started by an uberserver, itregisters with a well-known server, called a masterserver (a term borrowed from [2]). The uberserversends the ip address and port for itself and its mir-rors, as well as the game information being played onthe server (map, game style, etc.). The master serversends periodic keep-alive messages via TCP to theuberserver which replies with any changes in its rele-vant information such as new mirrors, removed mir-rors, players, or maps. If the uberserver does not

2We do not consider network partitioning in our evaluation,the description is here only for completeness

35

reply to a keep-alive message, the master server re-moves it from its database. The master server alsohas a client interface which will return games andtheir lists of mirrors.

4.5.2 qm-find

qm-find is our proposed tool for finding the clos-est game to a client. The tool will query the mas-ter server for all games matching a specific query(map, game style, etc.). The master server returns thecorresponding list of games. When the user selectsa game, qm-find queries an IDMaps [6] HOPSserver to find the mirror with the lowest approximateRTT relative to the user. qm-find then spawns thegame client with the the lowest RTT mirror as theserver name and port arguments.

4.6 Evaluation

Since CRIMP was designed with the architecture inmind, we wanted to evaluate the network layer un-der game conditions. However, due to limited equip-ment, and test subjects, we were unable to run ex-periments with large numbers of clients and servers.Nevertheless, we captured the traffic on the multicastchannel for a smaller number of clients, and createda traffic model based on the patterns in these.

In figure 4.1, we present the network traffic for twoservers on a 100baseT LAN. Here our definition oftraffic is bytes sent per packet. We considered multi-ple games with varying numbers of clients. Observethat the packet size after some initial large messages,asymptotically remains nearly constant. We alsopresent this data as a histogram in Figure 4.2. Weremoved the outliers (over 100 bytes) from the plotsince we care about asymptotic performance. Weused the remaining data points in the histogram asa traffic model for studying multicast performance.

0

500

1000

1500

2000

2500

3000

40 60 80 100 120 140 160 180

Inst

ance

s

Packet Size (bytes)

2 Clients3 Clients4 Clients5 Clients6 Clients

Figure 4.2: Histogram of Packet Sizes

Experimental Setup

Using the traffic model as discussed above, we cre-ated a simple test harness to evaluate the networklayer. This test harness sent a packet of a size de-termined randomly, using the histogram as a proba-bility density function, to the multicast channel. Dueto limitations of dummynet, we were unable to createrandom network delay and loss on a multicast chan-nel between two machines. We ran multiple test har-nesses on the machines to create a virtual topologylike that in Figure 4.3.

Using this virtual topology, having a two-wayFreeBSD Dummynet [17] bridge (denoted Dum-mynet in the figure) with a delay of 25 ms, and va-riety packet loss rates, we evaluated the multicastlayer in terms of perceived RTT, total losses and du-plicate ACKs. Each sender sent 2000 total packets,one packet every 30 ms, as in Quake. We initializedCRIMP with a history of 1000 ms.

To compute meaningful results, we recorded thesend time of each packet, and the time that eachclient receives the packet. We also recorded the num-ber of duplicate recovery packets sent and number

36

40

60

80

100

120

140

160

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Pac

ket S

ize

(byt

es)

Packet Number

2 Clients

a. 2 Clients

40

60

80

100

120

140

160

0 2000 4000 6000 8000 10000 12000 14000

Pac

ket S

ize

(byt

es)

Packet Number

3 Clients

b. 3 Clients

40

60

80

100

120

140

160

0 2000 4000 6000 8000 10000 12000 14000 16000

Pac

ket S

ize

(byt

es)

Packet Number

4 Clients

c. 4 Clients

40

60

80

100

120

140

160

0 2000 4000 6000 8000 10000 12000 14000 16000

Pac

ket S

ize

(byt

es)

Packet Number

5 Clients

d. 5 Clients

40

60

80

100

120

140

160

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

Pac

ket S

ize

(byt

es)

Packet Number

6 Clients

e. 6 Clients

Figure 4.1: Two-mirror Client Traces

DummyNet

Figure 4.3: Virtual Multicast Topology

of duplicate repairs. We consider a packet to be aduplicate if a packet with the same data has alreadybeen sent. Hence, if a receiver sends two recoveryrequests, one is considered to be a duplicate. Wepresent the results from our experiment in in Table4.6. We concede that the topology may cause the

Loss Perceived Total Duplicate DuplicateRate RTT Losses Recovery Repairs

0% 15ms 0 0 05% 32ms 0 441 32310% 78ms 0 1653 121515% 161ms 2 2512 1951

Table 4.1: Performance of Multicast

multicast layer to exhibit unusual results since half ofthe machines in essence see the same behavior (25msdelay, a loss, or 0ms delay). This nullifies any chanceof localized recover, and also causes the probabilityof no one receiving the packet 0, in essence makingour aggressive recovery less effective.

From these results, we can see that our multicastadequately provides reliability for our architecture.Note that only in the 15% loss rate case were packetsnot received. We think that this may be misrepresen-

37

tative, and an artifact of our topology having an en-tire side miss a packet. In the 5% loss case, which weview as a conservative estimate of an actual network,we experience a perceived RTT about than twice thatof the 0% case. This behavior is consistent with ourexpectations of an efficient reliable multicast.

4.7 Future Work

4.7.1 Forward Error Correction

A possible improvement for recovery is using for-ward error correction [10] to recover packets withoutgoing to the network. If, for example, each packetalso included the contents of the last two packets,the server would in essence, have received all threepackets. If one of the previous two packets was lost,it could be recovered from that packet. The network-based reliability would thus be needed only for burstsof packet loss. This would decrease recovery latencyfor some packets at the expense of increased networkstress (in the example, packets would triple in size).

38

Chapter 5

Conclusions

In conclusion, we present a novel networked gamearchitecture, the Mirrored-Server architecture. Thisarchitecture takes advantage of mirroring that hasbeen shown to provide better services to clients [6],and extends it to a system which changes at a veryhigh rate. To facilitate this architecture, we alsodescribe a new synchronization mechanism, Trail-ing State Synchronization to provide efficient con-sistency between mirrors. We also present a reliablemulticast layer, CRIMP, which takes advantages ofsimplifications presented by this architecture to pro-vide low latency many-to-many reliability. We alsosome results which prove that our architecture canprovide quality gameplay, yet leave much evalua-tion for future work, since we were hamstrung bythe oddities of the Quake server.

39

40

Bibliography

[1] N. Baughman and B. Levine. “cheat-proofplayout for centralized and distributed onlinegame s”. In Proc. Infocom 2001, April 2001.

[2] Y. W. Bernier. “Half-Life and Team FortressNetworking: Closing the Loop on ScalableNetwork Gaming Backend Services”. In Proc.of the Game Developers Conference, 2000,May 2000.

[3] Diablo II, Starcraft. Blizzard entertainment,January 2001.http://www.blizzard.com/.

[4] S. Floyd et al. “A Reliable Multicast Frame-work for Light-weight Sessions and Applica-tion Level Framing”. In IEEE/ACM Transac-tions on Networking, Dec. 1997.

[5] EverQuest, PlanetSide. Verant interactive, Jan-uary 2001.http://www.verant.com/.

[6] P. Francis, S. Jamin, C. Jin, Y. Jin, V. Paxson,D. Raz, Y. Shavitt, and L. Zhang. “IDMaps: AGlobal Internet Host Distance Estimation Ser-vice”. Submitted for publication, 2001.

[7] T.A. Funkhouser. “ring: A client-server systemfor multiuser virtual environments”. In Proc.of the SIGGRAPH Symposium on Interactive3D Graphics, pages 85–92. ACM SIGGRAPH,April 1995.

[8] L. Gautier and C. Diot. “Distributed Synchro-nization for Multiplayer Interactive Applica-tions on the Internet”. Submitted for publica-tion, October 1998.

[9] D. Helder and S. Jamin. “Banana Tree Proto-col, and End-host Multicast Protocol”. Submit-ted for publication, July 2000.

[10] J.F. Kurose and K.W. Ross. “Computer Net-working: A Top-Down Approach Featuring theInternet”. Addison Wesley, Preliminary edi-tion, 2000.

[11] B. N. Levine and J. J. Garcia-Luna-Aceves. “acomparison of reliable multicast protocols”.

[12] D. Li and D. Cheriton. “OTERS (On-TreeEfficient Recovery using Subcasting): A reli-able multicast protocol”. In Proc. IEEE In-ternational Conference on Network Protocols(ICNP’98), pages 237–245, 1998.

[13] J. Liebeherr and T. Beam. “hypercast: A proto-col for maintaining multicast group members in a logical hypercube topology”. In Proc. of the1th Workshop on Networked Group Communi-cation, pages 72–89, 1999.

[14] Katherine L. Morse. “interest management inlarge-scale distributed simulations”. TechnicalReport ICS-TR-96-27, 1996.

41

[15] L. Pantel and L. Wolf. “On the Impact of Delayon Real-Time Multiplayer Games”. Submittedfor publication, 2001.

[16] Quake. id software, January 2001.http://www.idsoftware.com/.

[17] L. Rizzo. “Dummynet: A Simple Approach tothe Evaluation of Network Protocols”. ACMComputer Communication Review, 1997.

[18] J. C.-H. Lin S. Paul, K. K. Sabnani and S. Bhat-tacharyya. “Reliable Multicast Transport Pro-tocol (RMTP)”. IEEE Journal of Selected Ar-eas in Communications, 15(3):407–421, 1997.

[19] B. Stabell and K. R. Schouten. “The Story ofXPilot”. ACM Crossroads, January 2001.

[20] S. G. Rao Y. Chu and H. Zhang. “A Case forEnd System Multicast”. In Measurement andModeling of Computer Systems, pages 1–12,2000.

42

Documents

A Distributed Multiplayer Game Server Systemashvin/courses/ece1746/2003/... · A Distributed Multiplayer Game Server System ... In this architecture, the server is responsible for