8
8/3/2019 Issues and Trends in Routers http://slidepdf.com/reader/full/issues-and-trends-in-routers 1/8 Issues and Trends in Router Design S. Keshav and Rosen Sharma Cornell University Future routers must no t only forward packets at high speeds, ABSTRACT ut also deal with nontrivial issues such as scheduling support for differential services, heterogeneous link technologies, and backward compatibility with a wide range of packet formats and routing protocols. In this article, the authors outline the design issues facing the next generation of backbone, enterprise, and access routers. The authors also present a survey of recent advances in router design, identifying impor- tant trends, concluding with a selection of open issues. outers knit together the constituent networks of R he global Internet, creating the illusion of a uni- fied whole. While their primary role is to transfer packets from a set of input links to a set of output links, they must also deal with heterogeneous link technologies, provide scheduling support for differential service, and participate in complex distributed algorithms to generate globally coherent routing tables. These demands, along with an insatiable need for bandwidth in the Internet, complicate their design. Routers are found at every level in the Internet. Routers in access networks allow homes and small businesses to connect to an Internet service provider (ISP). Routers in enterprise networks link tens of thousands of computers within a campus or an enterprise. Routers in the backbone are not usually directly accessible to end systems. Instead, they link together ISPs and enterprise networks with long distance trunks. The rapid growth of the Internet has created different challenges for routers in backbone, enterprise, and access networks. The backbone needs routers capable of routing at high speeds on a few links. Enterprise routers should have low cost per port and a large number of ports, be easy to configure, and sup- port quality of service (QoS). Finally, access routers should support many heterogeneous high-speed ports and a variety of protocols at each port, and try to bypass the central office voice switch. This article presents the design issues and trends that arise in these three classes of routers. The following section describes the structure of a generic router. The section after that discusses design issues in backbone, enterprise, and access routers. We then present some recent advances and trends in router design. Finally, we conclude with a descrip- tion of some open problems. We note that our main topic of discussion is packet forwarding; routing protocols, which cre- ate the forwarding tables, are dealt with only in passing. COMPONENTS OF A ROUTER Figure 1 abstracts the architecture of a generic router. A generic router has four components: input ports, output ports, a switching fabric, and a routing processor. An inputport is the point of attachment for a physical link and is the point of entry for incoming packets. Ports are instantiated on line cards, which typically support 4, 8, or 16 ports. The switching fabric interconnects input ports with output ports. We classify a router as input-queued or output-queued depending on the relative speed of the input ports and the switching fabric. If the switching fabric has a bandwidth greater than the sum of the bandwidths of the input ports, pack- ets are queued only at the outputs, and the router is called an output-queued router. Otherwise, queues may build up at the inputs, and the router is called an input-queued router. An outputport stores packets and schedules them for service on an output link. Finally, the routingprocessor participates in routing protocols and creates a forwarding table that is used in packet fonvard- ing. We now discuss the components of this generic router in more detail. An input port provides several functions. First, it carries out data link layer encapsulation and decapsulation. Second, it may also have the intelligence to look up an incoming pack- et’s destination address in its forwarding table to determine its destination port (this is called route lookup). The route lookup algorithm can be implemented using custom hardware, or each line card may be equipped with a general-purpose pro- cessor. Third, in order to provide QoS guarantees, a port may need to classify packets into predefined service classes. Fourth, a port may need to run data-link-level protocols such as Serial Line Internet Protocol (SLIP) and Point-to-point Protocol (PPP), or network-level protocols such as Point-to- Point Transmission Protocol (PPTP). Once the route lookup is done, the packet needs to be sent to the output port using the switching fabric. If the router is input-queued, several input ports must share the fabric: the final function of an input port is to participate in arbitration protocols to share this common resource. The switching fabric can be implemented using many dif- ferent techniques (for a detailed survey see [l]). he most common switch fabric technologies in use today are buses, crossbars, and shared memories. The simplest switch fabric is a bus that links all the input and output ports. However, a bus is limited in capacity by its capacitance and the arbitration overhead for sharing this single critical resource. Unlike a bus, a crossbar provides multiple simultaneous data paths through the fabric. A crossbar can be thought of as 2N buses linked by N * N crosspoints: if a crosspoint is on, data on an input bus is made available to an output bus; otherwise, it is not. How- ever, a scheduler must turn on and off crosspoints for each set of packets transferred in parallel across the crossbar. Thus, the scheduler limits the speed of a crossbar fabric. In a shared-memory router, incoming packets are stored in a shared memory and only pointers to packets are switched. This increases switching capacity. However, the speed of the switch is limited to the speed at which we can access memory. Unfortunately, unlike memory size, which doubles every 18 months, memory access times decline only around 5 percent each year. This is an intrinsic limitation of shared-memory switch fabrics. 144 0163-6804/98/$10.00 0 998 IEEE IEEE Communications Magazine May 1998

Issues and Trends in Routers

Embed Size (px)

Citation preview

Page 1: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 1/8

Issues and Trends in

Router Design

S. Keshav and Rosen Sharma

Cornell University

Future routers must not only forward packets at h i g h speeds,ABSTRACTut also deal with nontrivial issues such as scheduling supportfor differential services, heter ogen eous link technologies, a n d backward compatibility with

a wide range of packet formats and routing protocols. In this article, the a uthors outlinethe design issues facing the next generation of backbone, enterprise, a n d access routers.The authors also pre sent a survey of recent adva nces in router design, identifying impor-

tant trends, concluding with a selection of open issues.

outers kni t together the const i tuent networks ofR he global Intern et, creating th e illusion of a uni-f ied whole. While their primary ro le is to t ransfer packetsfrom a set of input links to a set of outp ut links, they musta l so dea l wi th he te rogen eous l ink t echno log ies, p rov idescheduling support for differential service, and participate incomplex distributed algorithms to generate globally coherentrouting tables. These demands, along with an insatiable needfor bandwidth in the Internet, complicate their design.

Rou ters are found at every level in the Inte rnet. Rou ters inaccess networks allow homes and sm all businesses to connectto an In terne t service provider (ISP). Route rs in enterprisenetworks link tens of thousands of computers within a campuso r an en te rp r i se . Rou te rs in the backbone are no t u sual lydirectly accessible to e nd systems. Instead, they link togethe rISPs and enterp rise networks with long distance trunks. Th erapid growth of the Inter net has created different challengesfor router s in backbone, enterprise, and access networks. Th ebackbone needs routers capable of routing at high speeds on a

few links . Enterp rise ro uters should have low cost per portand a large number of ports, be easy to configure, and sup-port quality of service (QoS). Finally, access routers shouldsupport many heterogeneous high-speed ports and a variety ofprotocols at each po rt , and t ry to bypass the cen tral officevoice switch.

This article presents the design issues and trends that arisei n t h e s e t h r e e c l as s e s of r o u t e r s . T h e f o l lo w i n g s e c t i o ndescribes the struc ture of a gene ric router. The section aftertha t d i scusses design i s sues in backbone , e n te rp r i s e , andaccess routers . We then p resent some recent advances andtrends in ro uter design. Finally, we con clude with a de scrip-tion of som e open problems. W e note that our main topic ofdiscussion is packet forwarding; routing protocols, which cre -ate th e forwarding tables, a re dealt with only in passing.

COMPONENTS OF A ROUTERF i g u r e 1 abs t rac t s the a rch i t ec tu re o f a gener ic rou te r . Ageneric router has four components: input ports, output ports,a switching fabric, and a rou ting processor. An i npu tpor t isthe po int of attac hme nt for a physical link and is the poin t ofen t ry fo r incoming packe t s. P o r t s a r e in s tan t ia ted on l inecards, which typically suppor t 4, 8, or 16 ports . The switchingfabric interconnects inpu t ports with output ports. We classifya route r as input-queued or output-queued de pending on th erelative speed of th e input ports and th e switching fabric. Ifthe switching fabric has a bandwidth gre ater tha n the sum of

the bandwidths of the input ports, pack-ets are queued only at the outputs, andthe rou ter is cal led an output-queu ed

route r. Otherwise, queues may build up at the inputs, and therouter is called a n input-queued router. An outputport storespacke t s and schedu les them fo r se rv ice on an ou tpu t l ink.Finally, the routingprocessor participates in rou ting protocolsand creates a forwarding table that is used in packet fonvard-ing. We now discuss the com ponents of this generic router inmore detail.

An in put por t provides several functions. First, i t carriesout data link layer encapsulation and decapsulation. Second,it may also have the intelligence to look up an incoming pack-et’s destination address in its forwarding table to determine itsdestination port (this is called route lookup).The ro ute lookupa lgo ri thm can be imp lemen ted us ing cus tom hardware , o reach line card may be equipped with a general-purpose pro-cessor. Third, in o rde r to provide QoS guarantees, a port mayneed to c las si fy packe t s in to p rede f ine d se rv ice c las ses .Four th, a port m ay need to run data-link-level protocols suchas Serial Line In terne t Protocol (SLIP) and P oint-to-pointProtocol (PPP), o r network-level protocols such as Point-to-

Point Transmission Protocol (PP TP). Once th e route lookupis done, the packet needs to be se nt to the ou tput port usingthe switching fabric . If the ro u te r i s inpu t -queued , s evera linput ports m ust share th e fabric: the f inal funct ion of aninput port is to part icipate in arbi t rat ion protocols to sharethis common resource.

Th e switching fabric can be implem ented using many dif-ferent techniques (for a detai led survey see [l ]) . he mos tcommon switch fabric technologies in use today a re buses ,crossbars, and share d memories. T he simplest switch fabric isa bus that links all the input and outpu t ports. However, a busis limited in capacity by its capacitance a nd th e arbitra tionoverhead for sharing this single critical resource. Unlike a bus,a crossbar provides multiple simultaneous da ta paths throughthe fabric. A crossbar can be thought of as 2N buses linked byN * N crosspoints: if a crosspoint is on, data on an inp ut bus

is made available to an outp ut bus; otherwise, it is not. How-ever, a scheduler must turn on and off crosspoints for each setof packets transferred in parallel across the crossbar. Thus,t h e s c h e d u l e r l im i t s t h e s p e e d o f a c r o s s b a r f a b r i c . I n as h a r e d - m e m o r y r o u t e r , i n c o m i ng p a c k e t s a r e s t o r e d i n ashared m emory and only pointers to packets ar e switched.This increases switching capacity. However, the speed of theswitch is limited t o the speed a t which we can access memory.Unfo rtunately , unlike memory size, which doubles every 18months, memory access times decline only around 5 percenteach year. This is an in tr ins ic l imitat ion of shared-mem oryswitch fabrics.

144 0163-6804/98/$10.000 998 IEEE IEEE Communications Magazine May 1998

Page 2: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 2/8

1 Inpuvoutput Switchina

Outp ut ports storeo n t h e o u t p u t l i nk .scheduling algorithmsLike input ports, out?u tlink layer encapsulatichigher-level protocol!can be fou nd in [2].

implements rout ingconfigure and manageet whose de st inat ionwarding table in the

Th e routing proce3sor

I ports.,

packets before they are transmitted‘:hey can implem ent sophis t icated

to support priorities and guarantees.ports also need to support data

n and decapsulation, and a variety of.Mo re details on packet scheduling

computes the forwarding table,

pro to col^, and runs the software tothe router. It also handles any pack-

address cannot be found in the for-line card.

fahri; a

With this descriptiorsection we turn ourenterprise, and access

EThe Internet currentlys e r v e u p t o a f e wrou te rs in te rconnec t

shared among a largewide-area transmissicsecond ary issue in thcry issues are reliability

using much the samespares, dual powerrouters. These ar e st;and we will not considerfocus on techniques til

The major perforris the t ime taken to

Har dwa re reliability

DESIGNSSUES

of a generic router in hand , in thisattention to design issues for backbone,

routers.

ACKBONE ROUTERS

has a few tens of backbones that eacht h o u s a n d s m a l l e r ne t w o r ks . B a c k b o n e

en te rp r i se ne tworks , so their cost is

customer base. Moreover, the cost ofn links is currently so high that cost is adesign of backbone routers. T he prima-and speed.

in backbone routers can be achievedtechniques as in telepho ne switches: hot

suF&lies,and duplicate data paths through the.ndard in all high-end backbone routers,

these issues in any detail. Instead, we

lance bottleneck in backbone IP routerslook up a rou te in the forwarding table.

achieve high-speed routing.

i Routing processor

W Figure 1. Architectureof a router.

masks A with the mask s to red wi th tha t en t ry , and if th ismatches the corresponding network address, adds port to theset of candidate destination ports. T he selected destination isthe candidate po rt corresponding to th e longest mask (we callthis the entry with the longest prefix ma tch) . For example, con-s ider a route r th at has th e fo llowing entr ies in i ts rout ingtab le : {<128 .32 .1 .5 /16 , 1> , <128 .32 .225 .0 /18 , 3> ,

<128 .0 .0 .0 /8 , 5>}. A p a c k e t w i t h a d e s t i n a t i o n a d d r e s s128.32.195.1 matches all three entries, so the set of candidatedestination ports for this packet is (1, 3, 5 ) . However, port 3corresponds to the routing entry with the longest mask (i.e.,18). Therefore, the dest inat ion port of the packet is port 3.This example outlines the two reasons why looking up a rou teis hard. First, the routing table may contain many thousandsof entries. Thus, it is highly inefficient to match an incomingpacket serially with every entry. Indeed, to achieve high rout-ing speeds , we shou ld t igh t ly bound th e wors t -case t imerequired to look up a destination port. Second, an incomingpacket may match mult ip le rout ing entries . We must f ind ,among matching entries, the on e with the longest match.

Th e cost of route lookups increases if packets are small orpackets ar e routed t o a large number of des t inat ions , so acache of frequently visited destinations becomes ineffective.

We now present some representative measurements to evalu-ate the s i tuat ion in current backbones . Figure 2a shows the

1 o

0.8

sWR

0.7>4-U

mc

.-

c

0.4eLa0.2

0.0

I

Cih trib utio n of given packet sizeto to tal packet and byte volume

-7 I I t

I f

- ackets- ytes

. I I I!io0 1000 1500 21

Packet size (bytes)

2gW

+

mc

W

S

V

.-

L

D

Active flowls at FIX-West as afunction of flow granularity

100,000 8 t i I- ecember 22,199 5- eptember25, 1995

80,000

60,000

40,000

20,000

0.0

0 src net dst net src host ds t host net pair host pair port pair

FIX-West, September 25 and December 20 , 1995(flow = packet train wi th 64 s dle period)

W Figure 2 . a ) Packcb sue distribution in a backbone router (data courtesy M C t 25 June 1997, OC-3backbone link); b) number ofactive flows in a bapkbone router.

IEEE Communications1Magazine May 1998 145II I

Page 3: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 3/8

packe t s i ze d i s t r ibu t ion f rom a 5 min t r ace co l l ect ed on ab a c k b o n e r o u t e r i n t h e M C I b a c k b o n e [3]. W e s e e t h a tapproximately 40 percent packets are 40 bytes long (these cor-respond t o TCP acknowledgment packets). This distributionimplies that current backbone routers must perform a largenumber of route lookups every second. One metric of routediversity is the size of the forwarding cache if routes are keptin the cache for a given t ime per iod. In a t race obta ined in

1995, Fig. 2b [4], e find that if route s are kept in a cache foronly 64 s since their last use, we stil l need a cache of 64,000entries.

Input-queued and outpu t-queued routers share the routelookup bot t leneck, but each has an addi t ional performancebottleneck that th e other does not. Recall that output-queu edswitches must ru n the switch fabric at a speed greater than thesum of th e speeds of th e incoming l inks. Whi le this can besolved by building third-generation interconnection networks,that still leaves the prob lem of storing packets rapidly in out-p u t b u f f e r s . T h e r a t e a t w hi ch t h e o u t p u t b u f f e r c a n b eaccessed is limited by D RA M o r SRA M access times. Theseultimately l imit the spe ed at which an ou tput-qu eued rou tercan be run. On e way to get arou nd this problem is to place allqueuing at the input. However, with this approac h an arbitermust resolve content ion for th e swi tching fabric and o utputqueue. It is hard to design arbiters that run at high speeds andcan also fairly schedule th e switch fabric and outpu t l ine [5,61. W e d i sc u s s s t r a t e g i e s f o r s p e e d i n g u p o u t p u t - q u e u e drouters later.

In addition to overcoming bottlenecks in performance atindividual routers, t here is an additional design issue that isoften ign ored. We believe tha t th e stabili ty an d reliabili ty ofrouting protocol implementations critically affect th e scalabili-ty of th e Interne t. Litt le is known abou t the stability of net-w o r k s w h e r e r o u t e r s r u n d i f f e r e n t v e r s io n s o f t h e s a m eprotocol, or worse, run different protocols altogether. Thus,even slight changes in network configuration can lead to seri-o u s a n d n e a rl y u n d e t e c t a b l e p r o b le m s . F o r e x a m p le , t h edescriptions of filters to export (import) routes from an interi-or gateway protocol to BG P is the cause of many a rout ing

problems on the Internet. Recent studies have shown that therout ing osci l la tions which character ize the current Internetare ofte n the result of small bugs in protocol imp lementationor ro uter m isconfiguration [7].

ENTERPRISE OUTERS

Ente rp r i se o r campus ne tworks i n t e rconnec t end sys t ems.Unlike backbone networks, where s peed comes first and cost

is a secondary issue, their primary g oal is to provide connec-tivity to a large num ber of en dpoin ts as cheaply as possible.Moreover, support for different service qualities is desirablebecau se this would allow QoS guarantees at least for trafficconfined to th e local area. Most en terprise networks are C UT-ren t ly bu i l t f rom E the rne t segment s connec ted by hubs o rbr idges . T hes e dev ices a re che ap and easy to i ns t a l l , andrequire no configuration. However, not only does the perfor-mance of a netw ork built with hubs and bridges deg rade withthe network's size, bu t the re is usually little suppo rt for servicedifferentiation. In contrast, a network built with ro uters parti-tions the machines into m ultiple collision domains and there -fore scales better with the size of th e network. Besides mostrouters support some form of service differentiation, at leastallowing multiple priority levels. Routers, how ever, tend to bemo re expensive per p ort and req uire extensive configurationbefore they can be used. Th e challenge, therefore, is to buildenterpr ise routers that have a low cost per port and a largenumber of ports, are easy to configure, and support QoS.

Enterprise ro uters have several additional design requ ire-ments. Unlike backbone networks, enterprise networks maycarry a significant amount of multicast and broadcast traffic.Thus, they must carry multicast traffic efficiently. Backboner o u t e r s t e n d t o s u p p o r t o n l y t h e IP pro toco l . In con t ra s t ,enterprise networks, which must deal with legacy LA N tech-nologies, must support multiple protocols, including IP, IPX,and Vines. They must also support features such as firewalls,traffic filters, extensive administrative and security policies,and vi r tual LA Ns. Final ly, unl ike backbon es, which l ink ahandful of t runks, enterpr ise routers must provide a largenumber of ports. Thus, enterp rise router designers must solve

the conflicting design goals of providing a richfea tu re se t a t each por t , and reduc ing the cos tper port. This is not an easy task!

ACCESSOUTERS

Access netw orks l ink custom ers at home or in asmall business with an ISP. Access networks havetraditionally bee n li t t le more than m odem poolsa t t a c h e d t o t e r m i n a l c o n c e n t r a t o r s , s e rv i ng alarge numb er of slow-speed dialup connections.How ever , this s imple mod el of the access net -w o r k i s c h a n g in g . F i r s t , a c ce s s n e t w o r k s a r ebeginning t o use a variety of access technologiessuch as high-speed modems, ADSL, and cable

modems. Second, the use of te lephone l ines foraccessing the In ternet f rom home has increased

the load on the phone network. The long connec-tion holding time for Internet dialup connectionscreates problems for phone switches. Thus, thereis considerable pressure o n access networks )tobeaware of the underlying telepho ne network, a ndto t ry t o bypass t he vo ice swi tch fo r da t a ca l ls .Third, access routers are beginning to provide notjust a SLIP or P PP connect ion, but a lso vir tualp r i v a t e n e t w o r k p r o t o c o l s s u c h a s P P T P a n dIPSec. These protocols need to b e run a t everyrou ter po rt . Finally, technologies such as asym-metric digital subscriber loop (AD SL) will soon

146 IEEE Communications Magazine May 1998

Page 4: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 4/8

increase th e bandwiclth available fr om e ach hom e. Th is willfurt her increase the $ad on access routers. Because of thesefou r trends, access rciuters will soon need to support a largenumber of heterogeneous, potentially high-speed ports and avariety of protocols clperating a t each po rt, an d try to bypassthe voice switch if this is a t all possible. W e feel tha t accessrou ter technology is 11n a st ate of flux due t o re cent advancesin technologies suchlas AD SL, a nd it is too early to discuss

their architecture in this article.

RECENTADVAINCESND CURRENT TRENDSTh e rapid growth ofi the In ternet has led to a f lurry of newresearch in rou ter d e igns. In this section, we pre sent a selec-tion of th ese approaiFhes, along with ou r observations abo utcurrent trends in rou er design. A note of warning: this is notme ant to be a n exhai stive list-it is only a sampling of workin a rapidly changing ield!

11HIGC -SPEED ROUTELOOKUP

We saw previously that one of the major bottlenecks in back-bone rou ters is the need to compu te the longest prefix matchfor each incoming p; cket. The sp eed of a route lookup algo-

requ ires to f ind the pa tch ing rou te en try , and the speed ofth e memory. Fo r exiample, if a n algori thm perform s eightmemory lookups an(d the inp ut port has a mem ory with a naccess time of 60 ns,lthe time taken to look up a route is 480ns, al lowing i t to dci abo ut 2 mil l ion r ou te lookups/s . Th esame algorithm, usirlg a costlier memory with a 10 ns accesstime, would allow the po rt to perfo rm 12.5 million lookups/s.A second co nsidera/ ion in designing forwarding table da tastructures is the timd taken to update the table. Re cent stud-ies have shown that h routin g table chan ges relatively slowly,requiring up date s cln ly arou nd once every 2 min [7]. Thisallows us to use corhpl icated da ta s t ructures tha t opt imizeroute lookup a t the expense of the t ime taken to update therouting table. I

The standard dat;l structure to store routes is a tree, where

e a c h p a t h i n t h e tr ic e f r o m r o o t t o l e a f c o r r e s p o n d s t o a nentry in the forwardikg table. Thus, the longest prefix match ist h e l o n g e st p a t h i n t h e t r e e t h a t m a t c h e s t h e d e s t i n a t i o naddress of an incomlng packet [8]. Conceptually, a tree-basedalgorithm starts at tile roo t of the tree and recursively match-es the children of th e curren t node with the next few bits ofthe dest inat ion add~ress , topping if n o match is found [9].T h u s , i n t h e w o r s t c a s e i t t a k e s t i m e p r o p o r t i o n a l t o t h elength of the dest in /at ion address to f ind the longest prefixmatch. Th e key idea in a t ree-based a lgori thm is that mostnodes require s torage for only a few chi ldren ins tead of al lpossible ones. Such hlgorithms, therefore, make frugal use ofmem ory at th e expelnse of doing more memo ry lookups. Asmemory prices drorl, this is precisely the wrong design deci-sion. Worse, the corhmonly used Patricia-tree algorithm mayneed to backtrack t6 f ind the longest m atch, leading to poor

worst-case performa ce.T h e p e r f o r m a n c e o f r o u t e l o o k u p a l g o ri t h m s c a n b e

improved in several ways.1 We class ify these improvementtechniques into:

Hardware-oriented techniquesTable compaction ltechniquesHashing techniqutls

Ir i thm is determineGI y the numb er of memory accesses i t

P

These improvements ake commercially ve y valuable, so they are not well

described in the literature. Our descriptions are meant only as an outline;

nondisclosure agreements preclude greater detail.1

Well-known hardware-oriented solutions are based o n con-tent-addressable memories (CAM S) and caches. B oth tech-niques scale poorly with routing table su e, and cannot be usedfor backbone routers that support large routing tables. Somerecen t hardware-o r ien ted app roaches es sen t ia l ly combinelog ic an d memory toge the r in a s ing le dev ice , d ras t ica l lyreducing t he m emory access time. This “intelligent-memory’’approach is quite general, and can be used in conjunction with

the software techniques described later. A second hardware-oriented solution is to increase the amount of memory used tostore the routing table. Re ference [lo] argues that it is feasi-b le to use a s ingle 1 G b tab le to look up a 32 -b i t add res s .Even at cu rre nt prices , th is would cost only abou t $6500,which is well within th e range of affordability for backbonerou te r s . Over t ime , as memory cost s d rop , th is app roa chmight well be the best on e even for enterprise routers. A sub-t l e p rob lem wi th th i s app roach , however , i s tha t the t ab lebecomes very hard to update: changing a single forwardingentry might cause several thousand mem ory locat ions to b eupdated. C heap special-purpose hardware that can performrapid updates on th e forwarding table is described in [lo].

T a b l e c o m p a c t i o n t e c h n i q u e s , s u c h a s t h e a l g o r i t h mdescribed in [11], exploit the sparse distribution of forwardingentries in the space of all possible network addresses to builda complicated but compact data structure for the forwardingtable. T he table is then sto red in the primary cache of a pro-cessor, allowing route lookup at gigabit speeds.

Finally, hash-based solutions have also been proposed forroute lookup. The need to determine the longest prefix matchl imits the use of hashing. In particular, given a destinationaddress , we do no t know th e p re f ix to u se fo r f ind ing thelongest match. The solution to this problem is to try differentmasks, choosing the one th at has th e longest mask length. Thechoice of masks can be iterative [12] or hierarchical, or th efirst few bits of the address could be used to find a list of pr e-fix lengths. U nfortunately, non e of these solutions scale wellwith the size of the destination address.

In recent w ork, Waldvogel et al. [13] have presented a scal-able hash-based algorithm tha t can look up th e longest prefix

match for an N bit address in O(1og N) steps. Their algorithmc o m p u t e s a s e p a r a t e h a s h t a b l e f o r e a c h p o ss i b le p r e f i xlength. Instead of naively searching for a suc cessful hash sta rt-ing from th e longest possible prefix, the ir algorithm d oes abinary search on the prefix lengths. This requires hash tablesto contain markers that, on a hash failure, point to the correctsmaller-length hash table to search in. The se arch path for aparticular forwarding entry is compactly stored in the fo rm ofa “rope,” which reduces the storage requirements for markers.In addi t ion , by precomputing ha sh tables that hold al l for -warding en tries associated with each 16-bit prefix, they can“mutate” the hash table on the fly, further reducing the num-be r of memory accesses to a n average of two per lookup.

To sum up, we believe that fas t ro ute lookup is a so lvedproblem. This has major implications for A TM switches (seesidebar).

ADVANCESN SWITCHING FABRICS

As mentioned in th e first section, switching fabrics are usuallyimplemented as a crossbar, shared memory, or bus. The speedof a c rossba r fab r ic is l im i ted by the sched u le r , tha t of ashare d memory fabric by memory access speeds, and that of abus by bus capacitance and arbitration overhead. The mid-’80ssaw a lot of research in the area of switch fabric design, butfew of these designs were actually built b ecause advances inbus speeds m ade th em unnecessary. These designs include thewell-known Banyan family of fabrics, along with others suchas the Delta and O mega fabrics (for a survey see [l]). These

IEEE Communication:!Magazine May 1998 147

I

Page 5: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 5/8

designs were revived in the early 0O OS, mostly for bui ld inglarge ATM switches. With the rec ent decline in demand fo rATM , IP router s ar e being bui l t by wrapping segmentat ionand reassembly modules around ATM switch fabric cores. Inthese routers, permanent virtual circuits are established fromeach port t o all the oth er ports. After a longest-prefix matchdetermines th e dest inat ion port for an IP packet , i t i s frag-m e n t e d i n t o A T M c e ll s a n d s w i t ch e d . T h e A T M c e ll s a r e

reassembled at the output port before transmission. Using anATM core allows the router to support different QoS streamsin the switching fabric and overcom e the problems associatedwith switching variable-size packets. However, these designsinherit some of drawbacks of ATM. First, ATM switches, forthe m ost part, do not have good support fo r multicast.2 Multi-cas t ing da ta th rough th e switch co re requ i res a VC I to bemapped to multiple VCIs, and copies of a cell to be generatede i ther a t the inpu t p o r t o r wi th in the swi tch fab r ic . Thes eoverheads de crease the efficiency of the switching fabric. Sec-ond, a subtler problem arises because IP traffic control algo-rithms ar e usually specified in terms of packets rath er th an inte rms of cells. Thus, with cell-based fabrics, implementingsemantics such as those required by shared f i l ters in RS VPcan be a challenging task. Despite these drawbacks, the use of

a fixed-size interna l switching core seems t o be a widespreaddesign techniq ue, and we will assume the existence of such acore in the rest of this article.

SPEEDINGUPOUTPUT UEUES

We no ted ear l i e r tha t a majo r p rob lem in ou tpu t -que ueds w it c he s i s t h e s p e e d a t w hi c h t h e o u t p u t q u e u e c a n b eaccessed. Two design techniques allow this bottleneck to b eovercom e. The first is to build very wide mem ories tha t canload an entire cell in a single memory cycle. We can do thisby deploying memory elements in parallel an d feeding themwith a cell-wide data bus. Although this extravagant us e ofmemory is costly, as memory prices continue to drop by 60perc ent a year the approach is rapidly becoming attractive. Asecond ap proach to bui ld ing fas t outp ut queues is to in te-grate a port controller and the associated que ues on a single

chip . This approach al lows the readlwri te control logic toaccess queues in memory an entire row at a time, an d there-fore a t speeds far grea ter than with external logic. An exam-ple of this approach can be se en in the 77V400 chipset fromID T Inc. The key idea here is to integra te eight serial inputand outp ut port controllers and a shared memory on a singlevery large-scale integrated (VLSI) chip. The serial inputs areparallelized in a shift register, and the entire shift register,usually containing an ATM cell, is read into the memory inparal lel . When the ou tput port scheduler decides to serve apacket, it read s the cell in parallel into a shift register, con-verts it to serial, and transmits it. Switching is accomplishedby deciding which output port controller should receive anincoming cell. Since the memory can be accessed rapidly, a nout put port contro l ler can s tore eight cel ls in a s ingle cel ltime ( at a line rate of 155 Mbls). The VLSI packaging makesthe chip economical: a 1.2 Gb/s 8-p ort switch-on-a-chip costsonly around $50. While this approach does not scale to verylarge switches, 77V400 chips can be interc onnec ted to formlarger buffered switch fabrics . For ins tance, a three-s tagebuffered Banyan switch can be easily constructed with suchelements. Again, at e ach stage, cell loss du e to simultaneouscell arrivals on multiple inputs is avoided because of the inte-

Currently multicas t is not w ide4 supported in the backbone. Efforts like

the IP Multicast Inihahve (IPMI) may change this in the near uture, mak -ingmultlcast support an importa nt eature or backbone routers.

g r a t i o n o f t h e m e m o r y e l e m e n t w i t h t h e p o r t l o g i c . W eb e l i e v e t h a t i n t e l l i g e n t p o r t c o n t r o l l e r s , s u c h a s I D T ’ s77V400, Lucent’s Atlanta, and MMC’s ATM S2000 chipsets,are th e wave of th e future.

INPUT-QUEUEDWITCHES

Despite improvem ents in the spee d of output queues , theyare still a significant bottleneck, since, to avoid packet loss,

t h e y m u s t r u n m u c h f a s t e r t h a n t h e i n p u t l in k s. W e c a na v o id t h i s p r o b l e m a l t o g e t h e r b y b u il d in g i n p u t - q u e u e dswitches. Input queuing is often deprecated because of thehead-of-line blocking problem: packe ts blocked a t the headof an input queue preve nt schedulable packets dee per withinthe que ue f rom accessing the swi tch fab ric . However , bymaintain ing per-output qu eues at each input , head-of-l ineb lock ing can be comple te ly avo ided . Th i s s t i l l leaves thep rob lem of arbi t rat ing access to the switch fabric at h ighspeeds . Recen t research suggests that th is may be solvablewith current technology [5]. Consequent ly , i t appears thatrouter ba ndwidth can be increased by yet another ord er ofmagnitude by moving to input queuing. Inpu t queuing, how-ever, suffers from some serious problems that still need reso-lution. First, packet scheduling algorithms for providing QoS

are u sua l ly spec if i ed in t e rms o f o u tpu t queue s . I t i s no tc lear how to mod i fy these a lg o r i thm s to s imu l taneous lyschedule the outp ut que ues and the switch fabric . This is acomplex and nontriv ial issue: in effect , we a re asking eachinput port controller to mimic the actions of the entire set ofoutput port controllers, each of which could conceivably bet r a n s m i t t i n g p a c k e t s o n a d i f f e r e n t l i n k t e c h n o lo g y . F o rexample, consider a router th at has a f iber d is tr ibuted datai n t e r f a c e ( F D D I ) , a F a s t E t h e r n e t , a n d a T 3 p o r t . I f t h isrouter uses input queuing, each inpu t port controller shouldschedule packets not only according to the varying transmis-sion speeds of the o utpu t links, but also in accordance withthe transmission policies associated with the disparate links.In par t i cu la r, an inpu t que ue canno t send a packe t to theFast Et her net po rt if that por t is backing off from a collision.I t canno t send a packet to the FD DI port i f that port does

not hav e the toke n. Clearly, with the diversity of link tech-nologies, building a gen eral-purpose inpu t port controller isa c h a l l e n g i n g if n o t i m p o s s i b l e t a s k . S e c o n d , e n h a n c e drouter services such as Rand om Early Discard [14] dependon the l eng th o f the ou tpu t queue . Wi th an inpu t -queuedswitch , the output queue length is not known. D ue to thesep r a c t i c a l p r o b l e m s , n o n t r i v ia l i n p u t - q u e u e d e n t e r p r i s erouters that d eal with heter ogene ous links and policies maynever become pract ical , and hybrid approaches with bothinput and output queuing and a moderate deg ree of speedupin th e switching fabric may be necessary.

SCHEDULING

S u p p o s e tha t p a c k e t s arriving at al l t h e i n p u t ports of a

router wish to leave from the sam e output port. If the outpu ttrunk speed is the sam e as the input trunk speed, only one ofthese packets can be t ransmit ted in the t ime i t takes for al lo f t h e m t o a r r i ve a t t h e o u t p u t p o r t . I n o r d e r t o p r e v e n tpacket loss, the outpu t port provides buffers to store excessarriving packets, and serves packets from the buffer as andw h e n t h e o u t p u t t r u n k i s f r e e . T h e o b v i ou s w ay t o s e r v epackets from t he buffer is in the order in which they arrivedat th e buffer , that is, f i rs t-come fi rs t-served (FCFS ). FCFSservice is trivial to im plem ent, requiring the ro ute r or switchto sto re only a single head and tail pointer per outp ut trunk.However, this solution has its problems because it do es notallow the router to give some sources a lower delay than oth-ers, or prevent a malicious source, which sends an un ending

148 IEEE Communications Magazine * May 1998

Page 6: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 6/8

stream of packe ts as fast as it can, from causing othe r well-behaved stream s froni losing packets. An alternative servicemethod, called Fair Queuing, solves these problems, albeit ata g re a te r imp lemen1 ,a tion cos t [15] . In the F a i r Q ueu ingapproach, e ach source sharing a bottleneck link is allocateda n i d e a l r a t e of service at t ha t l ink . Specifically , focusingonly on th e sources that are backlogged at the link at a giveninstant in time, the available service rate of the tru nk is par-

titioned in ac cordance with a s et of weights. If we representthe w eigh ts of backlipgged sources 1, 2, ..., n by w1, 2 ...,w,, the n the ide al service received by sourc e k is rw&(w,)where r is the rat e of the outgoing trunk. Fair Q ueuing andi ts variants are m echmisms that serve packets from the out-put queue to approximately parti t ion the t runk service ratein this mann er. All ve!rsions of Fair Qu euing req uire packetsto be se rved in an o i ;der d i f fe ren t f rom tha t in wh ich theyarrived. Consequent ly , Fair Q ueuing is more expensive toi m p l e m e n t t h a n F C F S , s i n c e i t m u s t d e c i d e t h e o r d e r i nw h i c h t o s e r v e i n c om i n g p ac k e t s , a n d t h e n m a n a g e t h equeues in order to carry th is out. In general , the h igher thenumber of conversations going through a router, the costlierit is to implem ent Fair Q ueuing since Fair Que uing requiress o m e f o r m of p e r - co n v e rs a t io n s t a t e t o b e s t o r e d o n t h erouters. An exhaustfire survey of variants of the Fair Q ueu-ing algorithm can b e foun d in [16].

Fair Q ueuing has three important and useful propert ies .First, it provides protwtion so that a well-behaved source doesnot see packet losse du e to misbehavior by o ther sources .Second, by design it provides fair bandwidth allocation. If th esum of weights of the sources is bounded, each source is guar-anteed a minimum share of link capacity. Finally, it can beshown that if a source is leaky-bucket regulated, ind epen dentof the behavior of the other sources, it receives a bound on itsworst-case end-to-enc) delay. For the se reasons, almost all cur-rent routers support som e variant of Fair Queuing.

A re lated scheduling problem has to d o with th e partition-ing of link capacity aipong different classes of users. Considera wide-area trun k shar ed by two companies. All other thingsbeing equal, in times bf congestion the tru nk should be equal-

ly shared by packets Crom both companies. These sort of link-sharing requirem ent!$ dea l with classes of connect ions ratherth an individual connc;ctions, and req uir e per-class bookke ep-ing. In recent work iit has been shown that extensions of FairQueuing are compatible w ith hierarchical link-sharing require-ments [17, 181. Fast implementations of algorithms that pr o-vide both hierarchical link sharing and per-connection QoSguarantees are a n area of active research [19, 201. We expectthat all future routerii will provide som e form of Fair Queuingat output queues.

R,EDUCING PORT COST

Both enterprise a nd access routers support a large number ofports. Thus, they need to reduce the cost of a port to the bareminimum. The cost cif a port depends on:

The amount and kind of memory it usesIts processing powrxThe complexity of Ithe protocol used for communicationbetween the po rt and th e routing processorPor ts built with general-purpose processors, large buffers,

a n d c o m p l e x c o m m u n i c a t i o n p r o to c o l s t e n d t o be m o r eexpensive tha n thosi; built using ASICs, with smaller bu ffersand s imp le communica t ion p ro toco l s . Choos ing be tweenA S I C s a n d g e n e r a l - p u r p o s e p r o c e s s o r s f o r a p o r t i s n o tstraightforward. Gen eral-p urpos e processors are costlier, butal low extensib le port funct ional i ty , are avai lable off- the-she l f, and the i r p r i ce /per fo rmance r a t io improves rap id lywith time. Their cost currently makes the m su itable only for

backbone routers, but their flexibility will eventually makethem a t t rac t ive fo r e n te rp r i s e and access rou te rs . On t h eothe r hand, ASICs are not only cheape r, but can also pro-vide operatio ns that ar e specific to routing, such as travers-ing a Pa t ri c ia t ree . M oreover , the l ack of f lexibi l i ty w ithASICs can be overcome by implementing functionality in therou t ing p rocesso r. T hus , a t the mom en t i t s eems to makesense to use processor-based designs for backbone route rs

and ASIC-based designs for the local area. Over t ime, thesituation may well be reversed.

Th e cost of a port is propo rtional to the type an d size ofmemory on the port. SFLAMS offer faster access times, but ar ec o s tl i e r t h a n D R A M s . I n g e n e r a l , b a c k b o n e r o u t e r s u s eS R A M s , a n d e n t e r p r i s e a n d a c c e s s r o u te r s u s e D R A M s .Buffer memory is anothe r param eter that is difficult to size.I n g e n e r a l , t h e r u l e o f t h u m b i s t h a t a p o r t s h o u l d h a v eenough buffers to support at least one bandwidth delay prod-uct worth of packets, where the delay is the mean end-to-enddelay and the bandw idth is the largest bandwidth available toTC P connect ions t ravers ing that r outer . This s izing al lowsTC P connections to open up their transmission windows with-out excessive congestive losses. The largest bandw idth avail-able to connections in the Internet today is around 100 Mb/s.

In th e backbone, assuming conservatively that the m ean c on-nect ion delay is 100 ms, th is comes to a bou t 1 .125 Mb ofbuffering per port. Far less buffering is necessary in enterprisenetworks , where th e me an connect ion delay is usual ly lessthan 10 ms, corresponding to a per-port buffering of ab out100 kb.

F ina lly , the cos t of t h e p o r t i s a l s o d e t e r m i n e d b y t h ecomplexity of th e connections betw een the c ontrol path andthe data path in the line card. In some cases, a routing pro-cessor sends commands to each port through th e switchingfabric and t he port’s in ternal buffers . If command packetscan get los t they need retransmiss ion. Careful engineeringof the con t ro l p ro toco l i s necessary to reduce the cos t o fport control.

ENTERPRISE-LEVELANAGEMENTND CENTRALIZATION

An enterprise or campus may contain many routers under thecontrol of a single administrator. In this situation, it is often agood idea to centralize some functions usually associated withindividual routers . F or example, a cen tral route server cancompute loop-free routes for the ent i re enterprise and loadthem in to each router’s forwarding tables. A similar approachcan be ta ken for loading mult icas t forwarding entries , thusfreeing the routers from the bu rden of participating in com-plex multicast routing protocols. We call this “enterprise-levelmanagem ent .” The enterprise-level approach also makes i teasier to implement global policies. For example, a n organiza-tion may want to limit the total a mount of resources dedicat-ed to m ult icas t t raff ic . A lthough the mechanisms to res tr icttraffic (like policers) may be implem ented at each rou ter, thecomputation of parameters for the individual policers can b ecentralized.

AVOIDINGOUTE LOOKUPS

In s tead o f reduc ing the cos t of r o u t e l o o k u p s, b a c k b o n erouters can use two techniques to avoid route lookups alto-gethe r. First, backbone networks can provide a virtual circuitin te rfa ce (e.g. , carry ing IP over ATM over SONET), and

r e q u i r e e d g e n e tw o r k s t o t r a n s l a t e f r o m t h e d e s t i n a t i o naddress to a VCI. Since VCIs a re integers drawn from a smallspace, they can be looked u p with a s ingle memory access.More over, th e complexity of a longest prefix match is avoided.However , the ed ge ne twork mus t somehow d i s tr ibu te themapping between a VCI ( or tag) and the destination port to

IEEE Communications Magazine 9 May 1998 149

I

Page 7: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 7/8

each router in the backbone. This can be achieved throughprotocols such as I P switching3 and tag switching. The deploy-ment of such a technique requires a major change to the net-work. A second technique essentially introduces an other levelin the routing hierarchy to reduce the size of routing tables,making route lookups cheaper. W ith this approach, backbonerouters only keep routes to destinations served by tha t back-bone. All unknown destinations ar e routed to a “gateway” at anetwork access point (NAP). Competing backbone providersexchange routing information a nd packets at NA Ps. Whilethis m eans that global (large) routing tables are n eeded onlyat NAPs, the growth in the num ber of providers in the NA Phas stressed the scalability of the Border Gateway Protocol(BG P), which is used to exchange routing information amongpeer routers at the NAP.

ROUTEROPERATINGYSTEMS

In th e past , routers have been viewed as hardware devicesthat are optimized for routing packets at high speeds. Thus,the software environment on a typical control processor isbare bones, providing little beyond a basic m onitor. There isa growing realization, however, tha t router hardw are is get-ting to be a commodity which is easy to build, and the great-

est asset of a router vendor is its software. A parallel tren d,which reinforces this idea, is the not ion that by opening upthe router archi tecture to th ird part ies , router vendors canleverage them to c rea te enhanced se rv ices with in the ne t -work. Thus, ther e is an intense focus on the developm ent ofrouter operating systems: operating systems that are special-i zed to run on rou te rs a nd p rov ide a ca re fu l ly con t ro l l edappl ication programming in terface (API) to th e underly inghardware. If th is t rend cont inues , end users may be able toinstall custom software modules within router s to provide ser-vices such as firewalls, traffic management policies, applica-tion-specific signaling, and fine-grained control of the routingpolicy. The re is a fairly large and vocal “open signaling” com-munity tha t is lobbying route r vendors fo r precisely this kindof access to route r internals [21]. It is interesting to no te tha ttelepho ne companies already provide an extremely rich pro-

g ramming in te r face to PBX hardware th rough th e JTAPIi n t e r f a c e [ 2 2 ]. A n e x t r e m e v a r i a n t o f t h i s t r e n d , d u b b e d“active networking,” would allow individual packets to installso f tware on the f ly in rou te rs . I t i s no t c lea r what pe r fo r -man ce pena l ty such an app r oach wou ld exac t , bu t rec en twork shows that both th e securi ty concerns and th e perfor-manc e hi t of act ive networks may be to lerable, at leas t insome environments [23].

OPENPROBLEMSIn this section, we identify some challenging open problems inroute r design. We believe tha t the solutions to these problemswill lead to interesting new trade-offs in the next generation

of IP routers.

FLOW DENTIFICATIONI t i s o f t e n u s e fu l t o t h i n k o f t h e s e t of packe t s t rave l ingt h r o u g h t h e I n t e r n e t b e t w e e n a given source and a g ivendestination close together in time as constituting a flow. Aflow can result from the s et of packets within a long-lastingTCP connect ion or from the set of UDP packets in an audioor video session. By definition flows last fo r a while, so it is a

In IP switching the backbo ne tells the edge about the mapping, instead

of the other way around, but the idea is conceptually similar to what we

describe here,

useful optimization to pin resources, such as cache entries,associated with the curr ent set of flows. Therefo re, it is use-ful to identify flows on the fly by noticing, for instanc e, thatm o r e t h a n X ackets with the same source and dest inat ionIP addresses and TCP port numbers have been see n in thelast Y seconds [24]. Flows may also be associated with re al-time performance guarantees. We can identify these flows bymatching incoming packet headers with a set of prespecifiedfilters. Since classification needs to be done for each incom-ing packet, we need fast classification algorithms. Unfortu -nately , while the algori thms described earl ier can look uproutes for 32-bit addresses at line speed, they cannot be easi-ly modified for fas t f low class if icat ion . Mo reover, we lackgeneric yet efficient flow descriptors. For instance, the mostgeneric class if ier is one that masks t he source and dest ina-t ion IP addresses and ports and th e protocol number, thusrequiring a lookup on 104 bits of the packet . T his sort ofclass if ier seems diff icul t to implement at h igh speed. Webelieve t hat com ing up with a concise description of a classi-fier and a way to match th e “be st” classifier among the sev-e r a l t h o u s a n d t h a t m a y b e p r e s e n t a t a r o u t e r i s a n o p e nproblem.

RESOURCERESERVATIONTh e Interne t has poor support for resource reservations: E th-ernet-based LANs, W AN access links, and back bone routersare geared toward best-effort traffic, with no support even fort h e s i m p l e s t of p r i o r i t y s c h e m e s . A s E t h e r n e t s b e c o m eswi tched and the dem and fo r some fo rm of service qualityincreases, we expect to se e support for resource reservation inall three classes of routers. Resource reservation goes hand inhan d with flow classification, because resou rces ar e reservedon behalf of prespecified flows. Unfortunately, this couplingmakes resource reservat ion at leas t as hard to solve as th isopen problem!

Eve n if we ha d efficient flow classifiers, reso urce reserva-tion additionally requires either policing, so the demand of anindividual f low is l imited , or some for m of seg rega t ion inpacket scheduling, so over-limit flows are automatically dis-

couraged. Given the complexity of implementing Fair-Queu-ing-type scheduling algorithms at high speed, there has beenmuch re cent work in coming up with efficient pol icers . Forexample, in the RIO scheme over-limit packets ar e marked aslow priority and preferentially d iscarded [25]. The choice ofgood policing algorithms and associated pricing schemes is anopen problem.

EASEOF CONFIGURATION

Configuring routers is hard work. Misconfigured routers canbe hard to detect a nd can cause nearly untraceable perfor-mance problems. F or exam ple, bugs in the configuration ofproxy ARP on routers manifest themselves only as a myste-rious increase in network delay [26] .We believe that simple

and in tu i t ive abstract ions of the underly ing network func-t iona l ity wou ld go a long way in so lv ing these p rob lems .These abstract ions remain elusive. Configurat ion becomesh a r d e r if the funct ional i ty , such as l imit ing the amou nt of

mul t i cas t t ra f f ic in a ne twork , req u i res the s imu l taneousc o n f i g u r a t i o n o f m o r e t h a n o n e r o u t e r i n t h e n e t w o r k .In teract ion between inconsis tent configurat ions can causenetworkwide problems and failures. It is not always possibleto visually examine co nfiguration files to discover mistakesand inconsistencies. W e believe that t he next generatio n ofcon f igu ra t ion too l s wil l need ru le -based and s imu la t ion -b a s e d s u b s y s t em s t o t e s t a c o n f i g u r e d r o u t e r b e f o r eins tal l ing i t in th e f ield . This is a d iff icul t and in teres t ingopen problem.

150 IEEE Communications Magazine May 1998

Page 8: Issues and Trends in Routers

8/3/2019 Issues and Trends in Routers

http://slidepdf.com/reader/full/issues-and-trends-in-routers 8/8

STABILITY OF LARGE SYSTEMS

Recal l that router hardware can be made more rel iable byadding hot spares , dual power suppl ies , and dupl icate datapa ths . In con t ra s t to ha rdware re l i ab i li ty , wh ich i s a wellunde rstood a nd solved problem, software reliability remains achallenging open pr oblem. W e believe that stability of routersoftware is a necessary prerequisite for the reliability of alarge netw ork. Software stability is hard t o achieve because

software state is affected by interac tion among different fea-tures. For example, the addition of a BGP (exterior routing)attribute may affect the calculation of routes exported to inte-rior routing protocols. Furth erm ore, interaction between bugsfrom different vendors [7] can lead to persistent instabilities inthe network. Simulation may not be very helpful since it is dif-f icul t to rep rodu ce bugs in implementat ions , and thus th eexact behavior of nodes in the netw ork. We believe the on esolution to softw are reliability may lie in add ing features toprotoc ol implementations, similar to the s uppo rt for multicasttraceroute in mrouted, which allow us to detect and isolateproblems. This would at least allow us to track the cause forinstability once problem s occur.

ACCOUNTABILITYThe introduction of differential service in the Inter net must

necessarily be accomlpanied by pricing. Pricing requires routersupp ort for accounting. The cost and feasibility of accountingsuppor t depends on the granulari ty at which accounting isdone. Similar to flow identification, where coming up w ith aconcise definition of a classifier and a way to match the bestclassifier is ha rd, a concise definition of an a ccount an d a wayto identify and bill an account is an open problem.

CONCLUSIONSIP route rs are in the midst of great change, due to both cus-tomer pull and technology push. Customers are demandinghigher bandw idth, greater reliability, lower cost, greate r flexi-bility, and ea se of con figura tion. Simultaneously, technology,in the form of ATM switching cores and fas t route-lookup

algorithms, has allowed router vendors to build the next gen-eratio n of route rs. Vie believe that th e advan ces described inthis article, such as tlne use of ATM cores, better output queu-ing, advanced scheduling algorithms, avoiding route lookups,and c entra lized admin istratio n, will be the distinguishing fea-tures of this generation. While these advances have solvedsome difficult problems, impo rtant issues still remain unre -solved. We believe that und erstanding t he stability of a net-w o r k of r o u t e r s i s a cri t ical open issue. Trading off cost ,speed, flexibility, and ease of configuration, as always, will bea challenge for rou ter designers in years to come.

ACKNOWLEDGMENTSWe would l ike to thank He mant Kanakia and Ritesh Ahujafo r he lping us be t t e r unders tand the " rea l " p rob lems andtrend s in rou ter design.

REFERENCES

[l] . Oie et al., "Survey of Switching Techniques in High-speed Networks

and Their Performance," Proc. IEEE INFOCOM'90, une 1990, p.

[2] . Keshav, An Engineering Approach to Computer Networking, Reading,MA: Addison-Wesley, 1997.

1242-51.

[3]Contribut ion of Packet Sizes to Packet and Byte Volumes," MCI vBNShttp://www.nlanr.netlNHLearn/ packetsize.htm1.

[4] Flow Counts as a Functio n of Flow Granularity," FIX-West,http://www.nlanr.ner/NA/Learn/aggregation.html.

151 N. McKeown, "Scheduling Algorithms for Input-Queued Cell Switches,"Ph.D. thesis, Univ. Calif. Berkeley, May 1995.

[6] . McKeown, V. Anantharam, and J. Walrand, "Achieving 100%

Throughput in an Input-Queued Switch," Proc. / E € € INFOCOM'96,SanFrancisco, CA, Mar. 1996.

171 C. Labovitz, G. R. Malan, and F. Jahanian, "Internet Routing Instability,"Proc. ACM SIGCOMM'97,Cannes, France, Sept. 1997.

181W. R. Stevens, TCP/IP Illustrated, vol. 2,Reading, MA: Addison-Wesley,1995.

[91K. Sklower, A Tree-Based Packet Routing Scheme for Berkeley Unix,Internal memo, Univ. Calif. Berkeley.

1101 P. Gupta, S . Lin and N. McKeown, "Routing Lookups in Hardware a t

Memory Access Speeds," Proc. IEEE INFOCOM '98,Mar. 1998.

[I 1 A. Brodnik et al., "Small Forwarding Tables for Fast Route Lookups,"Proc. ACM SIGCOMM '97,Cannes, France, Sept. 1997.

(121Online Proceeding of the panel discussion on Better Integration of IP

and ATM, IP ATM Workshop, Washington University, St . Louis, MO,Nov. 1996; http://arl.wustl.edu/arl.  

1131 M. Waldvogel et al., "Scalable High Speed IP Routing Lookups," Proc.ACM SIGCOMM '97,Cannes, France, Sept. 1997.

[141 . Floyd and V. Jacobson, Random Early Detection Gateways for Con-

gestion Avoidance, IEEEIACM Trans. Networking, Aug. 1993.[151A. Demers, S. Keshav, and S. Shenker, "Design and Analysis of a Fair

Queuing Algorithm," Proc. ACM SIGCOMM '89,Austin, TX, Sept. 1989.

[161 . Goyal, "Packet Scheduling Algorithms for Integrated Service Net-works," Ph.D. thesis, Univ. Texas, Austin, 1997.

[171 . C. R.Bennett and

H.Zhang, "Hierarchical Packet Fair Queuing Algo-rithms," Proc. ACM SIGCOMM'96,Stanford, CA, Aug. 1996.

[181 . Goyal, H. Vin. and H. Chen, "Start-t ime Fair Queuing: A SchedulingAlgorithm for Integrated Services Packet Switching Networks," Proc.

ACM SIGCOMM '96,Aug. 1996.[19]. C. R. Bennett, D. C. Stephens and H. Zhang, "High Speed, Scalable,

and Accurate Implementation of Fair Queuing Algorithms in ATM Net-works," Proc. ICNP '97, 1997.

I201 J. Rexford et al., "A Scalable Architecture for Fair Leaky-Bucket Shap-ing," Proc. IEEE INFOCOM '97,Kobe, Japan.

[21]OPENSIG Spring '97Workshop, http://www.ctr.columbia.edu/opensig

[221 ava Telephony Application Programmer's Interface, available fromhttp://java.sun.com/products/jtapi/index.html

[23] . S. Alexander et al., "Active Bridging," Proc. ACM SIGCOMM '97,

Cannes, France, Sept. 1997.[24]. Lin and N. McKeown, "A Simulation Study of IP Switching," Proc.

ACM SIGCOMM '97,Cannes, France, Sept. 1997.[25]D. Clark and J. Wroclawski, "An Approach t o Service Allocations in the

Internet," Internet draft, draft-clark-diff-svc- alloc-00.txt. available from

http://www.ietf.org.

I261 S . M. Ballew, Managing IP Networks with Cisco Routers, O'Reilly,1997.

BIOGRAPHIES

S.KEsHAv ([email protected])is an associate professor of computerscience at Cornell University. He was formerly a member of technical staffat AT&T Bell Laboratories, where he worked on ATM network design and

traffic management. He is the author of An EngineeringApproach to Com-

puter Networks (Addison-Wesley Longman, 1997).He received the SakrisonPrize for the best Ph.D. thesis in the EECS Department at University of Cali-

fornia, Berkeley, in 1992, nd a Sloan Fellowship n 1997.He is a memberof the editorial teams for IEEEIACM Transactions on Network ing and the

Journal of High Speed Networks, and has published numerous technicalpapers. His current research interests are in network management, protocoldesign, and computer telephony integration.

ROSENSHARMA completed his B.Tech in computer Science from the Indian

Inst itu te of Technology, Delhi, in 1993, eceiving the President's GoldMedal for academic excellence. He then joined Stanford University to pur-sue a doctoral degree in networking. His research led to the formation o f

VxTreme Corporation in December 1995 n Palo Alto, California, of which

he was cofounder and lead architect. VxTreme developed client/server

applications for multimedia delivery on the Internet. His is currently a Ph.D.candidate at Cornell University. His research interests are i n the areas ofmultimedia and networking.

IEEE Communications Magazine May 1998 151