8
338 IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART B, VOL. 21, NO. 4, NOVEMBER 1998 Architecture, Defect Tolerance, and Buffer Design for a New ATM Switch Vijay K. Jain, Senior Member, IEEE, Lei Lin, and Susumu Horiguchi, Senior Member, IEEE Abstract— This paper presents a modular architecture for a scalable ATM-switch. The cell routing function, as well as the associated queueing, are distributed over many small clusters of nodes, called basic modules. These basic modules are hierarchi- cally interconnected to form larger switches. In a basic module, every node is interconnected with adjacent nodes in the same module with three of its four links. The fourth link is used to connect either to an external port or to other basic modules at higher levels of the hierarchy. From a hardware implementation perspective, the simplicity of the architecture stems from the fact that each node in the switch consists of two small crossbar switches of low complex- ity, a buffer, and a controller. The hierarchial nature of the topology allows for modular growth of the switch. Further, the interconnection topology of the switch makes it suitable for three- dimensional (3-D) (stacked VLSI) implementation. Index Terms—Asynchronous transfer mode (ATM), broadband switches, defect tolerance, hierarchical switch, modular architec- ture, redundancy, routing, tera-switch, 3-D VLSI, yield. I. INTRODUCTION T HE rapidly increasing customer demand for a wide range of communication services has led to the evolution of the integrated broadband network, which can accommodate diverse bit rates and quality of service (QoS) objectives. Called the Broadband Integrated Services Digital Network (BISDN), it is an all-inclusive network. When fully deployed, the users will be able to multiplex low-bit rate traffic with high- and ultra-high bit rate traffic. Services will include mono and mul- timedia types, point-to-point and multipoint connections, and interactive and noninteractive modes. This new direction of development in telecommunication networks, with synergistic advances in fiber optic technology and microelectronics, has resulted in the adoption of a new switching technology, called fast packet technology or asynchronous transfer mode (ATM) [1]–[3]. This switching and multiplexing technology assures the transfer of a wide range of data. As might be expected, current proposals for broadband switches seek to employ a high degree of parallelism, distributed control, and routing at the hardware level [4]–[8]. Manuscript received October 1, 1997; revised August 26, 1998. This work was presented at the Second IEEE International Conference on Innovative Systems in Silicon, Austin, TX, October 1997. V. K. Jain is with the University of South Florida, Tampa, FL 33620 USA. L. Lin is with the National Semiconductor Corporation, Santa Clara, CA 95054 USA. S. Horiguchi is with the School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Eshikawa 923-12, Japan. Publisher Item Identifier S 1070-9894(98)08528-4. Fig. 1. Multiplexed cell stream on a link—an example. In this paper, we propose a new ATM switch, hereafter referred to as Tera-Switch, which imbibes the above design philosophy, while attempting to achieve two additional goals. First, it makes implementation in a three-dimensional (3- D) stack feasible. A 3-D implementation has the potential of putting the major, if not all, parts of the switch in one package, thereby achieving considerable increases in speed with a concurrent reduction in volume [9]–[11]. Secondly, by using a hierarchial topology, it allows for modular growth. The switch can be extended to a larger size by fully populating the existing highest level or adding still more levels. Switch capacities up to several terabits/s can be achieved. II. BRIEF OVERVIEW OF ATM We begin by remarking that ATM attempts to harness the best features of packet and circuit switching. It resembles circuit switching, inasmuch as a connection must first be established between the users before data communication can take place. On the other hand, it resembles packet switching because small packets, or cells as they are called, are transmit- ted. This is depicted in Fig. 1, as is also the fact that cells are sent asynchronously inasmuch as the inter-cell timing from a given source is asynchronous. In a physical link, the cells for a particular connection are multiplexed with those of numerous other connections [1]–[3]. The cell size is standardized to 53 bytes, of which the first five constitute a header that includes a connection identifier. The connection identifier is broken down into two fields, namely the virtual path identifier (VPI) and the virtual channel identifier (VCI), somewhat analogous to a person’s name divided into a family name—or last name, and a first name. The fact that many cells with connection identifier 3 are appearing in Fig. 1 could be indicative of a high-speed video connection, or simply that there is a burst of data occurring on that connection. Implicit also in Fig. 1 is the fact that ATM allows for integrated services, with a wide variety of bit-rates and/or burstiness, and for control over the QoS through a combination 1070–9894/98$10.00 1998 IEEE

Architecture, defect tolerance, and buffer design for a new ATM switch

  • Upload
    s

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

338 IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART B, VOL. 21, NO. 4, NOVEMBER 1998

Architecture, Defect Tolerance, and BufferDesign for a New ATM Switch

Vijay K. Jain, Senior Member, IEEE,Lei Lin, and Susumu Horiguchi,Senior Member, IEEE

Abstract—This paper presents a modular architecture for ascalable ATM-switch. The cell routing function, as well as theassociated queueing, are distributed over many small clusters ofnodes, called basic modules. These basic modules are hierarchi-cally interconnected to form larger switches. In a basic module,every node is interconnected with adjacent nodes in the samemodule with three of its four links. The fourth link is used toconnect either to an external port or to other basic modules athigher levels of the hierarchy.

From a hardware implementation perspective, the simplicityof the architecture stems from the fact that each node in theswitch consists of two small crossbar switches of low complex-ity, a buffer, and a controller. The hierarchial nature of thetopology allows for modular growth of the switch. Further, theinterconnection topology of the switch makes it suitable for three-dimensional (3-D) (stacked VLSI) implementation.

Index Terms—Asynchronous transfer mode (ATM), broadbandswitches, defect tolerance, hierarchical switch, modular architec-ture, redundancy, routing, tera-switch, 3-D VLSI, yield.

I. INTRODUCTION

T HE rapidly increasing customer demand for a wide rangeof communication services has led to the evolution of

the integrated broadband network, which can accommodatediverse bit rates and quality of service (QoS) objectives. Calledthe Broadband Integrated Services Digital Network (BISDN),it is an all-inclusive network. When fully deployed, the userswill be able to multiplex low-bit rate traffic with high- andultra-high bit rate traffic. Services will include mono and mul-timedia types, point-to-point and multipoint connections, andinteractive and noninteractive modes. This new direction ofdevelopment in telecommunication networks, with synergisticadvances in fiber optic technology and microelectronics, hasresulted in the adoption of a new switching technology, calledfast packet technology or asynchronous transfer mode (ATM)[1]–[3]. This switching and multiplexing technology assuresthe transfer of a wide range of data. As might be expected,current proposals for broadband switches seek to employ ahigh degree of parallelism, distributed control, and routing atthe hardware level [4]–[8].

Manuscript received October 1, 1997; revised August 26, 1998. This workwas presented at the Second IEEE International Conference on InnovativeSystems in Silicon, Austin, TX, October 1997.

V. K. Jain is with the University of South Florida, Tampa, FL 33620 USA.L. Lin is with the National Semiconductor Corporation, Santa Clara, CA

95054 USA.S. Horiguchi is with the School of Information Science, Japan Advanced

Institute of Science and Technology, Nomi, Eshikawa 923-12, Japan.Publisher Item Identifier S 1070-9894(98)08528-4.

Fig. 1. Multiplexed cell stream on a link—an example.

In this paper, we propose a new ATM switch, hereafterreferred to asTera-Switch, which imbibes the above designphilosophy, while attempting to achieve two additional goals.First, it makes implementation in a three-dimensional (3-D) stack feasible. A 3-D implementation has the potentialof putting the major, if not all, parts of the switch in onepackage, thereby achieving considerable increases in speedwith a concurrent reduction in volume [9]–[11]. Secondly, byusing a hierarchial topology, it allows for modular growth. Theswitch can be extended to a larger size by fully populatingthe existing highest level or adding still more levels. Switchcapacities up to several terabits/s can be achieved.

II. BRIEF OVERVIEW OF ATM

We begin by remarking that ATM attempts to harness thebest features of packet and circuit switching. It resemblescircuit switching, inasmuch as a connection must first beestablished between the users before data communication cantake place. On the other hand, it resembles packet switchingbecause small packets, or cells as they are called, are transmit-ted. This is depicted in Fig. 1, as is also the fact that cells aresent asynchronously inasmuch as the inter-cell timing from agiven source is asynchronous. In a physical link, the cells for aparticular connection are multiplexed with those of numerousother connections [1]–[3]. The cell size is standardized to 53bytes, of which the first five constitute a header that includes aconnection identifier. The connection identifier is broken downinto two fields, namely the virtual path identifier (VPI) andthe virtual channel identifier (VCI), somewhat analogous toa person’s name divided into a family name—or last name,and a first name. The fact that many cells with connectionidentifier 3 are appearing in Fig. 1 could be indicative of ahigh-speed video connection, or simply that there is a burst ofdata occurring on that connection.

Implicit also in Fig. 1 is the fact that ATM allows forintegrated services, with a wide variety of bit-rates and/orburstiness, and for control over the QoS through a combination

1070–9894/98$10.00 1998 IEEE

JAIN et al.: NEW ATM SWITCH 339

TABLE IFAMILY OF TERA-SWITCHES

of reservation and allocation of bandwidth and dynamic re-source management [3]. Note also that in view of Fig. 1, a linkcan be thought of as a superpipe with several pipes that can beassociated with distinct VP’s, each in turn containing subpipesassociated with several VC’s, all belonging to the same VP.At a switching node, the cell identifier (actually, connectionidentifier) together with the identity of the incoming physicallink are used to perform routing within the switch, leadingthe cell to an appropriate output port and also endowing itwith a translated identifier. This is shown in Fig. 2 for ahypothetical switch. For simplicity only a 2 2 is displayedwhereas in practice the number of input and output ports maybe quite large, say 256 each. From a conceptual point of view,the switch performs a mathematical mapping from the pair

Incoming_link, Cell_identifier to the pair Outgoing_link,New_cell_identifier [1]–[3], as demonstrated in the table.Also for simplicity, the input port processors (IPP’s) andthe output port processors (OPP’s) are not shown for theincoming and outgoing lines. The IPP performs such functionsas synchronization of the incoming cells with the cell clock ofthe switch. Among the functions performed by the OPP, oneof the most important is scheduling.

III. SWITCH ARCHITECTURE

In common with other switches, the proposed switch con-sists of individual switching nodes, their interconnections, andof course input/output (I/O) ports. However, as mentionedearlier, it is distinguished by a multilevel hierarchial structure.At the lowest level of the hierarchy, level one—to which wealso refer to as the basic module (BM), 16 switching nodesare arranged in a square cluster. As seen from Fig. 3, eachnode has four links and is a 2 2 switch. Three of these linksare used to interconnect the 16 nodes to form a basic module.The remainingone free link of each nodeis used either to

Fig. 2. A simple 2� 2 switch and its header, link translation table.

interconnect the module to external ports or to other basicmodules to create a higher level interconnection network.

A. Allocation of Free Links

Referring to Fig. 3, a BM has 16free links. As mentionedabove, these links may be used for external port connectionsor higher level interconnections. For example, a BM all ofwhose free links are allocated for external port connectionwill have eight external inputs (EI) and eight external outputs(EO) and is referred to as an 8EI, 8EO BM. Such a BM wouldhave no free links left for the higher level interconnections,therefore it can only have Level-1 interconnection. In fact,a family of BM’s can be envisioned, parameterized by thenumber of external connection links. Table I lists this family,while identifying the corresponding highest possible level (L)of interconnections.

340 IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART B, VOL. 21, NO. 4, NOVEMBER 1998

Fig. 3. Level-1 (basic module) architecture for ATM switch.

Also, the maximum number of nodes and the number ofexternal input/output pair of ports are shown. The entriesdenotes the the maximum offered load to the switch in Gegabits per second (Gbps) assuming that the line rate is 155 Megabits per second (Mbps) [2]. Similarly, denotes the maximumoffered load in Gbps based on a 622 Mbps line rate [2]. Forthe largest switch, the Level-4 shown in the rightmost column,the maximum offered load can be as high as 5095.424 Gbps,or 5 Terabits bits per second,hence the name Tera-switch. Thethroughput of the switch is

(1)

where denotes the cell loss probability. Of course, theoffered load is bounded by the entry (or for 622 Mbpslines) in the table corresponding to the particular member ofthe family utilized. Results on loss probability of the switchare presented in Section IV.

B. 2EI, 2EO Basic Module

The BM in Fig. 3 is a 2EI,2EO basic module, in whichfour special nodes can be discerned. Nodes 00 and 02 arethe external input nodes and nodes 10 and 12 are externaloutput nodes. The basic module therefore has 12 free linksfor higher level interconnection. Since four free links are usedup at each level of interconnection, the highest level switchthat can be created with this basic module is 4. Inparticular, we may connect 16 basic modules to form a Level-2 network. Fig. 4 shows the architecture of a Level-2 networkwhere it is seen that the basic modules are connected in atwo-dimensional (2-D) torus configuration. Nodes 31 and 32of Level-1 network are used for horizontal connection andnodes 20 and 30 for vertical connection. The Level-2 networkconsists of 256 nodes, which can provide a 3232 switchingcapability. In a similar manner, one can construct a Level-3network from Level-2 networks which can provide a 512512 switching capability. In turn, a Level 4 network can beformed by using 16 Level-3 networks which would provide a8192 8192 switching capability.

C. Node

As shown in Fig. 5, each node is a 2 2 switch with ashared buffer. Basically, incoming cells from the two input

Fig. 4. Level-2 interconnections for ATM switch (using 2EI,2EO BM’s).

Fig. 5. Node architecture for an ATM switch.

links are switched to the two output links. However, if bothcells request the same output link, then one of the cells isbuffered in the shared buffer for later transmission. Normally,an incoming cell is first stored in one of the buffer slots.A 2 crossbar implements this switching function. Duringeach time slot (cell slot), a buffered cell competes to access anoutput link and, after winning, the cell exits through an2 crossbar. Shared buffering helps minimize the problem ofblocking within the switch, while providing somewhat betterperformance than alternative queuing strategies.

Buffered cells which request access to the same output linkare allowed to do so on a first in first out (FIFO) basis.The controller part of the node keeps track of the relativearrival times of the cells. Specifically, it assigns a pseudo timestamp (actually a sequence number) to each cell. Assumingthat the buffer capacity is 2 ATM cells, the pseudo time

JAIN et al.: NEW ATM SWITCH 341

stamp sequence is anbit number. Moreover, a one bit codeis required, indicative of the particular output link the cell isrequesting to access. Suppose, for example, the buffer size isfour, then the pseudo time stamp sequence numbers are 2-bnumbers (00, 01 10, 11). The assignment of the pseudo timestamp numbers is in a circular mode which means if the lastcell was assigned the 11 b, the next cell is assigned 00. Besidesassigning the pseudo time stamp numbers, the controller alsokeeps track of the time stamp of the last cell which left thebuffer.

Additionally, the controller can be designed to have thecapability of regulating the arrival of cells at a node. In thisvein, the controller gives or denies permission to adjacentnodes to send cells, depending on the availability of bufferslots. This is accomplished by issuing a GRANT-IN signal,only when there is buffer space available to accommodatemore cells. If the buffer is full, the node denies the GRANTIN signals to both neighbors which are connected to its inputlinks. In this scenario, cell loss can be completely eliminatedat the output and routing nodes, thus occurring only at theexternal input nodes.

As may be clear by now, nodes fall into one of threecategories:

1) input nodes;2) output nodes;3) routing nodes.

An input node has one of its input links connected to an inputport processor (IPP). The input port processor acts as interfaceto connect the node to an external input link. One of its primaryfunctions is to convert the incoming ATM cell format to theswitch network format and to provide the (switch) routinginformation within these converted cells. An output node, onthe other hand, has one of its output links connected to anoutput port processor (OPP). An output port processor reversesthe format of the cell to the standard ATM format beforetransmission on to the output link. For the routing nodes, oneof the input or output links may be used to provide connectionto higher levels of the network.

D. Addressing and Routing

Base-4 digits are used for convenience of address repre-sentation. As shown in Fig. 3, nodes in a basic module areaddressed by two digits, the first representing the row indexand the next representing the column index. More generally,in an level network, the node address is

Here, the total number of digits is and pairs ofdigits run from pair number for Level- to pair number 1for the first level, i.e., the basic module. Specifically, thethpair indicates the position of the Level-subnetwork within the th level to which the node belongs;

In a two level network, for example, the addressbecomes The first pair of digits identifies the

BM (that the node belongs to) and the last pair of digitsidentifies the node within that BM.

Routing of cells in the switch is performed from the toplevel to the bottom level. Thus, it is commenced at the highestlevel network, then, after the packet reaches its highest levelsub-destination, routing is continued within that subnetwork tothe next lower level sub-destination. This process is repeateduntil the packet arrives at its final destination. When a cell isissued from a source node, the node checks its destination. Ifthe cell is destined to the current BM, routing is performedwithin that BM only. On the other hand, if it is addressed toanother BM, the source node sends the cell to the outlet nodewhich connects the BM to the level at which routing shouldbe performed. Consider that a cell is to be transported from asource node 000000 to destination node 231121. In this case,the routing must first be performed at Level-3, therefore thesource node sends the cell to the Level-3 outlet node 000001,whereupon the cell is routed at level-3. After the cell reachesthe 2,3 Level-2 network, then routing within that network iscarried out until the cell reaches its BM 1,1. Finally, the cellis routed to its destination node 2,1 within that BM.

In general, multiple paths exist for routing a cell in theswitch. We are, however, considering only a deterministicrouting strategy in which the source and destination nodeaddresses are sufficient to uniquely determine the path tracedby a cell. With this strategy, routing can be performed bymeans of a routing tag. In the interest of brevity, we give herethe details of only the routing within a Basic Module.

BM Routing AlgorithmComment: from source nodeto destination nodeIf is odd,

If climb staircase until row is reached; endifendif;

Comment: Note that a staircase startswith a vertical movementand also ends with a vertical movement.

Note that the short-hand notation means that a horizontalrouting is to be performed until the columnis reached.

IV. SIMULATIONS AND BUFFER DESIGN

We present the results of Monte-carlo simulations upon aLevel-2 64 64 switch with 4EI,4EO BM’s. The objectiveis to measure the cell loss rate and average delay. Fig. 6displays the cell loss rate of the switch versus link utilizationfor different buffer sizes. The traffic distribution was assumedas follows: the probability of a node sending a cell within theBM was taken to be 0.6; the probability of a node sending acell to the four neighboring BM’s was taken to be 0.3; andthe combined probability of sending a cell to the remaining 11BM’s was taken to be 0.1. It should also be remarked that theresulting data were smoothed using polynomial interpolation,and extrapolated to very low offered loads. In the figure onthe left, the internal link speed is assumed to be 1/4(linerate and in the figure on the right, the internal link speed

342 IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART B, VOL. 21, NO. 4, NOVEMBER 1998

Fig. 6. Cell loss probability versus offered load, for various buffer sizes.

is assumed to be 1/2 (line rate both with byte widthinternal links. As expected, the cell loss rate decreases withan increase in buffer size. For a line utilization of 0.75, thecell loss rate is seen to be on the order of with a buffersize of 16 for the case of internal speed1. This appearsto be a good choice. Note that for this 64 64 switch, thetotal throughput is 9.92 Gbps at a line speed of 155 Mbps, and39.808 Gbps at a line speed of 622 Mbps. The buffer memoryneeded per node (recalling that each cell comprises 53 bytes) is

bytes Bytes

Fig. 7 shows the average delay incurred in the switch forvarious buffer sizes. As seen from the graph, the averagedelay contributed by the switch is small. In fact it remainsin the range of 2–9 cell cycles, as the link utilization spansthe interval 0.2–0.8. Another interesting observation is thatthe average delay is somewhat invariant as the buffer sizeincreases from 4–32 cells, perhaps indicating that the tailprobability effect has set in at a buffer size of four cells.

A. Comparison with Other Architectures

While the switch provides a performance comparable com-pared to the other architectures, it has less wiring. Thisattribute, in fact, makes our switch to be a better candidatefor a 3-D stack implementation. Although the next sectionexamines wiring considerations for 3-D geometry briefly, amore detailed discussion will be presented elsewhere.

V. 3-D IMPLEMNTATION

As mentioned earlier, an important consideration in the de-sign of the new switch is the feasibility of 3-D implementation[9]–[11]. To be specific, let us consider a Level-3 switch with4096 nodes. For this network, the designer may implementeach of the 16 Level-2 networks on a single large area device.Let us call this device a plane.

1The corresponding internal ‘bit’ clock rates for the two different linespeeds are 77.5 and 311 Mbps.

Fig. 7. Delay versus offered load, for various buffer sizes.

Then 16 such planes can be used to build a stack as shown inFig. 8 with a total of 4096 nodes2. Consider that the node hasan effective area of 0.8 0.8 mm and a tile area of 1.0 1.0mm in 0.25 m CMOS technology. The total area requiredin the plane is then 1.6 1.6 cm . Clearly, a large VLSIchip can support this subnetwork and its associated controland power wiring. Sixteen such planes are therefore requiredto build a stack with 4096 nodes. Given below is an estimateof the vertical wiring needed to interconnect the planes.

Link width (bits)Number of basic modules in each Level-2 network16Peak number of Level-3 links per BM 10Number of vertical linksNumber of vertical wires

For stacked planes the vertical wiring pitch on an areabasis is approximately 150m 150 m, somewhat reducedcompared to [9],[10] due to technological advances. Thus, wecompute the area needed for the vertical connections to be0.0225 160 mm 3.6 mm . Thus, for byte widthcommunication, the area needed for vertical connections is 0.3cm . Clearly, the chip (plane) described earlier can readilysupport the vertical wiring needed for the vertical Level-3connections.

An important advantage of 3-D implementation is the re-duction of the length of the interconnects. The longest wiresin a planar Level-3 networks are the wrap around wires whichinterconnect the physically farthest Level-2 subnetworks. Thelength of these longest interconnects is 64width of a tile64 1.0 mm 6.4 cm. However, in the 3-D implementationof Level-3 network described above, these long wires runvertically and the corresponding vertical wire has a length of

where is the thickness of a chip plus the length of themicrobridge between chips. The thickness of the chip is about

2For simplicity of presentation, in this subsection we will ignore theredundancy provided at various levels.

JAIN et al.: NEW ATM SWITCH 343

Fig. 8. Three-dimensional realization of a 4096 node ATM switch.

0.02 in and the length of the microbridge is 0.002 in [9]. Thus,is about 0.056 cm and the longest vertical wire has a length of

0.9 cm. Clearly, in 3-D implementation the limiting wires arethe horizontal wires within the Level-2 subnetworks. In fact,the longest wire has a length of 16 1.0 mm 1.6 cmwhich, when compared to 6.4 cm, gives rise to a factor of fourimprovement over the planar implementation. Although notdiscussed here, the vertical wiring for the proposed architectureis considerably less than that for other alternatives switchfabrics, e.g., the -ary -cube [15], [12]. This is attested byFig. 3 of [12], depicting the peak number of vertical wires,which holds for the present architecture as well. While we haveemployed the 3-D technology of [9], [10] in the discussionabove, another alternative is to use the 3-D technology of [11].

VI. DEFECT TOLERANCE

Defect tolerance is essential for all large area devices inorder to achieve satisfactory yield. The proposed ATM switchis no exception. However, due to its hierarchial structure,defect tolerance is facilitated considerably for this architecture.Redundancy can be provided at each level; correspondingly,reconfiguration can also be performed at each level. This isdepicted in Fig. 9. Each BM is provided with one spare columnof nodes. Thus, the physical BM consists of 5 4 nodeswhile the target (logical) BM consists of 4 4 nodes. Thisrepresents a 25% redundancy. We assume that enough wiringresources exist between the nodes so that reconfiguration canbe performed successfully if 16 or more of the 20 nodes arehealthy. An example reconfiguration is illustrated in Fig. 10,which represents a worst case situation inasmuch as thereare four cluster-defect nodes. Note that the input/output portsof this 2EI,2EO BM are shown by double lines. Automaticreconfiguration has been dealt with for a similar (hierarchical)network in [13], and can be developed for this switch networkas well. Although quite effective, it is clear that a large numberof switches are used in this illustration. However, with thecontinuing improvements in defect densities, anonredundantBM, or one with much reduced switch resources shouldbecome practical in the near future. This would also befacilitated by the on-going reductions in feature size. Then,the reconfiguration would only be needed at the second or

Fig. 9. Hierarchical redundancy.

higher levels. The three-track wiring channel width may beestimated as 3 8 b 6 pitch 0.25 0.036 mm.Allowing for switches and for clock and power, we estimatethe channel width to be 0.2 mm. If hard restructuring is used,e.g., through laser linking and cutting, then the inter-nodedelay would be small. Soft switches, on the other hand, canintroduce significant delay, on the order of a bit interval (forthe longest path) for the lower line rate or a few bit intervalsfor the higher rate. However, this is not highly critical since,unlike a multiprocessor computing structure, here there are nodata dependencies. Further, the ATM cells are resynchronizedto the network clock at the output port processor.

With a node tile area of 1.0 1.0 0.01 cm ,and assuming a Poisson defect distribution, the node yieldmay be estimated as

(2)

where denotes the defect density per cmThe area assumedfor the tile is reasonable in 0.25m CMOS, based uponpreliminary estimates. Also note that a 40 Gbps ATM switchchip was recently announced in 0.25m CMOS with a totalarea of 1.4 cm [14]. Then the yield of the basic module is

344 IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART B, VOL. 21, NO. 4, NOVEMBER 1998

Fig. 10. Reconfiguration of a BM in the presence of defective nodes.

given by

(3)

In a similar manner, the yield for the second level isestimated for various defect densities. As seen in Fig. 9,the physical second level network consists of 54 BM’s,i.e., a total of twenty BM’s. Only 16 are needed for afunctional second level network. This represents, again, a 25%redundancy (at the second level). Using an approach somewhatsimilar to that in (3), the yield for Level-2 is estimated. Theyield estimates are plotted in Fig. 11 and are seen to bereasonable.

VII. CONCLUSION

We have presented a new hierarchial ATM switch. Themotivation for this new architecture arose from a desire toachieve throughputs ranging from tens of Gbps to several Tera-bits per second, and to allow modular growth. To achievehigher performance, a node structure with shared buffering,wherein an incoming cell from any input can be stored inany available buffer slot, was used. Simulation studies on theperformance of the switch based on a single path strategy wereperformed. These studies showed that a cell loss probability

Fig. 11. Yield estimates for level-1 and level-2 ATM switches

of 10 and a delay of about 8 cell cycles can be achievedwith a buffer size of 16 even at high loads (specifically,for a 75% fill-factor of incoming traffic). Several importantissues remain to be explored. On the architectural side, theseinclude copying and multicasting. On the manufacturing side,testing and packaging of this family of large switches posenew challenges for the microelectronics community.

REFERENCES

[1] M. D. Pryker,Asynchronous Transfer Mode. New York: Prentice-Hall,1995.

[2] R. Handel, M. Huber, and S. Schroder,ATM Networks: Concepts,Protocols, Applications. New York: Addison-Wesley, 1994.

JAIN et al.: NEW ATM SWITCH 345

[3] A. Acampora, Introduction to Broadband Networks. New York:Plenum, 1993.

[4] Special issue on “Large-scale ATM switching systems for B-ISDN,”IEEE J. Select. Areas Commun., vol. 9, Oct. 1991.

[5] H. Ahmadi, and W. E. Denzel, “A survey of modern high-performanceswitching techniques,”IEEE J. Select. Areas Commun., vol. 7, pp.1091–1103. Sept. 1989.

[6] P. Newman, “A fast packet switch for the integrated services backbonenetwork,” IEEE J. Select. Areas Commun., vol. 6, pp. 1468–1479, Dec.1988.

[7] M. G. Hluchyj and M. J. Karol, “Queuing in high-performance packetswitching,” IEEE J. Select. Areas Comm., vol. 6, pp. 1587–1597, Dec.1988.

[8] N. Mirfakhraei, “A wafer-scale ATM switching system based on theManhattan-Street network,” inProc. IEEE Int. Conf. Wafer-Scale Integr.,San Francisco, CA, Jan. 1995, pp. 182–190.

[9] M. J. Little, R. D. Etchells, J. Grinberg, S. P. Laub, J. G. Nash, and M.W. Yung, in Proc. Int. Conf. Wafer Scale Integr., 3-D Comput., 1989,pp. 55–64.

[10] M. L. Campbell and S. T. Toborg, “3-D wafer stack neurocomputing,”in Proc. Int. Conf. Wafer Scale Integr., 1993, pp. 67–74.

[11] H. Kurino, T. Matsumoto, K.-H. Yu, N. Miyakawa, H. Itani, H.Tsukamoto, and M. Koyanagi, “Three-dimensional integration technol-ogy for real time micro-vision systems,” inProc. Int. Conf. InnovativeSyst. Silicon, Oct. 1997, pp. 203–212.

[12] V. K. Jain, T. Ghirmai, and S. Horiguchi, “TESH: A new hierarchi-cal interconnection network for massively parallel computing,”IEICETrans. Inform. Syst., vol. E80D, pp. 837–846, Sept. 1997.

[13] B. M. Maziarz and V. K. Jain, “Automatic reconfiguration and yieldof the TESH multicomputer network: Theory, algorithm and software,”submitted for publication.

[14] Y. Ohtomo et al., “A 40 Gbps 8� 8 ATM switch LSI using 0.25�m CMOS/SIMOX,” in Proc. IEEE Int. Solid State Circuits Conf., SanFrancisco, CA, Feb. 1997, pp. 154–155.

[15] W. J. Dally, “Performance analysis ofk-ary n-cube interconnectionnetworks,” IEEE Trans. Comput., vol. 39, pp. 775–785, 1990.

Vijay K. Jain (M’65–SM’74) received the Ph.D.degree in electrical engineering from Michigan StateUniversity, East Lansing, in 1964.

He is Professor of Electrical Engineering at theUniversity of South Florida, Tampa. He was theGroup Leader of Architecture, Applications, andDesign Group, DARPA Project on Microelectronics,University of South Florida. He has also served asthe Task Leader on an effort on a “Fully ParallelFFT Processor.” His research interests include par-allel processing systems, interconnection networks,

VLSI architectures and design, high speed signal and image processingsystems, and broadband digital communications.

Lei Lin received the B.S.E.E. and M.S.E.E. degreesfrom Zhejiang University, China, in 1987 and 1989,respectively, and the Ph.D. degree from the Univer-sity of South Florida, Tampa, in 1997.

He is a Senior Designer at the National Semicon-ductor Corporation, Santa Clara, CA. His researchinterests include VLSI algorithms and design forimage processing and high-speed communications.

Susumu Horiguchi (S’79–M’81–SM’95) receivedthe B.S. degree in communication engineering andthe M.S. and Ph.D. degrees from Tohoku University,Japan, in 1976, 1978, and 1981, respectively.

He was with the faculty of the Department ofInformation Science, Tohoku University, from 1981to 1992. He was a Visiting Scientist with the IBMThomas J. Watson Research Center, YorktownHeights, NY, from 1986 to 1987, and a VisitingProfessor of the Center for Advanced Studies,University of Southwestern Louisiana, Lafayette, in

1994. He is a Full Professor in the Graduate School of Information Science,Japan Institute of Science and Technology. He has been conducting hisresearch group as the Chair of Multimedia Integral System Laboratory. Hisresearch interest has been mainly concerned with interconnection networks,parallel computing algorithm, massively parallel processing, parallel computerarchitecture, VLSI/WSI architecture, and multimedia integral system.

Dr. Horiguchi is a member of IEICE and IPS of Japan. He has been involvedin organizing many international workshops, symposia, and conferencessponsored by IEEE, IEICE, and IPS.