Upload
dorothy-gallagher
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Stage-Distributed Time-Division Permutation Routing in a Multistage Optically Interconnected Switching Fabric Alvaro Cassinelli(1), Makoto Naruse (2), Masatoshi Ishikawa(1)
1: University of Tokyo, Dept. Information Physics and Computing, 7-3-1 Hongo Bunkyo-ku, Tokyo 113-0033, Japan. ([email protected])2: Communications Research Laboratory, 4-2-1 Nukui-kita, Koganei, Tokyo 184-8795, Japan.
Mechanically Reconfigurable wave-guide-based interconnections
Introduction Column-controlled buffered MIN architecturePlane-to-plane guide-wave based optical interconnections
Bandwidth, cross-talk and volume consumption considerations, all advocate for plane-to-plane optical interconnects within massively interconnected electronic system as a promising substitute to conventional electronic circuitry in the near future.
Multistage Interconnection Networks (MINs) are capable of supporting arbitrary input-output interconnections, and have been long proposed as an interesting alternative to the full crossbar and the time-shared bus (in terms of performance, cost and scalability) for use as permutation networks for multiprocessors but also as point-to-point switching fabrics for telecom applications.
As a consequence, multistage architectures using optical plane-to-plane regular interconnections may well represent a theoretically optimal architecture for use within a massively interconnected systems.
The work presented here develops on some fundamental aspects of such optical multistage architecture:
Transparent column-controlled Multistage Network
Possible implementation of an electro-optical bi-permutation module using stacked layers and electro-optically controlled normal coupling between the layers.
Performance analysis of a buffered GSMIN
Conclusions and further research
Column-control certainly reduces overall interconnection capacity, but if things are properly designed, the MIN architecture can still accommodate communication primitives of most static interconnection networks. The interest of such an arrangement lies in the ease of control and straightforward implementation. We presented here preliminary experimental results demonstrating a simple optical architecture using cascaded fiber-based bi-permutation modules. An electro-mechanical system has been developed providing stage-switching times on the order of milliseconds, making this architecture suitable for reconfigurable, high-bandwidth inter-processor communications. Further research on this topic include the feasibility study of dense guided-wave multi-permutation modules by stacking layers of planar lightwave circuits, as well as the introduction of electro-optical switching.
6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Input request probability per unit time ()
Pro
ba
bil
ity
of
pa
ck
et
ac
ce
pta
nc
e
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
00
1
2
3
4
5
1
2
3
4
65
0
crossbar
Depth Analysis = Buffer Size
Depth Transfer = Buffer Size
SEMIN
GSMIN
Bu
ffer s
ize
6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Input request probability per unit time ()
Pro
ba
bil
ity
of
pa
ck
et
ac
ce
pta
nc
e
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
00
1
2
3
4
5
crossbar
Size: 128 x 128
GSMIN (alternate)GSMIN (forced alternate)
Bu
ffer s
ize
6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Input request probability per unit time ()
Pro
ba
bil
ity
of
pa
ck
et
ac
ce
pta
nc
e
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
00
1
2
3
4
5
1
2
3
4
65
0
crossbar
SEMIN
GSMINSize: 128 x 128
SEMIN (alternate)SEMIN/GSMIN (forced alternate)
Bu
ffer s
ize
“Fair” selection mode (the state of the global-switch is chosen randomly if the poll is a draw)
Time slot
Permutation appearance period
time
Inte
rco
nn
ect
1
Inte
rco
nn
ect
2
Inte
rco
nn
ect
3
Inte
rco
nn
ect
N
Inte
rco
nn
ect
1
Inte
rco
nn
ect
2
Inte
rco
nn
ect
3
Inte
rco
nn
ect
N
Inte
rco
nn
ect
1
Inte
rco
nn
ect
2
Inte
rco
nn
ect
3
Inte
rco
nn
ect
N
(B)
(A) (C)
A small mechanical (or optical ) perturbation produces a drastic change of the interconnection pattern from input to output.
• better efficiency (than holograms) for long-range interconnections.
• no cross-talk in 3D, just like free-space optics.
• No space-invariance required.
• Theoretically more volume efficient than free-space
• Precise and robust alignment possible…
• multiple interleaved permutations possible.
Fiber-based Modules vs. Free-Space
• Maybe “hard” to build? Boring, but not a fundamentally difficult (can be automated, can be done by “layers”, see below).
• Alignment of both output and input needed.
• Power dissipation may be a fundamental limitation, but we are far from these limits.
Routing Strategy
Conflicts are not resolved individually at the switch level, but globally at the stage level by a “tournament” between all the incoming requests to that particular stage. Provided that these request are uniformly directed to any possible output, "votes" leading to the adoption of one of the two possible states of the global-switch will be evenly distributed. Such behaviour takes place for all stages of the network, so that at each stage, half of the requests will be dropped and half will be able to pass to the next stage. This means there is an enormous number of discarded packets, certainly much bigger than that occurring by internal blocking in a standard SEMIN. However, if one considers a buffered architecture, then presumably there will be no need to provide it with a large buffer memory, because the packets that have been retained in the buffers are very likely to go forward in the following tournament.
Figures represent performance (normalized amount of satisfied requests) as a function of the input load (computed as the probability of a request being issued at any input per unit time) for the MIN with stage-distributed switching, and for a standard Shuffle Exchange MIN with independent control of switches (both 128x128 large networks). The input traffic follows the uniform request model.
Individual control of switches as well as arbitration may be unnecessary on a standard SEMIN if the buffer size is larger than four. Also, when the buffer size is equal to three, the 128x128 GSMIN already outperforms a 128x128 full-crossbar.
Why common control of switches per switching stage (column-control)?
A column-controlled MIN can be an efficient (circuit-switched) permutation network.
“indirect” implementation obtained by electrically joining individual switches (common control lines a, b and c) of a Shuffle-Exchange network (here the Inverse Baseline)
A more “direct” Implementation of a column-controlled Multistage Network using multi-permutation switching modules (the GSMIN network)
L2 E2
I4
switching region
inputcross
straight
In the case of “column/row-decomposable” permutations (see left), which happen to be the ones required in most regular interconnected MINs, 2D interconnection modules can be easily implemented by stacking layers of printed lightwave circuits containing 1D permutations (see right).
(A) Guide-wave based interconnection modules
While many demonstrator have been built to illustrate the advantages of free-space optics over electronics for dense plane-to-plane interconnections, there has been relatively little research on the use of two-dimensional wave-guide-based interconnections, (or guide-wave based interconnection modules). Yet, these can easily achieve better transmission efficiency than holographic-based interconnections while almost completely cancelling cross-talk, and contrary to the common belief they may be more volume efficient than free-space optics for both space-invariant and space-variant interconnects.
(B) Transparent column-controlled Optical Multistage Network
The use of “modular” inter-stage interconnections leads naturally to consider a new paradigm for the optical MIN architecture: interconnection modules containing several interleaved global interconnections (i.e. permutations). Addressing of the required permutation can be done by mechanical displacement of the whole multi-permutation module.
Cascading such modules without intermediate optoelectronic arrays gives a transparent ”globally-stage switched” multistage interconnection network (GSMIN) that can be used as a circuit-switched permutation network for multiprocessor communications (this architecture represents an original optical implementation of a column-controlled shuffle-exchange MIN).
(C) Buffered column-controlled Optical Multistage Network
Column-controlled unbuffered MINs have been extensively studied for circuit-switched applications. We found that an inter-stage buffered column-controlled MIN architecture may also represent an interesting alternative to the well-known Shuffle-Exchange MINs (SEMINs) for point-to-point packet-routed communications, both from the point of view of its implementation complexity and simplicity of routing protocol.
Recently, we tested this approach by implementing several 4x4 fiber-based modules each integrating a
different fixed topology.
Automatic alignment of modules, both dynamically and statically (pre-aligned ”plug-and-play”
exchangeable blocks), is a critical issue now being studied.
(b)(a)
input plane
0
1
2
3
4
5
6
8
output p
lane
“Brute” folding and row/column decomposed folding of the perfect shuffle inter-stage permutation
Stacking planar lightwave circuits to produce 2D guided-wave interconnection modules
Experimental results
Stacking layers
Multistage architecture
parallel computers
switching networks
Dense optical interconnect:
interconnection folded in 2D
Optical Multistage Architecture Paradigm
0000000100100011010001010110
011110001001101010111100110111101111
Inp
ut
Ou
tpu
t
E(1) E(1) E(1) E(1)0000000100100011010001010110011110001001101010111100110111101111
(4) (4) (4)(4)
Shuffle interconnectionExchange switch
0000000100100011010001010110
011110001001101010111100110111101111
Inp
ut
Ou
tpu
t
E(1) E(1) E(1) E(1)0000000100100011010001010110011110001001101010111100110111101111
(4) (4) (4)(4)
Shuffle interconnectionExchange switch
+
(2)
…an optical “3D optical wiring” module between
2D VLSI arrays.
networkinterconnection modules
Processor arrays
interconnection modules
arrays
networkinterconnection modules
Processor arrays
interconnection modules
arrays
Multi-permutation interconnection module paradigm
output
input
3D bi-permutation module built by stacking planar
lightwave circuits
output
input
3D bi-permutation module built by stacking planar
lightwave circuits
actuators
inputs
outputsactuators
inputs
outputs
Input(VCSEL array)
Output
C1 or IdC2 or Id
Id . Id E1 . E2Id . C2 C1 . IdId . Id E1 . E2Id . C2 C1 . Id
c3
c4
c2
c1
c2
c1
c 3 c4
…topology mapped on a plane (optical interconnects, VLSI integration)
four-dimensional hypercube connected multiprocessor…
{c2, id}{c1, id}
{c3, id}{c4, id}
Four-dimensional “spanned” hypercube network (16 nodes) using four bi-permutation modules, each providing a cube permutation and the identity.
A total of 24=16 global permutations for the whole network.
First two modules of a spanned version of a 4-dimensional hypercube were fabricated using interleaved optical fibers, and the resulting four possible interconnections observed.
Spanned hypercube using Bi-permutation modules
Guide-wave based multi-permutation interconnection module
Mechanically reconfigurable multi-stage architecture based on cascading multi-permutation modules
Channels are multi mode fibers: MFD = 9.5 mGrad diam. 125 m NA: 0.1
{c2, id}input output
Output (to CCD)
Input(from VCSEL array)
Exit first module
Input second module
Displacement stages
Prototype using interleaved fibers
Experience with two cascaded modules:
Displacement pitch for commutation: 125 m
Alignment tolerance: 5 m (half peak power).
Inter-module Coupling Efficiency: 1.7dB (no additional optics, matching oil or antireflection coating).
Validation of simple cascaded architecture
Electro-mechanical switching device
The switching speed seems to be limited to the millisecond range (resonant frequency).
Micro electro-mechanical actuators (MEMS) may also be an interesting alternative when switching latency in the millisecond range is tolerable.
This electro-magnetic actuated device can position the module
in both X and Y directions.
Non-integrated bi-permutation module:
Only the older packets waiting on the queues are given attention in the selection phase, and that only these packets are candidates to be transferred during the transfer phase. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Input request probability per unit time ()
Pro
ba
bil
ity
of
pa
ck
et
ac
ce
pta
nc
e
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
Depth Analysis = 1
Depth Transfer = 1
SEMIN
GSMIN
crossbar
Bu
ffer s
ize
3
4
6
Depth Analysis = Buffer Size
Depth Transfer = Buffer Size
0
crossbar
0
1
2
3
4
56
012
5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Input request probability per unit time ()
Pro
ba
bil
ity
of
pa
ck
et
ac
ce
pta
nc
e
SEMIN
GSMIN
Bu
ffer s
ize
“Alternate” selection mode (the state of the global-switch changes if the poll is even)
“Forced-alternate” selection mode (the state of the global-switch changes periodically)Bi-permutation module (integrates switching and
permutation stage)Column-controlled Shuffle-Exchange stage
a b ca b c c b ac b a
Despite the reduced permutation capacity of a column-controlled MIN, a permutation network can emulate (by time-multiplexing) a full-crossbar provided that the reconfiguration latency remains small with respect to the interconnection request rate, and that the bandwidth is sufficiently high to accommodate in a single time slot all the data to be transferred. An unbuffered column-controlled MIN may provide enough optical bandwidth - but the reconfiguration time may be still too slow (using thermo-optical switches for instance).
A clear improvement in performance is seen in the standard SEMIN for buffer
sizes larger than one.
This performance may exceed that of the GSMIN, but it seems to increase slowly
(with buffer size), so that when the buffer size is equal to five (or four in the case of
a 64×64 network), both networks show roughly the same performance.
From the routing point of view, simulations showed that a buffered column-controlled MIN would not require excessive buffer size to achieve respectable performances (under uniform traffic). The path-selection mechanism can be further reduced to the simple alternation of the available stage permutations, without degrading the performance. Under such stage-distributed time-division permutation multiplexing, the SEMIN and GSMIN fabrics become strictly equivalent routing architectures; hence (provided that buffer size is chosen to be larger than three) this analysis-free strategy will provide a very simple arbitration mechanism for standard SEMIN networks. This is an interesting result on its own.
There is no significant change in performance between a buffered GSMIN with "alternate" selection and a buffered GSMIN with "forced alternate" selection. A continual "blind" alternation of switching states performs just as well.
When a standard SEMIN and a column-control SEMIN (GSMIN) are operated in forced alternate mode, they become strictly equivalent architectures. Therefore, when a standard SEMIN is operated in forced alternate mode, its performance roughly degrades to that of an alternate operated GSMIN; on the other hand, when a GSMIN is operated in forced alternate mode, the performance remains substantially the same.
A multistage ”spanned” version of most direct network topologies (hypercube, cube-connected-cycles, etc.) can be implemented as an unbuffered column-controlled MIN architecture.
A time-division multiplexing (TDM) technique can be used to select the interconnections at each stage.
A column-controlled MIN can emulate a full-crossbar by time-multiplexing
Simplicity and Scalability: A whole switching stage needs a unique control signal.
Can have a “direct” optical implementation: the switching stage and its adjacent inter-stage permutation can be physically merged into a unique “permutation module" providing two different interconnection patterns (bi-permutation switching module).
=
arbitration
Total length of buffer
depth of analysis
Length of transferred packets/cycle
…
…
…
Buffer 1
Buffer N
…
…
Bi-p
erm
uta
tio
n
mo
du
le
arbitration
Total length of buffer
depth of analysis
Length of transferred packets/cycle
…
…
…
Buffer 1
Buffer N
…
…
Bi-p
erm
uta
tio
n
mo
du
le
Routing parameters
Optimal situation: both “depth of analysis” and “depth of transfer” are equal to the actual buffer size.
The performance of a "fair" operated GSMIN is superior to that of a standard ("fair" operated) MIN for buffer sizes larger than one.
Also, for a buffer size equal to three, the GSMIN network gives better performance than the crossbar, even at full-load.
This arbitration free mechanism may be very appealing for all-optical networks, if optical buffering functions (e.g. delay lines) could be integrated on the cascaded modules themselves, an issue worth exploring.
Two implementation of a column-controlled Multistage Interconnection Network
inputoutput
inputoutput
inp
ut
Alignments hardware
inp
ut
Alignments hardware
Input pattern (exit VCSEL array)
Output pattern (CCD image)
Ou
tpu
t (C
CD
)
Inp
ut
(VC
SE
Ls)
Input pattern (exit VCSEL array)
Output pattern (CCD image)
Input pattern (exit VCSEL array)
Output pattern (CCD image)
Ou
tpu
t (C
CD
)
Inp
ut
(VC
SE
Ls)
Ou
tpu
t (C
CD
)
Inp
ut
(VC
SE
Ls)
ECOC 2003 / We4.P.137
processing (analysis/buffer)
bi-permutation module
module control
processing (analysis/buffer)
bi-permutation module
module control