View
217
Download
0
Category
Preview:
Citation preview
Greedy ApplicationsEthernet Everywhere
Real Time Networks for the Atlas Experiment
Greedy ApplicationsEthernet Everywhere
Real Time Networks for the Atlas Experiment
Brian Martin Brian Martin
CERN
In memory of Bob DobinsonIn memory of Bob Dobinson
And on behalf of a large collaboration.
TNC 2004 Rhodes Greece 2
The Large Hadron Collider (LHC)
LHC is being constructed underground inside a 27km tunnel. Head on collisions of very high energy protons. World wide collaboration. Experiments in 2007. Big science.
TNC 2004 Rhodes Greece 3
The ATLAS experiment at LHC
Atlas Experiment
1700 People
33 countries
7000 ton detector
80 Terabytes/s data output
TNC 2004 Rhodes Greece 4
Event Visualisation
TNC 2004 Rhodes Greece 5
Trigger And Event Data Flow
ROBROBROBROB
L2PU
L2PU
L2PU
L2PU EF
Back End Network
~ 25 ROBS have information for
one Level 2 CPU
Subsequent events go to a free CPUTotal traffic is~20Kbytes @
100Khz
Data Collection Network
EF EF
ROB
SFOSFO
~500 Level 2 dual CPU’s
~3K Level 3 dual CPU’s
‘Good’ events are collected from the
ROB’s by the SFI’s and sent to level 3 CPU’s
Total Traffic ~2 Mbytes @ 3.5 KHz
After processing the accepted event is
fetched from Level 3 and sent to storage.
Total traffic~2 Mbytes @ 200 Hz
to storage
ROB
SFI SFI SFI
~ 100Sub Farm Interfaces
1600 ROB buffers
TNC 2004 Rhodes Greece 6
Parameterised Switch Model
Switch & Network Performance
It’s an issue
Measure and model switches• input and output buffer depth
• max throughput module to backplane
• max throughput across backplane
• max throughput backplane to module
• intra-module throughput
• fixed part of intra-module latency
• floating part (depends on packet size) of intra-module latency
• fixed part of inter-module latency
• floating part (depends on packet size) of inter-module latency
TNC 2004 Rhodes Greece 7
Switch Testing: Data Source Programmable Ethernet Frame sourceCommon clock, (+GPS option)Outgoing we control:
• Destination address• Frame Size• Time of dispatch
Incoming we measure:• Packet loss per source• Frame Size• Latency
Test consists of • Defining traffic profile for dispatch• Histogramming latency & packet loss per source/destination pair
TNC 2004 Rhodes Greece 8
e.g. Buffer size measurement
A B C
D
• Send• A->D 50%
– Measure: Tad0• Send
• A->D 5%• B,C->D 100%
– Measure: Tad1• Send
• A,B,C->D 100%– Measure: Tad2
• Size of the queues (Tpp=time per packet):– Output queue = (Tad1-Tad0)/Tpp– Input queue = (Tad2-Tad1)/Tpp
Switch
R
TNC 2004 Rhodes Greece 9
Modeling versus measurement
Combined 12 ROS x 8 SFI (udp Fs=1Kb) for 3, 10 and 100 % traffic sent to EB
0
1000
2000
3000
4000
5000
6000
0 2 4 6 8 10 12#L2PUs
EB
Ra
te (
Hz)
EB Only100 pc10 pc3 pc3 pc m10 pc m100 pc mEB Only m
• Using the calibrated switch model• Build a system model• And check it against measurements
TNC 2004 Rhodes Greece 10
Application of results• Can’t use TCP in the data collection network
– Latency too large– Requires low loss raw Ethernet
• Gathering full event is a bottleneck 1600 ROB’s->1 CPU– Credit based solution to random sources.– Link between number of credits and buffer depth.– More credits better use of bandwidth– But increases requirement for buffer depth
• Concentrating edge switches – Concatenates underused Gb/s links– Fewer but faster ports at the core– More smaller switches at the edge– Random congestion between switches
• Modeling can give the answer• And defines acceptable switches
TNC 2004 Rhodes Greece 11
Must all the CPU’s be on site?
ROBROBROBROB
L2PUL2PUL2PUL2PUL2PU
SFISFI SFI
PFPF PF
Local Event Processing Farms
PF
PF
PF PFPF
Remote Event
Processing Farms
Remote Event Processing Farms
Dedica
ted
light p
aths
10 GE
Packet switched WAN
Data Collection Network
Back End Network
ATLAS Detectors
TNC 2004 Rhodes Greece 12
Ethernet, how you’ve grown!
Shared copper mediaLimited distanceLimited speedHalf duplex
IEEE802.3ae Point to point Full duplex Switched architecture100meters over copper 40 kms over fiber10Gbit/secondWAN Access Carrier Class product
TNC 2004 Rhodes Greece 13
LAN at 10Gbit/s
• Ethernet LAN at 10Gbit/s• 40 Km is the standard,
– based on commercially available lasers – and installed fiber
• If you have access to dark fiber• Then you can amplify the light• One amplifier every 40 – 100- kms • DCF also needed• Amplifiers also amplify the noise• 3R regeneration every 400 – 600 kms
TNC 2004 Rhodes Greece 14
10 GE over dark fibre - ESTA
• 250 km of DARENET’s dark fibre in July 2003• 4 optical amplifiers in each direction• error-free transmission for 66 hours
LyngbyOdense
EDFA10 GbE
Næstved
121 km131 km
DCFDCF
10 GbEOpticalfilter
Opticalfilter
DCF
TNC 2004 Rhodes Greece 15
525 km Long Haul LAN
• Further tests done• Paper accepted for
COIN2004 in Yokohama
TNC 2004 Rhodes Greece 16
WAN
10 Gig E to the WAN: Theory
WANPHY
LTE
LTE
LTE
3R
3R
3R
3R
3R
OC192Router
OC192
10GE Switch
WANPHY
10GE Switch
ELTE ELTE
Router
Router
OC192
TNC 2004 Rhodes Greece 17
In practice however:
• Wan Phy designed to connect to Sonet through an ELTE– Same bitrate, Same frame structure
– A defined ELTE specification (But nobody built one)
• Not guaranteed to operate with existing LTE (OC192 port)– Unused / fixed-value bits in the management overhead
– Optical signal has relaxed jitter and clock specs, (cheaper optics)
• Nobody built the optics either, everyone uses Sonet optics• Tee Shirts versus the Suits • Where to plug the Wan-Phy fiber? • Into a OC192 socket, perhaps?
TNC 2004 Rhodes Greece 18
Plug and Play at CanarieONS 15454
ONS 15454
E-600 E-600
IXIA
10GE
1310nm 1550nm 1310nm1
2
100
110
120
130
140
150
160
170
0 2000 4000 6000 8000 10000
Frame size [bytes]
La
ten
cy
[u
s]
1 to 2 2 to 1
Latency @ 91.3% of the 10 GE LAN PHY line speed
TNC 2004 Rhodes Greece 19
WAN PHY over SONET
• Trials between Geneva and Amsterdam• Error-free transmission• Field validation of our previous lab experiments
ONS 15454
ONS 15454IXIA
10 GE
IXIA
10 GE
DWDM DWDM
CERN
Geneva
SURFnet
Amsterdam
OC192 OC192
TNC 2004 Rhodes Greece 20
Raw Ethernet
TNC 2004 Rhodes Greece 21
TCP• Using
– 2.4.21, web100 TCP patch, iperf 1.7.0, kernel
TNC 2004 Rhodes Greece 22
TCP: Geneva to AmsterdamCERN – UvA: Average of 5448 Mbps for almost 15 hours
TNC 2004 Rhodes Greece 23
WAN PHY over DWDM
• No packet loss for 91 hours and 365 TB of continuous streaming (equivalent to a BER better then of 0.3*10^-15)
IXIA10 GE
IXIA
10 GE
DWDM DWDM
CERN
Geneva
SURFnet
Amsterdam
ONS 15454
ONS 15454IXIA
10 GE
IXIA
10 GE
DWDM DWDM
OC192 OC192
TNC 2004 Rhodes Greece 24
10 GE WAN PHY demo ITU World Telecom’03
• First native transatlantic 10 GE experiment• In June 2003, 700 GBytes of data transferred to Ottawa in 6.5 hours• 9.24 Gbps with traffic generators 6 Gbps with UDP 5.24 Gbps with TCP
• Article describing the experiments accepted at the IEEE HSMNC’04 conference, Toulouse, June 28 - July 2
CiscoONS 15454Force10
E 600
Force10E 600
CiscoONS 15454
CiscoONS 15454
CiscoONS 15454
CiscoONS 15454
IntelItanium-2
IntelXeon
Ixia400T
10GE WAN PHY 10GE LAN PHY OC192c
Ottawa Toronto Chicago Amsterdam Geneva
Oct. 6th, 2003
HPItanium-2
HPItanium-2
Ixia400T
TNC 2004 Rhodes Greece 25
Native Ethernet Cern to Ottawa
Single stream UDP throughput Single stream TCP throughput
•Data rates are limited by the PC, even for our memory-to-memory tests•UDP uses less resources than TCP on high bandwidth-delay product networks
TNC 2004 Rhodes Greece 26
Thanks to: Bob Dobinson, Piotr Golonka, Andreas Hirstius,
Mihai Ivanovici, Kris Korcyl, Olivier Martin, Catalin Meirosu,
Stefan Stancu, Mikkel Olesen, Stan Cannon (CERN)
Lars Dittmann, Martin Petersen (COM)
Cees de Laat, Antony Antony,
Freek Dijkstraa (University of Amsterdam)
Wade Hong (University of Carleton)
Bill St. Arnaud, Rene Hatem (CANARIE )
Ray Belleville (Cortex Networks, Ottawa)
Erik Radius (SURFnet )
Pieter de Boer (SARA )
TNC 2004 Rhodes Greece 27
..and to our many partners
ESTA, IST-2001-33182
TNC 2004 Rhodes Greece 28
Conclusions10 GE works over dark fiber up to <600Km using LAN PHY
10 GE works over legacy trans / intercontinental with WAN PHY
Commercial switches already available with all needed features
The bandwidth is there to run off-site ATLAS farms in real time
Further work to do on managing traffic flow
Further work to do on understanding the cost model
ANY QUESTIONS
Recommended