Upload
ilene-flowers
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester1
Collaborations in Networking and Protocols
HEP and Radio Astronomy
Richard Hughes-Jones The University of Manchester
www.hep.man.ac.uk/~rich/ then “Talks”
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester2
VLBI Proof of Concept at iGrid2002European Topology: NRNs, Geant, Sites
SuperJANET4
iGrid 2002
ManchesterJodrell
SURFnet
JIVE
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester3
Normal Traffic
Normal Traffic +
Less Than Best Effort 2.0 Gbit/s
Normal Traffic +
Radio Astronomy Data 500 Mbit/s
Normal Traffic +
Radio Astronomy Data +
Less Than Best Effort 2.0 Gbit/s
Collaboration HEP, Radio Astronomy, Dante the NRNs, and Campus folks
Some results of the e-VLBI Proof of Concept
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester4
Jodrell BankUK
DwingelooDWDM link
MedicinaItaly Torun
Poland
e-VLBI at the GÉANT2 Launch Jun 2005
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester5
e-VLBI UDP Data Streams Collaboration HEP, Radio Astronomy, Dante the NRNs, and Campus folks Good opportunity to test UDP Throughput: 5 Hour run
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester6
ESLEA and UKLight Exploiting Switched Lightpaths for e-Science Applications EPSRC e-Science project £1.1M 11.5 FTE
Core Technologies:ProtocolsControl plane
HEP data transfers – ATLAS and D0 e-VLBI Medical Applications High Performance Computing
Involved with Protocols, HEP and e-VLBI Stephen Kershaw appointed as RA (joint with EXPReS)
Investigate how well the protocol implementations work UDP flows, TCP advanced stacks, DCCP (developed by UCL partners) Also examine how the Applications “use” the protocols Also the effect of the transport protocol on what the Application intended!
Develop real-time UDP transport for e-VLBI – vlbi_udp
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester7
ESLEA and UKLight
6 * 1 Gbit transatlantic Ethernet layer 2 paths UKLight + NLR
Disk-to-disk transfers with bbcp Seattle to UK Set TCP buffer and application
to give ~850Mbit/s One stream of data 840-620 Mbit/s
Stream UDP VLBI data UK to Seattle 620 Mbit/s
sc0502 SC|05
0
100
200
300
400
500
600
700
800
900
1000
16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
date-time
Ra
te
M
bit/s
sc0503 SC|05
0
100
200
300
400
500
600
700
800
900
1000
16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
date-time
Ra
te
M
bit/s
sc0504 SC|05
0
100
200
300
400
500
600
700
800
900
1000
16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
date-time
Ra
te
M
bit/s
sc0501 SC|05
0
100
200
300
400
500
600
700
800
900
1000
16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
time
Ra
te
M
bit/s
UKLight SC|05
0
500
1000
1500
2000
2500
3000
3500
4000
4500
16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
date-time
Ra
te
Mb
it/s
Reverse TCP
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester8
tcpmon: TCP Activity for remote Farms:Manc-CERN Req-Resp
0
50000
100000
150000
200000
250000
0 200 400 600 800 1000 1200 1400 1600 1800 2000time
Data
Byte
s O
ut
0
50
100
150
200
250
300
350
400
Data
Byte
s I
n
DataBytesOut (Delta DataBytesIn (Delta Web100 hooks for TCP status
Round trip time 20 ms 64 byte Request green
1 Mbyte Response blue TCP in slow start 1st event takes 19 rtt or ~ 380 ms
0
50000
100000
150000
200000
250000
0 200 400 600 800 1000 1200 1400 1600 1800 2000time ms
Data
Byte
s O
ut
0
50000
100000
150000
200000
250000
Cu
rCw
nd
DataBytesOut (Delta DataBytesIn (Delta CurCwnd (Value
TCP Congestion windowgets re-set on each Request
TCP stack RFC 2581 & RFC 2861 reduction of Cwnd after inactivity
Even after 10s, each response takes 13 rtt or ~260 ms
020406080
100120140160180
0 200 400 600 800 1000 1200 1400 1600 1800 2000time ms
TC
PA
ch
ive M
bit
/s
0
50000
100000
150000
200000
250000
Cw
nd
Transfer achievable throughput120 Mbit/s
Event rate very low Application not happy!
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester9
ESLEA: ATLAS on UKLight 1 Gbit Lightpath Lancaster-Manchester Disk 2 Disk Transfers Storage Element with SRM using distributed disk pools dCache & xrootd
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester10
udpmon: Lanc-Manc Throughputnot quite what we expected !!
Lanc Manc Plateau ~640 Mbit/s wire rate No packet Loss
Manc Lanc ~800 Mbit/s but packet loss
Send times Pause 695 μs every 1.7ms So expect ~600 Mbit/s
Receive times (Manc end) No corresponding gaps
W11 pyg13-gig1_19Jun06
0
500
1000
1500
2000
2500
3000
3500
6200000 6210000 6220000 6230000 6240000 6250000Recv time 0.1us
1-w
ay d
ela
y u
s
W11 pyg13-gig1_19Jun06
0
500
1000
1500
2000
2500
3000
3500
6200000 6210000 6220000 6230000 6240000 6250000Send time 0.1us
1-w
ay
de
lay
us
pyg13-gig1_19Jun06
0100200300400500600700800900
1000
0 10 20 30 40Spacing between frames us
Recv W
ire r
ate
Mbit/s
50 bytes
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1400 bytes
1472 bytes
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester11
EXPReS & FABRIC
EU Project to realise the current potential of eVLBI and investigate the Next Generation capabilities.
SSA Use of GRID Farms for distributed correlation. Linking Merlin telescopes to JIVE (present correlator)
4 * 1 Gigabit from Jodrell Links to 10 Service Challenge work.
Interface to eMERLIN – data at 30 Gbit/s
JRA - FABRIC Investigate use of different IP Protocols 10 Gigabit Onsala to Jodrell Links to 10 Gbit HEP work. Investigate 4 Gigabit over GEANT2 Switched Lightpaths
UDP and TCP Links to Remote Compute Farm HEP work. Develop 1 and 10 Gbit Ethernet end systems using FPGAs
Links to CALICE HEP work.
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester12
FABRIC 4 Gigabit Demo Will use a 4 Gbit Lightpath between two GÉANT PoPs Collaboration with Dante – Discussions in progress Continuous (days) Data Flows – VLBI_UDP and multi-Gigabit TCP tests
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester13
10 Gigabit Ethernet: UDP Data transfer on PCI-X Sun V20z 1.8GHz to
2.6 GHz Dual Opterons Connect via 6509 XFrame II NIC PCI-X mmrbc 2048 bytes
66 MHz One 8000 byte packets
2.8us for CSRs 24.2 us data transfer
effective rate 2.6 Gbit/s
2000 byte packet, wait 0us ~200ms pauses
8000 byte packet, wait 0us ~15ms between data blocks
CSR Access 2.8us
Data Transfer
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester14
Calice
Virtex 4 board from pld Applications PCI-express development card Using the FPGA to send and receive raw Ethernet frames at 1 Gigabit Package data from internal memory or external source into Ethernet Considering building a 10 Gigabit Ethernet add-on card
Take data in on the1Gig links, processing it, send results out on 10Gig link. Using 2 boards (2nd is a data generator) we could produce a small scale
Calice DAQ, take data in, buffer it to the DDR2 ram, and then read it out, Ethernet frame it and ship to PCs.
Ideas for an Ethernet packet monitor.
From Slides of
Marc Kelly
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester15
Backup Slides
Further network & end host investigations
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester16
VLBI Work
TCP Delay and VLBI Transfers
Manchester 4th Year MPhys Project
by
Stephen Kershaw & James Keenan
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester18
VLBI Application Protocol
VLBI data is Constant Bit Rate
tcpdelay instrumented TCP program emulates sending CBR
Data. Records relative 1-way delay
Data1
●●●
Timestamp1
Time
TCP & Network Receiver
Timestamp2
Sender
Data2Timestamp4
Timestamp5
Data4
Timestamp3
Data3
Packet loss
RTT
Time
Sender Receiver
ACKSegment time on wire = bits in segment/BW
Remember Bandwidth*Delay Product BDP = RTT*BW
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester19
1 way delay – 10000 packets
1-Way Delay
1 w
ay d
elay
100
ms
Message number
100 ms
10,000 Messages Message size: 1448 Bytes Wait time: 0 TCP buffer 64k Route:
Man-ukl-JIVE-prod-Man RTT ~26 ms
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester20
= 1.5 x RTT
= 1 x RTT 26 ms
Message number
≠ 0.5 x RTT
1 w
ay d
elay
10
ms
10 ms
Why not just 1 RTT? After SlowStart TCP Buffer Full Messages at front of TCP Send Buffer have to wait for next burst of ACKs – 1 RTT later Messages further back in the TCP Send Buffer wait for 2 RTT
1-Way Delay Detail
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester21
Recent RAID Tests
Manchester HEP Server
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester22
“Server Quality” Motherboards
Boston/Supermicro H8DCi Two Dual Core Opterons
1.8 GHz 550 MHz DDR Memory
HyperTransport
Chipset: nVidia nForce Pro 2200/2050
AMD 8132 PCI-X Bridge PCI
2 16 lane PCIe buses 1 4 lane PCIe 133 MHz PCI-X
2 Gigabit Ethernet SATA
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester23
Disk_test: areca PCI-Express 8 port Maxtor 300 GB Sata disks RAID0 5 disks Read 2.5 Gbit/s Write 1.8 Gbit/s
RAID5 5 data disks
Read 1.7 Gbit/s Write 1.48 Gbit/s
afs6 R0 5disk areca 8PCIe 10 Jun06 Read 8k
0
1000
2000
3000
4000
5000
6000
7000
0.0 500.0 1000.0 1500.0 2000.0 2500.0 3000.0 3500.0 4000.0
File size Mbytes
Th
ro
ug
hp
ut
Mb
it/s
Mbit/s 8k r
Mbit/s 8k w
afs6 R5 5disk areca 8PCIe 10 Jun06 Read 8k
0
1000
2000
3000
4000
5000
6000
7000
0.0 500.0 1000.0 1500.0 2000.0 2500.0 3000.0 3500.0 4000.0File size Mbytes
Th
ro
ug
hp
ut
Mb
it/s
Mbit/s 8k r
Mbit/s 8k w
afs6 R6 7disk areca 8PCIe 10 Jun06 Read
0
1000
2000
3000
4000
5000
6000
7000
0.0 500.0 1000.0 1500.0 2000.0 2500.0 3000.0 3500.0 4000.0File size Mbytes
Th
ro
ug
hp
ut
Mb
it/s
Mbit/s 8k r
Mbit/s 8k w
RAID6 5 data disks
Read 2.1 Gbit/s Write 1.0 Gbit/s
Collaboration Meeting , 4 Jul 2006, R. Hughes-Jones Manchester24
UDP Performance: 3 Flows on GÉANT
Throughput: 5 Hour run Jodrell: JIVE
2.0 GHz dual Xeon – 2.4 GHz dual Xeon670-840 Mbit/s
Medicina (Bologna): JIVE 800 MHz PIII – mark623 1.2 GHz PIII 330 Mbit/s limited by sending PC
Torun: JIVE 2.4 GHz dual Xeon – mark575 1.2 GHz PIII 245-325 Mbit/s limited by security policing (>400Mbit/s 20 Mbit/s) ?
Throughput: 50 min period Period is ~17 min
BW 14Jun05
0
200
400
600
800
1000
0 500 1000 1500 2000Time 10s steps
Rec
v w
ire r
ate
Mbi
t/s
JodrellMedicinaTorun
BW 14Jun05
0
200
400
600
800
1000
200 250 300 350 400 450 500Time 10s steps
Rec
v w
ire r
ate
Mbi
t/s
JodrellMedicinaTorun