21
DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG team at Manchester Andrew McNab MB - NG

DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

1

High Throughput: Progress and Current Results

Lots of people helped:MB-NG team at UCLMB-NG team at ManchesterAndrew McNab

MB - NG

Page 2: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

2

What’s Involved in Performance

End Hosts NIC: operation & PCI design – use of modern PCI commands

Chipset & PCIX

CPU, memory & memory-bus

OS kernel: drivers & TCP UDP IP stack

Disk sub-system: disks, controllers & interconnects

Routers Blades: operation

Switching / routing fabric: bus, crossbar, non-blocking operation

Policies

The Network (s) Framing

Bandwidth

Load & Congestion

Page 3: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

3PC PC

MB – NG SuperJANET4 Development Network (22 Mar 02)

UCL

OSM-1OC48-POS-SS

MCC

OSM-1OC48-POS-SS

MAN

Gigabit Ethernet2.5 Gbit POS Access2.5 Gbit POS coreMPLS Admin. Domains

SJ4 Dev

SJ4 DevSJ4

Dev

SJ4 Dev

PC PCPC

3ware RAID0

PC

3ware RAID0

Page 4: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

4

Initial Measurements

UDP throughput very variable

TCP throughput very variable Lon Man < Man Lon ?

Investigations: Packet loss several % Errors on interface

(ifconfig) Overruns

Page 5: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

5

txqueuelen-vs-sendstalls

TCP throughput very variable Lon Man < Man Lon ? throughput very variable

Page 6: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

6

Interrupt Coalescence Investigations TCP mem-mem lon2-man1 Tx 64 Tx-abs 64 Rx 0 Rx-abs 128 820-980 Mbit/s +- 50 Mbit/s

Tx 64 Tx-abs 64 Rx 20 Rx-abs 128 937-940 Mbit/s +- 1.5 Mbit/s

Tx 64 Tx-abs 64 Rx 80 Rx-abs 128 937-939 Mbit/s +- 1 Mbit/s

Page 7: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

7

24 Hours HighSpeed TCP mem-mem TCP mem-mem lon2-man1 Tx 64 Tx-abs 64 Rx 64 Rx-abs 128 941.5 Mbit/s +- 0.5 Mbit/s

Page 8: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

8

Raid0 Performance (1)

Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133

WriteSlight increase with number of disks

Read

3 Disks OK

Write 100 MBytes/s Read 130 MBytes/s

Read Throughput and no. Disks Stripe 64k raid0

0

20

40

60

80

100

120

140

160

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

yte

s/s

2 disk

3 disk

4 disk

Write Throughput and no. Disks Stripe 64k raid0

0

20

40

60

80

100

120

140

160

180

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

yte

s/s

2 disk

3 disk

4 disk

Page 9: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

9

Raid0 Performance (2)

Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133

No difference for Write

Larger Stripe lower the performance

Write 100 MBytes/s Read 120 MBytes/s

Disk-mem Write Throughput vs Stripe Size - 3 disk raid0

0

20

40

60

80

100

120

140

160

180

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

MB

yte

s/s

64

128

256

512

1000

Disk-mem Read Throughput vs Stripe Size - 3 disk raid0

0

20

40

60

80

100

120

140

160

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

yte

s/s

64

128

256

512

1000

Page 10: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

10

Gridftp Throughput HighSpeedTCP Int Coal 64 128 Txqueuelen 2000 TCP buffer 1 M byte

(rtt*BW = 750kbytes)

Interface throughput

Acks received

Data moved 520 Mbit/s

Same for B2B tests

So its not that simple!

Page 11: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

11

Gridftp Throughput + Web100

Throughput Mbit/s:

See alternate 600/800 Mbitand zero

Cwnd smooth No dup Ack / send stall /

timeouts

Page 12: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

12

Gridftp Throughput + Web100

Throughput Mbit/s vsRecv Window Size

Zero throughputindependent of Recv Window Size

Bytes sent Bytes received

Waits 0.4s at start !

Page 13: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

13

http data transfers HighSpeed TCP

Apachie web server out of the box!

prototype client - curl http library 1Mbyte TCP buffers 2Gbyte file Throughput 72 MBytes/s Cwnd - some variation No dup Ack / send stall /

timeouts

Page 14: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

14

http data transfers (2)

Limited by: Sender

Receive window size

Page 15: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

15

Int Coal 64 128 Txqueuelen 500

(no stall in rtt) TCP buffer 750k byte

(rtt*BW = 750k bytes)

1 stream every 60 s: man1 lon2 man2 lon2 man3 lon2

Sample ever 10ms Send rates:

940 Mbit/s 450 Mbit/s 300 Mbit/s

TCP sharing man1-lon2 mem-mem web100

Page 16: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

16

Int Coal 64 128 Txqueuelen 500

(no stall in rtt) TCP buffer 750k byte

(rtt*BW = 750k bytes)

1 stream every 60 s: man1 lon2 man2 lon2 man3 lon2

Sample ever 10ms Time in send limit:

Sender Cwind Recv wind

TCP sharing man1-lon2 mem-mem web100

Page 17: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

17

1Stream:

No Dup ACKs No SACKs No Sendstalls Why does Cwnd vary

TCP sharing man1-lon2 the WHY?

Page 18: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

18

2Streams:

Many Dup ACKs Many SACKs Why does Cwnd

have large variations

2 TCP streams man1-lon2 - the WHY?

Page 19: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

19

2Streams:

Dips in throughput due to Dup ACK

~4 losses /sec A bit regular ?

Cwnd decreases: 1 point 33% Ramp starts at 62% Slope 70Bytes/us

2 TCP streams man1-lon2 - the WHY? (2)

1 sec

Page 20: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

20

3Streams:

Dips in throughput due to Dup ACK

3 TCP streams man1-lon2 - the WHY?

10 sec

Page 21: DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG

DataTAG Meeting CERN 7-8 May 03R. Hughes-Jones Manchester

21

TCP sharing man1-lon2 - the WHY?

There is (a) correlation