48
High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Embed Size (px)

Citation preview

Page 1: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

High Performance Grid Data Transfer

Bill Allcock, ANL

BNL Technology Meeting

26 April 2004

Page 2: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Outline Transport and Related Services GridFTP and XIO Alternate Transport Protocols Optical Networking Firewalls

Page 3: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Transport and Related Services

Page 4: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Services: What’s the Big Deal? Grids (Globus anyway) has been service oriented

from the start. However, every service had its own, unique, non-

standard, and not necessarily well documented protocol

Web services began to gain traction and via SOAP and WSDL provided a nice interface, good tooling, etc..

But, did not deal with some issues critical to the Grid, such as state, lifetime management, etc.

Page 5: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Open Grid Services Infrastructure (OGSI)

The first crack at melding the Grid with Web Services.

Largely driven by Globus and IBM, but OGSI working group in GGF was very active in it’s creation.

Worked well, and gained lots of support Except from the Web Service Stack

Vendors…

Page 6: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Grid and Web Services:Convergence?

Grid

Web

However, despite enthusiasm for OGSI, adoption within Web community turned out to be problematic

Started far apart in apps & tech

OGSI

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging ?

Page 7: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Three Major Web Services Concerns about OGSI

Too much stuff in one specification

Does not work well with existing Web services tooling

Too “object oriented” Read “It didn’t fit their model”

Page 8: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Concerns Addressed

Too much stuff in one specification WSRF partitions OGSI v1.0 functionality into a

family of composable specifications Does not work well with existing Web

services tooling WSRF tones down the usage of XML Schema

Too object oriented WSRF makes an explicit distinction between

the “service” and the stateful “resources” acted upon by that service

Page 9: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Grid and Web Services:Convergence: Yes!

Grid

Web

The definition of WSRF means that Grid and Web communities can move forward on a common base

WSRF

Started far apart in apps & tech

OGSI

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging

Page 10: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

From OGSI to WSRF:Refactoring and Evolution**

OGSI WSRF

Grid Service Reference WS-Addressing Endpoint Reference

Grid Service Handle WS-Addressing Endpoint Reference

HandleResolver portType WS-RenewableReferences

Service data defn & access WS-ResourceProperties

GridService lifetime mgmt WS-ResourceLifeCycle

Notification portTypes WS-Notification

Factory portType Treated as a pattern

ServiceGroup portTypes WS-ServiceGroup

Base fault type WS-BaseFaults

**Draft document at www.globus.org/wsrf this week

Page 11: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

GridFTP and XIO

Page 12: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

What is GridFTP?

A secure, robust, fast, efficient, standards based, widely accepted data transfer protocol

A Protocol Multiple Independent implementation can interoperate

This works. Both the Condor Project at Uwis and Fermi Lab have home grown servers that work with ours.

Lots of people have developed clients independent of the Globus Project.

We also supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries

Page 13: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

wuftpd based GridFTPExisting Functionality

Security Reliability / Restart Parallel Streams Third Party Transfers Manual TCP Buffer Size Server Side Processing Partial File Transfer Large File Support Data Channel Caching Integrated

Instrumentation De facto standard on the

Grid

New Functionality in 3.2 Server Improvements

Structured File Info MLST, MLSD

checksum support chmod support

globus-url-copy changes File globbing support Recursive dir moves RFC 1738 support Control of restart Control of DC security

Page 14: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

New GridFTP Implementation 100% Globus code. No licensing issues. GT3.2 Alpha had a very minimal implementation GT3.2 final does *NOT* have the new server. It

will be released in 4.0. wuftpd specific functionality, such as virtual

domains, will NOT be present Will have IPV6 support included (EPRT, EPSV) Based on XIO Extremely modular to allow integration with a

variety of data sources (files, mass stores, etc.) Striping will also be present in 4.0

Page 15: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

GridFTP at SC’2000: Long-Running Dallas-Chicago Transfer

SciNet Power Failure Other demos starting up

(Congestion)

Parallelism Increases (Demos)

Backbone problems on the SC Floor

DNS Problems

Transition between files (not zero due to averaging)

Page 16: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

GridFTP: Standards Based

FTP protocol is defined by several IETF RFCs Start with most commonly used subset

Standard FTP: get/put etc., 3rd-party transfer Implement standard but often unused features

GSS binding, extended directory listing, simple restart Extend in various ways, while preserving

interoperability with existing servers Striped/parallel data channels, partial file, automatic &

manual TCP buffer setting, progress monitoring, extended restart

Page 17: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

GridFTP: Standards Based (cont)

Existing standards RFC 959: File Transfer Protocol RFC 2228: FTP Security Extensions RFC 2389: Feature Negotiation for the File

Transfer Protocol Draft: FTP Extensions

New drafts GridFTP: Protocol Extensions to FTP for the Grid

Grid Forum Recommendation GFD.20 http://www.ggf.org/documents/GWD-R/GFD-R.020.pdf

Page 18: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

GridFTP: Caveats Protocol requires that the sending side do the

TCP connect (possible Firewall issues) Client / Server

Currently, no server library, therefore Peer to Peer type apps VERY difficult

Generally needs a pre-installed server Looking at a “dynamically installable” server

Striped Server is a prototype, though a heavily used and well tested prototype. alpha of new server with striping by mid summer Should be released by Fall 2004

Page 19: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Striped Server Multiple nodes work together and act as a

single GridFTP server An underlying parallel file system stores

blocks of the file, usually in round robin fashion, across all of the nodes.

Each node then moves only the pieces of the file that it is responsible for.

The other side then writes the file in the same way, block round robin on a parallel file system.

This allows multiple levels of parallelism, CPU, bus, NIC, disk, etc.

Page 20: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

MODE ESPAS (Listen) - returns list of host:port pairsSTOR <FileName>

MODE ESPOR (Connect) - connect to the host-port pairsRETR <FileName>

18-Nov-03

GridFTP Striped Transfer

Host Z

Host Y

Host A

Block 1

Block 5

Block 13

Block 9

Host B

Block 2

Block 6

Block 14

Block 10

Host C

Block 3

Block 7

Block 15

Block 11

Host D

Block 4

Block 8 - > Host D

Block 16

Block 12 -> Host D

Host X

Block1 -> Host A

Block 13 -> Host A

Block 9 -> Host A

Block 2 -> Host B

Block 14 -> Host B

Block 10 -> Host B

Block 3 -> Host C

Block 7 -> Host C

Block 15 -> Host C

Block 11 -> Host C

Block 16 -> Host D

Block 4 -> Host D

Block 5 -> Host A

Block 6 -> Host B

Block 8

Block 12

Page 21: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Extensible IO (XIO) system Provides a framework that implements a

Read/Write/Open/Close Abstraction Drivers are written that implement the

functionality (file, TCP, UDP, GSI, etc.) Different functionality is achieved by building

protocol stacks GridFTP drivers will allow 3rd party applications to

easily access files stored under a GridFTP server Other drivers could be written to allow access to

other data stores. Changing drivers requires minimal change to the

application code.

Page 22: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking

Page 23: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking Will have (is having) a profound effect on the

way we do computing. Networking speeds increassing >> Moore’s Law

The network is no longer necessarily the weak link.

Network bandwidth is >> than the bus on your computer. Explode the computer out over the network

Current economics make it reasonable to have your own fiber.

Page 24: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking Whats the big deal?

Real QoS available Can carve out an OC-12 that is all yours

Its all yours, use any protocol you want i.e. ditch TCP

Optical switches are cheap Bringing about a change in the business model.

Users can request reconfiguration and get it within minutes or even seconds.

Page 25: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking What does it mean to you?

User: It should be nothing more than a service to request a

reservation for a path. Should be able to get guaranteed, or at least reasonable and

consistent network performance.

Developer For middleware or network developers some people think the

level of exposure will go down to every optical switch (mems, or individual path switch, not switch like the entire box)

Applications can (and should) become network (or if you luck buzz words, photonic) aware. They can alter their behavior based on what network resources are available.

Page 26: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking Caveat The following is all IMHO

We will never go to a fully circuit switched network

Just does not make sense for most web traffic Will be treated as a premium service

Most people will not have the luxury of true end-to-end optical switching

Will often do packet switching on-site to the Optical gear. Often will be virtual circuits over MPLS or GMPLS In those cases you still need proper traffic engineering

This may seem obvious, but this will only work if it is profitable.

Page 27: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Optical Networking Picture

Src Domain

Intra-DomainController

Domain 1

Intra-DomainController

Domain 2

Intra-DomainController

Dst Domain

Intra-DomainController

Inter-DomainController

Inter-DomainController

Inter-DomainController

Inter-DomainController

Dedicated Path (optical or not, virtual or dedicated lambdas)

Dynamic Optical Path AllocatorDOPA

Page 28: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Protocols

Page 29: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Protocols

Current TCP variants are great for web traffic, but not for bulk data transfer.

Attempts to improve this fall into a few categories: Evolution of AIMD Rate based Reliable UDP based XCP

Page 30: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

What’s wrong with TCP? Uses packet loss to signal congestion

Not necessarily the case Feedback is too coarse You only know when it is too late

Additive Increase, Multiplicative Decrease (AIMD) for its control mechanism

Need to have a buffer equal to the Bandwidth Delay Product

The Interaction between AIMD and BWDP

Page 31: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

AIMD To the first order this algorithm:

Begin in “slow start” by putting one packet on the network and then doubling (exponentially increasing) the number of unacknowledged packets (congestion window or CWND) each time an ACK is received, until it gets a congestion event

When there is a congestion event, enter congestion avoidance:

Cut the CWND in half Linearly increase one or a few packets per ACK

Note that CWND size is equivalent to Max BW

Page 32: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

BWDP TCP is reliable and must keep a copy of every

packet until the receiver acknowledges it has been received.

Use a tank as an analogy I can keep putting water in until it is full. After that, I can only put in one gallon for each

gallon removed. The volume of the tank is the cross sectional

area times the height Think of the BW as the area and the RTT as the

length of the network pipe.

Page 33: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Recovery Time

MTU

BW*RTT

RTTMTU

RTT *BW 21

TimeRecovery

RTT

MTU Recovery of Rate

(BWDP) RTT *BW 2

1 Recover toBytes

Recovery of Rate

Recover toBytes TimeRecovery

2

Page 34: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Recovery Time for a Single Congestion Event

T1 (1.544 Mbs) with 50ms RTT 10 KB Recovery Time (1500 MTU): 0.16 Sec

GigE with 50ms RTT 6250 KB Recovery Time (1500 MTU): 104 Seconds

GigE to Amsterdam (100ms) 1250 KB Recovery Time (1500 MTU): 416 Seconds

GigE to CERN (160ms) 2000 KB Recovery Time (1500 MTU): 1066 Sec (17.8

min)

Page 35: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Evolution of AIMD Still do Additive Increase, Multiplicative

Decrease, but change the rate at which you increase and decrease: High Speed TCP (RFC ?) Sally Floyd

After a set point (typically 38 packets) a table is used and more aggressive AIMD parameters are set

Scalable TCP Increase is exponential and decrease is 0.125

H-TCP (Leith) Uses time between congestion events to differentiate high

speed network from low speed.

Good Paper that compares these and others http://dsd.lbl.gov/DIDC/PFLDnet2004/papers/Bullot.pdf

Page 36: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Rate Based Control Flow control is based on changes in RTT

Assume an increase in RTT means buffers in the router are filling up

This was tried back in the 70’s, but TCP won out. FAST TCP out of Cal Tech (Stephen Low) uses

this approach. So far, results look very promising

Good performance Fairly TCP friendly

Still TCP, so socket libraries work

Page 37: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Reliable UDP Send the data UDP and use some form of

retransmit for lost data Reliable UDP (RFC ?)

Retransmit and control on TCP channel

Reliable Blast UDP (UIC/EVL/Leigh) Retransmit and control on TCP channel File based (must know # of bytes being transmitted)

Tsunami (IU/ANML/Wallace) Retransmit and control on TCP channel File based (must know # of bytes being transmitted)

SABUL/UDT (UIC/Grossman) SABUL (UDP/TCP), UDT (UDP ONLY) General transport, not file based

Page 38: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Reliable UDP Issues

TCP friendliness Only SABUL/UDT attempt to back off in the face of congestion. This means all others will shut out TCP flows This is why UDT went to UDP control, others strangle their

own control / retransmit flows.

Standardization Some routers won’t let UDP through

Great choice over an optical circuit though Can’t use the socket libraries Can use XIO

Page 39: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

XCP XCP (Katabi / MIT)

Generalization of Explicit Congestion Notification (ECN)

Decouples utilization control from fairness control

Based on control systems theory Requires router modifications

Again, IMHO This is the way to go You need a real feedback and a proper control

system

Page 40: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Firewalls

Page 41: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

What Security People Would like

Client

GridFTPServer

GridFTPServer

The NETWORK

Page 42: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

The NETWORK

What application People Want

Big, Fat, PipeGridFTPServer

GridFTPServer

Client

Page 43: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Where are we at today? Applications often can’t run, and if they are

high bandwidth apps, the firewall often limits performance.

Today, it means manual configuration to open holes, but then they stay open too long, and anyone can exploit them.

We need something better… We need to open holes in the firewall

Only while absolutely necessary For a specific party With confidence that the use of that port falls

within authorized behavior

Page 44: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

How can we do that? We envision two services

A service proxy Intercepts incoming service requests (SOAP in Web Services) Validates / Authorizes the request Pluggable framework so it can be easily extended Once authorized forwards it to the service

A secure, dynamic firewall “automatic garage door opener”

Temporarily opens holes through the firewall Uses lifetime management to ensure the holes close Ideally can specify exact host and service that will contact Possibly have no monitoring of packets after that? But could work with IDS, if it were fast enough.

Could, in theory, use these separately

Page 45: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

A picture

ProxyOpener

client

request

service Authorized requestOpen for

“n” minutes

Page 46: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Issues You might notice the picture does not show

who talks to the opener, this has significant security and effort impacts The Proxy? The Service? The Client?

Stability / Security of plug-ins What about non-SOAP requests? Would this be secure enough to let the fat

pipe run un-monitored?

Page 47: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Fusion Collaboratory

Reliable File TransferService

EstablishPaths

PPPL

Proxy SvcF/W GDO

GridFTPServer

GA

Proxy Svc F/W GDO

GridFTPServer

Data Channel

Open

EstablishPaths

Open

JET

Proxy Svc

GridFTPServer

GCB

GCBAgent

Client

Proxy Svc

GridFTPServer

GCB

Control Channels

Page 48: High Performance Grid Data Transfer Bill Allcock, ANL BNL Technology Meeting 26 April 2004

Even with Optical Paths…

Ultranet

Reliable File TransferService

EstablishPaths

PPPL

Proxy SvcF/W GDO

GridFTPServer

GA

Proxy Svc F/W GDO

GridFTPServer

Data Channel

Open

EstablishPaths

Open

DOPADOPA

ES-NetEstablish

PathEstablish

Path

ControlChannel

ControlChannel