22
A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu IDAC Center for Enabling Distributed Petascale Science (CEDp www.cedps.org

A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Embed Size (px)

Citation preview

Page 1: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

A Managed Object Placement Service (MOPS) using NEST and

GridFTP Dr. Dan Fraser

John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu

SCIDAC Center for Enabling Distributed Petascale Science (CEDpS)www.cedps.org

Page 2: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Overview

• Brief CEDPS overview

• Focus on data movement

• Managed Object Placement Service (MOPS)– Internal resource management (awareness)

• GFork capability

– External awareness & interaction• NEST (Network Storage Technology)

Page 3: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Petascale Data Challenge

• DOE facilities generatemany petabytes of data(2 petabytes = all U. S. academic research libraries!)

Massive data

U

U

U

U

U

DOEfacilities

• Remote users (at labs universities, industry) need data!

• Rapid, reliable accesskey to maximizingvalue of $B facilities

U

Remotedistributed users

U

U

Page 4: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

• Reliable: recoverfrom many failures

• Predictable: data arrives when scheduled

• Secure: protect expensive resources & data• Scalable: deal with many

users & much data

Bridging the Divide (1):Move Data to Users

When & Where NeededC

B

A

• Fast: >10,000x faster thanusual Internet

“Deliver this 100 Terabytes to

locations A, B, C by 9am tomorrow”

Page 5: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

• Flexible: easyintegration of functions

• Secure: protect expensive resources & data• Scalable: deal with many

users & much data

Bridging the Divide (2):Allow Users to Move

ComputationNear Data

A

• Science services:provide analysisfunctions neardata source

“Perform mycomputation F ondatasets X, Y, Z”

Y Z

XF

Page 6: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

• Instrument: includemonitoring points inall system components

• Monitor: collect data inresponse to problems

• Diagnose: identify thesource of problems

Bridging the Divide (3):Troubleshoot

End-to-EndProblemsC

B

A

“Why did my datatransfer (or remoteoperation) fail?”

• Identify & diagnose failures & performanceproblems

Page 7: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

What is GridFTP

• Widely used, open source, production quality data mover– Separate control and data channels– Parallel streams (~3-5x faster than TCP/IP)– Parallel stripes (multiple servers)– Partial file transfer– Multiple security options (GSI, SSH)– Third party control– Extensible for both file system & protocols

Page 8: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

GridFTP Modularity

Data StorageInterfaces (DSI) -POSIX -SRB -HPSS -NEST

GridFTP Server-separate control, data-striping

XIO Drivers -TCP -UDT (UDP) -parallel streams -GSI -SSH

Client Interfaces -Globus-URL-Copy -C Library -RFT (3rd party)

I/OFileSystems

Clients

Page 9: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

GridFTP Advanced Configurations

• GFork (Internal awareness)– Robust unix fork/setuid model– Allows server state to be maintained across

connections

• Dynamic backends– Stability in the event of backend failure– Growing resource pools for peak demands

• Storage/Access Allocation (External awareness)– NEST (Network Storage Technology)

Page 10: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Why is awareness important?

• Currently, GridFTP does everything it is asked• If asked, GridFTP in a worst case scenario could:

– Use all available memory & buffers on the server– Write until the file system is full– Slow down all the transfers when overloaded

(Worst case scenarios do not happen very often)

• Many tools designed to work around these limitations– SRM, DCache, …

Services should be able to protect both themselves and their environments

Page 11: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

GFork (Internal Awareness)

Client

Server Host

GForkServer

GridFTPPlugin

GridFTP Server Instance

Fork

GridFTP Server Instance

GridFTP Server Instance

State Sharing Link

ClientClient Inherited Links

Control Channel Connections

Page 12: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

External Awareness:Why storage allocations ?

• Users need both temporary storage, and long-term guaranteed storage.

• Administrators need a storage solution with configurable limits and policy.

• Administrators will benefit from NeST’s autonomous reclamations of expired storage allocations.

Page 13: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

External Awareness: GridFTP + NeST

GridFTP Server

NeST Callout

Disk

Storage

NeST Server

NeST Client

Negotiator

globus-url-copy

(Lot operations, etc.)(File transfers)

(GSI-FTP)

Page 14: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Overview of NeST

• NeST: Network Storage Technology• Lightweight: Configuration and installation can be

performed in minutes.• Multi-protocol: Supports Chirp, GridFTP, NFS,

HTTP– Chirp is NeST’s internal protocol

• Secure: GSI authentication• Allocation: NeST negotiates “mini storage

contracts” between users and server.

Page 15: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Storage allocations in NeST

• Lot – abstraction for storage allocation with an associated handle– Handle is used for all subsequent

operations on this lot

• Client requests lot of a specified size and duration. Server accepts or rejects client request.

Page 16: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

External Awareness Architecture

Client

GridFTP Server

ACL Plugin

DSI Plugin

Main Codebase

NEST

Page 17: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

ACL Plugin• Authorize/Init

– Grant access Yes/No– Plugin establishes context (initializes state for future requests)

• Create/Modify/Read a file– Given pathname and size– Creates a transaction

• Update Transaction– Plug in may timeout waiting– Progessively commit bytes as ‘complete’– Finished flag

Page 18: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Granting Access

Client

GridFTP Server

ACL Plugin

DSI Plugin

Main Codebase

Client connects

GSI ID

Allow?Y

230 Enter

GSI HandshakeNow known ID sent to auth pluginDo whatever needed to determine if allowedNotify client of access

NEST

Page 19: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Recieving a File

Client

GridFTP Server

ACL Plugin

DSI Plugin

Main Codebase

Path/size

Allow?Y

150 Begin

RECV file

Reserve Space

Start transfer

01010101010101010101

Receive Bytes

Update Transaction

Transaction Complete

NEST

Page 20: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Notes

• Sending a file– Same interactions as receiving, only simpler

(no space reservation)

• ACLs can be chained together– Chaining semantics still being worked out

Page 21: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Using NeST• Init

– NeST can use the client username/GSI subject to initialize.

• Create/modify– Reserve space with a given timeout

• Pathname is key to transaction• If expires reservation and uncommitted data is lost

• Update– Commit bytes, reset timeout.

• Complete– Clean up state

Page 22: A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC

Conclusion• Services Must be able to protect

themselves

• Awareness of environment (Internal & External) is key

• Managed Object Placement Service– Straight-forward technology advancements– Capability greater than sum of parts

• Invitation to work together…