Upload
autumn-sutton
View
220
Download
0
Embed Size (px)
Citation preview
HEPiX Storage, Edinburgh 27-28 May 2004
SE Experiences
Supporting Multiple Interfaces to Mass Storage
HEPiX Edinburgh May 2004
Outline
Objectives
Achievements
Experiences, lessons learned Protocols, interfaces
Technical
Future
Conclusion
HEPiX Edinburgh May 2004
Objectives
Implement uniform interfaces to mass storage Independent of underlying storage system
SRM Uniform interface – much is optional
Develop back-end support for mass storage systems
Provide “missing” features – directory support?
Publish information
HEPiX Edinburgh May 2004
Objectives – SRM
SRM 1 provides async get, put get (put) returns request id
getRequestStatus returns status of request
When status = Ready, status contains Transfer URL – aka TURL
Client changes status to Running
Client downloads (uploads) file from (to) TURL
Client changes status to Done
Files can be pinned and unpinned
HEPiX Edinburgh May 2004
Objectives – SRM
SRM 1 interface requires web services SOAP messages via HTTP
Also SOAP via GSI-HTTP: HTTPG
Data Transfer GridFTP mandatory
Requires GSI Also requires Gridmap files
HEPiX Edinburgh May 2004
Achievements
In EDG, we developed EDG Storage Element Uniform interface to mass storage and disk
Interfaces with EDG Replica Manager
Also client command line tools
Interface was based on SRM but simplified Synchronous Trade-off between “getting it done soon” and “getting it
right the first time” Additional functionality such as directory functions
Highly modular system
HEPiX Edinburgh May 2004
Achievements – SE
TIME
Look upuser
Userdatabase
Filemetadata
Request and handlerprocess management
Look upfile data
Accesscontrol
MSSaccess
“Thin layer” interface
MassStorage
HEPiX Edinburgh May 2004
Achievements - SE
The request contains the sequence of names of handlers
As each handler processes the request, it calls a library that moves the name to an audit section
The library allows easy access to global data
Handlers may also have handler-specific data in the XML
Storing the XML output from each handler as the request gets processed makes it easy to debug the SE
handler3
handler4
handler5
GlobalData
handler1
handler2
sequence
audit
XML
HEPiX Edinburgh May 2004
Achievements - SE
The choice of architecture was right Multiple interfaces can be supported with the same core
New functionality can be added fairly easily Adding or removing access control is more or less a
question of adding the appropriate handler
Easy to debug Messages passed between handlers can be debugged
easily
Disadvantage: not as fast as a monolithic core Speed can be improved by having persistent handlers
HEPiX Edinburgh May 2004
Definition
A current definition of “Storage Element”
A “Storage Element” must provide: GridFTP for data transfer
An SRM 1.0 or 1.1 web services interface The difference is SRMCopy – 3rd party copying Which the client can always do anyway using GridFTP
Information published via MDS in GLUE schema This includes the end point
Interoperability via WSDL document
HEPiX Edinburgh May 2004
Experiences – technical
Initially we had to use Java for web services (Tomcat)
Interfacing Java to Unix processes is not always painless:
Java does not appear to work well with Unix pipes
We had to write a socket-to-pipe daemon
Parentprocess
Childprocess
Newprocess
fork()
exec()
pipes forstd{in,out,err}
HEPiX Edinburgh May 2004
Experiences – technical
gSOAP is pretty good these days Now uses std::string for strings
But how to do GSI We use Ben Couturier’s plugin
Clean Need no knowledge of Globus API to use it
The “official” GSI code was recently rewritten It now uses Globus GSSAPI Previously it used Globus IO module Free, but you must register your email address
HEPiX Edinburgh May 2004
Experiences – development
We use RFIO for data access to both CASTOR and HPSS
Unfortunately the RFIO libraries are binary incompatible – but installed in the same location
Slight differences in rfio.h, too
ADS stress testing found limitations in the pathtape interface
All users in a VO map to same ADS user EDG testbed: Hit limits on concurrent writes
HEPiX Edinburgh May 2004
Experiences - development
Look for opportunities for component reuse Used or improved components from other EDG WPs
Almost all parts of the Data Transfer components developed externally
Prefer Open Source Often need to look at source to debug or supplement docs
Prototype implementations live longer than expected
SE’s metadata system was implemented as prototype
Now replaced with improved system
HEPiX Edinburgh May 2004
Experiences – protocols
SRM 1 (and the EDG SE interface) is a control interface
Separating control from data transfer is useful
Can be used for load balancing, redirection, etc
Easy to add new data transfer protocols
However, files in cache must be released by the client or time out
control data xfdata xf
client
HEPiX Edinburgh May 2004
Experiences – SRM 1
WSDL definitely helped interoperability
Nevertheless we often saw and still see incompatibilities between various implementations
Many parts of the protocol are open to interpretation
E.g. SRMCopy – how does client know that copy has finished? And what the error is if it fails?
HEPiX Edinburgh May 2004
Experiences - protocols
Confusingly many names for files:
LFN: Logical File Name
GUID: used by RM
SFN: Site filename aka PFN – “physical”
SURL: SFN as URL
StFN: Storage file name
TURL: Transfer URL
Replicamanager
Replicacatalogue
StorageElement
MassStorage
User
SEdisk cache
LFN
GUIDSURL
TURL
StFN
SFN
HEPiX Edinburgh May 2004
Experiences – file access
Requirement: Clients must be able to access file in MSS independently
Not all access can be controlled by SE
SE cannot guarantee that it has the file…
…nor that file hasn’t been modified Need to keep checksums of data
Also hard to guarantee reservations… …unless the MSS has can guarantee space for the SE
HEPiX Edinburgh May 2004
Experiences – GridFTP
Requires Gridmap files Pooled accounts (contributed by Andrew McNab) help make this scalable
Observed data transfer rates: 1-6 MB/s But seen 30 MB/s with specially tuned settings
0.7 seconds to transfer a 200 bytes file Time is spent negotiating secure connection
HEPiX Edinburgh May 2004
Experiences – NFS
Used by jobs on worker nodes SE is NFS mounted on WNs
Who converts TURL to local file name? E.g. put the mount point into the filename
Potential need for several copies of file in disk cache
…due to ownership and access control issues …and who gets the file? Who releases it? How does client get the filename? What if the file times out before the job runs?
HEPiX Edinburgh May 2004
Experiences – protocols
SRM messages are well designed
Each has a status: Pending, Failed, Running,
Done
An array of file objects Status: Pending, Ready,
Running, Failed, Done
TURL, if applicable
File metadata (size etc)
Status=Running
File1: SFN=se.ac.uk/foostatus=ReadyTURL=gsiftp://se.ac.uk/…
File2: SFN=se.ac.uk/barstatus=FailedError=Access denied
File3: SFN=se.ac.uk/nainstatus=Pending
HEPiX Edinburgh May 2004
Experiences – technical
The dCache SRM server/client “Standard” – an SRM SHOULD work with these
…but not Open Source Many idiosyncrasies
Fetches WSDL file before sending commands! So the SOAP server has to be a web server as well – and HTTP/1.1 is not entirely trivial to implement correctly – and we can’t just use Apache because of GSI
Doesn’t check all errors – e.g. if a request fails Doesn’t set status to Running
HEPiX Edinburgh May 2004
Experiences – security
Delegation is not ideal Server must be trusted fully, no fine grained capabilities
Proxy certificates incompatible with normal certificates (but on IETF track)
Incompatibility between GSI (HTTPG) and HTTPS We used HTTPS initially, hooking the SE core into Apache
Easy to develop and debug, reliable, well understood tech Debug with curl etc
Need special clients for GSI
Andrew McNab proposes G-HTTPS: compatible with HTTPS, can still do delegation
HEPiX Edinburgh May 2004
Future – SRM 2.1
Frozen-ish as of this spring
Provides lots of new functionality… …much of which is optional…
…e.g. directory support, space reservation
SRM 2 “basic” will (probably) be functionally equivalent to SRM 1
Space types, file types: Volatile, Durable, Permanent
Slightly complicated semantics
HEPiX Edinburgh May 2004
Future – SRM 2.1
Guaranteed space reservation is hard Unless you have infinite space
Define all state transitions for the protocol Avoid ambiguities in SRM 1
WSRF currently being discussed in SRM community
HEPiX Edinburgh May 2004
Future – SRB
There exists an SRB SRM interface Or not?
Depending on who you ask
Build an SRM SRB interface? E.g. build a handler for the EDG SE
Need to consider data transfer: GridFTP vs SRB
HEPiX Edinburgh May 2004
DICOM server support
The GridThe Grid
Storage Element
WP10 DM2
DICOMServer
Metadata
Encrypt, anonymise
Metadata
Store keyStore patient metadata
Access control on metadata required; different ACLs for different types of metadata
Biomed applications in EGEE: Difficult task, not done in EDG
HEPiX Edinburgh May 2004
Conclusion We have built a good framework
SRM 1 is a good choice But file access is not trivial for the client
Technology is maturing But not as quickly as most people would like Still frequent tech and requirements changes Prototypes go into production Get it right vs build it now
Lots of idiosyncrasies – some end up getting promoted to “standard”
Still many challenges ahead – e.g. reservations