39
Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Embed Size (px)

Citation preview

Page 1: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Advanced topic: The SRM protocol and the StoRM implementation

Ezio Corso (EGRID Project, ICTP)

Page 2: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Advanced topic on data management I will briefly describe how the classic SE works:

Highlight design points and consequences for file security. File security: POSIX-like ACL access to files from the GRID.

I’ll then talk about the SRM protocol: Its origin to allow tape resources to be accessed from the

GRID. Particular attention to design differences with classic SE.

SRM transition as an interface to disk storage resources.

Differences with Tape based systems. I’ll finally talk about StoRM: an SRM implementation

that allows POSIX like ACL access.

Page 3: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

I. Classic SE

Page 4: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE It allows disk resources to be accessed

from the GRID. What makes a machine into a SE? Three

components are needed: A component that publishes and tells the

GRID that it is an available storage resource. The usual framework for authentication: GSI. A component that actually moves the files

around: the characterizing feature!

Page 5: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE Component that allows the GRID to be aware of its presence,

i.e. to be included in the GRID information system There is an LDAP Server that publishes information about the SE. Information organised according to the GlueSchema: specifically

by the GlueSEUniqueID entity. Information describing the SE such as its name and listening port of

service. Information specific to each VO that the SE is serving such as the

local path to the file holding directory, available space, etc. Part of the information is updated dynamically, especially that

concerning the disk space available and disk space occupied. It is done through LDAP Providers found in /opt/lcg/libexec. The providers run periodically scripts which update the dynamic

information. Finally the rest of the grid information system periodically polls

the information made available by the SE present there.

Page 6: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE User authentication: Grid Security Infrastructure

GSI Core of GLOBUS 2.4 libraries: used by service in charge of

moving files around! i.e. /opt/globus/lib/libglobus_gsi_credential_gcc32dbg.so.0,

/opt/globus/lib/libglobus_gsi_proxy-core_gcc32dbg.so.0, etc. Set of scripts run by cron jobs to manage pool accounts:

/opt/edg/sbin/edg-mkgridmap creates a gridmap file by reading a local configuration file that specifies sources of allowed credentials, from LDAP server or a specific file.

/opt/edg/sbin/lcg-expiregridmapdir used to remove the mapping to local credentials when a grid user no longer is working on that machine.

/opt/edg/sbin/edg-fetch-crl used to retrieve revocation lists of invalid certificates.

Page 7: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE

Component that carries out the functionality of moving files around the GRID.

In general it is just any implementation of a transport protocol that implements GSI! GridFTP most common! RFIO Anything that somebody comes up with as long

as it is GSI enabled: it is just a matter of who will adopt it and use it!

Page 8: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE

GridFTP: Essentially an FTP server

extended/optimized for large data transfers: Parallel streams for speed. Allows checkpoints during file transfers, for

later resuming. Authentication through GSI certificates

instead of user name + password

Page 9: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE Central point:

It is FTP! A user can do what an FTP client allows to be done!

There is no separation of what can be done from the grid, and the actual transport protocol.

There is no explicit and separate list of file manipulation operations that can be done from the grid!

There is no uniform view of the possible file manipulations: they are linked to the underlying transport protocol!

Depending on the protocol you may not have the same functionality

For the same functionality the specific protocol must be used: it may not be possible to access seamlessly all SEs!

Page 10: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE

Compare with CEs that have LRMS interface to forked jobs or to batch jobs.

It is an abstraction layer on the kinds of computations that can be done.

LRMS may not be a great protocol (gLite CEs are somewhat different)… yet it is an attempt to introduce an abstraction.

Page 11: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SEA more serious consequence of the lack of abstraction is

how to apply POSIX ACL like control on files, from the grid. It is left up to the transport protocol!

For GridFTP: It is FTP modified for GSI. FTP allows file manipulation compatible with underlying Unix

filesystem permissions. If grid control on files is needed, it is the underlying

filesystem that must be carefully managed! Map users to specific local accounts: not pool accounts. Each grid

user can be controlled individually once it gets into the machine. Partition local accounts into especially created groups: reflects

data access patterns. Carefully crafted directory tree guides data access.

So a grid user with no access rights to a file is stopped because the GridFTP server gets stopped on its track by the local filesystem!

Page 12: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE

In any case the proposed solution is problematic because data may be present in several SEs: Users have same UID across all SEs. Replication/Synchronisation of

directory structure across all SEs. Users supplied with tools to manage

permissions coherently across all SEs.

Page 13: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SECentral point: GRID lacked the concept of access control within the

same VO. It was only possible to find it when passing to the local

machine. The local machine had the means to enforce it: users +

group membership. Security therefore is set up behind the scenes at the

implementation level! No GRID concept involved! No GRID abstraction

available to: Express fine grained authorization. Express what can be accessed. Check GRID credentials.

Page 14: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Classic SE

VOMS proxies and GridFTP Allows to define roles and groups: it therefore

allows for fine tuning who the GRID user is. It is up to the system receiving these detailed

credentials to decide what local resources to use.

For SE there is still the same problem of explicitly listing what these resources are: dependency on the transport protocol as stated.

Page 15: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

II. The SRM protocol

Page 16: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocolStorage Resource Manager protocol: Originally devised to allow grid access to tape

based resources that had a disk area acting as cache.

Staging of files: A request for a file arrives If it is in cache it is returned right away Otherwise it is first fetched from tapes, copied to

disk and then returned. The system takes care of consistency between cache

and tapes. Needed to offset latency due to robotic arm

switching tapes.

Page 17: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocolSRM designed to handle that Tape/Disk-

cache scenario, from the GRID:1. The presence of cache area introduces

the concept of file type: Volatile: files get written in cache and the

system then removes them automatically after a lifetime expires.

Permanent: the files that get into cache are not removed automatically by the system

Durable: files do have a lifetime that may expire but the system does not remove them and instead sends an e-mail notification to the user.

Page 18: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

2. File staging introduces the concept of asynchronous calls to get or put a file:

SRM request issued to get a file Server replies immediately without

waiting for staging to complete. Server returns a Request Token which

the client uses to periodically poll the request’s status.

Page 19: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol3. The cache area also introduces a partition of file

namespace: Tape must store files: there have to be names that uniquely

identify the file in tape! The cache area must serve files.

It may return a path to fetch the file on disk that is different from the name that allows to uniquely identify the file in tape.

It can easily support different fetching mechanisms… that is different transport protocols!

SRM reflects this distinction in the concept of SURLs and TURLs:

SURL: Storage URL - A name that identifies a grid file in SRM storage: it is what the GRID sees!

srm://storage.egrid.it:8334/old-stocks/NYSE.txt TURL: Transfer URL – A name that identifies a transport protocol

and the path to fetch the file: it is how the GRID moves the file around!

gridftp://storage.egrid.it:2110/home/ecorso/examples/2005/data.txt

Page 20: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

Central point: SRM introduces an abstraction to

separate transfer protocol from the file operation itself.

Although introduced to handle the cache area, it also solves classic SE issues!

It decouples file operations from transfer protocol!

Page 21: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

Direct consequence: SRM servers do not move files in and out

of GRID storage! They only return TURLS! It is up to the SRM client once it gets a

TURL to call a GridFTP/RFIO/etc client for moving files!

SRM acts only as a broker for file management requests!

Transfer is decoupled from data presentation!

Page 22: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

Extra features and concepts in the protocol:

Big issue of not running out of space during a large file transfer. System used by the HEP community to

store/manage huge amounts of data from LHC.

SRM introduced space management and reservation interface.

Page 23: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol It distinguishes three types of reserved disk space:

Volatile: will be freed by the system as soon as its lifetime expires.

Permanent: will not be freed by the system. Durable: will not be freed but the user that allocated it will be

warned. Space type and file type cannot be mixed in arbitrary ways:

Permanent space will be able to host all three types of files. Volatile space can only host Volatile files.

The general way of working: Space request is made. Server returns a SpaceToken. All subsequent SRM calls made by the client pass on the token. The SRM server keeps track tokens and recognises allocated

space.

Page 24: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

The protocol calls: Data Transfer Functions

Misnomer… no data is moved by an SRM server

srmPrepareToPut, srmPrepareToGet: for putting a file into GRID storage or getting one out.

srmStatusOfPutRequest srmStatusOfGetRequest for polling!

They work on SURLs!

Page 25: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

The protocol calls: Cache area management

srmExtendFileLifeTime for extending lifetime of volatile files

srmRemoveFiles to remove permenent files

srmReleaseFiles, srmPutDone to force early lifetime expiry

Page 26: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

The protocol calls: Directory functions to manage files in tape

srmRmdir srmMkdir srmRm srmLs They work on SURL!

Page 27: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

The SRM protocol

The protocol calls: Space management functions srmReserveSpace srmReleaseSpace srmGetSpaceMetaData

Space Token returned and used with all Data transfer functions.

Page 28: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

III. SRM applied to disk storage!

Page 29: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

SRM applied to disk storage! SRM addresses the issues of classic SE: it is

natural to use it also for disk resources. There was also another important driving

force for its adoption: Many facilities were in place for LHC analysis of

data coming from experiments production centres.

The facilities had high performance storage solutions in place, employing disk parallel file systems such as GPFS and Lustre.

With advent of GRID technologies it became necessary to adapt existing installations to the GRID.

Page 30: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

SRM applied to disk storage! The context of operation is now different:

No tape with a cache in between In general all concepts are kept with slight

semantic adjustments SURL/TURL distinction is kept - it decouples

transfer protocol from data presentation as stated.

Three file types are kept - some files may be copied and live just for a certain amount of time.

Space reservation is kept - it is an important functionality.

Directory functions are kept.

Page 31: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

SRM applied to disk storage!Some compromises: Asynchronous nature of srmPrepareToGet,

srmPrepareToPut and srmCopy, remain although don’t make sense.

SpaceType distinction makes less sense: Arguably the whole disk can be seen as

permanent space, and so allow all three file types. Akin to tapes that are permanent by their nature.

Releasing of file and lifetime extension remain for volatile files; srmRemoveFiles for managing cache files does not make sense

Page 32: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

IV. StoRM SRM implementation

Page 33: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation

Result of collaboration between:

INFN - Grid.IT Project from the Physics community

+ICTP - EGRID Project: to build a pilot

national grid facility for research in Economics and Finance (www.egrid.it)

Page 34: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation StoRM’s implementation of SRM 2.1.1

meant to meet three important requirements from Physics community: Large volumes of data exasperating disk

resources: Space Reservation is paramount.

Boosted performance for data management: direct POSIX I/O call.

Security on data as expressed by VOMS: strategic integration with VOMS proxies.

Page 35: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation EGRID Requirements:

Data comes from Stock Exchanges: very strict legally binding disclosure policies. POSIX-like ACL access from GRID environment.

Promiscuous file access: existing file organisation on disk seamlessly available from the grid + files entering from the grid must blend seamlessly with existing file organisation. Very challenging – probably only partly achievable!

StoRM: disk based storage resource manager… allows for controlled access to files – major opportunity for low level intervention during implementation.

Page 36: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation How StoRM solves POSIX-like ACL

access from the GRID: All file requests are brokered with SRM

protocol. When StoRM receives an SRM request for a

file: StoRM asks policy source for access rights to:

given SURL for given grid credentials. Check is made at the grid credential level: not

local user as before! And it is done on a grid view of a file as identified by the SURL!

Page 37: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation The only part of the implementation outside of the

protocol is the Policy Source: a GRID service that is able to formulate/express physical access rules to resources.

StoRM leverages grid’s LogicalFileCatalogue (LFC) as policy source: it is intended for Logical Names! StoRM therefore stretches its use. Still, it is very GRID-friendly: it is not a proprietary solution!

It would be better to have it explicitly in the SRM protocol: SRM 2.1.1 does have some Permission functions but their expressive power is weak, and in the next version of the protocol they will be re-addressed (srmSetPermission, srmReassignToUser, srmCheckPermission).

Page 38: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

StoRM SRM implementation A last note: physical enforcement

through JustInTime ACL setup. All files have no ACLs setup: no user

can access files. Local Unix account corresponding to

grid credentials is determined. ACL granting requested access set up

for local user. ACL removed when file no longer

needed.

Page 39: Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Advanced topic on data management

Thank-you!