28
An overview of the EGEE infrastructure and middleware EGEE is funded by the European Union under contract IST-2003- 508833 Elena Slabospitskaya IHEP NA3 manager for Russia

An overview of the EGEE infrastructure and middleware

  • Upload
    raheem

  • View
    47

  • Download
    2

Embed Size (px)

DESCRIPTION

Elena Slabospitskaya IHEP NA3 manager for Russia. An overview of the EGEE infrastructure and middleware. EGEE is funded by the European Union under contract IST-2003-508833. Sources of information. LCG-2 User Guide https://edms.cern.ch/file/454439//LCG-2-UserGuide.html LCG Releases - PowerPoint PPT Presentation

Citation preview

Page 1: An overview of the EGEE infrastructure and middleware

An overview of the EGEE infrastructure and middleware

EGEE is funded by the European Union under contract IST-2003-508833

Elena SlabospitskayaIHEP

NA3 manager for Russia

Page 2: An overview of the EGEE infrastructure and middleware

Sources of information

LCG-2 User Guidehttps://edms.cern.ch/file/454439//LCG-2-UserGuide.html

LCG Releaseshttp://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/index.cgi?var=releasesLCG-2 Install Notes (for administrators)

LCG-2 Manual Installation Guide (for administrators)https://edms.cern.ch/file/434070//LCG2Install.htmlSite with EDG

Tutorialshttp://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/

Page 3: An overview of the EGEE infrastructure and middleware

Overall

1. GSI – Grid Security Infrastructure

2. Infi\ormation System

3. Job Management

4. Data Management

5. Monitoring System

Conclusions

Page 4: An overview of the EGEE infrastructure and middleware

Main Logical Machine Types (1)

RMSCERN

PS

RBMSU

BD I IMSU

SE

SE

SE

SE

SE

SE

UI

UI

UI

CECE

CE

Protvino, IHEPDubna, JINR

Moscow,SINP MSU

SES

E

CE

UI

Moscow, ITEP

Distributed system - A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine.

Page 5: An overview of the EGEE infrastructure and middleware

UI – User Interface

CE – Grid Gate and Worker Nodes GG – Globus Gatekeeper, Globus Resource Allocation Manager, master server of Local Resource Management System, local Logging and Bookkeepering server

SE – Classic Storage Element – GridFTP server SE may control large disk arrays or Mass Storage System(MSS). This storage resources are managed by Storage Resource Manager (SRM). SRM is interacting with OS, MSS and with protocols (to perform file transfer operations) As MSS, LCG-2 support dcache disk pool (GridFTP and rfio), tape archiving system - Castor( GridFTP and rfio) and Enstore(GridFTP ).RB -Resource BrokerRMS -Replica Management SystemBDII – Berkeley DB Information IndexPS – proxy server

Main Logical Machine Types (2)

Page 6: An overview of the EGEE infrastructure and middleware

How do I login on the Grid ?

Two basic concepts: Authentication: Who am I?

“Equivalent” to a pass port, ID card etc.

Authorisation: What can I do? Certain permissions, duties etc.

The Grid Security Infrastructure (GSI) in LCG-2 enables secure authentication and

communication over an open network . GSI is based on public key encryption,

X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol.

Page 7: An overview of the EGEE infrastructure and middleware

- Provides information about grid resourses and their status

- GLUE (Grid Laboratory for a Uniform Environment) schema – common conceptual data model for CE, SE and binding CE-SE.

-MDS (Monitoring and Discovery Service) from Globus has been adopted asa provider of IS.

- IS implements Glue schema using OpenLDAP – Lightweight Directory Acess Protocol

- GRIS – Grid Resource Information System – local on CE and SE

- GIIS – Grid Index Information Service – site (CE)

- BDII -Berkeley DB Information Index

Information System

.

Page 8: An overview of the EGEE infrastructure and middleware

Information system in LCG-2

Page 9: An overview of the EGEE infrastructure and middleware

A LDAP Information System is based on entries.Each entries describes an object – person, computer etc and has unique Distinquished Name (DN). Which kind of information can be stored in each entryis specified in an LDAP schema

Directory Information Tree

Directory Information Tree (DIT) – a tree of directory entries

Page 10: An overview of the EGEE infrastructure and middleware

LDAP directory of an LCG-2 BDII

Page 11: An overview of the EGEE infrastructure and middleware

Job management

Workload Management System (WMS) services is usually run at Resource Broker. Network Server (NS), which accepts the incoming job requests from the UI,

and provides for the job control functionality.

Workload Manager, which is the core component of the system.

Match-Maker (also called Resource Broker), whose duty is finding the best resource matching the requirements of a job (match-making process).

Job Adapter, which prepares the environment for the job and its final description, before passing it to the Job Control Service.

Job Control Service (JCS), which finally performs the actual job management operations (job submission, removal...)

Logging and Bookkeeping service (LB) . The LB logs all job management Grid events, which can then be retrieved by users or system administrators for monitoring or troubleshooting.

Page 12: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement CE)Element CE)

Information Service (IS)

ReplicaCatalogue(RC)

Page 13: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job SubmitEvent

Input Sandbox

Job Status

submitted

Page 14: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

Page 15: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

Page 16: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService(JSS)

StorageElement (SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

BrokerInfo

scheduled

Page 17: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduled

Input Sandbox

running

Page 18: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduled

Job Status

running

Page 19: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

Job Status

done

Job Status

Page 20: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

done

Job Status

Job Status

outputready

Output Sandbox

Page 21: An overview of the EGEE infrastructure and middleware

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Output Sandbox

cleared

submitted

waiting

ready

scheduled

running

done

Job Status

outputready

Page 22: An overview of the EGEE infrastructure and middleware

Possible Job States

SUBMITTED

WAITING

READY

SCHEDULED

RUNNING

DONE(ok)DONE(failed)

OUTPUTREADY

CLEARED

ABORTEDDONE(cancelled)

Page 23: An overview of the EGEE infrastructure and middleware

Data Management Data Naming

SURL Storage URL An SURL is a locator for a physical filesrm://lxshare0282.cern.ch:8443/castor/cern.ch/home/dteam/generated/2004-02-11/A SURL is often called PFN (Physical File Name)filed8f59bcf-5c85-11d8-bbf3-c59c9bed1519

UUID Universally Unique IDentifier A UUID is a 128 bits long numberGUID Grid Unique IDentifier A UUID generated by the Replica Management System guid:e4fbe9b0-5c85-11d8-bbf3-c59c9bed1519

LFN Logical File Name A Logical File Name is a user defined alias to a GUID.

TURL Transport URL A Transport URL is returned by a SRM in response to a request for a way to access a SURL.

lfn:anjita-demo0236-2004-11-02

rfio://lxshare0282.cern.ch//data/dt/stage/filec0fabd63-5cba-11d8-ba4c-e2aa3666572b.4003

Page 24: An overview of the EGEE infrastructure and middleware

Different filenames in LCG-2

Page 25: An overview of the EGEE infrastructure and middleware

The main services offered by the RMS are: the Replica Location Service (RLS) and the Replica Metadata Catalog (RMC).

The RLS maintains information about the physical location of the replicas (mapping with the GUIDs). It is composed of several Local Replica Catalogs (LRCs) which hold the information of replicas for a single VO.

The RMC stores the mapping between GUIDs and the respective aliases (LFNs) associated with them, and maintains other metada information (sizes, dates, ownerships...)

The last component of the Data Management framework is the Replica Manager. The Replica Manager presents a single interface for the RMS to the user, and interacts with the other services.

REPLICA MANAGEMENT SYSTEM (RMS)

Page 26: An overview of the EGEE infrastructure and middleware

Interactions of the RM with other grid components

Page 27: An overview of the EGEE infrastructure and middleware

CONCLUSIONS

The EGEE Grid requires resources, an infrastructure and middleware that allows for:

Authentication and Authorization Information services Job and Data Management Monitoring and fault recovery

Page 28: An overview of the EGEE infrastructure and middleware

SRM Storage Resource Manager A high-level interface to a storage system. RLS Replica Location Service The distributed service providing the mappings between GUIDs and SURLs. An RLS has two components: LRC and RLI LRC Local Replica Catalog The catalog storing GUID to SURL mappings, along with SURL attributes for a given site, or a single Storage Re- source Manager at a site. RLI Replica Location Index The catalog storing information about which Local Replica Catalogs have GUID to SURL mappings for a par- ticular GUID. It thus provides the link between different LRCs, allowing for distributed indexing and querying of the Catalogs. RMC Replica Metadata Catalog The catalog storing LFN aliases for GUID, as well as at- tributes on GUIDs and LFNs. ROS Replica Optimization Service A service providing information to guide selection be- tween replicas located at different sites. This is based on network information collected from available network monitors.

Appendix. Data Management Services

http://lspitsky.home.cern.ch/lspitsky/

MDS- Monitoring and Discovery ServiceLCFG -Local ConFiguration System - Edinburgh