26
The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova [email protected] n.it

The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova [email protected]

Embed Size (px)

Citation preview

Page 1: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

The Grid approach for the HEP computing problem

Massimo SgaravattoINFN Padova

[email protected]

Page 2: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

What is a Grid ?“Dependable, consistent, pervasive access to resources”

Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals in the absence of central control, omniscience, trust relationships

Make it easy to use diverse, geographically distributed, locally managed and controlled computing facilities as if they formed a coherent local cluster

Page 3: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

What does the Grid do for you? You submit your work And the Grid

“Partitions” your work into convenient execution units based on the available resources, data distribution, … if there is scope for parallelism

Finds convenient places for it to be run Organises efficient access to your data

Caching, migration, replication Deals with authentication and authorization to the different

sites that you will be using Interfaces to local site resource allocation mechanisms, policies Runs your jobs Monitors progress Recovers from problems Tells you when your work is complete

Page 4: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Grid approach in many sciences and disciplines …

Page 5: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Mathematicians Solve NUG30

Looking for the solution to the NUG30 quadratic assignment problem

An informal collaboration of mathematicians and computer scientists

Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Page 6: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Network for Earthquake Engineering

Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 7: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Grid approach to address the High

Energy Physics (HEP) computing problem

Page 8: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

HEP computing characteristics Large numbers of independent events to process Large data sets, mostly read-only Modest floating point requirement Batch processing for production & selection - interactive for analysis Commodity components are just fine for HEP Very large aggregate requirements – computation, data The LHC challenge

Jump in orders of magnitude wrt. previous experiments Geographical dispersion of people and of resources Scale

Petabytes per year of data Thousands of processors Thousands of disks Terabits/second of I/O bandwidth …

Complexity Lifetime (20 years) …

Page 9: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

CMS: 1800 physicists150 institutes32 countries

World Wide Collaboration distributed computing & storage capacity

Page 10: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Solution?

Regional Computing Centres Serve better the needs of the world-wide

distributed community Data available nearby Reduce dependence on links to CERN Exploit established computing expertise &

infrastructure in national labs, universities

See http://www.cern.ch/monarc

Page 11: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Page 12: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Grid as a possible approach Various technical issues to address

Resource Discovery Resource Management

Distributed scheduling, optimal co-allocation of CPU, data and network resources, uniform interface to different local resource managers, …

Data Management Petabyte-scale information volumes, high speed data moving and

replica, replica synchronization, data caching, uniform interface to mass storage management systems, …

Automated system mgmt techniques of large computing fabrics Monitoring Services Security

Authentication, Authorization … Scalability, Robustness, Resilience

Grid model to address such problems

Page 13: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

State (HEP-centric view) circa 2.5 years ago

Globus project Globus toolkit: core services for Grid tools

and applications (Authentication, Information service, Resource management, etc…)

Good basis to build on but: No higher level services Handling of lots of data not addressed No production quality implementations Not possible to do real work with Grids yet …

Page 14: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

DataGrid Project (EDG) Project started Jan 2001, duration 3 years Goals

To build a significant prototype of the LHC computing model To collaborate with and complement other European and US

projects To develop a sustainable computing model applicable to other

sciences and industry: biology, earth observation etc. Specific project objectives

Middleware for fabric & Grid management evaluation, test, and integration of existing M/W S/W and research and development of new S/W as appropriate

Large scale testbed Production quality demonstrations

Open source and technology transfer

See http://www.eu-datagrid.org

Page 15: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Main Partners CERN

CNRS - France

ESA/ESRIN - Italy

INFN - Italy

NIKHEF – The Netherlands

PPARC - UK

Page 16: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Research and Academic Institutes•CESNET (Czech Republic)•Commissariat à l'énergie atomique (CEA) – France•Computer and Automation Research Institute,  Hungarian Academy of Sciences (MTA SZTAKI)•Consiglio Nazionale delle Ricerche (Italy)•Helsinki Institute of Physics – Finland•Institut de Fisica d'Altes Energies (IFAE) - Spain•Istituto Trentino di Cultura (IRST) – Italy•Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany•Royal Netherlands Meteorological Institute (KNMI)•Ruprecht-Karls-Universität Heidelberg - Germany•Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands•Swedish Natural Science Research Council (NFR) - Sweden

Associated Partners

Industry Partners•Datamat (Italy)•IBM (UK)•Compagnie des Signaux (France)

Page 17: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

The Middleware Working Group coordinates the development of the software modules leveraging, existing and long tested open standard solutions. Five parallel development teams implement the software: job scheduling, data management, grid monitoring, fabric management and mass storage management.

The Infrastructure Working Group is focused on the integration of middleware software with systems and networks to provide testbeds to demonstrate the effectiveness of DataGrid in production quality operations over high performance networks.

The Applications Working Group exploits the project developments to process large amounts of data produced by experiments in the fields of High Energy Physics (HEP), Earth Observations (EO) and Biology.

The Management Working Group has in charge the coordination of the entire project on a day-to-day basis and the dissemination of the results among industries and research institutes.

Applications

Middleware

Infrastructure

Managem

ent

Test

bed

Applications

Middleware

Infrastructure

Managem

ent

Test

bed

Applications

Middleware

Infrastructure

Managem

ent

Test

bed

Applications

Middleware

Infrastructure

Managem

ent

Test

bed

Page 18: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

DataGrid Architecture

Collective ServicesCollective Services

Information &

Monitoring

Information &

Monitoring

Replica ManagerReplica

ManagerGrid

SchedulerGrid

Scheduler

Local ApplicationLocal Application Local DatabaseLocal Database

Underlying Grid ServicesUnderlying Grid Services

Computing Element Services

Computing Element Services

Authorization Authentication and Accounting

Authorization Authentication and Accounting

Replica CatalogReplica Catalog

Storage Element Services

Storage Element Services

SQL Database Services

SQL Database Services

Fabric servicesFabric services

ConfigurationManagement

ConfigurationManagement

Node Installation &Management

Node Installation &Management

Monitoringand

Fault Tolerance

Monitoringand

Fault Tolerance

Resource Management

Resource Management

Fabric StorageManagement

Fabric StorageManagement

Grid

Fabric

Local Computing

Grid Grid Application LayerGrid Application Layer

Data Management

Data Management

Job Management

Job Management

Metadata Management

Metadata Management

Object to File

Mapping

Object to File

Mapping

Service Index

Service Index

Page 19: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

DataGrid achievements Testbed 1: first release of EDG middleware

First workload management system “Super scheduling" component using application data and

computing elements requirements

File Replication Tools (GDMP), Replica Catalog, SQL Grid Database Service, …

Tools for farm installation and configuration

… Used for real productions Towards testbed 2: new functionalities and

increased reliability

Page 20: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Job submission scenariodg-job-submit myjob.jdl

Myjob.jdlExecutable = "$(CMS)/exe/sum.exe";InputData = "LF:testbed0-00019";ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it";DataAccessProtocol = "gridftp";InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"};OutputSandbox = {“sim.err”, “test.out”, “sim.log"};Requirements = other.Architecture == "INTEL" && other.OpSys== "LINUX Red Hat 6.2";Rank = other.FreeCPUs;

Page 21: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Other HEP Grid initiatives PPDG (US) GriPhyN (US) DataTag & iVDGL

Transatlantic testbeds (to address interoperability)

LCG (LHC Computing Grid Project)

Page 22: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

The Grid World: current status Dozens of major Grid projects in scientific

& technical computing/research & education

Considerable consensus on key concepts and technologies Open source Globus Toolkit™ a de facto

standard for major protocols & services Industrial interest emerging rapidly Opportunity: convergence of eScience and

eBusiness requirements & technologies

Page 23: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Problems Almost all projects have developed

specialized services which have been layered on top of standard services (security, remote job execution, etc.)

Patchwork of protocols and non-interoperable “standards” and difficult to re-use “implementations”

Exploit Web Services

Page 24: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Web Services Increasingly popular standards-based

framework for accessing network applications W3C standardization; Microsoft, IBM, Sun, others

WSDL: Web Services Description Language Interface Definition Language for Web services

SOAP: Simple Object Access Protocol XML-based RPC protocol; common WSDL target

WS-Inspection Conventions for locating service descriptions

UDDI: Universal Desc., Discovery, & Integration Directory for Web services

Page 25: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Open Grid Service Architecture (OGSA) Service orientation

Computational resources, storage resources, networks, programs, databases, etc. all represented as services

Allows standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency

Grid service: web service with semantic for service interactions Management of transient instances (& state)

Page 26: The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it

Global Grid Forum Mission

To focus on the promotion and development of Grid technologies and applications via the development and documentation of "best practices," implementation guidelines, and standards with an emphasis on "rough consensus and running code"

An Open Process for Development of Standards A Forum for Information Exchange A Regular Gathering to Encourage Shared Effort

See http://www.globalgridforum.org