Upload
avye-pugh
View
42
Download
0
Embed Size (px)
DESCRIPTION
LHC Computing Grid Project. Creating a Global Virtual Computing Centre for Particle Physics ACAT’2002 27 June 2002 Les Robertson IT Division, CERN [email protected]. Summary. LCG – The LHC Computing Grid Project requirements, funding, creating a Grid areas of work grid technology - PowerPoint PPT Presentation
Citation preview
les robertson - cern-it 1last update: 19/04/23 21:24
LCG LHC Computing Grid Project
Creating a Global Virtual Computing Centre for Particle Physics
ACAT’2002
27 June 2002
Les Robertson
IT Division, CERN
les robertson - cern-it-2last update 19/04/23 21:24
LCG Summary
LCG – The LHC Computing Grid Project requirements, funding, creating a Grid
areas of work grid technology computing fabrics deployment operating a grid
Plan for the LCG Global Grid Service A few remarks
les robertson - cern-it-3last update 19/04/23 21:24
LCG
source: CERN/LHCC/2001-004 - Report of the LHC Computing Review - 20 February 2001
(ATLAS with 270Hz trigger)Regional Grand
Tier 0 Tier 1 Total Centres Total
Processing (K SI95) 1,727 832 2,559 4,974 7,533Disk (PB) 1.2 1.2 2.4 8.7 11.1Magnetic tape (PB) 16.3 1.2 17.6 20.3 37.9
---------- CERN ----------
Summary of Computing Capacity Required for all LHC Experiments in 2007
Funding dictates – Worldwide distributed computing system Small fraction of the analysis at CERN Batch analysis – using 12-20 large regional centres
how to use the resources efficiently establishing and maintaining a uniform physics environment
Data exchange and interactive analysis involving tens of smaller regional centres, universities, labs
les robertson - cern-it-4last update 19/04/23 21:24
LCG Summary - Project Goals
applications - tools, frameworks, environment, persistency
computing system global grid service cluster automated fabric collaborating computer centres grid CERN-centric analysis global analysis environment
Goal – Prepare and deploy the LHC computing environment
This is not another grid technology project –
it is a grid deployment project
les robertson - cern-it-5last update 19/04/23 21:24
LCG Two Phases
The first phase of the project – 2002-2005 preparing the prototype computing environment,
including support for applications – libraries, tools, frameworks,
common developments, ….. global grid computing service
funded by Regional Centres, CERN, special contributions to CERN by member and observer states, middleware developments by national and regional Grid projects
manpower OK hardware at CERN - ~40% funded
Phase 2 – construction and operation of the initial LHC Computing Service – 2005-2007
at CERN – missing funding of ~80M CHF
les robertson - cern-it-6last update 19/04/23 21:24
LCG Funding
Funding agencies have little enthusiasm for investing more in particle physics
HEP seen as a ground-breaker in computing initiator of the Web track record of exploiting leading edge computing effective global collaborations real need – for data as well as computation one of the few application areas with real cross-border data
needs
LHC in sync with -- emergence of Grid technology -- explosion of network bandwidth
We must deliver on Phase 1 for LHC - and show the relevance for other sciences
les robertson - cern-it-7last update 19/04/23 21:24
LCG Building a Grid
massstorage
applicationservers
WAN
data cache
Computing Centre Cluster
les robertson - cern-it-8last update 19/04/23 21:24
LCG
automated managementinstallation, configuration,maintenance, monitoring,error recovery, …
-reliability-cost containment
Cluster Fabric
autonomic computing
les robertson - cern-it-9last update 19/04/23 21:24
LCGThe MONARC Multi-Tier
Model (1999)
Department
Desktop
CERN
MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html
Tier 1 – full service
FNALRAL
IN2P3
Tier2 Lab a
Uni b Lab c
Uni n
les.
rob
ert
son
@ce
rn.c
h
Tier 0 - recording, reconstruction
les robertson - cern-it-10last update 19/04/23 21:24
LCG Building a Grid
CollaboratingComputer Centres
les robertson - cern-it-11last update 19/04/23 21:24
LCG Building a Grid
CollaboratingComputer Centres
The virtual LHC Computing CentreGrid
Alice VO
CMS VO
les robertson - cern-it-12last update 19/04/23 21:24
LCG Virtual Computing Centre
The user ---
sees the image of a single cluster
does not need to know - where the data is
- where the processing capacity is
- how things are interconnected
- the details of the different hardware
and is not concerned by the conflicting policies of the equipment owners and managers
les robertson - cern-it-13last update 19/04/23 21:24
LCGProject Implementation
Organisation
Four areas
Applications (see Matthias Kasemann’s presentation)
Grid Technology
Fabrics
Grid deployment
les robertson - cern-it-14last update 19/04/23 21:24
LCG
Grid Technology AreaLeveraging Grid R&D
Projects
US projects European projects
Many national, regional Grid projects --GridPP(UK), INFN-grid(I),NorduGrid, Dutch Grid, …
• significant R&D funding for Grid middleware
• risk of divergence
and is that good or bad?
• global grids need standards
• useful grids need stability
• hard to do this in the current state of maturity
• will we recognise and be willingto migrate to the winning solutions?
les robertson - cern-it-15last update 19/04/23 21:24
LCG Grid Technology Area
Ensuring that the appropriate middleware is available
Supplied and maintained by the “Grid projects”
It is proving hard to get the first “production” data intensive grids going as user services
Can the grid projects provide long-term support and maintenance?
Trade-off between new functionality and stability
les robertson - cern-it-16last update 19/04/23 21:24
LCG The Trans-Atlantic Issue
Bridging the ATLANTIC is essential for the project HICB – High Energy and Nuclear Physics Intergid
Collaboration Board GLUE – Grid Laboratory Universal Environment compatible middleware and infrastructure
Funded by DataTAG and iVDGL Certificates - OK Schemas – under way, working with the wider
Globus world, getting complicated – probably OK Middleware components – not yet clear – but close
collaboration on File replication Job scheduling
les robertson - cern-it-17last update 19/04/23 21:24
LCGCollaboration with Grid
Projects
LCG must deploy a GLOBAL GRID essential to have compatible middleware &
grid infrastructure better – have identical middleware
We are banking on GLUE
But we have to make some choices towards the end of the year
Services are about stability, support, maintenance
Can the R&D grid projects take commitments for long term maintenance of their middleware?
les robertson - cern-it-18last update 19/04/23 21:24
LCG Scope of Fabric Area
Tier 1,2 centre collaboration
Grid-Fabric integration middleware (DataGrid WP4)
Automated systems management package
Technology assessment (PASTA III) started
CERN Tier 0+1 centre
les robertson - cern-it-19last update 19/04/23 21:24
LCG Grid Deployment Area
The aim is to build a general computing service for a very large user population of independently-minded scientists using a large number of independently managed sites
This is NOT a collection of sites providing pre-defined services
it is the user’s job that defines the service it is current research interests that define the workload it is the workload that defines the data distribution
DEMAND - Unpredictable & Chaotic
But the SERVICE had better be Available & Reliable
les robertson - cern-it-20last update 19/04/23 21:24
LCGGrid Deployment – current
status
Experiments can do (and are doing) their event production using distributed resources with a variety of solutions
classic distributed production – send jobs to specific sites, simple
bookkeeping some use of Globus, and some of the HEP Grid tools other integrated solutions (ALIEN)
The hard problem for distributed computing is data analysis – ESD and AOD
chaotic workload unpredictable data access patterns
this is where new Grid technology is needed resource broker, replica management, ..
this is the problem that the LCG has to solve
les robertson - cern-it-21last update 19/04/23 21:24
LCG Grid Operation
User
Network Operations
Centre
Local operationLocal user support
Grid Operations Centre
Call Centre
Loca
l site
Grid
ope
ratio
ns
Grid information
service VirtualOrganisation
Grid logging &
bookkeeping
queriesmonitoring & alarmscorrective actions
les robertson - cern-it-22last update 19/04/23 21:24
LCG Grid Operation
We do not know how to do this
Probably nobody knows – looks like network operation, but there are many more variables to be watched and adjusted;looks like multi-national commercial systems, but we have no central ownership, control
A 24 hour service is needed – round the clock and round the world
les robertson - cern-it-23last update 19/04/23 21:24
LCGSetting up the
LHC Global Grid Service First data is in 2007 LCG must learn from current solutions, leverage the tools coming from the
grid projects, show that grids are usefulbut set realistic targets
short term (this year): use current solutions for physics data challenges (event
productions) consolidate (stabilise, maintain) middleware learn what a “production grid” really means by working with
DataGrid and VDT
medium term (next year): Set up a reliable global grid service – initially only a few larger
centres, but on three continents Stabilise it Several times the capacity of the CERN facility
and as easy to use
les robertson - cern-it-24last update 19/04/23 21:24
LCG
Having stabilised this base service –
showing that we can run a solid service for the experiments
then – progressive evolution –
integrate all of the Regional Centre resources provided for LHC improve quality, reliability, predictability integrate new middleware functionality – possibly once per year migrate to de facto standards as soon as they emerge
les robertson - cern-it-25last update 19/04/23 21:24
LCG Final comments
It is not just about distributing computation, it is also about managing distributed data (lots of it!) and maintaining a single view of the environment
All these parallel developments, rapidly changing technology .. may be good in the long term, but we must deploy a global grid service next year
A dependable, reliable 24 X 7 service is essential and not so easy to do with all these sites and all that data
Service Quality is the Key to Acceptance of Grids Reliable OPERATION will be the factor that limits the size of
practical Grids We are getting funding because of the relevance for other
sciences, engineering, business -- keeping things general, main-line must remain a high priority