The LHC Computing Grid Visit of Dr. John Marburger

Embed Size (px)

DESCRIPTION

Outline The Challenge The (W)LCG Project The LCG Infrastructure The LCG Service Beyond LCG The LHC Computing Grid – July 2006

Citation preview

The LHC Computing Grid Visit of Dr. John Marburger
Director, Office of Science and Technology Policy Frdric Hemmer Deputy Head, IT Department July 10, 2006 The LHC Computing Grid July 2006 Outline The Challenge The (W)LCG Project The LCG Infrastructure
The LCG Service Beyond LCG The LHC Computing Grid July 2006 New frontiers in data handling
ATLAS experiment: ~150 million 40MHz ~ 10 million Gigabytes per second Massive data reduction on-line Still ~1 Gigabyte per second to handle The LHC Computing Grid July 2006 The Data Challenge LHC experiments will produce million Gigabytes of data each year (about 20 million CDs!) LHC data analysis requires a computing power equivalent to ~ 100,000 of today's fastest PC processors. Requires many cooperating computer centres, CERN providing only ~20% of the computing resources The LHC Computing Grid July 2006 (extracted by physics topic)
Data Handling and Computation for Physics Analysis CERN reconstruction detector event filter (selection & reconstruction) analysis processed data event summary data raw data batch physics analysis event reprocessing simulation analysis objects (extracted by physics topic) event simulation interactive physics analysis The LHC Computing Grid July 2006 LCG Service Hierarchy Tier-0 the accelerator centre
Data acquisition & initial processing Long-term data curation Data Distribution to Tier-1 centres Canada Triumf (Vancouver) France IN2P3 (Lyon) Germany Karlsruhe Italy CNAF (Bologna) Netherlands NIKHEF/SARA (Amsterdam) Nordic countries distributed Tier-1 Spain PIC (Barcelona) Taiwan Academia SInica (Taipei) UK CLRC (Oxford) US FermiLab (Illinois) Brookhaven (NY) Tier-1 online to the data acquisition process high availability Managed Mass Storage grid-enabled data service All re-processing passes Data-heavy analysis National, regional support Tier-2 ~100 centres in ~40 countries Simulation End-user analysis batch and interactive Services, including Data Archive and Delivery, from Tier-1s The LHC Computing Grid July 2006 LHC Computing Grid project (LCG)
More than 100 computing centres 12 large centres for primary data management: CERN (Tier-0) and eleven Tier-1s 38 federations of smaller Tier-2 centres 40 countries involved The LHC Computing Grid July 2006 The LHC Computing Grid July 2006 The new European Network Backbone
LCG working group with Tier-1s and national/ regional research network organisations New GANT 2 research network backbone Strong correlation with major European LHC centres Swiss PoP at CERN The LHC Computing Grid July 2006 October 2005 (Lambda Triangle)
The LHC Computing Grid July 2006 Summary of Computing Resource Requirements
~ 100K of todays fastest processors Summary of Computing Resource Requirements All experiments From LCG TDR - June 2005 CERN All Tier-1s All Tier-2s Total CPU (MSPECint2000s) 25 56 61 142 Disk (PetaBytes) 7 31 19 57 Tape (PetaBytes) 18 35 53 CPU Disk Tape The LHC Computing Grid July 2006 LHC Computing Grid Project - a Collaboration
Building and operating the LHC Grid a global collaboration between The physicists and computing specialists from the LHC experiments The national and regional projects in Europe and the US that have been developing Grid middleware The regional and national computing centres that provide resources for LHC The research networks Researchers Computer Scientists & Software Engineers Service Providers The LHC Computing Grid July 2006 LCG depends on 2 major science grid infrastructures
The LCG service runs & relies on grid infrastructure provided by: EGEE - Enabling Grids for E-Science OSG- US Open Science Grid The LHC Computing Grid July 2006 EGEE Grid Sites : Q1 2006 sites EGEE: Steady growth over the lifetime of the project CPU EGEE: > 180 sites, 40 countries > 24,000 processors, ~ 5 PB storage The LHC Computing Grid July 2006 OSG & WLCG OSG Infrastructure is a core piece of the WLCG.
OSG delivers accountable resources and cycles for LHC experiment production and analysis. OSG Federates with other infrastructures. Experiments see a seamless global computing facility The LHC Computing Grid July 2006 Ramp up of OSG use last 6 months
deployment OSG deployment The LHC Computing Grid July 2006 Data Transfer by VOs e.g. CMS
The LHC Computing Grid July 2006 LCG Service planning Pilot Services stable service from 1 June 06
2006 cosmics LHC Service in operation 1 Oct 06over following six months ramp up to full operational capacity & performance 2007 LHC service commissioned 1 Apr 07 first physics 2008 full physics run The LHC Computing Grid July 2006 Service Challenges Jun-Sep 2006 SC4 pilot service
Purpose Understand what it takes to operate a real grid service run for weeks/months at a time (not just limited to experiment Data Challenges) Trigger and verify Tier1 & large Tier-2 planning and deployment - tested with realistic usage patterns Get the essential grid services ramped up to target levels of reliability, availability, scalability, end-to-end performance Four progressive steps from October 2004 thru September 2006 End SC1 data transfer to subset of Tier-1s Spring 2005 SC2 include mass storage, all Tier-1s, some Tier-2s 2nd half 2005 SC3 Tier-1s, >20 Tier-2s first set of baseline services Jun-Sep 2006 SC4 pilot service Autumn 2006 LHC service in continuous operation ready for data taking in 2007 The LHC Computing Grid July 2006 SC4 the Pilot LHC Service from June 2006
A stable service on which experiments can make a full demonstration of experiment offlinechain DAQ Tier-0 Tier-1 data recording, calibration, reconstruction Offline analysis - Tier-1 Tier-2 data exchange simulation, batch and end-user analysis And sites can test their operational readiness Service metrics MoU service levels Grid services Mass storage services, including magnetic tape Extension to most Tier-2 sites Evolution of SC3 rather than lots of new functionality In parallel Development and deployment of distributed database services (3D project) Testing and deployment of new mass storage services (SRM 2.2) The LHC Computing Grid July 2006 Sustained Data Distribution Rates: CERN Tier-1s
Centre ALICE ATLAS CMS LHCb Rate into T1 MB/sec (pp run) ASGC, Taipei X 100 CNAF, Italy 200 PIC, Spain IN2P3, Lyon GridKA, Germany RAL, UK 150 BNL, USA FNAL, USA TRIUMF, Canada 50 NIKHEF/SARA, NL Nordic Data Grid Facility Totals 1,600 Design target is twice these rates to enable catch-up after problems The LHC Computing Grid July 2006 SC4 T0-T1: Results Easter w/e Target 10 day period
Target: sustained disk disk transfers at 1.6GB/s out of CERN at full nominal rates for ~10 days Result: just managed this rate on Easter Sunday (1/10) Easter w/e Target 10 day period The LHC Computing Grid July 2006 ATLAS SC4 tests From last week: initial ATLAS SC4 work ATLAS transfers
Rates to ATLAS T1 sites close to target rates ATLAS transfers Background transfers The LHC Computing Grid July 2006 Impact of the LHC Computing Grid in Europe
LCG has been the driving force for the European multi-science Grid EGEE (Enabling Grids for E-sciencE) EGEE is now a global effort, and the largest Grid infrastructure worldwide Co-funded by the European Commission (~130 M over 4 years) EGEE already used for >20 applications, including Bio-informatics Education, Training Medical Imaging The LHC Computing Grid July 2006 The EGEE Project Infrastructure operation Middleware
Currently includes >200 sites across 39 countries Continuous monitoring of grid services & automated site configuration/management Middleware Production quality middleware distributed under business friendly open source licence User Support - Managed process from first contact through to production usage Training Documentation Expertise in grid-enabling applications Online helpdesk Networking events (User Forum, Conferences etc.) Interoperability Expanding interoperability with related infrastructures The LHC Computing Grid July 2006 EGEE-II Expertise & Resources
32 countries 13 federations Major and nationalGrid projects inEurope, USA, Asia + 27 countriesthrough related projects: BalticGrid EELA EUChinaGrid EUIndiaGrid EUMedGrid SEE-GRID The LHC Computing Grid July 2006 Example: EGEE Attacks Avian Flu
EGEE used to analyse 300,000 possible potential drug compounds against bird flu virus, H5N1. 2000 computers at 60 computer centres in Europe, Russia, Taiwan, Israel ran during four weeks in April - the equivalent of 100 years on a single computer. Potential drug compounds now being identified and ranked Neuraminidase, one of the two major surface proteins of influenza viruses, facilitating the release of virions from infected cells. Image Courtesy Ying-Ta Wu, AcademiaSinica. The LHC Computing Grid July 2006 Example: Geocluster industrial application
The first industrial application successfully running on EGEE Developed by the Compagnie Gnrale de Gophysique (CGG) in France, doing geophysical simulations for oil, gas, mining and environmental industries. EGEE technology helps CGG to federate its computing resources around the globe. The LHC Computing Grid July 2006 Grids in Europe Great investment in developing Grid technology
Sample of National Grid projects: Austrian Grid Initiative Belgium: BEGrid DutchGrid France: e-Toile; ACI Grid Germany: D-Grid; Unicore Greece: HellasGrid Grid Ireland Italy: INFNGrid; GRID.IT NorduGrid UK e-Science: National Grid Service;OMII; GridPP EGEE provides frameworkfor national, regional and thematic Grids Average of 180 M per year since 2002 (national + EC) The LHC Computing Grid July 2006 European e-Infrastructure Coordination
Evolution Europeane-Infrastructure Coordination EGEE EGEE-II EDG EGEE-III Testbeds Utility Service Routine Usage The LHC Computing Grid July 2006 The LHC Computing Grid July 2006