HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Embed Size (px)

Text of HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers...

  • Slide 1

HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October 25, 2000 http://l3www.cern.ch/~newman/hpiis2000.ppt Slide 2 Next Generation Experiments: Physics and Technical Goals The extraction of small or subtle new discovery signals from large and potentially overwhelming backgrounds; or precision analysis of large samples The extraction of small or subtle new discovery signals from large and potentially overwhelming backgrounds; or precision analysis of large samples Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010. Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010. Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and network resources effectively Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and network resources effectively Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, using heterogeneous resources. Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, using heterogeneous resources. Slide 3 The Large Hadron Collider (2005-) A next-generation particle collider A next-generation particle collider the largest superconductor installation in the world A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 interactions A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 interactions But only one in a trillion may lead to a major physics discovery Real-time data filtering: Petabytes per second to Gigabytes per second Real-time data filtering: Petabytes per second to Gigabytes per second Accumulated data of many Petabytes/Year Accumulated data of many Petabytes/Year Large data samples explored and analyzed by thousands of geographically dispersed scientists, in hundreds of teams Large data samples explored and analyzed by thousands of geographically dispersed scientists, in hundreds of teams Slide 4 Computing Challenges: LHC Example Geographical dispersion: of people and resources Complexity: the detector and the LHC environment Scale: Tens of Petabytes per year of data 1800 Physicists 150 Institutes 34 Countries Major challenges associated with: Communication and collaboration at a distance Network-distributed computing and data resources Remote software development and physics analysis R&D: New Forms of Distributed Systems: Data Grids Slide 5 Four LHC Experiments: The Petabyte to Exabyte Challenge ATLAS, CMS, ALICE, LHCB Higgs + New particles; Quark-Gluon Plasma; CP Violation Data written to tape ~25 Petabytes/Year and UP; 0.25 Petaflops and UP Data written to tape ~25 Petabytes/Year and UP; 0.25 Petaflops and UP 0.1 to 1 Exabyte (1 EB = 10 18 Bytes) (~2010) (~2015 ?) Total for the LHC Experiments 0.1 to 1 Exabyte (1 EB = 10 18 Bytes) (~2010) (~2015 ?) Total for the LHC Experiments Slide 6 From Physics to Raw Data (LEP) Basic physics Fragmentation,Decay Interaction with detector material Multiplescattering,interactionsDetectorresponse Noise, pile-up, cross-talk,inefficiency,ambiguity,resolution,responsefunction,alignment,temperature 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Raw data (Bytes)Read-outaddresses, ADC, TDC values, Bit patterns e+e+e+e+ e-e-e-e- f f Z0Z0Z0Z0 _ Slide 7 The Compact Muon Solenoid (CMS) MUON BARREL CALORIMETERS Silicon Microstrips (230 sqm) Pixels (80M channels) Scintillating PbWO 4 Cathode Strip Chambers CSC Resistive Plate Chambers RPC Drift Tube Chambers DT Resistive Plate Chambers RPC SUPERCONDUCTING COIL IRON YOKE TRACKERs MUON ENDCAPS Total weight : 12,500 t Overall diameter : 15 m Overall length : 21.6 m Magnetic field : 4 Tesla HCAL Plastic scintillator copper sandwich ECALCrystals Slide 8 From Raw Data to Physics (LEP) e+e+e+e+ e-e-e-e- f f Z0Z0Z0Z0 Basic physics ResultsFragmentation,DecayPhysicsanalysis Interaction with detector material Pattern,recognition,Particleidentification DetectorresponseApplycalibration,alignment 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Raw data Convert to physicsquantities Reconstruction Simulation (Monte-Carlo) Analysis _ Slide 9 Switch Data Fragments from on-detector digitizers Computer Farm raw data summary data Input: 1-100 GB/s Over 1 PetaByte/year 1-200 TB/year High Speed Network * figures are for one experiment Recording: 100-1000 MB/s Recording: 100-1000 MB/s Filtering: 35K SI95 Tape & Disk Servers Real-time Filtering and Data Acquisition* Slide 10 Higgs Search LEPC September 2000 Slide 11 10 9 events/sec, selectivity: 1 in 10 13 (1 person in a thousand world populations) LHC: Higgs Decay into 4 muons (tracker only); 1000X LEP Data Rate Slide 12 On-line Filter System u Large variety of triggers and thresholds: select physics la carte u Multi-level trigger u Filter out less interesting events u Online reduction 10 7 u Keep highly selected events u Result: Petabytes of Binary Compact Data Per Year Level 1 - Special Hardware Level 2 - Processors 40 MHz (1000 TB/sec) equivalent) Level 3 Farm of Commodity CPUs 75 KHz (75 GB/sec)fully digitised 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) Data Recording & Offline Analysis Slide 13 LHC Vision: Data Grid Hierarchy Tier 1 Tier2 Center Online System Offline Farm, CERN Computer Ctr > 20 TIPS FranceCentre FNAL Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~2.5 Gbits/sec 100 - 1000 Mbits/sec Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~0.6-2.5 Gbits/sec Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment Slide 14 Why Worldwide Computing? Regional Center Concept Advantages Managed, fair-shared access for Physicists everywhere Managed, fair-shared access for Physicists everywhere Maximize total funding resources while meeting the total computing and data handling needs Maximize total funding resources while meeting the total computing and data handling needs Balance between proximity of datasets to appropriate resources, and to the users Balance between proximity of datasets to appropriate resources, and to the users Tier-N Model Efficient use of network: higher throughput Efficient use of network: higher throughput Per Flow: Local > regional > national > international Utilizing all intellectual resources, in several time zones Utilizing all intellectual resources, in several time zones CERN, national labs, universities, remote sites Involving physicists and students at their home institutions Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region And/or by Common Interests (physics topics, subdetectors,) Manage the Systems Complexity Manage the Systems Complexity Partitioning facility tasks, to manage and focus resources Slide 15 Grid Services Architecture [*] GridFabric GridServices ApplnToolkits Applns Data stores, networks, computers, display devices, ; associated local services Protocols, authentication, policy, resource management, instrumentation, discovery,etc.... RemoteviztoolkitRemotecomp.toolkitRemotedatatoolkitRemotesensorstoolkitRemotecollab.toolkit A Rich Set of HEP Data-Analysis Related Applications [*] Adapted from Ian Foster Slide 16 SDSS Data Grid (In GriPhyN): A Shared Vision Three main functions: Raw data processing on a Grid (FNAL) Raw data processing on a Grid (FNAL) Rapid turnaround with TBs of data Accessible storage of all image data Fast science analysis environment (JHU) Fast science analysis environment (JHU) Combined data access + analysis of calibrated data Distributed I/O layer and processing layer; shared by whole collaboration Public data access Public data access SDSS data browsing for astronomers, and students Complex query engine for the public Slide 17 Principal areas of GriPhyN applicability: Main data processing (Caltech/CACR) Enable computationally limited searches periodic sources Access to LIGO deep archive Access to Observatories Science analysis environment for LSC (LIGO Scientific Collaboration) Tier2 centers: shared LSC resource Exploratory algorithm, astrophysics research with LIGO reduced data sets Distributed I/O layer and processing layer builds on existing APIs Data mining of LIGO (event) metadatabases LIGO data browsing for LSC members, outreach Hanford Livingston Caltech MIT INet2 Abilene Tier1 LSC Tier2 OC3 OC48 OC3 OC12 OC48 LIGO Data Grid Vision Slide 18 5 5 250 0.8 8 8 24 * 960 * 6 * 1.5 12 LAN-WAN Routers Computer farm at CERN (2005) Computer farm at CERN (2005) 0.8 Storage Network Farm Network 0.5 M SPECint95 > 5K processors 0.6 PByte disk > 5K disks + 2X More Outside 0.5 M SPECint95 > 5K processors 0.6 PByte disk > 5K disks + 2X More Outside * Data Rate in Gbps Thousands of CPU boxes Thousands of disks Hundreds of tape drives Real-time detector data Slide 19 Tier1 Regional Center Architecture (I. Gaines, FNAL) Tapes Network from CERN Network from Tier 2 centers Tape Mass Storage & Disk Servers Database Servers Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Recons