Upload
jasper-day
View
219
Download
2
Embed Size (px)
Citation preview
Grid Team
D a vid M a r tin+ Sc o tG R ID
A lan Fla ve ll+ C D F
T o n y D o y leSy s tem M an a g em e nt
P a ul M illar+ Sy s tem
W ill B e ll+ C D F
D a vid C a m e ron1 s t Y e a r P hD
Ia n Skill ico rn+ A T L A S
C a itrio n a N ic ho lson0 th Y e a r P hD
G a vin M c C a n ceG r id D a ta M an a g em e nt
S ta n T h o m p son+ A T L A S
A lan Fla ve ll+ Sy s tem
R ic k S t D e n isC D F
Sta n T h o m p son+ C D F
Jo h n K e n n e dy
Sa sha T ch e p la k ov Ia n Skill ico rn+ G r id
D a v id Sa x on+ P P E G ro u p
T o n y D o y leA T L A S
A n dre w P ic kfo rd
P a u l So lerL H C b
K e n Sm ithD e tec to rs
T o n y D o y leG r id D e ve lo pm e nt
D a v id Sa x onP P E G ro u p Le a d er
HardwareHardware SoftwareSoftware
SystemSystem MiddlewareMiddleware ApplicationsApplications
HardwareHardware
See David.. # working
LHC Computing at a Glance
• The investment in LHC computing will be massive– LHC Review estimated 240MCHF– 80MCHF/y afterwards
• These facilities will be distributed– Political as well as sociological and practical reasons
Europe:267 institutes, 4603 users
Elsewhere: 208 institutes, 1632 users
Rare Phenomena –Huge Background
9 or
ders
of
mag
nitu
de!
The HIGGS
All interactions
CPU Requirements
• Complex events– Large number of signals– “good” signals are covered
with background
• Many events– 109 events/experiment/year– 1- 25 MB/event raw data– several passes required
Need world-wide:
7*106 SPECint95 (3*108 MIPS)
LHC Computing Challenge
Tier2 Centre ~1 TIPS
Online System
Offline Farm~20 TIPS
CERN Computer Centre >20 TIPS
RAL Regional Centre
US Regional Centre
French Regional Centre
Italian Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100 MBytes/sec
~100 MBytes/sec
100 - 1000 Mbits/sec
•One bunch crossing per 25 ns
•100 triggers per second
•Each event is ~1 Mbyte
Physicists work on analysis “channels”
Glasgow has ~10 physicists working on one or more channels
Data for these channels is cached by the Glasgow server
Physics data cache
~PBytes/sec
~ Gbits/sec or Air Freight
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
~Gbits/sec
Tier Tier 00
Tier Tier 11
Tier Tier 33
Tier Tier 44
1 TIPS = 25,000 SpecInt95
PC (1999) = ~15 SpecInt95
ScotGRID++ ~1 TIPS
Tier Tier 22
Starting Point
CPU Intensive Applications
Numerically intensive simulations:– Minimal input and output data
• ATLAS Monte Carlo (gg H bb)182 sec/3.5 Mb event on 1000 MHz linux
box
Standalone physics applications:
1. Simulation of neutron/photon/electron interactions for 3D detector design2. NLO QCD physics simulation
Compiler Speed (MFlops)Fortran (g77) 27C (gcc) 43Java (jdk) 41
Compiler Tests:
Timeline
2002 200520042003
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4
Prototype of Hybrid Event Store (Persistency Framework)
Hybrid Event Store available for general users
Distributed production using grid services
First Global Grid Service (LCG-1) available
Distributed end-user interactive analysis
Full Persistency Framework
LCG-1 reliability and performance targets
“50% prototype” (LCG-3) available
LHC Global Grid TDR
applicationsapplications
gridgrid ScotGRID ~ 300 CPUs + ~ 50 TBytes
ScotGRID
ScotGRID Processing nodes at Glasgow 59 IBM X Series 330 dual 1 GHz Pentium III with 2GB memory • 2 IBM X Series 340 dual 1 GHz Pentium III with 2GB memory and dual ethernet • 3 IBM X Series 340 dual 1 GHz Pentium III with 2GB memory and 100 + 1000 Mbit/s ethernet • 1TB disk • LTO/Ultrium Tape Library • Cisco ethernet switches
ScotGRID Storage at Edinburgh• IBM X Series 370 PIII Xeon with 512 MB memory 32 x 512 MB RAM • 70 x 73.4 GB IBM FC Hot-Swap HDD
CDF equipment at Glasgow• 8 x 700 MHz Xeon IBM xSeries 370 4 GB memory 1 TB disk
Griddev testrig at Glasgow• 4 x 233 MHz Pentium II
EDG TestBed 1 Status
Web interface showing status of (~400) servers at testbed 1 sites
GRIDextend to all expts
Glasgow within the Grid
GridPP
EDG - UK Contributions
Architecture
•Testbed-1Network MonitoringCertificates & SecurityStorage Element R-GMALCFGMDS deploymentGridSiteSlashGrid
•Spitfire
•Optor
•GridPP Monitor Page
•=Glasgow element
Applications (start-up phase)
BaBar
•CDF+D0 (SAM)
•ATLAS/LHCbCMS(ALICE)
•UKQCD
£17m 3-year project funded by PPARC
CERN - LCG (start-up phase)
funding for staff and hardware...
£3.78m
£5.67m
£3.66m
£1.99m
£1.88m
CERN
DataGrid
Tier - 1/A
Applications
Operations
http://www.gridpp.ac.uk
Overview of SAM
Fabric
TapeStorage
Elements
RequestFormulator and
Planner
Client Applications
ComputeElements
Indicates component that w ill be replaced
DiskStorage
Elements
LANs andWANs
Resource andServices Catalog
ReplicaCatalog
Meta-dataCatalog
Authentication and SecurityGSISAM-specific user, group , node, st at ion regis tration B bftp ‘cookie’
Connectivity and Resource
CORBA UDP File transfer protocol s - ftp, b bftp, rcp GridFTP
Mass Storage s ystems protocol se.g. encp, hp ss
Collective Services
C atalogproto co ls
Signi fi cant Event Log ger Naming Service Database ManagerC atalog Manager
SAM R es ource M an ag em entB atch Sys tems - LSF, FB S, PB S,
C ondorData Mov erJob Services
Storage ManagerJob ManagerCache ManagerRequest Manager
“Dataset Editor” “File Storage Server”“Project Master” “Station Master” “Station Master”
Web Python codes, Java codesCom mand line D0 Fram ework C++ codes
“Stager”“Optim iser”
CodeRepostory
Name in “quotes” is SAM-given software component name
or addedenhanced using PPDG and Grid tools
Spitfire - Security Mechanism
Servlet Container
SSLServletSocketFactory
TrustManager
Security Servlet
Does user specify role?
Map role to connection id
Authorization Module
HTTP + SSLRequest + client certificate
Yes
Role
Trusted CAsIs certificate signed
by a trusted CA?
No
Has certificatebeen revoked?
Revoked Certsrepository
Find default
No
Role repositoryRole ok?
Connectionmappings
Translator Servlet
RDBMS
Request a connection ID
ConnectionPool
Optor – replica optimiser simulation
• Simulate prototype Grid• Input site policies and experiment data files.• Introduce replication algorithm:
– Files are always replicated to the local storage.– If necessary oldest files are deleted. – Even a basic replication
algorithm significantly reduces network traffic and program running times.
New economics-based algorithms under investigation
Prototypes
Tools:Java
Analysis Studio
overTCP/IP
InstantaneousCPU Usage
ScalableArchitecture
Individual Node Info.
real world...real world... simulatedWorld…
simulatedWorld…
Glasgow Investment in Computing Infrastructure
• Long tradition
• Significant Dept. Investment
• £100,000 refurbishment (just completed)
• Long term commitment (LHC era ~ 15 years)
• Strong System Management Team – underpinning role
• New Grid Data Management Group – fundamental to Grid Development
• ATLAS/CDF/LHCb software
• Alliances with Glasgow Computing Science, Edinburgh, IBM.
Summary(to be updated..)
• Grids are (already) becoming a reality
• Mutual Interest ScotGRID Example
• Glasgow emphasis on – DataGrid Core Development – Grid Data Management– CERN+UK lead– Multidisciplinary Approach– University + Regional Basis– Applications ATLAS, CDF, LHCb– Large distributed databases– a common problem=challenge
– CDF LHC– – Genes Proteins
Detector for ALICE experiment
Detector forLHCb experiment
ScotG
RI
D