27
GridPP: UK Computing for Particle Physics Tony Doyle

GridPP: UK Computing for Particle Physics Tony Doyle

Embed Size (px)

Citation preview

Page 1: GridPP: UK Computing for Particle Physics Tony Doyle

GridPP: UK Computing for Particle Physics

Tony Doyle

Page 2: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

ContextA Brief History Of GridPP

UK Computing CentresThe Grid & its Challenges

Resource AccountingPerformance Monitoring

Outlook Conclusions

The Icemen Cometh

Outline

Page 3: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

•To create a UK Particle Physics Grid and the computing technologies required for the Large Hadron Collider (LHC) at CERN

•To place the UK in a leadership position in the international development of the development of an EU Grid infrastructure

Context (2000)

Page 4: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

• 1999• Grid: Blueprint for a New Computing Infrastructure

by Ian Foster and Carl Kesselman published. • February 2000• A Joint Infrastructure Fund bid is submitted for

£6.2m to fund a prototype Tier-1 centre at RAL, for the EU-funded DataGrid project.

• At the time of the JIF bid the LHC was expected to produce 4PB of data a year for 10 years. By 2005, the expected figures had risen to 15PB a year for 15 years.

• RAL was chosen as the location of the Tier-1 centre because it already hosts the UK BaBar computing centre.

• May 2000• Last R-ECFA meeting in the UK. • October 2000• PPARC signs up to the EU DataGrid project,

contributing 20 people and a Tier-1 centre. • November 2000• Trade and Industry Secretary Stephen Byers

announces £98m for e-Science with Spending Review 2000.

• This includes £26m for PPARC to develop HEP and astronomy Grid Projects.

• December 2000• GridPP plan created at a meeting at RAL. Initially

the £26m was to help fund UK posts to coordinate the UK arm of LCG, as part of that organisation.

• April 2001• A Shadow Project Management Board,

refered to as "DataGrid-UK", is established. GridPP first proposal submitted.

• 30/31st May 2001• PPARC's e-Science Committee meets to consider

the proposal and approves the GridPP project, allocating £17m.

• 1st September 2001• GridPP officially starts, with funding for 3 years • January 2002• DataGrid releases first production version of the

testbed middleware. • February 2002• First international file transfers using X.509 digital c

ertificates

• 1st March 2002• RAL involved in a test of DataGrid by creating a

small 5 site testbed Grid, with CERN, IN2P3-Lyon, CNAF-Bologna and NIKHEF

• 11th March 2002• LHC Computing Grid Project launched. • 23th March 2002• First Prototype Tier1/A Hardware delivered to RAL,

consisting of 156 dual CPU PCs with 30GB of storage each.

• 25th April 2002• UK National e-Science Centre (NeSC

) opened in Edinburgh by Gordon Brown • June 2002• ScotGrid, one of the four Tier-2s in GridPP, goes

into production • August 2002• GridPP makes its first visit to the All Hands e-

Science meeting

A Brief History Of GridPP

Page 5: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

• December 2002• PPARC receive a further £31.6m for their e-Science

programme • The UK plays significant role in LHCb Data Challenge• February 2003• PPARC put out call for proposals for the second phase of its e-

Science programme.• June 2003• Proposal for GridPP2 submitted • August 2003• UKHEP Certificate Authority is

replaced by the UK e-Science Certificate Authority. This issues the digital certificates needed to use the Grid.

• September 2003 • LHC Computing Grid is launched • December 2003• GridSite, initially a tool used by the GridPP website gets its

first production release • GridPP2 proposal accepted by PPARC ensuring project will run

until Sept 2007 with £16.9m • April 2004• EU DataGrid project ends and is replaced by EGEE (Enabling

Grids for E-science in Europe) • September 2004• GridPP2 is launched • GridPP website wins award at All Hands Meeting for Best e-

Science Project Website • October 2004• CERN's 50th anniversary • January 2005• BaBar UK demonstrates the first successful integration of the

Grid into the official BaBar Monte Carlo production system • March 2005• LCG passes 100 sites worldwide • May 2005• GridPP has grown to 2,740 CPUs and 67 TB of storage • July 2005 • GridPP members use the UKLight high speed connection

between Lancaster and RAL for the first time, moving data 50 times faster than a normal ASDL line.

• September 2005• First WISDOM biomedical data challenge for drug discovery is

run to look for drugs against Malaria • GridSiteWiKi software released which allows users with the

correct digital certificate to edit wiki pages. • New version of Real Time Monitor Launched at e-Science All

Hands Meeting. • October 2005• International Grid Trust Federation (IGTF) established to

regulate the digital certificates used on the Grid worldwide. • November 2005• GridPP's storage capacity reaches 100TB • 2,000,000 jobs were run on the EGEE Grid in 2005 • January 2006 • LCG reaches data speeds of 1GB/s during testing of the

infrastructure • March 2006• The PEGASUS project is announced, a social science study of

GridPP by researchers from the London School of Economics • PPARC

signs the LCG Memorandum of Understanding with CERN, which commits the UK Tier-1 at RAL and the four UK Tier-2s to provide services and resources to the LCG

• April 2006• EGEE enters 2nd phase • PPARC looks for proposals for the continuation of the UK's

Grid computing for Particle Physicists after September 2007 • May 2006• Second WISDOM biomedical data challenge for drug discovery

is run to look for drugs against avian flu • July 2006• Proposal for GridPP3 submitted; this would extend the project

beyond the current end date of September 2007 • August 2006• GridPP has 3,240 CPUs and 246.25TB • December 2006• GridPP accounts for 27% of the 2006 total EGEE CPU

resources• March 2007• PPARC annnounces ₤30m for GridPP extension

A Brief History Of GridPP

Page 6: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

• 2006 was the second full year for the UK Production Grid • More than 5,000 CPUs and more than 1/2 Petabyte of disk

storage• The UK is the largest CPU provider on the EGEE Grid, with

total CPU used of 15 GSI2k-hours in 2006• The GridPP2 project has met 69% of its original targets

with 92% of the metrics within specification• The initial LCG Grid Service is now underway and will run

for the first 6 months of 2007• The aim is to continue to improve reliability and

performance ready for startup of the full Grid service on 1st July 2007

• The GridPP2 project has been extended by 7 months to April 2008

• The GridPP3 proposal was recently accepted by PPARC (£30m) to extend the project to March 2011

• We anticipate a challenging period ahead

Context (2007)

Page 7: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Real Time Monitor

Page 8: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

• High quality data services• National and International Role• UK focus for International Grid

development

•1500 CPUs•750 TB Disk•530 TB Tape

(Capacity 1PB)Grid Operations Centre

Tier-1 Centre at RAL

Page 9: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

ScotGridDurham, Edinburgh, Glasgow NorthGridDaresbury, Lancaster, Liverpool,Manchester, Sheffield

SouthGridBirmingham, Bristol, Cambridge,Oxford, RAL PPD

LondonBrunel, Imperial, QMUL, RHUL, UCL

Mostly funded by

HEFCE (SFC)

UK Tier-2 Centres

Page 10: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

19 UK Universities + STFCGridPP1 2001-2004

"From Web to Grid" [£16m+]

GridPP2+ 2004-2008 "From Prototype to Production” [£17m+]

GridPP3 2008-2011 "From Production to Exploitation” [£30m]

GridPP: Who are we?

Page 11: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

GridPP Middleware is..

Security

Network Monitoring

Information Services

Grid Data Management

Storage Interfaces

Workload Management

Middleware

Page 12: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Grid Challenges

Data Management, Security and Sharing

1. Software process2. Software efficiency3. Deployment

planning 4. Link centres

5. Share data

6. Manage data7. Install software8. Analyse data9. Accounting

10. Policies

Page 13: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Aim: by 2008 (full year’s data taking)

- CPU ~100MSI2k (100,000 CPUs)

- Storage ~80PB - Involving >100 institutes

worldwide

- Build on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT)

1. Prototype went live in September 2003 in 12 countries

2. Extensively tested by the LHC experiments in September 2004

3. February 2006 25,547 CPUs, 4398 TB storage

Status in May 2007 (last night):

177 sites, 29,266 CPUs, 13,815 TB storageMonitoring via Grid Operations

Centre

Grid Status

Page 14: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

htt

p:/

/ww

w3

.egee.c

esg

a.e

s/gri

dsi

te/a

ccounti

ng/C

ESG

A/t

ree_e

gee.p

hp

ResourcesAccumulated EGEE CPU Usage 104,126,019

kSI2k-hoursor >100 GSI2k-hours (!)

Via APEL accounting

UKI: 30,644,586 kSI2k-hours

Page 15: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

0

1000

2000

3000

4000

5000

6000

7000

8000

900006

/20/

04

07/1

9/04

08/1

7/04

09/1

5/04

10/1

4/04

11/1

2/200

4

12/1

1/200

4

01/0

9/200

5

02/0

7/200

5

03/0

8/200

5

04/0

6/200

5

05/0

5/200

5

06/0

3/200

5

07/0

2/200

5

07/3

1/05

08/2

9/05

09/2

7/05

10/2

6/05

11/2

4/05

23/1

2/200

5

21/0

1/200

6

19/0

2/200

6

20/0

3/200

6

18/0

4/200

6

18/0

5/200

6

16/0

6/200

6

15/0

7/200

6

13/0

8/200

6

11/0

9/200

6

12/1

0/200

6

11/1

1/200

6

10/1

2/200

6

08/0

1/200

6

08/0

2/200

6

09/0

3/200

6

Date

Pu

bli

shed

jo

b s

lots

UK total job slots

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

120.00%

06/0

2/2

004

07/0

3/2

004

08/0

3/2

004

09/0

3/2

004

10/0

4/2

004

11/0

4/2

004

12/0

5/2

004

01/0

5/2

005

02/0

5/2

005

03/0

8/2

005

04/0

8/2

005

05/0

9/2

005

06/0

9/2

005

07/1

0/2

005

08/1

0/2

005

09/1

0/2

005

10/1

1/2

005

11/1

1/2

005

12/1

2/2

005

12/0

1/2

006

12/0

2/2

006

15/0

3/2

006

15/0

4/2

006

17/0

5/2

006

17/0

6/2

006

18/0

7/2

006

18/0

8/2

006

18/0

9/2

006

22/1

0/2

006

22/1

1/2

006

23/1

2/2

006

23/0

1/2

006

25/0

2/2

006

Date

% j

ob

slo

ts u

sed

% EGEE slots used % UK slots used

2004 2005 2006 2007

2004 2005 2006 2007

Job Slots and Use

Page 16: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Resource Accounting

100,000 3GHz CPUs

CPU resources at ~required levels

(just in time delivery)

time

LHC start-upCPU

Grid-accessible disk accounting being improved

Grid Operations Centre

Page 17: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

(measured by UK Tier-1 for all VOs)

~90% CPU efficiency due to i/o bottlenecks is OK Concern that this is currently ~75%

Efficiency

Each experiment needs to work to improve their

system/deployment practice anticipating e.g. hanging

gridftp connections during batch work

target

Page 18: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

ALICE1%

ATLAS34%

CMS16%

LHCb35%

Other3%

H11%

ZEUS1%

D05%

BaBar4%

CDF0%

2006 CPU Usage

by experiment

UK Resources

Page 19: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

SAM tests (critical=subset)

BDII Top-level BDIIsBDII Site BDIIFTS File Transfer ServicegCE gLite Computing ElementLFC Global LFCVOMS VOMSCE Computing ElementSRM SRMgRB gLite Resource BrokerMyProxy MyProxyRB Resource BrokerVOBOX VO BOXSE Storage ElementRGMA RGMA Registry

Global Tier-1s

UK Tier-1 (RAL)

site testing

htt

p:/

/gri

dvie

w.c

ern

.ch/G

RID

VIE

W/s

am

e_i

ndex.p

hp

Page 20: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

• End-user analysis tests in advance of LHC data-taking• Example: ATLAS• Hourly polling of all sites

UK-ATLAS site performance (2007)

0

10

20

30

40

50

60

70

80

90

100

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

day

%a

ge

su

cc

es

s

ATLAS site testing

htt

p:/

/hepw

ww

.ph.q

mul.ac.

uk/~

lloyd/a

tlas/

ate

st.p

hp

• Measurably improved performance

12/01/07 10/05/07

Page 21: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

CSA06: Successful CMS global 25% capacity test over a 6 week period in Sep/Oct 2006.

Reconstruction, event selection, calibration, alignment, analysis.

1PB of data shipped between T0 – T1 – T2s in 6 weeks.

30 analysis projects involving 70 physicists

CMS Challenge

Page 22: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

COUNTRYCPU USE

(%)

UK 41.1

CERN 12.1

Italy 9.6

German 8.1

France 7.7

Spain 6.6

Greece 3.8

Netherlands 3.1

Poland 2.4

Russia 2.0

Hungary 0.9

UK

CER

N

German

ySpain

France

Italy

LHCb Production

UK consistently largest producer

for LHCb

Page 23: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Scenario Planning – Resource Requirements [TB, kSI2k]

GridPP requested a fair share of global requirements, according to experiment requirements

Changes in the LHC schedule prompted a(nother) round of resource planning - presented to CRRB on Oct 24th

New UK resource requirements have been derived and incorporated in the scenario planning e.g. Tier-1

Tier1 CPU 2008 2009 2010ALICE 10230 18430 22930ATLAS 18123 28423 49573CMS 12400 16900 36900LHCb 1770 4870 6740TOTAL 42523 68623 116143

Tier1 Disk 2008 2009 2010ALICE 5220 7940 9870ATLAS 9939 19686 39488CMS 5600 8500 13700LHCb 1025 2759 3250TOTAL 21784 38885 66308

Tier1 Tape 2008 2009 2010ALICE 7030 13980 20930ATLAS 7694 14950 28698CMS 13100 23500 36600LHCb 860 3070 5864TOTAL 28684 55500 92092

Forward Look

Page 24: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Input to Scenario Planning –Hardware Costing

• Empirical extrapolations with extrapolated (large) uncertainties• Hardware prices have been re-examined following recent Tier-1 purchase •CPU (woodcrest) was cheaper than expected based on extrapolation of previous 4 years of data

CPU Costs

-4

-3

-2

-1

0

1

01-Jan-02 16-May-03 27-Sep-04 09-Feb-06 24-Jun-07 05-Nov-08 20-Mar-10Date

Ln

(K£/

KS

I2K

)

Past CPU Purchase

Best fit to past purchases

29 month extrapolation

20 month extrapolation

Future price estimates

Max

Min

GridPP3 submission

CERN

Disk Costs

-2

-1

0

1

2

3

01-Jan-02 16-May-03 27-Sep-04 09-Feb-06 24-Jun-07 05-Nov-08 20-Mar-10Date

Ln

(K

£/T

B)

Past Purchases

Best fit (21.7months)

24 months

19 months

Future price estimates

Upper Limit

Lower limit

GridPP3 Proposal

CERN

Forward Look

Page 25: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Scenario PlanningAn example 70% “minimum

viable level” scenario [£m]

WG Area Item Cost WG Frac Area Item CostStaff 4.99 93% Staff 4.62Hardware 11.72 85% Hardware 7.20Staff 3.29 89% Staff 2.94Hardware 5.12 85% Hardware 4.35

C Support Staff 4.50 C 69% Support Staff 3.10D Operations Staff 1.89 D 88% Operations Staff 1.66E Management Staff 1.17 E 90% Management Staff 1.06F Outreach Staff 0.37 F 74% Outreach Staff 0.28G Travel etc Other 0.84 G 75% Travel etc Other 0.63

33.89 25.841.25 0% 0.00

35.14 25.844.15 5.172.50 2.50

41.79 33.51Full Approval CostRunning Costs

GridPP3 Proposal 70% Scenario

Working AllowanceProject Cost

B

Full Approval Cost

Working AllowanceProject CostContingencyRunning Costs

Contingency

A Tier-1

B Tier-2

Tier-1

Tier-2

A

Forward Look

Grid Middleware

Experiment Application Software

Integration

Application Middleware

Facilities and Fabrics

GridPP3 was funded predominately to install and operate resources, and to deploy the wLCG.

Page 26: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

ConclusionFrom UK Particle Physics perspective the Grid is the basis

for computing in the 21st Century:1. needed to utilise computing resources efficiently and securely2. uses gLite middleware (with evolving standards for interoperation)3. required significant investment from PPARC (STFC) – (£100m) over

10 yrs - including support from HEFCE/SFC4. required 3 years’ prototype testbed development [GridPP1]5. provides a working production system that has been running for over

two years in build-up to LHC data-taking [GridPP2]6. enables seamless discovery of computing resources:

utilised to good effect across the UK – internationally significant7. not (yet) as efficient as end-user analysts require:

ongoing work to improve performance8. ready for LHC – just in time delivery9. future operations-led activity as part of LCG, working with

EGEE/EGI (EU) and NGS (UK) [GridPP3]10.future challenge is to exploit this infrastructure to

perform (previously impossible) physics analyses from the LHC (and ILC and Fact and..)

Page 27: GridPP: UK Computing for Particle Physics Tony Doyle

R-ECFA Meeting11 May 2007 Tony Doyle - University of Glasgow

Further Infohttp://www.gridpp.ac.uk/