39
T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00

T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

Embed Size (px)

Citation preview

Page 1: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

T. BowcockA.Moreton, M.McCubbin

CERN-IT 5/00

Page 2: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

2

University of Liverpool

•MAP System•COMPASS •Grid•Summary

Page 3: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

3

MAP@Liverpool

• LHCb Experiment– CP violation– Rare B decays– signals of 103 to 106

• Backgrounds– Potentially all 1014

collisions/year!

323222

About 11012 BB produced/year

Page 4: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

4

LHCb Experiment

Vertex detector

Page 5: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

5

LHCb Experiment

Optimize the DetectorStudy the Backgrounds

Page 6: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

6

Simulation

• Full GEANT3 simulation– Event takes of order 120-200s on a 400MHz PC

• Put together a simulation facility– Samples of 107 to 108 / year– Many times more passed through GEANT– Monte Carlo Array Processor– Similar or larger samples– 109 institute/year

• Analysis, reprocessing

Page 7: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

7

Philosophy

• Fixed Purpose (MC): simplicity• Low Cost

– No Gbit ethernet until price falls– Don’t buy top of range processors– No SMP boards

• 1998/1999

– No tapes • Develop architecture with future in mind

– Minimum maintenance/development

Page 8: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

8

Using MAP

Disposable MC(throwaway!)• Cost• Write out ntuple/summary information

• I/O not really limited by architecture

• Events may be written out

• Small internal disks

Page 9: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

9

Hardware

• 300 processors– 400MHz PII– 128 Mbytes memory– 3 Gbytes disk/processor (IDE)– D-Link 100BaseT ethernet +hubs– commercial units

• custom boxes for packing and cooling

– Total 600kChF inc 17.5% VAT 1998/1999 (Funding Jan 99). ITS

• Including installation and 3-yr next day on-site maintenance.

Page 10: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

10

MAP-OS

• Linux– Originally RH5.2 (also tested 6.1)– Stripped to minimum

• On disk 180MBytes!

– Will (with FCS) reinstall/upgrade itself– Access/security

Page 11: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

11

View

HighGflops/m2

Old Mainframe Room

Power supply(3 phase)0.1MW max

50kw cooling

Page 12: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

12

Architecture

Master

Ext

ern

al E

ther

net

MAPSlaves

Hub(Switch- 00)

Hub(Switch - 00)100BaseT

Page 13: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

13

Design Features

• Mother boards/bios– No keyboard etc required on boot!

• Front panels– All connections except power

• Access to each PC via trolley on wheels• Cheaper than patch panel! Very convenient.

– Cooling (room air flow)• 30kW required 50kW capacity• Power cutoff installed

• Rack Mount– 30/rack, easy to extract

Page 14: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

14

Learned…

• Prototype• Cables

– Cheaper ethernet cables seem OK• Would have been nice to have

– On board power/heat sensing• Don’t really need power system

– Daisy chain in groups of 5– Transients can be huge!

Page 15: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

15

Bad things happen…

• Catastrophic power failure– No UPS (original design had one)– 4% needed manual intervention but no

hardware failure

• Burn-in & 4 months of operation– 1 power supply exploded– 4 PC’s with mother-board problems– 5 HD failures (within 1 week of turn on)– NIC cards fail – Typically 1% nodes may have a problem

Page 16: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

16

Flow Control System

• MAP-FCS – UDP level (frames)– solve packet-loss problem

• Bad hubs(D-Link)• NIC Realtek clones with high failure rate

– Broadcast system• 4 Mbytes/s 300 (Master to Slaves)

– Point to point on fail– “Standard Mode” Communication only with

master– Control up to 10,000 PC’s

Page 17: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

17

Performance

• Jan/May 00– 15 million GEANT events for

optimization– cf 250,000 possible at CERN– DELPHI events

• 500,000/day• Trilinear Gauge Couplings, W-mass

systematics

– ATLAS, CDF, H1

Page 18: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

18

User

• Interface to master only– Web/Grid interface– Security

• Submission script– Job Control File

• Sequential jobs, files to keep etc• Quick and easy to use

• Statically linked executable• Toolkit

– Enables assembly/merging of 300 outputs

Page 19: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

19

SearchAnalysis

• As a search-engine MAP architecture is ideal– Low search and recovery times– Chemistry

• Centre for Innovative Catalysis (JIF ’00), promises world lead for Liverpool.

– Bio-informatics• Compute/search farms

Page 20: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

20

ExtendingMAP

• Wish to store events– Part of our mindset (reevaluate?)

• With existing system– Build an analysis and storage system– Add on disk servers

Page 21: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

21

COMPASS

Page 22: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

22

COMPASS-99

DELL ITS

Page 23: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

23

COMPASS-00

• 3Tbytes – On top of 1TByte

MAP internal

• Rack Mounted• Prototype of

40Tbyte system

Page 24: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

24

COMPASS

• Low cost(25KCHf/Tbyte inc 17.5%VAT)– SCSI disks(10 50GByte)– Dual Redundant Power Supplies– No RAID backplane– No hotswap– 750 MHz processors + 512MBytes memory– Linux– Act as MAP masters

Page 25: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

25

COMPASS

• Have 3Tbytes of store for R&D on GRID and exploitation of MAP

• MAP & COMPASS are complementary…

• Originally requested 40TBytes of store– For H1, BaBar, ATLAS, DELPHI

Page 26: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

26

MAP&COMPASS

• DST or processed data stored– From MAP

• Reprocessed/analysed locally – COMPASS

• Limit data movement off site– COMPASS farm in own right– Powerful analysis engines– Access from remote sites– Designed to, in parallel, analyse very large

data sets (Data split nodes – June 00)

Page 27: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

27

Data Transfer2000-2003

• Data transfer to/from – Liverpool-CERN/RAL– Liverpool-SLAC/FNAL

• High Speed link may be a waste of money– 3MCHF for 2MBs line!– Quality of service– Probably not true in long term t

• Transfer disks

Page 28: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

28

MAP-2001

• Extension of existing architecture– Vast underestimate of amount of MC

required– Extend to 1000 PC’s

• 720 800MHz PIII with 72Gbyte disks• 128MBytes memory• Switched network (&higher quality!)• Better NICs/(onboard?)

Page 29: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

29

MAP-2001

• Companies more willing to discuss COTS type architecture– Many selling BEOWULF systems– Even IBM!– ITS will provide a turnkey system

including our version of MAP control

Page 30: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

30

MAP-2001

• Capability– Standard MAP mode – DST transfer– Search Engine– Interprocess communication– Large Internal Store

• Minimize network traffic• Reprocessing

Page 31: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

31

MAP-2001

• Increase power by factor of 5• Aim for 1.5M LHCb events/day

– Non-volatile 1 Tbyte/day– 50Days internal store

• Use for reprocessing data• Disk size will increase by

calendar 2001 • Multi-user and projects

Page 32: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

32

Issues

• Authentication and Security• Quality of Service• Resource Allocation

Page 33: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

33

Grid

• Adding Globus (June 2000)• Access from CERN &

– Cambridge University, JMU, Liverpool, RAL

• Remote submission

Page 34: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

34

Grid 2005

Tier 1

T2

T2

T2

T2

3

3

3

33 3

3

3

3

3

3

3

Tier 0 (CERN

)

44 4 4

33

??????

T2

Page 35: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

35

Grid-LHCb

• Aim to use MAP as an LHCb testbed– MC production– Data access– Analysis– UK and CERN sites– Interaction with RAL

Page 36: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

36

HealthGrid

• Virtual Population Laboratory– Co-proposed by Liverpool for a “world

scale met office for disease prediction”• in collaboration with WHO

– Analysis power based on MAP• 5000 PC system

Page 37: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

37

HealthGrid

• Community Health Surveillance– WAP, local data bases

• Information – statistics,

• Analysis– MAP like centres for Health Policy

• WHO Med Centre

Page 38: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

38

Comments

• High Power MC systems vital for HEP– Do we have/plan enough for LHC?

• Cost and Techniques of Storage– Small groups can’t afford/want HSM– Is tape obsolete?

• Problems for institutes not the same as for Tier 0/1 centres

Page 39: T. Bowcock A.Moreton, M.McCubbin CERN-IT 5/00. 29 May 2000CERN-IT T. Bowcock2 University of Liverpool MAP System COMPASS Grid Summary

29 May 2000 CERN-IT T. Bowcock

39

Summary

• MAP fulfils its design goals (works!)– MAP-FCS control up to 10,000PC’s

• Minimum manpower 0.5FTE to date– Maintenance and development

• COTS architecture a success– Low cost has its ups and downs!– MAP available off the shelf for HEP-MC

• Low cost high density storage server farm in prototype

• Grid enabled– Access from CERN – and UK HEP institutes soon

• MAP-2001– Test a 1000PC farm for LHC