22
Sep 23 1999 Nick Hadley DØ Prague Worksho p 1 Overview Organization Status (brief) Conclusion Run II Computing and Run II Computing and Software Software

Sep 23 1999Nick Hadley DØ Prague Workshop 1 Overview Organization Status (brief) Conclusion Run II Computing and Software

Embed Size (px)

Citation preview

Sep 23 1999 Nick Hadley DØ Prague Workshop 1

Overview Organization Status (brief) Conclusion

Run II Computing and Run II Computing and SoftwareSoftware

Sep 23 1999 Nick Hadley DØ Prague Workshop 2

ComputingComputing

Lehman Baseline

Trigger Systems

DAQ System

$6.0 M

$0.6 M

50 - 100 Hz

Reconstruction Farms $1.6 M, 20 - 40 Hz DC

Central AnalysisSystems $2.8 M

MassStorage

$2.9 M

Workgroup Server

WorkgroupServer

Desktops

RemoteComputing

Monte Carlo, ...

$0.8 M *

Sep 23 1999 Nick Hadley DØ Prague Workshop 3

SoftwareSoftware

Trigger Controls

Trigger Algorithms

Controls & MonitoringRun Control, Data Logging

Subdetector TasksAccelerator Interface

Calibration and Hardware Databases

Farm Control Software

Offline Reconstruction

Production DatabasesAnalysis Tools

Analysis Software

MSS ControlPersistency Tools

Remote Systems

Monte Carlos

DevelopmentEnvironment

Physicists’Analyses

Sep 23 1999 Nick Hadley DØ Prague Workshop 4

HistoryHistory

Run I Computing at DØ

Data collection @ 2 Hz with very high efficiency Reconstruction kept up with collection

time lag of ~ 2 weeks for calibration

Successes: Robustness and timeliness of reconstruction Easy access to microDST data set of all events

Problems/Bottlenecks: Tape access (8mm drives, no significant HSM system) Network access

Reconstruction system difficult to test and verify

Sep 23 1999 Nick Hadley DØ Prague Workshop 5

DØ Run II Data SetDØ Run II Data Set

Run II Parameters for DØ:

Trigger rate 50 Hz (LHC / 2)

Raw data event size 250 kB (LHC / 4)

Data collection 600 M evts/ yr (LHC / 2)

Summary event size 150 kB (LHC x 2)

Physics sum’ry evt siz 10 kB (LHC ??)

Total dataset size 300 TB/yr (LHC / 3)

Bottom line: Computing project

~ O (Run I x 20), ~ O (LHC / 3-4)

Must be accomplished w/ resources avail in 2000, not 2005!

Sep 23 1999 Nick Hadley DØ Prague Workshop 6

Run II Computing and Run II Computing and Software PlanSoftware Plan

Original Plan January 1997 Update to Plan January 1999

See DøatworkComputingReviews & Documentation

Key goals: Maintainability, Separately testable modules Flexibility

• Replaceable packages (e.g., the implementation of persistency, the graphics package, etc.)

Key decisions: OOAD/ C++

Interface to persistency mechanism (first implementations: DSPACK, EVPACK )

Sep 23 1999 Nick Hadley DØ Prague Workshop 7

R2CSP Organization R2CSP Organization

Co-LeadersHadley, Merritt

InfrastructureGreenlee,Li,Prosper

AlgorithmsProtopopescu,Womersley

Monte CarloKlima

Production/Data AccessDiesburg,Lueking

OnlineFuess,Slattery

Computing PlanningBoard

Tightly connected to Joint Projects

SoftwareTools

edm

dØom

framework

RCP

Config Man

Graphics

Subdetectors

Calib/AlignLevel 3Global Tracking

VertexingEM IDMuon IDTau IDJets/Missing Et

Generators Geant sim C++ wrapper Param MC

Procurements Farms Mass Storage Analysis SysSoftware Farms ENSTORE LSF, etc.Data basesData handling SAM

Hardware controls DAQ - primary & secondary Event monitoring Control room applications

Sep 23 1999 Nick Hadley DØ Prague Workshop 8

Joint Offline ProjectsJoint Offline Projects

ZOOM - C++ Class Libraries RIP - Reconstruction Input Pipeline (writes

data into robot in FCC) Support Databases

Using ORACLE, license negotiated through CD

Configuration Management Using SoftRelTools with DØ interface (ctest)

Hardware Projects Mass storage - new robot purchased Networking - new single mode fiber to DØ Farms - in progress (prototype purchase) Physics analysis - 1/3 purchased, online soon, 5TB

disk Note: So far, costed at or under DMNAG ests.

Sep 23 1999 Nick Hadley DØ Prague Workshop 9

Joint Offline Projects cont’dJoint Offline Projects cont’d

Visualization Making use of Open Inventor licenses

Physics Analysis ROOT chosen Evaluations of products

Storage Management Decision to use ENSTORE, linked to SAM

Data Access SAM Working Group - charged with producing a

proof - of - principle demonstration Oct. 98 - done!

Sep 23 1999 Nick Hadley DØ Prague Workshop 10

Computing and Software Plan: Computing and Software Plan: Decision TimelineDecision Timeline

1995 Use C++ ; write data using DSPACK

1996 Begin project; CPB structure, Begin to understand large C++ system

Use DØOM : isolate I/O from usr code

1997 (Jan) 60-page plan submitted

(Jul) Use KAI & VC++ compilers

(Oct) Use SAM

(Dec) Use ZOOM products

1998 (Jan) Use ORACLE for DB needs

(Mar - DAM review) Use modified exclusive streaming

(May) Use ENSTORE instead of HPSS; use Open Inventor graphics

(Aug - GCM review) Use tri-level analysis system; PC’s on desktop

(Sept.) Physics analysis software -Root

Sep 23 1999 Nick Hadley DØ Prague Workshop 11

Status in a NutshellStatus in a Nutshell

SAM Data Handling:Version 0 released in KITSUsed on farm prototypeReady for general users ond02ka

Reconstruction Program:Works from MC digi hits to producecentral tracks, cal & PS clusters,muon system hits, first versions ofphysics objects

Online System:Prototype DAQ working --Able to log data, Cal, muonInterface to control HV, etc.Framework for data monitoringLevel 3 readout software

Monte Carlo:Full GEANT3.21 sim of detectorMultiple interaction capability~90K events available (bkg, sig)Fast MC under development

Analysis Program:HBOOK Ntuples from MCC1available, MCC2 by 1/00ROOT - new Run II tool(can analyze old & new format)

Level 3 Filters:Tools & Filters under development; excellent tutorial on web

Databases:Oracle Event/File DB inproduction; used by SAMOther DBs under devel

Event DispFull det duein OctDisplaysfor alg devavailable now

Sep 23 1999 Nick Hadley DØ Prague Workshop 12

Status in a NutshellStatus in a Nutshell

ADIC tape robotOperational750 TB capacityIn use for SAM &ENSTORE tests;MCC99-1

Central Analysis SystemNow:d02ka + ~ 500 GB diskComing: 1/3 of final system Available to users by 22-Sep64 SGI R12000’s 1923 Gigabit Ethernet Interfaces5 TB disk 30 TB

Online SystemsDecAlpha hostsLevel 3 farmLinux & NT PCs forcontrols & monitoringEXAMINE farm

Networking UpgradesFC rewiring of DAB & PKs100 Mb to desktopsGb uplinks to main switchDone by Sept

Desktops at DØNT or Linux PCsLinux support person hiredOperations strategybeing worked out

Project Servers?NT, LINUXPossible rolesModel for offsite?SpecificationTax ??

Database Servers2 Suns here In use for Evt/File DB

Farms Prototype (4 nodes) for MCC -1Fermilab purchase this yr - 50nodesOffsite proposals for MC

Sep 23 1999 Nick Hadley DØ Prague Workshop 13

Status:Status: Infrastructure Infrastructure

Packages to support batch reconstruction are

in place and in use edm - stable, a few changes coord. with dØom rcp - flat file version stable dØom - DSPACK, EVPACK done, DB design work framework - batch, single threaded interactive have event displays for many detectors still to do: multi-threaded Interactive Framework,

DB version of RCP, integrated event display

Sep 23 1999 Nick Hadley DØ Prague Workshop 14

Status:Status: Infrastructure Infrastructure

Configuration management status Golden releases ~ every three weeks Latest release to IRIX and Linux (~200

packages)• OSF1, NT (40 packages now)• Switching to CTB2, SRT2. Needed for NT

Still with KAI compiler & VC++

• Debugger - TotalView a big improvement

Releases - set a new goal of one week releases

Still Working on t59 Established subsystem coordinators t60 will use SRT2 (and this will be the only

change!)

Sep 23 1999 Nick Hadley DØ Prague Workshop 15

Status:Status: Monte Carlo Monte Carlo

Full GEANT simulation Phase 1 of MCC 99 generated 87K events multiple interactions, better secondary tracing

now implemented

Production release of DØGSTAR/simpp certified for MCC99-2

Monte Carlo presentations to Run II Physics meeting: hope for additional users/feedback/help

Fast MC, trigger simulation in progress

Sep 23 1999 Nick Hadley DØ Prague Workshop 16

Status: AlgorithmsStatus: Algorithms

RECO program including first versions of all particle ID exists

First production version March 99 Algorithms group proceeding toward

Oct. 30 production release

Level 3 reviewed in April 99 software infrastructure in good shape, more

effort needed on releases all basic tools being designed and coded number of people increasing

Sep 23 1999 Nick Hadley DØ Prague Workshop 17

Status:Status: Production and Data Production and Data AccessAccess

SAM and ENSTORE prototypes exist and have been used to access MCC data with production RECO.

Solved numerous problems

Good progress on framework integration for Oct.1 SAM release

first 1/3 of central analysis server has arrived, available for users soon

final system 90K MIPS, 30 TB of disk

New DØ tape robot (750 GB) installed and in use

Serial Media choice by early fall

Sep 23 1999 Nick Hadley DØ Prague Workshop 18

Status:Status: Online Online

Version 0 of COOR, other infrastructure in place

Two DEC Alpha hosts + some PCs

Beginning to use system for commissioning tasks

Milestones driven by users’ needs

Muons and calorimeter successfully readout through the complete chain

Sep 23 1999 Nick Hadley DØ Prague Workshop 19

Von Rüden V Closeout Von Rüden V Closeout June99June99

Joint ProjectsProcurement Spending on track; well under controlManagement & Organization Move into understanding support

phaseInfrastructure Going wellData Access

Networking Continue good collaboration with CDRIP Stick to scheduleSerial Media Work out a backup strategy for Mammoth II’sENSTORE (DØ) Very pleased with excellent management

SAM (DØ) Pleased with further SAM progress; develop plans to further exercise the system with users

Production System Procurements(Robot Well done at the last review)Central Analysis System Well doneFarms Another success

Sep 23 1999 Nick Hadley DØ Prague Workshop 20

Von Rüden V CloseoutVon Rüden V CloseoutDØEnsure further attention to Level 3Global Tracking should work to recruit more people,

especially with prior experience in trackingStart working on calibration & alignment

OverallWe are “very much on track”The committee feels “rather confident” of a successful

completion (And they request us not to embarrass them by

proving them wrong)

Sep 23 1999 Nick Hadley DØ Prague Workshop 21

ScheduleSchedule The Schedule

MCC99-2 Monte Carlo Production release 9/01/99

Monte Carlo evt gen complete 11/15/99

RECO 3rd Production release 10/31/99 Farm production for MCC99-2 11/30/99

Level 3 many skeleton phys tools 7/31/99 1st prod rel. working physics tools 10/31/99 physics tools with raw data 2/28/00

Online System data logging rate test to FCC 9/01/99 integrated DAQ run 11/22/99

MCC99-2 analysis (SAM, DB, ROOT) 12/31/99 MCC99-2 Workshop for results ~2/28/00

Detector Ready for data 1/23/00 We still have a great deal to do. More time but no

extra people. Set priorities and keep focus.

Sep 23 1999 Nick Hadley DØ Prague Workshop 22

Summary of Computing Summary of Computing and Softwareand Software

Integration of systems is beginning!

Good reviews for software infrastructure and hardware purchasing plans

Reconstruction and Monte Carlo moving to V2.

We are well positioned but still need manpower.Hardware needs may make it difficult to add people at FNAL. Contributions of those based elsewhere very important.