57
My Journey through Scientific Computing August 28 2013, FNAL René Brun/CERN*

My Journey through Scientific Computing

  • Upload
    dyanne

  • View
    69

  • Download
    0

Embed Size (px)

DESCRIPTION

My Journey through Scientific Computing. August 28 2013, FNAL René Brun/CERN*. Disclaimer. In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP. - PowerPoint PPT Presentation

Citation preview

Page 1: My Journey through Scientific Computing

My Journey throughScientific

ComputingAugust 28 2013, FNAL

René Brun/CERN*

Page 2: My Journey through Scientific Computing

R.Brun 2

DisclaimerIn this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.

Having been involved in the design and implementation of many systems, my views are necessarily biased.

28/8/13

Page 3: My Journey through Scientific Computing

R.Brun 3

CV with experiments

Jun 1973: Thesis Nuclear Physics

Jul 1973 : ISR: R602 with C.Rubbia (reconstruction)

Oct 1974: SPS: NA4 with C.Rubbia (simulation/recons)

Feb 1980: LEP: OPAL at LEP (simulation and framework)

1991-1993 :LHC: ATLAS/CMS simulation

Sep 1994: SPS: NA49 at SPS (exile)

Sep 1996: LHC: ALICE (simulation and framework)

28/8/13

Page 4: My Journey through Scientific Computing

R.Brun 4

CV with general software

1974 : HBOOK

1975: GEANT1, Lintra, Mudifi

1977: GEANT2, HPLOT, ZBOOK

1980: GEANT3

1983: ZEBRA

1984: PAW

1995: ROOT

2012: GEANT4+5 vector prototype concepts

28/8/13

Page 5: My Journey through Scientific Computing

R.Brun 5

Machines

28/8/13

From Mainframes ===== Clusters

Walls of

cores

GRIDs&

Clouds

Page 6: My Journey through Scientific Computing

R.Brun 6

Machine Units (bits)

28/8/13

16 32 36 48 56 60 64pdp

11nord50

besm6

cdc many

many

univac

With even more combinations of

exponent/mantissa size

or byte ordering

A strong push to develop portable

machine independent I/O systems

Page 7: My Journey through Scientific Computing

R.Brun 7

User machine interface

28/8/13

Page 8: My Journey through Scientific Computing

R.Brun 8

Systems in 1980

28/8/13

OS & fortran

LibrariesHBOOK, Naglib, cernlib

ExperimentSoftware

End userAnalysis software

CDC, IBM

1000 KLOC

500 KLOC

100 KLOC

10 KLOC

Vax780

TapesRAM1 MB

Page 9: My Journey through Scientific Computing

R.Brun 9

Systems today

28/8/13

OS & compilers

Frameworks likeROOT, Geant4

ExperimentSoftware

End userAnalysis software

Hardware

20 MLOC

5 MLOC

4 MLOC

0.1 MLOC

HardwareHardwareHardwareClusters of multi-core machines

10000x8

GRIDS

CLOUDS

Networks10 Gbit/s

Disks1o PB

RAM16 GB

Page 10: My Journey through Scientific Computing

R.Brun 10

Tools & Libs

28/8/13

hbook

zebrapawzbook

hydra

geant1

geant2

geant3

geant4

root

minuit

bos

Geant4+5

Page 11: My Journey through Scientific Computing

R.Brun 11

General Software in 1973

Software for bubble chambers: Thresh, Grind, Hydra

Histogram tool: SUMX from Berkeley

Simulation with EGS3 (SLAC), MCNP(Oak Ridge)

Small Fortran IV programs (1000 LOC, 50 kbytes)

Punched cards, line printers, pen plotters (GD3)

Small archive libraries (cernlib), lib.a 28/8/13

Page 12: My Journey through Scientific Computing

R.Brun 12

Software in 1974First “Large Electronic Experiments”

Data Handling Division == Track Chambers

Well organized software in TC with HYDRA, Thresh, Grind, anarchy elsewhere

HBOOK: from 3 routines to 100, from 3 users to many

First software group in DD28/8/13

Page 13: My Journey through Scientific Computing

R.Brun 13

GEANT1 in 1975Very basic framework to drive a simulation program, reading data cards with FFREAD, step actions with GUSTEP, GUNEXT, apply mag-field (GUFLD).

Output (Hist/Digits) was user defined

Histograms with HBOOK

About 2,000 LOC

28/8/13

Page 14: My Journey through Scientific Computing

R.Brun 14

ZBOOK in 1975Extraction of the HBOOK memory manager in an independent package.

Creation of banks and data structures anywhere in common blocks

Machine independent I/O, sequential and random

About 5,000 LOC

28/8/13

Page 15: My Journey through Scientific Computing

R.Brun 15

GEANT2 in 1976Extension of GEANT1 with more physics (e-showers based on a subset of EGS, mult-scattering, decays, energy loss

Kinematics, hits/digits data structures in ZBOOK

Used by several SPS experiments (NA3, NA4, NA10, Omega)

About 10,000 LOC

28/8/13

Page 16: My Journey through Scientific Computing

R.Brun 16

Problems with GEANT2

Very successful small framework.

However, the detector description was user written and defined via “if” statements at tracking time.

This was becoming a hard task for large and always evolving detectors (case with NA4 and C.Rubbia)

Many attempts to describe a detector geometry via data cards (a bit like XML), but the main problem was the poor and inefficient detector description in memory.

28/8/13

Page 17: My Journey through Scientific Computing

R.Brun 17

GEANT3 in 1980A data structure (ZBOOK tree) describing complex geometries introduced , then gradually the geometry routines computing distances, etc

This was a huge step forward implemented first in OPAL, then L3 and ALEPH.

Full electromagnetic showers (first based on EGS, then own developments)

28/8/13

Page 18: My Journey through Scientific Computing

R.Brun 18

(HYDRA,ZBOOK)GEM ->ZEBRA

HYDRA and ZBOOK continuous developments, both having nice and complementary features.

In 1981 the GEM project launched, developed with no contacts with experiments fail to deliver a working system in 1982.

In 1983 the director for computing decided to stop GEM and HYDRA and support the ZBOOK line, mainly because of the success of GEANT3 based on ZBOOK.

I decided to collaborate with J.Zoll (HYDRA) to develop ZEBRA, combining ZBOOK and HYDRA.

This decision made OPAL and L3 happy, but ALEPH decided to use BOS from DESY.

28/8/13

Page 19: My Journey through Scientific Computing

19

GEANT3 with ZEBRA

ZEBRA was very rapidly implemented in 1983.

We introduced ZEBRA in GEANT3 in 1984.

From 1984 to 1993 we introduced plenty of new features in GEANT3: extensions of the geometry, hadronic models with Tatina, Gheisha and Fluka, Graphics tools.

In 1998, GEANT3 interface with ROOT via the VMC (Virtual Monte Carlo)

GEANT3 has been used and still in use by many experiments.

28/8/13R.Brun

Page 20: My Journey through Scientific Computing

R.Brun 20

Graphics/UI in the 80s

CORE system (Andy Van Dam) in the US

GKS in Europe

Xwindow wins with Xterms and workstations

We design HIGZ (interface to graphics and ZEBRA) with many interfaces (CORE, GKS, X, PHIGS,etc)

KUIP for User Interface (VT100, work-stations, xterms)

28/8/13

Page 21: My Journey through Scientific Computing

R.Brun 21

PAWFirst minimal version in 1984

Attempt to merge with GEP (DESY) in 1985, but take the idea of ntuples for storage and analysis. GEP was written in PL1.

Package growing until 1994 with more and more functions. Column-wise ntuples in 1990.

Users liked it, mainly once the system was frozen in 1994.

28/8/13

Page 22: My Journey through Scientific Computing

R.Brun 22

Vectorization attempts

During the years 1985->1990 a big effort was invested in vectorizing GEANT3 (work in collaboration with Florida State University) on CRAY/YMP, CYBER205,ETA10.

The minor gains obtained did not justify the big manpower investment. GEANT3 transport was still essentially sequential and we had a big overhead with vectors creation, gather/scatter.

However this experience and failure was very important for us and many messages useful for the design of GEANT5 many years later.

28/8/13

Page 23: My Journey through Scientific Computing

23

So far so goodThe years 1980->1989 were pretty quiet (ie no fights). Fortran 77 was the main language and LEP our main target. The SSC simulations (GEM, SDC) were along the same lines.

The ADAMO system had been proposed as an implementation of the Entity Relationship Model, but its use remained confidential (ALEPH/ZEUS), same for the Jazelle system from SLD.

In 1989 Tim Berners Lee joined my group in DD to implement a system allowing access to a central documentation on the IBM via RPCs (Remote Procedure Calls), but he developed something else. This coincided with a lot of controversy about future languages, data structures management, data bases, user interfaces, documentation systems, etc.

28/8/13R.Brun

Page 24: My Journey through Scientific Computing

R.Brun 24

1991: Erice Workshop

This workshop was supposed to see an agreement on the directions and languages for the next generation experiments, instead a very confusing picture emerged.

The MOOSE project created at CERN to investigate languages such as Eiffel, C++, ObjectiveC, F90. But this project failed to produce anything concrete.

At the same time my group was very busy with the LEP analysis with PAW and the SSC and LHC simulations with GEANT3 (ATLAS and CMS).

28/8/13

Page 25: My Journey through Scientific Computing

R.Brun 25

Erice workshop 1991

28/8/13

Where to go with DS

tools & languages

?

Powerful tools but

programming with

them was odd

Page 26: My Journey through Scientific Computing

R.Brun 26

1992: CHEP Annecy

Web, web, web, web…………

Attempts to replace/upgrade ZEBRA to support/use F90 modules and structures, but modules parsing and analysis was thought to be too difficult.

With ZEBRA the bank description was within the bank itself (just a few bits). A bank was typically a few integers followed by a dynamic array of floats/doubles.

We did not realize at the time that parsing user data structures was going to be a big challenge!!

28/8/13

Page 27: My Journey through Scientific Computing

R.Brun 27

Parallelism in the 80s & early 90s

Many attempts (all failing) with parallel architectures

Transputers and OCCAM

MPP (CM2, CM5, ELXI,..) with OpenMP-like software

Too many GLOBAL variables/structures with Fortran common blocks.

RISC architectures or emulators perceived as a cheaper solution in the early 90s.

Then MPPs died with the advent of the Pentium Pro (1994) and farms of PCs or workstations.

28/8/13

Page 28: My Journey through Scientific Computing

R.Brun 28

ConsequencesIn 1993/1994 performance was not anymore the main problem.

Our field invaded by computer scientists.

Program design, object-oriented programming , move to more sexy languages was becoming a priority.

The “goal” was thought less important than the “how”

This situation deteriorates even more with the death of the SSC.

28/8/13

Page 29: My Journey through Scientific Computing

R.Brun 29

1993: Warning Danger

3 “clans” in my group1/3 pro F90 1/3 pro C++1/3 pro commercial products (any language) for graphics, User Interfaces, I/O and data bases

My proposal to continue with PAW, develop ZOO(ZEBRA Object-Oriented) and GEANT3 geometry in C++ is not accepted.

Evolution vs Revolution

28/8/13

Page 30: My Journey through Scientific Computing

R.Brun 30

1994: What next?SSC down, LHC up. SSC refugees joining LHC development groups.

DRDC projects: Objectivity, GEANT4

Time to think, think, think and learn new things (OO,UML,C++,Eiffel, O2, ObjectStore, Objectivity,..)

Discard several proposals and choose exile to NA49

Fall94: first version of histogram package in C++, including some I/O attempts. Now in a better position to estimate development time, C++ pros&cons, Objectivity cons.

Christmas94: YES, let’s go with ROOT

28/8/13

Page 31: My Journey through Scientific Computing

R.Brun 31

1995: roads for ROOT

The official line was with GEANT4 and Objectivity, not much room left for success with an alternative product when you are alone.

The best tactic had to be a mixture of sociology , technicalities and very hard work.

Strong support from PAW and GEANT3 usersStrong support from HP (workstations + manpower)

In November we were ready for a first ROOT show

Java is announced (problem?)

28/8/13

Page 32: My Journey through Scientific Computing

R.Brun 32

1996: Technicalities

Histogram classes (3 versions) + Minuit

Hand written Streamers

From PAW/KUIP style to CINT

Collection classes (STL , templates, hesitations..)

ATLFAST++ (fast simulation of ATLAS based on ROOT)

Letter from the director of computing against the use of ROOT by experiments (except NA49). Problem for ALICE.

LHC++ official alternative to ROOT

28/8/13

Page 33: My Journey through Scientific Computing

R.Brun 33

1997: Technicalities++ROOTCINT generated Streamers with parsing of C++ header files, including user classes.

Many improvements and new packages

Automatic classes documentation system (THTML), first manual and tutorials.

gAlice converted to AliRoot

Interest from RHIC (Phobos and Star)

CHEP97 in Berlin with focus on GEANT4, Objectivity,LHC++,JAS

28/8/13

Page 34: My Journey through Scientific Computing

R.Brun 34

1998: work & smile

RUN II projects at FNALData Analysis and VisualizationData Formats and storage

ROOT competing with HistoScope, JAS, LHC++

CHEP98 (September) Intercontinental Chicago

ROOT selected by FNAL, followed by RHICVital decision for ROOT, thanks FermiLab !!!

But official support at CERN only in 2002

28/8/13

Page 35: My Journey through Scientific Computing

R.Brun 35

ROOT Trees vs Objectivity

Compared to the best OODBMS candidate in 1995 (Objectivity) ROOT supports a persistent class that may be a subset of the transient class.

ROOT supports compression (typical factors 3 to 6), file portability and access in heterogeneous networks

ROOT supports branch splitting that increases drastically the performance when reading.

It is based on classical system files and does not require the nightmare of a central data base.

ROOT TTreeCache allows efficient access to LAN and WAN files.

28/8/13

Page 36: My Journey through Scientific Computing

R.Brun

Input/Output: Major Steps

36

parallel merge

TreeCache

member-wise streamingfor STL collections<T*>

member-wise streamingfor TClonesArray

automatic streamers from dictionary with StreamerInfosin self-describing files

streamers generatedby rootcint

User written streamersfilling TBuffer

28/8/13

Page 37: My Journey through Scientific Computing

R.Brun 37

ROOT evolutionNo time to discuss the creation/evolution of the 110 ROOT shared libs/packages.

ROOT has gradually evolved from a data storage, analysis and visualization system to a more general software environment replacing totally what was known before as CERNLIB.

This has been possible thanks to MANY contributors from experiments, labs or people working on other fields.

In the following few slides, I show the current big systems assigned to at least one developer.

28/8/13

Page 38: My Journey through Scientific Computing

R.Brun 38

Code repository and ManagementFrom CMZ to CVS SVN GIT

Make cmake

Build and test infrastructure

Distribution

Forum, mails, Savannah->JIRA

Documentation, User Guide

28/8/13

Page 39: My Journey through Scientific Computing

R.Brun 39

CINT -> CLINGCINT, originally developed by Masa Goto (HP/Taligent/Japan) in 1991 has been gradually upgraded across the years.

Work in progress to replace CINT by CLING based on the CLANG C++ compiler from Apple.

With CLING many C++ limitations of CINT will be eliminated and full support for C++11 provided at the command line, script via a JIT compiler.

28/8/13

Page 40: My Journey through Scientific Computing

R.Brun 40

2-D GraphicsExtremely important area that requires a lot of effort to support a large number of styles, options.

Support for graphics on screen with different back-ends for Linux, Macs/Ipad , Windows, Android, etc

Support for many output formats (.ps, .eps, .pdf,.tex,.gif,.png,.jpg, .C, .root

Support for interaction, picking, reflexion, etc 28/8/13

Page 41: My Journey through Scientific Computing

R.Brun 41

3D GraphicsAgain many back-ends: X, OpenGL, GDK/Windows, Cocoa/Mac,

Event viewing

Ouput drivers

User Interfaces

28/8/13

Page 42: My Journey through Scientific Computing

R.Brun 42

Geometry package

Originally from GEANT3, then considerable developments.

Many interfaces: to/from GEANT3 and GEANT4

New developments (parallelism, thread safety and vectorization) in the context of the GEANT4+5 project

28/8/13

Page 43: My Journey through Scientific Computing

R.Brun 43

User InterfaceUI library for X, GL, GDK, Cocoa, Ipad,

Pre-defined menu, pop-up, pull-down

Interface builder

C++ automatic code generator

Qt interface

28/8/13

Page 44: My Journey through Scientific Computing

R.Brun 44

Maths&StatsMathematical functions

Matrix packages

Random number generators

MINUIT, Minuit2,..

RooStats, RooFit

TMVA,…

28/8/13

Page 45: My Journey through Scientific Computing

R.Brun 45

PyRoot/PythonFull time to work to support an easy Python interface

Old interface based on CINT

New interface based on CLING

28/8/13

Page 46: My Journey through Scientific Computing

R.Brun 46

PROOFUse multi-process techniques to speed-up interactive analysis.

Evolved and again evolving to be used via automatic interfaces to the GRID systems

PROOFLight interesting for multi-core laptops.

28/8/13

Page 47: My Journey through Scientific Computing

R.Brun 47

Systems in 2030 ?

28/8/13

OS & compilers

Frameworks likeROOT, Geant5

ExperimentSoftware

End userAnalysis software

Hardware

100 MLOC

20 MLOC

50 MLOC

1 MLOC

HardwareHardwareHardwareMulti-level parallel machines10000x1000x1000

GRIDS

CLOUDSon

demand

Networks100

Gbit/s

Disks1o00 PB

Networks100

Gbit/sNetworks10 Tbit/s

RAM10 TB

Page 48: My Journey through Scientific Computing

R.Brun 48

Parallelism: key points

28/8/13

Minimize the sequential/synchronization parts (Amdhal law): Very difficultRun the same code (processes) on all cores to optimize the memory use (code and read-only data sharing)

Job-level is better than event-level parallelism for offline systems.

Use the good-old principle of data locality to minimize the cache misses.

Exploit the vector capabilities but be careful with the new/delete/gather/scatter problem

Reorganize your code to reduce tails

Page 49: My Journey through Scientific Computing

Data Structures & parallelism

28/8/13R.Brun 49

eventevent

vertices

tracks

C++ pointersspecific to a process

Copying the structure implies a

relocation of all pointers

I/O is a nightmare

Update of the structure from a different thread implies a

lock/mutex

Page 50: My Journey through Scientific Computing

R.Brun 50

Data Structures & Locality

28/8/13

sparse data structures defeat the system memory caches

Group object elements/collections such

that the storage matches the traversal processes

For example: group the cross-sections for all

processes per material instead of all materials

per process

Page 51: My Journey through Scientific Computing

Tails

28/8/13R.Brun 51

A killer if one has to wait the end of col(i) before

processing col(i+1)Average number

of objects in memory

Page 52: My Journey through Scientific Computing

A better solution

28/8/13R.Brun 52

Pipeline of objects

CheckpointSynchronization.

Only 1 « gap » every N events

This type of solution required

anyhow for pile-up studies

Page 53: My Journey through Scientific Computing

R.Brun 53

Other requirements

Eliminate the sequential part (like merging files) when running jobs/threads/processes in parallel. Use parallel buffer merges instead.

Use efficient tools to monitor bottlenecks like memory allocation, cache misses, too many locks, etc

Compare the results with the most efficient sequential version and not just the version using one single thread.

Prove that you use a 8 cores-node with one job more efficiently than running 8 independent jobs (memory, cpu, I/O).

28/8/13

This still

requires more effort.Urgent

!!

Page 54: My Journey through Scientific Computing

R.Brun 54

Towards Parallel Software

A long way to go!!

There is no point in just making your code thread-safe. Use of parallel architectures requires a deep rethinking of the algorithms and dataflow.

Avoid committees!! Small teams instead, but well focused projects with well defined milestones and reference benchmarks and test suites.

One such project is GEANT GEANT4+5 launched 2 years ago. We start having very nice results. Now lead by Federico Carminati

But still a long way to go to adapt (or write radically new software) for the emerging parallel systems.

28/8/13

Page 55: My Journey through Scientific Computing

R.Brun 55

A long journeyI had a fantastic opportunity to work in a great environment and in contact with many many colleagues and users.

I had the chance to follow the evolution of physics via many experiments and understand a bit better the experiments expectations for simulation and analysis

I had the freedom to propose, implement and support a list of widely used tools

28/8/13

Page 56: My Journey through Scientific Computing

56

A long journey++I continue to work at CERN as honorary member

Following the developments of GEANT4+5 and ,of course, very interested by the current developments with ROOT.

I had the opportunity to start my own physics project (top left logo), my main occupation, where I feel free to think & work in crazy directions :

28/8/13R.Brun

Page 57: My Journey through Scientific Computing

R.Brun 57

ThanksMany thanks to the many people met in this lab across the past decades.

Many thanks to FermiLab for the strong support in 1998 and then for a strong contribution to the development of the ROOT system

Thanks to Pushpa Bhat for inviting me to give this talk and suggesting the theme and the title.

28/8/13