170
Introduction to the ROOT framework http://root.cern.ch René Brun CERN Do-Son school on Advanced Computing and GRID Technologies for Research Institute of Information Technology, VAST, Hanoi

Introduction to the ROOT framework root.cern.ch

Embed Size (px)

DESCRIPTION

Introduction to the ROOT framework http://root.cern.ch. Ren é Brun CERN. Do-Son school on Advanced Computing and GRID Technologies for Research Institute of Information Technology, VAST, Hanoi. What's ROOT?. ROOT Application Domains. Data Analysis & Visualization. General Framework. - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to the ROOT framework root.cern.ch

Introduction tothe ROOT framework

http://root.cern.ch

René BrunCERN

Do-Son school on Advanced Computing and GRID Technologies for Research

Institute of Information Technology, VAST, Hanoi

Page 2: Introduction to the ROOT framework root.cern.ch

René Brun 2

What's ROOT?

Page 3: Introduction to the ROOT framework root.cern.ch

René Brun 3

ROOT Application Domains

Data Storage: Local, Network

Data Analysis & Visualization

General Fram

ework

Page 4: Introduction to the ROOT framework root.cern.ch

René Brun 4

ROOT in a Nutshell The ROOT system is an Object Oriented framework

for large scale data handling applications. It is written in C++.

Provides, among others, an efficient data storage and access system designed to support

structured data sets (PetaBytes) a query system to extract data from these data sets a C++ interpreter advanced statistical analysis algorithms (multi dimensional

histogramming, fitting, minimization and cluster finding) scientific visualization tools with 2D and 3D graphics an advanced Graphical User Interface

The user interacts with ROOT via a graphical user interface, the command line or scripts

The command and scripting language is C++, thanks to the embedded CINT C++ interpreter, and large scripts can be compiled and dynamically loaded.

A Python shell is also provided.

Page 5: Introduction to the ROOT framework root.cern.ch

René Brun 5

ROOT: An Open Source Project

The project was started in 1995.The project is developed as a collaboration

between: Full time developers:

• 11 people full time at CERN (PH/SFT)• +4 developers at Fermilab/USA, Protvino , JINR/Dubna (Russia)

Large number of part-time contributors (155 in CREDITS file)A long list of users giving feedback, comments, bug

fixes and many small contributions• 2400 registered to RootForum• 10,000 posts per year

An Open Source Project, source available under the LGPL license

Page 6: Introduction to the ROOT framework root.cern.ch

René Brun 6

More information...

http://root.cern.chDownloadDocumentationTutorialsOnline HelpMailing listForum

Page 7: Introduction to the ROOT framework root.cern.ch

René Brun 7

Install ROOT

Download & extract ROOT v5.16/00 from http://root.cern.ch/root/Version516.html Set of precompiled versions and source code

Set environment export ROOTSYS=<path-to-installation> export PATH=$ROOTSYS/bin:$PATH export LD_LIBRARY_PATH=$ROOTSYS/lib:$LD_LIBRARY_PATH export DYLD_LIBRARY_PATH=$ROOTSYS/lib:

$DYLD_LIBRARY_PATH (MacOS X only) Compile source code (if chosen)

cd $ROOTSYS ./configure make More information at http://root.cern.ch/root/Install.html

Page 8: Introduction to the ROOT framework root.cern.ch

René Brun 8

ROOT Library StructureROOT libraries are a layered structure CORE classes always required: support for RTTI, basic

I/O and interpreter Optional libraries loaded when needed. Separation

between data objects and the high level classes acting on these objects. Example: a batch job uses only the histogram library, no need to link histogram painter library

Shared libraries reduce the application size and link time

Mix and match, build your own application Reduce dependencies via plug-in mechanism

Page 9: Introduction to the ROOT framework root.cern.ch

René Brun 9

Three User Interfaces

GUIwindows, buttons, menus

Command lineCINT (C++ interpreter),Python, Ruby,…

Macros, applications, libraries (C++ compiler and interpreter)

{ // example // macro...}

Page 10: Introduction to the ROOT framework root.cern.ch

CINT Interpreter

Page 11: Introduction to the ROOT framework root.cern.ch

René Brun 11

CINT in ROOTCINT is used in ROOT:

As command line interpreterAs script interpreterTo generate class dictionariesTo generate function/method calling stubsSignals/Slots with the GUI

The command line, script and programming language become the same

Large scripts can be compiled for optimal performance

Page 12: Introduction to the ROOT framework root.cern.ch

René Brun 12

Compiled versus Interpreted Why compile? Faster execution, CINT has limitations…

Why interpret? Faster Edit → Run → Check result → Edit cycles

("rapid prototyping"). Scripting is sometimes just easier.

Are Makefiles dead? No! if you build/compile a very large application Yes! ACLiC is even platform independent!

Page 13: Introduction to the ROOT framework root.cern.ch

René Brun 13

Running Code

To run function mycode() in file mycode.C:root [0] .x mycode.C

Equivalent: load file and run function:root [1] .L mycode.C

root [2] mycode()

All of CINT's commands (help):root [3] .h

Page 14: Introduction to the ROOT framework root.cern.ch

René Brun 14

Running Code

Macro: file that is interpreted by CINT (.x)

Execute with .x mymacro.C(42)

int mymacro(int value)

{

int ret = 42;

ret += value;

return ret;

}

Page 15: Introduction to the ROOT framework root.cern.ch

René Brun 15

Unnamed Macros

No functions, just statements

Execute with .x mymacro.C No functions thus no arguments

Recommend named macros!Compiler prefers it, too…

{

float ret = 0.42;

return sin(ret);

}

Page 16: Introduction to the ROOT framework root.cern.ch

René Brun 16

Named Vs. UnnamedNamed: function scope unnamed:

global

mymacro.C: unnamed.C:

Back at the prompt:

void mymacro()

{ int val = 42; }

{ int val = 42; }

root [] val

Error: Symbol val is not defined

root [] val

(int)42

Page 17: Introduction to the ROOT framework root.cern.ch

René Brun 17

Survivors: Heap Objects

obj local to function, inaccessible from outside:

Instead: create on heap, pass to outside:

pObj gone – but MyClass still where pObj pointed to! Returned by mymacro()

void mymacro()

{ MyClass obj; }

MyClass* mymacro()

{ MyClass* pObj = new MyClass();

return pObj; }

Page 18: Introduction to the ROOT framework root.cern.ch

René Brun 18

Running Code – Libraries"Library": compiled code, shared library

CINT can call its functions!

Build a library: ACLiC! "+" instead of Makefile(Automatic Compiler of Libraries for CINT)

CINT knows all its functions / types:

something(42)

.x something.C(42)++

Page 19: Introduction to the ROOT framework root.cern.ch

René Brun 19

My first session

root [0] 344+76.8(const double)4.20800000000000010e+002root [1] float x=89.7;root [2] float y=567.8;root [3] x+sqrt(y)(double)1.13528550991510710e+002root [4] float z = x+2*sqrt(y/6);root [5] z(float)1.09155929565429690e+002root [6] .q

root

root

root [0] try up and down arrows

See file $HOME/.root_hist

Page 20: Introduction to the ROOT framework root.cern.ch

René Brun 20

My second session

root [0] .x session2.Cfor N=100000, sum= 45908.6

root [1] sum(double)4.59085828512453370e+004

Root [2] r.Rndm()(Double_t)8.29029321670533560e-001

root [3] .q

root

{ int N = 100000; TRandom r; double sum = 0; for (int i=0;i<N;i++) { sum += sin(r.Rndm()); } printf("for N=%d, sum= %g\n",N,sum);}

session2.C

unnamed macroexecutes in global scope

Page 21: Introduction to the ROOT framework root.cern.ch

René Brun 21

My third session

root [0] .x session3.Cfor N=100000, sum= 45908.6

root [1] sumError: Symbol sum is not defined in current scope*** Interpreter error recovered ***

Root [2] .x session3.C(1000)for N=1000, sum= 460.311

root [3] .q

root

void session3 (int N=100000) { TRandom r; double sum = 0; for (int i=0;i<N;i++) { sum += sin(r.Rndm()); } printf("for N=%d, sum= %g\n",N,sum);}

session3.C

Named macroNormal C++ scope

rules

Page 22: Introduction to the ROOT framework root.cern.ch

René Brun 22

My third session with ACLIC

root [0] gROOT->Time();root [1] .x session4.C(10000000)for N=10000000, sum= 4.59765e+006Real time 0:00:06, CP time 6.890

root [2] .x session4.C+(10000000)

for N=10000000, sum= 4.59765e+006 Real time 0:00:09, CP time 1.062

root [3] session4(10000000)for N=10000000, sum= 4.59765e+006Real time 0:00:01, CP time 1.052

root [4] .q

#include “TRandom.h”void session4 (int N) { TRandom r; double sum = 0; for (int i=0;i<N;i++) { sum += sin(r.Rndm()); } printf("for N=%d, sum= %g\n",N,sum);}

session4.C

File session4.CAutomatically compiled

and linked by thenative compiler.

Must be C++ compliant

Page 23: Introduction to the ROOT framework root.cern.ch

René Brun 23

Macros with more than one function

root [0] .x session5.C >session5.logroot [1] .q

void session5(int N=100) { session5a(N); session5b(N); gROOT->ProcessLine(“.x session4.C+(1000)”);}void session5a(int N) { for (int i=0;i<N;i++) { printf("sqrt(%d) = %g\n",i,sqrt(i)); }}void session5b(int N) { double sum = 0; for (int i=0;i<N;i++) { sum += i; printf("sum(%d) = %g\n",i,sum); }}

session5.C

.x session5.Cexecutes the function

session5 in session5.C

root [0] .L session5.Croot [1] session5(100); >session5.logroot [2] session5b(3)sum(0) = 0sum(1) = 1sum(2) = 3

root [3] .q

use gROOT->ProcessLineto execute a macro from a

macro or from compiled code

Page 24: Introduction to the ROOT framework root.cern.ch

René Brun 24

Running a script in batch mode

root –b –q myscript.Cmyscript.C is executed by the interpreter

root –b –q myscript.C(42)same as above, but giving an argument

root –b –q myscript.C+myscript.C is compiled via ACLIC with the native

compiler and linked with the executable.

root –b –q myscript.C++g(42)same as above. Script compiled in debug mode and

an argument given.

Page 25: Introduction to the ROOT framework root.cern.ch

René Brun 25

Calling interpreter in a macro

All CINT commands can be excuted from an interpreted macro or from compiled code.gROOT->ProcessLine(“.x myscript.C”);

Page 26: Introduction to the ROOT framework root.cern.ch

René Brun 26

ROOT Tutorials & Doc

See Users Guide (pdf) atSee online Reference Manual atExecute tutorials at $ROOTSYS/tutorials

293 tutorials in the following directoriesfoam gui matrix spectrum xml hist mlp splotnet sql geom physics gl image pyroot threadfft graphics io pythia tree fit graphs math quadp ruby unuran

Page 27: Introduction to the ROOT framework root.cern.ch

René Brun 27

Generating a Dictionary

MyClass.h

MyClass.cxx rootcint –f MyDict.cxx –c MyClass.h

compile and link

libMyClass.so

MyDict.hMyDict.cxx

Page 28: Introduction to the ROOT framework root.cern.ch

René Brun 2828

HTML

Page 29: Introduction to the ROOT framework root.cern.ch

René Brun 29

Math TMVA

SPlot

Page 30: Introduction to the ROOT framework root.cern.ch

Graphics & GUI

Page 31: Introduction to the ROOT framework root.cern.ch

René Brun 31

TPad: main graphics container

Hello

Root > TLine line(.1,.9,.6,.6)

Root > line.Draw()

Root > TText text(.5,.2,”Hello”)

Root > text.Draw()

The Draw function adds the object to the list of primitives of the current pad.

If no pad exists, a pad is automatically created with a default range [0,1].

When the pad needs to be drawn or redrawn, the object Paint function is called.

Only objects derivingfrom TObject may be drawn

in a padRoot Objects or User objects

Page 32: Introduction to the ROOT framework root.cern.ch

René Brun 32

Basic Primitives

TButton

TLine TArrow TEllipse

TCurvyLine

TPaveLabel

TPave

TDiamond

TPavesText

TPolyLine

TLatex

TCrown

TMarker

TText

TCurlyArc

TBox

Page 33: Introduction to the ROOT framework root.cern.ch

René Brun 33

Full LateX

support on

screen and

postscript

TCurlyArcTCurlyLineTWavyLine

and other building blocks for Feynmann diagrams

Formula or diagrams can be

edited with the mouse

Feynman.C

latex3.C

Page 34: Introduction to the ROOT framework root.cern.ch

René Brun 34

Graphs

TGraph(n,x,y)

TCutG(n,x,y)

TGraphErrors(n,x,y,ex,ey)

TGraphAsymmErrors(n,x,y,exl,exh,eyl,eyh)

TMultiGraph

gerrors2.C

Page 35: Introduction to the ROOT framework root.cern.ch

René Brun 35

Graphics examples

Page 36: Introduction to the ROOT framework root.cern.ch

René Brun 36

Page 37: Introduction to the ROOT framework root.cern.ch

René Brun 37

Page 38: Introduction to the ROOT framework root.cern.ch

René Brun 38

Page 39: Introduction to the ROOT framework root.cern.ch

René Brun 39

Graphics (2D-3D)

“SURF”“LEGO”

TF3

TH3

TGLParametric

Page 40: Introduction to the ROOT framework root.cern.ch

René Brun 40

ASImage: Image processor

Page 41: Introduction to the ROOT framework root.cern.ch

René Brun 41

GUI (Graphical User Interface)

Page 42: Introduction to the ROOT framework root.cern.ch

René Brun 42

Canvas tool bar/menus/help

Page 43: Introduction to the ROOT framework root.cern.ch

René Brun 43

Object editor Click on any object to show

its editor

Page 44: Introduction to the ROOT framework root.cern.ch

René Brun 44

TBrowser: a dynamic browser

see $ROOTSYS/test/RootIDE

Page 45: Introduction to the ROOT framework root.cern.ch

René Brun 45

TBrowser Editor

Page 46: Introduction to the ROOT framework root.cern.ch

René Brun 46

GUI C++ code generator

When pressing ctrl+S on any widget it is saved as a C++ macro file thanks to the SavePrimitive methods implemented in all GUI classes. The generated macro can be edited and then executed via CINT

Executing the macro restores the complete original GUI as well as all created signal/slot connections in a global way

// transient frame TGTransientFrame *frame2 = new TGTransientFrame(gClient->GetRoot(),760,590); // group frame TGGroupFrame *frame3 = new TGGroupFrame(frame2,"curve");

TGRadioButton *frame4 = new TGRadioButton(frame3,"gaus",10); frame3->AddFrame(frame4);

root [0] .x example.C

Page 47: Introduction to the ROOT framework root.cern.ch

René Brun 47

The GUI builder provides GUI tools for developing user interfaces based on the ROOT GUI classes. It includes over 30 advanced widgets and an automatic C++ code generator.

The GUI Builder

Page 48: Introduction to the ROOT framework root.cern.ch

René Brun 48

More GUI Examples$ROOTSYS/tutorials/gui

$ROOTSYS/test/RootShower

$ROOTSYS/test/RootIDE

Page 49: Introduction to the ROOT framework root.cern.ch

René Brun 49

Geometry

The GEOMetry package is used to model very complex detectors (LHC). It includes

-a visualization system

-a navigator (where am I, distances, overlaps, etc)

Page 50: Introduction to the ROOT framework root.cern.ch

René Brun 50

OpenGL

see $ROOTSYS/tutorials/geom

Page 51: Introduction to the ROOT framework root.cern.ch

René Brun 51

Peak Finder + Deconvolutions

TSpectrum

Page 52: Introduction to the ROOT framework root.cern.ch

René Brun 52

Fitters

Minuit

Fumili

LinearFitter

RobustFitter

Roofit

Page 53: Introduction to the ROOT framework root.cern.ch

René Brun 53

Histograms

Analysis result: often a histogram

Value (x) vs. count (y)

Menu:View / Editor

Page 54: Introduction to the ROOT framework root.cern.ch

René Brun 54

Fitting

Analysis result: often a fit based on a histogram

Page 55: Introduction to the ROOT framework root.cern.ch

René Brun 55

FitFit = optimization in parameters,

e.g. Gaussian

For Gaussian: [0] = "Constant"[1] = "Mean"[2] = "Sigma" / "Width"

Objective: choose parameters [0], [1], [2] to get function as close as possible to histogram

2

2

]2[2

])1[(

]0[)(

x

exf

Page 56: Introduction to the ROOT framework root.cern.ch

René Brun 56

Fit Panel

To fit a histogram:

right click histogram,

"Fit Panel"

Straightforward interface for fitting!

Page 57: Introduction to the ROOT framework root.cern.ch

René Brun 57

Roofit: a powerful fitting framework

see $ROOTSYS/tutorials/fit/RoofitDemo.C

Page 58: Introduction to the ROOT framework root.cern.ch

Input/Output

Page 59: Introduction to the ROOT framework root.cern.ch

René Brun 59

Saving Data

Streaming, Reflection, TFile,

Schema Evolution

Page 60: Introduction to the ROOT framework root.cern.ch

René Brun 60

Saving Objects – the Issues

Storing aLong: How many bytes? Valid:

0x0000002a0x000000000000002a0x0000000000000000000000000000002a

which byte contains 42, which 0? Valid:char[] {42, 0, 0, 0}: little endian, e.g. Intelchar[] {0, 0, 0, 42}: big endian, e.g. PowerPC

long aLong = 42; WARNING:

platform dependent!

Page 61: Introduction to the ROOT framework root.cern.ch

René Brun 61

Platform Data Types

Fundamental data types:size is platform dependent

Store "int" on platform A0x000000000000002a

Read back on platform B – incompatible!Data loss, size differences, etc:0x000000000000002a

Page 62: Introduction to the ROOT framework root.cern.ch

René Brun 62

ROOT Basic Data Types

Solution: ROOT typedefs

Signed Unsigned sizeof [bytes]

Char_t UChar_t 1

Short_t UShort_t 2

Int_t UInt_t 4

Long64_t ULong64_t 8

Double32_t float on disk, double in RAM

Page 63: Introduction to the ROOT framework root.cern.ch

René Brun 6363

Object Oriented Concepts

Members: a “has a” relationship to the class.

Inheritance: an “is a” relationship to the class.

Class: the description of a “thing” in the system Object: instance of a class Methods: functions for a class

TObject

Jets Tracks EvNum

Momentum

Segments

Charge

Event

IsA

HasAHasA

HasA

HasAHasAHasA

Page 64: Introduction to the ROOT framework root.cern.ch

René Brun 64

Saving Objects – the Issues

Storing anObject:need to know its members + base classesneed to know "where they are"

WARNING:

platform dependent!

class TMyClass {

float fFloat;

Long64_t fLong;

};

TMyClass anObject;

Page 65: Introduction to the ROOT framework root.cern.ch

René Brun 65

Reflection

Need type description (aka reflection)

1. types, sizes, members

TMyClass is a class.

Members: "fFloat", type float, size 4 bytes "fLong", type Long64_t, size 8 bytes

class TMyClass {

float fFloat;

Long64_t fLong;

};

Page 66: Introduction to the ROOT framework root.cern.ch

René Brun 66

Reflection

Need type description

1. types, sizes, members

2. offsets in memory class TMyClass {

float fFloat;

Long64_t fLong;

};

TMyClass

Mem

ory

Add

ress

fLong

fFloat

– 16– 14– 12– 10– 8– 6– 4– 2– 0

PADDING "fFloat" is at offset 0

"fLong" is at offset 8

Page 67: Introduction to the ROOT framework root.cern.ch

René Brun 67

members memory re-order

I/O Using Reflection

TMyClass

Mem

ory

Add

ress

fLong

fFloat

– 16– 14– 12– 10– 8– 6– 4– 2– 0

PADDING

2a 00 00 00...

...00 00 00 2a

Page 68: Introduction to the ROOT framework root.cern.ch

René Brun 68

C++ Is Not Java

Lesson: need reflection!

Where from?

Java: get data members with

C++: get data members with – oops. Not part of C++.

-- discussions for next C++

Class.forName("MyClass").getFields()

Page 69: Introduction to the ROOT framework root.cern.ch

René Brun 69

Reflection For C++Program parses header files, e.g. Funky.h:

Collects reflection data for types requested in Linkdef.h

Stores it in Dict.cxx (dictionary file)

Compile Dict.cxx, link, load:C++ with reflection!

rootcint –f Dict.cxx Funky.h LinkDef.h

Page 70: Introduction to the ROOT framework root.cern.ch

René Brun 70

Reflection By Selection

LinkDef.h syntax:

#pragma link C++ class MyClass+;#pragma link C++ typedef MyType_t;#pragma link C++ function MyFunc(int);#pragma link C++ enum MyEnum;

Page 71: Introduction to the ROOT framework root.cern.ch

René Brun 71

Why So Complicated?

Simply use ACLiC:

Will create library with dictionary of all types in MyCode.cxx automatically!

.L MyCode.cxx+

Page 72: Introduction to the ROOT framework root.cern.ch

René Brun 72

Saving Objects: TFile

ROOT stores objects in TFiles:

TFile behaves like file system:

TFile has a current directory:

TFile* f = new TFile("afile.root", "NEW");

f->mkdir("dir");

f->cd("dir");

Page 73: Introduction to the ROOT framework root.cern.ch

René Brun 73

Saving Objects, Really

Given a TFile:

Write an object deriving from TObject:

"optionalName" or TObject::GetName()

Write any object (with dictionary):

TFile* f = new TFile("afile.root", "RECREATE");

object->Write("optionalName");

f->WriteObject(object, "name");

Page 74: Introduction to the ROOT framework root.cern.ch

René Brun 74

"Where Is My Histogram?"

TFile owns histograms, graphs, trees(due to historical reasons):

h automatically deleted: owned by file.

c still there.

TFile* f = new TFile("myfile.root");TH1F* h = new TH1F("h","h",10,0.,1.);h->Write();Canvas* c = new TCanvas();c->Write();delete f;

Page 75: Introduction to the ROOT framework root.cern.ch

René Brun 75

TFile as Directory: TKeys

One TKey per object:named directory entryknows what it points tolike inodes in file system

Each TFile directory is collection of TKeys

TKey: "hist1", TH1F

TFile "myfile.root"

TKey: "list", TListTKey: "hist2", TH2FTDirectory: "subdir"

TKey: "hist1", TH2D

Page 76: Introduction to the ROOT framework root.cern.ch

René Brun 76

TKeys Usage

Efficient for hundreds of objects

Inefficient for millions:lookup ≥ log(n)sizeof(TKey) on 64 bit machine: 160 bytes

Better storage / access methods available,

Page 77: Introduction to the ROOT framework root.cern.ch

René Brun 77

Risks With I/OPhysicists can loop a lot:

For each particle collision

For each particle created

For each detector module

Do something.

Physicists can loose a lot:

Run for hours…

Crash.

Everything lost.

Page 78: Introduction to the ROOT framework root.cern.ch

René Brun 78

Name Cycles

Create snapshots regularly:

MyObject;1

MyObject;2

MyObject;3

MyObject

Write() does not replace but append!but see documentation TObject::Write()

Page 79: Introduction to the ROOT framework root.cern.ch

René Brun 79

The "I" Of I/O

Reading is simple:

Remember:TFile owns histograms!file gone, histogram gone!

TFile* f = new TFile("myfile.root");TH1F* h = 0;f->GetObject("h", h);h->Draw();delete f;

Page 80: Introduction to the ROOT framework root.cern.ch

René Brun 80

Ownership And TFiles

Separate TFile and histograms:

… and h will stay around.

TFile* f = new TFile("myfile.root");TH1F* h = 0;TH1::AddDirectory(kFALSE);f->GetObject("h", h);h->Draw();delete f;

Page 81: Introduction to the ROOT framework root.cern.ch

René Brun 81

Changing Class:a Problem

Things change:

class TMyClass { float fFloat; Long64_t fLong;};

Page 82: Introduction to the ROOT framework root.cern.ch

René Brun 82

Changing Class: a Problem

Things change:

Inconsistent reflection data!

Objects written with old version cannot be read!

class TMyClass { double fFloat; Long64_t fLong;};

Page 83: Introduction to the ROOT framework root.cern.ch

René Brun 83

Solution: Schema Evolution

Support changes:

1. removed members: just skip data

2. added members: just default-initialize them (whatever the default constructor does)

Long64_t fLong;float fFloat;

file.root

float fFloat;

RAMignore

float fFloat;file.root Long64_t fLong;

float fFloat;

RAMTMyClass(): fLong(0)

Page 84: Introduction to the ROOT framework root.cern.ch

René Brun 84

Solution: Schema Evolution

Support changes:

3. members change types

Long64_t fLong;float fFloat;

file.rootLong64_t fLong;double fFloat;

RAM

float DUMMY;RAM (temporary)

convert

Page 85: Introduction to the ROOT framework root.cern.ch

René Brun 85

Supported Type Changes

Schema Evolution supports: float ↔ double, int ↔ long, etc (and pointers) TYPE* ↔ std::vector<TYPE> CLASS* ↔ TClonesArray

Adding / removing base class equivalent to adding / removing data member

Page 86: Introduction to the ROOT framework root.cern.ch

René Brun 86

Storing Reflection

Big Question: file == memory class layout?

Need to store reflection to file,see TFile::ShowStreamerInfo():

StreamerInfo for class: TH1F, version=1 TH1 BASE offset= 0 type= 0 TArrayF BASE offset= 0 type= 0

StreamerInfo for class: TH1, version=5 TNamed BASE offset= 0 type=67... Int_t fNcells offset= 0 type= 3 TAxis fXaxis offset= 0 type=61

Page 87: Introduction to the ROOT framework root.cern.ch

René Brun 87

Class Version

One TStreamerInfo for all objects of the same class layout in a file

Can have multiple versions in same file

Use version number to identify layout:

class TMyClass: public TObject {public: TMyClass(): fLong(0), fFloat(0.) {} virtual ~TMyClass() {} ... ClassDef(TMyClass,1); // example class};

Page 88: Introduction to the ROOT framework root.cern.ch

René Brun 88

Random Facts On ROOT I/O

We know exactly what's in a TFile – need no library to look at data!

ROOT files are zippedCombine contents of TFiles with $ROOTSYS/bin/hadd

Can even open TFile("http://myserver.com/afile.root") including read-what-you-need!

Nice viewer for TFile: new TBrowser

Page 89: Introduction to the ROOT framework root.cern.ch

René Brun 89

I/O

Object in Memory

Object in Memory

Streamer: No need for

transient / persistent classes

http

sockets

File on disk

Net File

Web File

XML XML File

SQL DataBase

LocalB

uffe

r

Page 90: Introduction to the ROOT framework root.cern.ch

René Brun 90

TFile / TDirectoryA TFile object may be divided in a hierarchy of

directories, like a Unix file system.Two I/O modes are supported

Key-mode (TKey). An object is identified by a name (key), like files in a Unix directory. OK to support up to a few thousand objects, like histograms, geometries, mag fields, etc.

TTree-mode to store event data, when the number of events may be millions, billions.

Page 91: Introduction to the ROOT framework root.cern.ch

René Brun 91

Self-describing filesDictionary for persistent classes written to the

file.ROOT files can be read by foreign readers Support for Backward and Forward

compatibilityFiles created in 2001 must be readable in 2015Classes (data objects) for all objects in a file

can be regenerated via TFile::MakeProject

Root >TFile f(“demo.root”);

Root > f.MakeProject(“dir”,”*”,”new++”);

Page 92: Introduction to the ROOT framework root.cern.ch

René Brun 92

Example of key mode

void keywrite() {

TFile f(“keymode.root”,”new”);

TH1F h(“hist”,”test”,100,-3,3);

h.FillRandom(“gaus”,1000);

h.Write()

}void keyRead() {

TFile f(“keymode.root”);

TH1F *h = (TH1F*)f.Get(“hist”);

h.Draw();

}

Page 93: Introduction to the ROOT framework root.cern.ch

René Brun 93

A Root file pippa.rootwith two levels of

directories

Objects in directory/pippa/DM/CJ

eg:/pippa/DM/CJ/h15

Page 94: Introduction to the ROOT framework root.cern.ch

René Brun 94

1 billion people surfing the

Web

LHC: How Much Data?

105

104

103

102

Level 1 Rate (Hz)

High Level-1 Trigger(1 MHz)

High No. ChannelsHigh Bandwidth(500 Gbit/s)

High Data Archive(5 PetaBytes/year)10 Gbits/s in Data base

LHCB

KLOE

HERA-B

CDF II

CDF

H1ZEUS

UA1

LEP

NA49ALICE

Event Size (bytes)

104 105 106

ATLASCMS

106

107

STAR

Page 95: Introduction to the ROOT framework root.cern.ch

ROOT Collection Classes

Or when one million TKeys are not enough…

Page 96: Introduction to the ROOT framework root.cern.ch

René Brun 96

Collection Classes

The ROOT collections are polymorphic

containers that hold pointers to TObjects, so:They can only hold objects that inherit from

TObjectThey return pointers to TObjects, that have to

be cast back to the correct subclass

Page 97: Introduction to the ROOT framework root.cern.ch

René Brun 97

Types Of Collections

The inheritance hierarchy of the primary collection classes

TCollection

TSeqCollection THashTable TMap

TList TOrdCollection TObjArray TBtree

TSortedList THashList TClonesArray

Page 98: Introduction to the ROOT framework root.cern.ch

René Brun 98

Collections (cont)Here is a subset of collections supported by ROOT:

TObjArray TClonesArray THashTable THashList TMap

Templated containers, e.g. STL (std::list etc)

Page 99: Introduction to the ROOT framework root.cern.ch

René Brun 99

TObjArray

A TObjArray is a collection which supports traditional array semantics via the overloading of operator[].

Objects can be directly accessed via an index. The array expands automatically when objects

are added.

Page 100: Introduction to the ROOT framework root.cern.ch

René Brun 100

TClonesArray

Array of identical classes (“clones”).

Designed for repetitive data analysis tasks: same type of objects created and deleted many times.

No comparable class in STL!

The internal data structure of a TClonesArray

Page 101: Introduction to the ROOT framework root.cern.ch

René Brun 101

Traditional Arrays

Very large number of new and delete calls in large loops like this (N(100000) x N(10000) times new/delete):

TObjArray a(10000); while (TEvent *ev = (TEvent *)next()) { for (int i = 0; i < ev->Ntracks; ++i) { a[i] = new TTrack(x,y,z,...); ... } ... a.Delete(); }

N(100000)

N(10000)

Page 102: Introduction to the ROOT framework root.cern.ch

René Brun 102

You better use a TClonesArray which reduces the number of new/delete calls to only N(10000):

TClonesArray a("TTrack", 10000); while (TEvent *ev = (TEvent *)next()) { for (int i = 0; i < ev->Ntracks; ++i) { new(a[i]) TTrack(x,y,z,...); ... } ... a.Delete(); }

Pair of new/delete calls cost about 4 μsNN(109) new/deletes will save about 1 hour.

N(100000)

N(10000)

Page 103: Introduction to the ROOT framework root.cern.ch

René Brun 103

THashTable - THashList

THashTable puts objects in one of several of its short lists. Which list to pick is defined by TObject::Hash(). Access much faster than map or list, but no order defined. Nothing similar in STL.

THashList consists of a THashTable for fast access plus a TList to define an order of the objects.

Page 104: Introduction to the ROOT framework root.cern.ch

René Brun 104

TMap

TMap implements an associative array of (key,value) pairs using a THashTable for efficient retrieval (therefore TMap does not conserve the order of the entries).

Page 105: Introduction to the ROOT framework root.cern.ch

ROOT Trees

Page 106: Introduction to the ROOT framework root.cern.ch

René Brun 106

Why Trees ?

As seen previously, any object deriving from TObject can be written to a file with an associated key with object.Write()

However each key has an overhead in the directory structure in memory (up to 160 bytes). Thus, Object.Write() is very convenient for simple objects like histograms, but not for saving one object per event.

Using TTrees, the overhead in memory is in general less than 4 bytes per entry!

Page 107: Introduction to the ROOT framework root.cern.ch

René Brun 107

Why Trees ?

Extremely efficient write once, read many ("WORM")

Designed to store >109 (HEP events) with same data structure

Load just a subset of the objects to optimize memory usage

Trees allow fast direct and random access to any entry (sequential access is the best)

Page 108: Introduction to the ROOT framework root.cern.ch

René Brun 108

Building ROOT Trees

Overview of TreesBranches

5 steps to build a TTree

Page 109: Introduction to the ROOT framework root.cern.ch

René Brun 109

Memory <--> TreeEach Node is a branch in the Tree

0123456789101112131415161718

T.Fill()

T.GetEntry(6)

T

Memory

Page 110: Introduction to the ROOT framework root.cern.ch

René Brun 110

ROOT I/O -- Split/ClusterTree version

Streamer

File

Branches

Tree in memory

Tree entries

Page 111: Introduction to the ROOT framework root.cern.ch

René Brun 111

Writing/Reading a Tree class Event : public Something {

Header fHeader;

std::list<Vertex*> fVertices;

std::vector<Track> fTracks;

TOF fTOF;

Calor *fCalor;

}

main() {

Event *event = 0;

TFile f(“demo.root”,”recreate”);

int split = 99; //maximum split

TTree *T = new TTree(“T”,”demo Tree”);

T->Branch(“event”,&event,split);

for (int ev=0;ev<1000;ev++) {

event = new Event(…);

T->Fill();

delete event;

}

t->AutoSave();

}

main() {

Event *event = 0;

TFile f(“demo.root”);

TTree *T = (TTree*)f.Get”T”);

T->SetBranchAddress(“event”,&event);

Long64_t N = T->GetEntries();

for (Long64_t ev=0;ev<N;ev++) {

T->GetEntry(ev);

// do something with event

}

}

Event.h

Write.C Read.C

Page 112: Introduction to the ROOT framework root.cern.ch

René Brun 112

Tree structure

Page 113: Introduction to the ROOT framework root.cern.ch

René Brun 113

8 Branches of T

8 leaves of branchElectrons

A double-clickto histogram

the leaf

Browsing a TTree with TBrowser

Page 114: Introduction to the ROOT framework root.cern.ch

René Brun 114

The TTreeViewer

Page 115: Introduction to the ROOT framework root.cern.ch

René Brun 115

Tree structureBranches: directories Leaves: data containersCan read a subset of all branches – speeds up

considerably the data analysis processesTree layout can be optimized for data analysisThe class for a branch is called TBranchVariables on TBranch are called leaf (yes -

TLeaf)Branches of the same TTree can be written to

separate files

Page 116: Introduction to the ROOT framework root.cern.ch

René Brun 116

Five Steps to Build a Tree

Steps:

1. Create a TFile

2. Create a TTree

3. Add TBranch to the TTree

4. Fill the tree

5. Write the file

Page 117: Introduction to the ROOT framework root.cern.ch

René Brun 117

Step 1: Create a TFile Object

Trees can be huge need file for swapping filled entries

TFile *hfile = new TFile("AFile.root");

Page 118: Introduction to the ROOT framework root.cern.ch

René Brun 118

Step 2: Create a TTree Object

The TTree constructor:Tree name (e.g. "myTree") Tree title

TTree *tree = new TTree("myTree","A Tree");

Page 119: Introduction to the ROOT framework root.cern.ch

René Brun 119

Step 3: Adding a Branch

Branch nameAddress of the pointer

to the object

Event *event = new Event(); myTree->Branch("EventBranch", &event);

Page 120: Introduction to the ROOT framework root.cern.ch

René Brun 120

Splitting a Branch

Setting the split level (default = 99)

Split level = 0 Split level = 99

tree->Branch("EvBr", &event, 64000, 0 );

Page 121: Introduction to the ROOT framework root.cern.ch

René Brun 121

Splitting (real example)

Split level = 0 Split level = 99

Page 122: Introduction to the ROOT framework root.cern.ch

René Brun 122

Splitting

Creates one branch per member – recursively Allows to browse objects that are stored in trees,

even without their library Makes same members consecutive, e.g. for object

with position in X, Y, Z, and energy E, all X are consecutive, then come Y, then Z, then E. A lot higher zip efficiency!

Fine grained branches allow fain-grained I/O - read only members that are needed, instead of full object

Supports STL containers, too!

Page 123: Introduction to the ROOT framework root.cern.ch

René Brun 123

Performance Considerations

A split branch is:Faster to read - the type does not have to be

read each timeSlower to write due to the large number of

buffers

Page 124: Introduction to the ROOT framework root.cern.ch

René Brun 124

Step 4: Fill the Tree

Assign values to the object contained in each branch

TTree::Fill() creates a new entry in the tree: snapshot of values of branches’ objects

Create a for loop

for (int e=0;e<100000;++e) { event->Generate(e); // fill event myTree->Fill(); // fill the tree }

Page 125: Introduction to the ROOT framework root.cern.ch

René Brun 125

Step 5: Write Tree To File

myTree->Write();

Page 126: Introduction to the ROOT framework root.cern.ch

René Brun 126

Example macro{ Event *myEvent = new Event(); TFile f("mytree.root"); TTree *t = new TTree("myTree","A Tree"); t->Branch("SplitEvent", &myEvent); for (int e=0;e<100000;++e) { myEvent->Generate(); t->Fill(); } t->Write();}

Page 127: Introduction to the ROOT framework root.cern.ch

René Brun 127

Reading a TTree

Looking at a treeHow to read a treeTrees, friends and chains

Page 128: Introduction to the ROOT framework root.cern.ch

René Brun 128

Looking at the Tree

TTree::Print() shows the data layout

root [] TFile f("AFile.root")root [] myTree->Print();*******************************************************************************Tree :myTree : A ROOT tree **Entries : 10 : Total = 867935 bytes File Size = 390138 ** : : Tree compression factor = 2.72 ********************************************************************************Branch :EventBranch **Entries : 10 : BranchElement (see below) **............................................................................**Br 0 :fUniqueID : **Entries : 10 : Total Size= 698 bytes One basket in memory **Baskets : 0 : Basket Size= 64000 bytes Compression= 1.00 **............................................................................*……

Page 129: Introduction to the ROOT framework root.cern.ch

René Brun 129

Looking at the Tree

TTree::Scan("leaf:leaf:….") shows the valuesroot [] myTree->Scan("fNseg:fNtrack"); > scan.txt

root [] myTree->Scan("fEvtHdr.fDate:fNtrack:fPx:fPy","", "colsize=13 precision=3 col=13:7::15.10");

******************************************************************************* Row * Instance * fEvtHdr.fDate * fNtrack * fPx * fPy ******************************************************************************** 0 * 0 * 960312 * 594 * 2.07 * 1.459911346 ** 0 * 1 * 960312 * 594 * 0.903 * -0.4093382061 ** 0 * 2 * 960312 * 594 * 0.696 * 0.3913401663 ** 0 * 3 * 960312 * 594 * -0.638 * 1.244356871 ** 0 * 4 * 960312 * 594 * -0.556 * -0.7361358404 ** 0 * 5 * 960312 * 594 * -1.57 * -0.3049036264 ** 0 * 6 * 960312 * 594 * 0.0425 * -1.006743073 ** 0 * 7 * 960312 * 594 * -0.6 * -1.895804524 *

Page 130: Introduction to the ROOT framework root.cern.ch

René Brun 130

TTree Selection Syntax

Prints the first 8 variables of the tree.

Prints all the variables of the tree.Select specific variables:

Prints the values of var1, var2 and var3.A selection can be applied in the second argument:

Prints the values of var1, var2 and var3 for the entries where var1 is exactly 0.

MyTree->Scan();

MyTree->Scan("*");

MyTree->Scan("var1:var2:var3");

MyTree->Scan("var1:var2:var3", "var1==0");

Page 131: Introduction to the ROOT framework root.cern.ch

René Brun 131

Looking at the TreeTTree::Show(entry_number) shows the values for

one entry

root [] myTree->Show(0);======> EVENT:0 EventBranch = NULL fUniqueID = 0 fBits = 50331648 [...] fNtrack = 594 fNseg = 5964 [...] fEvtHdr.fRun = 200 [...] fTracks.fPx = 2.066806, 0.903484, 0.695610, -

0.637773,... fTracks.fPy = 1.459911, -0.409338, 0.391340,

1.244357,...

Page 132: Introduction to the ROOT framework root.cern.ch

René Brun 132

How To Read a TTree

Example:

1. Open the Tfile

2. Get the TTree

TFile f("tree4.root")

TTree *t4=0;

f.GetObject("t4",t4)

OR

Page 133: Introduction to the ROOT framework root.cern.ch

René Brun 133

How to Read a TTree

3. Create a variable pointing to the dataroot [] Event *event = 0;

4. Associate a branch with the variable:root [] t4->SetBranchAddress("event_split", &event);

5. Read one entry in the TTreeroot [] t4->GetEntry(0)

root [] event->GetTracks()->First()->Dump()

==> Dumping object at: 0x0763aad0, name=Track, class=Track

fPx 0.651241 X component of the momentum

fPy 1.02466 Y component of the momentum

fPz 1.2141 Z component of the momentum

[...]

Page 134: Introduction to the ROOT framework root.cern.ch

René Brun 134

Example macro

{

Event *ev = 0;

TFile f("mytree.root");

TTree *myTree = (TTree*)f->Get("myTree");

myTree->SetBranchAddress("SplitEvent",&ev);

for (int e=0;e<100000;++e) {

myTree->GetEntry(e);

ev->Analyse();

}

}

Page 135: Introduction to the ROOT framework root.cern.ch

René Brun 135

Chains of Trees

A TChain is a collection of Trees.Same semantics for TChains and TTrees

root > .x h1chain.C root > chain.Process(“h1analysis.C”)

{ //creates a TChain to be used by the h1analysis.C class //the symbol H1 must point to a directory where the H1 data sets //have been installed TChain chain("h42"); chain.Add("$H1/dstarmb.root"); chain.Add("$H1/dstarp1a.root"); chain.Add("$H1/dstarp1b.root"); chain.Add("$H1/dstarp2.root");}

Page 136: Introduction to the ROOT framework root.cern.ch

René Brun 136

Data Volume & Organisation

100MB 1GB 10GB 1TB100GB 100TB 1PB10TB

1 1 500005000500505

TTree TChain

A TFile typically contains 1 TTreeA TChain is a collection of TTrees or/and TChainsA TChain is typically the result of a query to a file

catalog

Page 137: Introduction to the ROOT framework root.cern.ch

René Brun 137

Friends of Trees

- Adding new branches to existing tree without touching it, i.e.:

- Unrestricted access to the friend's data via virtual branch of original tree

myTree->AddFriend("ft1", "friend.root")

Page 138: Introduction to the ROOT framework root.cern.ch

René Brun 138

Tree Friends

0123456789101112131415161718

0123456789101112131415161718

0123456789101112131415161718

Public

read

Public

read

User

Write

Entry # 8

Page 139: Introduction to the ROOT framework root.cern.ch

René Brun 139

Tree Friends

TFile f1("tree1.root");tree.AddFriend("tree2", "tree2.root")tree.AddFriend("tree3", "tree3.root");tree.Draw("x:a", "k<c");tree.Draw("x:tree2.x", "sqrt(p)<b");

x

Processing timeindependent of thenumber of friendsunlike table joins

in RDBMS

Collaboration-widepublic read

Analysis groupprotected

userprivate

Page 140: Introduction to the ROOT framework root.cern.ch

René Brun 140

Summary: Reading TreesTTree is one of the most powerful collections

available for HEPExtremely efficient for huge number of data sets

with identical layoutVery easy to look at TTree - use TBrowser!Write once, read many (WORM) ideal for

experiments' dataStill: extensible, users can add their own tree as

friend

Page 141: Introduction to the ROOT framework root.cern.ch

René Brun 141

Analyzing Trees

Selectors, Proxies, PROOF

Page 142: Introduction to the ROOT framework root.cern.ch

René Brun 142

Recap

TTree efficient storage and access for huge amounts of structured data

Allows selective access of data

TTree knows its layout

Almost all HEP analyses based on TTree

Page 143: Introduction to the ROOT framework root.cern.ch

René Brun 143

TTree Data Access

Data access via TTree / TBranch is complex

Lots of code identical for all TTrees:getting tree from file, setting up branches, entry loop

Lots of code completely defined by TTree:branch names, variable types, branch ↔ variable mapping

Need to enable / disable branches "by hand"

Want common interface to allow generic "TTree based analysis infrastructure"

Page 144: Introduction to the ROOT framework root.cern.ch

René Brun 144

TTree Proxy

The solution:write analysis code using branch names

as variablesauto-declare variables from branch

structureread branches on-demand

Page 145: Introduction to the ROOT framework root.cern.ch

René Brun 145

Branch vs. Proxy

Take a simple branch:

Access via proxy in pia.C:

class ABranch { int a;};ABranch* b=0;tree->Branch("abranch", &b);

double pia() { return sqrt(abranch.a * TMath::Pi());}

Page 146: Introduction to the ROOT framework root.cern.ch

René Brun 146

Proxy Details

Analysis script somename.C must contain somename()

Put #includes into somename.h

Return type must convert to double

double pia() { return sqrt(abranch.a * TMath::Pi());}

Page 147: Introduction to the ROOT framework root.cern.ch

René Brun 147

Proxy Advantages

Very efficient: only reads leaf "a"

Can use arbitrary C++

Leaf behaves like a variable

Uses meta-information stored with TTree:branch namestypes contained in branches / leavesand their members (unrolling)

double pia() { return sqrt(abranch.a * TMath::Pi());}

Page 148: Introduction to the ROOT framework root.cern.ch

René Brun 148

Simple Proxy Analysis

TTree::Draw() runs simple analysis:

Compiles pia.C

Calls it for each event

Fills histogram named "htemp" with return value

TFile* f = new TFile("tree.root");TTree* tree = 0;f->GetObject("MyTree", tree);tree->Draw("pia.C+");

Page 149: Introduction to the ROOT framework root.cern.ch

René Brun 149

Behind The Proxy Scene

TTree::Draw("pia.C+") creates helper class

pia.C gets #included inside generatedSel!

Its data members are named like leaves,wrap access to Tree data

Can also generate Proxy via

generatedSel: public TSelector {#include "pia.C" ...};

tree->MakeProxy("MyProxy", "pia.C")

Page 150: Introduction to the ROOT framework root.cern.ch

René Brun 150

TSelector

TSelector is base for event-based analysis:

1. setup

2. analyze an entry

3. draw / save histograms

generatedSel: public TSelector

TSelector::Begin()

TSelector::Process(Long64_t entry)

TSelector::Terminate()

Page 151: Introduction to the ROOT framework root.cern.ch

René Brun 151

Extending The ProxyGiven Proxy generated for script.C (like pia.C):looks for and calls

Correspond to TSelector's functions

Can be used to set up complete analysis:fill several histograms,control event loop

void script_Begin(TTree* tree)Bool_t script_Process(Long64_t entry)void script_Terminate()

Page 152: Introduction to the ROOT framework root.cern.ch

René Brun 152

Proxy And TClonesArray

TClonesArray as branch:

Cannot loop over entries and return each:

class TGap { float ore; };TClonesArray* b = new TClonesArray("TGap");tree->Branch("gap", &b);

double singapore() { return sin(gap.ore[???]);}

Page 153: Introduction to the ROOT framework root.cern.ch

René Brun 153

Proxy And TClonesArray

TClonesArray as branch:

Implement it as part of pia_Process() instead:

class TGap { float ore; };TClonesArray* b = new TClonesArray("TGap");tree->Branch("gap", &b);

Bool_t pia_Process(Long64_t entry) { for (int i=0; i<gap.GetEntries(); ++i) hSingapore->Fill(sin(gap.ore[i])); return kTRUE;}

Page 154: Introduction to the ROOT framework root.cern.ch

René Brun 154

Extending The Proxy, Example

Need to declare hSingapore somewhere!

But pia.C is #included inside generatedSel, so:

TH1* hSingapore;void pia_Begin() { hSingapore = new TH1F(...);}Bool_t pia_Process(Long64_t entry) { hSingapore->Fill(...);}void pia_Terminate() { hSingapore->Draw();}

Page 155: Introduction to the ROOT framework root.cern.ch

ROOT on the GRID(s)

Many ways to use ROOT on a GRIDSingle laptop

Multi-Core desktopLocal area cluster

Cluster on a GRID centerClusters on GRID(s)

Page 156: Introduction to the ROOT framework root.cern.ch

René Brun 156

GRID: the ideaExploit resources available on the Net: You

take and you give.Not a new idea: Seti, Condor, etcBUT:

Can we make it transparent?What type of work?

Problem is more complex than a simple CPU distribution taskWAN data accessData managementAuthentication, time & space sharingDisk and CPU quotas

Page 157: Introduction to the ROOT framework root.cern.ch

René Brun 157

The HEP GRID100,000 computers

in 1000 locations5,000 physicists

in 1000 locations

WAN

Page 158: Introduction to the ROOT framework root.cern.ch

René Brun 158

GRID: Users profile

Few big users submittingmany long jobs (Monte Carlo,

reconstruction)They want to run many jobs in

one month

Many users submittingmany short jobs (physics

analysis)They want to run many jobs in one hour or less

Page 159: Introduction to the ROOT framework root.cern.ch

René Brun 159

Big Users

Monte Carlo jobs (one hour one day)Each job generates one file (1 GigaByte)

Reconstruction job (10 minutes -> one hour) Input from the MC job or copied from a storage

centerOutput (< input) is staged back to a storage center

Success rate (90%). If the job fails you resubmit it.

For several years, GRID projects focused effort on big users.

Page 160: Introduction to the ROOT framework root.cern.ch

René Brun 160

Small Users Scenario 1: submit one batch job to the GRID. It runs

somewhere with varying response times. Scenario 2: Use a splitter to submit many batch jobs to

process many data sets (eg CRAB, Ganga, Alien). Output data sets are merged automatically. Success rate < 90%. You see the final results only when the last job has been received and all results merged.

Scenario 3: Use PROOF (automatic splitter and merger). Success rate close to 100%. You can see intermediate feedback objects like histograms. You run from an interactive ROOT session.

Page 161: Introduction to the ROOT framework root.cern.ch

René Brun 161

Scenario 1 & 2: PROS

Job level parallelism. Conventional model. Nothing to change in user program.Initialisation phaseLoop on eventsTermination

Same job can run on laptop or on a GRID job.

Page 162: Introduction to the ROOT framework root.cern.ch

René Brun 162

Scenario 1 & 2: CONS(1)

Long tail in the jobs wall clock time distribution.

Page 163: Introduction to the ROOT framework root.cern.ch

René Brun 163

Scenario 1 & 2: CONS(2)

Can only merge output after a timecut.More data movement (input & output)Cannot have interactive feedbackTwo consecutive queries will produce

different results (problem with rare events)

Will use only one core on a multicore laptop or GRID node.

Hard to control priorities and quotas.

Page 164: Introduction to the ROOT framework root.cern.ch

René Brun 164

Scenario 3: PROS

Predictive response. Event level parallelism. Workers terminate at the same time

Process moved to data as much as possible, otherwise use network.

Interactive feedbackCan view running queries in different

ROOT sessions.Can take advantage of multicore cpus

Page 165: Introduction to the ROOT framework root.cern.ch

René Brun 165

Scenario 3: CONS

Event level parallelism. User code must follow a template: the TSelector API.

Good for a local area cluster. More difficult to put in place in a GRID collection of local area clusters.

Interactive schedulers, priority managers must be put in place.

Debugging a problem slightly more difficult.

Page 166: Introduction to the ROOT framework root.cern.ch

René Brun 166

GRID & Parallelism: 1

The user application splits the problem in N subtasks. Each task is submitted to a GRID node (minimal input, minimal output).

The GRID task can run synchronously or asynchronously. If the task fails or time-out, it can be resubmitted to another node.

One of the first and simplest use of the GRID, but not many applications in HEP.

Page 167: Introduction to the ROOT framework root.cern.ch

René Brun 167

GRID & Parallelism: 2

The typical case of Monte Carlo or reconstruction in HEP.

It requires massive data transfers between the main computing centers.

This activity has concentrated so far a very large fraction of the GRID projects and budgets.

It has been an essential step to foster coordination between hundreds of sites, improve international network bandwidths and robustness.

Page 168: Introduction to the ROOT framework root.cern.ch

René Brun 168

GRID & Parallelism: 3Distributed data analysis will be a major

challenge for the coming experiments.This is the area with thousands of people

running many different styles of queries, most of the time in a chaotic way.

The main challenges:Access to millions of data sets (eg 500 TeraBytes)Best match between execution and data locationDistributing/compiling/linking users code(a few

thousand lines) with experiment large libraries (a few million lines of code).

Simplicity of useReal time responseRobustness.

Page 169: Introduction to the ROOT framework root.cern.ch

René Brun 169

GRID & Parallelism: 3a

Currently 2 different & competing directions for distributed data analysis.Batch solution using the existing GRID

infrastructure for Monte Carlo and reconstruction programs. A front-end program partitions the problem to analyze ND data sets on NP processors.

Interactive solution PROOF. Each query is parallelized with an optimum match of execution and data location.

Page 170: Introduction to the ROOT framework root.cern.ch

René Brun 170

Parallelism: Where ?

Multi-Core CPU laptop/desktop

2(2007) 32(2012?)

Network of desktops

Local Cluster

with multi-core CPUs

GRID(s)