34
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

David Adams

ATLAS

DIALDistributed Interactive Analysis of

Large datasets

David AdamsBNL

March 25, 2003

CHEP 2003

Data Analysis Environment and Visualization

Page 2: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 2

David Adams

ATLAS

Contents

Goals of DIAL

What is DIAL?

Design• Applications

• Schedulers

• Datasets

Status

Development plans

GRID requirements

Page 3: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 3

David Adams

ATLAS

Goals of DIAL

1. Demonstrate the feasibility of interactive analysis of large datasets

• Large means too big for interactive analysis on a single CPU

2. Set requirements for GRID services• Datasets, schedulers, jobs, resource discovery,

authentication, allocation, ...

3. Provide ATLAS with analysis tool• For current and upcoming data challenges

Page 4: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 4

David Adams

ATLAS

What is DIAL?

Distributed• Data and processing

Interactive• Prompt response (seconds rather than hours)

Analysis of• Fill histograms, select events, …

Large datasets• Any event data (not just ntuples or tag)

Page 5: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 5

David Adams

ATLAS

What is DIAL? (cont)

DIAL provides a connection between• Interactive analysis framework

– Fitting, presentation graphics, …

– E.g. ROOT

• and Data processing application– E.g. athena for ATLAS

– Natural for the data of interest

DIAL distributes processing• Among sites, farms, nodes• To provide user with desired response time

Page 6: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 6

David Adams

ATLAS

In tera ctive a na lysise .g . R O O T , JA S , ...

D IA L

D istribu ted p rocessing ru nn ing da ta -specific a pp lica tion

D a ta s e t S c he d u le r A A AJ o b

Page 7: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 7

David Adams

ATLAS

Design

DIAL has the following components• Dataset describing the data of interest

– Organized into events

• Application– Event loop providing access to the data

• Task– Result to fill for each event

– Code process each event

• Scheduler– Distributes processing and combines results

Page 8: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 8

David Adams

ATLAS

UserAnalysis

Job 1

Job 2

Application Task

Dataset 1

Scheduler

1. Create or locate

2. select 3. Create or select

4. select

8. run(app,tsk,ds1)

5. submit(app,tsk,ds)8. run(app,tsk,ds2)

6. splitDataset

Dataset 2

7. create

e.g. ROOT

e.g. athena

Result9. fill

10. gather

Result 9. fill

Result Code

Page 9: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 9

David Adams

ATLAS

Design (cont)

Sequence diagrams follow1. User creates a task made up of

• Event selection

• Two histograms

• Code to fill these

2. User submits a job (application, task and dataset) to an existing scheduler

3. Grid scheduler uses site schedulers to process a job

Page 10: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 10

David Adams

ATLAS

xm l( ) : & ts kxm l

Ta s kts k

A cto r

Te x tco de

X m lEle m e n tt s k x m l

A da pte r< TH 1 >h is tpro d1

TH 1h is t1

Ev e n tS e ls e l

R e s u ltre s

c re ate ( )

c re ate ( )

ins e r t( s e l)

c re ate ( ...)

c re ate (his t1 )

ins e r t( "h1 ", & his tpro d1 )

A da pte r< TH 1 >h is tpro d2

TH 1h is t2

c re ate ( ...)

c re ate (his t2 )

ins e r t( "h2 ", & his tpro d2 )

c re ate ("m yc o de .c xx")

c re ate ( re s , c o de )

D ia l sequ ence 1C rea ting a ta sk

c re ate ( ...)

to _ xm l_ te xt( ) : "< Tas k> ...< /Tas k> "

Create empty result

Add first histogram

Add event selector

Add second histogram

Fetch code

Create task

Create task XML

Page 11: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 11

David Adams

ATLAS

D a ta s e tds

A cto r

D S C a ta lo gTa s kt s k

A pplica t io na pp

c r e ate ( "athe na")

c r e ate (ds xm l)

s ubm it(app, ts k , ds ) : jo bid

c r e ate ( ...)

jo b( jo bid) : jo bc o py

e ve nt_ c o unt( ) : 1 2 3

S ch e du le ra bs s ch

J o bI djo bid

X J o bx jo b

J o bjo bco py

D ia l sequ ence 2S u bm itting a job

r e s ul t ( ) : r e s

s e le c t(ds s ql) : ds xm l

c r e ate ( ts kxm l)

c re ate (xjo b)

c r e ate ( jo bid , ...)

s tatus ( ) : ( IN VAL ID | IN ITIAL IZ IN G | R U N N IN G | FAIL E D | D O N E | K IL L E D )

s tar t_ t im e ( ) : t1

s to p_ tim e ( ) : t2

Choose application

Create task

Select dataset

Submit job

Check job status

Page 12: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 12

David Adams

ATLAS

Job submitted

Assign job ID

Split dataset

Loop over sub-datasets

Submit job for each

sub-dataset

J o bI djo bid

A cto r

D a ta s e tds

D a ta s e tL is tfds s

G ridS ch e du le r

s ubm it( app, ts k , ds ) : jo bid

s pl i t_ f i le s ( ) : fds s

c re ate ( ...)

be gin( ) : i fds

o p* ( ) : fds

f i le nam e s ( ) : fnam e s

c re ate ( ...)

c re ate ( )

s ubm it(app, ts k, fds ) : s jo bid

D a ta s e t I te rifds

D a ta s e tfds

f ind_ s i te ( fnam e s ) : s s c he d

C lie n tS ch e du le rs s ch e d

D ial se q ue nc e 3G rid sc he d ule r

+ + ifds = = fds s .e nd( )

re c o rd_ jo b( jo bid, s jo bid, s s c he d)

X M L : X M L

S e rv e rS ch e du le r

S ite S ch e du le r

s ubm it( ...) : s jo bid

R e m o te s i te

Page 13: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 13

David Adams

ATLAS

Applications

Current application specification is• Name

– E.g. athena

• Version– E.g. 6.10.01

• List of shared libraries– E.g. libRawData, libInnerDetectorReco

Page 14: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 14

David Adams

ATLAS

Applications (cont)

Each DIAL compute node provides an application description database

• File-based– Location specified by environmental variable

• Indexed by application name and version• Application description includes

– Location of executable– Run time environment (shared lib path, …)– Command to build shared library from task code

• Defined by ChildScheduler– Different scheduler could change conventions

Page 15: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 15

David Adams

ATLAS

Schedulers

A DIAL scheduler provides means to• Submit a job• Terminate a job• Monitor a job

– Status

– Events processed

– Partial results

• Verify availability of an application• Install and verify the presence of a task for a

given application

Page 16: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 16

David Adams

ATLAS

Schedulers (cont)

Schedulers form a hierarchy• Corresponding to that of compute nodes

– Grid, site, farm, node

• Each scheduler splits job into sub-jobs and distributes these over lower-level schedulers

• Lowest level ChildScheduler starts processes to carry out the sub-jobs

• Scheduler concatenates results for its sub-jobs• User may enter the hierarchy at any level

Page 17: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 17

David Adams

ATLAS

Schedulers (cont)

Schedulers communicate using client-server• Between processes, nodes, sites• User constructs a client scheduler specifying

– Remote node

– Name for remote scheduler

• Server process on remote machines– Starts schedulers and assigns them names

– Passes requests from clients to the named scheduler

• Not yet implemented– Communication protocols not established

Page 18: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 18

David Adams

ATLAS

G ridS c h e d u le r

D IA L sched ule r hie ra rchy

C lie n tS c h e d u le r

S iteS c h e d u le r

C lie n tS c h e d u le r

S e rv e rS c h e d u le r

F armS c h e d u le r

C lie n tS c h e d u le r

S e rv e rS c h e d u le r

C h ildS c h e d u le r

S e rv e rS c h e d u le r

S e rv e rS c h e d u le r

G rid p o rta l S ite ga te w a y F a rm ga te w a y F a rm no d e

Us e r

C lie n tS c h e d u le r U s e r c ho o s e s

any o ne o f the s e

Page 19: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 19

David Adams

ATLAS

Datasets

Datasets specify event data to be processed

Datasets provide the following• List of event identifiers• Content

– E.g. raw data, refit tracks, cone=0.3 jets, …

• Means to locate the data– List of of logical files where data can be found

– Mapping from event ID and content to a file and a the location in that file where the data may be found

– Example follows

Page 20: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 20

David Adams

ATLAS

E xa m ple da ta set w ith m a pp ing to files

R aw

C lus ters

T rac ks

Jets

E lec tro ns

A O D

Con

tent

E ve n t F ile D ata o b jec t

Page 21: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 21

David Adams

ATLAS

Datasets (cont)

User may specify content of interest• Dataset plus this content restriction is another

dataset• Event data for the new dataset located in a

subset of the files required for the original• Only this subset required for processing

Page 22: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 22

David Adams

ATLAS

E xa m ple d a ta se t w ith c onte n t se le c tion

R aw

C lus ters

T ra c k s

Jets

E le c tro ns

A O D

Con

tent

E ven t F ile D ata o b jec tS elec ted file

Page 23: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 23

David Adams

ATLAS

Datasets (cont)

Distributed analysis requires means to divide a dataset into sub-datasets

• Sub-dataset is a dataset• Do not split data from any one event• Split along file boundaries

– Jobs can be assigned where files are already present

– Split most likely done at grid level

• May assign different events from one file to different jobs to speed processing

– Split likely done at farm level

Page 24: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 24

David Adams

ATLAS

E xa m ple sub-d a ta se t w ith c onte n t se le c tion

R aw

C lus ters

T ra c k s

Jets

E le c tro ns

A O DC

onte

nt

E ven t F ile D ata o b jec t

S elec ted file

S e le c te d e v e n ts

Page 25: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 25

David Adams

ATLAS

Status

All DIAL components in place• http://www.usatlas.bnl.gov/~dladams/dial• But scheduler is very simple

– Only local ChildScheduler is implemented– Grid, site, farm and client-server schedulers not yet

implemented

Datasets implemented as a separate system• http://www.usatlas.bnl.gov/~dladams/dataset• Only concrete dataset is ATLAS AthenaRoot

– Holds Monte Carlo generator information

Page 26: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 26

David Adams

ATLAS

Status (cont)

DIAL and dataset classes imported to ROOT• ROOT can be used as user interface

– All DIAL and dataset classes and methods available at command prompt

– DIAL and dataset libraries must be loaded

• Import done with ACLiC• Only preliminary testing done• Need to add adapter for TH1 and any other

classes of interest

Page 27: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 27

David Adams

ATLAS

DIAL status (cont)

No application integrated to process jobs• Except test program dialproc can be used to

count events• In ATLAS natural thing is to define a DIAL

algorithm to run in athena– However ATLAS is not yet able to persist

reconstructed data

• Perhaps a ROOT backend to process ntuples?– Or is this better handled with PROOF?

– Or use PROOF to implement a farm scheduler?

Page 28: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 28

David Adams

ATLAS

Development plans

(Items in red required for useful ATLAS tool)

Schedulers• Add client-server schedulers• Farm scheduler

– Allows large-scale test

• Site and grid schedulers– GRID integration

– Interact with dataset, file and replica catalogs

– Authentication and authorization

Page 29: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 29

David Adams

ATLAS

Development plans (cont)

Datasets• Interface to ATLAS POOL event collections

– expected in summer

• ROOT ntuples ??

Applications• Athena for ATLAS• ROOT ??

Analysis environment• Import classes into LCG/SEAL? (Python)• JAS? (java binding?)

Page 30: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 30

David Adams

ATLAS

GRID requirements

Identify components and services that can be shared with

• Other distributed interactive analysis projects– PROOF, JAS, …

• Distributed batch projects– Production

– Analysis

• Non-HEP event-oriented problems– Data organized into a collection of “events” that are

each processed in the same way

Page 31: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 31

David Adams

ATLAS

GRID requirements (cont)

Candidates for shared components include• Dataset

– Events– Content– File mapping– Splitting

• Job– Specification (application, task, response time)– Splitting– Merging results– Monitoring

Page 32: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 32

David Adams

ATLAS

GRID requirements (cont)

• Scheduler– Available applications and tasks

– Job submission

– Job status including partial results

• Application– Specification

– Installation

• Authentication and authorization• Resource location and allocation

– Data, processing and matching

Page 33: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 33

David Adams

ATLAS

GRID requirements (cont)

Difference with batch processing is latency• Interactive system provides means for user to

specify maximum acceptable response time• All actions must take place within this time

– Locate data and resources– Splitting and matchmaking– Job submission– Gathering of results

• Longer latency for first pass over a dataset– Record state for later passes– Still must be able to adjust to changing conditions

Page 34: David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization

March 25, 2003 DIAL CHEP 2003 34

David Adams

ATLAS

Grid requirements (cont)

Avoid sharp division between interactive and batch resources

• Share implies more available resources for both• Interactive use varies significantly

– Time of day

– Time to the next conference

– Discovery of interesting events

• Interactive request must be able to preempt long-running batch jobs

– But allocation determined by sites, experiments, …