19
October 21, 2005 The Korean Physical Society YuChul Yang [email protected] r The Current Status of CDF Grid 양양양 * , 양양양 , 양양양 , 양양양 , 양양양 , 양양양 , 양양양 , 양양양 , MIAN Shabeer, AHMAD KHAN Adil, 양양양 ( 양양양양양 ) 양양양 , 양양양 , 양양양 , 양양양 , 양양양 , 양양양 , 양양양 ( 양양양양양 ) 양양양 , 양양양 , 양양양 , 양양양 ( 양양양양양양 )

YuChul Yang [email protected]. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

Embed Size (px)

Citation preview

Page 1: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

The Current Status of CDF Grid

양유철 * , 한대희 , 공대정 , 김지은 , 서준석 , 장성현 , 조기현 , 오영도 , MIAN Shabeer, AHMAD KHAN Adil, 김동희 ( 경북대학교 )

김수봉 , 김현수 , 문창성 , 이영장 , 전은주 , 정지은 , 주경광 (서울대학교 )

유인태 , 이재승 , 조일성 , 최동혁 ( 성균관대학교 )

Page 2: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Introduction to CDF Computing

Developed in 2001-2002 to respond to experiments greatly increased need for computational and data handling resources to deal with RunII

One of the first large-scale cluster approaches to user computing for general analysis.

Greatly increased CPU power & data to physicists.

CDF Grid via CAF, DCAF, SAM and JIM ☞ DCAF(DeCentralized Analysis Farm) ☞ SAM (Sequential Access through Metadata) – Real data Handling System ☞ JIM(Job Information Management) – Resource Broker

Page 3: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Outline

CAF Central Analysis Farm :A large central computing resource based on Linux cluster farms with a simple job management scheme at Fermilab.

DCAF

Decentralized CDF Analysis Farm :We extended the above model, including its command line interface and GUI, to manage and work with remote resources

GridWe are now in the process of adapting and converting out work flow to the Grid

Page 4: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Environment on CAF All basic CDF software pre-installed on CAF Authentication via Kerberos ☞ Jobs are run via mapped accounts with authentication of actual user through special principal ☞ Database, data handling remote usres ID passed on through lookup of actual user via special principal

User’s analysis environment comes over in tarball - no need to pre-register or submit only certain jobs. Job returns results to user via secure ftp/rcp controlled by user script and principal

Page 5: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

In 2005, 50% of analysis farm outside of FNAL

Distributed clusters in Korea, Taiwan, Japan, Italy, Germany, Spain, UK, USA and Canada

Page 6: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Current DCAF approach

Cluster technology (CAF = “Central analysis farm”) extended to remote site (DCAFs = Decentralized CDF analysis Farm)

Multiple batch systems supported : converting from FBSNG system to Condor on all DCAFs

SAM data handling system required for offsite DCAFs

Page 7: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

http://www-cdf.fnal.gov/internal/fastnavigator/fastnavigator.html (2005/Oct/17)

Current CDF Dedicated Resources

Page 8: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

TYPE CPU RAM HDD NO

head Node cluster46.knu.ac.kr

AMD MP2000 * 2 2G 80G 1

sam stationcluster67.knu.ac.kr

Pentium 4 2.4G 1G 80G 1

submission node

cluster52.knu.ac.kr

Pentium 4 2.4G 1G 80G 1

worker nodecluster39~cluster73(21

)cluster102~cluster114(

13)Cluster122~cluster130(

9)

AMD MP2000 * 2 2G 80G 4

AMD MP2200 * 2 1G 80G 2

AMD MP2800 * 2 2G 80G 11

AMD MP2800 * 2 2G 250G 2

Pentium 4 2.4G 1G 80G 15

Xeon 3.G * 2 2G 80G 9

Total 75 CPU (173.9GHz)

73G 4020G 46

Detail of KorCAF resources

Page 9: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Storage Upgrade statusCPU RAM HDD NO

Current 0.6TB

Opteron dual (2005) 2G 4TB 1

Zeon dual (2005) 1G 1TB 1

Total 5.6TB 2

Now, Converting to Condor batch system

cdfsoft Installed products : 4.11.1, 4.11.2, 4.8.4, 4.9.1, 4.9.1hpt3, 5.2.0, 5.3.0, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4, development Installed binary products: 4.11.2, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4

Page 10: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

CAF gui & Monitoring SystemSelect farm

Process type

Submit status

User script , I/O file location

Data access

http://cluster46.knu.ac.kr/condorcaf

Page 11: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Functionality for UserFeature Status

Self-contained user interface

Yes

Runs arbitrary user code Yes

Automatic identity management

Yes

Network delivery of results

Yes

Input and output data handling

Yes

Batch system priority management

Yes

Automatic choice of farm Not yet

Negotiation of resources Not yet

Runs on arbitrary grid resources

Not yet

Grid

Page 12: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Luminosity and Data Volume

Expectations are for continued high-volume growth as luminosity and data logging rate continue to improve :

Luminosity on target to reach goal of 2.5x present rate.

Data logging rate will increase to 25 - 40MB/s in 2005

Rate will further increase to 60 MB/s in FY 2006

You are here

Page 13: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Total Computing Requirements

Input Conditions Resulting Requirements

FiscalYear

Int L Evts Peak rate Ana Reco DiskTape I/O

Tape Vol

fb-1 x 109 MB/s Hz THz THz PB GB/s PBActu

al

2003 0.3 0.6 20 80 1.5 0.5 0.2 0.2 0.4

2004 0.7 1.1 20 80 4.0 0.7 0.3 0.5 1.0

Estim

ate

d

2005 1.2 2.4 35 220 7.2 1.0 0.7 0.9 2.0

2006 2.7 4.7 60 360 16 1.4 1.2 1.9 3.3

2007 4.4 7.1 60 360 26 2.8 1.8 3/0 4.9 Analysis CPU, disk, tape needs scale with number of events.

FNAL portion of analysis CPU assumed at roughly 50% beyond 2005.

Page 14: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Movement to Grid It’s the world wide trend for HEP experiment.

Need to take advantage of global innovations and resources.

CDF still has a lot of data to be analyzed.

USE Grid

Cannot continue to expand dedicate resource

Page 15: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Activities for CDF Grid Testing various approaches to using Grid resources (Grid3/OSG and LCG)

Adapt the CAF infrastructure to run on top of the Grid using Condor glide-ins

Use direct submission via CAF interface to OSG and LCG

Use SAMGrid/JIM sendboxing as an alternate way to deliver experiment + user software

Combine DCAFs with Grid resources

Page 16: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Conclusions

CDF has successfully deployed a global computing environment (DCAFs) for user analysis.

A large portion (50%) of the total CPU resources of the experiment are now provided by offsite through a combination of DCAFs and other clusters. And KorCAF (DCAF in Korea) switch to Condor batch system.

Active work is in progress to build bridges to true Grid methods & protocols provide a path to the future.

Page 17: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

Backup

Page 18: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

AbstractsCDF is a large-scale collaborative experiment in particle physics currently taking data at the Fermilab Tevatron.  As a running experiment, it generates a large amount of physics data that require processing for user analysis.  The collaboration has developed techniques for such analysis and the related simulations based on distributed clusters at several locations throughout the world.  We will describe the evolution of CDF's global computing approach, which exceeded 5 THz of aggregate cpu capability during the past year, and its plans for putting increasing amounts of user analysis and simulation onto the grid

Page 19: YuChul Yang ycyang@mail.knu.ac. kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현,

October 21, 2005 The Korean Physical Society YuChul [email protected]

CDF Data Analysis Flow

CDF

Level-3 Trigger

Tape Storage

Production Farm

Central Analysis Farm