Upload
milo-fisher
View
214
Download
1
Embed Size (px)
Citation preview
Tevfik KosarComputer Sciences DepartmentUniversity of Wisconsin-Madison
[email protected]://www.cs.wisc.edu/condor
Managing and Scheduling Data Placement (DaP)
Requests
www.cs.wisc.edu/condor
Outline
› Motivation
› DaP Scheduler
› Case Study: DAGMan
› Conclusions
www.cs.wisc.edu/condor
Demand for Storage
› Applications require access to larger and larger amounts of data Database systems Multimedia applications Scientific applications
• Eg. High Energy Physics & Computational Genomics
• Currently terabytes soon petabytes of data
www.cs.wisc.edu/condor
Is Remote access good enough?
› Huge amounts of data (mostly in tapes)
› Large number of users› Distance / Low Bandwidth › Different platforms› Scalability and efficiency concerns=> A middleware is required
www.cs.wisc.edu/condor
Two approaches
› Move job/application to the data Less common Insufficient computational power on
storage site Not efficient Does not scale
› Move data to the job/application
www.cs.wisc.edu/condor
Move data to the Job
Huge tape library (terabytes)
Compute cluster
LAN
Local Storage Area (eg. Local Disk, NeST Server..)
WAN
Remote Staging Area
www.cs.wisc.edu/condor
Main Issues
› 1. Insufficient local storage area
› 2. CPU should not wait much for I/O
› 3. Crash Recovery
› 4. Different Platforms & Protocols
› 5. Make it simple
www.cs.wisc.edu/condor
Data Placement Scheduler (DaPS)
› Intelligently Manages and Schedules Data Placement (DaP) activities/jobs
› What Condor is for computational jobs, DaPS means the same for DaP jobs
› Just submit a bunch of DaP jobs and then relax..
www.cs.wisc.edu/condor
DaPS Architecture
DAPS Server
AcceptExec.
Sched.
DaPS Client
DaPS Client
Req.
Req.
GridFTP Server NeST Server
SRB Server
Local Disk
GridFTP Server
SRM Server Req.
Buffer
Req.
LocalRemote
Queue
Thirdparty transfer
Get
Put
www.cs.wisc.edu/condor
DaPS Client Interface
› Command line: dap_submit <submit file>
› API: dapclient_lib.a dapclient_interface.h
www.cs.wisc.edu/condor
DaP jobs
› Defined as ClassAds
› Currently four types: Reserve Release Transfer Stage
www.cs.wisc.edu/condor
DaP Job ClassAds[ Type = Reserve; Server = nest://turkey.cs.wisc.edu; Size = 100MB; reservation_no = 1; ……][ Type = Transfer; Src_url = srb://ghidorac.sdsc.edu/kosart.condor/x.dat; Dst_url = nest://turkey.cs.wisc.edu/kosart/x.dat; reservation_no = 1; ...... ]
www.cs.wisc.edu/condor
Supported Protocols
› Currently supported: FTP GridFTP NeST (chirp) SRB (Storage Resource Broker)
› Very soon: SRM (Storage Resource Manager) GDMP (Grid Data Management Pilot)
www.cs.wisc.edu/condor
Case Study: DAGMan.dagFile
CondorJobQueue
A
DAGManDAGMan
C
D
A
B
www.cs.wisc.edu/condor
Current DAG structure
› All jobs are assumed to be computational jobs
Job A
Job B Job C
Job D
www.cs.wisc.edu/condor
Current DAG structure
› If data transfer to/from remote sites is required, this is performed via pre- and post-scripts attached to each job.
Job A
PRE
Job BPOST
Job C
Job D
www.cs.wisc.edu/condor
New DAG structure
Add DaP jobs to the DAG structurePRE
Job BPOST
Transfer in
Reserve In & out
Job B
Transfer out
Releasein
Release out
www.cs.wisc.edu/condor
New DAGMan Architecture
.dagFile
CondorJobQueue
A
DAGManDAGMan
B
D
A
C DaPSJob
Queue
X
Y
X
www.cs.wisc.edu/condor
Conclusion
› More intelligent management of remote data transfer & staging increase local storage utilization maximize CPU throughput
www.cs.wisc.edu/condor
Future Work
› Enhanced interaction with DAGMan
› Data Level Management instead of File Level Management
› Possible integration with Kangaroo to keep the network pipeline full
www.cs.wisc.edu/condor
Thank You for Listening &
Questions
› For more information Drop by my office anytime
• Room: 3361, Computer Science & Stats. Bldg.
Email to:• [email protected]