View
220
Download
1
Category
Preview:
Citation preview
Pegasus ForAutomated Workflows
Derrick KearneyHUBzero® Platform for Scientific Collaboration
Purdue University
This work licensed underCreative Commons
See license online:by-nc-sa/3.0
Building Blocks of Programs
Function2
Function3
Function1
Inputs
Outputs
Function2
Function3
Function1
Inputs
Outputs
Types of Workflows
Sequential Workflows
Execute steps in order until all of the work has been completed
Could include activities that run in parallel
CNTBandsScience Domain: NanoelectronicsScientists: Lundstrom et al. (Purdue) https://nanohub.org/resources/cntbands-ext
Types of Workflows
Wideband Workflows
Execute the same function many (1000's) of times
Massively parallelScatter / GatherSweeps
EpigenomicsScience Domain: BioinformaticsScientists: Ben Berman et al. (USC)
Pegasus
Developed at USC Ewa Deelman et al. Website: pegasus.isi.edu Open Source Bindings for your favorite languages
Benefits:
Performance Portability Provenance Data Management Error Recovery
How does Pegasus Work?
If you can draw it ... … they can make it run
GridGrid
sayhi
inquire
f.a
f.b
f.c
DAX
DAG
HUBzero Infrastructure
Example Workflow
$ cat /apps/pegtut/current/bin/sayhi.sh
#!/bin/bash
# output something on stdoutecho "Hello `cat ${1}`!"
# print greeting to a fileecho "Hello `cat ${1}`!" >f.b
Tool Session
Containers
sayhi.sh
inquire.sh
f.a
sayhi
inquire
f.a
f.b
f.c
HUBzero Infrastructure
Example Workflow
$ cat /apps/pegtut/current/bin/inquire.sh
#!/bin/bash
# output some thing to stdoutecho "`cat ${1}` How are you?"
# print greeting to a fileecho "`cat ${1}` How are you?" >f.c
Tool Session
Containers
sayhi.sh
inquire.sh
f.a
sayhi
inquire
f.a
f.b
f.c
HUBzero Infrastructure
Example Workflow
$ cat f.a
pete
Tool Session
Containers
sayhi.sh
inquire.sh
f.a
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX using the Python API
import osfrom Pegasus.DAX3 import *
sayhipath = '/apps/pegtut/current/bin/sayhi.sh'inquirepath = '/apps/pegtut/current/bin/inquire.sh'
# create an abstract DAXdax = ADAG("sayhi_inquire")
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX – Declare files and executables to replica catalog
# Add input file to the DAX-level replica cataloga = File("f.a")a.addPFN(PFN("file://" + os.path.join(os.getcwd(),"f.a"), "local"))dax.addFile(a)
# Add executables to the DAX-level replica cataloge_sayhi = Executable(namespace="sayhi_inquire", \ name="sayhi", version="1.0", \ os="linux", arch="x86_64", \ installed=False)e_sayhi.addPFN(PFN("file://" + sayhipath, "condorpool"))dax.addExecutable(e_sayhi)
e_inquire = Executable(namespace="sayhi_inquire", \ name="inquire", version="1.0", \ os="linux", arch="x86_64", installed=False)e_inquire.addPFN(PFN("file://" + inquirepath, "condorpool"))dax.addExecutable(e_inquire)
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX – Add jobs to the DAX
# Add the sayhi jobsayhi = Job(namespace="sayhi_inquire", \ name="sayhi", version="1.0")sayhi.addArguments('f.a')b = File("f.b")sayhi.uses(a, link=Link.INPUT)sayhi.uses(b, link=Link.OUTPUT)dax.addJob(sayhi)
# Add the inquire job (depends on the sayhi job)inquire = Job(namespace="sayhi_inquire", \ name="inquire", version="1.0")inquire.addArguments('f.b')c = File("f.c")inquire.uses(b, link=Link.INPUT)inquire.uses(c, link=Link.OUTPUT)dax.addJob(inquire)
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX – Add jobs to the DAX
# Add the sayhi jobsayhi = Job(namespace="sayhi_inquire", \ name="sayhi", version="1.0")sayhi.addArguments('f.a')b = File("f.b")sayhi.uses(a, link=Link.INPUT)sayhi.uses(b, link=Link.OUTPUT)dax.addJob(sayhi)
# Add the inquire job (depends on the sayhi job)inquire = Job(namespace="sayhi_inquire", \ name="inquire", version="1.0")inquire.addArguments('f.b')c = File("f.c")inquire.uses(b, link=Link.INPUT)inquire.uses(c, link=Link.OUTPUT)dax.addJob(inquire)
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX – Add control-flow dependencies
# Add control-flow dependenciesdax.addDependency(Dependency(parent=sayhi, child=inquire))
sayhi
inquire
f.a
f.b
f.c
How does Pegasus Work?
Step 2. Convert Workflow to DAX – Write DAX to file
# Write the DAX to filewith open('sayhiinquire.dax','w') as fp: dax.writeXML(fp)
sayhi
inquire
f.a
f.b
f.c
Running the DAX
Step 3. Convert Workflow to DAX – Write DAX to file
$ submit pegasus-plan --dax sayhiinquire.dax
sayhi
inquire
f.a
f.b
f.c
User's Workspace Terminal
Grid
HUBzero Infrastructure
Running The DAX
Tool Session
Containers
$ submit pegasus-plan --dax sayhiinquire.dax
(989.0) Job Submitted at WF-DiaGrid(989.0) DAG Running at WF-DiaGrid…(989.0) DAG Done at WF-DiaGrid
$ cat f.b
Hello pete!
$ cat f.c
Hello pete! How are you? GridGridGrid
SubmitProxy
Recommended