Transcript

Introduction to Scalable Programming using Makeflow and

Work Queue

Dinesh Rajan and Mike AlbrechtUniversity of Notre Dame

October 24 and November 7, 2012

Go to: http://nd.edu/~cclClick “Tutorial: Introduction to Scalable Programming”

3

I have a standard, debugged, trusted application that runs on my laptop. One simulation runs in an hour.I have to run 100.Then I have to analyze the results, tweak the simulation, and run 100 more.

Can I get a single result faster?Can I get more results in the same time?

Last year,I heard aboutthis grid thing.

This year,I heard about

this cloud thing.

What do I do next?

Should I port my program to MPI or Hadoop?Learn C / JavaLearn MPI / HadoopRe-architectRe-writeRe-testRe-debugRe-certify

What if my application looks like this?

I can get as many machineson the cloud as I want!

How do I organize my applicationto run on those machines?

Makeflow:A Portable Workflow System

9

An Old Idea: Makefiles

part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result

10

Makeflow = Make + Workflow

• Provides portability across batch systems.• Enable parallelism (but not too much!)• Fault tolerance at multiple scales.• Data and resource management.

Makeflow

Local Condor SGE WorkQueue

http://www.nd.edu/~ccl/software/makeflow

Makeflow Language - Rules

• Each rule specifies:– a set of target files to

create;– a set of source

files needed to create them;

– a command that generates the target files from the source files.

part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result

out1 : part1 mysim.exemysim.exe part1 > out1

You must stateall the files

needed by the command.

PrivateCluster

CampusCondor

Pool

PublicCloud

Provider

CRCSGE

Cluster

Makefile

Makeflow

Local Files and Programs

Makeflow + Batch System

makeflow –T sge

makeflow –T condor

Work Queue

Work Queue

Drivers

• Local• Condor• SGE• Batch• Hadoop• WorkQueue

• Torque• MPI-Queue• XGrid• Moab

How to run a Makeflow

• Run a workflow locally (multicore?)– makeflow -T local sims.mf

• Clean up the workflow outputs:– makeflow –c sims.mf

• Run the workflow on SGE:– makeflow –T sge sims.mf

Hands Onhttp://nd.edu/~ccl/software/tutorials/ndtut12/mf-tutorial.php

Practice Problems

http://nd.edu/~ccl/software/tutorials/ndtut12/mf-hw.php

1. Construct a makeflow to render a short movie featuring a Rubik’s cube

2. Launch the makeflow on both your laptop and SGE

3. Consider ways you might use Makeflow for your research