17
Introduction to Scalable Programming using Makeflow and Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012

Introduction to Scalable Programming using Makeflow and Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012

Embed Size (px)

Citation preview

Introduction to Scalable Programming using Makeflow and

Work Queue

Dinesh Rajan and Mike AlbrechtUniversity of Notre Dame

October 24 and November 7, 2012

Go to: http://nd.edu/~cclClick “Tutorial: Introduction to Scalable Programming”

3

I have a standard, debugged, trusted application that runs on my laptop. One simulation runs in an hour.I have to run 100.Then I have to analyze the results, tweak the simulation, and run 100 more.

Can I get a single result faster?Can I get more results in the same time?

Last year,I heard aboutthis grid thing.

This year,I heard about

this cloud thing.

What do I do next?

Should I port my program to MPI or Hadoop?Learn C / JavaLearn MPI / HadoopRe-architectRe-writeRe-testRe-debugRe-certify

What if my application looks like this?

I can get as many machineson the cloud as I want!

How do I organize my applicationto run on those machines?

Makeflow:A Portable Workflow System

9

An Old Idea: Makefiles

part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result

10

Makeflow = Make + Workflow

• Provides portability across batch systems.• Enable parallelism (but not too much!)• Fault tolerance at multiple scales.• Data and resource management.

Makeflow

Local Condor SGE WorkQueue

http://www.nd.edu/~ccl/software/makeflow

Makeflow Language - Rules

• Each rule specifies:– a set of target files to

create;– a set of source

files needed to create them;

– a command that generates the target files from the source files.

part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result

out1 : part1 mysim.exemysim.exe part1 > out1

You must stateall the files

needed by the command.

PrivateCluster

CampusCondor

Pool

PublicCloud

Provider

CRCSGE

Cluster

Makefile

Makeflow

Local Files and Programs

Makeflow + Batch System

makeflow –T sge

makeflow –T condor

Work Queue

Work Queue

Drivers

• Local• Condor• SGE• Batch• Hadoop• WorkQueue

• Torque• MPI-Queue• XGrid• Moab

How to run a Makeflow

• Run a workflow locally (multicore?)– makeflow -T local sims.mf

• Clean up the workflow outputs:– makeflow –c sims.mf

• Run the workflow on SGE:– makeflow –T sge sims.mf

Hands Onhttp://nd.edu/~ccl/software/tutorials/ndtut12/mf-tutorial.php

Practice Problems

http://nd.edu/~ccl/software/tutorials/ndtut12/mf-hw.php

1. Construct a makeflow to render a short movie featuring a Rubik’s cube

2. Launch the makeflow on both your laptop and SGE

3. Consider ways you might use Makeflow for your research