Introduction to Scalable Programming using Makeflow and Work Queue

Embed Size (px)

DESCRIPTION

Introduction to Scalable Programming using Makeflow and Work Queue. Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012. Go to: http://nd.edu/~ccl. Click “Tutorial: Introduction to Scalable Programming”. - PowerPoint PPT Presentation

Text of Introduction to Scalable Programming using Makeflow and Work Queue

Welcome! Department of Computer Science and Engineering College of Engineering University of Notre Dame

Introduction to Scalable Programming using Makeflow and Work QueueDinesh Rajan and Mike AlbrechtUniversity of Notre Dame

October 24 and November 7, 2012

Go to: http://nd.edu/~cclClick Tutorial: Introduction to Scalable Programming3

I have a standard, debugged, trusted application that runs on my laptop. One simulation runs in an hour.I have to run 100.Then I have to analyze the results, tweak the simulation, and run 100 more.Can I get a single result faster?Can I get more results in the same time?

Last year,I heard aboutthis grid thing.

This year,I heard aboutthis cloud thing.What do I do next?Should I port my program to MPI or Hadoop?Learn C / JavaLearn MPI / HadoopRe-architectRe-writeRe-testRe-debugRe-certify

What if my application looks like this?I can get as many machineson the cloud as I want!

How do I organize my applicationto run on those machines?Makeflow:A Portable Workflow SystemAn Old Idea: Makefiles9part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result

Makeflow = Make + WorkflowProvides portability across batch systems.Enable parallelism (but not too much!)Fault tolerance at multiple scales.Data and resource management.

10MakeflowLocalCondorSGEWorkQueuehttp://www.nd.edu/~ccl/software/makeflow

Makeflow Language - RulesEach rule specifies: a set oftarget filesto create;a set ofsource filesneeded to create them;acommand that generates the target files from the source files.part1 part2 part3: input.data split.py ./split.py input.data

out1: part1 mysim.exe ./mysim.exe part1 >out1

out2: part2 mysim.exe ./mysim.exe part2 >out2

out3: part3 mysim.exe ./mysim.exe part3 >out3

result: out1 out2 out3 join.py ./join.py out1 out2 out3 > result out1 : part1 mysim.exemysim.exe part1 > out1 You must stateall the filesneeded by the command.sims.mfout.10 : in.dat calib.dat sim.exesim.exe p 10 in.data > out.10

out.20 : in.dat calib.dat sim.exesim.exe p 20 in.data > out.20

out.30 : in.dat calib.dat sim.exesim.exe p 30 in.data > out.30

PrivateClusterCampusCondorPoolPublicCloudProviderCRCSGEClusterMakefileMakeflowLocal Files and ProgramsMakeflow + Batch Systemmakeflow T sgemakeflow T condorWork QueueWork QueueDriversLocalCondorSGEBatchHadoopWorkQueueTorqueMPI-QueueXGridMoabHow to run a MakeflowRun a workflow locally (multicore?)makeflow -T local sims.mfClean up the workflow outputs:makeflow c sims.mfRun the workflow on SGE:makeflow T sge sims.mfHands Onhttp://nd.edu/~ccl/software/tutorials/ndtut12/mf-tutorial.php

Practice Problemshttp://nd.edu/~ccl/software/tutorials/ndtut12/mf-hw.phpConstruct a makeflow to render a short movie featuring a Rubiks cubeLaunch the makeflow on both your laptop and SGEConsider ways you might use Makeflow for your research