Upload
dominick-hector
View
223
Download
5
Tags:
Embed Size (px)
Citation preview
Motivation (in words)
• Large datasets– geographic information systems– bioinformatics– chemistry– physics– environmental modeling
• A single Dell office computer can’t handle the load, but… what if we use more than one?!
Motivation, cont’d
• We have a plethora of computers that are idle a large majority of the time
• Let’s take advantage of the hardware investment that has already been made to provide computing power to enable research tasks on traditionally computationally intractable problems
What’s the catch?
• Sounds almost too good to be true• It is and it isn’t– easy – providing an environment to connect
computers together … BOINC!– challenge – creating parallel algorithms to run on
the computing environment– challenge – making it easy for programmers and
scientists to submit work
Concepts• Technical term – non-dedicated cluster– set of computers whose idle time is harnessed to
process jobs– individual nodes in the cluster function as standalone
computers– in laymen’s terms: “let’s hook a bunch of lab
computers together”• Software environment – BOINC– powers many worldwide projects (e.g., SETI@home,
World Community Grid, climateprediction.net, etc…)– step-by-step instructions (minus the details) of how to
build a campus virtual supercomputer
Can we build a supercomputer?
• Yes – with campus-wide buy-in• Let’s start on a smaller scale…– Here we have five computers – With 5 computers – Florida export ~1 day– Adding additional computers is easy(<5min setup)– With 40 computers - Florida export in ~2 hours
Usage scenarios
• Simplest scenario• Non-parallel application with long runtime• You don’t want your office or lab computer
tied up running the computation• Solution: submit your non-parallel app to the
cluster using an easy-to-use web interface!
Usage scenarios
• Build or convert an existing parallel application• Four components– Work generator– The client program– Result validator– Result assimilator
Usage scenarios
• Classroom tool– networking– databases– algorithms and data structures– parallel computing– hardware
Demo
• Simple example (embarrassingly parallel)Calculate the sum of all the numbers between 1 and 100,000,000,000
• No modifications necessary to original Java program as long as it already reads its starting and ending numbers from the command line
Getting Involved
• Become a beta tester for usage scenario 1(i.e., the web application for uploading an app to run on the cluster)
• Suggest a project for collaboration, and I will assist in the conversion process
Thank you!
• More information about the project can be found on my faculty web page:http://faculty.samford.edu/~brtoone