Highly Distributed Parallel Computing Neil Skrypuch COSC 3P93 3/21/2007

Highly DistributedParallel Computing

Neil SkrypuchCOSC 3P933/21/2007

Overview

a network of computers all working towards a similar goal network consists of many nodes, few servers nodes perform computing and send results to a

server servers distribute jobs node machines do not communicate with eachother

Pros

Relatively Simple

don't need to worry about special interconnections

don't need to worry about cluster booting

Non-Homogeneous Network

can work across different computer architectures, OSes, etc

computers can be of varying speeds doesn't require the fastest or most expensive

computers computers can be distributed anywhere in the

world

Infrastructure

infrastructure for HDPC already exists almost everywhere anyone with a network of computers is already ready

for HDPC lots of programs already exist that take

advantage of HDPC

Expansion

expansion is painless there are no special constraints on the “shape” of

the network not fast enough yet? keep adding more

computers until it is

Resilience to Failure

it doesn't matter if one or more nodes die only the reliability of the central server(s) matter

Cons

Suitability

not all problems are suited to HDPC highly communication bound problems are a

poor fit for HDPC

Server Dependence

central server dependence is a double edged sword

if the central server becomes unavailable, everything grinds to a halt

Network (In)security

how to verify if a client should be allowed to join the network?

protecting data sent over the network verifying integrity and authenticity of data sent

over the network

Network (Un)reliability

nodes temporarily losing connectivity may make them temporarily useless

Dealing With the Issues

Server Dependence

the central server need not be a single server server itself may be clustered

countless ways to cluster servers

Clustering With a Database

allow nodes to talk directly to the database cluster the database over multiple servers

multi-master replication single master replication lots more...

Server Hierarchy

multiple tiers of servers may also be used could be considered recursive HDPC very similar to the tree architecture of

supercomputers

Lost Nodes

define a maximum amount of time to wait for a node's response

use redundancy assume some nodes will always be lost send duplicate jobs to multiple nodes simultaneously

Network (In)security

not as big of an issue as one might think encryption and public key infrastructures

mitigate most confidentiality and authenticity concerns

redundancy is useful for both reliability and security

Work Buffering

taking larger portions of work at a time temporary connectivity issues pose less of a

problem this way a node can continue working without talking to a

central server for longer

Where is HDPC Useful?

Combinatorics

search enumeration generation

Cryptography

brute force cipher cracking gives a glimpse of the future, in terms of what

the average person will be able to crack

Artificial Intelligence

genetic algorithms genetic programming alpha-beta search

Graphics

ray tracing animation fractal generation and calculation

Simulation

weather and climate modeling particle physics

Guidelines for Suitability

most problems involving a large search tree are well suited to HDPC

anything that can be broken down into smaller, self-contained, chunks is a good candidate for HDPC

How Well Does HDPC Work?

Folding@Home

~200,000 non-dedicated nodes 240 TFLOPS approximately 40 central servers, unknown

speeds

SETI@Home

~200,000 non-dedicated nodes 288 TFLOPS 10 central servers, all relatively modest

Blue Gene/L

currently the fastest supercomputer not HDPC 65,536 dedicated nodes 280 TFLOPS cost about $100,000,000 US

HDPC Works Well

typical speedup is close to linear cost is substantially less than a comparable

supercomputer nodes can also be general purpose computers

Why Does HDPC Work Well?

Infrastructure Reuse

in general, new hardware investments are not necessary creating new infrastructure is expensive and time

consuming it's easy to justify using things you already have

for additional purposes there are tons of idle CPUs at any given time,

why not use them?

Low Barrier to Entry

anyone with a couple of networked computers can start experimenting

Painlessly Scalable

smooth curve upwards for both cost and performance

Simpler to Program

doesn't require as much “thinking in parallel” in comparison to other approaches thinking in parallel is hard and fundamentally

different than thinking serially pushes the heavy lifting onto the database

instead of the application programmer

Commodity Hardware is Fast

a typical desktop machine today is more powerful than a supercomputer from 15 years ago and costs orders of magnitude less and outputs much less heat and takes up much less space and consumes much less power

The Future

supercomputers will become faster HDPC will become even faster than

supercomputers as both number of computers and speed increases

both supercomputers and HDPC will fill their own separate niche

Questions and Discussion

References

http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats http://www.boincstats.com/stats/project_graph.php?pr=sah http://www.boincstats.com/stats/project_graph.php?pr=bo http://www.itjungle.com/tlb/tlb033004-story04.html http://setiathome.berkeley.edu/sah_status.html http://fah-web.stanford.edu/serverstat.html http://top500.org/list/2006/11/100

Documents

Highly Distributed Parallel Computing Neil Skrypuch COSC 3P93 3/21/2007