27
GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing (joint work involving source code of at least 25 people) GPU logo created by Mark Grady and David A. Lucas

GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

  • Upload
    sailor

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing (joint work involving source code of at least 25 people). GPU logo created by Mark Grady and David A. Lucas. How many of you did not shutdown the computer and are now here in this room?. - PowerPoint PPT Presentation

Citation preview

Page 1: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU, a Gnutella Processing Unitan extensible P2P framework for

distributed computing(joint work involving source code of at least 25 people)

GPU logo created by Mark Grady and David A. Lucas

Page 2: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

How many of you did not shutdown the computer and are now here in

this room?• Assume we are 15 people running a screensaver without performing

real work. The talk lasts one hour.• Opportunity loss for one hour:

Speed: 15 * 0.8 GFlops = 12 GFlopsComp: 12GFlops * 1h = 43‘200 billion of floating point operations

• Costs for one hour:Power consumption: 15 * 300 W =

4500 W during one hour = 4.5 kWh Money: 4.5kWh à 0.20 CHF = 0.9 CHFOil needed: 0.36 liter (Gasoline: 12.3 kWh/kg)CO2 emissions: 0.81 kg CO2 (Gasoline: 2.27 kg CO2 / liter)

Page 3: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

During one year (15 people)…

• Opportunity loss for one year: Speed: 15 * 0.8 GFlops = 12 GFlops

Comp: 12GFlops * 1y = 378 432 000 billion of floating point ops

• Costs for one year:Power consumption: 15 * 300 W =

4500 W during one year = 39.42 MWh

Money: 39.42 MWh => 7 884 CHF (525.6 CHF per head)

Oil needed: 3153.6 liter (Gasoline: 12.3 kWh/kg)

CO2 emissions: 7 t CO2 (Gasoline: 2.27 kg CO2 / liter)

We signed the Kyoto Protocol…

Page 4: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Let‘s jump into the matter

This video was computed in a distributed fashion by 10 computers running for about one day. The GPU program distributed jobs,

Terragen computed the frames. Hacker Red created terrain, sky and camera path.

Terragen artists: Nico, Red, paulatreides, nikoala, nanobit

Page 5: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Distributed ComputingDistributed Computing Distributed computing is a science which solves a large problem by Distributed computing is a science which solves a large problem by

giving small parts of the problem to many computers to solve and giving small parts of the problem to many computers to solve and then combining the solutions for the parts into a solution for the then combining the solutions for the parts into a solution for the problem. problem.

Recent distributed computing projects have been designed to use Recent distributed computing projects have been designed to use the computers of hundreds of thousands of volunteers all over the the computers of hundreds of thousands of volunteers all over the world, via the Internet, to look for extra-terrestrial radio signals, to world, via the Internet, to look for extra-terrestrial radio signals, to look for prime numbers so large that they have more than ten million look for prime numbers so large that they have more than ten million digits, and to find more effective drugs to fight cancer and the AIDS digits, and to find more effective drugs to fight cancer and the AIDS virus. virus.

These projects run when the computer is idleThese projects run when the computer is idle These projects are so large, and require so much computing power These projects are so large, and require so much computing power

to solve, that they would be impossible for any one computer or to solve, that they would be impossible for any one computer or person to solve in a reasonable amount of time. (from person to solve in a reasonable amount of time. (from distributedcomputing.infodistributedcomputing.info))

Page 6: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Distributed Computing vs. Supercomputers

Lot of computational power available through Distributed Computing.However, supercomputers support intra-processor algorithms.Distributed computing projects keep running the same dumb task on all computers.

Page 7: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

2000

1998

2002

2004Centralized approach

Peer to Peer approach

Many other projects follow:

Folding@home, Chessbrain.net…

BOINC projectprovides framework

to unify severalprojects.

Gnutella,J. Frankel, T. Pepper

File sharing

systems:KazaaeMule..

P2P research frameworks: Sun JXTA,

Triana, GPU

Server

Client

Client

Client

Client

Client

Client Cl

ient

Client/Server

Peer to peer

Node

Node

Node

Node

Node

Node

Node

Grid ComputingGlobus Toolkit

Seti@Home(Anderson et al.)

Timeline

Page 8: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Principle of Gnutella

• Flooding

NodeAA

A

A

incoming

outgoing

Several ideas to limit geometrical growth of packets:List of already seen packets, Time To live stamp, QueryHit routing, Ultrapeers system keeps the network more tree-like.

Gnutella is implemented in GPU thanks to Kamil Pogorzelski.

Page 9: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

P2P networks are random graphs…

Node is computer. Length of edges is distance in milliseconds between two nodes.

Page 10: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

…random graphs with given edge length do not necessarily fit into

plane…

Page 11: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

…random graphs are fractal in their nature.

• Fractal dimension of the Gnutella network:D = 7.79 (difficult to imagine).• Interesting property of fractals: patterns repeat, patterns similar at different scale lengths.

Page 12: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Framework Architecture

Network Architecture

Suitable Tasks for the framework

ExtensionsDistributed Search Engine

Terragen Landscape Generator

Long term goals

Page 13: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU architecture on a single computer

Page 14: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Example

Main GPU application

Plugins perform computation.

Frontendsvisualizeresults ofcomputations.

Frontends monitornetworkperformance.Delphi GL port and

effects by Tom Nuydensand Chris Rorden

Page 15: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Network Architecture

GPUs advertise their IP number on a public list, GPUs know each other through autoping and autopongs, GPUs know IP address of entry gates.

Page 16: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU Network in practiceGPU Network in practice(December 2004)(December 2004)

Around 10 computers available at any time Around 10 computers available at any time of the day, in averageof the day, in average3 FTP servers, one for collecting 3 FTP servers, one for collecting generated images, one to distribute generated images, one to distribute updated binaries and one to distribute updated binaries and one to distribute videos.videos.Web on Web on sourceforge.netsourceforge.netCVS on CVS on sourceforge.netsourceforge.netDocumentation on Documentation on sourceforge.netsourceforge.net

Page 17: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Special features: Chat System

• Allows developers and users to meet, to exchange ideas and bug reports

• Debugging on the fly: if the network is running correctly, you should definitively not see the same sentence repeated five times (as it happened before).

Page 18: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Suitable Tasks for the frameworkSuitable Tasks for the framework

The star topology is a subset of the The star topology is a subset of the random graph. Any centralized approach random graph. Any centralized approach can run with some overhead on the P2P can run with some overhead on the P2P network (network (Rene TegelRene Tegel, applaunch.dll), applaunch.dll)

No overhead for:No overhead for:Monte Carlo MethodsMonte Carlo Methods

Evolutionary AlgorithmsEvolutionary Algorithms

Randomized AlgorithmsRandomized Algorithms

Distributed Databases (to same extent)Distributed Databases (to same extent)

Page 19: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU Extension I• Distributed Search Engine by Rene Tegel

Each GPU can run crawlers on websites. Links are visited randomly.Visited pages are indexed in a local database. Each GPU can query the databases of other GPUs. Status for this extension: experimental.

Page 20: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU Extension II

• Terragen Landscape Generator

(PlanetSide Software and Rene Tegel)

Page 21: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Terragen™

• Terragen is a software written by PlanetSide, a UK company. It is not open source but free for personal use.

• GPUs download terrain description and camera path from the FTP server, and decide to render a particular frame randomly. The computed frame is uploaded back to the FTP server.

• By merging frames together with a codec, we are able to generate videos (merging on one computer only)

• Status for this extension: production• Typical centralized extension• Download already produced videos here.

Page 22: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Special features: Autoupdate system

Releasing through Sourceforge takes about three quarter hour.Quick way to deliver fixes and to keep the cluster updated:Download new files from FTP server (tea.ch)

Page 23: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Long term goals

• At present, system scales up to 40-60 computers. Change this to scale up to 500 000 computers as any good P2P network does.

• Try to extend the framework to get a so that it supports agents (agent based model)

• Try to implement an example of evolutionary algorithm (e.g. Core Wars)

• Try to implement a project with public appealing, like Near Earth Objects Hazard Monitoring (ORSA)

Page 24: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Long term goals II

• GPU Core– Keep it under GPL license.– Rewrite it to be less ugly and more object-oriented.– Abandon Gnutella and go for a connection layer with

Distributed Hash Tables.– Connection layer should be generalized to support

any sort of communication (chats, computations, file-sharing)

– Native Linux implementation (not only through wine emulator, although already stable and fast)

– Not only x86 architecture

Page 25: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Special features: CVS support

• Goal: keep source code of developers synchronized.• Done through CVS of Sourceforge, bash Unix shell,

Cygwin or TortoiseCVS• Red files are not in sync with repository.

Page 26: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

GPU Cluster Pictures

Page 27: GPU, a Gnutella Processing Unit an extensible P2P framework for distributed computing

Thank you for your attention!Home of the project is

http://gpu.sourceforge.netMore videos here…

And thanks to the GPU Team and