An Application Programming Interface for HPC

8/13/2019 An Application Programming Interface for HPC

1/53

An Application Programming Interface for High

Performance Distributed Computing

For the Copernicus computing project

ASHKAN JAHANBAKHSH

HANIF FARAHMAND MOKARREMI

Master of Science Thesis, KTH

Supervisor: Iman Pouya, Erik Lindahl

Examiner: Patrik Henelius

Stockholm, Sweden 2013


2/53


3/53

Abstract

This master thesis was performed at Lindahl Lab at Science for Life Labora-

tory (SciLifeLab), located in Stockholm, Sweden.

The need of compute resources is increasing and the speed of a single com-

puter is not enough for data intensive computations. Distributed computing

has been developed to improve computations of such tasks by distributing

them to other machines. There are many platforms that are implemented for

this purpose. Copernicus is such a platform that provides the availability to

distribute computationally intensive tasks. The current Copernicus API han-

dles only distribution of entire applications and there is no support to distribute

sections of code in Copernicus.

In this paper we provide a general API design and a Python implementation

of it for how to distribute sections of a code on Copernicus. The implementa-

tion of the API handles only Python code but it is possible to extend to other

languages with the help of Python wrappers. It also abstracts the learning

threshold for a new Copernicus user, especially for those that are not com-

puter scientists or people with little knowledge in programming.


4/53

ReferatAPI fr distribuerade berkningar

Detta examensarbete utfrdes p Science for Life Laboratory (SciLifeLab), p

Lindahl Lab avdelning, som ligger i Stockholm.

Behovet av datorresurser kar och hastigheten av en enda dator r inte tillrck-

ligt fr dataintensiva berkningar. Distribuerade berkningar har utvecklats fr

att frbttra berkning av sdana arbeten genom att distribuera dem till andra

maskiner. Det finns mnga plattformar som har implementerats fr detta n-

daml. Copernicus r en sdan plattform som ger tillgng till distribuering av

berkningsintensiva arbeten. Den nuvarande Copernicus API hanterar endast

distribution av ett program i helhet och det finns inget std fr att distribuera

delar av kod i Copernicus.

I denna rapport tillhandahller vi ett generellt API-design och dess implemen-

tation i Python fr hur man distribuerar delar av en kod i Copernicus. Im-

plementationen av APIet hanterar endast Python-kod, men det r mjligt att

utvidga det till andra sprk med hjlp av Python wrappers. Detta minskar ven

bort inlrningstrskeln fr en ny Copernicus-anvndare, speciellt fr dem som

inte r dataloger eller personer med lite kunskap inom programmering.


5/53

AcknowledgementsWe would like to thank everyone who helped us along this master thesis. A special

thanks goes to Iman Pouya, our supervisor who guided us in the right direction to

reach our goal. Other special thanks go to Sander Pronk and Patrik Falkman who

gave us the opportunity to discuss our problems with them during the work and get

many valuable feedbacks. We also thank Professor Erik Lindahl, our examiner in

SciLifeLab who let us do this master thesis.

Finally, we would like to thank our friends and specially Ali Mehrabi and Hannes

Salin who put some time to read our report and gave us valuable feedback.


6/53

Contents

1 Introduction 1

1.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Parallelization and Distributed computing 5

2.1 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Distributed computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Copernicus 9

3.1 Copernicus design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Copernicus module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Copernicus module example . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.1 Defining the _import.xml file . . . . . . . . . . . . . . . . . . . 15

3.3.2 Defining the runner.py file . . . . . . . . . . . . . . . . . . . . . 16

3.3.3 Defining the executable.xml file . . . . . . . . . . . . . . . . . . 17

3.3.4 Adding jobs to Copernicus . . . . . . . . . . . . . . . . . . . . . . 17

4 Related work 19

4.1 Folding@home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 PiCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.3 Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.4 Techila. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 The API 21

5.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.1.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1.2 First attempt: Module generator . . . . . . . . . . . . . . . . . . . 22

5.1.3 Final attempt: Generic module . . . . . . . . . . . . . . . . . . . . 23

5.2 API implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Results 29

6.1 MD5 cracker with the new API . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


7/53

7 Discussion 39

7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Bibliography 41

Appendices 43

A Link to source code 45


8/53


9/53

Chapter 1

Introduction

Today, many scientists do a lot of computationally intensive simulations[1]. This is mostly

used in the academic world in biology, chemistry and physics but also in the commercial

industry, e.g. the automotive industry[1]. These simulations are usually done for several

reasons:

The problem is too complex to solve analytically

It is too expensive (money or time) to solve it in real world

To be able to do a good simulation of the real world application, one must build a model

that can represent that application. For example, in a car crash simulation it could be tooexpensive and time consuming for a company to destroy hundreds of cars to be able to get

a representative statistical result of how a possible real car crash would turn out. Instead

the company can use simulations to minimize the real car crash tests, thus save both time

and money.

Figure 1.1: A snapshot of a car crash simulation, source: BMW.

1


10/53

CHAPTER 1. INTRODUCTION

These simulations are usually computationally intensive so it would take a long time to

do the processing on a single workstation. Therefore the simulations are usually done oncomputational resources equipped with hundreds to millions of CPU cores. These com-

putational resources could either be a cluster of computers that are interconnected with a

high performance link, or a cluster of smaller computers that could be placed all over the

world which are connected through the Internet. The cluster of computers (also called su-

percomputers) is usually very huge. They are specially built to be energy efficient, but still

consume a lot of energy and emit huge amount of heat that needs to be dissipated. They are

therefore located in areas with a lot of space, not only needed for the computers but also

for the cooling system and electricity.

Figure 1.2: A supercomputer, source: NASA.

On top of the hardware, the user must implement a mechanism for the application to com-

municate across all computers and clusters. There is support for such a mechanism both in

the application layer and in a lower programming language layer. In the programming layer

there is several language APIs such as Message Passing Interface (MPI)[2], Remote Pro-

cedure Call (RPC)[3] and Remote Method Invocation (RMI)[4]. The programmer needs to

handle the communications between the computers, handle fault tolerance and the available

resources.

Many applications have been created to simplify and make the language layer more ab-

stract. While these applications might be very different in design and usage, most of them

have some fundamental functionality. Some of these functionalities are handling the com-

munication between computers, resource management and fault tolerance.

In the Related work chapter, three such tools will briefly be explained and in the next

section another tool together with a problem statement is presented.

2


11/53

1.1. PROBLEM STATEMENT

1.1 Problem statement

As explained in the previous section, there are many software tools that are aiming to sim-

plify the distribution of work. One of these tools is called Copernicus[5], which has been

mainly developed at Royal Institute of Technology (KTH) with cooperation with Virginia

and Stanford universities. Copernicus is a Peer-to-Peer (P2P) platform for distributed com-

puting. It connects heterogeneous computing resources and allows utilization by defining a

problem as a workflow. This means that users can focus on formulating their problem and

not worry about the parallel work distribution and fault tolerance.

It is designed primarily for molecular dynamics but can practically be used for any compu-

tationally intensive work that can be run in a distributed manner.

Figure 1.3: A molecular simulation of a protein folding, source: Stanford university.

The current structure of Copernicus allows binaries, custom programs and scripts to be used

in a workflow. However this is a monolithic approach and in certain use cases one needs

more granular control over the work distribution. The workflow creation part of Copernicus

is very powerful but it is also pretty time consuming for a new Copernicus user or users

with little knowledge in programming. In some use cases it is utterly impractical to design

a workflow to be able to distribute some work on the Copernicus platform.

One such use case is when only some part of a code needs to be run in a distributed manner.

In such use case, the current Copernicus design would force the user to change the code

dramatically to make it run on the Copernicus platform. Each time the user changes the

code, appropriate changes must be made again. While this is really time consuming and

even frustrating for a code base with a single developer, it is even more frustrating and

impractical for a code base with multiple developers. This is a high entry barrier which is

probably one of the reasons for the small user base of Copernicus.

The goal of this project is to define an API for users to use when they want to define sections

of their code to distribute in parallel in the Copernicus platform.

3


12/53

CHAPTER 1. INTRODUCTION

1.2 Methodology

The Copernicus project is a large free software project. It is lacking a flow chart and a

UML diagram that explains its design decisions. Its user API documentation was also un-

der construction, at the start of this thesis. From a developer standpoint these are two huge

drawbacks. This means that not only did we need to understand how to use Copernicus

but also read its code and find out about its current design, structure and its strength and

weaknesses. However, an introduction to how to use Copernicus was given by our supervi-

sor.

Because of the reasons above, we decided to first get familiar with the usage of Copernicus.

When a full understanding of its usage was obtained, literature studies were done in the area

of High Performance Computing (HPC) and similar applications. Most of them mentionedin the previous sections. The reason for this was to get good knowledge of what was out

there and how they are used.

After literature studies were done, we implemented a simple distributable application that

used Copernicus to compute some tasks. The goal was to understand and get an overview

on how Copernicus works and get used to the available Copernicus commands. After that,

we iteratively implemented a simple version of the API. This version was only intended

to get a flow in the development and understanding of the Copernicus design. When there

is a working flow, it is much easier to both understand and change each section of the

implementation.

The methodology of this thesis can be described by the following flow chart:

Figure 1.4: The methodology of this thesis.

4


13/53

Chapter 2

Parallelization and Distributed

computing

2.1 Parallelization

The clock speed of processors will no longer have a significant increment because the

high clock speed requires exponentially more voltage which generates almost exponen-

tially more heat[6]. Instead processor manufacturers use new transistors to add multiple

processors cores to each chip[6]. To use the power of these multiple cores, programs have

to be run in parallel. Parallel processing is a way of computing in which a large problem

is divided into smaller and independent tasks and all tasks are computed concurrently, on a

separate core, usually on a single computer. In other words parallel processing is the use of

two or more processor cores at the same time to solve a single problem that can be divided

into sub-problems.

Figure 2.1: A visualization of parallelization of a problem.

5


14/53

CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING

2.2 Distributed computing

In order to calculate computationally intensive tasks in a reasonable timeframe is to use a

supercomputer, i.e. a powerful high performance computer that consists of many compute

nodes1. In a supercomputer, tasks can take advantages of the huge amount of nodes, and

thus get computed in a parallelized way.

Another alternative to compute computationally intensive tasks in a reasonable timeframe

is to set up a number of computers in a network and use their resources to compute tasks.

Distributed computing[7] is a field in computer science which solves this by dividing a

large problem into smaller parts and sends them to many computers in a network to solve

and then merge the sub-solutions into a solution for the problem.

Figure 2.2: A visualization of distribution of a problem.

1A compute node is simply a single machine in a cluster or a network.

6


15/53

2.3. MAPREDUCE

2.3 MapReduce

MapReduce is a programming model that allows developers to process large data sets in

a distributed way[8]. There are two types of key functions in the MapReduce framework,

the Map function and the Reduce function. The job is separated into sub-problems which

are processed by the mappers. The outputs of the maps are sent to the reducers where they

are collected into one result. While the idea of having a mapper function and a reducer

function is pretty old and widely used[9], the name MapReduce was first encountered

when a paper on the subject was published, inherited from two Google engineers.

Figure 2.3: A visualization of MapReduce programming concept. A problem is separated

into sub-problems and sent to the mappers. When mappers are done mapping, their results

are sent to the reducers and they are collected and presented as a final output.

7


16/53

CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING

2.4 Advantages and Disadvantages

The main advantage of a distributed system compared to a parallel computing system is

the reliability. If one machine crashes, the remaining computers remain unaffected and the

system will still run as a whole, given that there is support for fault-tolerance. In a parallel

computing system, a failure of a CPU or other hardware may cause the whole system to

stop. Another advantage of a distributed system is the flexibility, i.e. it is very easy to

implement, install and debug new services which in turn can be accessed equally by every

client. Finally, there are many volunteers around the world who might want to contribute

CPU resources to help scientists compute data-intensive tasks in a distributed manner like

the Folding@Home project[10].

One of the main disadvantages of distributed computing systems is troubleshooting and di-agnosing them. The maintainer may be required to connect to remote machines or check if

the communication between computers in the system is working. Network is the other fac-

tor for a reliable distributed system, i.e. if the network is overloaded or there are problems

with data transmission, the performance of the system will be affected.

Both parallel and distributed computing requires one to modify sections of the code that

is meant to be parallelized to independent tasks. This requires a good understanding for

parallel and distributed computing.

8


17/53

Chapter 3

Copernicus

As mentioned in chapter 1, Copernicus is a highly scalable platform for distribution of

computationally intensive tasks, connected in a model that is called a workflow. The

user can focus on formulating the problem as a workflow instead of spending time on

handling the message passing and fault tolerance, which is a crucial part of distributed

computing.

In Copernicus, a user can create a workflow of connected instances of functions that are

working as wrappers for running executables. For example one might want to connect the

output of function A to the input of function B thereafter connect the output from B to

function C. The idea of the Copernicus workflow design is that a user should be able to

change the inputs of the module functions while the job is being executed.

Figure 3.1: An example of a simple workflow, with three connected functions in Coperni-

cus.

9


18/53

CHAPTER 3. COPERNICUS

Insection 3.2a more detailed explanation of Copernicus module is presented.

Copernicus is also designed with scalability in mind. That means the user can add new

workers on demand and the number of workers that can be in the cluster is practically

unlimited.

The following sections will describe the Copernicus design and the requirements for a

project to be distributed on it.

10


19/53

3.1. COPERNICUS DESIGN

3.1 Copernicus design

The current version of Copernicus is designed in such a way that there are three differ-

ent environments: client, server and workers. On the client machine the user starts a new

project and handles all the commands to the server. The servers job is to handle all com-

mands coming from the client and send jobs to the workers. Copernicus commands are

used as an interface between the client and the server i.e. to start a project, set inputs, add

jobs, receive outputs etc. The main purpose of the server is to distribute the computation-

ally intensive work to the workers. The sequential part of the code should always run on

the server. It also handles the persistency and fault tolerance in case a worker would not

respond.

In order for Copernicus to scale dynamically, i.e. to add new workers into a cluster, it isdesigned such that the workers are asking for jobs instead of the other way around. The

workers tell the server what they are capable of to compute, the server then checks if it

has any matching job for that specific worker. If it has a job that matches the workers

capabilities, it will send the job to the worker. Otherwise it will let the worker know that

it has no job at the moment. The worker will wait for a given amount of time and then

asks the server again for jobs. This is simply because the server might have a job at a later

time.

Figure 3.2: An illustration of jobs delivered to workers from a Copernicus server.

11


20/53


In order to make Copernicus even more dynamic, it was designed around the P2P architecture[5].

Befriended servers can help each other with the work balancing, i.e. a server asks anotherserver if it has undone jobs in its queue. The jobs can then be transferred to the other server

to more efficiently complete the job.

Figure 3.3: An illustration of jobs being transferred to a trusted Copernicus server when

the other Copernicus server has undone jobs in its queue.

12


21/53

3.2. COPERNICUS MODULE

3.2 Copernicus module

To be able to connect different executables inputs/outputs, files and different data types,

the user needs to define a module. The module specifies all the wrapping functions and

their corresponding executables with the inputs/outputs that needs to be connected. This is

defined in an XML-file called _import.xml.

Figure 3.4: This figure shows where all files must be located in order to create a Copernicus

module to be able to run an application on the Copernicus platform.

Another file that is needed for a working Copernicus module is a Python script. Each func-

tion definition in the _import.xml file needs to be implemented in a Python script. When

Copernicus reads the _import.xml file, it knows which functions that specific module has

and all the data types and inputs/outputs for each function. In the Python script, a usercan connect the instances of those functions and manipulate the outputs of one executable

before setting it to the input of another executable.

Each time a change is made to a module instance, Copernicus will call the specific function

with some argument. The function must return, otherwise the project will be blocked. This

is a crucial part to consider for the implementation of the API. Copernicus is designed in

this way so that users can interact with a running project. For example, a user might want

to change a value in the middle of a huge continuous project.

13


22/53


3.3 Copernicus module example

This section covers the basics of creating a module in Copernicus by an example. After the

module creation, adding jobs to the Copernicus job queue will be explained.

The aim of this application which is called MD5 Cracker is to crack a MD5 hash, i.e. to

find the plaintext1 that represents the corresponding MD5 hash. A MD5 hash is produced

by a one way encryption algorithm. There are several techniques to crack these kind of

hashes such as brute force attack, dictionary attack[11] and using rainbow tables[12]. The

application uses brute force attack technique, i.e. a way of cracking by trying all possible

combinations to find the plaintext. In order to crack a single lowercase character (English

language), it takes a maximum of 26 tries, and to crack a two character text, it takes a maxi-

mum of262

tries. The exponential nature of the brute force attack makes it computationallyintensive task. The problem can simply be divided into sub-problems by defining a range

for each task to compute. These two properties make this an ideal application to run on the

Copernicus platform.

1 import hashlib, string, itertools, sys

2 def bruteforce(job):

3 def validateWord(word, original_hash):

4 return hashlib.md5("".join(word)).hexdigest() ==

original_hash

5

6 def nextPermutation(FIRST_WORD_TUP,LAST_WORD_TUP,wSize):

7 for x in itertools.product(string.ascii_lowercase,

repeat=wSize):

8 if x >= FIRST_WORD_TUP and x


23/53

3.3. COPERNICUS MODULE EXAMPLE

3.3.1 Defining the _import.xml file

As already described in section 3.2, a file called _import.xml must be created and it

should be located in the Copernicus server.

1

2

3 MD5 cracker

4

5

6

7

8 crack a given hash and find the corresponding

plaintext

9

10

11 Length of the plaintext

12

13

14 The hash string

15

16

17 The start point

18

19

20 The end point

21

22

23

24

25 The output of md5cracker

26

27

28

31

32

Listing 3.2: The contents of "_import.xml"

15


24/53


In this XML example, the module is named MY_MODULE, and it has one function.

The function is named runner, it gets 4 input parameters and have one output. The inputsto the function are the total length of the plaintext, the hash value, the starting and ending

values. The output is the plaintext value that will be returned in a file after it is computed

by the workers.

3.3.2 Defining the runner.py file

Under the controller tag in the _import.xml file there is a property called function.

This property tells Copernicus that this file must be called runner.py.

1 import logging, cpc.command, cpc.util, os, shutil

2 log=logging.getLogger(cpc.lib.MY_MODULE)

3 def runner(inp):

4 if inp.testing():

5 return

6 fo=inp.getFunctionOutput()

7 persDir=inp.getPersistentDir()

8 val1=inp.getInput(num)

9 val2=inp.getInput(hash)

10 val3=inp.getInput(start)

11 val4=inp.getInput(end)

12 fileExist = os.path.isfile(persDir + "/stdout")

13 if not fileExist:

14 for i in range(len(val1)):

15 outputFiles= ["out.%d"%i]

16 args=["md5cracker", val1[i].get(), val2[i].get(),

val3[i].get(), val4[i].get()]

17 cmd=cpc.command.Command(persDir, "MY_MODULE/runner"

, args,

18 minVersion=cpc.command.Version("1.0"),

19 addPriority=0, outputFiles=outputFiles)

20 fo.addCommand(cmd)

21 return fo

Listing 3.3: The contents of "runner.py"

This Python script will tell Copernicus to run the application md5cracker on the work-

ers. The input values to the application are read from the module. After that, the internal

Copernicus API function addCommand is executed to add the desired job to the Coper-

nicus queue.

16


25/53

3.3. COPERNICUS MODULE EXAMPLE

3.3.3 Defining the executable.xml file

In order to let the Copernicus server know what functions the worker is capable of running,

a file called executable.xml must be created. This file must be copied to all the workers

that are meant to run the job.

1

2

3

4

5

6

Listing 3.4: The contents of "executable.xml"

Under the executable tag in this file there is a property called name. This property tells

Copernicus that this particular worker is capable of running MY_MODULE/runner. As

already mentioned in the previous sections, MY_MODULE is the name of the module

and runner is its function.

3.3.4 Adding jobs to Copernicus

When the module creation is finished, a project can be set up and jobs can be addedto the Copernicus queue. In order to create a workflow and connect the input/outputs,

some Copernicus commands must be executed. In this particular example, a hash value

95ebc3c7b3b9f1d2c40fec14415d3cb8 which represents the plaintext zzzzz is being

brute forced by the application. For the sake of simplicity, the length of the plaintext in

this example is known, which is 5. In a real case, it is not possible to know the length of

the plaintext from the hash value.

//create a project

$cpcc start MY_CRACKER

//import the recently create module to the project$cpcc import MY_MODULE

17


26/53


$cpcc transact//create an instance of the job and name it "runner_1"

$cpcc instance MY_MODULE::runner runner_1

$cpcc activate

//set the length of the plaintext

$cpcc set runner_1:in.num[+] "5"

//set the hash value to be cracked

$cpcc set runner_1:in.hash[+] "95

ebc3c7b3b9f1d2c40fec14415d3cb8"

//the start point

$cpcc set runner_1:in.start[+] "aaaaa"

//the end point$cpcc set runner_1:in.end[+] "mzzzz"

//commit the first job

$cpcc commit

Doing the same but with different arguments to add the second job:

$cpcc transact

$cpcc instance MY_MODULE::runner runner_2

$cpcc activate

$cpcc set runner_2:in.num[+] "5"

$cpcc set runner_2:in.hash[+] "95

ebc3c7b3b9f1d2c40fec14415d3cb8"

$cpcc set runner_2:in.start[+] "naaaa"

$cpcc set runner_2:in.end[+] "zzzzz"

//commit the second job

$cpcc commit

Two jobs are now added to the Copernicus queue to be run on the workers. If there are

workers with available computational resources, the jobs will be fetched by them and the

computation will start. When they are done with the computation, the results are sent back

to the server. After all jobs in the queue are done, the project is considered to be finishedby Copernicus.

While this is a very simple code example, it is clear that it is very time consuming process

for creating a Copernicus module for a simple distribution.

18


27/53

Chapter 4

Related work

In this chapter a number of related work will be briefly explained with their similarities/dif-

ferences and advantages/disadvantages compared to Copernicus. The goal is to get a good

insight in how other distributed platforms work and to get some inspiration before starting

to design the new API.

4.1 Folding@home

Folding@home (FaH)[13]is a distributed computing project with the goal to research pro-

tein folding[14], i.e. predicting the 3D-structure of a protein from its primary structure.

Currently, there are more than 263,000 volunteers all around the world that contribute their

computer resources to this project[13].

Copernicus is in fact highly influenced by the FaH design[15]. While most of the FaH code

is proprietary software and is only used for protein folding, Copernicus is completely free

software and is designed to do any kind of distributed computing[5].

4.2 PiCloud

PiCloud[16] is a so called cloud computing service, a commercial web application that

distributes computational work. Its API design is based on function calls to functions,

i.e. you define your functions that you want to run in a distributed manner, and then you

call the PiCloud API functions to run, make progress and receive the return values of the

function. PiCloud can both run a function sequential on the cloud and map the function to

run parallelized in a distributed system.

The big difference between the API of Copernicus and PiCloud is that the later can dis-

tribute a function and run it on the cloud but in the current Copernicus API a single function

19


28/53

CHAPTER 4. RELATED WORK

cannot be distributed but a whole program can. On the other hand, Copernicus is capable of

running several applications, collect outputs and connect inputs/outputs of each applicationto other ones and distribute the desired jobs on the workers. PiCloud is required to receive

all data needed to compute tasks before starting the computation, but Copernicus can start

a job and under the computation receive the data it needs to compute tasks.

4.3 Hadoop

Hadoop[17] is a software framework for running applications on a large cluster with sup-

port for large amount of data. It is derived from MapReduce and Google File System (GFS).

It is a free software program, licensed under the Apache License 2.0 and it is widely used

by large companies such as Facebook, Yahoo, Amazon.com, IBM, HP and others. While

Hadoop was not mainly designed for computationally intensive work, it surely can be used

as one. But its core strength is the Hadoop File System (HFS)[18]. The HFS replicates data

on all computers connected in the cluster such that if one node goes down another node can

take its place without losing any data. Data intensive jobs can take advantage of the so

called node localization system. Hadoop holds information on where each node is and

instead of transferring data to the program, the program is transferred to where the data is

located.

4.4 Techila

Techila[19] is a commercial distributed computation platform that lets intensive computa-

tions to be processed in a distributed way. It is meant to distribute sections of code, e.g.

a for-loop and it supports many languages such as Perl, Python, Matlab, C/C++, etc. It

is only capable of distributing embarrassingly parallel workloads, i.e. tasks that are com-

pletely independent, i.e. not have any shared variables.

Techila is very similar to Copernicus considering that both platforms distribute jobs on

workers and the end-user can receive the computed result from them. The difference be-

tween Techila and the current Copernicus API is that Techila is able to distribute sections

of code but Copernicus distributes programs that execute.

20


29/53

Chapter 5

The API

5.1 Design

Two fundamentally different design approaches were considered, function calls and an-

notations. In the case of function calls, the user would need to move the part of the code

that needs to be distributed inside a function, and add calls to our API-functions together

with needed arguments. The arguments would be a function pointer together with a list of

data that needs to be distributed. The function pointer points to the function that is going to

be called for each distributed work. Each element in the argument list is a list of arguments

for each distributed work. An example of the function call implementation would look like

this:

def myFunc(args):

#do a lot of work...

return something

args = [[arg1], [arg2], ... , [argN]]

if not COPERNICUS:

# Conventional wayretValueList = []

for arg in args:

retValueList.append(myFunc(arg))

else:

# The new API way

retValueList = call_to_our_api(myFunc, args)

Listing 5.1: API Design

21


30/53

CHAPTER 5. THE API

When a user runs the example script above through Copernicus, everything that is needed

for a Copernicus project will be created automatically, the script will be executed, the func-tion myFunc will be distributed to the workers and each return value from the workers

will be stored in the retValueList variable.

The annotation design would use preprocessor directives like OpenMP annotations in

C++, i.e. #pragma omp, where the programmer can add an annotation above the section

that needs to be run in parallel[20]. If the compiler has support for OpenMP, it will make

that section of code to run parallelized for that specific environment and CPU architecture.

If the compiler does not have support for OpenMP, the annotations will be ignored and the

program will run in sequential way. This way the user would only need to add this kind of

annotations right before the section of code that needs to be distributed. In Python they are

called decorators but they cannot be used for sections of code and are therefore limited toonly functions. This would make it unreasonable to use annotations for the design.

5.1.1 Considerations

There were many considerations made before the implementation started. This section will

list the most important part of them.

1. The user should change her code as little as possible to make it run on Copernicus.

2. The user might call our API functions multiple times in her code, so the user script

must wait till all jobs on the workers are done before continuing to execute the restof the code.

3. The script should not run on the client computer because the job might take long time

to complete, in case the user might want to shut down the client computer.

4. The Copernicus module should be as general as possible, i.e. able to handle arbitrary

number of input arguments, data types and executables.

5. Most likely, the server and workers will not have all required dependencies, therefore

these needs to be copied both to the server and on to all workers.

5.1.2 First attempt: Module generator

This section briefly describes the module generator design. The idea is that instead of

creating all the module files manually, our API creates all needed files and executes the

Copernicus commands by running a generated script. The user script is started normally

as the user does when she runs it on a single machine. The user adds function calls to

our API wherever a distribution of a function is needed. Our API function searches for

all dependencies for that specific function and serializes and dumps the function together

with all dependencies. The API function analyses the input data and generates all needed

22


31/53

5.1. DESIGN

Copernicus module files, the _import.xml, Python script, plugin script and a bash/shell

script for:

Creating a Copernicus project

Importing the module

Creating instances

Setting the input values

There are multiple challenges with this design approach. The first problem is that when the

user script is executed, each call to the API functions generates a new Copernicus module

that has its own specific name, number of inputs/outputs and their specific types. While

this actually works, it is not that practical for having a good overview of the project. The

second problem is that the user will have to manually copy the generated plugin script that

is specific for each call to the API functions.

While this design was not general enough to handle all kinds of Copernicus workflows, it

helped us gain a lot of knowledge about the internal design of Copernicus and had influence

on the final design. Also some code for handling dependencies and test code for creating a

Copernicus project was reused in the final design.

5.1.3 Final attempt: Generic module

The generic module was designed and redesigned multiple of times. But the final designturned out to be pretty simple. It has two main Copernicus module functions; one that starts

the users script (this is called mainRunner) and one that is created for each call to our API

functions, i.e. each function distribution (this is called subRunner). The mainRunner gets

its first inputs from the client when a project is created. The inputs are the name of the

script that calls our API functions and a tarball1 with all its dependencies. The outputs are

the standard out, standard error streams and a tarball including all the files that were created

during the execution. The user script is started in the mainRunner as a sub-process. This

way the mainRunner will not block the whole project.

1An archive that contains a set of files.

23


32/53

CHAPTER 5. THE API

Figure 5.1: Module design that shows the subRunner instance inside the mainRunner. Mul-

tiple subRunner instances are created for each call to the API functions.

While in the script, when a call to our API occurs, new instances of the subRunner are

created and the execution of the script is halted. The subRunner gets a dump of the function

that needs to be distributed, together with the specific argument for that job. Then the

outputs of each subRunner are connected to the sub-inputs of the mainRunner. In this

way the mainRunner can collect the output from each worker and reassemble them in a list

and return it back to the script, which then will continue its execution. When the whole

execution of the script is done, the mainRunner will collect the final outputs and make a

tarball out of them.

24


33/53

5.1. DESIGN

Figure 5.2: A flowchart of a user script running inside Copernicus by using generic module

API. The Communication Server and the user script are executed in separate threads from

the mainRunner.

One thing that was deliberately left out in the description of the execution scheme above

was how the user script is waiting for a list of outputs from the workers. The problem is

that while the user script is running in a separate process, there is by design no support in

Copernicus for the script to communicate with a specific project and add new jobs. The

function that is adding new jobs, i.e. creating new instances, should always be called inside

the Copernicus scope. This is solved by using an inter-process communication (IPC). The

actual implementation is described in the next chapter.When a call is made to our API from the user script, the function and the arguments are

dumped and a signal is sent through the IPC, which tells the server that there are now jobs

that need to be created. The jobCreator function, which is started from the mainRunner

and is waiting for a signal, gets the signal from the server and loads the dumped function

and its arguments. It then creates a list of instances of the subRunner and connects their

outputs to the sub-inputs of the mainRunner. The jobCreator function will return after

the job creation. As mentioned earlier, this is crucial for the Copernicus project not to be

blocked. After the jobCreator and mainRunner have returned, the subRunner function is

called by Copernicus, and the jobs will be put on the Copernicus job queue for the workers

25


34/53

CHAPTER 5. THE API

to fetch and run.

On the workers, the serialized function together with its list of arguments are loaded and

executed. After that the job is done, the output of the argument is serialized. It is then

compressed into a tarball together with all the new files that might have been created by the

distributed function, which is thereafter sent back to the Copernicus server.

As mentioned insection 3.2,each time a change is made to a module function instance, that

function is executed by Copernicus. In this case, the workers are sending back their results.

When results from all workers are sent back, they will be deserialized and collected into a

list. The list is then serialized and dumped and a signal is sent back to the API function that

is currently waiting and blocking the user script.

When the blocking API function gets the signal through IPC, it will load the serialized list

and send it back to the user script. The script will then continue its execution and a new

call to our API function is possible.

When the user script is done executing, the stdout and stderr, together with a tarball that

includes all newly created files, will be ready for the user to fetch.

5.2 API implementation

This section will go through each part of the implementation of the API design. Since

the whole Copernicus project is implemented in Python, this implementation is also in

Python.

The main challenge here is that since the function myFunc is being executed in the work-

ers, all its dependencies might not be available on those machines. So the API must handle

that by recursively inspect all the underlying dependencies and copy them to all the work-

ers. This is solved by using a Python API called snakefood[21].

For the implementation of the IPC, Unix Domain Sockets (UDS) are used, which is a socket

like API but the communication is made through a file instead of an IP address[ 22].

26


35/53

5.2. API IMPLEMENTATION

Figure 5.3: The API function and job creator function communicate through the communi-

cation server using UDS.

For passing data between the user script and the Copernicus module functions, the UDS

was not used. They were instead serialized and dumped to the hard drive by using the

internal Python object serialization, called marshal[23].

27


36/53


37/53

Chapter 6

Results

The final result of this thesis is a general design of an API for the Copernicus computing

project and also an implementation of the design in Python. The API radically simplifies

the modifications needed for a users Python script to run on the Copernicus computing

platform. The user no longer needs to create customized Copernicus modules but only call

the API function, which we call hapi_map, that handles everything that is needed for

that function to be distributed in the Copernicus platform. A full comparison between both

APIs is described in the next section and some more conclusions are drawn.

6.1 MD5 cracker with the new API

In this section the result of distributing MD5 Cracker application on the Copernicus plat-

form is presented. The goal is to show how the new API is used in an application. Two

test systems are used to measure the results and find the threshold of how small the jobs

can be to be able to obtain a speedup when distributing an application with the new API. A

speedup is the time that takes to execute an application in sequential mode, divided by the

time it takes to execute the same application in a parallel mode.

Speedup =T1

TP(6.1)

29


38/53

CHAPTER 6. RESULTS

The two systems run Ubuntu GNU/Linux 12.04 with Python 2.7.3. They are both x86-64

CPUs, but are of different vendors and thus different architecture designs. The followingcode shows how to run the application in parallel by splitting the work into two jobs. The

same concept is used to split the work into 24, 48 and 96 smaller jobs. Also different

plaintext lengths, 4 and 6, are used to be able to analyze how the API implementation

scales for small and large jobs.

import hashlib, string, itertools

def bruteforce(job):

#code omitted

job1 = [6, 453e41d218e071ccfb2d1c99ce23906a,

aaaaaa, mzzzzz]

job2 = [6, 453e41d218e071ccfb2d1c99ce23906a,

naaaaa, zzzzzz]

jobs = [job1, job2]

# This is the new distributed way.

from cpc.lib.execPythonModule.hapiModule import hapi_map

results = hapi_map(bruteforce, jobs)

Listing 6.1: md5cracker.py

30


39/53

6.1. MD5 CRACKER WITH THE NEW API

Figure 6.1:

Number of workers Duration (sec) for Opteron Duration (sec) for Xeon

1 2733 1707

2 1771 879

8 438 231

24 167 150

64 177 152

Table 6.1: The duration for 24 jobs and 6 characters.

Number of workers Speedup for Opteron Speedup for Xeon

1 1.00 1.00

2 1.54 1.948 6.24 7.39

24 16.37 11.38

64 15.44 11.23

Table 6.2: The speedup for 24 jobs and 6 characters.

31


40/53

CHAPTER 6. RESULTS

Figure 6.2:


1 6800 4701

2 3626 2435

8 1031 634

24 365 399

64 222 397



1 1.00 1.00

2 1.88 1.938 6.60 7.41

24 18.63 11.78

64 30.63 11.84


32


41/53


Figure 6.3:


1 4 3

2 21 18

8 6 5

24 4 3

64 8 5



1 1.00 1.00

2 0.19 0.178 0.67 0.60

24 1.00 1.00

64 0.50 0.60


33


42/53

CHAPTER 6. RESULTS

Figure 6.4:


1 5.9 4.7

2 40 37

8 10 10

24 8 7

64 11 13



1 1.00 1.00

2 0.15 0.138 0.59 0.47

24 0.74 0.67

64 0.54 0.36


34


43/53


From the results above, we can clearly see that when the computation time for a single job

is very short, the network data transfer and communication will be the bottleneck. We canalso see that when the number of workers on a system is more than the number of jobs,

not only there is no speedup, but it sometimes even result in a performance decrease. The

reason for this is that the server is being overloaded with a lot of workers that are asking

for jobs.

Comparing the first two charts, we can see that as long as there are enough CPUs and jobs

available for them, there will be a linear speedup.

35


44/53

CHAPTER 6. RESULTS

6.2 Comparison

In this section the usage of the previous Copernicus API and the new one is compared with

focus on the users perspective. The MD5 Cracker application described insection 3.3is

used:

# This function is supposed to be distributed.

def bruteforce(args):

#Code omitted

# Code omitted.

# Calling the brute force code.

result = bruteforce(args)


To run the application in a distributed way through Copernicus, the user should first of all

define a Copernicus module. Creation of a module in Copernicus is already described in

thesection 3.2. It requires one to have access to the Copernicus server and to write sev-

eral hundred lines of code. Since the bruteforce function is meant to be distributed and

Copernicus does not support function distribution, the md5cracker.py must be rewrit-

ten. The next step is to copy md5cracker.py file and all necessary dependencies to all

workers.

In this part, the same application is used as an example to be distributed by the new imple-

mented API. First of all sections of code that are meant to be distributed must call to the

new APIs function (hapi_map) with the following code:

# This function is supposed to be distributed.

def bruteforce(args):

# Code omitted.

# Code omitted.# This tells Copernicus to distribute the bruteforce

function.

results = hapi_map(bruteforce, listOfArgs)


Now the user is only required to run a single command to make the job done, i.e. distribute

the bruteforce function.

36


45/53

6.2. COMPARISON

Besides freeing the user from the burden of writing several hundred lines of code for creat-

ing the module, the new API brings more functionality to Copernicus which is:

Copernicus is now able to distribute sections of code and not only the entire appli-

cation. This means that sequential sections of the code can run locally on the server

and only those sections that are meant to be run on the workers will be distributed.

There is no longer any need to install or copy any files to the workers manually, all

that is handled by the API functions. In the previous Copernicus API, the user was

required to copy the application and its dependencies to all workers.

The user is no longer required to modify the Copernicus server to create Copernicus

modules. The main advantage of this result is that the system administrator is no

longer required to give the right to the users to access the server which in turn brings

more security to Copernicus.

There is no need to have any pre-knowledge about Copernicus, how it works and its

internal design. A user only needs to learn a single new, but simple, function call.

37


46/53


47/53

Chapter 7

Discussion

During the implementation of the different designs many unpredictable problems were en-

countered. Some of them were not solvable without doing some change to the Copernicus

design while other problems needed more consideration than they were initially given. For

example for the first design attempt, we did not know that when Copernicus is calling a

module function, that function must return before any other function inside that project

can be called from Copernicus. Another challenge was to decide where to run the user

script and how to handle the communication between the user script and Copernicus. Since

Copernicus is distributing the jobs to the workers, and the results from the workers need to

be returned back to the user script, the user script must be blocked inside the API function.If the script is executed on the client machine, there would be a need for a result fetching

function in the client side of Copernicus. This design would add some overhead to the

workflow of Copernicus and was therefore abandoned.

Most of our shortcomings during the design and implementation were due to lack of docu-

mentation on how to use Copernicus and its internal design. If it wasnt for all the fruitful

discussions with Iman Pouya, Sander Pronk and Patrik Falkman this project would proba-

bly not have been possible in the given time frame.

7.1 Future work

While the current design is general enough for a range of different domains, it lacks two

fundamental functionalities; distributing data files efficiently and distributing executable

binaries. These two functionalities are so common in distribution that the lack of them

would make the API unusable in practice. For such implementation in the future, the

designer might need to consider how to handle binaries that are compiled for different

architectures. This is an important point since one big advantage of using Copernicus

compared to other distributing platform is that it is cross-platform.

39


48/53

CHAPTER 7. DISCUSSION

For efficient file distribution, a simple solution would be that the user gives the name/path of

a file or a folder as an argument when running the Copernicus exec-py command. A fun-damentally different approach to this would be to design something more like how Hadoop

handles the distribution of files. That is a large cluster of computers in a distributed file

system with support for redundancy and network localization. Since this second approach

is too complex, the first approach would be a good starting point.

One of the requirements for this project was that the design should be general enough so

that it could be implemented in multiple languages. While the current implementation only

supports Python code, the design is fully portable to other similar dynamic programming

languages. For other common languages like C/C++ and Java the user will have to either

make binaries of each function and use a Python wrapper to call each of them, or to com-

pile the code as shared/dynamic library and call the functions through a Python wrapper.

For both situations the support for running executable binaries, explained above, must be

implemented.

Other future implementations that would be of interest for a Copernicus user would be

so called reducer functionality. At the moment there is only one API function which

called hapi_map, it is a mapper function that takes two arguments; the function that the

user wants to run on each worker and a list of arguments for each worker. Support for

the reducer functionality could either be added as a collection of standard reducers built

in the API, or as user defined reducer functions. The user would simply add a function

pointer or an identifier for one of the standard reducers as a third argument to the hapi_map

function.

7.2 Conclusion

With the new world of cloud computing and the birth of quantum computing[24] we can

see that the demand for computational intensive tasks are rapidly increasing.

We have in this master thesis managed to design and implement an easy to use API that

consists of one function call. It not only simplifies the usage of the Copernicus API but

also adds new functionalities. Users can now distribute sections of their code to run in a

distributed manner with a single function call. We have also shown where the minimum

limits of jobs can be in Copernicus, using our API. Finally we have listed many improve-ments that can be done to both this API design and the Copernicus project as whole. By

giving Copernicus this boost, we believe that the whole field can benefit from it.

40


49/53

Bibliography

[1] Wikipedia. Computer simulation. http://en.wikipedia.org/wiki/

Simulation#Computer_simulation, May 2013.

[2] Wikipedia. Message passing interface. http://en.wikipedia.org/wiki/

Message_Passing_Interface,May 2013.

[3] Wikipedia. Remote procedure call. http://en.wikipedia.org/wiki/

Remote_procedure_call, May 2013.

[4] Wikipedia. Java remote method invocation. http://en.wikipedia.org/

wiki/Java_remote_method_invocation,May 2013.

[5] Sander Pronk, Per Larsson, Iman Pouya, Gregory R. Bowman, Imran S. Haque, KyleBeauchamp, Berk Hess, Vijay S. Pande, Peter M. Kasson, and Erik Lindahl. Coper-

nicus: a new paradigm for parallel adaptive molecular dynamics. InProceedings of

2011 International Conference for High Performance Computing, Networking, Stor-

age and Analysis, SC 11, pages 60:160:10, New York, NY, USA, 2011. ACM.

[6] Intel. Why parallel processing? why now? what about my legacy

code? http://software.intel.com/en-us/blogs/2009/08/31/

why-parallel-processing-why-now-what-about-my-legacy-code ,

August 2009.

[7] Wikipedia. Distributed computing. http://en.wikipedia.org/wiki/

Distributed_computing, May 2013.

[8] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large

clusters. Commun. ACM, 51(1):107113, January 2008.

[9] Wikipedia. Simplified data processing on large clusters. http://research.

google.com/archive/mapreduce.html,May 2013.

[10] Adam L. Beberg, Daniel L. Ensign, Guha Jayachandran, Siraj Khaliq, and Vijay S.

Pande. Folding@home: Lessons from eight years of volunteer distributed computing.

41
http://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulation


50/53

BIBLIOGRAPHY

In Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed

Processing, IPDPS 09, pages 18, Washington, DC, USA, 2009. IEEE ComputerSociety.

[11] Wikipedia. Dictionary attack, a technique for defeating a cipher or a hash to determine

its decryption. http://en.wikipedia.org/wiki/Dictionary_attack,

May 2013.

[12] Wikipedia. Rainbow tables, a precomputed table for reversing cryptographic

hash functions. http://en.wikipedia.org/wiki/Rainbow_table, May

2013.

[13] Stanford University. Protein folding simulation software. http://folding.stanford.edu/,May 2013.

[14] Wikipedia. Protein folding. http://en.wikipedia.org/wiki/Protein_

folding,May 2013.

[15] Wikipedia. Copernicus and the folding@homes markov state model. http:

//en.wikipedia.org/wiki/Folding@home#Biomedical_research,

May 2013.

[16] PiCloud. A distributed computing service. http://www.picloud.com/,May

2013.

[17] Wikipedia. Apache hadoop, software framework that supports data-intensive

distributed applications. http://en.wikipedia.org/wiki/Apache_

Hadoop, May 2013.

[18] Tevfik Kosar. Data Intensive Distributed Computing: Challenges and Solutions for

Large-scale Information Management. Information Science Reference - Imprint of:

IGI Publishing, Hershey, PA, 1st edition, 2011.

[19] Wikipedia. Techila, a distributed computing platform. http://en.wikipedia.

org/wiki/Techila_Grid, May 2013.

[20] Wikipedia. Openmp. http://en.wikipedia.org/wiki/OpenMP, May

2013.

[21] Martin Blais. Snakefood, a python dependency analyzer. http://furius.ca/

snakefood/, May 2013.

[22] Wikipedia. Unix domain socket. http://en.wikipedia.org/wiki/Unix_

domain_socket,May 2013.

42
http://en.wikipedia.org/wiki/Dictionary_attackhttp://en.wikipedia.org/wiki/Rainbow_tablehttp://folding.stanford.edu/http://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://www.picloud.com/http://www.picloud.com/http://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/OpenMPhttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://www.picloud.com/http://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Rainbow_tablehttp://en.wikipedia.org/wiki/Dictionary_attack


51/53

[23] Python. Read and write python values in a binary format. http://docs.

python.org/2/library/marshal.html,May 2013.

[24] ArsTechnica. Google buys a d-wave quantum opti-

mizer. http://arstechnica.com/science/2013/05/

google-buys-a-d-wave-quantum-optimizer/ , May 2013.

[25] Wikipedia. Embarrassingly parallel workload. http://en.wikipedia.org/

wiki/Embarrassingly_parallel, May 2013.

[26] Xavier Vigouroux. What designs for coming supercomputers? InProceedings of the

Conference on Design, Automation and Test in Europe, DATE 13, pages 469469,

San Jose, CA, USA, 2013. EDA Consortium.

[27] Wikipedia. Peer-to-peer, a distributed application architecture that partitions

tasks or workloads between peers. http://en.wikipedia.org/wiki/

Peer-to-peer,May 2013.

43
http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.html


52/53


53/53

Appendix A

Link to source code

Link to Copernicus computing http://copernicus-computing.org/
http://copernicus-computing.org/http://copernicus-computing.org/

Documents

An Application Programming Interface for HPC