An Application Programming Interface for HPC

Embed Size (px)

Citation preview

  • 8/13/2019 An Application Programming Interface for HPC

    1/53

    An Application Programming Interface for High

    Performance Distributed Computing

    For the Copernicus computing project

    ASHKAN JAHANBAKHSH

    HANIF FARAHMAND MOKARREMI

    Master of Science Thesis, KTH

    Supervisor: Iman Pouya, Erik Lindahl

    Examiner: Patrik Henelius

    Stockholm, Sweden 2013

  • 8/13/2019 An Application Programming Interface for HPC

    2/53

  • 8/13/2019 An Application Programming Interface for HPC

    3/53

    Abstract

    This master thesis was performed at Lindahl Lab at Science for Life Labora-

    tory (SciLifeLab), located in Stockholm, Sweden.

    The need of compute resources is increasing and the speed of a single com-

    puter is not enough for data intensive computations. Distributed computing

    has been developed to improve computations of such tasks by distributing

    them to other machines. There are many platforms that are implemented for

    this purpose. Copernicus is such a platform that provides the availability to

    distribute computationally intensive tasks. The current Copernicus API han-

    dles only distribution of entire applications and there is no support to distribute

    sections of code in Copernicus.

    In this paper we provide a general API design and a Python implementation

    of it for how to distribute sections of a code on Copernicus. The implementa-

    tion of the API handles only Python code but it is possible to extend to other

    languages with the help of Python wrappers. It also abstracts the learning

    threshold for a new Copernicus user, especially for those that are not com-

    puter scientists or people with little knowledge in programming.

  • 8/13/2019 An Application Programming Interface for HPC

    4/53

    ReferatAPI fr distribuerade berkningar

    Detta examensarbete utfrdes p Science for Life Laboratory (SciLifeLab), p

    Lindahl Lab avdelning, som ligger i Stockholm.

    Behovet av datorresurser kar och hastigheten av en enda dator r inte tillrck-

    ligt fr dataintensiva berkningar. Distribuerade berkningar har utvecklats fr

    att frbttra berkning av sdana arbeten genom att distribuera dem till andra

    maskiner. Det finns mnga plattformar som har implementerats fr detta n-

    daml. Copernicus r en sdan plattform som ger tillgng till distribuering av

    berkningsintensiva arbeten. Den nuvarande Copernicus API hanterar endast

    distribution av ett program i helhet och det finns inget std fr att distribuera

    delar av kod i Copernicus.

    I denna rapport tillhandahller vi ett generellt API-design och dess implemen-

    tation i Python fr hur man distribuerar delar av en kod i Copernicus. Im-

    plementationen av APIet hanterar endast Python-kod, men det r mjligt att

    utvidga det till andra sprk med hjlp av Python wrappers. Detta minskar ven

    bort inlrningstrskeln fr en ny Copernicus-anvndare, speciellt fr dem som

    inte r dataloger eller personer med lite kunskap inom programmering.

  • 8/13/2019 An Application Programming Interface for HPC

    5/53

    AcknowledgementsWe would like to thank everyone who helped us along this master thesis. A special

    thanks goes to Iman Pouya, our supervisor who guided us in the right direction to

    reach our goal. Other special thanks go to Sander Pronk and Patrik Falkman who

    gave us the opportunity to discuss our problems with them during the work and get

    many valuable feedbacks. We also thank Professor Erik Lindahl, our examiner in

    SciLifeLab who let us do this master thesis.

    Finally, we would like to thank our friends and specially Ali Mehrabi and Hannes

    Salin who put some time to read our report and gave us valuable feedback.

  • 8/13/2019 An Application Programming Interface for HPC

    6/53

    Contents

    1 Introduction 1

    1.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    2 Parallelization and Distributed computing 5

    2.1 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.2 Distributed computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    2.3 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.4 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . 8

    3 Copernicus 9

    3.1 Copernicus design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3.2 Copernicus module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    3.3 Copernicus module example . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.3.1 Defining the _import.xml file . . . . . . . . . . . . . . . . . . . 15

    3.3.2 Defining the runner.py file . . . . . . . . . . . . . . . . . . . . . 16

    3.3.3 Defining the executable.xml file . . . . . . . . . . . . . . . . . . 17

    3.3.4 Adding jobs to Copernicus . . . . . . . . . . . . . . . . . . . . . . 17

    4 Related work 19

    4.1 Folding@home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    4.2 PiCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    4.3 Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    4.4 Techila. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    5 The API 21

    5.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    5.1.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    5.1.2 First attempt: Module generator . . . . . . . . . . . . . . . . . . . 22

    5.1.3 Final attempt: Generic module . . . . . . . . . . . . . . . . . . . . 23

    5.2 API implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    6 Results 29

    6.1 MD5 cracker with the new API . . . . . . . . . . . . . . . . . . . . . . . . 29

    6.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

  • 8/13/2019 An Application Programming Interface for HPC

    7/53

    7 Discussion 39

    7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    Bibliography 41

    Appendices 43

    A Link to source code 45

  • 8/13/2019 An Application Programming Interface for HPC

    8/53

  • 8/13/2019 An Application Programming Interface for HPC

    9/53

    Chapter 1

    Introduction

    Today, many scientists do a lot of computationally intensive simulations[1]. This is mostly

    used in the academic world in biology, chemistry and physics but also in the commercial

    industry, e.g. the automotive industry[1]. These simulations are usually done for several

    reasons:

    The problem is too complex to solve analytically

    It is too expensive (money or time) to solve it in real world

    To be able to do a good simulation of the real world application, one must build a model

    that can represent that application. For example, in a car crash simulation it could be tooexpensive and time consuming for a company to destroy hundreds of cars to be able to get

    a representative statistical result of how a possible real car crash would turn out. Instead

    the company can use simulations to minimize the real car crash tests, thus save both time

    and money.

    Figure 1.1: A snapshot of a car crash simulation, source: BMW.

    1

  • 8/13/2019 An Application Programming Interface for HPC

    10/53

    CHAPTER 1. INTRODUCTION

    These simulations are usually computationally intensive so it would take a long time to

    do the processing on a single workstation. Therefore the simulations are usually done oncomputational resources equipped with hundreds to millions of CPU cores. These com-

    putational resources could either be a cluster of computers that are interconnected with a

    high performance link, or a cluster of smaller computers that could be placed all over the

    world which are connected through the Internet. The cluster of computers (also called su-

    percomputers) is usually very huge. They are specially built to be energy efficient, but still

    consume a lot of energy and emit huge amount of heat that needs to be dissipated. They are

    therefore located in areas with a lot of space, not only needed for the computers but also

    for the cooling system and electricity.

    Figure 1.2: A supercomputer, source: NASA.

    On top of the hardware, the user must implement a mechanism for the application to com-

    municate across all computers and clusters. There is support for such a mechanism both in

    the application layer and in a lower programming language layer. In the programming layer

    there is several language APIs such as Message Passing Interface (MPI)[2], Remote Pro-

    cedure Call (RPC)[3] and Remote Method Invocation (RMI)[4]. The programmer needs to

    handle the communications between the computers, handle fault tolerance and the available

    resources.

    Many applications have been created to simplify and make the language layer more ab-

    stract. While these applications might be very different in design and usage, most of them

    have some fundamental functionality. Some of these functionalities are handling the com-

    munication between computers, resource management and fault tolerance.

    In the Related work chapter, three such tools will briefly be explained and in the next

    section another tool together with a problem statement is presented.

    2

  • 8/13/2019 An Application Programming Interface for HPC

    11/53

    1.1. PROBLEM STATEMENT

    1.1 Problem statement

    As explained in the previous section, there are many software tools that are aiming to sim-

    plify the distribution of work. One of these tools is called Copernicus[5], which has been

    mainly developed at Royal Institute of Technology (KTH) with cooperation with Virginia

    and Stanford universities. Copernicus is a Peer-to-Peer (P2P) platform for distributed com-

    puting. It connects heterogeneous computing resources and allows utilization by defining a

    problem as a workflow. This means that users can focus on formulating their problem and

    not worry about the parallel work distribution and fault tolerance.

    It is designed primarily for molecular dynamics but can practically be used for any compu-

    tationally intensive work that can be run in a distributed manner.

    Figure 1.3: A molecular simulation of a protein folding, source: Stanford university.

    The current structure of Copernicus allows binaries, custom programs and scripts to be used

    in a workflow. However this is a monolithic approach and in certain use cases one needs

    more granular control over the work distribution. The workflow creation part of Copernicus

    is very powerful but it is also pretty time consuming for a new Copernicus user or users

    with little knowledge in programming. In some use cases it is utterly impractical to design

    a workflow to be able to distribute some work on the Copernicus platform.

    One such use case is when only some part of a code needs to be run in a distributed manner.

    In such use case, the current Copernicus design would force the user to change the code

    dramatically to make it run on the Copernicus platform. Each time the user changes the

    code, appropriate changes must be made again. While this is really time consuming and

    even frustrating for a code base with a single developer, it is even more frustrating and

    impractical for a code base with multiple developers. This is a high entry barrier which is

    probably one of the reasons for the small user base of Copernicus.

    The goal of this project is to define an API for users to use when they want to define sections

    of their code to distribute in parallel in the Copernicus platform.

    3

  • 8/13/2019 An Application Programming Interface for HPC

    12/53

    CHAPTER 1. INTRODUCTION

    1.2 Methodology

    The Copernicus project is a large free software project. It is lacking a flow chart and a

    UML diagram that explains its design decisions. Its user API documentation was also un-

    der construction, at the start of this thesis. From a developer standpoint these are two huge

    drawbacks. This means that not only did we need to understand how to use Copernicus

    but also read its code and find out about its current design, structure and its strength and

    weaknesses. However, an introduction to how to use Copernicus was given by our supervi-

    sor.

    Because of the reasons above, we decided to first get familiar with the usage of Copernicus.

    When a full understanding of its usage was obtained, literature studies were done in the area

    of High Performance Computing (HPC) and similar applications. Most of them mentionedin the previous sections. The reason for this was to get good knowledge of what was out

    there and how they are used.

    After literature studies were done, we implemented a simple distributable application that

    used Copernicus to compute some tasks. The goal was to understand and get an overview

    on how Copernicus works and get used to the available Copernicus commands. After that,

    we iteratively implemented a simple version of the API. This version was only intended

    to get a flow in the development and understanding of the Copernicus design. When there

    is a working flow, it is much easier to both understand and change each section of the

    implementation.

    The methodology of this thesis can be described by the following flow chart:

    Figure 1.4: The methodology of this thesis.

    4

  • 8/13/2019 An Application Programming Interface for HPC

    13/53

    Chapter 2

    Parallelization and Distributed

    computing

    2.1 Parallelization

    The clock speed of processors will no longer have a significant increment because the

    high clock speed requires exponentially more voltage which generates almost exponen-

    tially more heat[6]. Instead processor manufacturers use new transistors to add multiple

    processors cores to each chip[6]. To use the power of these multiple cores, programs have

    to be run in parallel. Parallel processing is a way of computing in which a large problem

    is divided into smaller and independent tasks and all tasks are computed concurrently, on a

    separate core, usually on a single computer. In other words parallel processing is the use of

    two or more processor cores at the same time to solve a single problem that can be divided

    into sub-problems.

    Figure 2.1: A visualization of parallelization of a problem.

    5

  • 8/13/2019 An Application Programming Interface for HPC

    14/53

    CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING

    2.2 Distributed computing

    In order to calculate computationally intensive tasks in a reasonable timeframe is to use a

    supercomputer, i.e. a powerful high performance computer that consists of many compute

    nodes1. In a supercomputer, tasks can take advantages of the huge amount of nodes, and

    thus get computed in a parallelized way.

    Another alternative to compute computationally intensive tasks in a reasonable timeframe

    is to set up a number of computers in a network and use their resources to compute tasks.

    Distributed computing[7] is a field in computer science which solves this by dividing a

    large problem into smaller parts and sends them to many computers in a network to solve

    and then merge the sub-solutions into a solution for the problem.

    Figure 2.2: A visualization of distribution of a problem.

    1A compute node is simply a single machine in a cluster or a network.

    6

  • 8/13/2019 An Application Programming Interface for HPC

    15/53

    2.3. MAPREDUCE

    2.3 MapReduce

    MapReduce is a programming model that allows developers to process large data sets in

    a distributed way[8]. There are two types of key functions in the MapReduce framework,

    the Map function and the Reduce function. The job is separated into sub-problems which

    are processed by the mappers. The outputs of the maps are sent to the reducers where they

    are collected into one result. While the idea of having a mapper function and a reducer

    function is pretty old and widely used[9], the name MapReduce was first encountered

    when a paper on the subject was published, inherited from two Google engineers.

    Figure 2.3: A visualization of MapReduce programming concept. A problem is separated

    into sub-problems and sent to the mappers. When mappers are done mapping, their results

    are sent to the reducers and they are collected and presented as a final output.

    7

  • 8/13/2019 An Application Programming Interface for HPC

    16/53

    CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING

    2.4 Advantages and Disadvantages

    The main advantage of a distributed system compared to a parallel computing system is

    the reliability. If one machine crashes, the remaining computers remain unaffected and the

    system will still run as a whole, given that there is support for fault-tolerance. In a parallel

    computing system, a failure of a CPU or other hardware may cause the whole system to

    stop. Another advantage of a distributed system is the flexibility, i.e. it is very easy to

    implement, install and debug new services which in turn can be accessed equally by every

    client. Finally, there are many volunteers around the world who might want to contribute

    CPU resources to help scientists compute data-intensive tasks in a distributed manner like

    the Folding@Home project[10].

    One of the main disadvantages of distributed computing systems is troubleshooting and di-agnosing them. The maintainer may be required to connect to remote machines or check if

    the communication between computers in the system is working. Network is the other fac-

    tor for a reliable distributed system, i.e. if the network is overloaded or there are problems

    with data transmission, the performance of the system will be affected.

    Both parallel and distributed computing requires one to modify sections of the code that

    is meant to be parallelized to independent tasks. This requires a good understanding for

    parallel and distributed computing.

    8

  • 8/13/2019 An Application Programming Interface for HPC

    17/53

    Chapter 3

    Copernicus

    As mentioned in chapter 1, Copernicus is a highly scalable platform for distribution of

    computationally intensive tasks, connected in a model that is called a workflow. The

    user can focus on formulating the problem as a workflow instead of spending time on

    handling the message passing and fault tolerance, which is a crucial part of distributed

    computing.

    In Copernicus, a user can create a workflow of connected instances of functions that are

    working as wrappers for running executables. For example one might want to connect the

    output of function A to the input of function B thereafter connect the output from B to

    function C. The idea of the Copernicus workflow design is that a user should be able to

    change the inputs of the module functions while the job is being executed.

    Figure 3.1: An example of a simple workflow, with three connected functions in Coperni-

    cus.

    9

  • 8/13/2019 An Application Programming Interface for HPC

    18/53

    CHAPTER 3. COPERNICUS

    Insection 3.2a more detailed explanation of Copernicus module is presented.

    Copernicus is also designed with scalability in mind. That means the user can add new

    workers on demand and the number of workers that can be in the cluster is practically

    unlimited.

    The following sections will describe the Copernicus design and the requirements for a

    project to be distributed on it.

    10

  • 8/13/2019 An Application Programming Interface for HPC

    19/53

    3.1. COPERNICUS DESIGN

    3.1 Copernicus design

    The current version of Copernicus is designed in such a way that there are three differ-

    ent environments: client, server and workers. On the client machine the user starts a new

    project and handles all the commands to the server. The servers job is to handle all com-

    mands coming from the client and send jobs to the workers. Copernicus commands are

    used as an interface between the client and the server i.e. to start a project, set inputs, add

    jobs, receive outputs etc. The main purpose of the server is to distribute the computation-

    ally intensive work to the workers. The sequential part of the code should always run on

    the server. It also handles the persistency and fault tolerance in case a worker would not

    respond.

    In order for Copernicus to scale dynamically, i.e. to add new workers into a cluster, it isdesigned such that the workers are asking for jobs instead of the other way around. The

    workers tell the server what they are capable of to compute, the server then checks if it

    has any matching job for that specific worker. If it has a job that matches the workers

    capabilities, it will send the job to the worker. Otherwise it will let the worker know that

    it has no job at the moment. The worker will wait for a given amount of time and then

    asks the server again for jobs. This is simply because the server might have a job at a later

    time.

    Figure 3.2: An illustration of jobs delivered to workers from a Copernicus server.

    11

  • 8/13/2019 An Application Programming Interface for HPC

    20/53

    CHAPTER 3. COPERNICUS

    In order to make Copernicus even more dynamic, it was designed around the P2P architecture[5].

    Befriended servers can help each other with the work balancing, i.e. a server asks anotherserver if it has undone jobs in its queue. The jobs can then be transferred to the other server

    to more efficiently complete the job.

    Figure 3.3: An illustration of jobs being transferred to a trusted Copernicus server when

    the other Copernicus server has undone jobs in its queue.

    12

  • 8/13/2019 An Application Programming Interface for HPC

    21/53

    3.2. COPERNICUS MODULE

    3.2 Copernicus module

    To be able to connect different executables inputs/outputs, files and different data types,

    the user needs to define a module. The module specifies all the wrapping functions and

    their corresponding executables with the inputs/outputs that needs to be connected. This is

    defined in an XML-file called _import.xml.

    Figure 3.4: This figure shows where all files must be located in order to create a Copernicus

    module to be able to run an application on the Copernicus platform.

    Another file that is needed for a working Copernicus module is a Python script. Each func-

    tion definition in the _import.xml file needs to be implemented in a Python script. When

    Copernicus reads the _import.xml file, it knows which functions that specific module has

    and all the data types and inputs/outputs for each function. In the Python script, a usercan connect the instances of those functions and manipulate the outputs of one executable

    before setting it to the input of another executable.

    Each time a change is made to a module instance, Copernicus will call the specific function

    with some argument. The function must return, otherwise the project will be blocked. This

    is a crucial part to consider for the implementation of the API. Copernicus is designed in

    this way so that users can interact with a running project. For example, a user might want

    to change a value in the middle of a huge continuous project.

    13

  • 8/13/2019 An Application Programming Interface for HPC

    22/53

    CHAPTER 3. COPERNICUS

    3.3 Copernicus module example

    This section covers the basics of creating a module in Copernicus by an example. After the

    module creation, adding jobs to the Copernicus job queue will be explained.

    The aim of this application which is called MD5 Cracker is to crack a MD5 hash, i.e. to

    find the plaintext1 that represents the corresponding MD5 hash. A MD5 hash is produced

    by a one way encryption algorithm. There are several techniques to crack these kind of

    hashes such as brute force attack, dictionary attack[11] and using rainbow tables[12]. The

    application uses brute force attack technique, i.e. a way of cracking by trying all possible

    combinations to find the plaintext. In order to crack a single lowercase character (English

    language), it takes a maximum of 26 tries, and to crack a two character text, it takes a maxi-

    mum of262

    tries. The exponential nature of the brute force attack makes it computationallyintensive task. The problem can simply be divided into sub-problems by defining a range

    for each task to compute. These two properties make this an ideal application to run on the

    Copernicus platform.

    1 import hashlib, string, itertools, sys

    2 def bruteforce(job):

    3 def validateWord(word, original_hash):

    4 return hashlib.md5("".join(word)).hexdigest() ==

    original_hash

    5

    6 def nextPermutation(FIRST_WORD_TUP,LAST_WORD_TUP,wSize):

    7 for x in itertools.product(string.ascii_lowercase,

    repeat=wSize):

    8 if x >= FIRST_WORD_TUP and x

  • 8/13/2019 An Application Programming Interface for HPC

    23/53

    3.3. COPERNICUS MODULE EXAMPLE

    3.3.1 Defining the _import.xml file

    As already described in section 3.2, a file called _import.xml must be created and it

    should be located in the Copernicus server.

    1

    2

    3 MD5 cracker

    4

    5

    6

    7

    8 crack a given hash and find the corresponding

    plaintext

    9

    10

    11 Length of the plaintext

    12

    13

    14 The hash string

    15

    16

    17 The start point

    18

    19

    20 The end point

    21

    22

    23

    24

    25 The output of md5cracker

    26

    27

    28

    31

    32

    Listing 3.2: The contents of "_import.xml"

    15

  • 8/13/2019 An Application Programming Interface for HPC

    24/53

    CHAPTER 3. COPERNICUS

    In this XML example, the module is named MY_MODULE, and it has one function.

    The function is named runner, it gets 4 input parameters and have one output. The inputsto the function are the total length of the plaintext, the hash value, the starting and ending

    values. The output is the plaintext value that will be returned in a file after it is computed

    by the workers.

    3.3.2 Defining the runner.py file

    Under the controller tag in the _import.xml file there is a property called function.

    This property tells Copernicus that this file must be called runner.py.

    1 import logging, cpc.command, cpc.util, os, shutil

    2 log=logging.getLogger(cpc.lib.MY_MODULE)

    3 def runner(inp):

    4 if inp.testing():

    5 return

    6 fo=inp.getFunctionOutput()

    7 persDir=inp.getPersistentDir()

    8 val1=inp.getInput(num)

    9 val2=inp.getInput(hash)

    10 val3=inp.getInput(start)

    11 val4=inp.getInput(end)

    12 fileExist = os.path.isfile(persDir + "/stdout")

    13 if not fileExist:

    14 for i in range(len(val1)):

    15 outputFiles= ["out.%d"%i]

    16 args=["md5cracker", val1[i].get(), val2[i].get(),

    val3[i].get(), val4[i].get()]

    17 cmd=cpc.command.Command(persDir, "MY_MODULE/runner"

    , args,

    18 minVersion=cpc.command.Version("1.0"),

    19 addPriority=0, outputFiles=outputFiles)

    20 fo.addCommand(cmd)

    21 return fo

    Listing 3.3: The contents of "runner.py"

    This Python script will tell Copernicus to run the application md5cracker on the work-

    ers. The input values to the application are read from the module. After that, the internal

    Copernicus API function addCommand is executed to add the desired job to the Coper-

    nicus queue.

    16

  • 8/13/2019 An Application Programming Interface for HPC

    25/53

    3.3. COPERNICUS MODULE EXAMPLE

    3.3.3 Defining the executable.xml file

    In order to let the Copernicus server know what functions the worker is capable of running,

    a file called executable.xml must be created. This file must be copied to all the workers

    that are meant to run the job.

    1

    2

    3

    4

    5

    6

    Listing 3.4: The contents of "executable.xml"

    Under the executable tag in this file there is a property called name. This property tells

    Copernicus that this particular worker is capable of running MY_MODULE/runner. As

    already mentioned in the previous sections, MY_MODULE is the name of the module

    and runner is its function.

    3.3.4 Adding jobs to Copernicus

    When the module creation is finished, a project can be set up and jobs can be addedto the Copernicus queue. In order to create a workflow and connect the input/outputs,

    some Copernicus commands must be executed. In this particular example, a hash value

    95ebc3c7b3b9f1d2c40fec14415d3cb8 which represents the plaintext zzzzz is being

    brute forced by the application. For the sake of simplicity, the length of the plaintext in

    this example is known, which is 5. In a real case, it is not possible to know the length of

    the plaintext from the hash value.

    //create a project

    $cpcc start MY_CRACKER

    //import the recently create module to the project$cpcc import MY_MODULE

    17

  • 8/13/2019 An Application Programming Interface for HPC

    26/53

    CHAPTER 3. COPERNICUS

    $cpcc transact//create an instance of the job and name it "runner_1"

    $cpcc instance MY_MODULE::runner runner_1

    $cpcc activate

    //set the length of the plaintext

    $cpcc set runner_1:in.num[+] "5"

    //set the hash value to be cracked

    $cpcc set runner_1:in.hash[+] "95

    ebc3c7b3b9f1d2c40fec14415d3cb8"

    //the start point

    $cpcc set runner_1:in.start[+] "aaaaa"

    //the end point$cpcc set runner_1:in.end[+] "mzzzz"

    //commit the first job

    $cpcc commit

    Doing the same but with different arguments to add the second job:

    $cpcc transact

    $cpcc instance MY_MODULE::runner runner_2

    $cpcc activate

    $cpcc set runner_2:in.num[+] "5"

    $cpcc set runner_2:in.hash[+] "95

    ebc3c7b3b9f1d2c40fec14415d3cb8"

    $cpcc set runner_2:in.start[+] "naaaa"

    $cpcc set runner_2:in.end[+] "zzzzz"

    //commit the second job

    $cpcc commit

    Two jobs are now added to the Copernicus queue to be run on the workers. If there are

    workers with available computational resources, the jobs will be fetched by them and the

    computation will start. When they are done with the computation, the results are sent back

    to the server. After all jobs in the queue are done, the project is considered to be finishedby Copernicus.

    While this is a very simple code example, it is clear that it is very time consuming process

    for creating a Copernicus module for a simple distribution.

    18

  • 8/13/2019 An Application Programming Interface for HPC

    27/53

    Chapter 4

    Related work

    In this chapter a number of related work will be briefly explained with their similarities/dif-

    ferences and advantages/disadvantages compared to Copernicus. The goal is to get a good

    insight in how other distributed platforms work and to get some inspiration before starting

    to design the new API.

    4.1 Folding@home

    Folding@home (FaH)[13]is a distributed computing project with the goal to research pro-

    tein folding[14], i.e. predicting the 3D-structure of a protein from its primary structure.

    Currently, there are more than 263,000 volunteers all around the world that contribute their

    computer resources to this project[13].

    Copernicus is in fact highly influenced by the FaH design[15]. While most of the FaH code

    is proprietary software and is only used for protein folding, Copernicus is completely free

    software and is designed to do any kind of distributed computing[5].

    4.2 PiCloud

    PiCloud[16] is a so called cloud computing service, a commercial web application that

    distributes computational work. Its API design is based on function calls to functions,

    i.e. you define your functions that you want to run in a distributed manner, and then you

    call the PiCloud API functions to run, make progress and receive the return values of the

    function. PiCloud can both run a function sequential on the cloud and map the function to

    run parallelized in a distributed system.

    The big difference between the API of Copernicus and PiCloud is that the later can dis-

    tribute a function and run it on the cloud but in the current Copernicus API a single function

    19

  • 8/13/2019 An Application Programming Interface for HPC

    28/53

    CHAPTER 4. RELATED WORK

    cannot be distributed but a whole program can. On the other hand, Copernicus is capable of

    running several applications, collect outputs and connect inputs/outputs of each applicationto other ones and distribute the desired jobs on the workers. PiCloud is required to receive

    all data needed to compute tasks before starting the computation, but Copernicus can start

    a job and under the computation receive the data it needs to compute tasks.

    4.3 Hadoop

    Hadoop[17] is a software framework for running applications on a large cluster with sup-

    port for large amount of data. It is derived from MapReduce and Google File System (GFS).

    It is a free software program, licensed under the Apache License 2.0 and it is widely used

    by large companies such as Facebook, Yahoo, Amazon.com, IBM, HP and others. While

    Hadoop was not mainly designed for computationally intensive work, it surely can be used

    as one. But its core strength is the Hadoop File System (HFS)[18]. The HFS replicates data

    on all computers connected in the cluster such that if one node goes down another node can

    take its place without losing any data. Data intensive jobs can take advantage of the so

    called node localization system. Hadoop holds information on where each node is and

    instead of transferring data to the program, the program is transferred to where the data is

    located.

    4.4 Techila

    Techila[19] is a commercial distributed computation platform that lets intensive computa-

    tions to be processed in a distributed way. It is meant to distribute sections of code, e.g.

    a for-loop and it supports many languages such as Perl, Python, Matlab, C/C++, etc. It

    is only capable of distributing embarrassingly parallel workloads, i.e. tasks that are com-

    pletely independent, i.e. not have any shared variables.

    Techila is very similar to Copernicus considering that both platforms distribute jobs on

    workers and the end-user can receive the computed result from them. The difference be-

    tween Techila and the current Copernicus API is that Techila is able to distribute sections

    of code but Copernicus distributes programs that execute.

    20

  • 8/13/2019 An Application Programming Interface for HPC

    29/53

    Chapter 5

    The API

    5.1 Design

    Two fundamentally different design approaches were considered, function calls and an-

    notations. In the case of function calls, the user would need to move the part of the code

    that needs to be distributed inside a function, and add calls to our API-functions together

    with needed arguments. The arguments would be a function pointer together with a list of

    data that needs to be distributed. The function pointer points to the function that is going to

    be called for each distributed work. Each element in the argument list is a list of arguments

    for each distributed work. An example of the function call implementation would look like

    this:

    def myFunc(args):

    #do a lot of work...

    return something

    args = [[arg1], [arg2], ... , [argN]]

    if not COPERNICUS:

    # Conventional wayretValueList = []

    for arg in args:

    retValueList.append(myFunc(arg))

    else:

    # The new API way

    retValueList = call_to_our_api(myFunc, args)

    Listing 5.1: API Design

    21

  • 8/13/2019 An Application Programming Interface for HPC

    30/53

    CHAPTER 5. THE API

    When a user runs the example script above through Copernicus, everything that is needed

    for a Copernicus project will be created automatically, the script will be executed, the func-tion myFunc will be distributed to the workers and each return value from the workers

    will be stored in the retValueList variable.

    The annotation design would use preprocessor directives like OpenMP annotations in

    C++, i.e. #pragma omp, where the programmer can add an annotation above the section

    that needs to be run in parallel[20]. If the compiler has support for OpenMP, it will make

    that section of code to run parallelized for that specific environment and CPU architecture.

    If the compiler does not have support for OpenMP, the annotations will be ignored and the

    program will run in sequential way. This way the user would only need to add this kind of

    annotations right before the section of code that needs to be distributed. In Python they are

    called decorators but they cannot be used for sections of code and are therefore limited toonly functions. This would make it unreasonable to use annotations for the design.

    5.1.1 Considerations

    There were many considerations made before the implementation started. This section will

    list the most important part of them.

    1. The user should change her code as little as possible to make it run on Copernicus.

    2. The user might call our API functions multiple times in her code, so the user script

    must wait till all jobs on the workers are done before continuing to execute the restof the code.

    3. The script should not run on the client computer because the job might take long time

    to complete, in case the user might want to shut down the client computer.

    4. The Copernicus module should be as general as possible, i.e. able to handle arbitrary

    number of input arguments, data types and executables.

    5. Most likely, the server and workers will not have all required dependencies, therefore

    these needs to be copied both to the server and on to all workers.

    5.1.2 First attempt: Module generator

    This section briefly describes the module generator design. The idea is that instead of

    creating all the module files manually, our API creates all needed files and executes the

    Copernicus commands by running a generated script. The user script is started normally

    as the user does when she runs it on a single machine. The user adds function calls to

    our API wherever a distribution of a function is needed. Our API function searches for

    all dependencies for that specific function and serializes and dumps the function together

    with all dependencies. The API function analyses the input data and generates all needed

    22

  • 8/13/2019 An Application Programming Interface for HPC

    31/53

    5.1. DESIGN

    Copernicus module files, the _import.xml, Python script, plugin script and a bash/shell

    script for:

    Creating a Copernicus project

    Importing the module

    Creating instances

    Setting the input values

    There are multiple challenges with this design approach. The first problem is that when the

    user script is executed, each call to the API functions generates a new Copernicus module

    that has its own specific name, number of inputs/outputs and their specific types. While

    this actually works, it is not that practical for having a good overview of the project. The

    second problem is that the user will have to manually copy the generated plugin script that

    is specific for each call to the API functions.

    While this design was not general enough to handle all kinds of Copernicus workflows, it

    helped us gain a lot of knowledge about the internal design of Copernicus and had influence

    on the final design. Also some code for handling dependencies and test code for creating a

    Copernicus project was reused in the final design.

    5.1.3 Final attempt: Generic module

    The generic module was designed and redesigned multiple of times. But the final designturned out to be pretty simple. It has two main Copernicus module functions; one that starts

    the users script (this is called mainRunner) and one that is created for each call to our API

    functions, i.e. each function distribution (this is called subRunner). The mainRunner gets

    its first inputs from the client when a project is created. The inputs are the name of the

    script that calls our API functions and a tarball1 with all its dependencies. The outputs are

    the standard out, standard error streams and a tarball including all the files that were created

    during the execution. The user script is started in the mainRunner as a sub-process. This

    way the mainRunner will not block the whole project.

    1An archive that contains a set of files.

    23

  • 8/13/2019 An Application Programming Interface for HPC

    32/53

    CHAPTER 5. THE API

    Figure 5.1: Module design that shows the subRunner instance inside the mainRunner. Mul-

    tiple subRunner instances are created for each call to the API functions.

    While in the script, when a call to our API occurs, new instances of the subRunner are

    created and the execution of the script is halted. The subRunner gets a dump of the function

    that needs to be distributed, together with the specific argument for that job. Then the

    outputs of each subRunner are connected to the sub-inputs of the mainRunner. In this

    way the mainRunner can collect the output from each worker and reassemble them in a list

    and return it back to the script, which then will continue its execution. When the whole

    execution of the script is done, the mainRunner will collect the final outputs and make a

    tarball out of them.

    24

  • 8/13/2019 An Application Programming Interface for HPC

    33/53

    5.1. DESIGN

    Figure 5.2: A flowchart of a user script running inside Copernicus by using generic module

    API. The Communication Server and the user script are executed in separate threads from

    the mainRunner.

    One thing that was deliberately left out in the description of the execution scheme above

    was how the user script is waiting for a list of outputs from the workers. The problem is

    that while the user script is running in a separate process, there is by design no support in

    Copernicus for the script to communicate with a specific project and add new jobs. The

    function that is adding new jobs, i.e. creating new instances, should always be called inside

    the Copernicus scope. This is solved by using an inter-process communication (IPC). The

    actual implementation is described in the next chapter.When a call is made to our API from the user script, the function and the arguments are

    dumped and a signal is sent through the IPC, which tells the server that there are now jobs

    that need to be created. The jobCreator function, which is started from the mainRunner

    and is waiting for a signal, gets the signal from the server and loads the dumped function

    and its arguments. It then creates a list of instances of the subRunner and connects their

    outputs to the sub-inputs of the mainRunner. The jobCreator function will return after

    the job creation. As mentioned earlier, this is crucial for the Copernicus project not to be

    blocked. After the jobCreator and mainRunner have returned, the subRunner function is

    called by Copernicus, and the jobs will be put on the Copernicus job queue for the workers

    25

  • 8/13/2019 An Application Programming Interface for HPC

    34/53

    CHAPTER 5. THE API

    to fetch and run.

    On the workers, the serialized function together with its list of arguments are loaded and

    executed. After that the job is done, the output of the argument is serialized. It is then

    compressed into a tarball together with all the new files that might have been created by the

    distributed function, which is thereafter sent back to the Copernicus server.

    As mentioned insection 3.2,each time a change is made to a module function instance, that

    function is executed by Copernicus. In this case, the workers are sending back their results.

    When results from all workers are sent back, they will be deserialized and collected into a

    list. The list is then serialized and dumped and a signal is sent back to the API function that

    is currently waiting and blocking the user script.

    When the blocking API function gets the signal through IPC, it will load the serialized list

    and send it back to the user script. The script will then continue its execution and a new

    call to our API function is possible.

    When the user script is done executing, the stdout and stderr, together with a tarball that

    includes all newly created files, will be ready for the user to fetch.

    5.2 API implementation

    This section will go through each part of the implementation of the API design. Since

    the whole Copernicus project is implemented in Python, this implementation is also in

    Python.

    The main challenge here is that since the function myFunc is being executed in the work-

    ers, all its dependencies might not be available on those machines. So the API must handle

    that by recursively inspect all the underlying dependencies and copy them to all the work-

    ers. This is solved by using a Python API called snakefood[21].

    For the implementation of the IPC, Unix Domain Sockets (UDS) are used, which is a socket

    like API but the communication is made through a file instead of an IP address[ 22].

    26

  • 8/13/2019 An Application Programming Interface for HPC

    35/53

    5.2. API IMPLEMENTATION

    Figure 5.3: The API function and job creator function communicate through the communi-

    cation server using UDS.

    For passing data between the user script and the Copernicus module functions, the UDS

    was not used. They were instead serialized and dumped to the hard drive by using the

    internal Python object serialization, called marshal[23].

    27

  • 8/13/2019 An Application Programming Interface for HPC

    36/53

  • 8/13/2019 An Application Programming Interface for HPC

    37/53

    Chapter 6

    Results

    The final result of this thesis is a general design of an API for the Copernicus computing

    project and also an implementation of the design in Python. The API radically simplifies

    the modifications needed for a users Python script to run on the Copernicus computing

    platform. The user no longer needs to create customized Copernicus modules but only call

    the API function, which we call hapi_map, that handles everything that is needed for

    that function to be distributed in the Copernicus platform. A full comparison between both

    APIs is described in the next section and some more conclusions are drawn.

    6.1 MD5 cracker with the new API

    In this section the result of distributing MD5 Cracker application on the Copernicus plat-

    form is presented. The goal is to show how the new API is used in an application. Two

    test systems are used to measure the results and find the threshold of how small the jobs

    can be to be able to obtain a speedup when distributing an application with the new API. A

    speedup is the time that takes to execute an application in sequential mode, divided by the

    time it takes to execute the same application in a parallel mode.

    Speedup =T1

    TP(6.1)

    29

  • 8/13/2019 An Application Programming Interface for HPC

    38/53

    CHAPTER 6. RESULTS

    The two systems run Ubuntu GNU/Linux 12.04 with Python 2.7.3. They are both x86-64

    CPUs, but are of different vendors and thus different architecture designs. The followingcode shows how to run the application in parallel by splitting the work into two jobs. The

    same concept is used to split the work into 24, 48 and 96 smaller jobs. Also different

    plaintext lengths, 4 and 6, are used to be able to analyze how the API implementation

    scales for small and large jobs.

    import hashlib, string, itertools

    def bruteforce(job):

    #code omitted

    job1 = [6, 453e41d218e071ccfb2d1c99ce23906a,

    aaaaaa, mzzzzz]

    job2 = [6, 453e41d218e071ccfb2d1c99ce23906a,

    naaaaa, zzzzzz]

    jobs = [job1, job2]

    # This is the new distributed way.

    from cpc.lib.execPythonModule.hapiModule import hapi_map

    results = hapi_map(bruteforce, jobs)

    Listing 6.1: md5cracker.py

    30

  • 8/13/2019 An Application Programming Interface for HPC

    39/53

    6.1. MD5 CRACKER WITH THE NEW API

    Figure 6.1:

    Number of workers Duration (sec) for Opteron Duration (sec) for Xeon

    1 2733 1707

    2 1771 879

    8 438 231

    24 167 150

    64 177 152

    Table 6.1: The duration for 24 jobs and 6 characters.

    Number of workers Speedup for Opteron Speedup for Xeon

    1 1.00 1.00

    2 1.54 1.948 6.24 7.39

    24 16.37 11.38

    64 15.44 11.23

    Table 6.2: The speedup for 24 jobs and 6 characters.

    31

  • 8/13/2019 An Application Programming Interface for HPC

    40/53

    CHAPTER 6. RESULTS

    Figure 6.2:

    Number of workers Duration (sec) for Opteron Duration (sec) for Xeon

    1 6800 4701

    2 3626 2435

    8 1031 634

    24 365 399

    64 222 397

    Table 6.3: The duration for 96 jobs and 6 characters.

    Number of workers Speedup for Opteron Speedup for Xeon

    1 1.00 1.00

    2 1.88 1.938 6.60 7.41

    24 18.63 11.78

    64 30.63 11.84

    Table 6.4: The speedup for 96 jobs and 6 characters.

    32

  • 8/13/2019 An Application Programming Interface for HPC

    41/53

    6.1. MD5 CRACKER WITH THE NEW API

    Figure 6.3:

    Number of workers Duration (sec) for Opteron Duration (sec) for Xeon

    1 4 3

    2 21 18

    8 6 5

    24 4 3

    64 8 5

    Table 6.5: The duration for 24 jobs and 4 characters.

    Number of workers Speedup for Opteron Speedup for Xeon

    1 1.00 1.00

    2 0.19 0.178 0.67 0.60

    24 1.00 1.00

    64 0.50 0.60

    Table 6.6: The speedup for 24 jobs and 4 characters.

    33

  • 8/13/2019 An Application Programming Interface for HPC

    42/53

    CHAPTER 6. RESULTS

    Figure 6.4:

    Number of workers Duration (sec) for Opteron Duration (sec) for Xeon

    1 5.9 4.7

    2 40 37

    8 10 10

    24 8 7

    64 11 13

    Table 6.7: The duration for 48 jobs and 4 characters.

    Number of workers Speedup for Opteron Speedup for Xeon

    1 1.00 1.00

    2 0.15 0.138 0.59 0.47

    24 0.74 0.67

    64 0.54 0.36

    Table 6.8: The speedup for 48 jobs and 4 characters.

    34

  • 8/13/2019 An Application Programming Interface for HPC

    43/53

    6.1. MD5 CRACKER WITH THE NEW API

    From the results above, we can clearly see that when the computation time for a single job

    is very short, the network data transfer and communication will be the bottleneck. We canalso see that when the number of workers on a system is more than the number of jobs,

    not only there is no speedup, but it sometimes even result in a performance decrease. The

    reason for this is that the server is being overloaded with a lot of workers that are asking

    for jobs.

    Comparing the first two charts, we can see that as long as there are enough CPUs and jobs

    available for them, there will be a linear speedup.

    35

  • 8/13/2019 An Application Programming Interface for HPC

    44/53

    CHAPTER 6. RESULTS

    6.2 Comparison

    In this section the usage of the previous Copernicus API and the new one is compared with

    focus on the users perspective. The MD5 Cracker application described insection 3.3is

    used:

    # This function is supposed to be distributed.

    def bruteforce(args):

    #Code omitted

    # Code omitted.

    # Calling the brute force code.

    result = bruteforce(args)

    Listing 6.2: md5cracker.py

    To run the application in a distributed way through Copernicus, the user should first of all

    define a Copernicus module. Creation of a module in Copernicus is already described in

    thesection 3.2. It requires one to have access to the Copernicus server and to write sev-

    eral hundred lines of code. Since the bruteforce function is meant to be distributed and

    Copernicus does not support function distribution, the md5cracker.py must be rewrit-

    ten. The next step is to copy md5cracker.py file and all necessary dependencies to all

    workers.

    In this part, the same application is used as an example to be distributed by the new imple-

    mented API. First of all sections of code that are meant to be distributed must call to the

    new APIs function (hapi_map) with the following code:

    # This function is supposed to be distributed.

    def bruteforce(args):

    # Code omitted.

    # Code omitted.# This tells Copernicus to distribute the bruteforce

    function.

    results = hapi_map(bruteforce, listOfArgs)

    Listing 6.3: md5cracker.py

    Now the user is only required to run a single command to make the job done, i.e. distribute

    the bruteforce function.

    36

  • 8/13/2019 An Application Programming Interface for HPC

    45/53

    6.2. COMPARISON

    Besides freeing the user from the burden of writing several hundred lines of code for creat-

    ing the module, the new API brings more functionality to Copernicus which is:

    Copernicus is now able to distribute sections of code and not only the entire appli-

    cation. This means that sequential sections of the code can run locally on the server

    and only those sections that are meant to be run on the workers will be distributed.

    There is no longer any need to install or copy any files to the workers manually, all

    that is handled by the API functions. In the previous Copernicus API, the user was

    required to copy the application and its dependencies to all workers.

    The user is no longer required to modify the Copernicus server to create Copernicus

    modules. The main advantage of this result is that the system administrator is no

    longer required to give the right to the users to access the server which in turn brings

    more security to Copernicus.

    There is no need to have any pre-knowledge about Copernicus, how it works and its

    internal design. A user only needs to learn a single new, but simple, function call.

    37

  • 8/13/2019 An Application Programming Interface for HPC

    46/53

  • 8/13/2019 An Application Programming Interface for HPC

    47/53

    Chapter 7

    Discussion

    During the implementation of the different designs many unpredictable problems were en-

    countered. Some of them were not solvable without doing some change to the Copernicus

    design while other problems needed more consideration than they were initially given. For

    example for the first design attempt, we did not know that when Copernicus is calling a

    module function, that function must return before any other function inside that project

    can be called from Copernicus. Another challenge was to decide where to run the user

    script and how to handle the communication between the user script and Copernicus. Since

    Copernicus is distributing the jobs to the workers, and the results from the workers need to

    be returned back to the user script, the user script must be blocked inside the API function.If the script is executed on the client machine, there would be a need for a result fetching

    function in the client side of Copernicus. This design would add some overhead to the

    workflow of Copernicus and was therefore abandoned.

    Most of our shortcomings during the design and implementation were due to lack of docu-

    mentation on how to use Copernicus and its internal design. If it wasnt for all the fruitful

    discussions with Iman Pouya, Sander Pronk and Patrik Falkman this project would proba-

    bly not have been possible in the given time frame.

    7.1 Future work

    While the current design is general enough for a range of different domains, it lacks two

    fundamental functionalities; distributing data files efficiently and distributing executable

    binaries. These two functionalities are so common in distribution that the lack of them

    would make the API unusable in practice. For such implementation in the future, the

    designer might need to consider how to handle binaries that are compiled for different

    architectures. This is an important point since one big advantage of using Copernicus

    compared to other distributing platform is that it is cross-platform.

    39

  • 8/13/2019 An Application Programming Interface for HPC

    48/53

    CHAPTER 7. DISCUSSION

    For efficient file distribution, a simple solution would be that the user gives the name/path of

    a file or a folder as an argument when running the Copernicus exec-py command. A fun-damentally different approach to this would be to design something more like how Hadoop

    handles the distribution of files. That is a large cluster of computers in a distributed file

    system with support for redundancy and network localization. Since this second approach

    is too complex, the first approach would be a good starting point.

    One of the requirements for this project was that the design should be general enough so

    that it could be implemented in multiple languages. While the current implementation only

    supports Python code, the design is fully portable to other similar dynamic programming

    languages. For other common languages like C/C++ and Java the user will have to either

    make binaries of each function and use a Python wrapper to call each of them, or to com-

    pile the code as shared/dynamic library and call the functions through a Python wrapper.

    For both situations the support for running executable binaries, explained above, must be

    implemented.

    Other future implementations that would be of interest for a Copernicus user would be

    so called reducer functionality. At the moment there is only one API function which

    called hapi_map, it is a mapper function that takes two arguments; the function that the

    user wants to run on each worker and a list of arguments for each worker. Support for

    the reducer functionality could either be added as a collection of standard reducers built

    in the API, or as user defined reducer functions. The user would simply add a function

    pointer or an identifier for one of the standard reducers as a third argument to the hapi_map

    function.

    7.2 Conclusion

    With the new world of cloud computing and the birth of quantum computing[24] we can

    see that the demand for computational intensive tasks are rapidly increasing.

    We have in this master thesis managed to design and implement an easy to use API that

    consists of one function call. It not only simplifies the usage of the Copernicus API but

    also adds new functionalities. Users can now distribute sections of their code to run in a

    distributed manner with a single function call. We have also shown where the minimum

    limits of jobs can be in Copernicus, using our API. Finally we have listed many improve-ments that can be done to both this API design and the Copernicus project as whole. By

    giving Copernicus this boost, we believe that the whole field can benefit from it.

    40

  • 8/13/2019 An Application Programming Interface for HPC

    49/53

    Bibliography

    [1] Wikipedia. Computer simulation. http://en.wikipedia.org/wiki/

    Simulation#Computer_simulation, May 2013.

    [2] Wikipedia. Message passing interface. http://en.wikipedia.org/wiki/

    Message_Passing_Interface,May 2013.

    [3] Wikipedia. Remote procedure call. http://en.wikipedia.org/wiki/

    Remote_procedure_call, May 2013.

    [4] Wikipedia. Java remote method invocation. http://en.wikipedia.org/

    wiki/Java_remote_method_invocation,May 2013.

    [5] Sander Pronk, Per Larsson, Iman Pouya, Gregory R. Bowman, Imran S. Haque, KyleBeauchamp, Berk Hess, Vijay S. Pande, Peter M. Kasson, and Erik Lindahl. Coper-

    nicus: a new paradigm for parallel adaptive molecular dynamics. InProceedings of

    2011 International Conference for High Performance Computing, Networking, Stor-

    age and Analysis, SC 11, pages 60:160:10, New York, NY, USA, 2011. ACM.

    [6] Intel. Why parallel processing? why now? what about my legacy

    code? http://software.intel.com/en-us/blogs/2009/08/31/

    why-parallel-processing-why-now-what-about-my-legacy-code ,

    August 2009.

    [7] Wikipedia. Distributed computing. http://en.wikipedia.org/wiki/

    Distributed_computing, May 2013.

    [8] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large

    clusters. Commun. ACM, 51(1):107113, January 2008.

    [9] Wikipedia. Simplified data processing on large clusters. http://research.

    google.com/archive/mapreduce.html,May 2013.

    [10] Adam L. Beberg, Daniel L. Ensign, Guha Jayachandran, Siraj Khaliq, and Vijay S.

    Pande. Folding@home: Lessons from eight years of volunteer distributed computing.

    41

    http://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulation
  • 8/13/2019 An Application Programming Interface for HPC

    50/53

    BIBLIOGRAPHY

    In Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed

    Processing, IPDPS 09, pages 18, Washington, DC, USA, 2009. IEEE ComputerSociety.

    [11] Wikipedia. Dictionary attack, a technique for defeating a cipher or a hash to determine

    its decryption. http://en.wikipedia.org/wiki/Dictionary_attack,

    May 2013.

    [12] Wikipedia. Rainbow tables, a precomputed table for reversing cryptographic

    hash functions. http://en.wikipedia.org/wiki/Rainbow_table, May

    2013.

    [13] Stanford University. Protein folding simulation software. http://folding.stanford.edu/,May 2013.

    [14] Wikipedia. Protein folding. http://en.wikipedia.org/wiki/Protein_

    folding,May 2013.

    [15] Wikipedia. Copernicus and the folding@homes markov state model. http:

    //en.wikipedia.org/wiki/Folding@home#Biomedical_research,

    May 2013.

    [16] PiCloud. A distributed computing service. http://www.picloud.com/,May

    2013.

    [17] Wikipedia. Apache hadoop, software framework that supports data-intensive

    distributed applications. http://en.wikipedia.org/wiki/Apache_

    Hadoop, May 2013.

    [18] Tevfik Kosar. Data Intensive Distributed Computing: Challenges and Solutions for

    Large-scale Information Management. Information Science Reference - Imprint of:

    IGI Publishing, Hershey, PA, 1st edition, 2011.

    [19] Wikipedia. Techila, a distributed computing platform. http://en.wikipedia.

    org/wiki/Techila_Grid, May 2013.

    [20] Wikipedia. Openmp. http://en.wikipedia.org/wiki/OpenMP, May

    2013.

    [21] Martin Blais. Snakefood, a python dependency analyzer. http://furius.ca/

    snakefood/, May 2013.

    [22] Wikipedia. Unix domain socket. http://en.wikipedia.org/wiki/Unix_

    domain_socket,May 2013.

    42

    http://en.wikipedia.org/wiki/Dictionary_attackhttp://en.wikipedia.org/wiki/Rainbow_tablehttp://folding.stanford.edu/http://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://www.picloud.com/http://www.picloud.com/http://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/OpenMPhttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://www.picloud.com/http://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Rainbow_tablehttp://en.wikipedia.org/wiki/Dictionary_attack
  • 8/13/2019 An Application Programming Interface for HPC

    51/53

    [23] Python. Read and write python values in a binary format. http://docs.

    python.org/2/library/marshal.html,May 2013.

    [24] ArsTechnica. Google buys a d-wave quantum opti-

    mizer. http://arstechnica.com/science/2013/05/

    google-buys-a-d-wave-quantum-optimizer/ , May 2013.

    [25] Wikipedia. Embarrassingly parallel workload. http://en.wikipedia.org/

    wiki/Embarrassingly_parallel, May 2013.

    [26] Xavier Vigouroux. What designs for coming supercomputers? InProceedings of the

    Conference on Design, Automation and Test in Europe, DATE 13, pages 469469,

    San Jose, CA, USA, 2013. EDA Consortium.

    [27] Wikipedia. Peer-to-peer, a distributed application architecture that partitions

    tasks or workloads between peers. http://en.wikipedia.org/wiki/

    Peer-to-peer,May 2013.

    43

    http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.html
  • 8/13/2019 An Application Programming Interface for HPC

    52/53

  • 8/13/2019 An Application Programming Interface for HPC

    53/53

    Appendix A

    Link to source code

    Link to Copernicus computing http://copernicus-computing.org/

    http://copernicus-computing.org/http://copernicus-computing.org/