26
Orca A language for parallel programming of distributed systems

Orca A language for parallel programming of distributed systems

Embed Size (px)

Citation preview

Page 1: Orca A language for parallel programming of distributed systems

Orca A language for parallel programming

of distributed systems

Page 2: Orca A language for parallel programming of distributed systems

2

Orca

• Parallel language designed at VU

• Design and first implementation (‘88-’92):– Bal, Kaashoek, Tanenbaum

• Portable Orca system (‘93-’97):– Bal, Bhoedjang, Langendoen, Rühl, Jacobs, Hofman

• Used by ~30 M.Sc. Students

Page 3: Orca A language for parallel programming of distributed systems

3

Overview

• Distributed shared memory

• Orca– Shared data-object model– Processes– Condition synchronization– Language aspects

• Examples: TSP and ASP

• Implementation

Page 4: Orca A language for parallel programming of distributed systems

4

Orca’s Programming Model

• Explicit parallelism (processes)

• Communication model: – Shared memory: hard to build– Distributed memory: hard to program

• Idea: shared memory programming model on distributed memory machine

• Distributed shared memory (DSM)

Page 5: Orca A language for parallel programming of distributed systems

5

Distributed Shared Memory(1)

• Hardware (CC-NUMA):– Cache-coherent Non Uniform Memory Access– Processor can copy remote cache line– Hardware keeps caches coherent– Examples: DASH, Alewife, SGI Origin

CPU

cache/memory

remote read

local read

Page 6: Orca A language for parallel programming of distributed systems

6

Distributed Shared Memory(2)

• Operating system:– Shared virtual memory– Processor can fetch remote pages– OS keeps copies of pages coherent

• User-level system– Treadmarks– Uses OS-like techniques– Implemented with mmap and signals

Page 7: Orca A language for parallel programming of distributed systems

7

Distributed Shared Memory(3)

• Languages and libraries– Do not provide flat address space– Examples:

• Linda: tuple spaces

• CRL: shared regions

• Orca: shared data-objects

Page 8: Orca A language for parallel programming of distributed systems

8

Shared Data-object Model

• Shared data encapsulated in objects

• Object = variable of abstract data type

• Shared data accessed by user-defined, high-level operations

Object

Local data

Enqueue( )

Dequeue( )

Page 9: Orca A language for parallel programming of distributed systems

9

Semantics

• Each operation is executed atomically– As if operations were executed 1 at a time– Mutual exclusion synchronization– Similar to monitors

• Each operation applies to single object– Allows efficient implementation– Atomic operations on multiple objects are seldom

needed and hard to implement

Page 10: Orca A language for parallel programming of distributed systems

10

Implementation

• System determines object distribution

• It may replicate objects (transparently)

Single-copy object

Network

CPU 1 CPU 2

Replicated object

Network

CPU 1 CPU 2

Page 11: Orca A language for parallel programming of distributed systems

11

Object Types

• Abstract data type

• Two parts:1 Specification part

• ADT operations

2 Implementation part• Local data

• Code for operations

• Optional initialization code

Page 12: Orca A language for parallel programming of distributed systems

12

Example: Intobject

• Specification part

object specification IntObject; operation Value(): integer; operation Assign(Val: integer); operation Min(Val: integer);end;

object specification IntObject; operation Value(): integer; operation Assign(Val: integer); operation Min(Val: integer);end;

Page 13: Orca A language for parallel programming of distributed systems

13

Intobject Implementation Partobject implementation IntObject; X: integer; # internal data of the object

operation Value(): integer; begin return X; end; operation Assign(Val: integer); begin X := Val; end operation Min(Val: integer); begin if Val < X then X := Val; fi; end;end;

object implementation IntObject; X: integer; # internal data of the object

operation Value(): integer; begin return X; end; operation Assign(Val: integer); begin X := Val; end operation Min(Val: integer); begin if Val < X then X := Val; fi; end;end;

Page 14: Orca A language for parallel programming of distributed systems

14

Usage of Objects# declare (create) objectMyInt: IntObject;

# apply operations to the objectMyInt$Assign(5);tmp := MyInt$Value();

# atomic operation MyInt$Min(4);

# multiple operations (not atomic)if MyInt$Value() > 4 then MyInt$Assign(4);fi;

# declare (create) objectMyInt: IntObject;

# apply operations to the objectMyInt$Assign(5);tmp := MyInt$Value();

# atomic operation MyInt$Min(4);

# multiple operations (not atomic)if MyInt$Value() > 4 then MyInt$Assign(4);fi;

Page 15: Orca A language for parallel programming of distributed systems

15

Parallelism

• Expressed through processes– Process declaration: defines behavior– Fork statement: creates new process

• Object made accessible by passing it as shared parameter (call-by-reference)

• Any other data structure can be passed by value (copied)

Page 16: Orca A language for parallel programming of distributed systems

16

Example (Processes)

# declare a process typeprocess worker(n: integer; x: shared IntObject);begin #do work ... x$Assign(result);end;

# declare an objectmin: IntObject;

# create a process on CPU 2fork worker(100, min) on (2);

# declare a process typeprocess worker(n: integer; x: shared IntObject);begin #do work ... x$Assign(result);end;

# declare an objectmin: IntObject;

# create a process on CPU 2fork worker(100, min) on (2);

Page 17: Orca A language for parallel programming of distributed systems

17

Structure of Orca Programs

• Initially there is one process (OrcaMain)

• A process can create child processes and share objects with them

• Hierarchy of processes communicating through objects

• No lightweight treads

Page 18: Orca A language for parallel programming of distributed systems

18

Condition Synchronization

• Operation is allowed to block initially– Using one or more guarded statements

• Semantics:– Block until 1 or more guards are true– Select a true guard, execute is statements

operation name(parameters); guard expr-1 do statements-1; od; .... guard expr-N do statements-N od;end;

operation name(parameters); guard expr-1 do statements-1; od; .... guard expr-N do statements-N od;end;

Page 19: Orca A language for parallel programming of distributed systems

19

Example: Job Queueobject implementation JobQueue; Q: “queue of jobs”;

operation addjob(j: job); begin enqueue(Q,j); end;

operation getjob(): job; begin guard NotEmpty(Q) do return dequeue(Q); od; end;end;

object implementation JobQueue; Q: “queue of jobs”;

operation addjob(j: job); begin enqueue(Q,j); end;

operation getjob(): job; begin guard NotEmpty(Q) do return dequeue(Q); od; end;end;

Page 20: Orca A language for parallel programming of distributed systems

20

Traveling Salesman Problem

• Structure of the Orca TSP program

• JobQueue and Minimum are objects

Master

Slave

Slave

SlaveJobQueue Minimum

Page 21: Orca A language for parallel programming of distributed systems

21

Language Aspects (1)

• Syntax somewhat similar to Modula-2

• Standard types:– Scalar (integer, real)– Dynamic arrays– Records, unions, sets, bags– Graphs

• Generic types (as in Ada)

• User-defined abstract data types

Page 22: Orca A language for parallel programming of distributed systems

22

Language Aspects (2)

• No global variables

• No pointers

• Type-secure– Every error against the language rules will be detected

by the compiler or runtime system

• Not object-oriented– No inheritance, dynamic binding, polymorphism

Page 23: Orca A language for parallel programming of distributed systems

23

Example Graph Type: Binary Treetype node = nodename of BinTree;type BinTree = graph root: node; # global field nodes # fields of each node data: integer; LeftSon, RightSon: node; end;

type node = nodename of BinTree;type BinTree = graph root: node; # global field nodes # fields of each node data: integer; LeftSon, RightSon: node; end;

t: BinTree; # create treen: node; # nodename variablen := addnode(t); # add a node to tt.root := n; # access global fieldt[n].data := 12; # fill in data on node nt[n].LeftSon := addnode(t); # create left sondeletenode(t,n); # delete node nt[n].data := 11; # runtime error

t: BinTree; # create treen: node; # nodename variablen := addnode(t); # add a node to tt.root := n; # access global fieldt[n].data := 12; # fill in data on node nt[n].LeftSon := addnode(t); # create left sondeletenode(t,n); # delete node nt[n].data := 11; # runtime error

Page 24: Orca A language for parallel programming of distributed systems

24

Performance Issues

• Orca provides high level of abstraction easy to program

hard to understand performance behavior

• Example: X$foo() can either result in:– function call (if X is not shared)– monitor call (if X is shared and stored locally)– remote procedure call (if X is stored remotely)– broadcast (if X is replicated)

Page 25: Orca A language for parallel programming of distributed systems

25

Performance model

• The Orca system will:– replicate objects with a high read/write-ratio– store nonreplicated object on ``best’’ location

• Communication is generated for:– writing replicated object (broadcast)– accessing remote nonreplicated object (RPC)

• Programmer must think about locality

Page 26: Orca A language for parallel programming of distributed systems

26

Summary of Orca

• Object-based distributed shared memory– Hides the underlying network from the user– Applications can use shared data

• Language is especially designed for distributed systems

• User-defined, high-level operations