Upload
lonna
View
55
Download
1
Embed Size (px)
DESCRIPTION
POSH Python Object Sharing. Steffen Viken Valvåg In collaboration with Kjetil Jacobsen & Åge Kvalnes. University of Tromsø, Norway Sponsored by Fast Search & Transfer. Test.py. for x in "TEST PROGRAM": if x not in "FORGET IT": print x,. Test.pyc. 0 SETUP_LOOP - PowerPoint PPT Presentation
Citation preview
POSHPython Object Sharing
Steffen Viken ValvågIn collaboration with
Kjetil Jacobsen &Åge Kvalnes
University of Tromsø, NorwaySponsored byFast Search & Transfer
Python Execution Model
for x in "TEST PROGRAM": if x not in "FORGET IT": print x,
Test.py
0 SETUP_LOOP 3 LOAD_CONST ('TEST PROGRAM') 6 GET_ITER 7 FOR_ITER10 STORE_FAST (x)13 LOAD_FAST (x)16 LOAD_CONST ('FORGET IT')19 COMPARE_OP (not in)22 JUMP_IF_FALSE (to 33)25 POP_TOP26 LOAD_FAST (x)29 PRINT_ITEM30 JUMP_FORWARD (to 34)33 POP_TOP34 JUMP_ABSOLUTE37 POP_BLOCK
Test.pyc
OutputSPAM
Byte-code compilation
Interpretation
Python Threading ModelThread A
Bytecodes
Thread B
GIL
Each thread executes a separate sequence of byte codes All threads must contend for one global interpreter lock
Example: Matrix Multiplication
Performs a matrix multiplication A = B x C
The work is split between several worker threads
The application runs on a machine with 8 CPUs
0
100
200
300
400
500
1 2 3 4 5 6 7 8Number of workers
Tim
e IdealThreads
Threads do not scale for multiple CPUs due to lock contention on the GIL
Workaround: Processes
Process A Process B
IPC
Each process has its own interpreter lock Requires inter-process communication, using e.g.
message passing by means of pipes
Matrix Multiplication using Processes
A master process distributes the input matrices to a set of worker processes
Each worker process computes some part of the output matrix, and returns its result to the master
The master process assembles the final result matrix
More communication, and more complex pattern than using threads
Ways Ahead
Communication through standard Python container objects favors threads
Scalability on multiprocessor architectures favors processes
The GIL is here to stay, so making threads scale better is hard
However, there might be room for improvement of inter-process communication mechanisms
Using Shared Memory for IPC
Process B
Shared Memory
Process A
Processes communicate by accessing a shared memory region
Requires explicit synchronization and data marshalling, imposes a flat data structure
Using POSH for IPC
Process B
Shared Memory
Process A
Allocates regular Python objects in shared memory Shared objects are accessed transparently through
regular method calls
XL
X.method1()
L.extend([X, Y])
Y
IPC is done by modifying shared, mutable objects
Complications
Processes must synchronize their access to shared, mutable objects (just like threads)
Explicit synchronization of critical regions must be possible, while implicit synchronization upon accessing shared objects is desireable
Python’s regular garbage collection algorithm is inadequate for shared objects, which may be referenced by multiple processes
Proxy Objects
Shared ObjectProxy Object
X
Provides transparent access to a shared object by forwarding all attribute accesses and method calls
Provides a single entry point to a shared object, where synchronization policies may be enforced
X.method1()
return value return value
X.method2()
Multi-Process Garbage Collection
Must account for references from all live processes
Must stay up-to-date when processes fork, as this may create new references to shared objects
Should be able to handle abnormal process termination without leaking shared objects
Garbage Collection in POSH
Process A Shared Memory Process B
X
Y L
Shared Object Proxy Object
M
Regular Python referenceReference from a process to a shared object (type I)Reference from one shared object to another (type II)
Garbage Collection Details
POSH creates at most one proxy object per process for any given shared object
Shared objects are always referenced through their proxy objects
A bitmap in each shared object records the processes that have a corresponding proxy object. This tracks references of type I (from a process)
A separate count in each shared object records the number of references to the object from other shared objects. This tracks references of type II
Shared objects are deleted when there are no references to them of either type
Performance
Performing a matrix multiplication A = B x C using POSH
The work is split between several worker processes
The application runs on a machine with 8 CPUs
More overhead, but scales for multiple CPUs
0
100
200
300
400
500
1 2 3 4 5 6 7 8Number of workers
Tim
e IdealThreadsPOSH
Summary
Python uses a global interpreter lock (GIL) to serialize execution of byte codes
This entails a lack of scalability on multiprocessor architectures for CPU-intensive multi-threaded apps
However, threads offer an attractive programming model, with implicit communication
Processes + shared memory reduce IPC overheads, but normally impose flat data structures and require data marshalling
POSH uses processes + shared memory to offer a programming model similar to threads, with the scalability of processes
Availability
Open source, hosted at SourceForge http://poshmodule.sf.net/ Still not very stable Developers wanted
Example Usage
import posh
class Stuff(object): pass
posh.allow_sharing(Stuff, posh.generic_init)
mystuff = posh.share(Stuff())
def worker1(): mystuff.money = 0def worker2(): mystuff.debt = 100000def worker3(): mystuff.balance = mystuff.money - mystuff.debt
for w in worker1, worker2, worker3: posh.forkcall(w)posh.waitall()