View
2.521
Download
1
Category
Preview:
DESCRIPTION
Talk at Pycon India 2012 by Aalok the Magnificent
Citation preview
Concurrency & GeventScaling Real World Applications
Concurrency
handling a number of things at the same time
Examples: (Incoming) WebServers, Database Servers
(Outgoing) SSH Mux
SSH Mux
● Execute a command on a remote
SSH server
● Handle concurrent SSH Clients
● Command execution time varies
from seconds to days
● Command execution happens on
remote servers, SSH mux is I/O
bound
SSH Client
1. Init session
2. Authenticate
3. Get a channel
4. Issue command
5. Read output
Need Concurrency?
● Process blocks on read()
● No new connections can be inititated
● Need ability to handle multiple clients at the same
time
Multiprocessing
● One process is the master
● Master can spawn workers
● Each worker handles one request at a time
● Pre-forked pool of workers
Concurrent SSH Clients
SSH Mux Memory Usage
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000
100
200
300
400
500
600
ssh mux memory usage
Processes
# Concurrent Reqs
Me
mo
ry (
MB
)
SSH Mux Performace
20 40 80 150 200 400 800 1200 1500 1800 21000
20
40
60
80
100
120
140
160
ssh mux performance sheet
Processes
# Concurrent Reqs
Tim
e(s
)
Multiprocessing yay
● Easy to get started
● OS guaranteed process isolation & fairness
● Covers up for misbehaving workers
● Add more concurrency by adding more workers
● Convenient when numbers are smaller numbers
Multiprocessing nay
● Concurrency limited by number of processes
● Memory heavy
● Implicit scheduling
● Synchronization is not trivial
More Concurrency?
● Command execution is happening on remote servers, we are mostly blocked on I/O
● Handle multiple I/O in a single process?
Gevent
gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on
top of the libevent event loop.
Greenlets
● Lightweight 'threads' - not OS threads
● Explicit scheduling - Cooperative
● Minimal stack
● Application decides execution flow
● Easy to synchronize/ Avoid locks
● All run inside one process
Libevent
● Use fastest mechanism to poll (portable)
● Fast Event loop
● In Gevent, event loop runs in a greenlet (event hub)
● Instead of blocking, greenlets switch to event hub
● It's all done behind the scene
Monkey Patching
Monkey patchingModifies behaviour of blocking calls such as select, sleep to non-blocking
Patches the python standard socket library
Gevent
● Greenlet 1 is running
● Greenlet 2 and 3 are ready
Gevent
● Greenlet 1 has to wait for read
● Greenlet 1 switches to Event hub
Gevent
● Event hub switches to Greenlet 3
Gevent
● Greenlet 2 runs
Gevent
● Greenlet 2 wants to sleep
● Greenlet 2 switches to Event hub
Gevent
● Greenlet 1 data has come, moved to ready state
● Eventhub switches to Greenlet 3
Gevent
● Greenlet 3 runs
Gevent
● When Greenlet 1 resumes, its from next instruction
● It's as if it were a blocking call
Green SSH Client
1. Init session
2. Authenticate
3. Get a channel
4. Issue command
5. Read output
A closer look
Going Concurrent
Use pre-forked processes to use all cores
Memory usage
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000
5
10
15
20
25
30
35
40
45
ssh mux memory usage
Gevent+Processes
# Concurrent Reqs
Me
mo
ry(M
B)
SSH Mux Performace
20 40 80 150 200 400 800 1200 1500 1800 21000
10
20
30
40
50
60
70
ssh mux performace chart
Gevent+Processes
# Concurrent Reqs
Tim
e(s
)
SSH Mux memory usage
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000
100
200
300
400
500
600
ssh mux memory usage
Processes
Gevent
# Concurrent Reqs
Me
mo
ry(M
B)
SSH Mux Performance
20 40 80 150 200 400 800 1200 1500 1800 21000
20
40
60
80
100
120
140
160
ssh mux performance sheet
Processes
Gevent+Processes
# Concurrent Reqs
Tim
e(s
)
Gevent yay!
● Untwist – write linear non blocking code
● Explicit scheduling, dictate the execution flow
● Timeouts
● Events, AsyncResults for Synchronization
● gevent.wsgi
● Pre-spawned pool of greenlets
Gevent beware of
● No multicore support
● Not great for CPU bound applications
● Third party libs must be green (non blocking)
● Misbehaving workers can be lethal
● No fairness when it comes to scheduling
Take Away
● Gevent lets you write asynchronous code in a
synchronous manner
● No multicore support, still need multiprocessing
● Not so great for CPU bound applications
● Split your application into CPU bound and IO bound
parts
● Be willing to contribute patches
● Code available at
git@github.com:aaloksood/pyexamples.git
Thank you
That's all folks!
Countdown Timer● Count down from 200000000● Split work among workers
Threads
1 2 3 4 5 6 7 8 9 100
5
10
15
20
25
Multithreading wonder
1 Core
4 cores
# Workers
Tim
e(s
)
One core
1 2 3 4 5 6 7 8 9 1010.5
11
11.5
12
12.5
13
13.5
14
14.5
Execution time One Core
Processes
1 Core
Gevent_1
Gevent_4
# Workers
Tim
e (
s)
Four cores
1 2 3 4 5 6 7 8 9 100
5
10
15
20
25
Execution time 4 cores
Process
Threads
Gevent_1
Gevent_4
# Workers
Tim
e(s
)
Recommended