Scaling real world applications using gevent

Preview:

DESCRIPTION

Talk at Pycon India 2012 by Aalok the Magnificent

Citation preview

Concurrency & GeventScaling Real World Applications

Concurrency

handling a number of things at the same time

Examples: (Incoming) WebServers, Database Servers

(Outgoing) SSH Mux

SSH Mux

● Execute a command on a remote

SSH server

● Handle concurrent SSH Clients

● Command execution time varies

from seconds to days

● Command execution happens on

remote servers, SSH mux is I/O

bound

SSH Client

1. Init session

2. Authenticate

3. Get a channel

4. Issue command

5. Read output

Need Concurrency?

● Process blocks on read()

● No new connections can be inititated

● Need ability to handle multiple clients at the same

time

Multiprocessing

● One process is the master

● Master can spawn workers

● Each worker handles one request at a time

● Pre-forked pool of workers

Concurrent SSH Clients

SSH Mux Memory Usage

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000

100

200

300

400

500

600

ssh mux memory usage

Processes

# Concurrent Reqs

Me

mo

ry (

MB

)

SSH Mux Performace

20 40 80 150 200 400 800 1200 1500 1800 21000

20

40

60

80

100

120

140

160

ssh mux performance sheet

Processes

# Concurrent Reqs

Tim

e(s

)

Multiprocessing yay

● Easy to get started

● OS guaranteed process isolation & fairness

● Covers up for misbehaving workers

● Add more concurrency by adding more workers

● Convenient when numbers are smaller numbers

Multiprocessing nay

● Concurrency limited by number of processes

● Memory heavy

● Implicit scheduling

● Synchronization is not trivial

More Concurrency?

● Command execution is happening on remote servers, we are mostly blocked on I/O

● Handle multiple I/O in a single process?

Gevent

gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on

top of the libevent event loop.

Greenlets

● Lightweight 'threads' - not OS threads

● Explicit scheduling - Cooperative

● Minimal stack

● Application decides execution flow

● Easy to synchronize/ Avoid locks

● All run inside one process

Libevent

● Use fastest mechanism to poll (portable)

● Fast Event loop

● In Gevent, event loop runs in a greenlet (event hub)

● Instead of blocking, greenlets switch to event hub

● It's all done behind the scene

Monkey Patching

Monkey patchingModifies behaviour of blocking calls such as select, sleep to non-blocking

Patches the python standard socket library

Gevent

● Greenlet 1 is running

● Greenlet 2 and 3 are ready

Gevent

● Greenlet 1 has to wait for read

● Greenlet 1 switches to Event hub

Gevent

● Event hub switches to Greenlet 3

Gevent

● Greenlet 2 runs

Gevent

● Greenlet 2 wants to sleep

● Greenlet 2 switches to Event hub

Gevent

● Greenlet 1 data has come, moved to ready state

● Eventhub switches to Greenlet 3

Gevent

● Greenlet 3 runs

Gevent

● When Greenlet 1 resumes, its from next instruction

● It's as if it were a blocking call

Green SSH Client

1. Init session

2. Authenticate

3. Get a channel

4. Issue command

5. Read output

A closer look

Going Concurrent

Use pre-forked processes to use all cores

Memory usage

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000

5

10

15

20

25

30

35

40

45

ssh mux memory usage

Gevent+Processes

# Concurrent Reqs

Me

mo

ry(M

B)

SSH Mux Performace

20 40 80 150 200 400 800 1200 1500 1800 21000

10

20

30

40

50

60

70

ssh mux performace chart

Gevent+Processes

# Concurrent Reqs

Tim

e(s

)

SSH Mux memory usage

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1000

100

200

300

400

500

600

ssh mux memory usage

Processes

Gevent

# Concurrent Reqs

Me

mo

ry(M

B)

SSH Mux Performance

20 40 80 150 200 400 800 1200 1500 1800 21000

20

40

60

80

100

120

140

160

ssh mux performance sheet

Processes

Gevent+Processes

# Concurrent Reqs

Tim

e(s

)

Gevent yay!

● Untwist – write linear non blocking code

● Explicit scheduling, dictate the execution flow

● Timeouts

● Events, AsyncResults for Synchronization

● gevent.wsgi

● Pre-spawned pool of greenlets

Gevent beware of

● No multicore support

● Not great for CPU bound applications

● Third party libs must be green (non blocking)

● Misbehaving workers can be lethal

● No fairness when it comes to scheduling

Take Away

● Gevent lets you write asynchronous code in a

synchronous manner

● No multicore support, still need multiprocessing

● Not so great for CPU bound applications

● Split your application into CPU bound and IO bound

parts

● Be willing to contribute patches

● Code available at

git@github.com:aaloksood/pyexamples.git

Thank you

That's all folks!

Countdown Timer● Count down from 200000000● Split work among workers

Threads

1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

Multithreading wonder

1 Core

4 cores

# Workers

Tim

e(s

)

One core

1 2 3 4 5 6 7 8 9 1010.5

11

11.5

12

12.5

13

13.5

14

14.5

Execution time One Core

Processes

1 Core

Gevent_1

Gevent_4

# Workers

Tim

e (

s)

Four cores

1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

Execution time 4 cores

Process

Threads

Gevent_1

Gevent_4

# Workers

Tim

e(s

)

Recommended