Upload
henry-prentiss
View
224
Download
0
Embed Size (px)
Citation preview
gevent network library
Denis Bilenko
gevent.org
Problem statement
from urllib2 import urlopenresponse = urlopen('http://gevent.org')body = response.read()
How to manage concurrent connections?
Problem statement
def on_response_read(response): d = response.read() d.addCallbacks(on_body_read, on_error) def on_error(error): ...
def on_body_read(body): ... d = readURL('http://gevent.org').d.addCallbacks(on_response_read, on_error)reactor.run()
Possible answer: Async framework (Twisted, asyncore, ...)
simplicity is lost
Problem statement
from threading import Threaddef read_url(url): response = urllib2.urlopen(url) body = response.read()
t1=Thread(target=read_url, args=('http://gevent.org',))t1.start() t2=Thread(target=read_url, args=('http://python.org',))t2.start()t1.join()t2.join()
Possible answer: Threads
resource hog
Memory required for 10k connections
twisted55 MB
threading400 MB
Memory required for 10k connections
gevent (greenlet + libevent)
from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
concurrent fetch
Memory required for 10k connections
twisted55 MB
gevent70 MB
threading400 MB
Memory required for 10k connections
greenlet
from greenlet import greenlet
>>> def myfunction(arg):... return arg + 1
>>> g = greenlet(myfunction)>>> g.switch(2)3
from greenlet import greenlet
>>> MAIN = greenlet.getcurrent()>>> def myfunction(arg):... MAIN.switch('hello')... return arg + 1
>>> g = greenlet(myfunction)>>> g.switch(2)'hello'>>> g.switch('hello to you')3
switching deep down the stack
>>> def myfunction(arg):... MAIN.switch('hello')... return arg + 1 >>> def top_function(arg):... return myfunction(arg)
>>> g = greenlet(top_function) >>> g.switch(2)'hello'
from greenlet import greenlet
• primitive pseudothreads, share same OS thread• switched explicitly via switch() and throw()• organized in a tree, each has .parent except MAIN• switch(), throw() and .parent reserved for gevent
http://codespeak.net/py/0.9.2/greenlet.html
How gevent uses greenlet
HUB
MAIN
spawned greenlets
Hub: greenlet that runs event loopfrom gevent import core
class Hub(greenlet.greenlet):
def run(self): core.dispatch() # wrapper for event_dispatch()
def get_hub(): # return the global Hub instance # creating one if does not exist
gevent/hub.py
Event loop
• libevent 1.4.x or 2.0.5-beta• gevent.core: wraps libevent API (like pyevent)
>>> def print_hello():... print 'hello'>>> gevent.core.timer(1, print_hello)<timer ...>>>> gevent.core.dispatch()hello1 # return value (no more events)
Implementation of gevent.sleep()def sleep(seconds=0): """Put the current greenlet to sleep""“ switch = getcurrent().switch timer = core.timer(seconds, switch) try: get_hub().switch() finally: timer.cancel()
Cooperative socket
• gevent.socket: compatible synchronous interface• wraps a non-blocking socket
def recv(self, size): while True: try: return self._sock.recv(size) except error, ex: if ex[0] == EWOULDBLOCK: wait_read(self.fileno()) else: raise
Cooperative socket
• gevent.socket: compatible synchronous interface• wraps a non-blocking socket
def wait_read(fileno): switch = getcurrent().switch event = core.read_event(fileno, switch) try: get_hub().switch() finally: event.cancel()
gevent/socket.py
Cooperative socket
• gevent.socket• dns queries are resolved through libevent-dns
(getaddrinfo, gethostbyname)• gevent.ssl
Monkey patching
from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
Monkey patching
Patches:• socket and ssl modules• time.sleep, select.select• thread and threadingBeware:• libraries that wrap C libraries (e.g. MySQLdb)• Disk I/O• things not yet patched: subprocess, os.system, sys.stdinTested with httplib, urllib2, mechanize, mysql-connector,
SQLAlchemy, ...
Greenlet objects
from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
Greenlet objects
def read_url(url): response = urllib2.urlopen(url) body = response.read() g = Greenlet(read_url, url)g.start()
# wait for it to completeg.join()
# or raise an exception and wait to exitg.kill()
= spawn
Greenlet objects
def read_url(url): response = urllib2.urlopen(url) body = response.read() g = Greenlet(read_url, url)g.start()
# wait for it to complete (or timeout expires)g.join(timeout=2)
# or raise and wait to exit (or timeout expires)g.kill(timeout=2)
= spawn
Timeouts
with gevent.Timeout(5): response = urllib2.urlopen(url) for line in response: print line# raises Timeout if not done after 5 seconds
with gevent.Timeout(5, False): response = urllib2.urlopen(url) for line in response: print line# exits block if not done after 5 seconds
Beware: catch-all “except:”, non-yielding code
API
• socket, ssl • Greenlet• Timeout
• Event, AsyncResult• Queue (also JoinableQueue, PriorityQueue, LifoQueue)
– Queue(0) is a synchronous channel
• Pool
• StreamServer: TCP and SSL servers• WSGI servers
WSGI servers
• gevent.wsgi– uses libevent-http– efficient, but lacks important features
• gevent.pywsgi– uses gevent sockets
• green unicorn (gunicorn.org)– its own parser or gevent’s server– pre-fork workers
Caveat emptor
• Reduced portability– no Jython, IronPython– not all platforms supported by CPython
• PyThreadState is shared– exc_info (saved/restored by gevent)– tracing, profiling info
Future plans
• http://code.google.com/p/gevent/issues/list• alternative coroutine libraries– Stackless– swapcontext
• more libevent:– http client– buffered socket operations– priorities
• process handling (gevent.subprocess)• even more stable API with 1.0
Examples
• bitbucket.org/denis/gevent/src/tip/examples/• chat.gevent.org• omegle.com• ProjectsUsingGevent– gevent-mysql– psycopg2
• bit.ly/use-gevent– websockets, web crawlers, facebook apps
Summary
• coroutines are easy-to-use threads• as efficient as async libraries• works well if app is I/O bound• simple API, many things familiar• works with unsuspecting 3rd party modules
Thank you!
gevent.org@gevent