69
Debugging of (C)Python applications June the 20th, KharkivPy Roman Podoliaka (@amd4ever) http://bit.ly/1LpjXGL

Debugging of (C)Python applications

Embed Size (px)

Citation preview

Page 1: Debugging of (C)Python applications

Debugging of (C)Python applications

June the 20th, KharkivPy Roman Podoliaka (@amd4ever)

http://bit.ly/1LpjXGL

Page 2: Debugging of (C)Python applications

Why debugging?

• open source cloud platform

• dozens of (micro-)services

• new features are important, but making OpenStack stable, scalable and HA is even more important

• every day performance testing on hundreds of bare metal nodes

• nightly CI jobs running functional and destructive tests

• things break… pretty much all the time!

Page 3: Debugging of (C)Python applications

A little humble OpenStack

Page 4: Debugging of (C)Python applications

Typical environment

• CentOS 6 or Ubuntu 14.04

• CPython 2.6 or 2.7

• eventlet-based concurrency model for Python services

• MySQL (Galera), memcache [, MongoDB]

• RabbitMQ

Page 5: Debugging of (C)Python applications

Credits

• “Debugging Python applications in Production” by Vladimir Kirillov (https://www.youtube.com/watch?v=F9FHIghn_Vk)

• Brendan Gregg’s Blog (http://www.brendangregg.com/blog/index.html)

Page 6: Debugging of (C)Python applications

printf() debugging

Page 7: Debugging of (C)Python applications

printf() debugging: python-memcache

def _get_server(self, key): if isinstance(key, tuple): serverhash, key = key else: serverhash = serverHashFunction(key)

if not self.buckets: return None, None

for i in range(Client._SERVER_RETRIES): server = self.buckets[serverhash % len(self.buckets)] if server.connect(): # print("(using server %s)" % server,) return server, key serverhash = serverHashFunction(str(serverhash) + str(i)) return None, None

Page 8: Debugging of (C)Python applications

printf() debugging: just don’t do that!

• the most primitive way of introspection at runtime

• either always enabled or explicitly commented in the code

• limited to stdout/stderror streams

• information is only (barely) usable for developers

• pollutes the code when committed to VCS repositories

Page 9: Debugging of (C)Python applications

Logging

Page 10: Debugging of (C)Python applications

Logging: basics

import logging

FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"logging.basicConfig(format=FORMAT)

d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}logging.warning("Protocol problem: %s", "connection reset", extra=d)

2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset

Page 11: Debugging of (C)Python applications

Logging: log levelsif is_pid_cmdline_correct(pid, conffile.split('/')[-1]): try: _execute('kill', '-HUP', pid, run_as_root=True) _add_dnsmasq_accept_rules(dev) return except Exception as exc: LOG.error(_LE('kill -HUP dnsmasq threw %s'), exc)else: LOG.debug('Pid %d is stale, relaunching dnsmasq', pid)

Level Numeric valueCRITICAL 50ERROR 40

WARNING 30INFO 20

DEBUG 10NOTSET 0

Page 12: Debugging of (C)Python applications

Logging: log records propagation

import logging

LOG = logging.getLogger('sqlalchemy.orm')

...

LOG.debug('Instance changed state from `%(prev_state)s` to `%(new_state)s`', prev_state=prev_state, new_state=new_state)

sqlalchemy.orm -> sqlalchemy -> (root)

Page 13: Debugging of (C)Python applications

Logging: context matters

cfg.StrOpt('logging_context_format_string', default='%(asctime)s.%(msecs)03d %(process)d %(levelname)s ' '%(name)s [%(request_id)s %(user_identity)s] ' ‘%(instance)s%(message)s’)

2015-06-10 12:42:00.765 27516 INFO nova.osapi_compute.wsgi.server [req-58f233ab-f2b6-452f-b4fe-0c781ce8f8d0 None] 192.168.0.1 "GET /v2/fc7f78f1c53d4443976514d2fd16e5cb/images/detail HTTP/1.1" status: 200 len: 905 time: 0.1043971

2015-06-10 12:41:57.004 2760 AUDIT nova.virt.block_device [req-209db629-0d06-4f81-92ad-b910f1a72b36 None] [instance: a0d1c6ef-1fa8-46f9-a19d-f8fb7d2df6a2] Booting with volume 8bad9533-9d6f-4be8-939d-b7a28a536a1a at /dev/vda

Page 14: Debugging of (C)Python applications

Logging: log processing

• logs are collected from different sources and parsed (Logstash)

• then they are imported into a full-text search system (ElasticSearch)

• Web UI is used for providing easy access to results and querying (Kibana)

Page 15: Debugging of (C)Python applications

Logging: log processingtitle: Kernel Neighbour table overflowquery: > filename:kernel.log AND level:warning AND message:neighbour AND message:overflow

title: Neutron Skipping router removalquery: > filename:neutron-l3-agent.log AND location:neutron.agent.l3_agent AND message:skipping AND message:removal

title: Neutron OVS lib errors and warningsquery: > filename:neutron-openvswitch-agent.log AND location:neutron.agent.linux.ovs_lib AND level:(error OR warning)

title: Neutron race condition at subnet deletionquery: > filename:neutron AND level:trace AND message:AttributeError

Page 16: Debugging of (C)Python applications

Logging: summary

• useful for both developers and operators

• developers define verbosity by the means of logging levels

• configurable handlers (file, syslog, network, etc)

• advanced tooling for log processing / monitoring

Page 17: Debugging of (C)Python applications

Logging: useful links

• General info: https://docs.python.org/3.3/howto/logging.html#logging-howto

• Adding contextual information: https://docs.python.org/2/howto/logging-cookbook.html#adding-contextual-information-to-your-logging-output

• Logstash/ElasticSearch/Kibana: http://www.logstash.net/docs/1.4.2/tutorials/getting-started-with-logstash

Page 18: Debugging of (C)Python applications

pdb

Page 19: Debugging of (C)Python applications

pdb: basicsdef _binary_search(arr, left, right, key): if left == right: return -1

middle = left + (right - left) / 2

if key == arr[middle]: return middle elif key > arr[middle]: return _binary_search(arr, middle, right, key) else: return _binary_search(arr, left, middle, key)

def binary_search(arr, key): return _binary_search(arr, 0, len(arr), key)

l = list(range(10))assert binary_search(l, 5) == 5assert binary_search(l, 0) == 0assert binary_search(l, 9) == 9assert binary_search(l, 10) == -1assert binary_search(l, -5) == -1

Page 20: Debugging of (C)Python applications

pdb: basicsRomans-MacBook-Air:03-pdb malor$ python -m pdb basics.py> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(1)<module>()-> def _binary_search(arr, left, right, key):(Pdb) break binary_searchBreakpoint 1 at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py:15(Pdb) continue> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(16)binary_search()-> return _binary_search(arr, 0, len(arr), key)(Pdb) argsarr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]key = 5(Pdb) step--Call--> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(1)_binary_search()-> def _binary_search(arr, left, right, key):(Pdb) next> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(2)_binary_search()-> if left == right:

Page 21: Debugging of (C)Python applications

pdb: basics(Pdb) list 1 def _binary_search(arr, left, right, key): 2 -> if left == right: 3 return -1 4 5 middle = left + (right - left) / 2 6 7 if key == arr[middle]: 8 return middle 9 elif key > arr[middle]: 10 return _binary_search(arr, middle, right, key) 11 else:(Pdb) where /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/bdb.py(400)run()-> exec cmd in globals, locals <string>(1)<module>() /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(20)<module>()-> assert binary_search(l, 5) == 5 /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(16)binary_search()-> return _binary_search(arr, 0, len(arr), key)> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(2)_binary_search()-> if left == right:

Page 22: Debugging of (C)Python applications

pdb: post-mortem debuggingRomans-MacBook-Air:03-pdb malor$ python -m pdb basics.py> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(1)<module>()-> def _binary_search(arr, left, right, key):(Pdb) continueTraceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pdb.py", line 1314, in main pdb._runscript(mainpyfile) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pdb.py", line 1233, in _runscript self.run(statement) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/bdb.py", line 400, in run exec cmd in globals, locals File "<string>", line 1, in <module> File "basics.py", line 1, in <module> def _binary_search(arr, left, right, key): File "basics.py", line 16, in binary_search return _binary_search(arr, 0, len(arr), key)

…RuntimeError: maximum recursion depth exceededUncaught exception. Entering post mortem debuggingRunning 'cont' or 'step' will restart the program-> return _binary_search(arr, middle, right, key)

py.test --pdbnosetest --pdb -s. . .

Page 23: Debugging of (C)Python applications

pdb: commands(Pdb) breakNum Type Disp Enb Where1 breakpoint keep yes at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py:15

breakpoint already hit 2 times(Pdb) commands 1(com) args(com) where(com) end(Pdb) continuearr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]key = 5 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/bdb.py(400)run()-> exec cmd in globals, locals <string>(1)<module>() /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(20)<module>()-> assert binary_search(l, 5) == 5> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(16)binary_search()-> return _binary_search(arr, 0, len(arr), key)> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(16)binary_search()-> return _binary_search(arr, 0, len(arr), key)

Page 24: Debugging of (C)Python applications

pdb: conditional break points

(Pdb) break binary_searchBreakpoint 1 at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py:15(Pdb) breakNum Type Disp Enb Where1 breakpoint keep yes at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py:15(Pdb) condition 1 key == 10(Pdb) continue> /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/basics.py(16)binary_search()-> return _binary_search(arr, 0, len(arr), key)(Pdb) argsarr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]key = 10

Page 25: Debugging of (C)Python applications

pdb: summary

• bread and butter of Python developers

• usually the easiest and the quickest way of debugging scripts/apps

• integrated with popular test runners

• greenlet-friendly

• requires stdin/stdout, thus not usable for debugging daemons or embedded Python code (like Gimp or Blender plugins)

• not suitable for debugging of multithreaded/multiprocessing applications

• can’t attach to a running process (if not modified in advance)

Page 26: Debugging of (C)Python applications

winpdb

Page 27: Debugging of (C)Python applications

winpdb: attaching to a processrpodolyaka@rpodolyaka-pc:~/sandbox/debugging$ rpdb2 -d search.pyA password should be set to secure debugger client-server communication.Please type a password:r00tmePassword has been set

rpodolyaka@rpodolyaka-pc:~$ rpdb2RPDB2 - The Remote Python Debugger, version RPDB_2_4_8,Copyright (C) 2005-2009 Nir Aides.

> password r00tmePassword is set to: "r00tme"

> attachConnecting to 'localhost'...Scripts to debug on 'localhost':

pid name-------------------------- 3706 /home/rpodolyaka/sandbox/debugging/search.py

> attach 3706> *** Attaching to debuggee...

Page 28: Debugging of (C)Python applications

winpdb: attaching to a process> bp binary_search

> blList of breakpoints:

Id State Line Filename-Scope-Condition-Encoding------------------------------------------------------------------------------ 0 enabled 15 /home/rpodolyaka/sandbox/debugging/search.py binary_search

> go

> *** Debuggee is waiting at break point for further commands.> stackStack trace for thread 140416296978176:

Frame File Name Line Function------------------------------------------------------------------------------ > 0 ...ndbox/debugging/search.py 15 <module> 1 ....7/dist-packages/rpdb2.py 14220 StartServer 2 ....7/dist-packages/rpdb2.py 14470 main 3 /usr/bin/rpdb2 31 <module>

Page 29: Debugging of (C)Python applications

winpdb: embedded debugging

def add_lease(mac, ip_address): """Set the IP that was assigned by the DHCP server."""

import rpdb2; rpdb2.start_embedded_debugger('r00tme') api = network_rpcapi.NetworkAPI() api.lease_fixed_ip(context.get_admin_context(), ip_address, CONF.host)

dnsmasq daemon forks and executes this like:

nova-dhcpbridge add AA:BB:CC:DD:EE:FF 10.0.0.2

Page 30: Debugging of (C)Python applications

winpdb: debugging of threads def allocate_ips(engine, host): while True: with engine.begin() as conn: result = conn.execute( ip_addresses.select() \ .where(ip_addresses.c.host.is_(None)) ).first() if result is None: # no IPs left break

id, address = result.id, result.address rows = conn.execute( ip_addresses.update() \ .values(host=host) \ .where(ip_addresses.c.id == id) \ .where(ip_addresses.c.address == address) \ .where(ip_addresses.c.host.is_(None)) ) if not rows: # concurrent update continue

Page 31: Debugging of (C)Python applications

winpdb: debugging of threads t1 = threading.Thread(target=allocate_ips, args=(eng, 'host1'))t1.start()

t2 = threading.Thread(target=allocate_ips, args=(eng, 'host2'))t2.start()

t1.join()t2.join()

> attach $PID

> threadList of active threads known to the debugger:

No Tid Name State----------------------------------------------- 0 140456866166528 MainThread waiting at break point > 1 140456389068544 Thread-1 waiting at break point 2 140456380675840 Thread-2 waiting at break point

Page 32: Debugging of (C)Python applications

winpdb: debugging of threads > thread 2Focus was set to chosen thread.

> stackStack trace for thread 140456380675840:

Frame File Name Line Function------------------------------------------------------------------------------ > 0 /home/rpodolyaka/sa.py 30 allocate_ips 1 ...ib/python2.7/threading.py 763 run

> go

> break> *** Debuggee is waiting at break point for further commands.

> stackStack trace for thread 140456380675840:

Frame File Name Line Function------------------------------------------------------------------------------ > 0 ...alchemy/engine/default.py 409 do_commit 1 ...sqlalchemy/engine/base.py 525 _commit_impl 2 ...sqlalchemy/engine/base.py 1364 _do_commit

Page 33: Debugging of (C)Python applications

winpdb: summary

• allows to debug multithreaded Python applications

• remote debugging (which effectively means, no stdout/stdint limitations as with pdb)

• wxWidgets-based GUI

• to attach to a running process you need to modified it in advance (embedded debugging) or start it with rpdb2

Page 34: Debugging of (C)Python applications

cProfile

Page 35: Debugging of (C)Python applications

cProfile: basicsdef count_freq(stream): res = {} for i in iter(lambda: stream.read(1), ''): try: res[i] += 1 except KeyError: res[i] = 1 return res

def build_tree(stream): queue = [Node(freq=v, symb=k) for k, v in count_freq(stream).items()] while len(queue) > 1: queue.sort(key=lambda k: k.freq)

first = queue.pop(0) second = queue.pop(0)

queue.append( Node(freq=(first.freq + second.freq), left=first, right=second) )

return queue[0]

Page 36: Debugging of (C)Python applications

cProfile: Amdahl's law

Page 37: Debugging of (C)Python applications

cProfile: basicsRomans-MacBook-Air:07-cprofile malor$ python -m cProfile -s cumtime huffman.py ~/Downloads/kharkivpy-debugging.key 24868775 function calls in 14.059 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function) 1 0.008 0.008 14.059 14.059 huffman.py:1(<module>) 1 0.001 0.001 14.051 14.051 huffman.py:33(build_tree) 1 5.029 5.029 14.035 14.035 huffman.py:23(count_freq) 12417038 3.863 0.000 9.006 0.000 huffman.py:25(<lambda>) 12417038 5.143 0.000 5.143 0.000 {method 'read' of 'file' objects} 255 0.009 0.000 0.014 0.000 {method 'sort' of 'list' objects} 32895 0.005 0.000 0.005 0.000 huffman.py:36(<lambda>) 511 0.001 0.000 0.001 0.000 huffman.py:7(__init__) 510 0.000 0.000 0.000 0.000 {method 'pop' of 'list' objects} 1 0.000 0.000 0.000 0.000 functools.py:53(total_ordering) 1 0.000 0.000 0.000 0.000 {open} 256 0.000 0.000 0.000 0.000 {len} 255 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {dir} 1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects} 3 0.000 0.000 0.000 0.000 {setattr} 3 0.000 0.000 0.000 0.000 {getattr} 1 0.000 0.000 0.000 0.000 {max}

Page 38: Debugging of (C)Python applications

cProfile: visualisation

Page 39: Debugging of (C)Python applications

cProfile: context mattersimport cProfile as profilerimport gc, pstats, time

def profile(fn): def wrapper(*args, **kw): elapsed, stat_loader, result = _profile(“out.prof”, fn, *args, **kw) stats = stat_loader() stats.sort_stats('cumulative') stats.print_stats() return result return wrapper

def _profile(filename, fn, *args, **kw): load_stats = lambda: pstats.Stats(filename) gc.collect()

began = time.time() profiler.runctx('result = fn(*args, **kw)', globals(), locals(), filename=filename) ended = time.time()

return ended - began, load_stats, locals()['result']

Page 40: Debugging of (C)Python applications

cProfile: context matters

from werkzeug.contrib.profiler import ProfilerMiddlewareapp = ProfilerMiddleware(app)

Page 41: Debugging of (C)Python applications

cProfile: context mattersPATH: '/6e0f43cd74db46f5b95f2142fe0c9431/flavors/detail' 2732 function calls (2602 primitive calls) in 1.294 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/nova/api/compute_req_id.py:38(__call__) 2/1 0.008 0.004 1.287 1.287 /usr/lib/python2.7/dist-packages/webob/request.py:1300(send) 2/1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/webob/request.py:1262(call_application) 1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/nova/api/openstack/__init__.py:121(__call__) 1 0.000 0.000 1.271 1.271 /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:686(__call__) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:829(_validate_token) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:1669(get) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:1726(_cache_get)

Page 42: Debugging of (C)Python applications

cProfile: summary

• easy CPU profiling of Python code with low overhead

• text/binary representation of profiling results (the latter can be used for merging results and/or visualisation done by external tools)

• can’t attach to a running process

• can’t profile Python interpreter-level code (Py_EvaluateFrameEx, etc)

Page 43: Debugging of (C)Python applications

objgraph

Page 44: Debugging of (C)Python applications

objgraph: basics

In [1]: import objgraph

In [2]: objgraph.show_most_common_types()function 4530dict 2483tuple 1428wrapper_descriptor 1260weakref 981list 911builtin_function_or_method 897method_descriptor 705getset_descriptor 531type 473

Page 45: Debugging of (C)Python applications

objgraph: basics

In [3]: objgraph.show_growth()function 4530 +4530dict 2412 +2412tuple 1353 +1353wrapper_descriptor 1272 +1272weakref 985 +985list 904 +904builtin_function_or_method 897 +897method_descriptor 706 +706getset_descriptor 535 +535type 473 +473

In [4]: objgraph.show_growth()weakref 986 +1list 905 +1tuple 1354 +1

Page 46: Debugging of (C)Python applications

objgraph: graphs>>> x = []>>> y = [x, [x], {‘x’: x}]>>> objgraph.show_refs([y], filename='sample-graph.png')

Page 47: Debugging of (C)Python applications

strace

Page 48: Debugging of (C)Python applications

strace: tracing syscalls

rpodolyaka@rpodolyaka-pc:~$ strace -e network python sa.py

. . .

socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 5setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0setsockopt(5, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0connect(5, {sa_family=AF_INET6, sin6_port=htons(5432), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0getsockname(5, {sa_family=AF_INET6, sin6_port=htons(36894), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0sendto(5, "\0\0\0\10\4\322\26/", 8, MSG_NOSIGNAL, NULL, 0) = 8recvfrom(5, "S", 16384, 0, NULL, NULL) = 1

. . .

Page 49: Debugging of (C)Python applications

strace: tracing syscallsroot@node-13:~# strace -p 1508 -s 4096 -tt

. . .

16:53:29.532770 epoll_wait(7, {}, 1023, 0) = 016:53:29.532832 epoll_wait(7, {}, 1023, 0) = 016:53:29.532892 epoll_wait(7, {}, 1023, 0) = 016:53:29.532953 epoll_wait(7, {}, 1023, 0) = 016:53:29.533022 epoll_wait(7, {{EPOLLIN, {u32=9, u64=39432335262744585}}}, 1023, 915) = 116:53:29.596409 epoll_ctl(7, EPOLL_CTL_DEL, 9, {EPOLLRDNORM|EPOLLWRBAND|EPOLLMSG|0x28c45820, {u32=32644, u64=22396489217113988}}) = 016:53:29.596494 accept(9, 0x7ffe1ef32b10, [16]) = -1 EAGAIN (Resource temporarily unavailable)16:53:29.596638 epoll_ctl(7, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=9, u64=39432335262744585}}) = 016:53:29.596747 epoll_wait(7, {{EPOLLIN, {u32=9, u64=39432335262744585}}}, 1023, 851) = 116:53:29.611852 epoll_ctl(7, EPOLL_CTL_DEL, 9, {EPOLLRDNORM|EPOLLWRBAND|EPOLLMSG|0x28c45820, {u32=32644, u64=22396489217113988}}) = 016:53:29.611937 accept(9, 0x7ffe1ef32b10, [16]) = -1 EAGAIN (Resource temporarily unavailable)

. . .

Page 50: Debugging of (C)Python applications

strace: summary

• allows tracing of applications interactions with `outside world`

• points possible problems with performance (like excessive system calls, polling of events with too small timeout, etc)

• limited to tracing of system calls of one process and its forks

• use cautiously on production environments as it greatly affects performance

Page 51: Debugging of (C)Python applications

gdb

Page 52: Debugging of (C)Python applications

gdb: prerequisites

• Ubuntu/Debian:

• sudo apt-get install gdb python-dbg

• CentOS/RHEL/Fedora (separate debuginfo package repository):

• sudo yum install gdb python-debuginfo

Page 53: Debugging of (C)Python applications

gdb: basics

• python-dbg is a CPython binary built with ‘--with-debug -g’ options. It’s slow and verbose about memory management

• you can debug regular CPython processes in production using the debug symbols shipped separately

• gdb has Python bindings to write scripts for it

• CPython is shipped with a gdb script allowing to analyse interpreter-level stack frames to get app-level backtraces

Page 54: Debugging of (C)Python applications

gdb: `hanging` appdef allocate_ips(eng, host): while True: with eng.begin() as conn: row = conn.execute( ip_addresses.select() \ .where(ip_addresses.c.host.is_(None)) ).fetchone() if row is None: break

id, address = row.id, row.address updated_rows = conn.execute( ip_addresses.update() \ .values(host=host) \ .where(ip_addresses.c.id == id) \ .where(ip_addresses.c.host.is_(None)) ) if not updated_rows: continue

t = threading.Thread(target=allocate_ips, args=(eng, 'host1'))t.start()t.join()

Page 55: Debugging of (C)Python applications

gdb: `hanging` app

rpodolyaka@rpodolyaka-pc:~$ strace -p 20267Process 20267 attachedfutex(0x7fea50000c10, FUTEX_WAIT_PRIVATE, 0, NULL

rpodolyaka@rpodolyaka-pc:~$ gdb /usr/bin/python3.4 -p 20216

(gdb) t a a frame

Thread 2 (Thread 0x7f7702c83700 (LWP 20353)):#0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101101 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S: No such file or directory.

Thread 1 (Thread 0x7f770a03b700 (LWP 20350)):#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:8585 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S: No such file or directory.

Page 56: Debugging of (C)Python applications

gdb: `hanging` app(gdb) t a 2 py-btThread 2 (Thread 0x7f7702c83700 (LWP 20353)):Traceback (most recent call first): File "/usr/lib/python3.4/threading.py", line 294, in wait gotit = waiter.acquire(True, timeout) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/util/queue.py", line 157, in get self.not_empty.wait(remaining) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/pool.py", line 1039, in _do_get return self._pool.get(wait, self._timeout) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 2037, in contextual_connect self._wrap_pool_connect(self.pool.connect, None), File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 1906, in begin conn = self.contextual_connect(close_with_result=close_with_result) File "sa.py", line 31, in allocate_ips with eng.begin() as conn: File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap self._bootstrap_inner()

Page 57: Debugging of (C)Python applications

gdb: virtualenv pitfalls

rpodolyaka@rpodolyaka-pc:~$ gdb -p 20656 # WARN: executable not passed!

(gdb) py-btUndefined command: "py-bt". Try "help".(gdb) bt#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85#1 0x00000000004cdff5 in PyThread_acquire_lock_timed ()#2 0x0000000000522039 in ?? ()#3 0x00000000004ee01a in PyEval_EvalFrameEx ()#4 0x00000000004ec9fc in PyEval_EvalCodeEx ()#5 0x00000000004f25a9 in PyEval_EvalFrameEx ()#6 0x00000000004ec9fc in PyEval_EvalCodeEx ()#7 0x00000000004f25a9 in PyEval_EvalFrameEx ()#8 0x00000000004ec9fc in PyEval_EvalCodeEx ()#9 0x0000000000581115 in ?? ()#10 0x00000000005ab019 in PyRun_FileExFlags ()#11 0x00000000005aa194 in PyRun_SimpleFileExFlags ()#12 0x00000000004cb4cb in Py_Main ()#13 0x00000000004ca8ef in main ()

Page 58: Debugging of (C)Python applications

gdb: summary

• allows to debug multithreaded applications

• allows to attach to a running process at any given moment of time

• can be used for analysing of core dumps (e.g. if we don’t want to stop a process, or if it died unexpectedly)

• can be used for debugging of C-extensions, CFFI calls, etc

• success depends on how CPython was built and whether you have installed debug symbols or not

• used by pyringe to provide pdb-like experience (https://github.com/google/pyringe)

Page 59: Debugging of (C)Python applications

htop

Page 60: Debugging of (C)Python applications

htop

Page 61: Debugging of (C)Python applications

lsof

Page 62: Debugging of (C)Python applications

lsof: lsof -p $PIDnova-api 5910 nova mem REG 252,0 141574 3586 /lib/x86_64-linux-gnu/libpthread-2.19.sonova-api 5910 nova mem REG 252,0 149120 3582 /lib/x86_64-linux-gnu/ld-2.19.sonova-api 5910 nova mem REG 252,0 26258 52555 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cachenova-api 5910 nova 0u CHR 1,3 0t0 1029 /dev/nullnova-api 5910 nova 1u CHR 136,13 0t0 16 /dev/pts/13nova-api 5910 nova 2u CHR 136,13 0t0 16 /dev/pts/13nova-api 5910 nova 3w REG 252,0 34967268 135756 /var/log/nova/nova-api.lognova-api 5910 nova 4u unix 0xffff880850b92a00 0t0 260406 socketnova-api 5910 nova 5r FIFO 0,8 0t0 260407 pipenova-api 5910 nova 6w FIFO 0,8 0t0 260407 pipenova-api 5910 nova 7u IPv4 260408 0t0 TCP node-13.domain.tld:8773 (LISTEN)nova-api 5910 nova 8r CHR 1,9 0t0 1034 /dev/urandomnova-api 5910 nova 9u IPv4 260409 0t0 TCP node-13.domain.tld:8774 (LISTEN)nova-api 5910 nova 10u IPv4 260420 0t0 TCP *:8775 (LISTEN)nova-api 5910 nova 15u 0000 0,9 0 7380 anon_inode

Page 63: Debugging of (C)Python applications

netstat

Page 64: Debugging of (C)Python applications

netstat: netstat -nlap

tcp 8 0 192.168.0.16:52819 192.168.0.11:5673 ESTABLISHED 5975/pythontcp 0 0 192.168.0.16:36901 192.168.0.11:5673 ESTABLISHED 1513/pythontcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 3042/sshdtcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 13888/mysqldtcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 7433/mastertcp 0 0 0.0.0.0:3260 0.0.0.0:* LISTEN 19704/tgtdtcp 0 0 192.168.0.16:35357 0.0.0.0:* LISTEN 5546/python

Page 65: Debugging of (C)Python applications

perf_events

Page 66: Debugging of (C)Python applications

perf_events: perf top

Page 67: Debugging of (C)Python applications

perf_events: perf trace

254.663 ( 0.001 ms): sshd/22802 clock_gettime(which_clock: 7, tp: 0x7ffd0e807970 ) = 0 254.666 ( 0.003 ms): sshd/22802 read(fd: 14</dev/ptmx>, buf: 0x7ffd0e8038b0, count: 16384 ) = 4095 254.672 ( 0.243 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32, timeout: 4294967295) = 1 254.678 ( 0.003 ms): chrome/11973 read(fd: 24<socket:[147806]>, buf: 0x6a6a2d5b018, count: 4096 ) = 32 254.685 ( 0.003 ms): chrome/11973 write(fd: 11<pipe:[147797]>, buf: 0x7f940dfa55e7, count: 1 ) = 1 254.688 ( 0.001 ms): chrome/11973 read(fd: 24<socket:[147806]>, buf: 0x6a6a2d5b018, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable 254.691 ( 0.001 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32 ) = 0 254.693 ( 0.001 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32 ) = 0

Page 68: Debugging of (C)Python applications

perf_events: perf stat

Performance counter stats for 'python sa.py':

125.242831 task-clock (msec) # 0.004 CPUs utilized 945 context-switches # 0.008 M/sec 14 cpu-migrations # 0.112 K/sec 6,996 page-faults # 0.056 M/sec 408,133,256 cycles # 3.259 GHz 213,117,410 stalled-cycles-frontend # 52.22% frontend cycles idle <not supported> stalled-cycles-backend 432,245,331 instructions # 1.06 insns per cycle # 0.49 stalled cycles per insn 91,417,607 branches # 729.923 M/sec 3,937,108 branch-misses # 4.31% of all branches

30.130596204 seconds time elapsed

Page 69: Debugging of (C)Python applications

Questions?

slides: http://bit.ly/1LpjXGL twitter: @amd4ever