Upload
james-golick
View
10.399
Download
7
Tags:
Embed Size (px)
DESCRIPTION
Does your code work? Probably not. The libraries you're using probably don't work either. If you're lucky, the OS does, but even then you'll probably find something wrong if you look hard enough. Debugging is the reason that the last 20% of shipping a product usually accounts for 80% of the time. And yet, there are a million blog posts and talks about writing code, but very few about figuring out why it doesn't work right once you have. So, how do you find bugs? In this talk I'll explore a set of tools and techniques that have helped me diagnose defects in everything from php code to malloc implementations. One time I even used this strategy to diagnose an outage in a codebase I'd never seen that was written in a language I barely knew and a framework I'd never heard of - in less than 5 minutes. You'll walk away with this talk with everything you need to learn how to debug anything. Video: https://www.youtube.com/watch?v=VV7b7fs4VI8
Citation preview
How to Debug Anything
@jamesgolick
well ok, not anything, but most stuff on unixy operating systems that have the tooling i’m going to talk about today
Everything is Terrible
Everything is Broken
“Correct Code”™
“If you want to deploy high quality software that performs, you should expect to fix bugs at every level.” - me
“I don’t understand how this is possible.” - every programmer ever
0. php
a blind debugging session
the website is down
what we have to work with
• The source code. (nope)
• Knowledge of the system. (nope)
• Familiarity with the programming language.
(nope)
• SSH Access. (yup)
logging in the real world(often useless)
#cool
find a pid
sudo strace -ff -s 2048 -p 22935
write(1, "hi\n", 3) = 3
function name arguments return value
how to read strace output
man 2 writelearn more about your favorite system calls
{sa_family=AF_INET, sin_port=htons(50318), sin_addr=inet_addr("192.168.212.2")}, [16]) = 12fcntl(12, F_GETFD) = 0fcntl(12, F_SETFD, FD_CLOEXEC) = 0getsockname(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("192.168.212.182")}, [16]) = 0fcntl(12, F_GETFL) = 0x2 (flags O_RDWR)fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0read(12, "GET / HTTP/1.1\r\nHost: localhost:8181\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36\r\nAccept-Encoding: gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nCookie: _jsuid=3899596137; _packages_session=d2NrM3RZMUJXRE8zcFB0aXNGVk83Ny9lRDR3Y09uSVNoRUcrREV0VnF2UjFxRjk1NjAyUzZ5ZG81M1JyczRzUU10ZTBqMXI5QkJXZzFqZnM1RUNmdEdGYmN2eG92SUsvU24wOWhJSlhNZzYrQXdYN2tMYnRZaEhWN3ArbEpiZVpMUWNjWHNRWHc2VjkwQzZ2S0Y4aGlLeks3MmhoTXBXN2NRWUEwbGFFekpENHdveCtTNXl1MllDUTFzUzZMSU5WZlRqUlQ1aXB2bWVsZDVGVFE1Tlp0UT09LS1vYWdoMk9mZHUvS3U5OWpoME1ZY3pBPT0%3D--8e4ac5c1aebe1e9226063c3d2b83b4176535377a\r\n\r\n", 8000) = 792stat("/var/www/", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0stat("/var/www/index.php", {st_mode=S_IFREG|0664, st_size=447, ...}) = 0setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={60, 0}}, NULL) = 0rt_sigaction(SIGPROF, {0x7f8a898930c0, [PROF], SA_RESTORER|SA_RESTART, 0x7f8a8c6d14a0}, {0x7f8a898930c0, [PROF], SA_RESTORER|SA_RESTART, 0x7f8a8c6d14a0}, 8) = 0rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0umask(077) = 022umask(022) = 077getcwd("/", 4095) = 2chdir("/var/www") = 0setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={30, 0}}, NULL) = 0lstat("/var/www/index.php", {st_mode=S_IFREG|0664, st_size=447, ...}) = 0open("/var/www/index.php", O_RDONLY) = 13fstat(13, {st_mode=S_IFREG|0664, st_size=447, ...}) = 0fstat(13, {st_mode=S_IFREG|0664, st_size=447, ...}) = 0fstat(13, {st_mode=S_IFREG|0664, st_size=447, ...}) = 0fstat(13, {st_mode=S_IFREG|0664, st_size=447, ...}) = 0mmap(NULL, 447, PROT_READ, MAP_SHARED, 13, 0) = 0x7f8a8d34f000munmap(0x7f8a8d34f000, 447) = 0close(13) = 0getcwd("/var/www", 4096) = 9lstat("/var/www/./oh-fuck.php", 0x7fff11342620) = -1 ENOENT (No such file or directory)lstat("/usr/share/php/oh-fuck.php", 0x7fff11342620) = -1 ENOENT (No such file or directory)lstat("/usr/share/pear/oh-fuck.php", 0x7fff11342620) = -1 ENOENT (No such file or directory)lstat("/var/www/oh-fuck.php", 0x7fff11342620) = -1 ENOENT (No such file or directory)getcwd("/var/www", 4096) = 9lstat("/var/www/./oh-fuck.php", 0x7fff11342580) = -1 ENOENT (No such file or directory)lstat("/usr/share/php/oh-fuck.php", 0x7fff11342580) = -1 ENOENT (No such file or directory)lstat("/usr/share/pear/oh-fuck.php", 0x7fff11342580) = -1 ENOENT (No such file or directory)lstat("/var/www/oh-fuck.php", 0x7fff11342580) = -1 ENOENT (No such file or directory)getcwd("/var/www", 4096) = 9lstat("/var/www/oh-fuck.php", 0x7fff113446e0) = -1 ENOENT (No such file or directory)open("/var/www/oh-fuck.php", O_RDONLY) = -1 ENOENT (No such file or directory)chdir("/") = 0umask(022) = 022open("/dev/urandom", O_RDONLY) = 13read(13, "\33\260\300\377gK\222d", 8) = 8close(13) = 0open("/dev/urandom", O_RDONLY) = 13read(13, "4\274\17x\35\16\336\260", 8) = 8close(13) = 0open("/dev/urandom", O_RDONLY) = 13read(13, "&M\330\225-P\340\345", 8) = 8close(13) = 0setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8a8d34e000writev(12, [{"HTTP/1.0 500 Internal Server Error\r\nDate: Sat, 14 Jun 2014 17:40:32 GMT\r\nServer: Apache/2.2.22 (Ubuntu)\r\nX-Powered-By: PHP/5.3.10-1ubuntu3.11\r\nVary: Accept-Encoding\r\nContent-Encoding: gzip\r\nContent-Length: 20\r\nConnection: close\r\nContent-Type: text/html\r\n\r\n", 256}, {"\37\213\10\0\0\0\0\0\0\3", 10}, {"\3\0", 2}, {"\0\0\0\0\0\0\0\0", 8}], 4) = 276write(7, "192.168.212.2 - - [14/Jun/2014:17:40:32 +0000] \"GET / HTTP/1.1\" 500 276 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36\"\n", 199) = 199times({tms_utime=6, tms_stime=56, tms_cutime=0, tms_cstime=0}) = 1718165756shutdown(12, 1 /* send */) = 0poll([{fd=12, events=POLLIN}], 1, 2000) = 1 ([{fd=12, revents=POLLIN|POLLHUP}])read(12, "", 512) = 0close(12) = 0read(4, 0x7fff113490cf, 1) = -1 EAGAIN (Resource temporarily unavailable)
ETOOMUCHOUTPUT
writev(12, [{"HTTP/1.0 500 Internal Server Err"..., 256}, {"\37\213\10\0\0\0\0\0\0\3", 10}, {"\3\0", 2}, {"\0\0\0\0\0\0\0\0", 8}], 4) = 276
find failurealways work backwards
open("/var/www/db.in.php", O_RDONLY)=
-1 ENOENT (No such file or directory)
find the causehopefully.
open("/var/www/index.php", O_RDONLY)=13
find the offender
prove your hypothesis
find the offender
fix the bug!
#cool
“I don’t understand how this is possible.” - every programmer ever
0. Forget everything you think you know.
1. Get a third party opinion.
third partiesi have known and loved
source: http://www.brendangregg.com/linuxperf.html
1. aptbuilding a cloud for packages is hard
sudo apt-get update
sudo strace -ff apt-get update
write(1, "Ign http://192.168.212.136:3000 trusty Release\n", 62) = 62
find failurealways work backwards
read(6, "400 URI Failure\nURI: https://packagecloud-repositories-dev2.s3.amazonaws.com/1/1/ubuntu/dists/trusty/Release?AWSAccessKeyId=AKIAILW54TIPGLUGWOYA&Signature=s/c0fzVQhxpBPbpyTIzCxAfo/8g=&Expires=1402837136\nMessage: Bad header line \n\n", 64000) = 230
find the cause
confirm your hypothesis
#cool
apt-get source apt
read(6, "400 URI Failure\nURI: https://packagecloud-repositories-dev2.s3.amazonaws.com/1/1/ubuntu/dists/trusty/Release?AWSAccessKeyId=AKIAILW54TIPGLUGWOYA&Signature=s/c0fzVQhxpBPbpyTIzCxAfo/8g=&Expires=1402837136\nMessage: Bad header line \n\n", 64000) = 230
locate a hook
locate a hook
stare at the code
confirm your hypothesis
> Content-Type:
> Content-Type: text/plain
#cool
2. Locate the correct source code.
3. Identify a hard-coded string to grep for.
4. Stare at the code until it makes sense.
5. Fix whatever is broken.
How to Debug Anything !
!
0. Forget everything you think you know. 1. Get a third party opinion. 2. Locate the correct source code. 3. Identify a hard-coded string to grep for. 4. Stare at the code until it makes sense. 5. Fix whatever is broken.