Upload
fukamachi
View
1.755
Download
1
Embed Size (px)
DESCRIPTION
At Lisp Meetup #22
Citation preview
Writing a fast HTTP parser
Lisp Meetup #22 Eitaro Fukamachi
Thank you for coming.
I’m Eitaro Fukamachi @nitro_idiot fukamachi
(and 'web-application-developer 'common-lisper)
We’re hiring! Tell @Rudolph_Miller.
fast-http
• HTTP request/response parser
• Written in portable Common Lisp
• Fast
• Chunked body parser
Let me tell why I had to write a fast HTTP parser.
Wookie is slower than Node.js
• Wookie is 2 times slower than Node.js
• Profiling result was saying “WOOKIE:READ-DATA” was pretty slow.
• It was only calling “http-parse”.
• “http-parse” which is an HTTP parser Wookie is using.
The bottleneck was HTTP parsing.
Wookie is slower than Node.js
• Node.js’s HTTP parse is “http-parser”.
• Written in C.
• General version of Nginx’s HTTP parser.
• Is it possible to beat it with Common Lisp?
Today, I’m talking what I did for writing a fast Common Lisp program.
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
A brief introduction of HTTP
HTTP request look like…
GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵
↵
HTTP request look like…
GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵
↵
First Line
Headers
Body (empty, in this case)
HTTP request look like…
GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵
↵ CR + LF
CRLF * 2 at the end of headers
HTTP response look like…
HTTP/1.1 200 OK↵ Cache-Control: max-age=0↵ Content-Type: text/html↵ Date: Wed, 26 Nov 2014 04:52:55 GMT↵
↵ <html> …
HTTP response look like…
HTTP/1.1 200 OK↵ Cache-Control: max-age=0↵ Content-Type: text/html↵ Date: Wed, 26 Nov 2014 04:52:55 GMT↵
↵ <html> …
Status Line
Headers
Body
HTTP is…
• Text-based protocol. (not binary)
• Lines terminated with CRLF
• Very lenient.
• Ignore multiple spaces
• Allow continuous header values
And, there’s another difficulty.
HTTP messages are sent over a network.
Which means, we need to think about long & incomplete HTTP messages.
There’s 2 ways to resolve this problem.
1. Stateful (http-parser)
http-parser (used in Node.js)
• https://github.com/joyent/http-parser
• Written in C
• Ported from Nginx’s HTTP parser
• Written as Node.js’s HTTP parser
• Stateful
http-parser (used in Node.js)for (p=data; p != data + len; p++) { … switch (parser->state) { case s_dead: … case s_start_req_or_res: … case s_res_or_resp_H: … } }
http-parser (used in Node.js)for (p=data; p != data + len; p++) { … switch (parser->state) { case s_dead: … case s_start_req_or_res: … case s_res_or_resp_H: … } }
Process char by char
Do something for each state
2. Stateless (PicoHTTPParser)
PicoHTTPParser (used in H2O)
• https://github.com/h2o/picohttpparser
• Written in C
• Stateless
• Reparse when the data is incomplete
• Most HTTP request is small
And fast-http is…
fast-http is in the middle
• Not track state for every character
• Set state for every line
• It makes the program simple
• And easy to optimize
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
Memory allocation is slow
• (in general)
• Make sure not to allocate memory during processing
• cons, make-instance, make-array…
• subseq, append, copy-seq
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
Data types
• Wrong data type makes your program slow.
• List or Vector
• Hash Table or Structure or Class
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
Benchmark is quite important
• “Don’t guess, measure!”
• Check if your changes improve the performance.
• Benchmarking also keeps your motivation.
Profiling
• SBCL has builtin profiler
• (sb-profile:profile “FAST-HTTP” …)
• (sb-profile:report)
5 important things
• Architecture
• Reducing memory allocation
• Choosing the right data types
• Benchmark & Profile
• Type declarations
Type declaration
• Common Lisp has type declaration (optional)
• (declare (type <type> <variable symbol>))
• It’s a hint for your Lisp compiler
• (declare (optimize (speed 3) (safety 0)))
• It’s your wish to your Lisp compilerSee also: Cより高速なCommon Lispコードを書く
(safety 0)
• (safety 0) means “don’t check the type & array index in run-time”.
• Fast & unsafe (like C)
• Is fixnum enough?
• What do you do when someone passes a bignum to the function?
(safety 0)
• fast-http has 2 layers
• Low-level API
• (speed 3) (safety 0)
• High-level API (safer)
• Check the variable type
• (speed 3) (safety 2)
Attitude
Attitude
• Write carefully.
• It’s possible to beat C program
• (if the program is complicated enough)
• Don’t give up easily
• Safety is more important than speed
Thanks.
EITARO FUKAMACHI 8arrow.org @nitro_idiot fukamachi