"Http protocol and other stuff" by Bipin Upadhyay

Preview:

DESCRIPTION

A holistic view of how the web works, with an overview of the HTTP protocol.Presented by me at null security group (http://null.co.in), Mumbai chapter meet on Aug' 27th.

Citation preview

…and other stuff

that make the web work

Bits ‘bout Moi!

Senor Bipin Upadhyay

Developer, Directi Pvt. Ltd.

Lead, NULL Open Security Group – Mumbai Chapter

OWASP ESAPI-PHP Committer

Part of IHP (Honeynet Project)

Amateur Photographer

I know Kung-fu…

If Only it was true…

Think about the possibilities…

I know Kung-fu

Me too..

Me three..

Sigh! But it ain’t true, yet!

Agenda

http://icanhascheezburger.files.wordpress.com/2009/02/funny-pictures-cat-has-naps-on-his-agenda.jpg

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2965: Handling Statelessness

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2965: Handling Statelessness

Bit of History

Mar’89 – T.B. Lee presents “Information Management: A Proposal”

Aug’91 – Announces WWW

Mar’93 – Mosaic announced

Mar’94 – Netscape found

Oct’94 – W3C found by T.B. Lee

Web 2.0, uh!

http://www.wagnerblog.com/images/AjaxDarkSide.jpg

HTTP: What is it?

Part of the Application Layer of TCP/IP protocol suite

HTTP: What is it?

Part of the Application Layer of TCP/IP protocol suite

A set of grammatical rules for a client and server to communicate

http://www.flickr.com/photos/joshfassbind/4584323789/

HTTP: What is it?

Part of the Application Layer of TCP/IP protocol suite

A set of grammatical rules for a client and server to communicate

HTTP is what powers the WWW

…but

http://www.flickr.com/photos/quinnanya/4456123452/

Why should I bother?

Because:

web development sucks

http://www.flickr.com/photos/sneeu/1589152071/

Why should I bother?

Because:

web development sucks

Even your grandmom knows, ‘tis all about fundamentals

Why should I bother?

Also:

facilitates debugging,

improves understanding of security & performance

Why should I bother?

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2985: Handling Statelessness

http://www.flickr.com/photos/stephenpoff/2312981944/

OSI & TCP/IP protocol suite

OSI is a reference model

http://blog.uad.ac.id/imam_riadi/files/2009/01/osi-layer.jpg

OSI & TCP/IP protocol suite…

TCP/IP protocol suite is implementation of OSI

http://www.hill2dot0.com/wiki/index.php?title=Image:G0209_TCPIP_vs_OSI.jpg

OSI & TCP/IP protocol suite…

Visual learning: Wireshark, baby

http://www.wireshark.org/

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2965: Handling Statelessness

The Communication

My favorite interview question:

http://www.flickr.com/photos/terryhart/2890904949/

The Communication

My favorite interview question:

What all happens between the time when:

we click on a hyperlink

and the page is completely rendered in a browser

Brower InternetzProxy LBWeb

ServerDB

Server

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

Browser cache/ hosts file/ DNS server

null.co.in

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

Browser cache/ hosts file/ DNS server

74.53.228.212null.co.in

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

TCP Connection: There, bro?

SYN

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

SYN

SYN-ACK

TCP Connection: Yo!

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

SYN

SYN-ACK

ACK

TCP Connection: Cool!

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

HTTP: Got this file?

GET /

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

HTTP: Yup! Here ‘tis.

GET /

200 OK

index.html

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

HTTP: Can I have these as well?

GET /

200 OK

index.html

GET /js.js

GET /pic.jpg

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

HTTP: Sure!

GET /

200 OK

index.html

GET /js.js

GET /pic.jpg

200 OK

more content…

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

FIN

TCP Connection: Arigato, am done.

Brower InternetzProxy LBWeb

ServerDB

Server

Client Server (null.co.in)

FIN

FIN-ACK

TCP Connection: Sayonara!

The Communication

…. or simply

The Communication

Web 2.0 has shrunk the client and server distinction

Conventionally, client sends an HTTP request

Server responds with an HTTP response

The Communication: HTTP Request

Request Line

Request Method

Requested Resource

HTTP Version used

Headers

General Headers

Request Headers

Entity Headers

Content (Optional)

The Communication: HTTP Response

Status Line

HTTP version(s) understood by server

Status code (3 digit numerical value)

Status description

Headers

General Headers

Response Headers

Entity Headers

Content (Optional)

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2965: Handling Statelessness

http://www.saynotocrack.com/wp-content/uploads/2007/06/flinstones-anatomy.jpg

Anatomy

HTTP Request and Response are comprised of various components:

Request Methods

Response Status Codes

Request Headers

Response Headers

General Headers

Entity Headers

Content (MIME Media Types)

Anatomy: Request Methods

Humans can convey emotions in several ways

Why should HTTP clients lag!!!

HTTP methods describe the type of communication

GET POST HEAD OPTIONS

TRACE PUT DELETE CONNECT

Anatomy: Response Status Codes

Indicate the server’s mood corresponding to a request

Combination of a numerical code, and a short description

Cab be categorized in 5 categories:

1xx -- Informational

2xx -- Successful

3xx -- Redirection

4xx -- Client Error

5xx -- Server Error

Anatomy: Request Headers

Specific to an HTTP Request

Carry information about the client, and the type of request

Facilitates better understanding between client and server

Host Accept-Language If-Modified-Since Referer

User-Agent Authorization If-None-Match Expect

Accept Proxy-Authorization

If-Range From

Accept-Charset Max-Forwards If-Unmodified-Since

TE

Accept-Encoding If-Match Range

Anatomy: Response Headers

Specific to an HTTP Response

Carry information about the server, and the type of response

Accept-Ranges ETag Retry-After WWW-Authenticate

Age Location Server Proxy-Authenticate

Vary

Anatomy: General Headers

Carry information about the HTTP transaction

Can be a part of request, as well as response

Cache-Control Keep-Alive Pragma Via

Connection Upgrade Trailer Warning

Transfer-Encoding Date

Anatomy: Entity Headers

Carry information about the content

Mainly a part of HTTP response

Allow Content-Language Content-Location Content-Range

Content-Encoding Content-Length Content-MD5 Content-Type

Expires Last-Modified

Anatomy: Content

IANA maintains a list of valid content types

It is specified by the Content-Type Entity header

Categorized in 9 MIME Media types:

application audio example image

message model multipart text

video

Agenda

Intro: What & Why???

OSI model: Back to the basics

10000 feet view: How the web works

RFC 2616: Anatomy

RFC 2965: Handling Statelessness

Handling Statelessness

HTTP is a stateless protocol

Handling Statelessness

HTTP is a stateless protocol

i.e., server’s got a bad memory

Handling Statelessness

Cookies to rescue

http://www.flickr.com/photos/lij/283869088/

Handling Statelessness

Cookies:

are text files stored by client browser

maintain session by storing information

are non-executable

Handling Statelessness

Cookie attributes:

name=value

expires=value

domain=value

path=value

Secure

HttpOnly --not a part of spec

Conclusion

The single biggest problem in communication

is the illusion… that it has taken place.

--George Bernard Shaw

Conclusion

The single biggest problem in communication

is the illusion… that it has taken place.

--George Bernard Shaw

Think about it

Q&A!!!

Got queries? Raise your hands.

Arigato!

Contact info:

Om—At—[projectbee.org/null.co.in]

http://projectbee.org/

Twitter - @bipinu

Flickr -- projectbee