View
161
Download
1
Category
Tags:
Preview:
DESCRIPTION
Electronic Mail (SMTP, POP, IMAP, MIME) We will work through the handout from Tanenbaum’s book “Computer Networking.”. Internet E-mail standards were published in two parts in 1982: RFC 822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES by David H. Crocker - PowerPoint PPT Presentation
Citation preview
1
Electronic Mail (SMTP, POP, IMAP, MIME)
We will work through the handout from Tanenbaum’s book “Computer Networking.”
Overview:
The message will be constructed under RFC 822, then passed to SMTP (RFC 821) for transmission.
Internet E-mail standards were published in two parts in 1982:
RFC 822: STANDARD FOR THE FORMAT OF
ARPA INTERNET TEXT MESSAGES
by David H. Crocker
RFC 821: SIMPLE MAIL TRANSFER PROTOCOL by Jonathan B. Postel
(Updated as RFC 2822 and 2821 (April, 2001).)
2
7.4.3 Message Formats
RFC 822 messages consist of lines of ASCII text, ending with <CR> <LF>
maximum 1000 characters
Messages are divided into three sections:
■ header fields
■ optionally, the message body.
■ a blank line (a line with nothing except <CR><LF> )
3
Headers
■ contain readable text (ASCII – no control characters)
■ are divided into lines
■ each line of form <keyword> : <value>
Keywords To and From are required, others optional.
4
Some other RFC 822 header fields not involved in transport:
5
RFC 822 states that the message can consist
only of ASCII text and SMTP (RFC 821) expects this.
ASCII is a 7-bit code, which is transmitted right-adjusted in an 8-bit byte, leaving binary 0 in the
high-order position.
6
MIME – Multipurpose Internet Mail Extensions (RFC 1521, 1993)
In the body of the message we would like to be able to include items such as:
To send non-ASCII information (arbitrary binary string) we must “disguise” it as ASCII
Such material may contain arbitrary sequences of binary digits. No reason that high-order bit of byte is always zero.
■ Messages not containing any kind of text (image, audio and video)
■ Messages in languages without alphabets (Chinese and Japanese)
■ Messages in non-Latin alphabets (Arabic, Russian, Hebrew)
■ messages in languages with accents
7
Questions:
■ how does the sender disguise the binary string as ASCII?
■ when recipient receives the “ASCII” how does she
retrieve the binary string?
■ when recipient retrieves the binary string, how does
she know what it is?
8
Questions:
■ how do we disguise the binary string as ASCII?
9
10
U A B
In this example, disguise is not necessary, since ‘UAB’ is already ASCII text!
010101 01
V
11
Receiver sees the Content-Transfer-Encoding header, then knows how to reverse the encoding to retrieve the original binary string.
Second Question:
■ when recipient receives the “ASCII” how does she
retrieve the binary string?
12
Third question:
■ when recipient retrieves the binary string, how does
she know what it is?
13
14
Body
Section boundary
Required blank line
RFC 822 Headers
15
7.4.4 Message Transfer
This is RFC 821, ”Simple Mail Transfer Protocol.”
SMTP is a simple ASCII protocol, running on top of TCP.
First, the client establishes a TCP connection to port 25 of the server
(this would have involved a preliminary access to the DNS system to discover a type MX resource record for the destination domain).
Overview:
This message has been constructed under RFC 822, and will be passed to SMTP (RFC 821) for transmission.
We will illustrate the client/server exchange by considering transmission of the message in figure 7-46.
16
TCP connection from client abc.com to port 25 on Mail Exchanger for xyz.com already established.
RFC821 (SMTP) Dialog
RFC 822 message
End marker added by SMTP client
17
The e-mail message as seen on user screen:
Subject: Test IIFrom: Anthony Barnard <barnard@earthlink.net>Date: Fri, 20 Jul 2007 11:59:23 -0500To: "Anthony (work) Barnard" barnard@cis.uab.edu
The following two lines have a period in the first position:..The following two lines have periods in the first two positions:....end test
What if the 822 message itself has a period alone in the first position?
Will SMTP server see this and terminate the message prematurely?
18
Wireshark trace of sending message:Frame 22 (588 bytes on wire, 588 bytes captured)Internet Protocol, Src: 192.168.2.99, Dst: 207.69.189.206Transmission Control Protocol, Src Port: 3693 (3693), Dst Port: smtp (25), Simple Mail Transfer Protocol Message: Message-ID: <46A0E9EB.8030105@earthlink.net>\r\n Message: Date: Fri, 20 Jul 2007 11:59:23 -0500\r\n Message: From: Anthony Barnard <barnard@earthlink.net>\r\n Message: User-Agent: Thunderbird 1.5.0.12 (Windows/20070509)\r\n Message: MIME-Version: 1.0\r\n Message: To: "Anthony (work) Barnard" <barnard@cis.uab.edu>\r\n Message: Subject: Test II\r\n Message: Content-Type: text/plain; charset=ISO-8859-1; format=flowed\r\n Message: Content-Transfer-Encoding: 7bit\r\n Message: \r\n [the blank line] Message: The following two lines have a period in the first position:\r\n Message: ..\r\n Message: ..\r\n Message: The following two lines have periods in the first two positions:\r\n Message: ...\r\n Message: ...\r\n Message: end test\r\n Message: .\r\n [the end-of-message marker appended by SMTP client]
[Extra period “stuffed” in by SMTP client]
[Extra period “stuffed” in by SMTP client]
19
Introduction to the World Wide Web
Since we are coming off a study of E-mail, it may be helpful to note the influence that it had on the WWW protocols. Both separate the specification of the message from its transmission.
►RFC822/MIME govern format of E-mail messages
HTML governs format of WWW pages
Like SMTP and POP3, HTTP is an “ASCII protocol” that can be easily read and understood by humans.
►RFC821/SMTP and RFC1939/POP3 govern transmission of E-mail messages
HTTP governs transmission of WWW pages
However, the correspondence is only loose: HTML look very different from RFC/822/MIME, whereas HTTP draws from both RFC 822/MIME and RFC821/SMTP
20
An HTML document!
We will revisit this!
21
Chapter 27 – World Wide Web
Skim sections 27.1 – 27.5
22
27.6 Hypertext Transfer Protocol (HTTP)
► Application Level
► Request/Response
► Stateless
► Bi-directional Transfer
► Capability Negotiation
► Support for Caching
► Support for Intermediaries (proxies)
23
27.7 HTTP GET Request
Using Comer’s example
http://www.cs.purdue.edu/people/comer/
once TCP connection to HTTP server www.cs.purdue.edu has been made, browser sends command
GET /people/comer/ HTTP/1.1
27.8 Error Messages
Not much to say!
Host: www.cs.purdue.edu Required request header (see later)
24
27.9 Persistent Connections
HTTP/1.0 followed the FTP paradigm, using one TCP connection per data transfer – create data connection, transfer one file, close data connection.
► Disadvantage:
need to identify beginning and end of each itemcan’t reserve a bit pattern as “sentinel”have to use content-length response header
► Advantage:
reduced overhead pipelining
Default in HTTP/1.1 is persistent connection
25
27.10 Data Length and Program Output
May not be convenient or even possible for server to know the length of an item before sending.
In this case we cannot use persistent connection.
HTTP server reverts to closing connection after a sending a single file (as in HTTP/1.0)
Server tells client about this by sending connection: close header (HTTP headers in next section).
26
27.11 Length Encoding and Headers
After the first line of a request or response:
“..HTTP borrows the basic format from e-mail, using the 822 format and MIME extensions. Like a standard 822 message, each HTTP transmission contains a header, a blank line, and the item being sent. Furthermore each line in the header contains a keyword, a colon, and information.”
Figure 27.1
Some headers:
27
Hypertext Transfer Protocol GET /barnard/old_home.html HTTP/1.1\r\n Host: www.cis.uab.edu\r\n User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20040922\r\n Accept:text/xml,application/xml,application/xhtml+xml,
text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\n Accept-Language: en-us,en;q=0.5\r\n Accept-Encoding: gzip,deflate\r\n Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n Keep-Alive: 300\r\n Connection: keep-alive\r\n \r\n
Required blank line
Message body empty
Required request header
Request line
Wireshark example (request):
Key in http://www.cis.uab.edu/barnard/old_home.html
28
Wireshark example (reply to request in previous slide):
Hypertext Transfer Protocol
HTTP/1.1 200 OK\r\n Date: Fri, 08 Oct 2004 17:30:54 GMT\r\n Server: Apache/1.3.29 (Unix) PHP/4.3.5RC3\r\n Last-Modified: Mon, 08 Mar 2004 23:52:12 GMT\r\n ETag: "770077-ee9-404d072c"\r\n Accept-Ranges: bytes\r\n Content-Length: 3817\r\n Keep-Alive: timeout=15, max=100\r\n Connection: Keep-Alive\r\n Content-Type: text/html\r\n \r\n
Line-based text data: text/html:Required blank line
Message body: 3817 bytes of data /barnard/old_home.html
Status line Response code 200
29
Omit rest of Chapter 27 – World Wide Web
Comer’s presentation is inadequate for our purpose, so we will again use parts of Tanenbaum’s presentation (handout).
Preview:
To make the WWW usable for E-commerce, 4 key developments were needed:
1. Cookies
2. Forms
3. Three-tier system
4. Security
30
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
WWW server is stateless
Like a packet filter, it does not remember anything.
But for applications like E-commerce we need state!
In 1994 Netscape invented a “fix” to HTTP – “cookies”
31
Statelessness and Cookies – continued
Along with a WWW page, the server sends a cookie to the client.
On later accesses to the server, the client returns the cookie.
This identifies the client and provides continuity from visit to visit.
Cookie is a small file that the client stores on its hard disk
(terminology is that server sets the cookie).
32
Statelessness and Cookies – continued
►Sneaky.com to track the user’s WWW browsing.
Cookies have been set by:
►Tom’s Casino to Identify this client.
►Joe’s Store to record that shopping cart
currently has the items in it.
►A WWW portal to record the client’s news interests.
We’ll take a closer look at cookies later.
33
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
******** content *********
7.3.2 Static Web Documents 629HTML — The Hypertext Markup Language 629Forms 634
34
7.3.2 Static Web Documents
WWW pages are written in Hypertext Markup Language (HTML)
Formatting commands are called tags
e.g. <h2> this is a second-level headline </h2>
states that the text between the tags should be displayed at level-2 size.
I will assume that you are familiar with basic HTML
35
Forms
HTML 1.0 was basically one-way;
HTML 2.0 introduced forms, which can be completed by the client and returned to the server.
This was a key step in making E-commerce possible.
(Latest is HTML 5.0)
36
Forms - continued
Upper part of Figure 7-29(a)
In these examples the input tag has no type parameter
– default is “text” – user keys in information
In first example:System will assign the keyed-in string to the variable “customer”
37
Forms - continued
Figure 29(b)
Anthony Barnard
3037 Westmoreland Drive
Mountain Brook AL USA
123456789 07/20
*
38
Forms - continued
Figure 7-29(a)
input tag has parameter type with value radio – like car radio buttons
Select exactly one of the alternatives
IF VISA clicked value visacard will be assigned to variable cc
39
Input type checkbox – optional – can check or ignore
Input type submit – click when ready to upload data to WWW server
40
customer=Anthony+Barnard&address=3037+Westmoreland+Drive&city=Mountain+Brook&state=AL&country=USA&cardno=123456789&expires=7/20&cc=visacard&product=expensive&express=on
When Submit order button is clicked the system first assembles the input information into a string.
41
Every form needs at least one submit button!
The ACTION and method parameters specify what should happen next after the submit order button is clicked.
42
1. Make TCP connection to widget.com, port 80
2. Use HTTP to POST the string to script widgetorder in directory cgi-bin
What happens when the submit order button is clicked?
43
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
******** content *********
7.3.2 Static Web Documents 629HTML — The Hypertext Markup Language 629Forms 634
7.3.3 Dynamic Web Documents 643Server-Side Dynamic Page Generation 643 656
44
7.3.3 Dynamic Web Documents
Not all WWW pages can be prepared in advance.
Server-side Dynamic Web Page Generation
Example of the need for a server to build a page dynamically:
You have several items in your shopping cart and have clicked on the PROCEED TO CHECKOUT button.
The server needs to build a page showing your purchases, for your confirmation.
45
7.3.3 Dynamic Web Documents – continued
Common Gateway Interface (CGI)
Standard interface allows WWW servers to talk to back-end servers.
Scripts are usually stored in directory cgi-bin
Recall ACTION parameter in figure 7-29(a) :
“3-tier system”
46
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
******** content *********
7.3.2 Static Web Documents 629HTML — The Hypertext Markup Language 629Forms 634
7.3.3 Dynamic Web Documents 643Server-Side Dynamic Page Generation 643
******** transmission across internet *******
7.3.4 — The HyperText Transfer Protocol 651Connections 652Methods 652Message Headers 654Example HTTP Usage 656
47
7.3.4 HTTP – The Hypertext Transfer Protocol
Each interaction consists of one ASCII request, followed by one RFC 822 MIME-like response.
48
7.3.4 HTTP – The Hypertext Transfer Protocol
Connections (recall from Comer section 27.9)
“In HTTP 1.0 after the connection was established, a single request was sent over and a single response was sent back. Then the TCP connection was released.”
HTTP 1.1 default is persistent connections – can send numerous requests and get numerous responses over the same TCP connection.
49
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Requests
“Each request consists of one or more lines of ASCII text, with the first word on the first line being the name of the method requested.”
Example:
GET filename HTTP/1.1
50
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Responses
“Every request gets a response, consisting of a status line and possibly additional information (e.g. all or part of a WWW page).”
51
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Message Headers
After the first line ( request or response) HTTP messages follow the pattern of E-mail messages, one or more headers, followed by a blank line, optionally followed by the message body.
The MIME rules apply to the body and some of the MIME headers are used (e.g. content-type and content-encoding).
52
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Message headers - Request
Message headers - Response
*
53
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Example HTTP usage
“Because HTTP is an ASCII protocol, it is quite easy for a person at a terminal (as opposed to a browser) to talk directly to Web servers. All that is needed is a TCP connection to port 80 on the server. Readers are encouraged to try this scenario personally.”
Required request header
User keys in:
telnet www.ietf.org 80 > log.
GET /rfc.html HTTP/1.1Host: www.ietf.org
.
.
.close
54
7.3.4 HTTP – The Hypertext Transfer Protocol - continued
Figure 7-44
telnet www.ietf.org 80 > log
GET /rfc.html HTTP/1.1Host: www.ietf.org blank line to signal end
Blank line in response
[More HTML]
55
TERMINOLOGY
► user agent – the client that initiates a request, usually a browser.
► origin server – the server on which a given resource resides (“origin” to distinguish it from any proxy servers involved)
► Host domain name (HDN)
www.mylab.org/cgi-bin/sampleform
request-host request-URI
URL:
► request-host
► request-URI
Back to Cookies!
This treatment is based onRFC 2109/2965 HTTP State Management Mechanism
56
TERMINOLOGY - continued
► domain-match
Host A’s name domain-matches host B’s if
► their names or IP addresses match exactly
► A is a HDN string and has the form NB,
where N is a non-empty name string, B has the form .B́ and B́ is a HDN
Examples:
► www.amazon.com domain matches .amazon.com
► www.amazon.com does not domain-match amazon.com
N B
► pda-as.amazon.com domain matches .amazon.com
57
Definition of HTTP session
1. Each session has a beginning and an end.
An HTTP session may contain several TCP sessions
Informally: a session might include access to a catalog, selection of purchase items into a shopping cart, checkout, and
acknowledgement of purchase.
5. The session is implicit in the exchange of state information(there is no special message to start or stop a session).
4. Either the user-agent or the origin server may terminate a session
3. Session is started by the origin server
2. Each session is relatively short-lived.
58
OUTLINE
Origin server sends state information (cookie) to the user agent
User agent returns state information to origin server.
59
The Role of the Origin Server
►The origin server (surprising!) initiates an HTTP session, if it so desires.
► To identify themselves, user agents should send Cookie request headers (subject to other rules detailed below) with every request.
► Servers may send a Set-Cookie header with any response (not necessarily with every response, but Amazon sends same
cookies repeatedly – see in Lab session 8).
► To initiate a session, the origin server sends a message with an extra response header to the client, Set-Cookie
► The origin server may include multiple Set-Cookie headers in a response.
60
set-cookie = "Set-Cookie:" cookies
cookies = 1#cookie
cookie = NAME "=" VALUE
*(";" cookie-av)
cookie-av = "Comment" "=" value |
"Domain" "=" value |
"Expires" "=" value |
"Path" "=" value
Set-Cookie Syntax
At least one cookie
Zero or more attribute-value pairs
61
Example: Wireshark trace of response to user keying in www.amazon.com (from Lab session 8)
Hypertext Transfer Protocol HTTP/1.1 200 OK\r\n Date: Fri, 04 Nov 2011 19:55:42 GMT\r\n Server: Server\r\n Set-Cookie: skin=noskin; path=/; domain=.amazon.com; expires=Fri, 04-Nov-2011 19:55:42 GMT\r\n x-amz-id-1: 06RZ1EQG59NPJZENETCY\r\n p3p: policyref="http://www.amazon.com/w3c/p3p.xml\r\n x-amz-id-2: 3LFrkcUrpQGeTZd5nBBJpw7sW67 Vary: Accept-Encoding,User-Agent\r\n Content-Encoding: gzip\r\n Content-Type: text/html; charset=ISO-8859-1\r\n Set-cookie: session-id-time=2082787201l; path=/;domain=.amazon.com; expires=Tue, 01-Jan-2036 08:00:01 GMT\r\n Set-cookie: session-id=182-8717797-2826126;
path=/; domain=.amazon.com; expires=Tue, 01-Jan-2036 08:00:01 GMT\r\n Transfer-Encoding: chunked\r\n \r\n ****** blank line
62
When the user agent sends a request to an origin server, the user agent includes a Cookie request header if it has applicable cookies, based on:
► the request-host – Domain Selection
AND
► the request URI – Path Selection
AND
► the expiration date – Age selection
The Role of the User Agent (browser)
The user agent keeps separate track of state information that arrives via Set-Cookie response headers from each origin server.
63
www.mylab.org/cgi-bin/sampleform
request-host request-URI
Domain selection:
The origin server’s FQDN must domain-match the domain attribute of the cookie.
Path Selection:
The path attribute of the cookie must match a prefix of the request-URI
Age Selection:
Cookies that have expired should have been discarded and so are not sent.
User Agent Role – continued
64
Example: Wireshark trace of response to user keying in www.amazon.com (from Lab session 8)
Hypertext Transfer Protocol HTTP/1.1 200 OK\r\n Date: Fri, 04 Nov 2011 19:55:42 GMT\r\n Server: Server\r\n Set-Cookie: skin=noskin; path=/; domain=.amazon.com; expires=Fri, 04-Nov-2011 19:55:42 GMT\r\n x-amz-id-1: 06RZ1EQG59NPJZENETCY\r\n p3p: policyref="http://www.amazon.com/w3c/p3p.xml\r\n x-amz-id-2: 3LFrkcUrpQGeTZd5nBBJpw7sW67 Vary: Accept-Encoding,User-Agent\r\n Content-Encoding: gzip\r\n Content-Type: text/html; charset=ISO-8859-1\r\n Set-cookie: session-id-time=2082787201l; path=/;domain=.amazon.com; expires=Tue, 01-Jan-2036 08:00:01 GMT\r\n Set-cookie: session-id=182-8717797-2826126;
path=/; domain=.amazon.com; expires=Tue, 01-Jan-2036 08:00:01 GMT\r\n Transfer-Encoding: chunked\r\n \r\n ****** blank line
www.amazon.com domain-matches this
65
Trace of next HTTP request message client to server www.amazon.com
Should we send the cookies set in previous slide?
Hypertext Transfer Protocol GET /aan/2009-09-09/static/amazon/iframeproxy-9.html HTTP/1.1\r\n Host: www.amazon.com\r\n User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Ubuntu/10.04 (lucid) Firefox/3.6.23\r\n Accept:text/html,application/xhtml+xml,application/xml; q=0.9,*/*;q=0.8\r\n Accept-Language: en-us,en;q=0.5\r\n Accept-Encoding: gzip,deflate\r\n Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n Keep-Alive: 115\r\n Connection: keep-alive\r\n Referer: http://www.amazon.com/\r\n
Cookie domain was .amazon.com
Cookie path was /
Cookie: session-id-time=2082787201l; session-id=182-8717797-2826126\r\n
66
Hypertext Transfer Protocol GET /getad?site=amazon.us;pt=Gateway;slot=right-2;ef=0 HTTP/1.1\r\n Host: pda-as.amazon.com\r\n User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Ubuntu/10.04 (lucid) Firefox/3.6.23\r\n Accept: */*\r\n Accept-Language: en-us,en;q=0.5\r\n Accept-Encoding: gzip,deflate\r\n Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n Keep-Alive: 115\r\n Connection: keep-alive\r\n Referer: http://www.amazon.com/aan/2009-09-
09/static/amazon/iframeproxy-9.html\r\n
Trace of HTTP request message client to a different server pda-as.amazon.com
Should we send the same two cookies?
Cookie domain was .amazon.com
Cookie path was /
Cookie: session-id-time=2082787201l; session-id=182-8717797-2826126\r\n
67
Hypertext Transfer Protocol [truncated] GET /1/display-ads- *** more!
Host: fls-na.amazon.com\r\n User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Ubuntu/10.04 (lucid) Firefox/3.6.23\r\n Accept: image/png,image/*;q=0.8,*/*;q=0.5\r\n Accept-Language: en-us,en;q=0.5\r\n Accept-Encoding: gzip,deflate\r\n Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n Keep-Alive: 115\r\n Connection: keep-alive\r\n Referer: http://www.amazon.com/aan/2009-09-
09/static/amazon/iframeproxy-9.html\r\n
Trace of HTTP request message client to a different server fls-na.amazon.com
Should we send the same two cookies?
Cookie path was /
Cookie domain was .amazon.com
Cookie: session-id-time=2082787201l; session-id=182-8717797-2826126\r\n
Recommended