McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Chapter 27 HTTP and WWW

McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004

Chapter 27

HTTPand

WWW


HTTP

Hypertext Transfer Protocol (HTTP) is used mainly to access data on the World Wide Web.

It can jump from one document to another

Functions like FTP and SMTP Transfers files and uses services of TCP; Uses

TCP port 80 Transfer data between client and server HTTP information is read and interpreted by

HTTP server and HTTP client


Figure 27.1 HTTP Transaction HTTP itself is a stateless protocol Client initializes the transaction by

sending a request message. Server replies by sending a response. Two types of HTTP messages

Request Response


Figure 27.2 Request Message Request line

Request type Uniform Resource Locator (URL): address of the web

page Method: Protocol used to retrieve the document. Host computer: Name of the computer where the

information is located Port: [Optional] Port number of server Path: Path name of the file where the information is located. Version: HTTP 1.1 OR 1.0 OR 0.9

Headers Body


Figure 27.3 Request line

URL


Methods Request method is the actual command or request that a

client issues to the server GET: Client wants to retrieve a document from server HEAD: client wants information about a document and not the

document itself. POST: Client provides information to the server. PUT: Client provides a document to the server. PATCH: similar to PUT but only with differences that should be

implemented in existing file. COPY: Copies a file to another location. Source is in request line

and destination is in entity header. MOVE: Moves a file to another location. DELETE: Removes a document from server. LINK: Creates a link or links from a document to another

location. UNLINK: Deletes links created by LINK method. OPTION: Used by client to ask the server about available

options.


Figure 27.5 Response Message

Status Line HTTP Version Status code: Status code field as in FTP & SMTP; three digits Status Phrase: Explains the status code in text form.

Header Exchange additional information between client and server Header name, colon, space, header value.

Body


Figure 27.6 Status Line

Header Format


Figure 27.8 Header Categories General header: Info about message. Request header: specifies client’s configuration and client’s

preferred document format. Response header: Specifies server’s configuration and special

information about the request. Entity header:

Gives info about the body of document. Mostly in response message but in request messages of POST &

PUT methods Request message has Only general, request and entity

headers. Response message has general, response and entity headers.


Example 1Example 1This example retrieves a document. We use the GET method to retrieve an image with the path /usr/bin/image1. The request line shows the method (GET), the URL, and the HTTP version (1.1). The header has two lines that show that the client can accept images in GIF and JPEG format. The request does not have a body. The response message contains the status line and four lines of header. The header lines define the date, server, MIME version, and length of the document. The body of the document follows the header (see Fig. 27.9, next slide).


Figure 27.9 Example 1


Example 2Example 2

This example retrieves information about a document. We use the HEAD method to retrieve information about an HTML document (see the next section). The request line shows the method (HEAD), URL, and HTTP version (1.1). The header is one line showing that the client can accept the document in any format (wild card). The request does not have a body. The response message contains the status line and five lines of header. The header lines define the date, server, MIME version, type of document, and length of the document (see Fig. 27.10, next slide). Note that the response message does not contain a body.


Figure 27.10 Example 2


Features of HTTP 1.1 Persistent connection

HTTP 1.1 default option Server leaves the connection open for more requests after sending

a response. Server can close the connection at the request of a client or if a

timeout has been reached. Usually length of data is sent along with each response but

when the length is not known, server informs the client that the length is not known and closes the connection after sending data so the client knows that the end of data has been reached.

Nonpersistent connection HTTP 1.0 One TCP connection is made for each request/response.

Client opens a TCP connection and sends a request Server sends the response and closes the connection Client reads the data until it encounters an end-of-file marker;

the client then closes the connection. For N different images in different files, the connection must be

opened and closed N times; impose high overhead on server.


Proxy Server

HTTP support Proxy server. Proxy server is a computer that keeps copies of

responses to recent requests. If proxy server is present, HTTP client sends a

request to proxy server and the proxy server checks its cache.

If the response is not stored in cache, the proxy server sends the request to corresponding server.

Incoming responses are sent to proxy server and stored for further requests from other clients.

Reduces load on original sever, decreases traffic, and improves latency.


Figure 27.11 Distributed services

World wide web (WWW) Repository of information spread all over the world. Unique combination of flexibility, portability and user

friendliness. WWW today is a distributed client-server service, in

which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called websites.


Figure 27.12 Hypertext

Linking of documents is done using pointers Hypertext documents only contain text,

hypermedia documents can contain pictures, graphics, and sound

Unit of hypertext or hypermedia available on web is called a page. The main page for an organization or an individual is called homepage.


Figure 27.13 Browser architecture

Browser has three parts Controller: receives input from keyboard or mouse

and uses the client programs to access the document.

Client programs Interpreters: After the document has been accessed,

the controller use one of the interpreters to display the document on the screen; HTML or Java.


Figure 27.14 Categories of Web documents

Static documents Fixed-content documents that are created and

stored in the server. Client can get only the copy of the document. The contents in the server can be changed, but the

user cannot change it.


Figure 27.15 Static Document


Figure 27.16 Boldface Tags

HTML(Hypertext Markup Language) Language for creating web pages. Tags are instructions to the browser. HTML allows us to embed formatting instructions in

the file itself. HTML lets us use only ASCII characters for both the

main text and formatting instructions.


Figure 27.18 Beginning and Ending Tags

Structure of a web page Head

1st part of a web page Contains the title of the page and other parameters that the

browser will use. Body

Actual contents of a page are in the body, which includes text and tags. Tags define the appearance of the document.

Tags Marks that are embedded into the text. Enclosed in two signs (< and >) and usually comes in pairs. Beginning tag starts with the name of the tag, and the

ending tag starts with a slash followed by the name of the tag.


Table 27.1 Common tags Table 27.1 Common tags

BeginningTag

Ending Tag

Meaning

Skeletal Tags

<HTML> </HTML> Defines an HTML document

<HEAD> </HEAD> Defines the head of the document

<BODY> </BODY> Defines the body of the document

Title and Header Tags

<TITLE> </TITLE> Defines the title of the document

<Hn> </Hn> Defines the title of the document


Table 27.1 Common tags (continued) Table 27.1 Common tags (continued)

BeginningTag

Ending Tag

Meaning

Text Formatting Tags

 Boldface

 Italic

 Underlined

 Subscript

 Superscript

Data Flow Tag

<CENTER> </CENTER> Centered

 Line break


Table 27.1 Common tags (continued) Table 27.1 Common tags (continued)

BeginningTag

Ending Tag

Meaning

List Tags

<OL> </OL> Ordered list

<UL> </UL> Unordered list

<LI> </LI> An item in a list

Image Tag

<IMG> Defines an image

Hyperlink Tag

<A> </A> Defines an address (hyperlink)

Executable Contents

<APPLET> </APPLET> The document is an applet


Example 3Example 3

This example shows how tags are used to let the browser format the appearance of the text.<HTML> <HEAD> <TITLE> First Sample Document </TITLE> </HEAD> <BODY> <CENTER> <H1> ATTENTION </H1> </CENTER> You can get a copy of this document by: <UL> <LI> Writing to the publisher <LI> Ordering online <LI> Ordering through a bookstore </UL> </BODY></HTML>


Example 4Example 4

This example shows how tags are used to import an image and insert it into the text.

<HTML> <HEAD> <TITLE> Second Sample Document </TITLE> </HEAD> <BODY> This is the picture of a book: <IMG SRC="Pictures/book1.gif" ALIGN=MIDDLE> </BODY></HTML>


Example 5Example 5

This example shows how tags are used to make a hyperlink to another document.<HTML> <HEAD> <TITLE> Third Sample Document </TITLE> </HEAD> <BODY> This is a wonderful product that can save you money and time. To get information about the producer, click on <A HREF="http://www.phony.producer"> Producer </A> </BODY></HTML>


Figure 27.19 Dynamic Document

Dynamic documents do not exist in a predefined format. Dynamic document is created by a Web server whenever a

browser requests the document. When a request arrives, the Web server runs an application

program that creates the dynamic document. The server returns the output of the program as a response to the browser that requested the document.

As fresh document is created for each request, the contents of a dynamic document can vary from one request to another.

Example is getting date and time from the server.


Steps involved in handling dynamic documents. Server examines the URL to find if it defines a dynamic

document. URL defines a dynamic document, the server executes the

program. Sends the output of the program to the client (browser).

Common Gateway Interface (CGI) Technology that creates and handles dynamic documents. CGI is a set of standards that defines how a dynamic

document should be written, how input data should be supplied to the program, and how the output result should be used.

Can use C, C++, Perl, … Use of common in CGI indicates that the standard defines a

set of rules that are common to any language or platform. Gateway here means that a CGI program is a gateway that

can be used to access other resources such as databases and graphics packages.

Interface means that there is a set of predefined terms, variables, calls, and so on that can be used in any CGI program.


Example 6Example 6

Example 6 is a CGI program written in Bourne shell script. The program accesses the UNIX utility (date) that returns the date and the time. Note that the program output is in plain text.

#!/bin/sh # The head of the programecho Content_type: text/plainecho# The body of the programnow='date'echo $nowexit 0


Example 7Example 7

Example 7 is similar to Example 6 except that program output is in HTML.

#!/bin/sh # The head of the programecho Content_type: text/htmlecho# The body of the programecho <HTML>echo <HEAD><TITLE> Date and Time </TITLE></HEAD>echo <BODY>now='date'echo <CENTER> $now </CENTER>echo </BODY>echo </HTML>exit 0


Example 8Example 8

Example 8 is similar to Example 7 except that the program is written in Perl.

#!/bin/perl # The head of the programprint "Content_type: text/html\n";print "\n";# The body of the programprint "<HTML>\n";print "<HEAD><TITLE> Date and Time </TITLE></HEAD>\n";print "<BODY>\n";$now = 'date';print "<CENTER> $now </CENTER>\n";print "</BODY>\n";print "</HTML>\n";exit 0


Figure 27.20 Active document For active documents, we need a program to be run at the

client side. For example, to run animations. When a browser requests an active document, the server

sends a copy of the document in the form of byte code. The document is then run at the client (browser) site; the client can store this document in its own storage area also.

Active document is stored in binary code in the server.


Creation, compilation and execution

At server site, programmer writes a program, in source code, and stores it in a file.

Compile the code into byte code. Path name of the file is the one used by a URL to refer to the file. In this file, each program command (statement) is in binary form, and each identifier (variable, constants, function names, and so on) is referred to by a binary offset address.

Client (browser) requests a copy of the binary code, which is probably transported in compressed form from the server to the client (browser).

Client (browser) uses its own software to change the binary code into executable code. The software links all the library modules and makes it ready for execution.

Client (browser) runs the program and creates the result that can include animation or interaction with the user.


Java Java is combination of a high-level programming language, a

run-time environment, and a class library that allows a programmer to write an active document (an applet) and a browser to run it.

Java can also be a stand-alone program without using a browser.

Java is an object-oriented language like C++ without operator overloading or multiple inheritance.

Java is platform-independent and does not use pointer arithmetic.

Java is an object-oriented language, a programmer defines a set of objects and a set of operations (methods) to operate on those objects.

Java is a typed language which means that the programmer must declare the type of any piece of data before using it.

Java is also a concurrent language, which means the programmer can use multiple threads to create concurrency.


Classes and Objects Object is an instance of a class that uses methods

(procedures or functions) to manipulate encapsulated data.

Inheritance Inheritance defines a hierarchy of objects, in which

one object can inherit data and methods from other objects.

In Java, we can define a class as the base class that contains data and methods common to many classes.

Inherited classes can inherit these data and methods and can also have their own data and methods.

Packages Java has a rich library of classes, which allows the

programmer to create and use different objects in an applet.


Figure 27.21 Skeleton of an applet Applet is an active document written in Java. It is actually the definition of a publicly

inherited class, which inherits from the applet class defined in the java.applet library.

Programmer can define private data and public and private methods in this definition.

Client process (browser) creates an instance of this applet. The browser then uses the public methods defined in the applet to invoke private methods or to access data.


Figure 27.23 Creation and compilation Use an editor to create a java source file. Name of the file is the same as the name of the

publicly inherited class with the “java” extension. Java compiler creates the bytecode for the file,

with the “class” extension. Create an applet which can be run by a browser.


Figure 27.24 HTML document carrying an Applet

To use an applet, an HTML document is created and the name of the applet is inserted between the <APPLET> tags.

The tag also defines the size of the window used for the applet.


Example 9Example 9In this example, we first import two packages, java.awt and java.applet. They contain the declarations and definitions of classes and methods that we need. Our example uses only one publicly inherited class called First. We define only one public method, paint. The browser can access the instance of First through the public method paint. The paint method, however, calls another method called drawString, which is defined in java.awt.*.

import java.applet.*;import java.awt.*;

public class First extends Applet{ public void paint (Graphics g) { g.drawString ("Hello World", 100, 100); }}


Example 10Example 10

In this example, we modify the program in Example 9 to draw a line. Instead of method drawString, we use another method called drawLine. This method needs four parameters: the x and y coordinates at the beginning of the line and the x and y coordinates at the end of the line. We use 0, 0 for the beginning and 80, 90 for the end.

import java.applet.*;import java.awt.*;

public class Second extends Applet{ public void paint (Graphics g) { g.drawLine (0, 0, 80, 90); }}

Documents

McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Chapter 27 HTTP and WWW