56
HTML and HTTP HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum) 1

HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Embed Size (px)

Citation preview

Page 1: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTML and HTTPHTML and HTTP

Based onComputer Networks and Internets, Comer

CSIT 320 (Blum) 1

Page 2: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HypertextHypertextHTML stands for HyperText Markup Language

and HTTP stands for HyperText Transport Protocol, so that raises the question: what is hypertext?

Hypertext is “a method of storing data through a computer program that allows a user to create and link fields of information at will and to retrieve the data non-sequentially.” (Webster’s)

A hyperlink is a region on one document (page) that when clicked brings up for the user another document.

It was developed by Ted Nelson in the 1960s.

CSIT 320 (Blum) 2

Page 3: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

URLURL

The “resources” (data or program files) are located on many computers through an internet or the Internet, hence this is a “distributed” system

The location of a resource is given by its URL (Uniform Resource Locator) ◦ http://www.lasalle.edu:1234/it/

fake.htm#attach CSIT 320 (Blum) 3

Page 4: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

BrowserBrowserHypertext is generally viewed in a web

browser, an application used to locate (linked or otherwise) web pages and display them.

Some browsers such as Lynx only link text documents.

But when most people think of browsers they think of Netscape Navigator and/or Microsoft Internet Explorer, which support more than just text.

CSIT 320 (Blum) 4

Page 5: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HypermediaHypermediaModern browsers link information in

non-textual format (graphics, sound, video, etc.) and so are “multimedia” or “hypermedia” programs.

The browser may need a plug-in to support some formats. A plug-in adds a particular feature or service to a larger system.

Browsers plug-ins are based on MIME file types.

CSIT 320 (Blum) 5

Page 6: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

MosaicMosaic

The first widely used multimedia browser was Mosaic.

Marc Andreessen is credited with initiating the development of Mosaic.

Mosaic moved the Internet out of the realm of academics and computer hobbyists by making it accessible to a much more general audience.

It helped the Internet maintain its exponential growth in number of users.

CSIT 320 (Blum) 6

Page 7: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Fig. 2.1: Computers Fig. 2.1: Computers connected to the Internet vs. connected to the Internet vs. Year Year

CSIT 320 (Blum) 7

mosaic

Page 8: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Mosaic (Cont.) Mosaic (Cont.) Andreessen started Mosaic while

working for the National Center for Supercomputing Applications (NCSA) at the University of Illinois.

Andreessen helped found Netscape Communications, which was originally called Mosaic Communications.

Mosaic is distinct from Netscape. In fact, Mosaic is also licensed for commercial use and is provided to users by some Internet access providers.

CSIT 320 (Blum) 8

Page 9: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTMLHTMLBrowsers interpret web documents,

especially HTML documentsHyperText Markup Language is an

“authoring” scheme for creating documents for the World Wide Web.

The World Wide Web (WWW) is the collection of resources available through HTTP to users on the Internet.

CSIT 320 (Blum) 9

Page 10: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

CSIT 320 (Blum) 10

Page 11: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

MarkupMarkupThe M in HTML stands for “Markup” Markup refers to the sequence of

characters (or symbols) inserted in a document to indicate how the file should look when it is printed or displayed and/or to describe the document's logical structure.

The markup indicators are often called "tags."

CSIT 320 (Blum) 11

Page 12: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTML5 is the latest HTML5 is the latest versionversion

CSIT 320 (Blum) 12

Page 13: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

TagsTags

These formatting instructions must be distinguishable from the text they are in.

In HTML, angle brackets < and > are used as delimiters to indicate the beginning and end of a tag◦ This gives <b>bold</b> type.

As with the byte stuffing we saw in Ethernet frames (where soh an eot were special characters), angle brackets must be replaced in a HTML document with &lt; and &gt; CSIT 320 (Blum) 13

Page 14: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Tags (Cont.)Tags (Cont.)The formatting or structure the tag indicates

often refers to an entire region, so many HTML tags occur in pairs (heading and trailing). The trailing tag includes a slash.

An HTML document begins an <HTML> tag and ends with an </HTML> tag.

An HTML document is broken into two pieces: the head and the body ◦ The head is the part between the head tags <head>

and </head>

◦ The body is the part between the body tags <body> and </body>

CSIT 320 (Blum) 14

Page 15: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Cascading Style SheetCascading Style SheetAn html document can be be written to

work in conjunction with a css file – a cascading style sheet.

A cascading style sheet separates out instructions about look and layout so that they can be reused – that is referred to many times either within in the same document or even by different documents.

CSIT 320 (Blum) 15

Page 16: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTML (Cont.)HTML (Cont.)There are hundreds of other tags used

to format and layout the information in a Web page.

For instance, <p>…</p> is used to make paragraphs and <i> … </i>is used to italicize fonts.

Tags are also used to specify hypertext links. ◦ <a href=“http://www.lasalle.edu”>La

Salle</a>

HTML is not the only Markup Language.

CSIT 320 (Blum) 16

Page 17: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

SGMLSGMLHTML has similarities to SGML, Standard

Generalized Markup Language, a generic system for organizing and tagging elements of a document.

GML was started by IBM and became SGML when it was taken over by the International Organization for Standards (ISO).

SGML is not about formatting, it’s more general. SGML provides rules for tagging elements.

Those tags might be interpreted as formatting as is done in HTML but can be interpreted in other ways as well.

CSIT 320 (Blum) 17

Page 18: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

XML eXtensible Markup XML eXtensible Markup Language Language

“Extensible” means capable of being extended, and markup language involves tags, so XML is a scheme in which the user can define his or her own tags.

For example, a company may elect to designate a social security number by placing it in tags defined for that purpose

◦ <ssn>123456789</ssn>

This data can be transported from application to application and system to system and is carrying around a self-identifying tag with it.

CSIT 320 (Blum) 18

Page 19: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

XML (Cont.) XML (Cont.) Unlike HTML tags, XML tags are not

necessarily about formatting and presentation.

However, a presentation application can be instructed to represent a certain type of data (as identified by its XML tags) in a particular way.

On the other hand, a database interface program can be instructed to place the information into the appropriate field.

CSIT 320 (Blum) 19

Page 20: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTPHTTPHTML and other web documents are sent

across the network using HTTP Hypertext Transport Protocol, which was originally developed by Dr. Tim Berners-Lee.◦ It was developed while he worked at CERN, a

center for particle physics, so that scientists from all over the world could share information.

HTTP defines rules for how messages are formatted and transmitted, what actions are allowed by Web servers, what actions are allowed by clients, etc.

CSIT 320 (Blum) 20

Page 21: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTP messageHTTP message

CSIT 320 (Blum) 21

Page 22: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTPHTTP

A Web server has an HTTP daemon that waits for HTTP requests and handles them when they arrive.

A Web browser is an HTTP client, sending requests to server machines.

For example, entering a URL in the location field of a browser (client) sends an HTTP request to the appropriate Web server, which responds with the page.

◦ Of course, if a domain name is entered, it may have to go to the DNS server first.

CSIT 320 (Blum) 22

Page 23: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTPHTTPHTTP is a stateless protocol. Each

command is executed without knowing anything about any preceding commands. ◦ This is good for keeping transmission lines

available, since there are no ongoing sessions tying up resources.

◦ This is bad for having a web site respond in an intelligent way to a user.

This problem of HTTP is addressed in a number of ways, including ActiveX, Java, JavaScript (AJAX) and cookies.

CSIT 320 (Blum) 23

Page 24: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTP 1.1HTTP 1.1

Modern browsers support HTTP 1.1Instead of opening and closing a

connection for each application request, HTTP 1.1 provides a persistent connection that allows multiple requests to be batched or pipelined to an output buffer.

The underlying TCP layer can put multiple requests (and responses to requests) into one TCP segment.

Fewer segments, less overhead. CSIT 320 (Blum) 24

Page 25: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

HTTP 1.1 (Cont.) HTTP 1.1 (Cont.) Compression: If a browser (client)

indicates that it can decompress HTML files, then a server compresses them for transport across the Internet.

Standard image files are already in a compressed format, so this improvement applies only to HTML and other non-image data types.

CSIT 320 (Blum) 25

Page 26: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

sHTTPsHTTPSecure HTTP is an extension to the HTTP

protocol for sending data securely over the Web.

Not all browsers and servers support S-HTTP.

Another technology for secure communications over the Web is Secure Sockets Layer (SSL).

SSL and S-HTTP have different designs and goals. SSL is designed to establish a secure connection between two computers, S-HTTP is designed to send individual messages securely.

CSIT 320 (Blum) 26

Page 27: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

CacheCacheTo increase speed, browsers cache

web page documents locally. There are also cache servers,

machines on the local network that cache web page documents.

First, the page is looked for on the local machine, then on the local network (cache server) and then at the remote location.

Refresh if you don’t want the cached version

CSIT 320 (Blum) 27

Page 28: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

FTPFTP

Based on Computer Networks and Internets, Comer

CSIT 320 (Blum) 28

Page 29: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

FTPFTP

File Transfer Protocol is a set of rules for moving files around on an internet or the Internet.

One common use of FTP is to move web-page files from the computer on which they were created to the web server where they are accessible to people on the web.

Another common use is to download programs and other files to one’s computer.

One can also download files using HTTP but FTP is faster.

CSIT 320 (Blum) 29

Page 30: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

CSIT 320 (Blum) 30

Page 31: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

VersionsVersionsThere is a command-line version of FTP.

◦ This is a fairly standard utility but the user must know a set of commands to use it.

◦ A user can put a file into a directory at a remote location or get a file from there.

There is also a GUI version. ◦ This version is easier to operate (with its

listboxes, scrollbars and buttons).

◦ Typically it must be downloaded.

One can also use a browser to get files using FTP from sites.

CSIT 320 (Blum) 31

Page 32: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Access and CapabilityAccess and CapabilityAccess to the FTP services typically

requires authenticating the user (username and password).

In such cases, the user can typically delete, rename, move files and so on, in addition of copying them.

Anonymous FTP does not authenticate a user but allows the user to do less, typically one only gets files◦ It is used as a means to distribute files.

CSIT 320 (Blum) 32

Page 33: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Anonymous FTPAnonymous FTPIn anonymous FTP, one enters

"anonymous" for the username. The password may not matter or

they may request an email address, or in old versions the password may be “guest”.

This is a way of giving the public access to a server so that files can be downloaded.

CSIT 320 (Blum) 33

Page 34: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Data and ControlData and ControlLocal machine must have an FTP

client.Remote machine must have an FTP

server. Transferring a file using FTP actually

consists of two connections. An FTP daemon listens at TCP port 21.

◦ (UDP has its own set of ports.) Port 21 is for initiating a control

connection.

CSIT 320 (Blum) 34

Page 35: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

CSIT 320 (Blum) 35

Data

Control

Page 36: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Data and ControlData and ControlThe client’s initial control message

includes the port number at which the client expects to receive data.

The server’s port 20 initiates a data connection to that port on the client.

The control connection indicates what files will be transferred in which direction; the actual transferring takes place on the data connection.

There is one control connection during an FTP session, but the data connections close when the transfer is complete, thus an FTP session may have several data connections.

CSIT 320 (Blum) 36

Page 37: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

FTP Client and ServerFTP Client and Server

CSIT 320 (Blum) 37

Page 38: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: Start/Run/cmdStart/Run/cmd

CSIT 320 (Blum) 38

For older operating systems, use command instead of cmd.

Page 39: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: ftp <domain name>ftp <domain name>

CSIT 320 (Blum) 39

Enter username and password, password need not be echoed

Page 40: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: lsCommand-line FTP: ls

CSIT 320 (Blum) 40

Shows contents of current directory (folder)

Page 41: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: cd <directory name>cd <directory name>

CSIT 320 (Blum) 41

Moves one into the specified folder on the remote machine

Page 42: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: wildcardwildcard

CSIT 320 (Blum) 42

* is the wildcard, it stands in for anything that might follow, in this case we are listing any files that begin with f

Page 43: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: wildcardwildcard

CSIT 320 (Blum) 43

* is the wildcard, it stands in for anything that might precede, in this case we are listing any files that end with .jpg

Page 44: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: get <filename>get <filename>

CSIT 320 (Blum) 44

Transfers a copy of a remote file to the local machine

Page 45: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

OverwritingOverwritingMost versions of FTP simply

overwrite a file of the same name when one uses the get or put commands.

Unlike many applications, the user will not be given a warning that he or she is about to overwrite a file.

CSIT 320 (Blum) 45

Page 46: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: Command-line FTP: put <filename>put <filename>

CSIT 320 (Blum) 46

Places a copy of a local file onto a remote computer

Page 47: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: binaryCommand-line FTP: binary

CSIT 320 (Blum) 47

Get and put assume files are in ASCII, the binary command switches the mode to binary for transferring other types of files

While the first get looks like it worked, the PowerPoint file could not be opened, the second get provided a useable ppt file.

Page 48: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Command-line FTP: asciiCommand-line FTP: ascii

CSIT 320 (Blum) 48

Puts FTP back into ASCII mode

Page 49: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Htm file transferred in ASCII Htm file transferred in ASCII modemode

CSIT 320 (Blum) 49

Page 50: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Htm file transferred in Binary Htm file transferred in Binary modemode

CSIT 320 (Blum) 50

“Returns” in original document can be lost, replaced with unprintable characters

Page 51: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

FTP commandsFTP commands

CSIT 320 (Blum) 51

Page 52: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Passive FTPPassive FTPPassive FTP is a more secure form of

data transfer in which the flow of data is set up and initiated by the File Transfer Program (FTP) client rather than by the FTP server program.

FTP client programs sometimes allow the user to select passive FTP.

Most Web browsers (which act as FTP clients) use passive FTP by default.

CSIT 320 (Blum) 52

Page 53: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Passive FTPPassive FTP

Recall FTP consists of two connections, in normal FTP the client initiates the control connection, but the server establishes the data connection.

Some networks have firewalls that only allows connections that were initiated from within, this would rule out the data connection of a normal FTP session.

CSIT 320 (Blum) 53

Page 54: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

““Normal” vs Passive FTPNormal” vs Passive FTPNormal: Client initiates control and

gives a port number to server which then initiates data connection.

Passive: Client initiates control and asks server to return over the control connection which port it intends to use (for data), then the client initiates a data connection using the port number supplied by the server.

CSIT 320 (Blum) 54

Page 55: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

TFTPTFTP

Trivial File Transfer Protocol, a simple version of FTP, but TFTP uses the User Datagram Protocol (UDP) instead of TCP. ◦ It is simpler, faster, requires less code.

◦ But is less capable and less secure.

It is used where user authentication and directory visibility are not required.

It is often used by servers to boot diskless workstations, X-terminals, and routers. ◦ Diskless workstations need operating systems too.

CSIT 320 (Blum) 55

Page 56: HTML and HTTP Based onComputer Networks and Internets, Comer CSIT 320 (Blum)1

Other ReferencesOther ReferencesComputer Networks and

Internets, Comerhttp://www.webopedia.comhttp://www.whatis.com http://www.uic.edu/depts/accc/

network/ftp/vftp.html http://www.w3.org/TR/REC-

html40/struct/global.html

CSIT 320 (Blum) 56