Download ppt - WWW, HTML, HTTP,... Wie funktioniert das World Wide Web ?

WWW, HTML, HTTP,...

Wie funktioniert das World Wide Web ?

© [email protected]

Hype, warum ? Vorraussetzung: weltweites Netzwerk

Internet, war schon vor dem WWW da Email, FTP, Gopher,...

Jeder kann, praktisch ohne Aufwand, Informationen finden

Surfen (Informationen sind verknüpft) Suchen (Suchbare Verzeichnisse)

Informationen veröffentlichen

Einfache Werkzeuge Browser Editoren

Multimedial Keine zentrale Kontrolle/Zensur

Jeder kann alles veröffentlichen


As we may thinkThe investigator is staggered by the findings and conclusions of thousands of otherworkers -- conclusions which he cannot find time to grasp, much less to remember, as they appear. [...]

Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose. [...]

Mendel's concept of the laws of genetics was lost to the world for a generation because his publication did not reach the few who were capable of grasping and extending it [...]

A record if it is to be useful to science, must be continuously extended, it must be stored, and above all it must be consulted. [...]

When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. [...]

The human mind does not work that way. It operates by association [...]

If the user wishes to consult a certain book, he taps its code on the keyboard, and the title page of the book promptly appears before him, projected onto one of his viewing positions. [...]

This is the essential feature of the memex.The process of tying two items together is the important thing. [...]

Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails


Informationen Jeder braucht Informationen Informationen müssen existieren

Dokumente, Bücher, Zeitschriften,... und gefunden werden knowledge management

Kataloge Hierarchien

nicht immer eindeutig Metadaten

welche (jetzt und später) wichtig ? Klassifikationen (Schlagworte)

Jeder nutzt seine eigenen !!


Ordnung zum ersten: Gopher


Zum zweiten: Udine,HyperG,...

Ende der achtziger Jahreentstanden viele Hypertext-Informationssysteme, hauptsächlich hierarchisch strukturiert, die versuchten,die Schwächen von Gopher undanderen zu überwinden.


Und Chaos: T. Berners-Lee

The actual observed working structure of the organisation is a multiply connected "web" whose interconnections evolve with time. In this environment, a new person arriving, or someone taking on a new task, is normally given a few hints as to who would be usefulpeople to talk to. Information about what facilities exist and how to find out about them travels in the corridor gossip and occasional newsletters, and the details about what isrequired to be done spread in a similar way. [...]

A problem, however, is the high turnover of people. When two years is a typical length of stay, information is constantly being lost.[...] Often, the information has been recorded, it just cannot be found. [...]

CERN is a model in miniature of the rest of world in a few years time. CERN meets now some problems which the rest of the world will have to face soon [...]

the method of storage must not place its own restraints on the information

This is why a "web" of notes with links (like references) between them is far more useful than a fixed hierarchical system. The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything.


RequirementsRemote access across networks.

CERN is distributed, and access from remote machines is essential.

HeterogeneityAccess is required to the same data from different types of system (VM/CMS, Macintosh, VAX/VMS, Unix)

Non-CentralisationInformation systems start small and grow. They also start isolated and then merge. A new system must allow existing systems to be linked together without requiring any central control or coordination.

Access to existing dataIf we provide access to existing databases as though they were in hypertext form, the system will get off the ground quicker.

Private linksOne must be able to add one's own private links to and from public information. One must also be able to annotate links,as well as nodes, privately.


Informationen


Architektur


Integration


WWW Die am CERN entwickelten Protokolle und Werkzeuge

wurden rasch weltweit akzeptiert, insb. als grafische Browser entwickelt wurden (NCSA-Mosaic)

Warum ? Jeder kann mit minimalem Aufwand Dokumente publizieren

und integrieren Die Protokolle sind so einfach (primitiv) und portabel, dass

jedes System angebunden werden kann Link sind privat, d.h. werden nicht zentral registriert

unidirektional bidirektionale Links (z.B. HyperG) konnten sich nicht

durchsetzen broken link Problematik

keine zentrale Struktur lost in hyperspace Relevanz von Informationen (altavista vs. google,...)


Architektur

Wie funktioniert das eigentlich ? Darstellung: HTML Abwicklung: HTTP

Browser WebserverTCP/IP

Dateien ???


HTML Auszeichnungssprache Tags

Dokument <HEAD>...</HEAD>, <HTML></HTML>,... Metadaten <META>, <TITLE>, <AUTHOR>,... Struktur <h1>, <ul>, <p> Links <a href="www.tillh.de">home</a>

stellen Semantik des Dokuments dar enthalten Links Präsentation durch Browser heute auch Präsentation durch HTML

Formatierung <FONT>, ... Logik JavaScript, DHTML,...


HTML-Dokument<HTML><HEAD><TITLE>Ein schönes Dokument </TITLE></HEAD><BODY><h1>Eine Überschrift</h1>Etwas Text<P> Ein Absatz<IMG SRC="bunt.gif"> Ein Bild<A HREF="www.interessant.de">hier klicken</a><P> Eine Liste<UL><LI> erstens </LI><LI> zweitens </LI><LI> drittens </LI></UL></BODY></HTML>


Tabellen<HTML><HEAD><TITLE>Ein schönes Dokument </TITLE></HEAD><BODY><h1>Eine Überschrift</h1>Etwas Text<TABLE BORDER=3><TR><TD>Spalte 1</TD><TD>Spalte 2</TD><TD>Spalte 3</TD></TR><TR><TD COLSPAN=2>Spalte 1 und 2</TD><TD>Spalte 3</TD></TR><TR><TD COLSPAN=3>ganz schön breit</TD></TR></TABLE></BODY></HTML>


HTML Ursprünglich nur zur Strukturierung gedacht heute auch zur Formatierung da keine Positionierung,... schwierig hauptsächlich durch

FONTS, Farben Tabellen Bilder

browserabhängig Style sheets Pfusch --> PDF, XML,...


HTTP Wie kommen die Dokumente zum Browser ? Ursprünglich

Auslieferung von Dateien deshalb einfaches, zustandsloses Protokoll

Struktur Browser fordert eine Seite an (GET) Webserver liest Datei und schickt sie zurück fertig

keine Anmeldung,... bei jeder Anfrage wird neue Verbindung

aufgebaut einfach !!!


Frage ...

GET /mini.html HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.51 [de]C-CCK-MCD DT (WinNT; I)Host: dbservAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: deAccept-Charset: iso-8859-1,*,utf-8


und Antwort

HTTP/1.1 200 OKDate: Thu, 17 May 2001 09:12:50 GMTServer: Apache/1.3.12 (Unix) (SuSE/Linux)Last-Modified: Thu, 17 May 2001 09:05:17 GMTETag: "3aed8-40-3b03944d"Accept-Ranges: bytesContent-Length: 64Connection: closeContent-Type: text/htmlX-Pad: avoid browser bug

<html><head><title>Hallo</title><body>Hallo</body></html>


dynamische Seiten GET <Pfad> liefert eine Datei aus

Mimetype in Content-Type

Was aber, wenn Inhalt nicht als Datei vorliegt ? z.B. Suche nach Inhalten, Uhrzeit,...

CGI Common Gateway Interface Wenn eine Datei in speziellem Verzeichnis (meist cgi-

bin) angefordert wird, "weiß" der Webserver, daß diese ausgeführt werden soll

GET /cgi-bin/SayHello ruft das programm SayHello im entsprechenden Verzeichnis auf

Parameter werden im Environment übergeben


CGI EnvironmentDOCUMENT_ROOT="/usr/local/httpd/htdocs"GATEWAY_INTERFACE="CGI/1.1"HTTP_ACCEPT="image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*"HTTP_ACCEPT_CHARSET="iso-8859-1,*,utf-8"HTTP_ACCEPT_ENCODING="gzip"HTTP_ACCEPT_LANGUAGE="de"HTTP_CONNECTION="Keep-Alive"HTTP_HOST="192.168.1.110"HTTP_USER_AGENT="Mozilla/4.51 [de]C-CCK-MCD DT (WinNT; I)"PATH="/sbin:/bin:/usr/sbin:/usr/bin"QUERY_STRING=""REMOTE_ADDR="192.168.1.101"REMOTE_PORT="1049"REQUEST_METHOD="GET"REQUEST_URI="/cgi-bin/printenv"SCRIPT_FILENAME="/usr/local/httpd/cgi-bin/printenv"SCRIPT_NAME="/cgi-bin/printenv"SERVER_ADDR="192.168.1.110"SERVER_ADMIN="[no address given]"SERVER_NAME="mac.e-technik.uni-ulm.de"SERVER_PORT="80"SERVER_PROTOCOL="HTTP/1.0"SERVER_SIGNATURE="<ADDRESS>Apache/1.3.12 Server at mac.e-technik.uni-ulm.de Port 80</ADDRESS>\n"SERVER_SOFTWARE="Apache/1.3.12 (Unix) (SuSE/Linux)"UNIQUE_ID="OwOfSMCoAW4AAAGfAxA"


Parameter Wie erhält z.B. Suchprogramm das zu suchende

Wort URL der Form: http://server/pfad/script?Parameter=Wert z.B. ...suche?Begriff=Internet QueryString QUERY_STRING="Begriff=Internet" mehrere mit ?p1w1&p2=w2...

Eingabe durch Benutzer ? HTML Forms, INPUT Tags

<FORM METHOD="GET" ACTION="http://localhost/Suche"><INPUT TYPE="text" NAME="Begriff"><INPUT TYPE="submit"></FORM>


Parameter contd.

Problem: viele/lange Parameter Länge von URL ist begrenzt Platz im Environment ist begrenzt (OS spezifisch)

Lösung: POST

GET /Suche?Begriff=Internet HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.51 [de]C-CCK-MCD DT (WinNT; I)Host: localhostAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: deAccept-Charset: iso-8859-1,*,utf-8

POST /Suche HTTP/1.0Connection: Keep-Alive

Content-type: application/x-www-form-urlencodedContent-length: 16

Begriff=Internet


Methoden Auswahl im <FORM METHOD="xxx"...> Tag

xxx=GET xxx=POST

GET maximale Länge von Parametern beschränkt Probleme bei Sonderzeichen (blank,...) URL encoding können als Bookmark gespeichert werden besser nicht für Bestellungen,...

POST unbeschränkte Länge der Parameter auch ganze Dateien,...


Typische Anwendung Suchmaschine

<HTML><HEAD><TITLE>Telefonbuch</TITLE></HEAD><BODY><h1>Telefonbuch</h1><FORM METHOD="POST" ACTION="http://localhost/Suche">Name:<INPUT TYPE="text" NAME="Name"><INPUT TYPE="submit"></FORM></BODY></HTML>


Ergebnis

<HTML><HEAD><TITLE>Telefonbuch Liste</TITLE></HEAD><BODY><h1>Suchergebnis</h1><UL><LI> <a href="ShowDetail?ID=4711">Müller, Hans</a></LI><LI> <a href="ShowDetail?ID=4243">Müller, Hugo</a></LI><LI> <a href="ShowDetail?ID=1234">Müller, Karin</a></LI></UL></BODY></HTML>