44
Programming for WWW Programming for WWW (ICE 1338) (ICE 1338) Lecture #9 Lecture #9 July 23, 2004 In-Young Ko iko .AT. i cu . ac.kr Information and Communications University (ICU)

Programming for WWW (ICE 1338) Lecture #9 Lecture #9 July 23, 2004 In-Young Ko iko.AT. icu.ac.kr Information and Communications University (ICU) iko.AT

Embed Size (px)

Citation preview

Programming for WWWProgramming for WWW(ICE 1338)(ICE 1338)

Lecture #9Lecture #9 July 23, 2004

In-Young Koiko .AT. icu.ac.kr

Information and Communications University (ICU)

July 23, 2004 2 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

AnnouncementsAnnouncements

Class hours on Friday July 30Class hours on Friday July 30thth will be moved will be moved to to 3:00PM~5:30PM3:00PM~5:30PM

July 23, 2004 3 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Review of the Previous LectureReview of the Previous Lecture

Interaction between Java Applets and Interaction between Java Applets and JavaScriptJavaScript

CGI programmingCGI programming Perl pattern matchingPerl pattern matching

July 23, 2004 4 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Contents of Today’s LectureContents of Today’s Lecture

Perl modulesPerl modules CookiesCookies Introduction to PHPIntroduction to PHP XML and XML ProcessingXML and XML Processing

July 23, 2004 5 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CGI.pm ModuleCGI.pm Module CGI.pm: A CGI.pm: A Perl modulePerl module serves as serves as a librarya library The The useuse declaration is used to declaration is used to make a module make a module

availableavailable to a program to a program To make only part of a module available, specify the part To make only part of a module available, specify the part

name after a colon name after a colone.g., e.g., use CGI ":standard";use CGI ":standard";

Common CGI.pm FunctionsCommon CGI.pm Functions ““Shortcut” functionsShortcut” functions produce tags, using their parameters produce tags, using their parameters

as attribute valuesas attribute valuese.g., e.g., h2("Very easy!");h2("Very easy!"); produces produces <h2> Very easy! </h2><h2> Very easy! </h2>

In this example, the parameter to the functionIn this example, the parameter to the function h2 is used h2 is used as the content of the <h2> tagas the content of the <h2> tag

AW lecture notes

July 23, 2004 6 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CGI.pm ModuleCGI.pm Module (cont.) (cont.)

Tags can have Tags can have both content and attributesboth content and attributes Each attribute is passed as a name/value pairEach attribute is passed as a name/value pair Attribute namesAttribute names are passed with are passed with a preceding dasha preceding dash

e.g., e.g., textarea(-name => "Description",textarea(-name => "Description", -rows => "2", -cols => "35");-rows => "2", -cols => "35");

Produces: <textarea name ="Description" rows=2 cols=35> Produces: <textarea name ="Description" rows=2 cols=35>

</textarea></textarea>

Tags and their Tags and their attributes are distributedattributes are distributed over the over the parameters of the functionparameters of the function

e.g., e.g., ol(li({-type => "square"}, ["milk", "bread", "cheese"]));ol(li({-type => "square"}, ["milk", "bread", "cheese"]));

Output: Output: <ol><li type="square“<ol><li type="square“>>milk</li>milk</li>

<li type="square“<li type="square“>>bread</li>bread</li>

<li type="square“<li type="square“>>cheese</li>cheese</li>

</ol></ol> AW lecture notes

July 23, 2004 7 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CGI.pm ModuleCGI.pm Module (cont.) (cont.)

PProducroducinging output for return to the user output for return to the user A call to A call to header()header() produces: produces:

Content-type: text/html;charset=ISO-8859-1Content-type: text/html;charset=ISO-8859-1

-- blank line -- blank line ––

The The start_htmlstart_html function is used to create the head of the function is used to create the head of the return document, as well as the <body> tagreturn document, as well as the <body> tag The parameter to The parameter to start_htmlstart_html is used as the title of the document is used as the title of the document

e.g., e.g., start_htmlstart_html(("Bill’s Bags""Bill’s Bags"););DOCTYPE html PUBLICDOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0"-//W3C//DTD XHTML 1.0 Transitional//EN"Transitional//EN"

"DTD/xhtml11-transitional.dtd">"DTD/xhtml11-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml lang="en-US"><html xmlns="http://www.w3.org/1999/xhtml lang="en-US">

<head><title>Bill’s Bags</title><head><title>Bill’s Bags</title>

</head><body></head><body> The The end_htmlend_html function generates function generates </body></html></body></html>

AW lecture notes

July 23, 2004 8 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CGI.pm ModuleCGI.pm Module (cont.) (cont.)

The The paramparam function is given a widget’s name; it function is given a widget’s name; it returns the widget’s valuereturns the widget’s value If the query string has If the query string has name=Abrahamname=Abraham in it, in it,

param("name")param("name") will return "Abraham“ will return "Abraham“e.g., e.g., my($age, $gender, $vote) =my($age, $gender, $vote) =

(param("age"), param("gender")(param("age"), param("gender"), , param("vote"));param("vote"));

AW lecture notes

July 23, 2004 9 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CookiesCookies A A sessionsession is the is the collection of all of the requests collection of all of the requests

made by a particular browsermade by a particular browser from the time the from the time the browser is started until the user exits the browserbrowser is started until the user exits the browser

The HTTP protocol is The HTTP protocol is statelessstateless, b, but, there are ut, there are several reasons why it is useful forseveral reasons why it is useful for the server to the server to relate a request to a sessionrelate a request to a session Shopping cartsShopping carts for many different simultaneous customers for many different simultaneous customers Customer profilingCustomer profiling for advertising for advertising Customized interfacesCustomized interfaces for specific clients for specific clients

Approaches to storing client information:Approaches to storing client information: Store it on the serverStore it on the server – too much to store! – too much to store! Store it on the client machineStore it on the client machine –– this this works works

AW lecture notes

July 23, 2004 10 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

CookiesCookies (cont.) (cont.)

A A cookiecookie is an object sent by the server to is an object sent by the server to the clientthe client

Cookies are Cookies are createdcreated by some software by some software system on the server (maybe a CGI program)system on the server (maybe a CGI program)

At the time a cookie is created, it is given a At the time a cookie is created, it is given a lifetimelifetime

Every time the browser sends a requestEvery time the browser sends a request to to the server that created the cookie, while the the server that created the cookie, while the cookiecookie is still alive, is still alive, the cookie is includedthe cookie is included

A browser can be set to reject all cookiesA browser can be set to reject all cookiesAW lecture notes

July 23, 2004 11 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Using CGI.pm for CookiesUsing CGI.pm for Cookies

CGI.pm includes support for cookiesCGI.pm includes support for cookiescookiecookie(-(-namename => a_name, => a_name, --valuevalue => a_value, => a_value, --expiresexpires => => a_time);a_time); The time is a number followed by a unit codeThe time is a number followed by a unit code (d, s, m, h, M, y)(d, s, m, h, M, y)

e.g.,e.g., -expires => '+5d'-expires => '+5d'

Cookies must be placed in the HTTP header at the time Cookies must be placed in the HTTP header at the time the header is createdthe header is created

ee.g., .g., header(-cookie => $my_cookie);header(-cookie => $my_cookie);

To fetch the cookies from an HTTP request, call To fetch the cookies from an HTTP request, call cookiecookie with no parameterswith no parameters – A – A hash of all cookies is returned hash of all cookies is returned

To fetch the value of one particular cookie, send the To fetch the value of one particular cookie, send the cookie’s name to the cookie’s name to the cookiecookie function function, ,

e.g., e.g., $age = $age = cookiecookie(′age′);(′age′);AW lecture notes

July 23, 2004 12 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

A Cookie ExampleA Cookie Example A cookie that A cookie that tells the client the time of his or her tells the client the time of his or her

last visitlast visit to this site to this site Use the Perl function, Use the Perl function, localtimelocaltime, to get the parts of time, to get the parts of time

($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtimelocaltime;;

@day_stuff = ($sec, $min, $hour, $mday, $mon, $year@day_stuff = ($sec, $min, $hour, $mday, $mon, $year););

$day_cookie = $day_cookie = cookiecookie(-(-namename => 'last_time', => 'last_time',

--valuevalue => \@day_stuff, => \@day_stuff, --expiresexpires => => '+5d');'+5d');

AW lecture notes

July 23, 2004 13 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Perl ReferencesPerl References

Textbook chapters 4 and 5Textbook chapters 4 and 5 Perl.com: Perl.com: http://www.perl.com/http://www.perl.com/

A Perl Tutorial: A Perl Tutorial: http://www.comp.leeds.ac.uk/Perl/start.htmlhttp://www.comp.leeds.ac.uk/Perl/start.html

Perl Pattern Matching: Perl Pattern Matching: http://www.sarand.com/td/ref_perl_pattern.htmlhttp://www.sarand.com/td/ref_perl_pattern.html

Perl Functions: Perl Functions: http://www.sunsite.ualberta.ca/Documentation/Misc/perl-http://www.sunsite.ualberta.ca/Documentation/Misc/perl-5.6.1/pod/perlfunc.html 5.6.1/pod/perlfunc.html

Perl Modules:Perl Modules: http://www.cpan.org/modules/http://www.cpan.org/modules/

July 23, 2004 14 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

PHP (PHP Hypertext Preprocessor)PHP (PHP Hypertext Preprocessor)

Developed in 1994 by Developed in 1994 by Rasmus Lerdorf to allow him to track Rasmus Lerdorf to allow him to track visitors to his Web sitevisitors to his Web site

UUsed for form handling, file processing, and database accesssed for form handling, file processing, and database access AA server-side scripting languageserver-side scripting language whose scripts are whose scripts are

embedded in HTML documentsembedded in HTML documents – – SimilarSimilar to JavaScript, but to JavaScript, but on the server sideon the server side

AAn alternative to CGI, Active Server Pages (ASP), and Java n alternative to CGI, Active Server Pages (ASP), and Java Server Pages (JSP)Server Pages (JSP)

The PHP processor has two modes: The PHP processor has two modes: copycopyinging HTML HTML texts texts and and interpretinterpreting ing PHPPHP codes codes

SSyntax is similar to that of JavaScriptyntax is similar to that of JavaScript DDynamically typedynamically typed

AW lecture notes

July 23, 2004 15 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

An Example PHP CodeAn Example PHP Code

http://www.php.net/manual/en/tutorial.firstpage.php

<html><html> <head><head>

 <title>PHP Test</title> <title>PHP Test</title> </head></head> <body><body>

<?php echo '<p>Hello World</p>'; ?><?php echo '<p>Hello World</p>'; ?> </body></body></html> </html>

<html><html> <head><head> <title>PHP Test</title><title>PHP Test</title> </head></head> <body><body> <p>Hello World</p><p>Hello World</p> </body></body></html></html>

‘‘hello.phphello.php’ on the ’ on the Web serverWeb server

The document content The document content received by the client via received by the client via ‘‘http://…../hello.phphttp://…../hello.php’’

July 23, 2004 16 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

A PHP Example: A Hit CounterA PHP Example: A Hit Counter

counter.phpcounter.php<?php<?php

$counter_file = ("counter.txt");$counter_file = ("counter.txt");$visits = file($counter_file);$visits = file($counter_file);$visits[0]++;$visits[0]++;$fp = fopen($counter_file , "w");$fp = fopen($counter_file , "w");fputs($fp , "$visits[0]");fputs($fp , "$visits[0]");fclose($fp);fclose($fp);eecho "There have been $visits[0] visitors so far";cho "There have been $visits[0] visitors so far";

?>?> Now add the following to your page where you wish the Now add the following to your page where you wish the

counter to appearcounter to appear<?php <?php includeinclude("counter.php");?("counter.php");?>>

http://tutorials.programmingsite.co.uk/counter1.php

July 23, 2004 17 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

PHP ReferencesPHP References

PHP.net: PHP.net: http://http://www.php.netwww.php.net//

PHP Manual: PHP Manual: http://www.php.net/manual/en/index.phphttp://www.php.net/manual/en/index.php Examples and Tutorials: Examples and Tutorials:

http://php.resourceindex.com/Documentation/Examples_ahttp://php.resourceindex.com/Documentation/Examples_and_Tutorials/nd_Tutorials/

July 23, 2004 18 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

SGML (Standard Generalized Markup Language)SGML (Standard Generalized Markup Language)

SGML is SGML is a a metameta-markup language-markup language ddeveloped in eveloped in the early 1980the early 1980s (s (ISO 8879ISO 8879, 1, 1986986))

HTMLHTML was developed was developed using SGMLusing SGML in the early in the early 1990s - 1990s - specifically for Web documentsspecifically for Web documents

PProblems with HTML:roblems with HTML:1. 1. Fixed set of tags and attributesFixed set of tags and attributes

User cannot define new tags or attributesUser cannot define new tags or attributes So, the tags cannot connote any particular meaningSo, the tags cannot connote any particular meaning

2. 2. NNo restrictions on arrangement or order of tago restrictions on arrangement or order of tag appearanceappearance

SGMLSGML is is too large and complextoo large and complex to use, and it is to use, and it is very very difficult to build a parserdifficult to build a parser for it for it

AW lecture notes

July 23, 2004 19 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML (Extended Markup Language)XML (Extended Markup Language)

XML isXML is a light version of SGMLa light version of SGML that provides a that provides a way way of of storing and transferring data of any kindstoring and transferring data of any kind

XML XML vs. vs. HTMLHTML HTML is a markup language used to describe the HTML is a markup language used to describe the layoutlayout of of

any kind of informationany kind of information XML is a meta-markup language that can be used to XML is a meta-markup language that can be used to

define markup languages that can define markup languages that can define the meaningdefine the meaning of of specific kinds of informationspecific kinds of information

XML does XML does not predefine any tagsnot predefine any tags All documents described with an XML-derived All documents described with an XML-derived

markup language can be markup language can be parsed with a single parserparsed with a single parserAW lecture notes

July 23, 2004 20 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML SyntaxXML Syntax A flexible text format that is originally designed for A flexible text format that is originally designed for

large-scale electronic publishing of documentslarge-scale electronic publishing of documents An An XML documentXML document is a is a hierarchical organizationhierarchical organization of one of one

or more named elementsor more named elements An An elementelement is composed of an opening-tag, data (string or is composed of an opening-tag, data (string or

another element), and a closing-taganother element), and a closing-tag An An opening-tagopening-tag is an element name surrounded by ‘<’ and ‘>’ is an element name surrounded by ‘<’ and ‘>’ A A closing-tagclosing-tag is an element name surrounded by ‘<’ and ‘/>’ is an element name surrounded by ‘<’ and ‘/>’ An element may have zero or more An element may have zero or more attributesattributes An An attributeattribute is a name-value pair that specifies a property of is a name-value pair that specifies a property of

the elementthe element

July 23, 2004 21 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML Syntax XML Syntax (cont.)(cont.)

All XML documents begin with an XML declaration:All XML documents begin with an XML declaration:<?xml version = "1.0"?><?xml version = "1.0"?> XML comments are just like HTML commentsXML comments are just like HTML comments

XML names: XML names: Must begin with a letter or an underscoreMust begin with a letter or an underscore They can include digits, They can include digits, hyphenshyphens, and , and periodsperiods There is There is no length limitationno length limitation They are They are case sensitivecase sensitive (unlike HTML names) (unlike HTML names)

Syntax rules for XML:Syntax rules for XML: Every XML document defines Every XML document defines a single root elementa single root element, whose , whose

opening tag must appear as the first line of the documentopening tag must appear as the first line of the document Every element that has content must have a Every element that has content must have a closing tagclosing tag Tags must be properly nestedTags must be properly nested All All attribute values must be quotedattribute values must be quoted

AW lecture notes

July 23, 2004 22 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

An XML Document ExampleAn XML Document Example<?xml version = "1.0“?><?xml version = "1.0“?><class><class><name><name>Prog. for WWWProg. for WWW</name></name><code><code>ICE1338ICE1338</code></code><students><students>

<student <student id=“20037001”id=“20037001”>> <name><name>Y.K. KoY.K. Ko</name></name> <bday><bday>820304820304</bday></bday></student></student><student <student id=“20037002”id=“20037002”>> <name><name>D.W. KimD.W. Kim</name></name> <bday><bday>830512830512</bday></bday></student></student>

</students></students></class></class>

July 23, 2004 23 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

An XML Document ExampleAn XML Document Example<?xml version = "1.0“?><?xml version = "1.0“?><class><class><name><name>Prog. for WWWProg. for WWW</name></name><code><code>ICE1338ICE1338</code></code><students><students>

<student <student id=“20037001”id=“20037001”>> <name><name>Y.K. KoY.K. Ko</name></name> <bday><bday>820304820304</bday></bday></student></student><student <student id=“20037002”id=“20037002”>> <name><name>D.W. KimD.W. Kim</name></name> <bday><bday>830512830512</bday></bday></student></student>

</students></students></class></class>

An opening-tagAn opening-tag

A closing-tagA closing-tag

An elementAn element

An attributeAn attribute

The root elementThe root element

A valueA value

July 23, 2004 24 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML Document StructuresXML Document Structures

Logical StructureLogical Structure: tells what elements are to be : tells what elements are to be included in a document and in what orderincluded in a document and in what order AA new new nested tagnested tag needs to be defined needs to be defined to provide more to provide more

info about the content of a taginfo about the content of a tag Nested tags are better than attributes, becauseNested tags are better than attributes, because

attributes cannot describe structureattributes cannot describe structure and the and the structural structural complexity may growcomplexity may grow

AttributesAttributes should always be used should always be used to identify numbers or to identify numbers or namesnames of elements (like HTML id and name attributes) of elements (like HTML id and name attributes)

Physical StructurePhysical Structure: : governs the content in a governs the content in a document in form of storage units called document in form of storage units called entitiesentities

AW lecture noteshttp://tech.irt.org/articles/js212/

July 23, 2004 25 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML Logical Structure ExamplesXML Logical Structure Examples

<!-- A tag with one attribute --><!-- A tag with one attribute --><patient name = "Maggie Dee <patient name = "Maggie Dee

Magpie">Magpie"> ......</patient></patient>

AW lecture notes

<!-- <!-- Multi-levelMulti-level nested tags --> nested tags -->

<patient><patient>

<name><name>

<first> Maggie </first><first> Maggie </first>

<middle> Dee </middle><middle> Dee </middle>

<last> Magpie </last><last> Magpie </last>

</name></name>

......

</patient></patient>

<!-- A tag with one nested tag --><!-- A tag with one nested tag -->

<patient><patient>

<name> Maggie Dee Magpie <name> Maggie Dee Magpie

</name></name>

......

</patient></patient>

July 23, 2004 26 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

DTD (Data Type Definitions)DTD (Data Type Definitions) A DTD is a set of structural rules called A DTD is a set of structural rules called declarationsdeclarations

SSpecify a pecify a set of elementsset of elements, along with , along with how and where they how and where they can appearcan appear in a document in a document (in (in BNFBNF))

Purpose: provide a Purpose: provide a standard formstandard form for a collection of for a collection of XML documentsXML documents

Not all XML documents have or need a DTDNot all XML documents have or need a DTD The DTD for a document can be internal or externalThe DTD for a document can be internal or external All of the declarations of a DTD are enclosed in theAll of the declarations of a DTD are enclosed in the

block of ablock of a DOCTYPEDOCTYPE markup declaration markup declaration DTD declarations have the form: DTD declarations have the form: <!<!keywordkeyword … > … > PPossible declaration keywords: ossible declaration keywords:

ELEMENTELEMENT, , ATTLISTATTLIST, , ENTITYENTITY, and , and NOTATIONNOTATIONAW lecture notes

July 23, 2004 27 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

A DTD ExampleA DTD Example<!DOCTYPE HYPERLIB [<!DOCTYPE HYPERLIB [ <!ELEMENT HYPERLIB<!ELEMENT HYPERLIB (HEAD?, BODY)>(HEAD?, BODY)> <!ELEMENT HEAD <!ELEMENT HEAD (TITLE, AUTHOR+)>(TITLE, AUTHOR+)> <!ELEMENT BODY <!ELEMENT BODY (INTRO?, TOPIC+)>(INTRO?, TOPIC+)> <!ELEMENT TOPIC <!ELEMENT TOPIC (TITLE,(#PCDATA | P | DL )*)>(TITLE,(#PCDATA | P | DL )*)> <!ELEMENT DL <!ELEMENT DL (DT,DD)+>(DT,DD)+> <!ELEMENT DT <!ELEMENT DT (#PCDATA | EMBED | A | EMPH )+>(#PCDATA | EMBED | A | EMPH )+> <!ELEMENT DD <!ELEMENT DD (#PCDATA | P | EMBED | A | EMPH | DL)+>(#PCDATA | P | EMBED | A | EMPH | DL)+> <!ELEMENT INTRO <!ELEMENT INTRO (#PCDATA)>(#PCDATA)> <!ELEMENT TITLE <!ELEMENT TITLE (#PCDATA)>(#PCDATA)> <!ELEMENT AUTHOR <!ELEMENT AUTHOR (#PCDATA)>(#PCDATA)> <!ATTLIST AUTHOR <!ATTLIST AUTHOR function (manager | editor | contrib) #REQUIRED>function (manager | editor | contrib) #REQUIRED> <!ELEMENT P <!ELEMENT P (#PCDATA)>(#PCDATA)> <!ELEMENT A <!ELEMENT A (#PCDATA)>(#PCDATA)> <!ELEMENT EMPH <!ELEMENT EMPH (#PCDATA)>(#PCDATA)> <!ELEMENT EMBED <!ELEMENT EMBED (#PCDATA)>(#PCDATA)>]>]> http://lib.ua.ac.be/MAN/WP31/t15.html

July 23, 2004 28 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

A DTD ExampleA DTD Example<!DOCTYPE HYPERLIB [<!DOCTYPE HYPERLIB [ <!ELEMENT HYPERLIB<!ELEMENT HYPERLIB (HEAD?, BODY)>(HEAD?, BODY)> <!ELEMENT HEAD <!ELEMENT HEAD (TITLE, AUTHOR+)>(TITLE, AUTHOR+)> <!ELEMENT BODY <!ELEMENT BODY (INTRO?, TOPIC+)>(INTRO?, TOPIC+)> <!ELEMENT TOPIC <!ELEMENT TOPIC (TITLE,(#PCDATA | P | DL )*)>(TITLE,(#PCDATA | P | DL )*)> <!ELEMENT DL <!ELEMENT DL (DT,DD)+>(DT,DD)+> <!ELEMENT DT <!ELEMENT DT (#PCDATA | EMBED | A | EMPH )+>(#PCDATA | EMBED | A | EMPH )+> <!ELEMENT DD <!ELEMENT DD (#PCDATA | P | EMBED | A | EMPH | DL)+>(#PCDATA | P | EMBED | A | EMPH | DL)+> <!ELEMENT INTRO <!ELEMENT INTRO (#PCDATA)>(#PCDATA)> <!ELEMENT TITLE <!ELEMENT TITLE (#PCDATA)>(#PCDATA)> <!ELEMENT AUTHOR <!ELEMENT AUTHOR (#PCDATA)>(#PCDATA)> <!ATTLIST AUTHOR <!ATTLIST AUTHOR function (manager | editor | contrib) #REQUIRED>function (manager | editor | contrib) #REQUIRED> <!ELEMENT P <!ELEMENT P (#PCDATA)>(#PCDATA)> <!ELEMENT A <!ELEMENT A (#PCDATA)>(#PCDATA)> <!ELEMENT EMPH <!ELEMENT EMPH (#PCDATA)>(#PCDATA)> <!ELEMENT EMBED <!ELEMENT EMBED (#PCDATA)>(#PCDATA)>]>]> http://lib.ua.ac.be/MAN/WP31/t15.html

July 23, 2004 29 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

More on DTD…More on DTD… Internal and External DTDsInternal and External DTDs

Internal DTDsInternal DTDs<!DOCTYPE<!DOCTYPE root_nameroot_name [ [ …… ]]>>

External DTDsExternal DTDs<!DOCTYPE<!DOCTYPE root_nameroot_name SYSTEMSYSTEM “ “DTD_file_nameDTD_file_name”>”>

Problems with DTDs:Problems with DTDs: Syntax is different from XMLSyntax is different from XML - cannot be parsed - cannot be parsed

with an XML parserwith an XML parser It is confusing to deal with two differentIt is confusing to deal with two different

syntactic formssyntactic forms DTDs do not allow specification of particularDTDs do not allow specification of particular

kinds of datakinds of data AW lecture notes

July 23, 2004 30 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML EntitiesXML Entities

EntitiesEntities a allow llow usersusers to to assign a name to some assign a name to some contentcontent, and use that name to refer to that content, and use that name to refer to that content

Used as Used as "macros""macros" for content (e.g., special for content (e.g., special characters, images, documents)characters, images, documents)

Entity CategoriesEntity Categories The Document EntityThe Document Entity: the root of the entity tree, the : the root of the entity tree, the

whole documentwhole document Internal General EntitiesInternal General Entities: association of an arbitrary : association of an arbitrary

piece of text with a name piece of text with a name External General EntitiesExternal General Entities: incorporate content from : incorporate content from

external filesexternal fileshttp://tech.irt.org/articles/js212/

July 23, 2004 31 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Internal General EntitiesInternal General Entities

Predefined EntitiesPredefined Entities

Character ReferencesCharacter References – refer to Unicode – refer to Unicode characters using characters using &#&#decimaldecimal;; or or &#&#xxhexhex;;

Internal Entity DeclarationInternal Entity Declaration<!ENTITY<!ENTITY entitynameentityname " "replacement textreplacement text"">>

Internal Entity ReferenceInternal Entity Reference - - &&entitynameentityname;;

Entity Entity Name Replacement Text

The left angle bracket (<) lt &#60;

The right angle bracket (>) gt &#62;

The ampersand (&) amp &#38;

The single quote or apostrophe (') apos &#39;

The double quote (") quot &#34;

http://tech.irt.org/articles/js212/

July 23, 2004 32 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

External General EntitiesExternal General Entities

Provides a mechanism for Provides a mechanism for dividing a document up dividing a document up into logical chunksinto logical chunks, each of which can be , each of which can be stored in stored in a separate filea separate file

When the parent file is parsed by an XML When the parent file is parsed by an XML processor, it will have the effect of processor, it will have the effect of inserting the inserting the contentscontents of each of the individual files of each of the individual files at that at that location of the respective entitylocation of the respective entity references references

External entities can contain External entities can contain binary databinary data, which , which can be used to reference images and other non-can be used to reference images and other non-XML content in the document XML content in the document

http://tech.irt.org/articles/js212/

July 23, 2004 33 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

External Entity ExampleExternal Entity Example

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE <!DOCTYPE [<!ENTITY [<!ENTITY section1section1 SYSTEM "path/to/section_1.xml"> SYSTEM "path/to/section_1.xml"> ...... <!ENTITY <!ENTITY sectionmsectionm SYSTEM "path/to/section_m.xml">] SYSTEM "path/to/section_m.xml">]>>…… <document><document>&section1;&section1;......&sectionm;&sectionm;</document> </document>

AW lecture notes

July 23, 2004 34 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

NamespacesNamespaces MMarkup vocabularyarkup vocabulary: : the collection of all of the element the collection of all of the element

types and attribute names of a markup language (a tag set)types and attribute names of a markup language (a tag set) An XML document may define its own tag set and also use An XML document may define its own tag set and also use

that of another tag set - CONFLICTS!that of another tag set - CONFLICTS! XML namespaceXML namespace:: a collection of names used in XML a collection of names used in XML

documents as element types and attributedocuments as element types and attribute namesnames The name of an XML namespace has the form of a URIThe name of an XML namespace has the form of a URI A namespace declaration has the form:A namespace declaration has the form:

<element_name xmlns[:prefix] = URI><element_name xmlns[:prefix] = URI>The The prefixprefix is a is a short nameshort name for the namespace, which is for the namespace, which is attached to names from the namespace in the XML attached to names from the namespace in the XML documentdocumente.g., e.g., <gmcars xmlns:<gmcars xmlns:gmgm = "http://www.gm.com/names"> = "http://www.gm.com/names"> In the document, you can use In the document, you can use <<gmgm:pontiac>:pontiac> AW lecture notes

July 23, 2004 35 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Namespace ExampleNamespace Example<h:html xmlns:<h:html xmlns:xdcxdc="http://www.xml.com/books"="http://www.xml.com/books"

xmlns:xmlns:hh="http://www.w3.org/HTML/1998/html4">="http://www.w3.org/HTML/1998/html4"> <h:head><h:<h:head><h:titletitle>Book Review</h:title></h:head>>Book Review</h:title></h:head> <h:body><h:body> <xdc:bookreview><xdc:bookreview> <xdc:<xdc:titletitle>XML: A Primer</xdc:title>>XML: A Primer</xdc:title> <h:table><h:table> <h:tr align="center"><h:tr align="center"> <h:td>Author</h:td><h:td>Price</h:td><h:td>Author</h:td><h:td>Price</h:td> <h:td>Pages</h:td><h:td>Date</h:td></h:tr><h:td>Pages</h:td><h:td>Date</h:td></h:tr> <h:tr align="left"><h:tr align="left"> <h:td><xdc:author>Simon St. Laurent</xdc:author></h:td><h:td><xdc:author>Simon St. Laurent</xdc:author></h:td> <h:td><xdc:price>31.98</xdc:price></h:td><h:td><xdc:price>31.98</xdc:price></h:td> <h:td><xdc:pages>352</xdc:pages></h:td><h:td><xdc:pages>352</xdc:pages></h:td> <h:td><xdc:date>1998/01</xdc:date></h:td></h:tr><h:td><xdc:date>1998/01</xdc:date></h:td></h:tr> </h:table></h:table> </xdc:bookreview></xdc:bookreview> </h:body></h:body></h:html></h:html>

July 23, 2004 36 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML ProcessorsXML Processors XML ParsersXML Parsers: read XML documents and provide : read XML documents and provide

access to their content and structure via DOM access to their content and structure via DOM (e.g., Xerces, Sun’s Java XML Parser)(e.g., Xerces, Sun’s Java XML Parser)

Document Filtering (Validation)Document Filtering (Validation) Document Type Declaration (DTD)Document Type Declaration (DTD): a : a grammargrammar for a for a

class of XML documents class of XML documents XML Schema (XSD)XML Schema (XSD): a successor of DTD. Describes : a successor of DTD. Describes

the structure of an XML documentthe structure of an XML document XML PresentationXML Presentation

eXtensible Stylesheet Language (XSL)eXtensible Stylesheet Language (XSL): a language to : a language to define the transformation and presentation of an XML define the transformation and presentation of an XML documentdocument

July 23, 2004 37 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML ProcessorsXML Processors

XML XML DocumentDocument

DatabasesDatabases

XML ParserXML Parser

DTD/ DTD/ XMLSchemaXMLSchema

XSL XSL DescriptionDescription

XSL ProcessorXSL Processor

XML Grammar XML Grammar (Structure) Validation(Structure) Validation

DOM ObjectsDOM Objects

HTML HTML PresentationPresentation

classclass

namename codecode studentsstudents

ProgProg. Lang. Lang ICE1341ICE1341 studentstudent studentstudent

namename bdaybday namename bdaybday

Y.K. KoY.K. Ko 820304820304 D.W. KimD.W. Kim 830512830512

classclass

namename codecode studentsstudents

ProgProg. Lang. Lang ICE1341ICE1341 studentstudent studentstudent

namename bdaybday namename bdaybday

Y.K. KoY.K. Ko 820304820304 D.W. KimD.W. Kim 830512830512

Parsing EventsParsing Events

DOM APIDOM API SAX APISAX API

July 23, 2004 38 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

XML APIsXML APIs SAX (Simple API for XML)SAX (Simple API for XML) – – XML-DEVXML-DEV

Stream-based Access InterfaceStream-based Access Interface (Sequential Access) (Sequential Access) Notifies an application of a stream of Notifies an application of a stream of parsing eventsparsing events Needs a Needs a Content HandlerContent Handler to handle the parsing events to handle the parsing events

(e.g., start and end of an element)(e.g., start and end of an element) Appropriate to handle a large XML document Appropriate to handle a large XML document

DOM (Document Object Model)DOM (Document Object Model) – – W3CW3C Object-oriented Access Interface Object-oriented Access Interface (Random Access)(Random Access) Builds a Builds a tree of nodestree of nodes based on the structure and based on the structure and

information in an XML documentinformation in an XML document Types of nodes: Types of nodes: DocumentDocument, , ElementElement, , AttrAttr, …, …

July 23, 2004 39 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

DOM RepresentationDOM Representation

classclass

namename codecode studentsstudents

ProgProg. Lang. Lang ICE1341ICE1341 studentstudent studentstudent

namename bdaybday namename bdaybday

Y.K. KoY.K. Ko 820304820304 D.W. KimD.W. Kim 830512830512

classclass

namename codecode studentsstudents

ProgProg. Lang. Lang ICE1341ICE1341 studentstudent studentstudent

namename bdaybday namename bdaybday

Y.K. KoY.K. Ko 820304820304 D.W. KimD.W. Kim 830512830512

<class><class> <name><name>Prog. Lang.Prog. Lang.</name></name> <code><code>ICE1341ICE1341</code></code> <students><students>

<student <student id=“20037001”id=“20037001”>> <name><name>Y.K. KoY.K. Ko</name></name> <bday><bday>820304820304</bday></bday> </student></student> <student <student id=“20037002”id=“20037002”>> <name><name>D.W. KimD.W. Kim</name></name> <bday><bday>830512830512</bday></bday> </student></student>

</students></students></class></class>

XML DocumentXML Document DOM RepresentationDOM Representation

Document Document (Root Node)(Root Node)

Elements (Child Elements (Child Nodes)Nodes)

Node Values Node Values (Text Nodes)(Text Nodes)

July 23, 2004 40 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Java API Hierarchy for DOMJava API Hierarchy for DOMNodeNode

getChildNodesgetChildNodes(): NodeList(): NodeListgetAttributesgetAttributes(): NamedNodeMap(): NamedNodeMapgetNodeNamegetNodeName(): String(): StringgetNodeValuegetNodeValue(): String(): StringappendChildappendChild(Node)(Node)removeChildremoveChild(Node)(Node)setNodeValuesetNodeValue(String)(String)

AttrAttr

getNamegetName(): String(): StringgetValuegetValue(): String(): StringsetValuesetValue(String)(String)

CharacterDataCharacterData

getDatagetData(): String(): StringgetLengthgetLength(): int(): intsetDatasetData(String)(String)

DocumentDocument

createAttributecreateAttribute(String): Attr(String): AttrcreateElementcreateElement(String): Element(String): ElementcreateTextNodecreateTextNode(String): Text(String): TextgetDocumentElementgetDocumentElement(): Element(): ElementgetElementByTagNamegetElementByTagName(String): NodeList(String): NodeList

ElementElement

getAttributegetAttribute(String): String(String): StringgetTagNamegetTagName(): String(): String

TextText

splitTextsplitText(int): Text(int): Text

CommentComment

July 23, 2004 41 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

An Example of Creating DOM Objects from an An Example of Creating DOM Objects from an XML FileXML File

try {try { DocumentBuilderFactoryDocumentBuilderFactory docBuilderFactory = docBuilderFactory =

DocumentBuilderFactory.DocumentBuilderFactory.newInstancenewInstance();(); DocumentBuilderDocumentBuilder docBuilder docBuilder = = docBuilderFactory.docBuilderFactory.newDocumentBuildernewDocumentBuilder();(); DocumentDocument doc = doc =

docBuilder.docBuilder.parseparse(new File("(new File("sample.xmlsample.xml"));")); ElementElement rootEle = rootEle = doc.doc.getDocumentElementgetDocumentElement()();; NodeListNodeList children = rootEle. children = rootEle.getChildNodesgetChildNodes();(); for (int i = 0; i < children.for (int i = 0; i < children.getLengthgetLength(); i++) {(); i++) { NodeNode subEle = children. subEle = children.itemitem(i);(i); … … }}} catch(Exception e) { e.printStackTrace();} catch(Exception e) { e.printStackTrace(); }}

July 23, 2004 42 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Related MaterialsRelated Materials W3C’s XML Web Site: W3C’s XML Web Site: http://www.w3.org/XML/http://www.w3.org/XML/

XML Specification: XML Specification: http://www.w3.org/TR/2004/REC-xml-20040204/http://www.w3.org/TR/2004/REC-xml-20040204/

XML Concepts: XML Concepts: http://www.w3.org/Talks/General/Concepts.htmlhttp://www.w3.org/Talks/General/Concepts.html

DTD Tutorial: DTD Tutorial: http://www.w3schools.com/dtd/http://www.w3schools.com/dtd/

XML Schema Tutorial: XML Schema Tutorial: http://www.w3schools.com/schema/default.asphttp://www.w3schools.com/schema/default.asp

W3C’s XSL Site: W3C’s XSL Site: http://www.w3.org/Style/XSL/http://www.w3.org/Style/XSL/

XML Entities and their Applications: XML Entities and their Applications: http://tech.irt.org/articles/js212/ http://tech.irt.org/articles/js212/

Other XML-related Notes: Other XML-related Notes: http://www.w3.org/XML/notes.htmlhttp://www.w3.org/XML/notes.html

July 23, 2004 43 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Related MaterialsRelated Materials W3C Document Object ModelW3C Document Object Model ((www.w3.org/DOM/www.w3.org/DOM/))

A simple way to read an XML file in Java A simple way to read an XML file in Java (www.developerfusion.com/show/2064/)(www.developerfusion.com/show/2064/)

Working with XML Working with XML (java.sun.com/xml/jaxp/dist/1.0.1/docs/tutorial/index.html)(java.sun.com/xml/jaxp/dist/1.0.1/docs/tutorial/index.html)

Java Technology and XML FAQs Java Technology and XML FAQs (java.sun.com/xml/faq.html)(java.sun.com/xml/faq.html)

Java API Manual Java API Manual (java.sun.com/j2se/1.4.2/docs/api/)(java.sun.com/j2se/1.4.2/docs/api/)

See See org.w3c.domorg.w3c.dom and and javax.xml.parsersjavax.xml.parsers XML.org XML.org (www.xml.org)(www.xml.org)

July 23, 2004 44 Programming for WWW (Lecture#9) In-Young Ko, Information Communications University

Homework #3Homework #3 Due by Due by Friday July 30thFriday July 30th Design an Design an XML document structureXML document structure to represent the results from to represent the results from

your Web Wrapperyour Web Wrapper You can use DTD or XSD for writing the grammar, but it is not a You can use DTD or XSD for writing the grammar, but it is not a

requirementrequirement You can just sketch the structure by drawing a tree hierarchyYou can just sketch the structure by drawing a tree hierarchy

Write Write a program to a program to generate a DOM hierarchygenerate a DOM hierarchy of the wrapper of the wrapper results by using a DOM library, and results by using a DOM library, and link the program with your link the program with your Web wrapperWeb wrapper

Produce an XML fileProduce an XML file from the DOM representation of the results from the DOM representation of the results Submit the following things electronically to the TASubmit the following things electronically to the TA

The XML document structure designThe XML document structure design Your Web wrapper program with the XML generation partYour Web wrapper program with the XML generation part An output XML fileAn output XML file