78
XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik, CROATIA

XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

Embed Size (px)

Citation preview

Page 1: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

XML

Marc NyssenVrije Universiteit Brussel

Medical Informatics

1st International Summer SchoolApplications of ICT in Biomedicine

August 5-10, 2002Dubrovnik, CROATIA

Page 2: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

XML: COURSE CONTENTS

1. Introduction: what is XML? 2. Syntax rules 3. Xschema definitions 4. XML document formats 5. Extensible Stylesheet Language 6. Cascading Stylesheet7. XSL Formatting Objects (XSL-FO) 8. XML Data formatting 9. Document Object Model (DOM) 10. Simple API for XML (SAX) 11. The Broader view: semantic web 12. Examples 13. References

Page 3: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML?

Page 4: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML?

eXtensible Markup Language

subset of SGML (Standard Generalized Markup Language, 1986) iso srandard ISO 8879 (Charles F. Gotfarb)

GML (Generalized ...) °1969

markup: tags contain meta-information

extensible: define your own tags

with XML you can define specific markup languages

Strict separation: structure / layout

Page 5: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (2)Extensible Markup Language 1.0

Open solution: W3C Recommendation, Feb. 10th 1998

hardware independent (data exchange)

sofware independence

text based (human + computer readable)

strict syntax

International (unicode character set)

Page 6: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (3)Example:<?xml version=”1.0”?><patients>

<patfile nr=”952345”><name>

Frank Doo </name> <bdate>

<day>23</day><month>05</month><year>1958</year>

</bdate><diag>healthy</diag>

</patfile> </patients>

Page 7: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (4)Tree representation:

<?xml version=”1.0”?> <patients>

<patfile nr=”952345”> <name>

Frank Doo </name> <bdate>

<day>23</day><month>05</month><year>1958</year>

</bdate> <diag>healthy</diag>

</patfile> </patients>

Page 8: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (5)

XML is structure:

strict separation structure/layout self-describing data style sheet required

XML: single source:

Page 9: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (6)

XML is multimedia:

MathML: mathematics VoiceXML: speech

XML medical applications:

data exchange medical record storage Electronic prescriptions Summary records

Page 10: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (7)

XML is a number of annex technologies:

data rendering/formatting via style sheet Cascading style sheet eXtensible Stylesheet Language (XSLT)

data structuring and integrity via data description

Data processing via parsers (huge body of work available)

XFORMS

Some 93!!! languages (MathML, Xlink, Xpath, EbXML, ...)

Page 11: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

1. Introduction: what is XML? (8)

XML support:

In web browsers great differences in supported featueres Mozilla (open software) does best job

Great variety of free tools available!

Java-based parsers

Active web sites (validation)

Page 12: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules

Page 13: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules Well formed XML documents comply to syntax rules

programming line:

<?xml version=”1.0”?> (other examples will follow) element:

<name>Frank Zappa</name> attributes:

<patient nr=”99858201”> comments:

<!-- Any text ... ... -->

Page 14: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (2)Using elements:

<name>Frank Zappa</name>

starting tag data ending tag

<empty></empty> no data: but strict!

<empty/> brief notation

Page 15: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (3)

all elements MUST have start and en tag

tags are case sensitive (Tag differs from tag)

elements must be nested cleanly: <a><b> data </b></a>

an XML document has a single root element

the order of the elements counts!

Page 16: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (4) attributes add extra information:

<date lastcorrect=”03Jan2002”>Mon Oct 21 1999</date>

attributes have a name and a value

lastcorrect name

03Jan2002 value

order of no importance except in:

<?xml version=”1.0” encoding=”UTF-8”>

Page 17: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (5)

Special characters via entity references: (5)

< (tag delimiter) &lt; (less than)

> (tag delimiter) &gt; (greater than)

& (ampersand) &amp;

“ (double quotes) &quot;

' (apostrophe) &apos;

Page 18: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (6)

CDATA section: between <![CDATA[ and ]]>

Can contain any character data except ]]>

Example: <p> A sample XML code would be:

<![CDATA[

<?xml version=”1.0” ?>

<patients> .... </patients>

]]>

</p>

Page 19: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (7)The XML declaration: <?xml version=”1.0” ... ?>

Useful but not absolutely required

If there: on the very first line (no spaces in front)

version=”1.0” currently (2002) the only one (backw. Compat.)

encoding=”UTF-8” differerent encodings: ASCII, UNICODE, ISO-8859, ... (optional, default: “UNICODE”)

standalone=”yes” if “no”, application should read an external DTD, in another file (optional attribute, default: “no”)

Page 20: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (8)Names of elements and attributes:

should start with letter or _ (underscore)

cannot start with XML

then, succession of:

letters numbers _ (underscores) - (minus signs) . (full stops)

the : (colon) is reserved for namespaces spaces in tags and attributes: irrelevant spaces in elements are kept

Page 21: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (9)

Well formedness test:

Use a XML-capable browser (Netscape Navigator 6, Mozilla, IE5):

Page 22: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (10)

XPATH:

XML document is a tree structure (7 node types)

root node element nodes text nodes attibute nodes comment nodes processing instruction nodes namespace nodes

XPATH does not recognize CDATA

Page 23: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (11)

XPATH: syntax

Location path identifies set of nodes in a document

/ is document root// all descendants.. parent element

/patient/name/first

* wildcard

//patient[@born <= 1995 and @born >= 1990]

(used in xsl)

Page 24: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

2. Syntax rules (12)

XLinks: syntax

can be simple point-to point (HTML) xlink:type=”simple”

<course xmlns:xlink=”http://www.w3.org/1999/xlink”

xlink:type =”simple”

xlink:href=”http://mnf.ac.be/course”> ...

several more types: extended, locator, arc, title, resource

several attributes: xlink:show xlink:actuate

Page 25: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XML Schema

Page 26: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XML schema structuring XML

define names for elements and attributes

Imposes order in which elements and children appear

Schema

elements

attributes

entities

Page 27: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (2)

Syntax:

XML Schema (W3C): XML Schema Definition Language XSD

Page 28: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. Document Type Definitions (9)

Conformity checking of XSD in XML: validation

XMLSPY and other validating parsers (xmllint)

http://www.stg.brown.edu/service/xmlvalid/

Page 29: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XML schema (10)

Graphical representation

Page 30: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (11)

Summary

well formed XML: complies to all syntax rules

valid XML document: complies to XSchema

use tools such ax XML editors and validating parsers

Page 31: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (12)

Namespaces

distinguish between elements/attributes sharing same name

group related elements/attributes from 1 XML application for processing software

Page 32: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (13)Namespaces (2)

a namespace is defined by a Uniform Resource Identifier URI

looks like a URL but is just an identifier!

suppose you work with 2 XML docs in 1 app

<title>XML Course</title> and <title>Great Student</title>

to distinguish: associate each with a different “ name space”

Page 33: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. Document Type Definitions (14)

Namespaces (3)

<crs:courselist xmlns:crs=”http://www.docarch.be/crs” xmlns:stu=”http://www.docarch.be/stu”>

<crs:title>XML</crs:title>

<stu:name>John</stu:name>

<stu:title>distinguished</stu:title>

</crs:courselist>

Prefix not necessary for 1 (default) namespace

Page 34: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (16)

XSD (XML Schema Definition):

<xs: element>

attributes: Name minOccurs: 0 .. x maxOccurs: 0 .. unbounded

<xs: simpleType> <xs: complexType> <xs: sequence>

Page 35: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. XSchema (17)XSD: example

<?xml version=”1.0” encoding=”UTF-8”?>

<?xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema” elementFormDefault=”qualified”>

<xs:element name=”patientlist”>

<xs: complexType> <xs:sequence> <xs:element ref=”name” minoccurs=”0” maxOccurs=”unbounded”/> </xs:sequence>

</xs: complexType>

</xs:element>

Page 36: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

3. Document Type Definitions (18)

XSD self-defined data types :

<xs:simpleType name=”date”>

<xs: restriction base=”xs:date”/>

</xs:simpleType>

Page 37: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

4. XML document formats

Page 38: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

4. XML document formatsDifferent stylesheet methods

CSS: Cascading Style Sheet: simple instructions determine layout, fonts, colors (CSS levels 1 and 2)

XSLT (eXtensible Stylesheet Language Transformations) also - XHTML: strict HTML

really strict <!DOCTYPE html public “ -//W3C//DTD XHTML 1.0 STRICT//EN”

“DTD/shtml1strict.dtd”> transitional

<!DOCTYPE html public “ -//W3C//DTD XHTML 1.0 TRANSITIONAL//EN” “DTD/shtml-1transitional.dtd”>

frameset<!DOCTYPE html public “ -//W3C//DTD XHTML 1.0 FRAMESET//EN”

“DTD/shtml1-frameset.dtd”>

Page 39: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

4. XML document formats (2)

CSS: simplest, most supported by browsers

not XML related

XSL-T: more general, more complex fully XML-related

Page 40: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

4. XML document formats (3)Docbook: (http://www.oasis-open.org/docbook/)

XML/SGML vocabulary for books/papers

Technical Committee

DocBook schema

Document Type Definition (DTD)

<!DOCTYPE book PUBLIC “-//Norman Walsh//DTD DocBk XML V4.2.1//EN” “ docbook/docbookx.dtd”>

Annex activities

Allows to write books in XML -> tools -> output styles

Page 41: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

4. XML document formats (4)

The Dublin Core (http://purl.org/dc/)

Minimal set of publication items:15

Title, Author or Creator, Subject and Keywords,Description, Publisher, Contributor,

Type, Format, Resource Identifier, Date Source, Language, Relation, Coverage, Rights Management

Page 42: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

5. Extensible Stylesheet Language (XSL)

Page 43: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

5. Extensible Stylesheet Language (XSL)

XSL is 'client' based

XML- technology (vs. CSS: non-XML)

Two instances:

XSL-T : Transformations XSL-FO: Formatting Objects

Page 44: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

5. Extensible Stylesheet Language(XSL)(2)

Page 45: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

5. Extensible Stylesheet Language(XSL)(3)

XSL components:

XPATH XML path references nodes for processing

XSL-T XSL transformations

- to transform from XML to XML

- produce data presentation document

XSL-FO Formatting Objects: produce documents

Page 46: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS)

Page 47: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS)

Syntax: (non-XML)

element-match {formatting-item: value; ........ }

* {font-size: large} set large font for all elements

patient [nr=12345] {display: none} select on attribute value

diag {display: block ; text-align: center} text block display

diag {display: item-list } list (bullets or not)

diag {display: table} start a table, children:

rows and cells

Page 48: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS) (2)

/* Defaults for whole doc */

patients {font-family: "Times New Roman" ; font-size 18pt}

/* name as header */

name { display: block ; text-align: center ; font- size:36pt ; font-weight:"bold"}

/* bdate as list */

day { display: list-item; list-style-type: decimal}

month { display: list-item; list-style-type: decimal}

year { display: list-item; list-style-type: decimal}

/* diagnosis */

diag { display: block ; text-align: center ; font-size: 22}

Page 49: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS) (3) Example: XML file

<?xml version="1.0" standalone="yes"?>

<?xml-stylesheet type="text/css" href="patient.css"?>

<patients>

<patfile nr="A952345">

<name>Frank Doo</name>

<bdate><day>23</day> <month>05</month> <year>1958</year></bdate>

<address>Long Street 15, Hightown</address>

<diag>Jan 2002: healthy</diag>

<diag>May 2002: flue</diag>

<diag>July 2002: pain in the back</diag>

</patfile>

</patients>

Page 50: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS) (4) Example: patient.css

/* Defaults for whole document */ patients {font-family: "Times New Roman"; font-size: 22pt}

/* name as header */ name { display: block; text-align: center; font-size: 30pt; font-weight:"bold"}

/* bdate as table */ bdate {display: block }

day {color: blue} month {color: green} year {color: blue}

/* address */ address { display: block; font-style: italic}

/* diagnosis */ diag { display: list-item ; text-indent: 2cm; list-style-position: inside; font-size: 18pt; color: red}

Page 51: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

6. Cascading Stylesheets (CSS) (5)

Result: xml file, together with patient.css stylesheet

Page 52: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

7. XSL Formatting Objects (XSL-FO)

Page 53: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

7. XSL Formatting Objects (XSL-FO)

Documents consist of boxes

Block areas Inline areas Line areas Glyph areas

Master pages define margins dimension

XSLT-like syntax defines 'format processing'

Page 54: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

8. XML Data formatting

Page 55: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

8. XML Data formatting

Data formatting options:

client-side: XSL-T

server-side:

DOM (Document Object Model)

SAX (Simple API for XML)

Page 56: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

8. XML Data formatting (2)

General model for data processing:

Page 57: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

9. Document Object Model (DOM)

Page 58: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

9. Document Object Model (DOM)

W3C recommendation: model to store hiërarchical documents in memory

the whole document is in memory, we have random access

ideal for document editing, data retrieval, navigation

disadvantage 1: speed

disadvantage 2: memory resource

Page 59: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

9. Document Object Model (DOM)(2)

Document structure:

Page 60: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

9. Document Object Model (DOM)(3)

DOM nodes:

document: parent of all nodes

elements: children: other nodes and text nodes; attributes

attributes

comment

CDATA: not parsed

processing instructions

document fragments

other types: entities, entity references, notations

Page 61: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

10. Simple API for XML (SAX)

Page 62: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

10. Simple API for XML (SAX)

Event-based:

parse the document when a match -> corresponding action

Non-official standard

good speed minimal memory requirement Platform independent: Java

Page 63: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

10. Simple API for XML (SAX) (2)

Methodology:

write an appropriate event handler

get a SAX parser

link the event handler to the parser

parse and process as events are triggered

Page 64: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

11. The Broader view: semantic web

Page 65: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

11. The Broader view: semantic web

Page 66: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

11. The Broader view: semantic web (2)

The semantic web: extension of the current web, in which information is given well-defined meaning, better enabling computers and people to work in cooperation (W3C)

searching: far better to dispose of semantic data

keywords alone too weak

HTML meta tags <meta content=”diagnosis”>

ad-hoc and insufficient

RDF: Resource Description Framework

Page 67: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

11. The Broader view: semantic web (3) RDF: Resource Description Framework

XML encoding for resources

Each Description element contains: about attribute with URI

Children: property of resource

<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>

<rdf:Description about=”http://mnf.ac.be/xml/”>

<author>Marc Nyssen</author>

<coursetype>lecture</coursetype>

</rdf:Description>

</rdf:RDF>

Page 68: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples

Page 69: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples patients+dtd.xml

<?xml version="1.0"?> <!DOCTYPE patients [ <!ELEMENT patients (patfile*)> <!ELEMENT patfile (name+, bdate+, diag+)> <!ATTLIST patfile nr ID #REQUIRED> <!ELEMENT name (#PCDATA)>

<!ELEMENT bdate (day+, month+, year+)> <!ELEMENT day (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT diag (#PCDATA)> ]>

<patients> <patfile nr="A952345"> <name>

Frank Doo </name> <bdate>

<day>23</day> <month>05</month> <year>1958</year>

</bdate> <diag>healthy</diag>

</patfile> </patients>

Page 70: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples (2) patient.css

/* Defaults for whole doc */patients {font-family: "Times New Roman"; font-size: 22pt}

/* name as header */name { display: block; text-align: center; font-size: 30pt;

font-weight:"bold"}

/* bdate as table */bdate {display: block }

day {color: blue}month {color: green}year {color: blue}

/* address */address { display: block; font-style: italic}

/* diagnosis */diag { display: list-item ; text-indent: 2cm; list-style-

position: inside; font-size: 18pt; color: red}

Page 71: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples (3) pat-css.xml

<?xml version="1.0" standalone="yes"?><?xml-stylesheet type="text/css" href="patient.css"?>

<patients> <patfile nr="A952345"> <name>

Frank Doo </name> <bdate>

<day>23</day> <month>05</month> <year>1958</year> </bdate>

<address>Long Street 15, Hightown</address>

<diag>Jan 2002: healthy</diag> <diag>May 2002: flue</diag> <diag>July 2002: pain in the back</diag>

</patfile> </patients>

Page 72: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples (4) xsltproc patients-xslt.xml > patients-xslt.html

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="./patients.xsl" ?><patients> <patient> <number>0599123123</number> <name>John Doo</name> <diag>healthy</diag> </patient> <patient> <number>0479123123</number> <name>Jane Bee</name> <diag>flue</diag> </patient> <patient> <number>2469523729</number> <name>Louise Three</name> <diag>pregnant</diag> </patient></patients>

Page 73: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples (5) patients.xsl

<?xml version="1.0"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

<xsl:template match ="/">

<html><body><h1>Patient list</h1><p></p><table border="3">

<xsl:apply-templates select="//patients"/></table></body></html>

</xsl:template>

<xsl:template match="patient"><tr> <td><B>Patient name: </B></td>

<td><xsl:value-of select="name"/></td> <td><B>Number: </B></td> <td> <xsl:value-of select="number"/></td> <td><B>Diagnosis: </B></td> <td> <xsl:value-of select="diag"/></td>

</tr> </xsl:template></xsl:stylesheet>

Page 74: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

12. Examples (6) Resulting HTML output: patients-xslt.html

<html><body><h1>Patient list</h1><p><table border="3"> <tr><td><B>Patient name: </B></td><td>John Doo</td><td><B>Number: </B></td><td>0599123123</td><td><B>Diagnosis: </B></td><td>healthy</td></tr> <tr><td><B>Patient name: </B></td><td>Jane Bee</td><td><B>Number: </B></td><td>0479123123</td><td><B>Diagnosis: </B></td><td>flue</td></tr> <tr><td><B>Patient name: </B></td><td>Louise Three</td><td><B>Number: </B></td><td>2469523729</td><td><B>Diagnosis: </B></td><td>pregnant</td></tr></table> </body></html>

Page 75: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

13. References

Page 76: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

13. References

XML in a nutshell, Eliotte Rusty Harold, W. Scott Means, O'Reilly, Jan 2001,

ISBN 0-596-00058-8

XML Specification Guide, Ian S. Graham, Liam Quin, Wiley, 1999, ISBN 0-471-32753-0

Learning XML (Creating Self-Describing Data), Erik T. Ray, O'Reilly, 2001, ISBN0-596-00046-4

XML Cursus (Technologisch Instituut KVIV 2001-2002), Erik Duval, Bert Paepen (Departement Computerwetenschappen, KUL)

Page 77: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

13. References (2)

http://www.w3.org/, http://www.w3.org/XML/ the reference for XML

http://www.stg.brown.edu/service/xmlvalid/ XML validation form

Namespaces FAQ http://www.rpbourret.com/xml/NamespacesFAQ.htm

Docbook: http://www.docbook.org ... and others (http://www.oasis-open.org)

The Dublin Core (http://purl.org/dc/)

Specialized XML sites: http://www.xml.org http://www.oasis-open.org/cover

XML encryption: http://www.w3.org/Encryption/2001

XML signatures: http://www.w3.org/signature

XML tutorials: http://www.xml101.com/xml/default.asp

Page 78: XML Marc Nyssen Vrije Universiteit Brussel Medical Informatics 1st International Summer School Applications of ICT in Biomedicine August 5-10, 2002 Dubrovnik,

13. References (3) Apache project: http://xml.apache.org/

The goals of the Apache XML Project are:

The goals of the Apache XML Project are:

* to provide commercial-quality standards-based XML solutions that are developed in an open and cooperative fashion, * to provide feedback to standards bodies (such as IETF and W3C) from an implementation perspective, and * to be a focus for XML-related activities within Apache projects

The Apache XML Project currently consists of the following sub- projects, each focused on a different aspect of XML:

* Xerces - XML parsers in Java, C++ (with Perl and COM bindings) * SOAP - Simple Object Access Protocol

* Xalan - XSLT stylesheet processors, in Java and C++ * Batik - A Java based toolkit for Scalable Vector Graphics

* Cocoon - XML-based web publishing, in Java * Crimson - A Java XML parser derived from the

* AxKit - XML-based web publishing, in mod_perl Sun Project X Parser.

* FOP - XSL formatting objects, in Java

* Xang - Rapid development of dynamic server pages, in JavaScript