41
IT Strategy, IBS, Technology & Solutions 1 [email protected] / 416.513.5656 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal Database Users Group Ian GRAHAM IT Strategy, IBS, Technology and Solutions, BMO Financial Group E: <[email protected]> T: (416) 513.5656 / F: (416) 513.5590 To download this talk: http://www.utoronto.ca/ian/talks/

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

Embed Size (px)

Citation preview

Page 1: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML 101:A Technical Introduction to XML

20 November 2002Bank of Montreal Database Users Group

Ian GRAHAM

IT Strategy, IBS, Technology and Solutions, BMO Financial Group

E: <[email protected]>

T: (416) 513.5656 / F: (416) 513.5590

To download this talk: http://www.utoronto.ca/ian/talks/

Page 2: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 3: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

A base-level syntax– for encoding structured, text-based information (words, characters, ...)

A text-based syntax– XML is written using printable Unicode characters. Explicit binary data is not

allowed

Supports extensible data formats – XML lets you define your own elements (essentially data types), within the

constraints of the syntax rules

Designed as a universal format– The syntax rules ensure that all XML processing software MUST identically

handle a given piece of XML data.

If you can read and process it, so can anybody else

What is XML?

Page 4: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML: A Simple Example

<?xml version="1.0" encoding="iso-8859-1"?> <partorders xmlns=“http://myco.org/Spec/partorders”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster </desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <deliveryDate date=“27aug1999-12:00h” /> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> . . . Order something else . . . </order></partorders>

XML Declaration (“this is XML”) Flags character encoding used in file

Black – XML tags and markupBlue - encoded text data

Page 5: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Example Revisited

<partorders xmlns=“http://myco.org/Spec/partorders” > <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster </desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <deliveryDate date=“27aug1999-12:00h” /> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> . . . Order something else . . . </order></partorders>

tags attribute of thisquantity element

element

Hierarchical, structured data

Page 6: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Data Model - A Tree

<partorders xmlns="...">

<order date="..."

ref="...">

<desc> ..text..

</desc>

<part />

<quantity />

<delivery-date />

</order>

<order ref=".." .../>

</partorders>

text

partorders

order

order

desc

part

quantity

delivery-date

date=

ref=

date=

ref=

xmlns=

text

Page 7: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML: Design goals

Simple but reliable– Strict syntax rules, to eliminate syntax errors– syntax defines structure (hierarchically), and names structural parts

(element names) -- it is self-describing data

Extensible and ‘mixable’– Can create your own language of tags/elements – Can mix one language with another, and still reliably separate /

process the data

Designed for a distributed environment – Can have remote (‘webbed’) data, and retrieve and use it reliably

Page 8: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

The parser must verify that the XML is syntactically correct Such data is said to be well-formed

– The minimal requirement to “be” XML

A parser MUST stop processing if the data isn’t well-formed– E.g., stop processing and “throw an exception” to the XML-based

application. The XML 1.0 spec requires this behaviour

XML Processing: The XML Parser

XML data XMLparser

parserInterface

XML-basedapplication

Page 9: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Special Issues: Characters and Charsets

XML specification defines characters allowed as whitespace in tags: <element id = “23.112” />

You cannot use EBCIDIC character ‘NEL’ as whitespace– Must make sure to not do so!

What if you want to include characters not defined in the encoding charset (e.g., Greek characters in an ISO-Latin-1 document):

– Use character references. For example: &#9824; -- the spades character ()

9824th character in the Unicode character set

Also, a reminder that binary data is forbidden– must be encoded as printable characters (e.g. using Base64)

Page 10: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

– A DTD can define external parts (entities) to be ‘included’ in– But …. what if the parser can’t find the external parts (firewall?)? – That depends on the type: there are two types of XML parsers

• one that MUST retrieve all parts• one that can ignore them (if it can’t find them)

Parsers and DTDs

XML dataparser

parserinterface

XML-basedapplication

DTD

Page 11: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Two types of XML parsers

Validating– Must retrieve all entities and process all of the DTD. Will stop

processing and indicate a failure if it cannot– It must also test and verify other things in the DTD -- instructions that

define syntactic document rules (allowed elements, attributes, etc.).

Non-validating (well-formed only) – Tries retrieve all ‘parts’, but will cease processing the DTD content

at the first part (entity) it can’t find, – But this is not an error -- the parser simply makes available the XML

data (and the names of any unresolved ‘parts’) to the application.

Application behavior will depend on parser type

Many parsers can operate in either mode (config)

Page 12: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 13: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Defining constraints / languages

Two ways of doing so:– XML Document Type Declaration (DTD) -- Part of core XML spec.– XML Schema (often called XSD) -- New specification (2001), which

allows for richer constraints on XML documents.

What DTDs and/or schema specify: – Allowed element and attribute names, hierarchical nesting rules;

element content/type restrictions

Adding dialect specifications implies two classes of XML data– Well-formed XML that is syntactically correct– Valid XML that is well-formed and consistent with

a specific DTD (or Schema)

Schemas are more powerful than DTDs– Often used for type validation, or for defining low-level type

constraints (integer, varchar, datetime, etc.) constraints on values.

Page 14: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

DTD Example

<!DOCTYPE transfers [ <!ELEMENT transfers (fundsTransfer)+ > <!ELEMENT fundsTransfer (from, to) > <!ATTLIST fundsTransfer date CDATA #REQUIRED> <!ELEMENT from (amount, transitID?, accountID, acknowledgeReceipt ) > <!ATTLIST from type (intrabank|internal|other) #REQUIRED> <!ELEMENT amount (#PCDATA) > . . . Omitted DTD content . . . <!ELEMENT to EMPTY > <!ATTLIST to account CDATA #REQUIRED>]><transfers> <fundsTransfer date="20010923T12:34:34Z"> . . . As with previous example . . .

Page 15: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Namespaces

Mechanism for identifying different “spaces” for XML names– That is, element or attribute names

This is a way of identifying different language dialects, consisting of names that have specific semantic (and processing) meanings.

For example <key/> in one language (e.g. a security key) can be distinguised from <key/> in another language (a database key)

Mechanism uses a special xmlns attribute to define namespaces.

– The namespace is a URL string– But the URL does not reference anything in particular (there may be

nothing there!)

Page 16: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Mixing languages together

<?xml version= "1.0" encoding= "utf-8" ?>

<html xmlns="http://www.w3.org/1999/xhtml1" xmlns:mt="http://www.w3.org/1998/mathml” ><head> <title> Title of XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> <mt:mathml> <mt:title> ... MathML markup . . . </mt:mathml> <p> more html stuff goes here </p></div> </body></html>

mt: prefix indicates ‘space’ mathml (a different language)

Default ‘space’is xhtml

Namespaces let you do this relatively easily:

Page 17: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Schemas

A specification for defining XML validation rules Specs: http://www.w3.org/XML/SchemaBest-practice: http://www.xfront.com/BestPracticesHomepage.html

Uses pure XML (plus namespaces) to do this

More powerful than DTDs - can specify things like integer types, date strings, real numbers in a given range, etc.

Often used for type validation, or for relating database schemas to XML models

They don’t, however, let you declare entities -- those can only be done in DTDs

The following slide shows the XML schema equivalent to our DTD

Page 18: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Schema version of our DTD (Portion)

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="accountID" type="xs:string"/> <xs:element name="acknowledgeReceipt" type="xs:string"/> <xs:complexType name="amountType"> <xs:simpleContent> <xs:restriction base="xs:string"> <xs:attribute name="currency" use="required"> <xs:simpleType> <xs:restriction base="xs:NMTOKEN"> <xs:enumeration value="USD"/> . . . (some stuff omitted) . . . </xs:restriction> </xs:simpleType> </xs:attribute> </xs:restriction> </xs:simpleContent> </xs:complexType> <xs:complexType name="fromType"> <xs:sequence> <xs:element name="amount" type="amountType"/> <xs:element ref="transitID" minOccurs="0"/> <xs:element ref="accountID"/> <xs:element ref="acknowledgeReceipt"/> </xs:sequence> . . . And still more !!! . . .

Page 19: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 20: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Software

XML parsers….. – Read in XML data, checks for syntactic (and possibly DTD/Schema)

constraints, and makes data available to an application. There are three 'generic' parser APIs

• SAX Simple API to XML (event-based)• DOM Document Object Model (object/tree based)• JDOM Java Document Object Model (object/tree based)• Pull evolving API (new) (pull-based / object +

tree)

– Lots of XML parsers and interface software available • Unix, Linux, Windows 2000/XP, Z/OS, etc

– SAX-based parsers are fast (often as fast as you can stream data)

– DOM slower, more memory intensive (create in-memory version of entire document

– Validating can be much slower than non-validating

Page 21: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Parser API: SAX

A) SAX: Simple API for XML– http://www.megginson.com/SAX/index.html– An event-based interface (a push parser API)– Parser reports events whenever it sees a tag/attribute/text

node/unresolved external entity/other (driven by input stream)– Programmer attaches “event handlers” to handle the event

Advantages– Simple to use– Very fast (not doing very much before you get the tags and data)– Low memory footprint (doesn’t read an XML document entirely into

memory)

Disadvantages– Not doing very much for you -- you have to do everything yourself– Not useful if you have to dynamically modify the document once it’s in

memory (since you’ll have to do all the work to put it in memory yourself!)

Page 22: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Parser API: DOM

B) DOM: Document Object Model– http://www.w3.org/DOM/– An object-based interface– Parser generates an in-memory tree corresponding to the document– DOM interface defines methods for accessing and modifying the tree

Advantages– Very useful for dynamic modification of, access to the tree– Useful for querying (I.e. looking for data) that depends on the tree

structure [element.childNode("2").getAttributeValue("boobie")]– Same interface for many programming languages (C++, Java, ...)

Disadvantages– Can be slow (needs to produce the tree), and may need lots of

memory– DOM programming interface is a bit awkward, not terribly object

oriented

Page 23: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

DOM Parser Processing Model

XML dataparser

parserinterface

application

text

partorders

order

order

desc

part

quantity

delivery-date

Document “object”

DOM

Page 24: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Parser API: JDOM

B2) JDOM: Java Document Object Model– http://www.jdom.org– A Java-specific object-oriented interface– Parser generates an in-memory tree corresponding to the document– JDOM interface has methods for accessing and modifying the tree

Advantages– Very useful for dynamic modification of the tree– Useful for querying (I.e. looking for data) that depends on the tree

structure– Much nicer Object Oriented programming interface than DOM

Disadvantages– Can be slow (make that tree...), and can take up lots of memory– New, and not entirely cooked (but close) – Only works with Java

Page 25: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Parser API: Pull

C) Pull Interfaces– http://www.xmlpull.org/ (Java); there is also a .NET pull API – An pull-parser interface – API uses expressions / methods to ‘pull’ specific chunks of XML data,

or to iterate over the XML– Can be built on top of a DOM model

Advantages– Easier to write applications that need to read in and process XML

data (‘easier’ model than a push API, in many cases)– Has proven a very popular component in the .NET toolkit

Disadvantages– Can be slow if you do lots of iteration over the XML input data– No common API across different languages (although xmlpull.org

tries to be similar to the .NET API); not yet a ‘real’ standard (still being worked on; not part of most commercial environments)

Page 26: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Processing: XSLT

D) XSLT eXtensible Stylesheet Language -- Transformations– http://www.w3.org/TR/xslt– An XML language for processing/transforming XML– Does tree transformations -- takes XML and an XSLT style sheet as

input, and produces a new XML document with a different structure

Advantages– Very useful for tree transformations -- much easier than DOM or SAX

for this purpose– Can be used to query a document (XSLT pulls out the part you want)

Disadvantages– Can be slow for large documents or stylesheets– Can be difficult to debug stylesheets (poor error detection; much

better if you use schemas)

Page 27: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XSLT processing model

D) Processing model

XSLT style sheet in

XMLparser

XSLT processor

text

partorders

order

order

desc

part

quantity

delivery-date

document “objects” fordata and style sheet

XMLparser

XML data in

partorders

xza

order

foo bee

data out (XML)

schema

schema

Page 28: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Processing Toolkits

Lots of them … Java

– JAXP ( http://java.sun.com/xml/jaxp/faq.html )dom4j ( http://www.dom4j.org ) .NET ( part of .NET framework)… … others …

Provide DOM, SAX, (JDOM) interfaces, plus lots of other useful tools in a standardized way (loading parsers, performing XSLT transformations, etc.)

JAXP is standard Java, and thus integrated with Websphere

Page 29: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 30: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML and databases

So where do you stick XML data– Inside a database!?!– But how to do this – and which database type to use:

– RDBMS, ORDBMS, ODB, XML??

How you do so depends on the use cases you have for the data. Some good-to-ask questions are

– Am I talking about storing documents, or data?– Is the XML format integral to the application (e.g. XHTML, DocBook?)

– How will the database be queried?– Queried by XML structure, or by standard SQL– What ‘parts’ of the document need to be queried– Do I need a text index?

– How will the data be used/retrieved?– Passed to XML processing tools (e.g. XSLT), or used at ‘atomic’ simple type

level?

– The answers drive out – What database to choose, how to map XML to tables (O-R or table

mappings), store as BLOB or broken up …..

Page 31: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML and databases

Upcoming technologies– XML Query – a query language for querying XML datasets (and

databases)• Uses XML schema for type casting, and validation• Info: http://www.w3.org/XML/Query

Useful XML Database references– http://www.xml.com/pub/a/2001/10/31/nativexmldb.html Introductory article– http://www.rpbourret.com/xml/XMLAndDatabases.htm XML and databases– http://www.rpbourret.com/xml/XMLDatabaseProds.htm Products list– http://www.xmldb.org/resources.html Docs / resource list

Page 32: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 33: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Messaging

Use XML as the format for sending messages between systems Advantages:

– Common syntax; self-describing (easier to parse)– Can use common/existing transport mechanisms to “move” the XML

data (HTTP, HTTPS, SMTP (email), MQ, IIOP/(CORBA), JMS, ….)

Requirements– Shared understanding of dialects for transport (required registry

[namespace!] ) for identifying dialects– Shared acceptance of messaging contract

Disadvantages– Asynchronous transport; no guarantee of delivery, no guarantee that

partner (external) shares acceptance of contract.– Messages will be much larger than binary (10x or more) [can

compress]

Page 34: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Common messaging model

XML over HTTP – Use HTTP to transport XML messages –

POST /path/to/interface.pl HTTP/1.1Referer: http://www.foo.org/myClient.htmlUser-agent: db-server-olkAccept-encoding: gzipAccept-charset: iso-8859-1, utf-8, ucsContent-type: application/xml; charset=utf-8Content-length: 13221. . .

<?xml version=“1.0” encoding=“utf-8” ?><message> . . . Markup in message . . . </message>

Page 35: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Some standards for message format

Define dialects designed to “wrap” remote invocation messages

XML-RPC http://www.xmlrpc.com– Very simple way of encoding function/method call name, and passed

parameters, in an XML message.

SOAP (Simple object access protocol) http://www.soapware.org

– More complex wrapper, which lets you specify schemas for interfaces; more complex rules for handling/proxying messages, etc. This is a core component of Microsoft’s .NET strategy, and is integrated into more recent versions of Websphere and other commercial packages. W3c activity (who sets the SOAP spec) is outlined at: http://www.w3.org/2000/xp/Group/

Page 36: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

XML Messaging + Processing

FactorySupplier

Supplier

Supplier

Place order(XML/edi) using

SOAP over HTTP

Response(XML/edi) using

SOAP over HTTP

SOAP interface

SOAP

Transport

XML/EDI

HTTP(S)SMTPother ...

Application

SOAP API

• XML as a universal format for data exchange

Page 37: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Web “Services” Model

SOAP plus higher-level modeling for how services are ‘advertised’, ‘exposed’ and ‘found’

– Uses an XML dialect, WSDL (Web Services Description Language) to define a service

• WSDL can use XML Schema to define how data is passed between a service provider and requestor

– Uses an XML dialect, UDDI (Universal Description, Discovery and Integration) for

• Describing services (high-level)• Discovering services (registry services, metadata)• UDDI defined using XML Schema

– Core technology for application integration• Microsoft .NET• IBM Websphere• Oracle • …. Many others

Page 38: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

WSDL

XML schema

skeleton

proxy

Client code

WS/SOAP

proxy

WS/SOAP

skeleton

adapter

MECH

adapter

Middle tiercode

automatedcode

generator

Writ

e th

e A

pplic

atio

n!

SOAPRequests/responses

Validation,business

logic, routing,Logging,more…

ProductSystemcode

Web Services Code Development

Page 39: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

Presentation Outline

1. What is XML (basic introduction)

2. Defining language dialects and constraints– DTDs, namespaces, and schemas

3. XML processing– Parsers and parser interfaces; XML processing tools

4. XML databases– High-level issues, and references

5. XML messaging / web services– Why, and some issues/example

6. Conclusions

Page 40: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

SAX 1

XML (and related) Specifications

XML 1.0 XML names

XSLT

DOM 1

‘Open’ std

XHTML 1.0

XML query ….

XML schema

SOAP UDDI

XML-RPC

SAX 2

DOM 2

JDOM

JAXP

WSDL

APIs

Style Protocols Web Services Application areas

XML Core

W3C rec

W3C draft

industry std

Xpath

XSL

MathML

SMIL 1 & 2

SVG

Modularized XHTML

XHTMLbasic

Xforms

Canonical

XMLsignature

XML base

Xlink

Xpointer

Infoset

RDF

Xfragment

XHTMLevents

FinXML

dirXML

100's more ....

DOM 3

CSS 1

CSS 2

CSS 3

IFX

FpML ...

ebXML

Biztalk

WDDX XMI...

...

…...

Page 41: IT Strategy, IBS, Technology & Solutions ian.graham@bmo.com / 416.513.5656 1 XML 101: A Technical Introduction to XML 20 November 2002 Bank of Montreal

IT Strategy, IBS, Technology & Solutions [email protected] / 416.513.5656

The End.

Ian GRAHAM

IT Strategy, IBS, Technology and Solutions, BMO Financial Group

E: <[email protected]>

T: (416) 513.5656 / F: (416) 513.5590

XML 101:A Technical Introduction to XML