40
XML:Managing data exchange Words can have no single fixed meaning. Like wayward electrons, they can spin away from their initial orbit and enter a wider magnetic field. No one owns them or has a proprietary right to dictate how they will be used. David Lehman, End of the Word, 1991.

XML:Managing data exchange

  • Upload
    chesna

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

XML:Managing data exchange. Words can have no single fixed meaning. Like wayward electrons, they can spin away from their initial orbit and enter a wider magnetic field. No one owns them or has a proprietary right to dictate how they will be used . David Lehman, End of the Word , 1991. - PowerPoint PPT Presentation

Citation preview

Page 1: XML:Managing data exchange

XML:Managing data exchange

Words can have no single fixed meaning. Like wayward electrons, they can spin away from their

initial orbit and enter a wider magnetic field. No one owns them or has a proprietary right to dictate how

they will be used.

David Lehman, End of the Word, 1991.

Page 2: XML:Managing data exchange

2

Central problems of data management

Capture

Storage

Retrieval

Exchange

Page 3: XML:Managing data exchange

3

EDI

Electronic exchange of standard documents

In use for some 20 years

StandardsANSI X.12 (US and Canada)

EDIFACT (International)

Page 4: XML:Managing data exchange

4

EDI: Advantages

Paper handling is reduced, saving time and money

Data can be exchanged in real time

There are fewer errors since data are keyed only once

Enhanced data sharing enables greater coordination of activities between business partners

Money flows are accelerated and payments received sooner

Page 5: XML:Managing data exchange

5

EDI: Adoption

Much information flow is still on paper

Electronic exchange is the exception rather than the rule

The Internet is a lower cost solution than EDI using value added networks (VANs)

Page 6: XML:Managing data exchange

6

SGML

Document management consumes15% of company revenue

25% of labor costs

10 - 60% of an office worker’s time

Standard generalized markup language (SGML) was designed to reduce the cost of document management

Page 7: XML:Managing data exchange

7

Markup language

Embedded information within text about the meaning of the text

<cdliner>This uniquely creative collaboration between Miles Davis and Gil Evans has already resulted in two extraordinary albums—<cdtitle>Miles Ahead</cdtitle><cdid>CL 1041></cdid> and <cdtitle>Porgy and Bess</cdtitle> <cdid>CL 1274</cdid>.</cdliner>

Page 8: XML:Managing data exchange

8

SGML

A vendor independent standard for publication of all media

Cross system

Portable

Defines the structure of a document

The parent of HTML and XML

Page 9: XML:Managing data exchange

9

SGML: Advantages

Re-useSame advantage as with word processing

FlexibilityGenerate output for multiple media

RevisionVersion control

Page 10: XML:Managing data exchange

10

SGML code

<chapter><no>16</no><title>XML: Managing Data Exchange</title><section><quote><emph type = "2">Words can have no single

fixed meaning. Like wayward electrons, they can spin away from their initial orbit and enter a wider magnetic field. No one owns them or has a proprietary right to dictate how they will be used.</emph></quote>

…</section>…</chapter>

Page 11: XML:Managing data exchange

11

HTML code

<html><body><h1><b>16</b></h1><h1><b>XML: Managing Data Exchange</b></h1><p><i>Words can have no single fixed meaning. Like

wayward electrons, they can spin away from their initial orbit and enter a wider magnetic field. No one owns them or has a proprietary right to dictate how they will be used.</i>

</p></body></html>

Page 12: XML:Managing data exchange

12

The problem with HTML

Presentation not meaning

Reader has to infer meaning

Machines are not very good at inferring meaning

Page 13: XML:Managing data exchange

13

XML

Extensible markup language

SGML for e- and m-commerce

A meta-languageA language to generate languages

Will steadily replace HTML

Page 14: XML:Managing data exchange

14

XML vs. HTML

Structured text

User-definable structure

Context-sensitive retrieval

Greater hypertext linkage

Formatted text

Pre-defined format

Limited retrieval

Limited hypertext linking

Page 15: XML:Managing data exchange

15

XML rules

Elements must have both an opening and closing tagElements must follow a strict hierarchy with only one root elementElements may not overlap other elementsElement names must obey XML naming conventionsXML is case sensitive

Page 16: XML:Managing data exchange

16

HTML vs. XML

HTML XML

<p><b>MIST7600</b> Data Management<br>3 credit hours</p>

<course><code>MIST7600</code><title>Data Management</title><credit>3</credit></course>

Page 17: XML:Managing data exchange

17

Processing shift

From server to browserBrowser can ‘read’ meaning of the data

Less data transmitted

•HTML •XML

•Retrieve shirt data with prices in $US•Retrieve shirt data with prices in euros

•Retrieve shirt data with prices in $US•Retrieve conversion rate of $US to euro•Retrieve Java program to convert currencies•Compute prices in euros

Page 18: XML:Managing data exchange

18

Searching

Search engines look for appropriate tags in the XML code

Faster

More precise

Page 19: XML:Managing data exchange

19

Expected gains

Store once and format many times

Hardware and software independence

Capture once and exchange many times

Accelerated targeted searching

Less network congestion

Page 20: XML:Managing data exchange

20

XML language design

Designers must defineAllowable tags

Rules for nesting tags

Which tagged elements can be processed

Page 21: XML:Managing data exchange

21

XML Schema

The schema definesThe names and contents of all elements that are permissible in a certain document

The structure of the document

How often an element might appear

The order in which the elements must appear

The type of data the element contains

Page 22: XML:Managing data exchange

22

DOM

Document object model

The data model for an XML document

A tree (1:m)

Page 23: XML:Managing data exchange

23

Schema (cdlib.xsd)

XML declaration and root of all schema documents

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>

Page 24: XML:Managing data exchange

24

Schema (cdlib.xsd)

CD library definition<xsd:element name="cdlibrary">

<xsd:complexType>

<xsd:sequence>

<xsd:element name="cd" type="cdType”

minOccurs="1” maxOccurs="unbounded"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

Page 25: XML:Managing data exchange

25

Schema (cdlib.xsd)CD definition

<xsd:complexType name="cdType"><xsd:sequence>

<xsd:element name="cdid" type="xsd:string"/><xsd:element name="cdlabel" type="xsd:string"/><xsd:element name="cdtitle" type="xsd:string"/><xsd:element name="cdyear" type="xsd:integer"/><xsd:element name="track" type="trackType"

minOccurs="1" maxOccurs="unbounded"/>

</xsd:sequence></xsd:complexType>

Page 26: XML:Managing data exchange

26

Schema (cdlib.xsd)

Track definition<xsd:complexType name="trackType">

<xsd:sequence>

<xsd:element name="trknum" type="xsd:integer"/>

<xsd:element name="trktitle" type="xsd:string"/>

<xsd:element name="trklen" type="xsd:time"/>

</xsd:sequence>

</xsd:complexType>

Page 27: XML:Managing data exchange

27

Common datatypes

string

boolean

uriReference

decimal

float

integer

time

date

Page 28: XML:Managing data exchange

28

XML (cd.xml)

<?xml version = "1.0” encoding=“UTF-8”?><cdlibrary xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="cdlib.xsd"><cd>

<cdid>A2 1325</cdid><cdlabel>Atlantic</cdlabel><cdtitle>Pyramid</cdtitle><cdyear>1960</cdyear><track><trknum>1</trknum><trktitle>Vendome</trktitle><trklen>2:30</trklen>

</track>…

</cd></cdlibrary>

Page 29: XML:Managing data exchange

29

XSL

Extensible stylesheet language

Defines how an XML document is rendered

An XML file

Page 30: XML:Managing data exchange

30

XSL

Results of applying cd.xslPyramid, Atlantic, 1960 [A2 1325]

1 Vendome 00:02:30

2 Pyramid 00:10:46

Ella Fitzgerald, Verve, 2000 [D136705]

1 A tisket, a tasket 00:02:37

2 Vote for Mr. Rhythm 00:02:25

3 Betcha nickel 00:02:52

Page 31: XML:Managing data exchange

<?xml version="1.0" encoding="UTF-8”?><xsl:stylesheet

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output encoding="UTF-8" indent="yes" method="html" version="1.0" /><xsl:template match="/"><html>

<head><title>Complete List of Songs</title>

</head><body>

<h2>Complete List of Songs</h2><xsl:apply-templates select=

"cdlibrary" /><xsl:for-each select="cd"><font color="maroon"> <xsl:value-of select="cdtitle" /> , <xsl:value-of select="cdlabel" /> , <xsl:value-of select="cdyear" />[ <xsl:value-of select="cdid" />]

</font> <br />

cd.xsl

Page 32: XML:Managing data exchange

<table><xsl:for-each select= "track"><tr><td align="left"><xsl:value-of select= "trknum" />

</td><td><xsl:value-of select=

"trktitle" /></td><td align= "center" ><xsl:value-of select= "trklen" />

</td></tr>

</xsl:for-each></table><br />

</xsl:for-each></body></html>

</xsl:template></xsl:stylesheet>

cd.xsl(continued)

Page 33: XML:Managing data exchange

33

Converting XML

Transformation and manipulationXSLTOne XML vocabulary to another

• FPML to finML

Re-ordering, filtering, and sorting

RenderingXSLTe.g., XML to WAP

Page 34: XML:Managing data exchange

34

XML and databases

XML is a data management tool

XML documents will have to be stored for the long-term

Need a DBMS

Page 35: XML:Managing data exchange

35

DBMS requirements

Store a large number of documents;Store large documentsSupport access to portions of a document (e.g., the data for a single CD in a library of 20,000 CDs)Concurrent accessVersion controlIntegrate data from other sources

Page 36: XML:Managing data exchange

36

RDBMS

Document-centricStore as CLOB

Data-centricObject-relational extensions to support element retrieval and update

Expect RDBMS vendors to offer extensions to support XML

Page 37: XML:Managing data exchange

37

Database to XML

A significant proportion of Web pages are generated from databases

Instead of converting to HTML these should be converted to XML

Render with XSL

Need tools for converting relational data to XML

Page 38: XML:Managing data exchange

38

OODBMS

A good fit to the DOM

Little development at this stageVendors have fewer resources

Page 39: XML:Managing data exchange

39

XML database

Special purpose XML databaseTamino

This is a new area and you will need to monitor developments

Page 40: XML:Managing data exchange

40

Conclusion

XML is a significant technological development

Its main purpose is to support data exchange

It will lower the cost of business transactions

It will be a critical data management technology