M2 XML Schemadefinition - Introduction XML 1.0 Namespaces XML Schema XML Schemadefinition 2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Motivation

  • View
    213

  • Download
    0

Embed Size (px)

Text of M2 XML Schemadefinition - Introduction XML 1.0 Namespaces XML Schema XML Schemadefinition 2011 JKU...

  • Modul 2:

    XML Schemadefinition

    a.Univ.-Prof. Dr. Werner Retschitzegger

    Vorle

    sung

    IFS in

    der B

    ioinfo

    rmati

    k

    SS 20

    11

    Johannes Kepler University Linzwww.jku.ac.at

    Johannes Kepler University Linzwww.jku.ac.at

    Institute of Bioinformaticswww.bioinf.jku.at

    Institute of Bioinformaticswww.bioinf.jku.at

    IFSIFSInformation Systems Group

    www.ifs.uni-linz.ac.at

    IFSIFSIFSIFSInformation Systems Group

    www.ifs.uni-linz.ac.at

    M2-2

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Outline

    IntroductionMotivation for XMLDocument Markup LanguagesApplication Areas for XML

    XML 1.0NamespacesXML Schema

    The following slides are based (among others) on:Elliotte Rusty Harold, W. Scott Means, XML in a Nutshell: A Desktop Quick Reference, 3rd Edition, O'Reilly & Associates, 2005

  • M2-3

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Motivation for XML 1/5From HTML to XML

    "If I invent another programming language, its name will contain the letter X."

    (N. Wirth, Software Pioniere Konferenz, Bonn 2001)

    223 Mio.SQL

    252 Mio.ABC

    20,6 KWerner Retschitzegger

    237 Mio.Soccer

    603 Mio.XML

    2,2 Mrd.Love

    Google Indicator:

    ... as of Sep/16/08

    M2-4

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Motivation for XML 2/5From HTML to XML

    Brian Kerningham: "The problem with HTML-WYSIWYG is thatwhat you see is all you've got"

    HTML (HyperText Markup Language) is the "Lingua Franca" for representing Hypertext Documents at the WebStandardized 1989 by W3C (World Wide Web Consortium)Basic concept: "Markup" in terms of "Tags"

    DrawbacksRestricted number of pre-defined tags

    permanent extensions with proprietary tags

    Tags primarily describe layout aspectshardens Web search

  • M2-5

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Motivation for XML 3/5From HTML to XML

    PDACatalogNokia 8210

    Battery900mAh

    Weight141g

    HTML describes layout of content

    900mAh141g

    XML describes structure and semantics of content

    Tim Bray, Co-Editor of XML 1.0:"XML will become the ASCII of the 21st century -

    basic, essential, unexciting"

    PDA-Catalog

    BatteryWeight

    PDA-Catalog

    M2-6

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Motivation for XML 4/5Features of XML

    Layout IndependenceSeparation of structure and semantics of the content from its layout

    Platform and Vendor IndependenceEndorsed by the W3C

    InternationalityBased on the UNICODE-Standard

    ExtensibilityTags can be defined and named arbitrarily meta language

    StructurabilityTags can be nested arbitrarily

    Semi-structuredContent can contain fully structured parts and fully unstructured parts

    Self-describingTags describing structure and semantics of the content are... for humans: relatively easy to read and edit... for machines: easy to generate and parse

    X-Technology InfrastructureW3C provides a set of XML-based standards XML Standards Family

    Correctness ProofOptionally, XML documents can be proofed for correctness

  • M2-7

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Well-formednesssyntactical properties, e.g.:

    At least 1 tag per documentExactly 1 root tagTags have to be none-overlappingEach tag has to havean end tag....

    XML-Processors parse XML documents and checkeither solely well-formedness (non-validating processors)or also validity (validating processors)

    Can be called from within an application (e.g., browser)Decompose an XML document into its parts forming a tree, which allows to access its parts from within an application

    ValidityXML document is well-formedand corresponds to a schemaSchema defines vocabulary and grammarAlternatives: DTD orXML Schema-StandardApplication

    DocumentpartsErrors

    Catalog.DTD

    XML Processor

    ParserEntityManagerPDACatalog1.XML

    PDA

    XML-Document

    FeaturesEntities

    Motivation for XML 5/5Properties of XML Documents and XML Processors

    M2-8

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Document Markup Languages 1/4History

    Vannevar Bush 1945 MemexDouglas Engelbart 1962 AugmentTed Nelson 1965 XanaduWilliam Tunniclife (GCA) 1967 GenCodeGoldfarb, Mosher, Lorie (IBM) 1969 GML (Generalized Markup Language)ANSI 1978 Standardisierung (GenCode & GML)Charles GoldfarbISO 1986 SGML (Standard Generalized Markup

    Language - ISO 8879)Tim Berners-Lee (CERN) 1989 HTML (Hypertext Markup Language)Mark Andreessen (NCSA) 1993 HTML-Forms (XMosaic)Netscape, Microsoft 1994 HTML-DerivationsJon Bosak, Tim Bray, 1996 XML Working Group James Clark et al. (W3C)

    10. 2. 1998 XML 1.029. 9. 2006 XML 1.1, 2nd Edition

  • M2-9

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Document Markup Languages 2/4 Memex

    http://www.ps.uni-sb.de/~duchier/pub/vbush/vbush-all.shtml

    M2-10

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    SGMLXML Meta Level

    XHTML Language Level(e.g. DTDs)HTMLMathML

    Instance Level(documents)

    e i +1= 0n

    f (n) = kk=1

    WMLz.B.

    z.B.

    M2

    M1

    M0

    [www.omg.org]

    Document Markup Languages 3/4XML and OMGs Metadata Architecture

  • M2-11

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Document Markup Languages 4/4XML versus ...

    ... SGMLXML vs. SGML (60 pages vs. 600 pages)XML has 20% of SGMLs complexity, but 80% of its functionalityXML documents are conform to an ISO revision of SGML -WebSGML (Annex to the SGML-Standard ISO8879)

    ... HTMLXML is complementary to HTML (semantic and structure vs. layout)XML is not backward compatible to HTMLSimple conversion from HTML documents to XML

    ... XHTML= Extensible HTMLW3C Recommendation Aug. 2002 (2nd edition)HTML 4.01 as an XML application, i.e. HTML was described bymeans of a XML-DTD

    M2-12

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Application Areas of XML 1/4Three Main Application Areas

    Data Exchange ("Portable Data")Using XML solely as an exchange format orUsing also a common schema

    Multi-DeliveryOne and the same content can be delivered to different end user devices

    Intelligent RetrievalInstead of a simple keyword search on basis of HTML documents, structure-based search on basis of XML documents

    "Mozart

    " -

    Componi

    st or cho

    colate

    ball?

  • M2-13

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    [http://www.oasis-open.org/cover/xml.html#applications]

    XML-DTDs for ...Literature "Gutenberg"Travel "openTravel"News "NewsML"Marketing "adXML"Weather "OMF"Human Resources "XML-HR"Voice Applications "VoxML"Vector Graphics "SVG"Mobile Applications "WML"Geo Applications "ANZMETA"Health Care "HL7"Mathematics "MathMLBanking "MBAeGovernment eGovML

    Electronic CommerceCBL: Common Business

    Library (Commerce One)

    BizTalk: MicrosoftcXML: Commerce XMLRosettaNet:Format for Online-

    OrdersebXML: OASIS + XML/EDIFnXML: Financial Products

    Markup Language...

    Application Areas of XML 2/4Industrial Sectors "Verticalisation of XML"

    M2-14

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    Application Areas of XML 3/4Sources of XML Data

    Inter-application and mobile devices communication data

    e.g., Web Services

    Logs and Blogse.g., RSS

    Metadatae.g., Schema, WSDL, XMP

    Presentation datae.g., XHTML

    Documentse.g., Word

    Views of other sources of datae.g., Relational, LDAP, CSV, Excel, etc.

    Sensor data

  • M2-15

    XML SchemadefinitionXML SchemaNamespacesXML 1.0Introduction

    2011 JKU Linz, Institut fr Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

    XMLXML language concepts incl. DTD

    XML NamespacesSupport of a global identification schema for element names and attribute names

    XPath (XML Path Language)Path expressions for navigation in XML documents

    XML SchemaXML-based language for the definition of XML schemata

    XLink, XPointerXML-based language for the linking of (parts of) XML documents

    XSL (Extensible Stylesheet Language)XSLT: Transformation of XML documents (declarative)XSL-FO: Rendering of XML documents (declarative)

    DOM (Document Object Model)API for accessing XML documents in a procedural manner

    W3C Standardization Levels:(1) Note(2)