Extensible Markup Language (XML) 1.0 (Second .The Extensible Markup Language (XML) is a subset of

  • View
    212

  • Download
    0

Embed Size (px)

Text of Extensible Markup Language (XML) 1.0 (Second .The Extensible Markup Language (XML) is a subset of

  • W3C

    Extensible Markup Language (XML) 1.0 (Second Edition)

    W3C Recommendation 6 October 2000This version:

    http://www.w3.org/TR/2000/REC-xml-20001006 (XHTML, XML, PDF, XHTML review versionwith color-coded revision indicators)

    Latest version: http://www.w3.org/TR/REC-xml

    Previous versions: http://www.w3.org/TR/2000/WD-xml-2e-20000814 http://www.w3.org/TR/1998/REC-xml-19980210

    Editors: Tim Bray, Textuality and Netscape Jean Paoli, Microsoft C. M. Sperberg-McQueen, University of Illinois at Chicago and Text Encoding Initiative Eve Maler, Sun Microsystems, Inc. - Second Edition

    Copyright 2000 W3C (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.

    AbstractThe Extensible Markup Language (XML) is a subset of SGML that is completely described in thisdocument. Its goal is to enable generic SGML to be served, received, and processed on the Web in theway that is now possible with HTML. XML has been designed for ease of implementation and forinteroperability with both SGML and HTML.

    Status of this DocumentThis document has been reviewed by W3C Members and other interested parties and has been endorsedby the Director as a W3C Recommendation. It is a stable document and may be used as reference materialor cited as a normative reference from another document. W3Cs role in making the Recommendation isto draw attention to the specification and to promote its widespread deployment. This enhances thefunctionality and interoperability of the Web.

    This document specifies a syntax created by subsetting an existing, widely used international textprocessing standard (Standard Generalized Markup Language, ISO 8879:1986(E) as amended andcorrected) for use on the World Wide Web. It is a product of the W3C XML Activity, details of which can

    1

    Extensible Markup Language (XML) 1.0 (Second Edition)

    http://www.w3.org/http://www.w3.org/TR/2000/REC-xml-20001006http://www.w3.org/TR/2000/REC-xml-20001006.htmlhttp://www.w3.org/TR/2000/REC-xml-20001006.xmlhttp://www.w3.org/TR/2000/REC-xml-20001006.pdfhttp://www.w3.org/TR/2000/REC-xml-20001006-review.htmlhttp://www.w3.org/TR/REC-xmlhttp://www.w3.org/TR/2000/WD-xml-2e-20000814http://www.w3.org/TR/1998/REC-xml-19980210http://www.w3.org/Consortium/Legal/ipr-notice#Copyrighthttp://www.w3.org/http://www.lcs.mit.edu/http://www.inria.fr/http://www.keio.ac.jp/http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimerhttp://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarkshttp://www.w3.org/Consortium/Legal/copyright-documents-19990405http://www.w3.org/Consortium/Legal/copyright-documents-19990405http://www.w3.org/Consortium/Legal/copyright-software-19980720

  • be found at http://www.w3.org/XML. The English version of this specification is the only normativeversion. However, for translations of this document, see http://www.w3.org/XML/#trans. A list of currentW3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

    This second edition is not a new version of XML (first published 10 February 1998); it merelyincorporates the changes dictated by the first-edition errata (available at http://www.w3.org/XML/xml-19980210-errata) as a convenience to readers. The errata list for this secondedition is available at http://www.w3.org/XML/xml-V10-2e-errata.

    Please report errors in this document to xml-editor@w3.org; archives are available.

    Note:

    C. M. Sperberg-McQueens affiliation has changed since the publication of the first edition. He is now atthe World Wide Web Consortium, and can be contacted at cmsmcq@w3.org.

    Table of Contents..................... 31 Introduction.................. 4 1.1 Origin and Goals................... 5 1.2 Terminology..................... 62 Documents............... 6 2.1 Well-Formed XML Documents.................... 7 2.2 Characters................ 8 2.3 Common Syntactic Constructs................ 9 2.4 Character Data and Markup.................... 10 2.5 Comments................. 10 2.6 Processing Instructions.................. 11 2.7 CDATA Sections............. 11 2.8 Prolog and Document Type Declaration............... 15 2.9 Standalone Document Declaration................. 16 2.10 White Space Handling................. 17 2.11 End-of-Line Handling................. 17 2.12 Language Identification................... 183 Logical Structures............ 19 3.1 Start-Tags, End-Tags, and Empty-Element Tags................ 21 3.2 Element Type Declarations.................. 22 3.2.1 Element Content.................. 23 3.2.2 Mixed Content................ 24 3.3 Attribute-List Declarations.................. 25 3.3.1 Attribute Types................. 28 3.3.2 Attribute Defaults.............. 29 3.3.3 Attribute-Value Normalization.................. 30 3.4 Conditional Sections................... 324 Physical Structures............... 32 4.1 Character and Entity References.................. 35 4.2 Entity Declarations

    2

    Table of Contents

    http://www.w3.org/XML/http://www.w3.org/XML/#transhttp://www.w3.org/TR/http://www.w3.org/XML/xml-V10-2e-erratahttp://www.w3.org/XML/xml-V10-2e-erratahttp://lists.w3.org/Archives/Public/xml-editor

  • .................. 35 4.2.1 Internal Entities

    .................. 36 4.2.2 External Entities

    ................... 37 4.3 Parsed Entities

    ................. 37 4.3.1 The Text Declaration

    ............... 37 4.3.2 Well-Formed Parsed Entities

    .............. 38 4.3.3 Character Encoding in Entities

    .......... 39 4.4 XML Processor Treatment of Entities and References

    .................. 40 4.4.1 Not Recognized

    ................... 40 4.4.2 Included

    ................ 41 4.4.3 Included If Validating

    ................... 41 4.4.4 Forbidden

    ................. 41 4.4.5 Included in Literal

    .................... 42 4.4.6 Notify

    ................... 42 4.4.7 Bypassed

    .................. 42 4.4.8 Included as PE

    ........... 42 4.5 Construction of Internal Entity Replacement Text

    .................. 43 4.6 Predefined Entities

    ................. 43 4.7 Notation Declarations

    .................. 44 4.8 Document Entity

    .................... 445 Conformance

    ............. 44 5.1 Validating and Non-Validating Processors

    ................. 45 5.2 Using XML Processors

    ..................... 456 Notation

    Appendices

    ..................... 47A References

    ................. 47 A.1 Normative References

    .................. 48 A.2 Other References

    ................... 49B Character Classes

    ............... 53C XML and SGML (Non-Normative)

    ......... 53D Expansion of Entity and Character References (Non-Normative)

    ............ 55E Deterministic Content Models (Non-Normative)

    ........... 55F Autodetection of Character Encodings (Non-Normative)

    ........... 56 F.1 Detection Without External Encoding Information

    ......... 58 F.2 Priorities in the Presence of External Encoding Information

    ............. 58G W3C XML Working Group (Non-Normative)

    .............. 59H W3C XML Core Group (Non-Normative)

    ............... 59I Production Notes (Non-Normative)

    1 IntroductionExtensible Markup Language, abbreviated XML, describes a class of data objects called XML documents [p.6] and partially describes the behavior of computer programs which process them. XML is anapplication profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]

    3

    1 Introduction

  • [p.49] . By construction, XML documents are conforming SGML documents.

    XML documents are made up of storage units called entities [p.32] , which contain either parsed orunparsed data. Parsed data is made up of characters [p.7] , some of which form character data [p.9] , andsome of which form markup [p.9] . Markup encodes a description of the documents storage layout andlogical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure.

    [Definition: A software module called an XML processor is used to read XML documents and provideaccess to their content and structure.] [Definition: It is assumed that an XML processor is doing its workon behalf of another module, called the application.] This specification describes the required behavior ofan XML processor in terms of how it must read XML data and the information it must provide to the application.

    1.1 Origin and Goals

    XML was developed by an XML Working Group (originally known as the SGML Editorial ReviewBoard) formed under the auspices of the World Wide Web Consortium (W3C) in 1996. It was chaired byJon Bosak of Sun Microsystems with the active participation of an XML Special Interest Group(previously known as the SGML Working Group) also organized by the W3C. The membership of theXML Working Group is given in an appendix. Dan Connolly served as the WGs contact with the W3C.

    The design goals for XML are:

    1. XML shall be straightforwardly usable over the Internet.

    2. XML shall support a wide variety of applications.

    3. XML shall be compatible with SGML.

    4. It shall be easy to write programs which process XML documents.

    5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero.

    6. XML documents should be human-legible and reasonably clear.

    7. The XML design should be prepared quickly.

    8. The design of XML shall be formal and concise.

    9. XML documents shall be easy to create.

    10. Terseness in XML markup is of minimal i