37
XEP-0071: XHTML-IM Peter Saint-Andre mailto:xsf@stpeter.im xmpp:peter@jabber.org http://stpeter.im/ 2018-03-08 Version 1.5.4 Status Type Short Name Deprecated Standards Track xhtml-im This specification defines an XHTML 1.0 Integration Set for use in exchanging instant messages that contain lightweight text markup. The protocol enables an XMPP entity to format a message using a small range of commonly-used HTML elements, attributes, and style properties that are suitable for use in in- stant messaging. The protocol also excludes HTML constructs that may introduce malware and other vulnerabilities (such as scripts and objects) for improved security.

XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

XEP-0071 XHTML-IM

Peter Saint-Andremailtoxsfstpeterimxmpppeterjabberorghttpstpeterim

2018-03-08Version 154

Status Type Short NameDeprecated Standards Track xhtml-im

This specification defines an XHTML 10 Integration Set for use in exchanging instant messages thatcontain lightweight text markup The protocol enables an XMPP entity to format a message using a smallrange of commonly-used HTML elements attributes and style properties that are suitable for use in in-stant messaging The protocol also excludes HTML constructs that may introduce malware and othervulnerabilities (such as scripts and objects) for improved security

LegalCopyrightThis XMPP Extension Protocol is copyright copy 1999 ndash 2020 by the XMPP Standards Foundation (XSF)

PermissionsPermission is hereby granted free of charge to any person obtaining a copy of this specification (therdquoSpecificationrdquo) to make use of the Specification without restriction including without limitation therights to implement the Specification in a software program deploy the Specification in a networkservice and copy modify merge publish translate distribute sublicense or sell copies of the Specifi-cation and to permit persons to whom the Specification is furnished to do so subject to the conditionthat the foregoing copyright notice and this permission notice shall be included in all copies or sub-stantial portions of the Specification Unless separate permission is granted modified works that areredistributed shall not contain misleading information regarding the authors title number or pub-lisher of the Specification and shall not claim endorsement of the modified works by the authors anyorganization or project to which the authors belong or the XMPP Standards Foundation

Warranty NOTE WELL This Specification is provided on an rdquoAS ISrdquo BASIS WITHOUT WARRANTIES OR CONDI-TIONS OF ANY KIND express or implied including without limitation any warranties or conditions ofTITLE NON-INFRINGEMENT MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE

LiabilityIn no event and under no legal theory whether in tort (including negligence) contract or otherwiseunless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writingshall the XMPP Standards Foundation or any author of this Specification be liable for damages includ-ing any direct indirect special incidental or consequential damages of any character arising fromout of or in connection with the Specification or the implementation deployment or other use of theSpecification (including but not limited to damages for loss of goodwill work stoppage computer fail-ure or malfunction or any and all other commercial damages or losses) even if the XMPP StandardsFoundation or such author has been advised of the possibility of such damages

ConformanceThis XMPP Extension Protocol has been contributed in full conformance with the XSFrsquos IntellectualProperty Rights Policy (a copy of which can be found at lthttpsxmpporgaboutxsfipr-policygtor obtained by writing to XMPP Standards Foundation PO Box 787 Parker CO 80134 USA)

Contents1 Introduction 1

2 Choice of Technology 1

3 Requirements 2

4 Concepts and Approach 3

5 Wrapper Element 4

6 XHTML-IM Integration Set 561 Structure Module Definition 662 Text Module Definition 663 Hypertext Module Definition 764 List Module Definition 765 Image Module Definition 866 Style Attribute Module Definition 8

7 Recommended Profile 871 Structure Module Profile 872 Text Module Profile 973 Hypertext Module Profile 974 List Module Profile 975 Image Module Profile 976 Style Attribute Module Profile 10

761 Recommended Style Properties 1077 Recommended Attributes 11

771 Common Attributes 11772 Specialized Attributes 12773 Other Attributes 13

78 Summary of Recommendations 13

8 Business Rules 14

9 Examples 15

10 Discovering Support for XHTML-IM 20101 Explicit Discovery 20102 Implicit Discovery 21

11 Security Considerations 21111 Malicious Objects 21112 Phishing 22113 Presence Leaks 22

12 W3C Considerations 22121 Document Type Name 22122 User Agent Conformance 23123 XHTML Modularization 24124 W3C Review 24125 XHTML Versioning 24

13 IANA Considerations 25

14 XMPP Registrar Considerations 25141 Protocol Namespaces 25

15 XML Schemas 25151 XHTML-IM Wrapper 25152 XHTML-IM Schema Driver 26153 XHTML-IM Content Model 28

16 Acknowledgements 32

2 CHOICE OF TECHNOLOGY

1 IntroductionThis document defines methods for exchanging instant messages that contain lightweighttext markup In the context of this document rdquolightweight text markuprdquo is to be understoodas a combination of minimal structural elements and presentational styles that can easilybe rendered on a wide variety of devices without requiring a full rich-text rendering enginesuch as a web browser Examples of lightweight text markup include basic text blocks(eg paragraphs and blockquotes) structural elements (eg emphasis and strength) listshyperlinks image references and font styles (eg sizes and colors)Note This specification essentially defines a recommended set of XHTML elements andattributes for use in instant messaging by applying a series of rdquofiltersrdquo to the wealth of XHTMLfeatures however developers of JabberXMPP clients mainly need to pay attention to theSummary of Recommendations rather than the complexities of how the recommendationsare derived

2 Choice of TechnologyIn the past there have existed several incompatible methods within the Jabber communityfor exchanging instant messages that contain lightweight text markup The most notablesuch methods have included derivatives of XHTML 10 1 as well as of Rich Text Format (RTF) 2Although it is sometimes easier for client developers to implement RTF support (this isespecially true on certain Microsoft Windows operating systems) there are several reasons(consistent with the XMPP Design Guidelines (XEP-0134) 3) for the XMPP Standards Foun-dation (XSF) 4 to avoid the use of RTF in developing a protocol for lightweight text markupSpecifically

1 RTF is not a structured vocabulary derived from SGML (as is HTML 40 5) or more rele-vantly from XML (as is XHTML 10)

2 RTF is under the control of the Microsoft Corporation and thus is not an open standardmaintained by a recognized standards development organization therefore the XSF isunable to contribute to or influence its development if necessary and any protocol theXSF developed using RTF would introduce unwanted dependencies

Conversely there are several reasons to prefer XHTML for lightweight text markup

1XHTML 10 lthttpwwww3orgTRxhtml1gt2Rich Text Format (RTF) Version 15 Specification lthttpmsdnmicrosoftcomlibraryen-usdnrtfspechtmlrtfspecaspgt

3XEP-0134 XMPP Design Guidelines lthttpsxmpporgextensionsxep-0134htmlgt4The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

5HTML 40 lthttpwwww3orgTRREC-html40gt

1

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 2: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

LegalCopyrightThis XMPP Extension Protocol is copyright copy 1999 ndash 2020 by the XMPP Standards Foundation (XSF)

PermissionsPermission is hereby granted free of charge to any person obtaining a copy of this specification (therdquoSpecificationrdquo) to make use of the Specification without restriction including without limitation therights to implement the Specification in a software program deploy the Specification in a networkservice and copy modify merge publish translate distribute sublicense or sell copies of the Specifi-cation and to permit persons to whom the Specification is furnished to do so subject to the conditionthat the foregoing copyright notice and this permission notice shall be included in all copies or sub-stantial portions of the Specification Unless separate permission is granted modified works that areredistributed shall not contain misleading information regarding the authors title number or pub-lisher of the Specification and shall not claim endorsement of the modified works by the authors anyorganization or project to which the authors belong or the XMPP Standards Foundation

Warranty NOTE WELL This Specification is provided on an rdquoAS ISrdquo BASIS WITHOUT WARRANTIES OR CONDI-TIONS OF ANY KIND express or implied including without limitation any warranties or conditions ofTITLE NON-INFRINGEMENT MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE

LiabilityIn no event and under no legal theory whether in tort (including negligence) contract or otherwiseunless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writingshall the XMPP Standards Foundation or any author of this Specification be liable for damages includ-ing any direct indirect special incidental or consequential damages of any character arising fromout of or in connection with the Specification or the implementation deployment or other use of theSpecification (including but not limited to damages for loss of goodwill work stoppage computer fail-ure or malfunction or any and all other commercial damages or losses) even if the XMPP StandardsFoundation or such author has been advised of the possibility of such damages

ConformanceThis XMPP Extension Protocol has been contributed in full conformance with the XSFrsquos IntellectualProperty Rights Policy (a copy of which can be found at lthttpsxmpporgaboutxsfipr-policygtor obtained by writing to XMPP Standards Foundation PO Box 787 Parker CO 80134 USA)

Contents1 Introduction 1

2 Choice of Technology 1

3 Requirements 2

4 Concepts and Approach 3

5 Wrapper Element 4

6 XHTML-IM Integration Set 561 Structure Module Definition 662 Text Module Definition 663 Hypertext Module Definition 764 List Module Definition 765 Image Module Definition 866 Style Attribute Module Definition 8

7 Recommended Profile 871 Structure Module Profile 872 Text Module Profile 973 Hypertext Module Profile 974 List Module Profile 975 Image Module Profile 976 Style Attribute Module Profile 10

761 Recommended Style Properties 1077 Recommended Attributes 11

771 Common Attributes 11772 Specialized Attributes 12773 Other Attributes 13

78 Summary of Recommendations 13

8 Business Rules 14

9 Examples 15

10 Discovering Support for XHTML-IM 20101 Explicit Discovery 20102 Implicit Discovery 21

11 Security Considerations 21111 Malicious Objects 21112 Phishing 22113 Presence Leaks 22

12 W3C Considerations 22121 Document Type Name 22122 User Agent Conformance 23123 XHTML Modularization 24124 W3C Review 24125 XHTML Versioning 24

13 IANA Considerations 25

14 XMPP Registrar Considerations 25141 Protocol Namespaces 25

15 XML Schemas 25151 XHTML-IM Wrapper 25152 XHTML-IM Schema Driver 26153 XHTML-IM Content Model 28

16 Acknowledgements 32

2 CHOICE OF TECHNOLOGY

1 IntroductionThis document defines methods for exchanging instant messages that contain lightweighttext markup In the context of this document rdquolightweight text markuprdquo is to be understoodas a combination of minimal structural elements and presentational styles that can easilybe rendered on a wide variety of devices without requiring a full rich-text rendering enginesuch as a web browser Examples of lightweight text markup include basic text blocks(eg paragraphs and blockquotes) structural elements (eg emphasis and strength) listshyperlinks image references and font styles (eg sizes and colors)Note This specification essentially defines a recommended set of XHTML elements andattributes for use in instant messaging by applying a series of rdquofiltersrdquo to the wealth of XHTMLfeatures however developers of JabberXMPP clients mainly need to pay attention to theSummary of Recommendations rather than the complexities of how the recommendationsare derived

2 Choice of TechnologyIn the past there have existed several incompatible methods within the Jabber communityfor exchanging instant messages that contain lightweight text markup The most notablesuch methods have included derivatives of XHTML 10 1 as well as of Rich Text Format (RTF) 2Although it is sometimes easier for client developers to implement RTF support (this isespecially true on certain Microsoft Windows operating systems) there are several reasons(consistent with the XMPP Design Guidelines (XEP-0134) 3) for the XMPP Standards Foun-dation (XSF) 4 to avoid the use of RTF in developing a protocol for lightweight text markupSpecifically

1 RTF is not a structured vocabulary derived from SGML (as is HTML 40 5) or more rele-vantly from XML (as is XHTML 10)

2 RTF is under the control of the Microsoft Corporation and thus is not an open standardmaintained by a recognized standards development organization therefore the XSF isunable to contribute to or influence its development if necessary and any protocol theXSF developed using RTF would introduce unwanted dependencies

Conversely there are several reasons to prefer XHTML for lightweight text markup

1XHTML 10 lthttpwwww3orgTRxhtml1gt2Rich Text Format (RTF) Version 15 Specification lthttpmsdnmicrosoftcomlibraryen-usdnrtfspechtmlrtfspecaspgt

3XEP-0134 XMPP Design Guidelines lthttpsxmpporgextensionsxep-0134htmlgt4The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

5HTML 40 lthttpwwww3orgTRREC-html40gt

1

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 3: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

Contents1 Introduction 1

2 Choice of Technology 1

3 Requirements 2

4 Concepts and Approach 3

5 Wrapper Element 4

6 XHTML-IM Integration Set 561 Structure Module Definition 662 Text Module Definition 663 Hypertext Module Definition 764 List Module Definition 765 Image Module Definition 866 Style Attribute Module Definition 8

7 Recommended Profile 871 Structure Module Profile 872 Text Module Profile 973 Hypertext Module Profile 974 List Module Profile 975 Image Module Profile 976 Style Attribute Module Profile 10

761 Recommended Style Properties 1077 Recommended Attributes 11

771 Common Attributes 11772 Specialized Attributes 12773 Other Attributes 13

78 Summary of Recommendations 13

8 Business Rules 14

9 Examples 15

10 Discovering Support for XHTML-IM 20101 Explicit Discovery 20102 Implicit Discovery 21

11 Security Considerations 21111 Malicious Objects 21112 Phishing 22113 Presence Leaks 22

12 W3C Considerations 22121 Document Type Name 22122 User Agent Conformance 23123 XHTML Modularization 24124 W3C Review 24125 XHTML Versioning 24

13 IANA Considerations 25

14 XMPP Registrar Considerations 25141 Protocol Namespaces 25

15 XML Schemas 25151 XHTML-IM Wrapper 25152 XHTML-IM Schema Driver 26153 XHTML-IM Content Model 28

16 Acknowledgements 32

2 CHOICE OF TECHNOLOGY

1 IntroductionThis document defines methods for exchanging instant messages that contain lightweighttext markup In the context of this document rdquolightweight text markuprdquo is to be understoodas a combination of minimal structural elements and presentational styles that can easilybe rendered on a wide variety of devices without requiring a full rich-text rendering enginesuch as a web browser Examples of lightweight text markup include basic text blocks(eg paragraphs and blockquotes) structural elements (eg emphasis and strength) listshyperlinks image references and font styles (eg sizes and colors)Note This specification essentially defines a recommended set of XHTML elements andattributes for use in instant messaging by applying a series of rdquofiltersrdquo to the wealth of XHTMLfeatures however developers of JabberXMPP clients mainly need to pay attention to theSummary of Recommendations rather than the complexities of how the recommendationsare derived

2 Choice of TechnologyIn the past there have existed several incompatible methods within the Jabber communityfor exchanging instant messages that contain lightweight text markup The most notablesuch methods have included derivatives of XHTML 10 1 as well as of Rich Text Format (RTF) 2Although it is sometimes easier for client developers to implement RTF support (this isespecially true on certain Microsoft Windows operating systems) there are several reasons(consistent with the XMPP Design Guidelines (XEP-0134) 3) for the XMPP Standards Foun-dation (XSF) 4 to avoid the use of RTF in developing a protocol for lightweight text markupSpecifically

1 RTF is not a structured vocabulary derived from SGML (as is HTML 40 5) or more rele-vantly from XML (as is XHTML 10)

2 RTF is under the control of the Microsoft Corporation and thus is not an open standardmaintained by a recognized standards development organization therefore the XSF isunable to contribute to or influence its development if necessary and any protocol theXSF developed using RTF would introduce unwanted dependencies

Conversely there are several reasons to prefer XHTML for lightweight text markup

1XHTML 10 lthttpwwww3orgTRxhtml1gt2Rich Text Format (RTF) Version 15 Specification lthttpmsdnmicrosoftcomlibraryen-usdnrtfspechtmlrtfspecaspgt

3XEP-0134 XMPP Design Guidelines lthttpsxmpporgextensionsxep-0134htmlgt4The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

5HTML 40 lthttpwwww3orgTRREC-html40gt

1

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 4: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

12 W3C Considerations 22121 Document Type Name 22122 User Agent Conformance 23123 XHTML Modularization 24124 W3C Review 24125 XHTML Versioning 24

13 IANA Considerations 25

14 XMPP Registrar Considerations 25141 Protocol Namespaces 25

15 XML Schemas 25151 XHTML-IM Wrapper 25152 XHTML-IM Schema Driver 26153 XHTML-IM Content Model 28

16 Acknowledgements 32

2 CHOICE OF TECHNOLOGY

1 IntroductionThis document defines methods for exchanging instant messages that contain lightweighttext markup In the context of this document rdquolightweight text markuprdquo is to be understoodas a combination of minimal structural elements and presentational styles that can easilybe rendered on a wide variety of devices without requiring a full rich-text rendering enginesuch as a web browser Examples of lightweight text markup include basic text blocks(eg paragraphs and blockquotes) structural elements (eg emphasis and strength) listshyperlinks image references and font styles (eg sizes and colors)Note This specification essentially defines a recommended set of XHTML elements andattributes for use in instant messaging by applying a series of rdquofiltersrdquo to the wealth of XHTMLfeatures however developers of JabberXMPP clients mainly need to pay attention to theSummary of Recommendations rather than the complexities of how the recommendationsare derived

2 Choice of TechnologyIn the past there have existed several incompatible methods within the Jabber communityfor exchanging instant messages that contain lightweight text markup The most notablesuch methods have included derivatives of XHTML 10 1 as well as of Rich Text Format (RTF) 2Although it is sometimes easier for client developers to implement RTF support (this isespecially true on certain Microsoft Windows operating systems) there are several reasons(consistent with the XMPP Design Guidelines (XEP-0134) 3) for the XMPP Standards Foun-dation (XSF) 4 to avoid the use of RTF in developing a protocol for lightweight text markupSpecifically

1 RTF is not a structured vocabulary derived from SGML (as is HTML 40 5) or more rele-vantly from XML (as is XHTML 10)

2 RTF is under the control of the Microsoft Corporation and thus is not an open standardmaintained by a recognized standards development organization therefore the XSF isunable to contribute to or influence its development if necessary and any protocol theXSF developed using RTF would introduce unwanted dependencies

Conversely there are several reasons to prefer XHTML for lightweight text markup

1XHTML 10 lthttpwwww3orgTRxhtml1gt2Rich Text Format (RTF) Version 15 Specification lthttpmsdnmicrosoftcomlibraryen-usdnrtfspechtmlrtfspecaspgt

3XEP-0134 XMPP Design Guidelines lthttpsxmpporgextensionsxep-0134htmlgt4The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

5HTML 40 lthttpwwww3orgTRREC-html40gt

1

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 5: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

2 CHOICE OF TECHNOLOGY

1 IntroductionThis document defines methods for exchanging instant messages that contain lightweighttext markup In the context of this document rdquolightweight text markuprdquo is to be understoodas a combination of minimal structural elements and presentational styles that can easilybe rendered on a wide variety of devices without requiring a full rich-text rendering enginesuch as a web browser Examples of lightweight text markup include basic text blocks(eg paragraphs and blockquotes) structural elements (eg emphasis and strength) listshyperlinks image references and font styles (eg sizes and colors)Note This specification essentially defines a recommended set of XHTML elements andattributes for use in instant messaging by applying a series of rdquofiltersrdquo to the wealth of XHTMLfeatures however developers of JabberXMPP clients mainly need to pay attention to theSummary of Recommendations rather than the complexities of how the recommendationsare derived

2 Choice of TechnologyIn the past there have existed several incompatible methods within the Jabber communityfor exchanging instant messages that contain lightweight text markup The most notablesuch methods have included derivatives of XHTML 10 1 as well as of Rich Text Format (RTF) 2Although it is sometimes easier for client developers to implement RTF support (this isespecially true on certain Microsoft Windows operating systems) there are several reasons(consistent with the XMPP Design Guidelines (XEP-0134) 3) for the XMPP Standards Foun-dation (XSF) 4 to avoid the use of RTF in developing a protocol for lightweight text markupSpecifically

1 RTF is not a structured vocabulary derived from SGML (as is HTML 40 5) or more rele-vantly from XML (as is XHTML 10)

2 RTF is under the control of the Microsoft Corporation and thus is not an open standardmaintained by a recognized standards development organization therefore the XSF isunable to contribute to or influence its development if necessary and any protocol theXSF developed using RTF would introduce unwanted dependencies

Conversely there are several reasons to prefer XHTML for lightweight text markup

1XHTML 10 lthttpwwww3orgTRxhtml1gt2Rich Text Format (RTF) Version 15 Specification lthttpmsdnmicrosoftcomlibraryen-usdnrtfspechtmlrtfspecaspgt

3XEP-0134 XMPP Design Guidelines lthttpsxmpporgextensionsxep-0134htmlgt4The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

5HTML 40 lthttpwwww3orgTRREC-html40gt

1

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 6: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

3 REQUIREMENTS

1 XHTML is a structured format that is defined as an application of XML 10 6 making itespecially appropriate for sending over JabberXMPP which is at root a technology forstreaming XML (see XMPP Core 7)

2 XHTML is an open standard developed by the World Wide Web Consortium (W3C) 8 arecognized standards development organization

Therefore this document defines support for lightweight text markup in the form of an XMPPextension that encapsulates content defined by an XHTML 10 Integration Set that we labelrdquoXHTML-IMrdquo The remainder of this document discusses lightweight text markup in terms ofXHTML 10 only and does not further consider RTF or other technologies

3 RequirementsHTML was originally designed for authoring and presenting stuctured documents on theWorldWideWeb and was subsequently extended to handlemore advanced functionality suchas image maps and interactive forms However the requirements for publishing documents(or developing transactional websites) for presentation by dedicated XHTML user agentson traditional computers or small-screen devices are fundamentally different from therequirements for lightweight text markup of instant messages for this reason only a reducedset of XHTML features is needed for XHTML-IM In particular

1 IM clients are not XHTML clients their primary purpose is not to read pre-existingXHTML documents but to read and generate relatively large numbers of fairly smallinstant messages

2 The underlying context for XHTML content in JabberXMPP instant messaging isprovided not by a full XHTML document but by an XML stream and specifically by amessage stanza within that stream Thus the ltheadgt element and all its children areunnecessary Only the ltbodygt element and some of its children are appropriate foruse in instant messaging

3 The XHTML content that is read by onersquos IM client is normally generated on the fly byonersquos conversation partner (or to be precise by his or her IM client) Thus there is aninherent limit to the sophistication of the XHTML markup involved Even in normalXHTML documents fairly basic structural and rendering elements such as definitionlists abbreviations addresses and computer input handling (eg ltkbdgt and ltvargt)

6Extensible Markup Language (XML) 10 (Fourth Edition) lthttpwwww3orgTRREC-xmlgt7RFC 6120 ExtensibleMessaging and Presence Protocol (XMPP) Core lthttptoolsietforghtmlrfc6120gt8The World Wide Web Consortium defines data formats and markup languages (such as HTML and XML) for useover the Internet For further information see lthttpwwww3orggt

2

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 7: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

4 CONCEPTS AND APPROACH

are relatively rare There is little or no foreseeable need for such elements within thecontext of instant messaging

4 The foregoing is doubly true of more advanced markup such as tables frames andforms (however there exists an XMPP extension that provides an instant messagingequivalent of the latter as defined in Data Forms (XEP-0004) 9)

5 Although ad-hoc styles are useful for messaging (by means of the rsquostylersquo attribute) fullsupport for Cascading Style Sheets 10 (defined by the ltstylegt element or a standalonecss file and implemented via the rsquoclassrsquo attribute) would be overkill since many CSS1properties (eg box classification and text properties) were developed especially forsophisticated page layout

6 Background images audio animated text layers applets scripts and other multimediacontent types are unnecessary especially given the existence of XMPP extensions suchas SI File Transfer (XEP-0096) 11 Jingle (XEP-0166) 12 and Jingle File Transfer (XEP-0234)13

7 Content transformations such as those defined by XSL Transformations 14 must notbe necessary in order for an instant messaging application to present lightweight textmarkup to an end user

As explained below some of these requirements are addressed by the definition of theXHTML-IM Integration Set itself while others are addressed by a recommended rdquoprofilerdquo forthat Integration Set in the context of instant messaging applications

4 Concepts and ApproachThis document defines an adaptation of XHTML 10 (specifically an XHTML 10 IntegrationSet) that makes it possible to provide lightweight text markup of instant messages (mainly forJabberXMPP instant messages although the Integration Set defined herein could be used byother protocols) This pattern is familiar from email wherein the HTML-formatted version of

9XEP-0004 Data Forms lthttpsxmpporgextensionsxep-0004htmlgt10Cascading Style Sheets level 1 lthttpwwww3orgTRREC-CSS1gt11XEP-0096 SI File Transfer lthttpsxmpporgextensionsxep-0096htmlgt12XEP-0166 Jingle lthttpsxmpporgextensionsxep-0166htmlgt13XEP-0234 Jingle File Transfer lthttpsxmpporgextensionsxep-0234htmlgt14XSL Transformations lthttpwwww3orgTRxsltgt

3

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 8: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

5 WRAPPER ELEMENT

the message supplements but does not supersede the text-only version of the message 15

In JabberXMPP communications the meaning (as opposed to markup) of the message MUSTalways be represented as best as possible in the normal ltbodygt child element or elementsof the ltmessagegt stanza qualified by the rsquojabberclientrsquo (or rsquojabberserverrsquo) namespaceLightweight text markup is then provided within an lthtmlgt element qualified by thersquohttpjabberorgprotocolxhtml-imrsquo namespace 16 However this lthtmlgt element is usedsolely as a rdquowrapperrdquo for the XHTML content itself which content is encapsulated via one ormore ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespace alongwith appropriate child elements thereofThe following example illustrates this approach

Listing 1 A simple exampleltmessage gt

ltbodygthiltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -weightbold rsquogthiltpgt

ltbodygtlthtmlgt

ltmessage gt

Technically speaking there are three aspects to the approach taken herein

1 Definition of the lthtmlgt rdquowrapperrdquo element which functions as an XMPP extensionwithin XMPP ltmessagegt stanzas

2 Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 10mod-ules using the concepts defined in Modularization of XHTML 17

3 A recommended rdquoprofilerdquo regarding the specific XHTML 10 elements and attributes tobe supported from each XHTML 10 module

These three aspects are defined in the three document sections that follow

5 Wrapper ElementThe root element for including XHTML content within XMPP stanzas is lthtmlgt Thiselement is qualified by the rsquohttpjabberorgprotocolxhtml-imrsquo namespace From the

15The XHTML is merely an alternative version of the message body or bodies and the semantic meaning is to bederived from the textual message body or bodies rather than the XHTML version

16It might have been better to use an element name other than lthtmlgt for the wrapper element however chang-ing it would not be backwards-compatible with the older protocol and existing implementations

17Modularization of XHTML lthttpwwww3orgTR2004WD-xhtml-modularization-20040218gt

4

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 9: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

6 XHTML-IM INTEGRATION SET

perspective of XMPP the wrapper element functions as an XMPP extension element fromthe perspective of XHTML it functions as a wrapper for XHTML 10 content qualified by thersquohttpwwww3org1999xhtmlrsquo namespace Such XHTML content MUST be contained inone or more ltbodygt elements qualified by the rsquohttpwwww3org1999xhtmlrsquo namespaceand MUST conform to the XHTML-IM Integration Set defined in the following section If morethan one ltbodygt element is included in the lthtmlgt wrapper element each ltbodygt elementMUST possess an rsquoxmllangrsquo attribute with a distinct value where the value of that attributeMUST adhere to the rules defined in RFC 5646 18 A formal definition of the lthtmlgt elementis provided in the XHTML-IM Wrapper SchemaThe XHTML ltbodygt element is not to be confused with the XMPP ltbodygt element which is achild of a ltmessagegt stanza and is qualified by the rsquojabberclientrsquo or rsquojabberserverrsquo namespaceas described in XMPP IM 19 The lthtmlgt wrapper element is intended for inclusion only as adirect child element of the XMPP ltmessagegt stanza and only in order to specify a marked-upversion of the message ltbodygt element or elements but MAY be included elsewhere inaccordance with the rdquoextended namespacerdquo rules defined in the XMPP IM specificationUntil and unless (1) additional integration sets are defined and (2) mechanisms are specifiedfor discovering or negotiating which integration sets are supported the XHTML markupcontained within the lthtmlgt wrapper element

1 MUST NOT include elements and attributes that are not part of the XHTML-IM Integra-tion Set defined in Section 6 of this document any such elements and attributes MUSTbe ignored if received

2 SHOULD NOT include elements and attributes that are not part of the recommendedprofile of the XHTML-IM Integration Set defined in Section 7 of this document any suchelements and attributes SHOULD be ignored if received

Note In the foregoing restrictions the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

6 XHTML-IM Integration SetThis section defines an XHTML 10 Integration Set for use in the context of instant messagingGiven its intended usage we label it rdquoXHTML-IMrdquoModularization of XHTML provides the ability to formally define subsets of XHTML 10 viathe concept of rdquomodularizationrdquo (which may be familiar from XHTML Basic 20) Many of thedefined modules are not necessary or useful in the context of instant messaging and in the

18RFC 5646 Tags for Identifying Languages lthttptoolsietforghtmlrfc5646gt19RFC 6121 Extensible Messaging and Presence Protocol (XMPP) Instant Messaging and Presence lthttptool

sietforghtmlrfc6121gt20XHTML Basic lthttpwwww3orgTRxhtml-basicgt

5

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 10: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

6 XHTML-IM INTEGRATION SET

context of JabberXMPP instant messaging specifically some modules have been supersededby well-defined XMPP extensions This document specifies that XHTML-IM shall be based onthe following XHTML 10 modules

bull Core Modulesndash Structure Modulendash Text Modulendash Hypertext Modulendash List Module

bull Image Module

bull Style Attribute Module

Modularization of XHTML defines many additional modules such as Table Modules FormModules ObjectModules and FrameModules None of thesemodules is part of the XHTML-IMIntegration Set If support for such modules is desired it MUST be defined in a separate anddistinct integration set

61 Structure Module DefinitionThe Structure Module is defined as including the following elements and attributes 21

Element Attributesltbodygt class id title styleltheadgt profilelthtmlgt versionlttitlegt

62 Text Module DefinitionThe Text Module is defined as including the following elements and attributes

Element Attributesltabbrgt class id title style

21The rsquostylersquo attribute is specified herein where appropriate because the Style Attribute Module is included inthe definition of the XHTML-IM Integration Set whereas the event-related attributes (eg rsquoonclickrsquo) are notspecified because the Implicit Events Module is not included

6

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 11: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

6 XHTML-IM INTEGRATION SET

Element Attributesltacronymgt class id title styleltaddressgt class id title styleltblockquotegt class id title style citeltbrgt class id title styleltcitegt class id title styleltcodegt class id title styleltdfngt class id title styleltdivgt class id title styleltemgt class id title stylelth1gt class id title stylelth2gt class id title stylelth3gt class id title stylelth4gt class id title stylelth5gt class id title stylelth6gt class id title styleltkbdgt class id title styleltpgt class id title styleltpregt class id title styleltqgt class id title style citeltsampgt class id title styleltspangt class id title styleltstronggt class id title styleltvargt class id title style

63 Hypertext Module DefinitionThe Hypertext Module is defined as including the ltagt element only

Element Attributesltagt class id title style accesskey charset href hreflang rel rev tabindex type

64 List Module DefinitionThe List Module is defined as including the following elements and attributes

Element Attributesltdlgt class id title style

7

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 12: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

Element Attributesltdtgt class id title styleltddgt class id title styleltolgt class id title styleltulgt class id title styleltligt class id title style

65 Image Module DefinitionThe Image Module is defined as including the ltimggt element only

Element Attributesltimggt class id title style alt height longdesc src width

66 Style Attribute Module DefinitionThe Style Attribute Module is defined as including the style attribute only as included in thepreceding definition tables

7 Recommended ProfileEven within the restricted set of modules specified as defining the XHTML-IM Integration Set(see preceding section) some elements and attributes are inappropriate or unnecessary forthe purpose of instant messaging although such elements and attributes MAY be included inaccordance with the XHTML-IM Integration Set further recommended restrictions regardingwhich elements and attributes to include in XHTML content are specified below

71 Structure Module ProfileThe intent of the protocol defined herein is to support lightweight text markup of XMPPmessage bodies only Therefore the ltheadgt lthtmlgt and lttitlegt elements SHOULD NOTbe generated by a compliant implementation and SHOULD be ignored if received (wherethe meaning of rdquoignorerdquo is defined by the conformance requirements of Modularization ofXHTML as summarized in the User Agent Conformance section of this document) Howeverthe ltbodygt element is REQUIRED since it is the root element for all XHTML-IM content

8

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 13: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

72 Text Module ProfileNot all of the Text Module elements are appropriate in the context of instant messagingsince the XHTML content that one views is generated by onersquos conversation partner in whatis often a rapid-fire conversation thread Only the following elements are RECOMMENDED inXHTML-IM

bull ltblockquotegt

bull ltbrgt

bull ltcitegt

bull ltemgt

bull ltpgt

bull ltspangt

bull ltstronggt

The other Text Module elements MAY be generated by a compliant implementation butMAY be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

73 Hypertext Module ProfileThe only recommended attributes of the ltagt element are specified in the RecommendedAttributes section of this document

74 List Module ProfileBecause it is unlikely that an instant messaging user would generate a definition list onlyordered and unordered lists are RECOMMENDED Definition lists SHOULD NOT be generatedby a compliant implementation and SHOULD be ignored if received (where the meaningof rdquoignorerdquo is defined by the conformance requirements of Modularization of XHTML assummarized in the User Agent Conformance section of this document)

75 Image Module ProfileThe only recommended attributes of the ltimggt element are specified in the RecommendedAttributes section of this document

9

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 14: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

The XHTML specification allows a rdquodatardquo URL RFC 2397 22 as the value of the rsquosrcrsquo attributeThis is NOT RECOMMENDED for use in XHTML-IM because it can significantly increase thesize of the message stanza and XMPP is not optimized for large stanzas If the image datais small (less than 8 kilobytes) clients MAY use Bits of Binary (XEP-0231) 23 in coordinationwith XHTML-IM if the image data is large the value of the rsquosrcrsquo SHOULD be a pointer to anexternally available file for the image (or the sender SHOULD use a dedicated file transfermethod such as In-Band Bytestreams (XEP-0047) 24 or SOCKS5 Bytestreams (XEP-0065) 25)For security reasons or because of display constraints a compliant client MAY choose todisplay rsquoaltrsquo text only not the image itself (for details see the Malicious Objects section of thisdocument)

76 Style Attribute Module ProfileThis module MUST be supported in XHTML-IM if possible although clients written for certainplatforms (eg console clients mobile phones and handheld computers) or for certain classesof users (eg text-to-speech clients) might not be able to support all of the recommendedstyles directly they SHOULD attempt to emulate or translate the defined style propertiesinto text or other presentation styles that are appropriate for the platform or user base inquestionA full list of recommended style properties is provided below

761 Recommended Style Properties

CSS1 defines 42 rdquoatomicrdquo style properties (which are categorized into font color andbackground text box and classification properties) as well as 11 rdquoshorthandrdquo properties(rdquofontrdquo rdquobackgroundrdquo rdquomarginrdquo rdquopaddingrdquo rdquoborder-widthrdquo rdquoborder-toprdquo rdquoborder-rightrdquordquoborder-bottomrdquo rdquoborder-leftrdquo rdquoborderrdquo and rdquolist-stylerdquo) Many of these properties are notappropriate for use in text-based instant messaging for one or more of the following reasons

1 The property applies to or depends on the inclusion of images other than those han-dled by the XHTML ImageModule (eg the rdquobackground-imagerdquo rdquobackground-repeatrdquordquobackground-attachmentrdquo rdquobackground-positionrdquo and rdquolist-style-imagerdquo properties)

2 The property is intended for advanced document layout (eg the rdquoline-heightrdquo propertyand most of the box properties with the exception of rdquomargin-leftrdquo which is useful forindenting text and rdquomargin-rightrdquo which can be useful when dealing with images)

3 The property is unnecessary since it can be emulated via user input or recommendedXHTML stuctural elements (eg the rdquotext-transformrdquo property can be emulated by the

22RFC 2397 The data URL scheme lthttptoolsietforghtmlrfc2397gt23XEP-0231 Bits of Binary lthttpsxmpporgextensionsxep-0231htmlgt24XEP-0047 In-Band Bytestreams lthttpsxmpporgextensionsxep-0047htmlgt25XEP-0065 SOCKS5 Bytestreams lthttpsxmpporgextensionsxep-0065htmlgt

10

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 15: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

userrsquos keystrokes or use of the caps lock key)

4 The property is otherwise unlikely to ever be used in the context of rapid-fire con-versations (eg the rdquofont-variantrdquo rdquoword-spacingrdquo rdquoletter-spacingrdquo and rdquolist-style-positionrdquo properties)

5 The property is a shorthand property but some of the properties it includes are not ap-propriate for instant messaging applications according to the foregoing considerations(in fact this applies to all of the shorthand properties)

Unfortunately CSS1 does not include mechanisms for defining profiles thereof (as doesXHTML 10 in the form of XHTML Modularization) While there exist reduced sets of CSS2these introduce more complexity than is desirable in the context of XHTML-IM Therefore wesimply provide a list of recommended CSS1 style propertiesXHTML-IM stipulates that only the following style properties are RECOMMENDED

bull background-color

bull color

bull font-family

bull font-size

bull font-style

bull font-weight

bull margin-left

bull margin-right

bull text-align

bull text-decoration

Although a compliant implementationMAY generate or process other style properties definedin CSS1 such behavior is NOT RECOMMENDED by this document

77 Recommended Attributes771 Common Attributes

Section 51 of Modularization of XHTML describes several rdquocommonrdquo attribute collections ardquoCorerdquo collection (rsquoclassrsquo rsquoidrsquo rsquotitlersquo) an rdquoI18Nrdquo collection (rsquoxmllangrsquo not shown below sinceit is implied in XML) an rdquoEventsrdquo collection (not included in the XHTML-IM Integration Setbecause the Intrinsic Events Module is not selected) and a rdquoStylerdquo collection (rsquostylersquo) The

11

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 16: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

following table summarizes the recommended profile of these common attributes within theXHTML 10 content itself

Attribute Usage Reasonclass NOT RECOMMENDED External stylesheets (which rsquoclassrsquo

would typically reference) are notrecommended

id NOT RECOMMENDED Internal links and message fragmentsare not recommended in IM contentnor are external stylesheets (whichalso make use of the rsquoidrsquo attribute)

title NOT RECOMMENDED Granting of titles to elements in IMcontent seems unnecessary

style REQUIRED The rsquostylersquo attribute is required since itis the vehicle for presentational styles

xmllang NOT RECOMMENDED Differentiation of language identifica-tion SHOULD occur at the level of theltbodygt element only

772 Specialized Attributes

Beyond the rdquocommonrdquo attributes certain elements within the modules selected for theXHTML-IM Integration Set are allowed to possess other attributes such as eight attributes forthe ltagt element and five attributes for the ltimggt element The recommended profile forsuch attributes is provided in the following table

Element Scope Attribute Usageltagt accesskey NOT RECOMMENDEDltagt charset NOT RECOMMENDEDltagt href REQUIREDltagt hreflang NOT RECOMMENDEDltagt rel NOT RECOMMENDEDltagt rev NOT RECOMMENDEDltagt tabindex NOT RECOMMENDEDltagt type RECOMMENDEDltcitegt uri NOT RECOMMENDEDltimggt alt REQUIREDltimggt height RECOMMENDEDltimggt longdesc NOT RECOMMENDEDltimggt src REQUIREDltimggt width RECOMMENDED

12

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 17: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

7 RECOMMENDED PROFILE

773 Other Attributes

Other XHTML 10 attributes SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

78 Summary of RecommendationsThe following table summarizes the elements and attributes that are recommended withinthe XHTML-IM Integration Set

Element Attributesltagt href style typeltblockquotegt styleltbodygt style xmllang When contained within the lthtml

xmlns=rsquohttpjabberorgprotocolxhtml-imrsquogt element a ltbodygtelement is qualified by the rsquohttpwwww3org1999xhtmlrsquo namespacenaturally this is a namespace declaration rather than an attribute per seand therefore is not mentioned in the attribute enumeration

ltbrgt -none-ltcitegt styleltemgt -none-ltimggt alt height src style widthltligt styleltolgt styleltpgt styleltspangt styleltstronggt -none-ltulgt style

Any other elements and attributes defined in the XHTML 10 modules that are included in theXHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation andSHOULD be ignored if received (where the meaning of rdquoignorerdquo is defined by the conformancerequirements of Modularization of XHTML as summarized in the User Agent Conformancesection of this document)

13

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 18: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

8 BUSINESS RULES

8 Business RulesThe following rules apply to the generation and processing of XHTML content by Jabberclients or other XMPP entities

1 XHTML-IM content is designed to provide a formatted version of the XML characterdata provided in the ltbodygt of an XMPP ltmessagegt stanza if such content is includedin an XMPP message the lthtmlgt element MUST be a direct child of the ltmessagegtstanza and the XHTML-IM content MUST be understood as a formatted version of themessage body XHTML-IM content MAY be included within XMPP ltiqgt stanzas (orchildren thereof) but any such usage is undefined In order to preserve bandwidthXHTML-IM content SHOULD NOT be included within XMPP ltpresencegt stanzas how-ever if it is so included the lthtmlgt element MUST be a direct child of the ltpresencegtstanza and the XHTML-IM content MUST be understood as a formatted version of theXML character data provided in the ltstatusgt element

2 The sending client MUST ensure that if XHTML content is sent its meaning is the sameas that of the plaintext version and that the two versions differ only in markup ratherthan meaning

3 XHTML-IM is a reduced set of XHTML 10 and thus also of XML 10 Therefore all openingtags MUST be completed by inclusion of an appropriate closing tag

4 XMPP Core specifies that an XMPP ltmessagegt MAY contain more than one ltbodygtchild as long as each ltbodygt possesses an rsquoxmllangrsquo attribute with a distinct value Inorder to ensure correct internationalization if an XMPP ltmessagegt stanza containsmore than one ltbodygt child and is also sent as XHTML-IM the lthtmlgt elementSHOULD also contain more than one ltbodygt child with one such element for eachltbodygt child of the ltmessagegt stanza (distinguished by an appropriate rsquoxmllangrsquoattribute)

5 Section 111 of XMPP Core stipulates that character entities other than the five generalentities defined in Section 46 of the XML specification (ie amplt ampgt ampamp ampaposand ampquot) MUST NOT be sent over an XML stream Therefore implementations ofXHTML-IMMUST NOT include predefined XHTML 10 entities such as ampnbsp -- insteadimplementations MUST use the equivalent character references as specified in Section41 of the XML specification (even in non-obvious places such as URIs that are includedin the rsquohrefrsquo attribute)

6 For elements and attributes qualified by the rsquohttpwwww3org1999xhtmlrsquo names-pace user agent conformance is guided by the requirements defined in Modularization

14

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 19: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

9 EXAMPLES

of XHTML for details refer to the User Agent Conformance section of this document

7 Where strictly presentational style are desired (eg colored text) it might be necessaryto use rsquostylersquo attributes (eg ltspan style=rsquofont-color greenrsquogtthis is greenltspangt)However where possible it is instead RECOMMENDED to use appropriate structuralelements (eg ltstronggt and ltblockquotegt instead of say style=rsquofont-weight boldrsquo orstyle=rsquomargin-left 5rsquo)

8 Nesting of block structural elements (ltpgt) and list elements (ltdlgt ltolgt ltulgt) is NOTRECOMMENDED except within ltdivgt elements

9 It is RECOMMENDED for implementations to replace line breaks with the ltbrgt elementand to replace significant whitepace with the appropriate number of non-breakingspaces (via the NO-BREAK SPACE character or its equivalent) where rdquosignificantwhitespacerdquo means whitespace that makes some material difference (eg one or morespaces at the beginning of a line or more than one space anywhere else within a line)not rdquonormalrdquo whitespace separating words or punctuation

10 When rendering XHTML-IM content a receiving user agent MUST NOT render asXHTML any text that was not structured by the sending user agent using XHTML ele-ments and attributes if the sender wishes text to be structured (eg for certain wordsto be emphasized or for URIs to be linked) the sending user agent MUST represent thetext using the appropriate XHTML elements and attributes

9 ExamplesThe following examples provide an insight into the inclusion of XHTML content in XMPPltmessagegt stanzas but are by no means exhaustive or definitive(Note The examples might not render correctly in all web browsers since not all webbrowsers comply fully with the XHTML 10 and CSS1 standards Markup in the examples mayinclude line breaks for readability Example renderings are shown with a colored backgroundto set them off from the rest of the text)

Listing 2 Emphasis font colors strengthltmessage gt

ltbodygtWow Iampaposm green with envyltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltp style=rsquofont -sizelarge rsquogt

15

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 20: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

9 EXAMPLES

ltemgtWowltemgt Iampaposm ltspan style=rsquocolorgreen rsquogtgreen ltspangtwith ltstrong gtenvyltstrong gt

ltpgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsWow Irsquom green with envy

Listing 3 Blockquote citeltmessage gt

ltbodygtAs Emerson said in his essay Self -Reliance

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtAs Emerson said in his essay ltcitegtSelf -Reliance ltcitegtltpgtltblockquote gt

ampquotA foolish consistency is the hobgoblin of little mindsampquot

ltblockquote gtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsAs Emerson said in his essay Self-ReliancerdquoA foolish consistency is the hobgoblin of little mindsrdquo

Listing 4 An image and a hyperlinkltmessage gt

ltbodygtHey are you licensed to Jabber

http wwwxmpporgimagespsa -licensejpgltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHey are you licensed to lta href=rsquohttp wwwjabberorgrsquogt

Jabber ltagtltpgtltpgtltimg src=rsquohttp wwwxmpporgimagespsa -licensejpgrsquo

alt=rsquoALicensetoJabber rsquoheight=rsquo261rsquowidth=rsquo537rsquogtltpgt

ltbodygtlthtmlgt

16

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 21: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

9 EXAMPLES

ltmessage gt

This could be rendered as followsHey are you licensed to Jabber

Note the large size of the image Including the rsquoheightrsquo and rsquowidthrsquo attributes is thereforequite friendly since it gives the receiving application hints as to whether the image is toolarge to fit into the current interface (naturally these are hints only and cannot necessarilybe relied upon in determining the size of the image)Rendering the rsquoaltrsquo value rather than the image would yield something like the followingHey are you licensed to JabberIMG rdquoA License to Jabberrdquo

Listing 5 Two listsltmessage gt

ltbodygtHereampaposs my plan for today1 Add the following examples to XEP -0071

- ordered and unordered lists- more styles (eg indentation)

2 Kick back and relaxltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtHereampaposs my plan for today ltpgtltolgt

ltligtAdd the following examples to XEP -0071ltulgt

ltligtordered and unordered lists ltligtltligtmore styles (eg indentation)ltligt

ltulgtltligtltligtKick back and relax ltligt

ltolgtltbodygt

lthtmlgtltmessage gt

This could be rendered as followsHerersquos my plan for today

1 Add the following examples to XEP-0071bull ordered and unordered lists

17

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 22: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

9 EXAMPLES

bull more styles (eg indentation)

2 Kick back and relax

Listing 6 Quoted textltmessage gt

ltbodygtYou wrote

I think we have consensus on the following

1 Remove ampltdivampgt2 Nesting is not recommended3 Donrsquotpreservewhitespace

Yes no maybe

Thatseemsfinetome ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtYouwrote ltpgtltblockquote gtltpgtIthinkwehaveconsensusonthefollowing ltpgtltol gtltli gtRemoveampltdivampgtltli gtltli gtNestingisnotrecommended ltli gtltli gtDonrsquot preserve whitespace ltligt

ltolgtltpgtYes no maybeltpgt

ltblockquote gtltpgtThat seems fine to meltpgt

ltbodygtlthtmlgt

ltmessage gt

Although quoting receivedmessages is relatively uncommon in IM it does happen This couldbe rendered as followsYou wroteI think we have consensus on the following

1 Remove ltdivgt

2 Nesting is not recommended

3 Donrsquot preserve whitespace

18

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 23: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

9 EXAMPLES

Yes no maybeThat seems fine to me

Listing 7 Multiple bodiesltmessage gt

ltbody xmllang=rsquoen -USrsquogtawesomeltbodygtltbody xmllang=rsquode -DErsquogtausgezeichnetltbodygtlthtml xmlns=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltbody xmllang=rsquoen -USrsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogtltpgtltstrong gtawesomeltstrong gtltpgt

ltbodygtltbody xmllang=rsquode -DErsquo xmlns=rsquohttp wwww3org 1999 xhtml rsquogt

ltpgtltstrong gtausgezeichnetltstrong gtltpgtltbodygt

lthtmlgtltmessage gt

How multiple bodies would best be rendered will depend on the user agent and relevant ap-plication For example a specialized Jabber client that is used in foreign language instructionmight show two languages side by side whereas a dedicated IM client might show contentonly in a human userrsquos preferred language as captured in the client configuration

Listing 8 Unrecognized Elements and Attributesltmessage gt

ltbodygtThe XHTML user agent conformance requirements say to ignoreelements and attributes you donrsquotunderstand towit

4Ifauseragentencountersanelementitdoesnotrecognize itmustcontinuetoprocessthechildrenofthatelementIfthecontentistext thetextmustbepresentedtotheuser

5Ifauseragentencountersanattributeitdoesnotrecognize itmustignoretheentireattributespecification(ietheattributeanditsvalue) ltbody gtlthtmlxmlns=rsquohttp jabberorgprotocolxhtml -imrsquogtltbodyxmlns=rsquohttpwwww3org 1999 xhtml rsquogtltpgtTheltacronym gtXHTML ltacronym gtuseragentconformancerequirementssaytoignoreelementsandattributesyoudonrsquot understand to witltpgt

ltol type=rsquo1rsquo start=rsquo4rsquogtltligtltpgt

If a user agent encounters an element it doesnot recognize it must continue to process thechildren of that element If the content is text

19

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 24: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

10 DISCOVERING SUPPORT FOR XHTML-IM

the text must be presented to the userltpgtltligtltligtltpgt

If a user agent encounters an attribute it doesnot recognize it must ignore the entire attributespecification (ie the attribute and its value)

ltpgtltligtltolgt

ltbodygtlthtmlgt

ltmessage gt

Let us assume that the recipientrsquos user agent recognizes neither the ltacronymgt element(which is discouraged in XHTML-IM) nor the rsquotypersquo and rsquostartrsquo attributes of the ltolgt element(which after all were deprecated in HTML 40) and that it does not render nested elements(eg the ltpgt elements within the ltligt elements) in this case it could render the content asfollows (note that the element value is shown as text and the attribute value is not rendered)The XHTML user agent conformance requirements say to ignore elements and attributes youdonrsquot understand to wit

1 If a user agent encounters an element it does not recognize it must continue to processthe children of that element If the content is text the text must be presented to theuser

2 If a user agent encounters an attribute it does not recognize it must ignore the entireattribute specification (ie the attribute and its value)

10 Discovering Support for XHTML-IMThis section describes methods for discovering whether a Jabber client or other XMPP entitysupports the protocol defined herein

101 Explicit DiscoveryThe primary means of discovering support for XHTML-IM is Service Discovery (XEP-0030) 26

(or the dynamic profile of service discovery defined in Entity Capabilities (XEP-0115) 27)

Listing 9 Seeking to Discover XHTML-IM Supportltiq type=rsquogetrsquo

from=rsquojulietshakespearelitbalcony rsquoto=rsquoromeoshakespearelitorchard rsquo

26XEP-0030 Service Discovery lthttpsxmpporgextensionsxep-0030htmlgt27XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

20

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 25: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

11 SECURITY CONSIDERATIONS

id=rsquodisco1 rsquogtltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogt

ltiqgt

If the queried entity supports XHTML-IM it MUST return a ltfeaturegt element with a rsquovarrsquoattribute set to a value of rdquohttpjabberorgprotocolxhtml-imrdquo in the IQ result

Listing 10 Contact Returns Disco Info Resultsltiq type=rsquoresult rsquo

from=rsquoromeoshakespearelitorchard rsquoto=rsquojulietshakespearelitbalcony rsquoid=rsquodisco1 rsquogt

ltquery xmlns=rsquohttp jabberorgprotocoldiscoinforsquogtltfeature var=rsquohttp jabberorgprotocolxhtml -imrsquogt

ltquery gtltiqgt

Naturally support can also be discovered via the dynamic presence-based profile of servicediscovery defined in Entity Capabilities (XEP-0115) 28

102 Implicit DiscoveryA Jabber userrsquos client MAY send XML ltmessagegt stanzas containing XHTML-IM extensionswithout first discovering if the conversation partnerrsquos client supports XHTML-IM If the userrsquosclient sends a message that includes XHTML-IMmarkup and the conversation partnerrsquos clientreplies to that message but does not include XHTML-IM markup the userrsquos client SHOULDNOT continue sending XHTML-IM markup

11 Security Considerations111 Malicious ObjectsWhile scripts applets binary objects and other potentially executable code is excluded fromthe profiles used in XHTML-IM malicious entities still may inject those and thus exploitentities which rely on this exclusion Entities thus MUST assume that inbound XHTML-IMmay be mailicious and MUST sanitize it according to the profile used by ignoring elementsand removing attributes as neededTo further reduce the risk of such exposure an implementation MAY choose to

28XEP-0115 Entity Capabilities lthttpsxmpporgextensionsxep-0115htmlgt

21

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 26: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

12 W3C CONSIDERATIONS

bull Not make hyperlinks clickable

bull Not fetch or present images but instead show only the rsquoaltrsquo text

In addition an implementation MUST make it possible for a user to prevent the automaticfetching and presentation of images (rather than leave it up to the implementation)

112 PhishingTo reduce the risk of phishing attacks 29 an implementation MAY choose to

bull Display the value of the XHTML rsquohrefrsquo attribute instead of the XML character data of theltagt element

bull Display the value of XHTML rsquohrefrsquo attribute in addition to the XML character data of theltagt element if the two values do not match

113 Presence LeaksThe network availability of the receiver may be revealed if the receiverrsquos client automaticallyloads images or the receiver clicks a link included in a message Therefore an implementationMAY choose to

bull Not fetch images offered by senders that are not authorized to view the receiverrsquos pres-ence

bull Warn the receiver before allowing the user to visit a URI provided by the sender

12 W3C ConsiderationsThe usage of XHTML 10 defined herein meets the requirements for XHTML 10 IntegrationSet document type conformance as defined in Section 3 (rdquoConformance Definitionrdquo) ofModularization of XHTML

121 Document Type NameThe Formal Public Identifier (FPI) for the XHTML-IM document type definition is

-JSFDTD Instant Messaging with XHTMLEN

29Phishing has been defined by the Financial Services Technology Consortium Counter-Phishing Initiative as rdquoabroadly launched social engineering attack in which an electronic identity is misrepresented in an attempt totrick individuals into revealing personal credentials that can be used fraudulently against themrdquo

22

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 27: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

12 W3C CONSIDERATIONS

The fields of this FPI are as follows

1 The leading field is rdquo-rdquo which indicates that this is a privately-defined resource

2 The second field is rdquoJSFrdquo (an abbreviation for Jabber Software Foundation the formername for the XMPP Standards Foundation (XSF) 30) which identifies the organizationthat maintains the named item

3 The third field contains two constructsa) The public text class is rdquoDTDrdquo which adheres to ISO 8879 Clause 10221b) The public text description is rdquoInstant Messaging with XHTMLrdquo which contains

but does not begin with the string rdquoXHTMLrdquo (as recommended for an XHTML 10Integration Set)

4 The fourth field is rdquoENrdquo which identifies the language (English) in which the item isdefined

122 User Agent ConformanceA user agent that implements this specificationMUST conform to Section 35 (rdquoXHTML FamilyUser Agent Conformancerdquo) of Modularization of XHTML Many of the requirements definedtherein are already met by Jabber clients simply because they already include XML parsersHowever rdquoignorerdquo has a special meaning in XHTML modularization (different from its mean-ing in XMPP) Specifically criteria 4 through 6 of Section 35 ofModularization of XHTML state

1 W3C TEXT If a user agent encounters an element it does not recognize it must continueto process the children of that element If the content is text the textmust be presentedto the userXSF COMMENT This behavior is different from that defined by XMPP Core and inthe context of XHTML-IM implementations applies only to XML elements qualifiedby the rsquohttpwwww3org1999xhtmlrsquo namespace as defined herein This crite-rion MUST be applied to all XHTML 10 elements except those explicitly includedin XHTML-IM as described in the XHTML-IM Integration Set and RecommendedProfile sections of this document Therefore an XHTML-IM implementation MUSTprocess all XHTML 10 child elements of the XHTML-IM lthtmlgt element even if suchchild elements are not included in the XHTML 10 Integration Set defined herein andMUST present to the recipient the XML character data contained in such child elements

30The XMPP Standards Foundation (XSF) is an independent non-profit membership organization that developsopen extensions to the IETFrsquos Extensible Messaging and Presence Protocol (XMPP) For further informationsee lthttpsxmpporgaboutxmpp-standards-foundationgt

23

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 28: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

12 W3C CONSIDERATIONS

2 W3C TEXT If a user agent encounters an attribute it does not recognize it must ignorethe entire attribute specification (ie the attribute and its value)XSF COMMENT This criterion MUST be applied to all XHTML 10 attributes ex-cept those explicitly included in XHTML-IM as described in the XHTML-IM Inte-gration Set and Recommended Profile sections of this document Therefore anXHTML-IM implementation MUST ignore all attributes of elements qualified by thersquohttpwwww3org1999xhtmlrsquo namespace if such attributes are not explicitly in-cluded in the XHTML 10 Integration Set defined herein

3 W3C TEXT If a user agent encounters an attribute value it doesnrsquot recognize it must usethe default attribute valueXSF COMMENT Since not one of the attributes included in XHTML-IM has a default valuedefined for it in XHTML 10 in practice this criterion does not apply to XHTML-IMimplementations

123 XHTML ModularizationFor information regarding XHTML modularization in XML schema for the XHTML 10 In-tegration Set defined in this specification refer to the Schema Driver section of this document

124 W3C ReviewThe XHTML 10 Integration Set defined herein has been reviewed informally by an editorof the XHTML Modularization in XML Schema specification but has not undergone formalreview by the W3C Before the XHTML-IM specification proceeds to a status of Final withinthe standards process of the XMPP Standards Foundation the XMPP Council 31 is encouragedto pursue a formal review through communication with the Hypertext Coordination Groupwithin the W3C

125 XHTML VersioningThe W3C is actively working on XHTML 20 32 and may produce additional versions of XHTMLin the future This specification addresses XHTML 10 only but it may be superseded orsupplemented in the future by a XMPP Extension Protocol specification that defines methodsfor encapsulating XHTML 20 content in XMPP

31The XMPP Council is a technical steering committee authorized by the XSF Board of Directors and elected byXSF members that approves of new XMPP Extensions Protocols and oversees the XSFrsquos standards process Forfurther information see lthttpsxmpporgaboutxmpp-standards-foundationcouncilgt

32XHTML 20 lthttpwwww3orgTRxhtml2gt

24

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 29: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

13 IANA ConsiderationsThis document requires no interaction with the Internet Assigned Numbers Authority (IANA)33

14 XMPP Registrar Considerations141 Protocol NamespacesThe XMPP Registrar 34 includes rsquohttpjabberorgprotocolxhtml-imrsquo in its registry ofprotocol namespaces

15 XML Schemas151 XHTML-IM WrapperThe following schema defines the XMPP extension element that serves as a wrapper forXHTML content

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquoxmlnsxhtml=rsquohttp wwww3org 1999 xhtml rsquotargetNamespace=rsquohttp jabberorgprotocolxhtml -imrsquoxmlns=rsquohttp jabberorgprotocolxhtml -imrsquoelementFormDefault=rsquoqualified rsquogt

ltxsimport namespace=rsquohttp wwww3org 1999 xhtml rsquoschemaLocation=rsquohttp wwww3org 200208 xhtmlxhtml1 -

strictxsdrsquogt

ltxsannotation gtltxsdocumentation gt

This schema defines the lthtmlgt element qualified bythe rsquohttp jabberorgprotocolxhtml -imrsquo namespaceThe only allowable child is a ltbodygt element qualified

33The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique pa-rameter values for Internet protocols such as port numbers and URI schemes For further information seelthttpwwwianaorggt

34The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used inthe context of XMPP extension protocols approved by the XMPP Standards Foundation For further informa-tion see lthttpsxmpporgregistrargt

25

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 30: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

by the rsquohttp wwww3org 1999 xhtml rsquo namespace Referto the XHTML -IM schema driver for the definition of theXHTML 10 Integration Set

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxselement name=rsquohtmlrsquogtltxscomplexType gt

ltxssequence gtltxselement ref=rsquoxhtmlbody rsquo

minOccurs=rsquo0rsquomaxOccurs=rsquounbounded rsquogt

ltxssequence gtltxscomplexType gt

ltxselement gt

ltxsschema gt

152 XHTML-IM Schema DriverThe following schema defines the modularization schema driver for XHTML-IM

ltxml version=rsquo10rsquo encoding=rsquoUTF -8rsquogt

ltxsschemaxmlnsxs=rsquohttp wwww3org 2001 XMLSchema rsquotargetNamespace=rsquohttp wwww3org 1999 xhtml rsquoxmlns=rsquohttp wwww3org 1999 xhtml rsquoelementFormDefault=rsquoqualified rsquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema driver for XHTML -IM an XHTML 10Integration Set for use in exchanging marked -up instantmessages between entities that conform to the ExtensibleMessaging and Presence Protocol (XMPP) This IntegrationSet includes a subset of the modules defined for XHTML 10but does not redefine any existing modules nor does it

26

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 31: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

The Formal Public Identifier (FPI) for this IntegrationSet is

-JSFDTD Instant Messaging with XHTMLEN

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation which is available at thefollowing URL

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -blkstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -hypertext

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -image -1xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlphras

-1 xsdrdquogtltxsinclude

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstruct-1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -inlstyle

-1 xsdrdquogtltxsinclude

27

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 32: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

schemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -list -1 xsdrdquogt

ltxsincludeschemaLocation=rdquohttp wwww3orgMarkUpSCHEMAxhtml -struct -1

xsdrdquogt

ltxsschema gt

153 XHTML-IM Content ModelThe following schema defines the content model for XHTML-IM

ltxml version=rdquo10rdquo encoding=rdquoUTF -8rdquogtltxsschema xmlnsxs=rdquohttp wwww3org 2001 XMLSchemardquo

targetNamespace=rdquohttp wwww3org 1999 xhtmlrdquoxmlns=rdquohttp wwww3org 1999 xhtmlrdquogt

ltxsannotation gtltxsdocumentation gt

This is the XML Schema module of named XHTML 10 content modelsfor XHTML -IM an XHTML 10 Integration Set for use in exchangingmarked -up instant messages between entities that conform tothe Extensible Messaging and Presence Protocol (XMPP) ThisIntegration Set includes a subset of the modules defined forXHTML 10 but does not redefine any existing modules nordoes it define any new modules Specifically it includes thefollowing modules only

- Structure- Text- Hypertext- List- Image- Style Attribute

Therefore XHTML -IM uses the following content models

Blockmix Block -like elements eg paragraphsFlowmix Any block or inline elementsInlinemix Character -level elementsInlineNoAnchorclass Anchor elementInlinePremix Pre element

XHTML -IM also uses the following Attribute Groups

CoreextraattribI18nextraattrib

28

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 33: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

Commonextra

Full documentation of this Integration Set is contained inrdquoXEP -0071XHTML -IMrdquo a specification published by theXMPP Standards Foundation

http wwwxmpporgextensionsxep -0071 html

ltxsdocumentation gtltxsdocumentation source=rdquohttp wwwxmpporgextensionsxep -0071

htmlrdquogtltxsannotation gt

lt-- BEGIN ATTRIBUTE GROUPS --gt

ltxsattributeGroup name=rdquoCoreextraattribrdquogtltxsattributeGroup name=rdquoI18nextraattribrdquogtltxsattributeGroup name=rdquoCommonextrardquogt

ltxsattributeGroup ref=rdquostyleattribrdquogtltxsattributeGroup gt

lt-- END ATTRIBUTE GROUPS --gt

lt-- BEGIN HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoAnchorclassrdquogtltxssequence gt

ltxselement ref=rdquoardquogtltxssequence gt

ltxsgroup gt

lt-- END HYPERTEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN IMAGE MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoImageclassrdquogtltxschoice gt

ltxselement ref=rdquoimgrdquogtltxschoice gt

ltxsgroup gt

lt-- END IMAGE MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN LIST MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoListclassrdquogtltxschoice gt

ltxselement ref=rdquoulrdquogtltxselement ref=rdquoolrdquogt

29

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 34: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

ltxselement ref=rdquodlrdquogtltxschoice gt

ltxsgroup gt

lt-- END LIST MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN TEXT MODULE rdquoPRIMITIVESrdquo --gt

ltxsgroup name=rdquoBlkPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoaddressrdquogtltxselement ref=rdquoblockquoterdquogtltxselement ref=rdquoprerdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoBlkStructclassrdquogtltxschoice gt

ltxselement ref=rdquodivrdquogtltxselement ref=rdquoprdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoHeadingclassrdquogtltxschoice gt

ltxselement ref=rdquoh1rdquogtltxselement ref=rdquoh2rdquogtltxselement ref=rdquoh3rdquogtltxselement ref=rdquoh4rdquogtltxselement ref=rdquoh5rdquogtltxselement ref=rdquoh6rdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlPhrasclassrdquogtltxschoice gt

ltxselement ref=rdquoabbrrdquogtltxselement ref=rdquoacronymrdquogtltxselement ref=rdquociterdquogtltxselement ref=rdquocoderdquogtltxselement ref=rdquodfnrdquogtltxselement ref=rdquoemrdquogtltxselement ref=rdquokbdrdquogtltxselement ref=rdquoqrdquogtltxselement ref=rdquosamprdquogtltxselement ref=rdquostrongrdquogtltxselement ref=rdquovarrdquogt

ltxschoice gtltxsgroup gt

30

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 35: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

15 XML SCHEMAS

ltxsgroup name=rdquoInlStructclassrdquogtltxschoice gt

ltxselement ref=rdquobrrdquogtltxselement ref=rdquospanrdquogt

ltxschoice gtltxsgroup gt

lt-- END TEXT MODULE rdquoPRIMITIVESrdquo --gt

lt-- BEGIN BLOCK COMBINATIONS --gt

ltxsgroup name=rdquoBlockclassrdquogtltxschoice gt

ltxsgroup ref=rdquoBlkPhrasclassrdquogtltxsgroup ref=rdquoBlkStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END BLOCK COMBINATIONS --gt

lt-- BEGIN INLINE COMBINATIONS --gt

lt-- Any inline content --gtltxsgroup name=rdquoInlineclassrdquogt

ltxschoice gtltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- Inline content contained in a hyperlink --gtltxsgroup name=rdquoInlNoAnchorclassrdquogt

ltxschoice gtltxsgroup ref=rdquoImageclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END INLINE COMBINATIONS --gt

lt-- BEGIN TOP -LEVEL MIXES --gt

ltxsgroup name=rdquoBlockmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogt

31

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 36: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

16 ACKNOWLEDGEMENTS

ltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoFlowmixrdquogtltxschoice gt

ltxsgroup ref=rdquoBlockclassrdquogtltxsgroup ref=rdquoHeadingclassrdquogtltxsgroup ref=rdquoInlineclassrdquogtltxsgroup ref=rdquoListclassrdquogt

ltxschoice gtltxsgroup gt

ltxsgroup name=rdquoInlinemixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlineclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlNoAnchormixrdquogtltxschoice gt

ltxsgroup ref=rdquoInlNoAnchorclassrdquogtltxschoice gt

ltxsgroup gt

ltxsgroup name=rdquoInlinePremixrdquogtltxschoice gt

ltxsgroup ref=rdquoAnchorclassrdquogtltxsgroup ref=rdquoInlPhrasclassrdquogtltxsgroup ref=rdquoInlStructclassrdquogt

ltxschoice gtltxsgroup gt

lt-- END TOP -LEVEL MIXES --gt

ltxsschema gt

16 AcknowledgementsThis specification formalizes and extends earlier work by Jeremie Miller and Julian Mis-sig on XHTML formatting of Jabber messages Many thanks to Shane McCarron for hisassistance regarding XHTML modularization and conformance issues Thanks also to con-tributors on the discussion list of the XSFrsquos Standards SIG 35 for their feedback and suggestions

35The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension ProtocolsThe discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions as

32

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements
Page 37: XEP-0071:XHTML-IM - XMPPContents 1 Introduction 1 2 ChoiceofTechnology 1 3 Requirements 2 4 ConceptsandApproach 3 5 WrapperElement 4 6 XHTML-IMIntegrationSet 5 6.1 StructureModuleDefinition

16 ACKNOWLEDGEMENTS

well as for announcements by the XMPP Extensions Editor and XMPP Registrar To subscribe to the list or viewthe list archives visit lthttpsmailjabberorgmailmanlistinfostandardsgt

33

  • Introduction
  • Choice of Technology
  • Requirements
  • Concepts and Approach
  • Wrapper Element
  • XHTML-IM Integration Set
    • Structure Module Definition
    • Text Module Definition
    • Hypertext Module Definition
    • List Module Definition
    • Image Module Definition
    • Style Attribute Module Definition
      • Recommended Profile
        • Structure Module Profile
        • Text Module Profile
        • Hypertext Module Profile
        • List Module Profile
        • Image Module Profile
        • Style Attribute Module Profile
          • Recommended Style Properties
            • Recommended Attributes
              • Common Attributes
              • Specialized Attributes
              • Other Attributes
                • Summary of Recommendations
                  • Business Rules
                  • Examples
                  • Discovering Support for XHTML-IM
                    • Explicit Discovery
                    • Implicit Discovery
                      • Security Considerations
                        • Malicious Objects
                        • Phishing
                        • Presence Leaks
                          • W3C Considerations
                            • Document Type Name
                            • User Agent Conformance
                            • XHTML Modularization
                            • W3C Review
                            • XHTML Versioning
                              • IANA Considerations
                              • XMPP Registrar Considerations
                                • Protocol Namespaces
                                  • XML Schemas
                                    • XHTML-IM Wrapper
                                    • XHTML-IM Schema Driver
                                    • XHTML-IM Content Model
                                      • Acknowledgements