Scientific Applications of XML Arvind Hulgeri, Shantanu Godbole aru@cse.iitb.ernet.in...

Preview:

Citation preview

Scientific Applications of XML

Arvind Hulgeri, Shantanu Godbole

aru@cse.iitb.ernet.in shantanu@it.iitb.ernet.in

MathML

MathMarkupLanguage

MathML Objectives

Encode mathematical material for teaching and scientific communication

Encode both mathematical notation and mathematical meaning

Facilitate conversion to and from other math formats, both presentational and semantic. e.g., Tex, braille

Allow the passing of information intended for specific renderers and applications

Provide for extensibility

Be human legible, and simple for software to generate and process

Presentation and Content Markup

Presentation markupCaptures notational structure Facilitate rendering to various media

Content markupCaptures mathematical structureFacilitate the assignment of mathematical meaning to an expression

Can be mixed together

Example: a + b

Presentation:

<mrow>

<mi>a</mi>

<mo>+</mo>

<mi>b</mi>

</mrow>

Content:

<apply>

<plus/>

<ci>a</ci>

<ci>b</ci>

</apply>

Example: (a + b)2

Presentation:

<msup> <mfenced>

<mrow> <mi>a</mi> <mo>+</mo> <mi>b</mi> </mrow>

</mfenced> <mn>2</mn></msup>

Content:

<apply> <power/> <apply> <plus/> <ci>a</ci> <ci>b</ci> </apply> <cn>2</cn> </apply>

Annotations

t0

dxx

<mrow> <msubsup> <mo>&int;</mo> <mn>0</mn> <mi>t</mi> </msubsup> <mfrac> <mrow> <mo>&dd;</mo> <mi>x</mi> </mrow> <mi>x</mi> </mfrac></mrow>Presentation

Annotations

<apply> <int/> <bvar><ci>x</ci></bvar> <lowlimit><cn>0</cn></lowlimit> <uplimit><ci>t</ci></uplimit> <apply> <divide/> <cn>1</cn> <ci>x</ci> </apply></apply>Content

t0

dxx

Annotations

<semantics>

Content encoding

<annotation-xml encoding="MathML-Presentation">

Presentation encoding

</annotation-xml>

</semantics>

Why Two Markups?

Same notation may represent several mathematical ideas

xi = x power i

= ith element of vector xSame mathematical idea often has several notations nCm

CML

Chemical MarkupLanguage

CML – What it does

Universal, platform and application independent format for storing and exchanging chemical informationPublishing, querying, communicating chemical information for both humans and machines

Facilitate conversion to and from legacy formats used by popular chemical editing and display programs

CML – The need

Absence of mechanisms in HTML for directly handling chemical information e.g. molecular structures and spectra

Difficulties in automatically recognizing and extracting chemical data

Development and extension of a Chemical Markup Language (CML) and techniques to allow the display of molecules, spectra and reactions within a web browser

CML Objectives

A present day online chemical paper might consist of HTML text, static bit-map images, diagrams and molecular structures from an external legacy data file (e.g. MOL, PDB)

The external data files become isolated from the text and from each other

Need for a single, human readable format combining both textual and non textual information within a single document

CML – Chemical components

Chemical Components (e.g. <molecule>, <reaction>, <crystal>) used to indicate chemical 'objects'.

E.g. a <molecule> will contain a <list> of <atom>s, which in turn have three <float>s specifying Cartesian coordinates for each atom

Partial XML file - “ethanol”<cml title="ethanol" id="cml_ethanol" ><molecule title="ethanol" id="mol_ethanol"

xmlns="x-schema:cml_schema_ie_02.xml" convention="mol" >

<formula>C2 H6 O </formula><string title="CAS" >64-17-5 </string><float title="molecular weight" >46.07 </float><list title="atoms" >

<atom id="ethanol_a_1" convention="mol" ><integer builtin="atomId" >1 </integer><float builtin="x3" units="A" >1.0303 </float><float builtin="y3" units="A" >0.8847 </float><float builtin="z3" units="A" >0.9763 </float><string builtin="elementType" >C </string>

</atom> …

Some XSL processing<xsl:template match="molecule"> <!-- Pull out @id="" etc

--><table>

<tr><td>Molecule ID:</td><td>Formula:</td><td>CAS:</td> </tr>

<tr><td><xsl:value-of select="@id"/></td><td><xsl:value-of select="formula"/></td><td><xsl:value-of select="*[@title =

'CAS']"/></td></tr><tr><td>Alternate Names:</td>

<td colspan="6"> <xsl:for-each select="list[@title = 'alternate names']/string[@title='name']">

<xsl:value-of select="text()"/>, </xsl:for-each> </td> </tr>

</table></xsl:template>

Web Resources

MathML http://www.w3c.org/math

CML http://www.xml-cml.org

Other Scientific Applications

Wireless Markup Language (WML)http://www.oasis-open.org/cover/wap-wml.html

Bioinformatics Sequence Markup Language (BSML)

http://www.visualgenomics.com/bsml/index.html

The BIOpolymer Markup Language (BIOML)http://www.proteometrics.com/BIOML/

Other Scientific Applications(contd…)

Vector Markup Language (VML)http://www.w3.org/TR/NOTE-VML

Precision Graphics Markup Language (PGML)

http://www.oasis-open.org/cover/pgmlDTD19980410.html

XML Digital Signature (Signed XML)http://www.oasis-open.org/cover/xmlSig.html

smartX ['SmartCard'] Markup Language (SML)http://www.smartxml.com/

Other Scientific Applications(contd…)

Web Interface Definition Language (WIDL)http://www.w3.org/TR/NOTE-VML

Weather Observation Markup Format (OMF)http://www.oasis-open.org/cover/omfDesc19980610.html

X-ACT - XML Active Content Technologies Councilhttp://www.x-act.org/

Some More Links…

http://www.xml.comhttp://www.w3c.org/http://www.oasis-open.org/cover/xml.html

Last but not the least!!!

http://www.cse.iitb.ernet.in/~dbms/Data/Conferences/XMLWorkshop/

http://www.google.com/

Recommended