Upload
sybil-myrtle-sims
View
232
Download
0
Embed Size (px)
Citation preview
SDPL 2003 Notes 7: XML Wrapping 1
7 Translating Data to XML7 Translating Data to XML
How to translate existing data formats to XML?How to translate existing data formats to XML?– (and why?)(and why?)
XW (XML Wrapper)XW (XML Wrapper)– an "XML wrapper description language"an "XML wrapper description language"– developed in XRAKE project, Univ. of Kuopio, 2001–02developed in XRAKE project, Univ. of Kuopio, 2001–02– Ek, Hakkarainen, Kilpeläinen, Kuikka, Penttinen: Describing Ek, Hakkarainen, Kilpeläinen, Kuikka, Penttinen: Describing
XML Wrappers for Information Integration. In Proc. of XML Wrappers for Information Integration. In Proc. of XML XML Finland 2001Finland 2001, Tampere, Finland, Nov. 2001, 38–51., Tampere, Finland, Nov. 2001, 38–51.
– Ek, Hakkarainen, Kilpeläinen, Penttinen: Declarative XML Ek, Hakkarainen, Kilpeläinen, Penttinen: Declarative XML Wrapping of Data. Report A/2002/2, Dept. of CS & Appl. Wrapping of Data. Report A/2002/2, Dept. of CS & Appl. Math, Univ. of Kuopio.Math, Univ. of Kuopio.
SDPL 2003 Notes 7: XML Wrapping 2
XRAKE ProjectXRAKE Project
""XXML-ML-rarajapintojen japintojen kekehittäminen" hittäminen" (Developing XML-based interfaces)(Developing XML-based interfaces)
Studied Studied definition and implementation of XML-definition and implementation of XML-based interfaces, and their application inbased interfaces, and their application in– integration of heterogeneous data sourcesintegration of heterogeneous data sources– management of mass printingmanagement of mass printing– assembly and manipulation of electronic assembly and manipulation of electronic
patient recordspatient records
SDPL 2003 Notes 7: XML Wrapping 3
XRAKE - SupportXRAKE - Support
National Technology Agency of Finland (TEKES) and National Technology Agency of Finland (TEKES) and seven local IT companies/organizations seven local IT companies/organizations – DEIO ISDEIO IS– Enfo GroupEnfo Group– JSOP InteractiveJSOP Interactive– Kuopio University HospitalKuopio University Hospital– MedigroupMedigroup– SysOpenSysOpen– TietoEnatorTietoEnator
SDPL 2003 Notes 7: XML Wrapping 4
XW: MotivationXW: Motivation
XML-based protocols developed for XML-based protocols developed for e-business, medical messages, … e-business, medical messages, …
Legacy data formats need to be converted Legacy data formats need to be converted to XMLto XML– How?How?
SDPL 2003 Notes 7: XML Wrapping 5
XML-wrappingXML-wrapping
Need ”Need ”XML-wrappersXML-wrappers” (aka ” (aka extractorsextractors))– interface/conversion program to produce an interface/conversion program to produce an
XML representation for source dataXML representation for source data
source1source1
source2source2
source3source3
wrapperwrapper11
XML-form-XML-form-11
wrapperwrapper22
wrapperwrapper33
XML-form-XML-form-22
SDPL 2003 Notes 7: XML Wrapping 6
How to wrap?How to wrap?
1. With an interface integrated to source1. With an interface integrated to source– E.g. XML-interfaces of database systemsE.g. XML-interfaces of database systems– OK, OK, ifif available available
2. With an ad-hoc written translator2. With an ad-hoc written translator– E.g. JDBC+Java or E.g. JDBC+Java or
separator-encoded text form + Perlseparator-encoded text form + Perl– OK; conversion possibly efficientOK; conversion possibly efficient– Development and maintenance tediousDevelopment and maintenance tedious
:: -(-(
SDPL 2003 Notes 7: XML Wrapping 7
How to wrap?How to wrap?(2)(2)3. Generic source-independent 3. Generic source-independent
wrapping wrapping – requires a file/message/report produced requires a file/message/report produced
by the systemby the system» normally availablenormally available
– development and maintenance of development and maintenance of wrappers should become easierwrappers should become easier
=> Wrapper description language XW=> Wrapper description language XW
SDPL 2003 Notes 7: XML Wrapping 8
XW (XML Wrapper)XW (XML Wrapper)
XML-based, declarative wrapper XML-based, declarative wrapper description languagedescription language
To convert from a To convert from a – textual or binary sourcetextual or binary source
» currently (XW 1.59) only text sources currently (XW 1.59) only text sources supported supported
to XML formto XML form
SDPL 2003 Notes 7: XML Wrapping 9
XW: Design principlesXW: Design principles
A concise and natural XML syntaxA concise and natural XML syntax– description of simple and typical conversion description of simple and typical conversion
tasks should be simpletasks should be simple Solving the key problem: Initial conversion Solving the key problem: Initial conversion
of a legacy data format to XMLof a legacy data format to XML– more general post-processing with more general post-processing with
XSLT/SAX/ DOMXSLT/SAX/ DOM– necessary for being able to apply XML necessary for being able to apply XML
techniquestechniques
SDPL 2003 Notes 7: XML Wrapping 10
XW: InfluencesXW: Influences
XML NamespacesXML Namespaces– for separating XW commands and result elementsfor separating XW commands and result elements
XML SchemaXML Schema– description of alternative and repetitive structures description of alternative and repetitive structures
((CHOICECHOICE, , minoccursminoccurs, , maxoccursmaxoccurs))– data types of binary source data data types of binary source data
(string, byte, int, …)(string, byte, int, …) XSLTXSLT
– template-based description of result documentstemplate-based description of result documents– variables for storing result fragmentsvariables for storing result fragments
xmlns:xw=”http://www.cs.uku.fi/XW/2001” xmlns:xw=”http://www.cs.uku.fi/XW/2001”
SDPL 2003 Notes 7: XML Wrapping 11
<xw:wrapper xw:sourcetype="text" <xw:wrapper xw:sourcetype="text" xmlns:xw="http://www.cs.uku.fi/XW/2001" xmlns:xw="http://www.cs.uku.fi/XW/2001"
xw:inputencoding="Cp850" … > xw:inputencoding="Cp850" … > <invoice note="XW-generated" <invoice note="XW-generated" xw:starter="\^xw:starter="\^INVOICEINVOICE"">> <identifierdata ...>Inserted result text ... <identifierdata ...>Inserted result text ... ...... </identifierdata></identifierdata> <specification <specification xw:starter="\^xw:starter="\^PHONE SPECIFICATIONPHONE SPECIFICATION"" ...> ...> ...... </specification></specification> <data <data xw:starter="\^xw:starter="\^------"" xw:maxoccurs="unbounded"xw:maxoccurs="unbounded"...>...> ...... </data></data> </invoice></invoice></xw:wrapper></xw:wrapper>
How does XW look like?How does XW look like?
SDPL 2003 Notes 7: XML Wrapping 12
XW-architectureXW-architecture (1)(1)
AA AA x1x1x2x2
BBBBy1y1 y2y2
z1 z1 z2z2
<part-a> <part-a> <e1> <e1>x1x1</e1> </e1> <e2> <e2>x2x2</e2></e2></part-a></part-a><part-b><part-b> <line-1> <line-1> <d1> <d1>y1y1</d1> </d1> <d2> <d2>y2y2</d2></d2> </line-1> </line-1> <d3> <d3>z2z2</d3></d3></part-b></part-b>
XW-engineXW-engine
<xw:wrapper … ><xw:wrapper … > … … </xw:wrapper></xw:wrapper>
source source datadata
wrapper wrapper descriptiondescription
result result documentdocument
XSLT
post-post-processingprocessing
SAX
DOM
SDPL 2003 Notes 7: XML Wrapping 13
XW-architecture (2)XW-architecture (2)
AA AA x1x1x2x2
BBBBy1y1 y2y2
z1 z1 z2z2XW-engineXW-engine
<xw:wrapper … ><xw:wrapper … > … … </xw:wrapper></xw:wrapper>
startElement(part-a, …) startElement(part-a, …) startElement(e1, …) startElement(e1, …) characters(”characters(”x1x1”)”)……
SAX eventsSAX events
source source datadata
wrapper wrapper descriptiondescription - to use as a - to use as a
program program componentcomponent
SDPL 2003 Notes 7: XML Wrapping 14
XW-architecture (3)XW-architecture (3)
ap
plica
tioap
plica
tionn
XW
-engin
eX
W-e
ngin
e
SA
XS
AX
AA AA x1x1x2x2
BBBBy1y1 y2y2
z1 z1 z2z2
<xw:wrapper … ><xw:wrapper … > … … </xw:wrapper></xw:wrapper>
<part-a> <part-a> <e1> <e1>x1x1</e1> </e1> <e2> <e2>x2x2</e2></e2></part-a></part-a><part-b><part-b> <line-1> <line-1> <d1> <d1>y1y1</d1> </d1> <d2> <d2>y2y2</d2></d2> </line-1> </line-1> <d3> <d3>z2z2</d3></d3></part-b></part-b>
result result documentdocument
source source datadata
wrapper descriptionwrapper description
SDPL 2003 Notes 7: XML Wrapping 15
Wrapper description ~ aWrapper description ~ a grammar grammar for sourcefor source Wrapping ~ Wrapping ~ parsing parsing the source datathe source data
– split data into parts split data into parts according to the according to the descriptiondescription
XW: Basic IdeasXW: Basic Ideas
– Result document = Result document = XML for the parse tree XML for the parse tree of the sourceof the source
<whole> <whole>
</whole> </whole>
<a> <a>
</a> </a>
<b> <b>
</b> </b>
<b1> <b1>
<b2> <b2>
<b3> <b3>
SDPL 2003 Notes 7: XML Wrapping 16
XW SyntaxXW Syntax
<xw:wrapper xw:sourcetype=”text” <xw:wrapper xw:sourcetype=”text”
xmlns:xw=”http://www.cs.uku.fi/XW/2001”>xmlns:xw=”http://www.cs.uku.fi/XW/2001”> <invoice … > <invoice … > <identifierdata ...><identifierdata ...>
......
</identifierdata></identifierdata>
<specification ...> <specification ...>
......
</specification> </specification> </invoice> </invoice></xw:wrapper></xw:wrapper>
Splitting of Splitting of source content source content
into partsinto parts(-> elements)(-> elements)
SDPL 2003 Notes 7: XML Wrapping 17
Recognition of content parts (1)Recognition of content parts (1)
by by separatorsseparators; For example:; For example:<invoice xw:starter="\^INVOICE"…<invoice xw:starter="\^INVOICE"…
by positionby position (within surrounding part): (within surrounding part):
<invoicenumber <invoicenumber xw:position="53 64"/> xw:position="53 64"/>
(Invoice number is in positions 53..64 of the (Invoice number is in positions 53..64 of the first row of an first row of an identifierdataidentifierdata-part)-part)
<identifierdata <identifierdata xw:childterminator="\n" … >xw:childterminator="\n" … >
for for sub-sub-partsparts
SDPL 2003 Notes 7: XML Wrapping 18
Recognition of content parts (2)Recognition of content parts (2)
In binary data by In binary data by content data typescontent data types; ; For example:For example:
<xw:wrapper xw:sourcetype="binary"...><xw:wrapper xw:sourcetype="binary"...>
<A xw:type="byte"/><A xw:type="byte"/><B xw:type="string" <B xw:type="string"
xw:stringLength="20"/>xw:stringLength="20"/><C xw:type="int"/> <C xw:type="int"/>
</xw:wrapper></xw:wrapper>– Split input to a byte, a string of 20 charactes, and Split input to a byte, a string of 20 charactes, and
an integer; (an integer; ( elements elements AA,, BB andand CC))
SDPL 2003 Notes 7: XML Wrapping 19
Repetition:Repetition: <line xw:terminator="\n" <line xw:terminator="\n"
xw:minoccurs="2" xw:maxoccurs="2"/>xw:minoccurs="2" xw:maxoccurs="2"/>
– 2 input lines 2 input lines 2 2 lineline elements elements
Recognition of content parts (3)Recognition of content parts (3)
Alternative parts:Alternative parts:
<xw:CHOICE xw:maxoccurs=”unbounded"><xw:CHOICE xw:maxoccurs=”unbounded"> <A xw:starter=”\^aa” xw:terminator=”\n” /> <A xw:starter=”\^aa” xw:terminator=”\n” /> <B xw:starter=”\^bb” xw:terminator=”\n” /> <B xw:starter=”\^bb” xw:terminator=”\n” />
</xw:CHOICE></xw:CHOICE> – arbitrary number (at least 1) lines starting with ”arbitrary number (at least 1) lines starting with ”aaaa” ”
or ”or ”bbbb” ” elements elements AA or or BB
SDPL 2003 Notes 7: XML Wrapping 20
XW: Modifying the structure of XW: Modifying the structure of datadata Limited modification possible:Limited modification possible:
– discarding parts of datadiscarding parts of data– collapsing levels of hierarchycollapsing levels of hierarchy– adding levels of hierarchyadding levels of hierarchy– generating verbatim content and generating verbatim content and
attributesattributes– re-arranging existing datare-arranging existing data
SDPL 2003 Notes 7: XML Wrapping 21
Discarding parts of dataDiscarding parts of data
<spec xw:starter="SPEC" <spec xw:starter="SPEC"
xw:childterminator="\n">xw:childterminator="\n">
<!-- Split the ”SPEC” into rows: --><!-- Split the ”SPEC” into rows: -->
<!-- Ignore the first three rows: --><!-- Ignore the first three rows: -->
<<xw:ignorexw:ignore xw:minoccurs="3" xw:minoccurs="3" xw:maxoccurs="3" />xw:maxoccurs="3" />
. . .. . .
</spec></spec>
Input parts Input parts notnot matched matched by wrapper elements are ignoredby wrapper elements are ignored
SDPL 2003 Notes 7: XML Wrapping 22
Collapsing hierarchyCollapsing hierarchy
<data<data xw:starter=”START”xw:starter=”START” xw:terminator=”END” xw:terminator=”END” xw:childterminator="\n”> xw:childterminator="\n”>
<!-- ’data’ is made of rows --><!-- ’data’ is made of rows -->
<xw:collapse><xw:collapse>
<date xw:position=”5 14"/><date xw:position=”5 14"/>
<sum xw:position=”16 21"/><sum xw:position=”16 21"/></xw:collapse></xw:collapse>
. . .. . .
</data></data>
SDPL 2003 Notes 7: XML Wrapping 23
Collapsing hierarchy Collapsing hierarchy (2)(2)
– Split source data into Split source data into parts according to parts according to specified separatorsspecified separators
STARTSTART 17.8.199617.8.1996 95.5095.50
ENDEND
<data><data>
. . .. . .
</data></data>
SDPL 2003 Notes 7: XML Wrapping 24
<data><data> <xw:collapse<xw:collapse>> </xw:collapse</xw:collapse>> . . . . . .</data></data>
Collapsing hierarchy Collapsing hierarchy (3)(3)
– split parts into sub-split parts into sub-parts, according to parts, according to sub-elementssub-elements
17.8.199617.8.1996 95.5095.50
SDPL 2003 Notes 7: XML Wrapping 25
<data><data> <xw:collapse<xw:collapse>>
<date><date> </date></date> <sum><sum> </sum></sum> </xw</xw::collapsecollapse>> . . .. . .</data></data>
Collapsing hierarchy Collapsing hierarchy (4)(4)
17.8.199617.8.199695.5095.50
<data><data> <date><date> </date></date> <sum><sum> </sum> </sum> . . . . . .</data></data>
17.8.199617.8.199695.5095.50
SDPL 2003 Notes 7: XML Wrapping 26
<data /><data />
Collapsing hierarchy Collapsing hierarchy (5)(5)
<data><data> </data></data>17.8.199617.8.1996
17.8.199617.8.1996
Input partInput part wrapper elementwrapper element
++ resultresult
<xw:collapse /><xw:collapse />17.8.199617.8.1996 ++
17.8.199617.8.1996default:default: discardwhitespace=”true”discardwhitespace=”true”
SDPL 2003 Notes 7: XML Wrapping 27
Adding levels of hierarchyAdding levels of hierarchy
Example: Recognizing IP addresses in Example: Recognizing IP addresses in binary databinary data
<xw:ELEMENT xw:name=”IP-address"><xw:ELEMENT xw:name=”IP-address">
<a xw:type="byte"/> <a xw:type="byte"/> <b xw:type="byte"/> <b xw:type="byte"/>
<c xw:type="byte"/> <c xw:type="byte"/> <d xw:type="byte"/><d xw:type="byte"/>
</xw:ELEMENT></xw:ELEMENT>
SDPL 2003 Notes 7: XML Wrapping 28
Adding levels of hierarchy Adding levels of hierarchy (2)(2)– Binary data = string Binary data = string
of bytesof bytes
<a>193</a> <a>193</a> <b>167</b> <b>167</b> <c>232</c> <c>232</c> <d>253</d><d>253</d>
193193 232232167167 253253
<IP-address> <IP-address>
<a>193</a> <a>193</a>
<b>167</b> <b>167</b> <c>232</c> <c>232</c> <d>253</d> <d>253</d> </IP-address> </IP-address>
SDPL 2003 Notes 7: XML Wrapping 29
Adding levels of hierarchy Adding levels of hierarchy (3)(3) NB: an NB: an xw:ELEMENTxw:ELEMENT does not correspond to parts of does not correspond to parts of
input data (like ordinary result elements do):input data (like ordinary result elements do):
<!-- Wrap first two lines as INTRO: --><!-- Wrap first two lines as INTRO: --><data xw:childterminator="\n"/><data xw:childterminator="\n"/> < <xw:ELEMENTxw:ELEMENT xw:name="INTRO"> xw:name="INTRO">
<!--lines are matched by these elements:--><!--lines are matched by these elements:--> <xw:collapse /><xw:collapse /> <xw:collapse /><xw:collapse />
</</xw:ELEMENTxw:ELEMENT>> … …
</data></data>
SDPL 2003 Notes 7: XML Wrapping 30
Rearranging contentRearranging content
Content can be rearranged by storing Content can be rearranged by storing results temporarily in variables:results temporarily in variables:
<data xw:childterminator="\n"/><data xw:childterminator="\n"/> <xw:STORE xw:name="lines"><xw:STORE xw:name="lines">
<!-- lines are matched by these elements :--><!-- lines are matched by these elements :--> <line1 /><line2 /> <line1 /><line2 />
</xw:STORE></xw:STORE> … …
<xw:COPY-OF xw:select="lines" /><xw:COPY-OF xw:select="lines" />
</data></data>
SDPL 2003 Notes 7: XML Wrapping 31
<whole><whole> <<xw:STORExw:STORE xw:name=" xw:name="xxxx">"> <a xw:starter="<a xw:starter="AA" xw:terminator="" xw:terminator="$$"/>"/> </</xw:STORExw:STORE>> <b xw:starter="B"><b xw:starter="B"> <b1 xw:starter="1"/><b1 xw:starter="1"/> <b2 xw:starter="2"/><b2 xw:starter="2"/> <<xw:COPY-OFxw:COPY-OF xw:select=" xw:select="xxxx"/>"/> <b3 xw:starter="3"/><b3 xw:starter="3"/> </b></b></whole></whole>
<whole><whole> <b><b> <b1>one</b1><b1>one</b1> <b2>two</b2><b2>two</b2>
<b3>three</b3><b3>three</b3> </b></b></whole></whole>
AAxy..xy..zz$$??????
B..B..1one 2two1one 2two
3three3three
Rearranging result structuresRearranging result structures
<a>xy..<a>xy..z</a>z</a>
SDPL 2003 Notes 7: XML Wrapping 32
<whole><whole> <<xw:STORExw:STORE xw:name=" xw:name="xxxx">"> <a xw:starter="<a xw:starter="AA" xw:terminator="" xw:terminator="$$"/>"/> </</xw:STORExw:STORE>> <b xw:starter="B"><b xw:starter="B"> <b1 xw:starter="1"/><b1 xw:starter="1"/> <b2 xw:starter="2"/><b2 xw:starter="2"/> <<xw:VALUE-OFxw:VALUE-OF xw:select=" xw:select="xxxx"/>"/> <b3 xw:starter="3"/><b3 xw:starter="3"/> </b></b></whole></whole>
<whole><whole> <b><b> <b1>one</b1><b1>one</b1> <b2>two</b2><b2>two</b2>
<b3>three</b3><b3>three</b3> </b></b></whole></whole>
AAxy..xy..zz$$??????
B..B..1one 2two1one 2two
3three3three
Rearranging result Rearranging result contentcontent
xy..xy..zz
SDPL 2003 Notes 7: XML Wrapping 33
XW: ImplementationXW: Implementation
Prototype implemented with JavaPrototype implemented with Java Apache Xerces 2.0.1 used as a SAX parserApache Xerces 2.0.1 used as a SAX parser
– to read the wrapper description, to read the wrapper description, which is represented internally as ..which is represented internally as ..
a a wrapper treewrapper tree– guides the parsing of source dataguides the parsing of source data
SDPL 2003 Notes 7: XML Wrapping 34
Wrapper TreeWrapper Tree
Wrapper tree nodeWrapper tree node– corresponds to an element of wrapper descriptioncorresponds to an element of wrapper description– used for matching parts of source dataused for matching parts of source data– includes sets includes sets SS, , BB, , TT and and F F of stringsof strings
» computed from wrapper descriptioncomputed from wrapper description» SS: element's own : element's own sstarter stringstarter strings» BB: strings that can : strings that can bbegin part of elementegin part of element
= S = S starters of subelements that can begin the starters of subelements that can begin the part of the elementpart of the element
» TT: : tterminating delimiters for the part of elementerminating delimiters for the part of element» FF: strings that can : strings that can ffollow the part of elementollow the part of element
SDPL 2003 Notes 7: XML Wrapping 35
<xw:wrapper xw:name="Wrapper tree example"<xw:wrapper xw:name="Wrapper tree example" xw:sourcetype="text"xw:sourcetype="text" xmlns:xw="http://www.cs.uku.fi/XW/2001">xmlns:xw="http://www.cs.uku.fi/XW/2001"> <doku xw:childterminator="<doku xw:childterminator="\n\n" terminator="" terminator="$$">"> <a xw:starter="<a xw:starter="\^A\^A" xw:minoccurs="0"/>" xw:minoccurs="0"/> <b xw:starter="<b xw:starter="\^B\^B" />" /> <c xw:starter="<c xw:starter="\^C\^C"/>"/> <xw:CHOICE xw:minoccurs="0"<xw:CHOICE xw:minoccurs="0" xw:maxoccurs="unbounded">xw:maxoccurs="unbounded"> <d xw:starter="<d xw:starter="\^D\^D"/>"/> <e xw:starter="<e xw:starter="\^E\^E"/>"/> </xw:CHOICE></xw:CHOICE> </doku></doku></xw:wrapper></xw:wrapper>
dokudokuS: S: B:B:\^A \^A ,,\^B\^BT: T: $$ F:F:
bbS:S:\^B \^B B:B:\^B\^BT:T:\n\n F:F:\^C\^C
aaS:S:\^A\^A B:B:\^A\^AT:T:\n\n F:F:\^B\^B
xw:CHOICExw:CHOICES: S: B:B:\^D\^D,,\^E\^E T: T: F:F:\^D\^D,,\^E\^E, , $$
ccS:S:\^C\^C B:B:\^C\^CT:T:\n\n F:F:\^D\^D,,\^E\^E, , $$
ddS:S:\^D\^D B:B:\^D\^D T:T:\n\n F:F:\^D\^D,,\^E\^E, , $$
eeS:S:\^E\^E B:B:\^E\^E T:T:\n\n F:F:\^D\^D,,\^E\^E, , $$
AaaaAaaaBbbbBbbbCcccCcccEeeeEeeeDdddDdddDdddDddd$$
SDPL 2003 Notes 7: XML Wrapping 36
Executing a wrapper (simplified)Executing a wrapper (simplified)
Traverse the wrapper tree; In each node:Traverse the wrapper tree; In each node:– scan input until the start of corresponding part found (= scan input until the start of corresponding part found (=
a delimiter belonging to set a delimiter belonging to set BB))– report report startElement(…)startElement(…) – EitherEither
» process child nodes recursively, or process child nodes recursively, or
» report report characters(…)characters(…) for a leaf-level element for a leaf-level element
– scan input until the end of the part (using sets scan input until the end of the part (using sets TT and and FF))– report report endElement(…)endElement(…)– if node iterative, and a string in if node iterative, and a string in BB found, reprocess found, reprocess
nodenode
SDPL 2003 Notes 7: XML Wrapping 37
Development statusDevelopment status
Fall 2001: language designed from Fall 2001: language designed from concrete examplesconcrete examples
2002: Design of implementation 2002: Design of implementation principles, implementationprinciples, implementation– wrapping of separator-based and wrapping of separator-based and
positional text data implementedpositional text data implemented– wrapping of binary data (and few other wrapping of binary data (and few other
details) unimplemented details) unimplemented
SDPL 2003 Notes 7: XML Wrapping 38
XW: Some possible XW: Some possible extensionsextensions Evaluation of expressionsEvaluation of expressions
– for generating computed attributes for generating computed attributes ((implemented recentlyimplemented recently))
– for guiding repetition (for guiding repetition (min/maxoccursmin/maxoccurs) by content ) by content values values
Namespace support for resultsNamespace support for results Describing recursive (unlimited nesting) Describing recursive (unlimited nesting)
source structuressource structures=> recognizing LL(k) languages=> recognizing LL(k) languages
(Usefulness for wrapping data formats?)(Usefulness for wrapping data formats?)
SDPL 2003 Notes 7: XML Wrapping 39
XW: SummaryXW: Summary
XW: a convenient "XML wrapper XW: a convenient "XML wrapper description language”description language”– for translating legacy data to XMLfor translating legacy data to XML– declarative wrapper descriptiondeclarative wrapper description– easier than procedural ad-hoc conversion easier than procedural ad-hoc conversion
programsprograms– working prototype implementationworking prototype implementation
» to be available at to be available at www.cs.uku.fi/research/XRAKEwww.cs.uku.fi/research/XRAKE