15
Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Krextor – An Extensible XMLRDF Extraction Framework Scripting for the Semantic Web, 5 th Workshop Christoph Lange Jacobs University, Bremen, Germany KWARC – Knowledge Adaptation and Reasoning for Content May 31, 2009 Ch. Lange (Jacobs University) Krextor – An Extensible XMLRDF Extraction Framework May 31, 2009 1/15

Krextor – An Extensible XML→RDF Extraction Framework

Embed Size (px)

DESCRIPTION

Workshop Scripting for the Semantic Web, ESWC 2009

Citation preview

Page 1: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Krextor – An Extensible XML→RDF ExtractionFramework

Scripting for the Semantic Web, 5th Workshop

Christoph Lange

Jacobs University, Bremen, Germany

KWARC – Knowledge Adaptation and Reasoning for Content

May 31, 2009

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 1/15

Page 2: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Overview

Want XML applications to contribute to the Semantic Web?1 Define a schema→ontology mapping for your XML language2 Extract RDF from XML

Krextor:Specify XML→ontologymappings (as extractionrules)Perform extraction(XSLT-basedimplementation)

http://kwarc.info/projects/krextor/

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 2/15

Page 3: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

XML vs. RDF

Two slices of the infamous Layer Cake:

RDF

XML

Doesn’t tell much about the role of XML:1 XML only for encoding higher-layer formalisms like RDF or OWL?2 or XML as a metalanguage of its own right?

In case (2), we need a semantics for XML-based languages!

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 3/15

Page 4: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

XML languages

Advantages of using XML for knowledge representation (and notjust RDF):

1 Sequential order out of the box2 Style languages (CSS, XSL)

Given any domain, . . .can define an XML schema for a domain-specific languageconcise syntax for domain expertsno need to think in triples (compare OWL XML vs. RDF/XML)

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 4/15

Page 5: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

What about the semantics?

<workshop xml:id="SFSW09"conference="#ESWC09"number="5"date="2009-05-31"><title short="SFSW">Scripting for the Semantic Web</title>

</workshop>

Usual approach: human-readable specification, then hard-codeSemantic approaches: RDFa, Microformats

Open questions:1 How to give above language a direct RDF-based semantics?2 How to implement the XML→RDF translation?

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 5/15

Page 6: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Making an XML language semantic

We are focused on practical implementation, not on a formalsemantics bridging XML and RDF.We want to benefit from existing XML and RDF tools.

Our approach:1 provide rules that translate XML to RDF2 if needed, supply an ontology as vocabulary for the extracted

RDF

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 6/15

Page 7: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Krextor’s History

1 Origin: OMDoc (Open MathematicalDocuments; XML schema and ontology)manage in a semantic wiki

2 Hard-coded Java implementation: toounflexible to maintain

3 More lightweight approach: XSLT codedfrom scratch (OMDoc→RXR→Java)

4 Needed support for other languages5 Created Krextor, a generic XSLT-based

framework6 . . . and provided somemore

translations (‘‘extraction modules’’)

http://kwarc.info/projects/krextor/

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 7/15

Page 8: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

The Framework

OMDoc+RDFa

OMDoc/OWL+RDFa

XHTML+RDFa

OpenMath

my XML+RDFa?

myMicroformat

genericrepresentation

RXR

your format

Javacallback

RDF/XML

Turtle

??

input format

output format

Collection of XSLT stylesheets, Java wrapper, Shell frontendOutput targetted at machines, not humans

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 8/15

Page 9: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Adding Input and Output Modules

Input module (for a new XML language):very simple declarative mappings (element↦class)otherwise pattern-match XML structure, then call a predefinedtemplate: create resource, add property, etc.several ways of generating URIs for XML elements: xml:id,auto-generated, custom

Output module (for a new RDF serialization):implement low-level ‘‘triple generation template’’or post-process output of an existing module

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 9/15

Page 10: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Our own applications

Semantic wiki: SWiM semantic wiki (http://swim.kwarc.info)mathematical documents (OMDoc, OpenMath)extract RDF outline from documentsuse it for navigation, querying, problem-solvingassistance

Documented ontologies:

write ontologies in OMDoc(better documentability→ poster session)Krextor translates to OWL

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 10/15

Page 11: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Example: hCalendar Microformat (1)

Input:<div class="vevent"><a class="url" href="http://www.eswc2009.org">ESWC</a>starts on <span class="dtstart">2009-05-31</span>.</div>

Desired output:<http://www.eswc2009.org>

a <http://www.w3.org/2002/12/cal/ical#Vevent> ;<http://www.w3.org/2002/12/cal/ical#dtstart>

"2009-05-31"^^<http://www.w3.org/2001/XMLSchema#date> .

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 11/15

Page 12: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Example: hCalendar Microformat (2)

Usage: krextor hcalendar..turtle infile.xhtml

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 12/15

Page 13: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Example: Declarative Mapping (OpenMath)

<xsl:variable name="krextor:resources"><CD type="&omo;ContentDictionary"/><CDDefinition type="&omo;SymbolDefinition"related-via-properties="&omo;containsSymbolDefinition"/>

<Example type="&omo;Example"related-via-properties="&omo;hasExample"/>

</xsl:variable>

<xsl:template match="CD|CDDefinition|Example"<xsl:apply-templates select="." mode="krextor:create-resource"/>

</xsl:template>

Resources

<xsl:variable name="krextor:literal-properties"><Name property="&dc;identifier" normalize-space="true"/><Description property="&dc;description" normalize-space="true"/><Title property="&dc;title" normalize-space="true"/><Role property="&omo;role" normalize-space="true"/>

</xsl:variable>

<xsl:template match="Name|Description|Title|Role"><xsl:apply-templates select="." mode="krextor:add-literal-property"/>

</xsl:template>

Properties

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 13/15

Page 14: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Related Work

Swignition: extensive support for ‘‘standard’’ semantics (RDFa,microformats, GRDDL), but harder to add a new inputlanguage

XSDL: declarative XML→OWL-DL mapping. Not (?)implemented; would make a nice frontend to Krextor

XSPARQL: combines SPARQL and XQuery, breaks boundariesbetween XML and RDF. Currently rather one-timequeries than complete translations.

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 14/15

Page 15: Krextor – An Extensible XML→RDF Extraction Framework

Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion

Conclusion

Krextor supports many XML→RDF conversion tasksEasy to extend, easy to integrate into applications

Possible integration into engineering workflows:Ontology engineering: First design the ontology, then a convenient

XML syntax for domain-specific knowledgeLanguage engineering: Specify the semantics while engineering

the schema

Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 15/15