KYOTO (ICT-211423)Knowledge Yielding Ontologies for Transition-Based Organization
Intelligent Content and Semantics
The First KYOTO WorkshopFebruary 2-3 2009
Overall Kyoto Architecture and Kyoto Annotation Format
Carlo Aliprandi - SyNTHEMA
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Kyoto Architecture - Baselines
• KYOTO: an information sharing system that enables the extraction of deep semantics (Web 3.0) from texts, for a selected domain, anchoring meaning across cultures and languages
• KYOTO: a social platform (Web 2.0) for knowledge sharing and transfer supporting people and organization in building, maintaining and improving knowledge
• Baselines for KYOTO architecture:– Strong backbone for data exchange among components – Adopt and adapt existing standards – Open and public system– Synchronize across versions/languages/NLP tools/research groups– API to connect to sources and services– Services to plug and unplug different knowledge sources (Lexicon,
Wordnets, Ontologies– Tradeoff btw generic vs domain resources
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
System components
• Capture Serversystem for selecting, converting and storing documents into
the Kyoto document DB.
linguistic processors producing KAF annotations • Wikyoto system
wiki system for yielding wordnets and ontologies. Main interface for concept and fact users
• Document Manager• Term Editor• Kybot Editor
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
System components
• Tybot ServerAutomatic term and relation extraction from KAF documents
and population of term database Validation of terms and population and mapping to D-WNs
via Wikyoto• Kybots Server
Semi-Automatic fact annotation on KAF documents, using patterns (Kybots)
• Kyoto Search systemMain interface for end-users
• Fact search system• Fact alert system
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
(Simplified) architecture: domain expert point-of-view
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Overall architecture
Tybot ServerTybot Server
CaptureServer
CaptureServer
IndexingServer
IndexingServer
Kybot ServerKybot Server
KybotsDB
KybotsDB
DocumentBase
DocumentBase
DomainOntologyDomain
Ontology
Wordnet (Japanese)Wordnet
(Japanese) Wordnet (Dutch)
Wordnet (Dutch)Wordnet
(Spanish)Wordnet (Spanish)
Wordnet (Chinese)Wordnet (Chinese)
Wordnets
FrameNetFrameNet
DOLCEDOLCEFrameNetFrameNet
KyotoOntology
KyotoOntology
SUMOSUMO
Ontologies
Basque Term DB
Japan Term DB
Domain WordnetDomain
Wordnet
Extracted Terms
Extracted Terms
L.P.(Dutch)
L.P.(Dutch)L.P.
(English)L.P.
(English)
L.P.(Basque)
L.P.(Basque)
L.P.(Italian)
L.P.(Italian)
Linguistic Processor
TermEditorTermEditor
Doc.Manager
Doc.Manager
Kybot EditorKybot Editor
Wikyoto
[2]
[1]
[3]
SearchApp.
SearchApp.
BrowseBrowse
Kyoto System
Concept User
Fact User
End User
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Data formats: KAF
• Kyoto Annotation Format (Level 1)a multi-layered annotation format for:– Tokenizaton and word form segmentation– POS tagging – Lemmatization and Term extraction – Constituency Tagging– Dependency Tagging
ENG-3.0-107695012-N
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Semantic Annotation
• Semantic Annotation Format for:– Named Entity Recognition (time, events, quant. …)
– Word Sense Disambiguation (D-WSD)– Semantic Role Labeling (SRL)
no synsets
KAF level2 (SemKAF)ENG-3.0-107630294-N
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Data formats
Level of annotation:1. Morpho-syntax annotation2. Semantic annotation3. Terms representation
4. Facts annotation
5. Wordnets6. Ontologies
Standard format
}KAF
TMF
KAF
LMF OWL
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
KAF annotation : words
<text> <wf wid="w1" sent="1" para="1">Tropical</wf> <wf wid="w2" sent="1" para="1">terrestrial</wf> <wf wid="w3" sent="1" para="1">species</wf> <wf wid="w4" sent="1" para="1">populations</wf> <wf wid="w5" sent="1" para="1">declined</wf> <wf wid="w6" sent="1" para="1">by</wf> <wf wid="w7" sent="1" para="1">55</wf> <wf wid="w8" sent="1" para="1">per</wf> <wf wid="w9" sent="1" para="1">cent</wf> <wf wid="w10" sent="1" para="1">on</wf> <wf wid="w11" sent="1" para="1">average</wf> <wf wid="w12" sent="1" para="1">from</wf> <wf wid="w13" sent="1" para="1">1970</wf> <wf wid="w14" sent="1" para="1">to</wf> <wf wid="w15" sent="1" para="1">2003</wf> </text>
Tropical terrestrial species
populations declined by 55 per cent
on average from 1970 to 2003.
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
KAF annotation : terms
<term tid="t5" type="open" lemma="decline" pos="V"> <spans> <target id="w5"/> </spans>
<term tid="t7" type="open" lemma="55 per cent" pos="N"> <spans> <target id="w7"/> <target id="w8"/> <target id="w9"/> </spans> </term>
Tropical terrestrial species
populations declined by 55 per cent
on average from 1970 to 2003.
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
KAF annotation : constituents
<chunks> <!-- terrestrial species --> <chunk cid="2" head="t3" phrase="NP"> <spans> <target id="t2"/> <target id="t3"/> </spans> </chunk> <!-- terrestrial species populations --> <chunk cid="3" head="t4" phrase="NP"> <spans> <target id="t2"/> <target id="t3"/> <target id="t4"/> </spans> </chunk> <!-- Tropical terrestrial species --> <chunk cid="4" head="t3" phrase="NP"> <spans> <target id="t1"/> <target id="t2"/> <target id="t3"/> </spans> </chunk> </chunks>
Tropical terrestrial species
populations declined by 55 per cent
on average from 1970 to 2003.
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
KAF annotation : dependencies
<deps> <dep from="t4" to="t5" rfunc="subj"/> <dep from="t4" to="t1" rfunc="mod"/> <dep from="t4" to="t2" rfunc="mod"/> <dep from="t4" to="t3" rfunc="mod"/>
<term tid="t1" type="open" lemma="tropical" pos="G"> .. <term tid="t2" type="open" lemma="terrestrial" pos="G"> .. <term tid="t3" type="open" lemma="species" pos="N"> .. <term tid="t4" type="open" lemma="population" pos="N"> .. <term tid="t5" type="open" lemma="decline" pos="V"> ..
Tropical terrestrial species
populations declined by 55 per cent
on average from 1970 to 2003.
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
<term tid="t4" type="open" lemma="population" pos="N"> <spans> <target id="w4"/> </spans> <senseAlt>
<sense sensecode="EN-17-00861095-n" /><sense sensecode="EN-17-00859568-n" />.......
<term tid="t4" type="open" lemma="population" pos="N"> <spans> <target id="w4"/> </spans> <senseAlt>
<sense sensecode="EN-17-00859568-n" confidence="0.80 "/><sense sensecode="EN-17-00257849-n" confidence="0.13 /><sense sensecode="EN-17-00962397-n" confidence="0.07 />
</senseAlt> </term>
KAF annotation: WSD
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Kyoto open-ness
– The kernel of the system. – Core components available as Open Source– Integrating existing resources – Usable by anybody in the 7 Kyoto langs– Fast delivery: at M12 beta available for several components
(Capture Server, LPs, Tybot server, Wikyoto …)
– Third-part resources as plug-ins– Third-part (open sources) linguistic processors– New languages– Search Interface– Fact Alert System - News Monitoring System
The First KYOTO Workshop, Amsterdam, February 2-3 2009 ICT-211423
Thanks