12
Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global Intelligent Content at Trinity College Dublin

Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Embed Size (px)

Citation preview

Page 1: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Using Semantic Mapping to Manage Heterogeneity in

XLIFF Interoperabilityby

Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan

CNGL Centre for Global Intelligent Content at Trinity College Dublin

Page 2: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Outline

• Localization industry – interoperability issues

• Linked Data representation of localization content• Still has interoperability issues

• Language Technology retraining workflow - use case

• Our mapping representation

• Evaluation

• Conclusions

Page 3: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Localization IndustryDocument Store

Extract & Segment

Named EntityRecognition

Identify terms and translation

Prioritise PE based on QE

Post edit

Machine Translate

HTML source

Annotated XLIFF

source

Src XLIFF + glossary

Src/Tgt XLIFF

Prioritised XLIFF

PE‘d XLIFF

XLIFF source

Translation Workflow

Page 4: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Linked Data Representation – L3 Data

Document Store

Triple Store

Extract & Segment

Named EntityRecognition

Identify terms and translation

Prioritise PE based on QE

Post edit

Machine Translate

HTML source

Annotated XLIFF

source

Src XLIFF + glossary

Src/Tgt XLIFF

Prioritised XLIFF

PE‘d XLIFF

L3 data

XLIFF source

Translation Workflow

XSLT Mapper

Page 5: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

LT Retraining WorkflowDocument Store

Triple Store

Extract & Segment

Named EntityRecognition

Identify terms and translation

Prioritise PE based on QE

Post edit

Machine Translate

HTML source

Annotated XLIFF

source

Src XLIFF + glossary

Src/Tgt XLIFF

Prioritised XLIFF

PE‘d XLIFF

L3 data (GLOBIC)

New training

data

Train & deploy MT Tool

(GLOBIC unaware)

Analyse and select

Retrain?

XLIFF source

Retraining Workflow

Translation Workflow

Mapping (GLOBIC to ITS)L3 data (ITS)

XSLT Mapper

Page 6: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Architecture Diagram of the Process

Triple Store

ApplicationSPARQL

processor

SPIN API

1. Application search for resources in the Triple Store2. None in application’s vocabulary, search for mappings3. If mappings exist, then retrieve the SPIN representation4. Convert the SPIN representation to SPARQL syntax via a call to the

SPIN API5. Execute the SPARQL query via the SPARQL processor6. Consume the newly created data

Page 7: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Mapping Requirements

1. A mapping entity must be expressed as RDF, with a unique URI, allowing it to be published as Linked Data

2. The executable statement must be a SPARQL query

3. The executable statement must be expressed as RDF and linked to a mapping entity

4. A mapping entity is to be modeled with associated meta-data

Page 8: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Meta-data and SPIN• Meta-data properties from the GLOBIC and W3C PROV vocabularies:

gic:wasCreatedBy, gic:mapDescription, prov:generatedAtTime, prov:wasRevisionOf

• SPIN vocabulary to express SPARQL queries as RDF:

SELECT ?subject ?predicate ?objectWHERE { ?subject ?predicate ?object }

[] a sp:Select ; sp:templates ([ sp:object _:b1 ; sp:predicate _:b2 ; sp:subject _:b3 ]); sp:where ([ sp:object _:b1 ; sp:predicate _:b2 ; sp:subject _:b3 ]). _:b3 sp:varName “subject”^^xsd:string ._:b2 sp:varName “predicate"^^xsd:string . _:b1 sp:varName “object"^^xsd:string .

SPARQL Query

SPIN Representation

Page 9: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Mapping Representation Example

ex:globic_to_its_mtScore_map_1_1 a gic:Mapping ; gic:hasRepresentation ex:globic_to_its_mtScore_sp_2 ; gic:wasCreatedBy ex:person_1 ; prov:generatedAtTime “2014-01-01”^^xsd:date ; gic:mapDescription “Used to map MT confidence data from ------------------------------------ GLOBIC to ITS vobabulary” ; gic:version “1.1”^^xsd:float ; prov:wasRevisionOf ex:globic_to_its_mtScore_map_1 .

ex:globic_to_its_mtScore_sp_2 a sp:Construct ; sp:templates ([ sp:object _:b1 ; sp:predicate itsrdf:mtConfidence ; sp:subject _:b2 ]) ; sp:where ([ sp:object _:b1 ; sp:predicate gic:qualityAssessment ; sp:subject _:b2 ]) .

_:b2 sp:varName "s"^^xsd:string . _:b1 sp:varName "val"^^xsd:string .

Mapping Entity + Meta-data SPIN Representation of SPARQL Query

Page 10: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Evaluation

• Two initial experiments:

1. Test the mapping capabilities of SPARQL construct queries• R2R Framework – 70* test mappings• Reproduced R2R Evaluation• R2R test mappings as SPARQL construct queries• Compared results – SPARQL construct queries as expressive as R2R Framework

2. Test the expressiveness of SPIN vocabulary with regard to expressing SPARQL construct queries as RDF• Carried out using online SPIN RDF Converter and TopBraid composer• Input the SPARQL construct queries from first evaluation• SPIN could represent all queries in RDF• Suitable vocabulary to use

Page 11: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Conclusions

• Mapping representation to increase interoperability within heterogeneous workflows

• All aspects of mapping representation published as Linked Data

• Discovery of the mappings through SPARQL queries - ultimately executed through SPARQL processor

• Evaluation – Capabilities of SPARQL construct queries and expressiveness of SPIN

• Not just relevant to localization workflows, useful in other Linked Data scenarios

Page 12: Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global

Thank You

Questions?