59
CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson <[email protected]>

XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF in the Localisation of

Open Source Software

One step forward, two steps back?

Asgeir Frimannsson <[email protected]>

Page 2: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Overview

• Motivation

• Background

– GNU Gettext and XLIFF

• One step forward…

– XLIFF in Open Source Localisation

• …And two steps back?

– Tool Support

– Challenges in Open Source Localisation

– Where do we go from here?

• Questions?

Page 3: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Motivation

Page 4: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Open Source Software

Page 5: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Open Source Localisation

• Open Source What?

– Localisation of Open Source Software?

– Localisation using Open Source Tools?

Open Source Tools Proprietary Tools

Proprietary

Project ☺ Ask someone else

Open Source

Project ☺ Not Interested

Page 6: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gnome 2.18 (Latest Release)

• User Interface

– 36 000 Translation Units

– 46 Languages >85% Translated

– 62 Languages >50% Translated

– 170 Language Teams

• Documentation

– 23 000 Translation Units

– 2 Languages >85% Translated

– 4 Languages >50% Translated

Page 7: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

KDE 3.x (Latest Release)

• User Interface

– 107 000 Translation Units

– 27 Languages >85% Translated

– 47 Languages >50% Translated

– 107 Language Teams

• Documentation

– 68 000 Translation Units

– 8 Languages >85% Translated

– 13 Languages >50% Translated

Page 8: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Not just statistics…

• Some teams focus on specific applications or

software distributions

• E.g. the KhmerOS initiative have been

targeting the OpenSUSE Linux distribution:

[ From: http://i18n.opensuse.org/stats/ ]

Page 9: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Language InitiativesAchinese | Afar | Afrikaans | Akan | Albanian | Amharic | Arabic | Aragonese | Armenian | Assamese |

Asturian | Azerbaijani | Basque | Belarusian | Bengali | Berber (Other) | Blin; Bilin | Bosnian | Brazilian

Portuguese | Breton | Bulgarian | Buriat | Burmese | Catalan | Cebuano | Chinese (Hong Kong) |

Cornish | Corsican | Croatian | Czech | Danish | Divehi | Dutch | Dzongkha | English | English

(Australia) | English (Canada) | English (United Kingdom) | Esperanto | Estonian | Faroese | Filipino |

Finnish | French | Frisian | Friulian | Gaelic; Scottish | Galician | Ganda | Georgian | German |

German, Low | Greek | Greenlandic (Kalaallisut) | Guarani | Gujarati | Haitian; Haitian Creole | Hausa

| Hawaiian | Hebrew | Hiligaynon | Hindi | Hungarian | Icelandic | Indonesian | Interlingua | Inuktitut

| Irish | Italian | Japanese | Javanese | Kabyle | Kannada | Kashubian | Kazakh | Khmer | Kinyarwanda

| Kirghiz | Klingon; tlhIngan-Hol | Konkani | Korean | Kurdish | Kurdish (Sorani) | Lao | Latin | Latvian |

Limburgian | Lingala | Lithuanian | Lojban | Lower Sorbian | Luxembourgish | Macedonian | Malagasy

| Malay | Malayalam | Maltese | Manx | Maori | Marathi | Mongolian | Navaho | Ndebele, South |

Neapolitan | Nepali | Northern Sami | Norwegian Bokmal | Norwegian Nynorsk | Occitan (post 1500) |

Oriya | Oromo | Pampanga | Papiamento | Persian | Polish | Portuguese | Punjabi | Pushto | Quechua

| Raeto-Romance | Romanian | Romany | Russian | Sanskrit | Sardinian | Scots | Serbian | Sidamo |

Simplified Chinese | Sindhi | Sinhalese | Slovak | Slovenian | Somali | Sotho, Northern | Sotho,

Southern | Spanish | Swahili | Swati | Swedish | Syriac | Tagalog | Tajik | Tamashek | Tamil | Tatar |

Telugu | Tetum | Thai | Tibetan | Tigre | Tigrinya | Traditional Chinese | Tsonga | Tswana | Turkish |

Turkmen | Uighur | Ukrainian | Urdu | Uzbek | Venda | Vietnamese | Walamo | Walloon | Welsh |

Wolof | Xhosa | Yiddish | Yoruba | Zulu

[ From: https://translations.launchpad.net/ubuntu/feisty/+translations ]

Page 10: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Motivating Factors

• Community-driven Translation

• Enabling end-users to contribute to the

localisation process

– Domain experts

– Knowledge of Language and Culture

• “Crowdsourcing” of translations

• Not strictly confined to open source software

– Google, Microsoft

Page 11: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Motivating Factors

• Allowing Users to embrace software without letting

go of their language and cultural identity

– Translation Initiatives driven by native language speakers

– E.g. KhmerOS, Translate.org.za,

• Technology enabling community-driven Localisation

of Software and E-content

– Localisation Tools

– Processes

– Enabling Technologies

Page 12: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Background

Page 13: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

GNU Gettext

• De facto standard for i18n support in GNU

based open source applications

• Based around two file formats:

– Portable Object (PO) – Bi-lingual String Table

containing original extracted (English US) strings

and translations

– Machine Object (MO) – Binary representation of

String table for retrieving strings at run-time.

Page 14: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gettext Overview

Translate

3

Convert

Machine Object

(MO) file

4

Retrieve Messages

Application

(runtime)

5

Runtime

Page 15: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gettext PO – Overview# SOME DESCRIPTIVE TITLE.# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDERmsgid ""msgstr "" " Project-Id-Version: Project Name and Version\n "" PO-Revision-Date: YYYY-DD-MM HH:MM-SSSS\n "" POT-Creation-Date: YYYY-DD-MM HH:MM-SSSS\n "" Language-Team: Language Team <email@addr>\n “" Last-Translator: Translator Name <email@addr>\n "" MIME-Version: 1.0\n "" Content-Type: text/plain; charset=UTF-8\n "" Content-Transfer-Encoding: 8bit\n "

# translator-comments#. extracted-comments#: filename:linenumber#, flag...msgid untranslated-stringmsgstr translated-string

Comments

Header

White-space

Translation

Unit(s)

Segment Meta data

Page 16: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gettext PO – Header# SOME DESCRIPTIVE TITLE.# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDERmsgid ""msgstr "" " Project-Id-Version: Project Name and Version\n "" Report-Msgid-Bugs-To: Name <email@addr>\n "" Language-Team: Language Team <email@addr>\n "" Last-Translator: Translator Name <email@addr>\n "" PO-Revision-Date: YYYY-DD-MM HH:MM-SSSS\n "" POT-Creation-Date: YYYY-DD-MM HH:MM-SSSS\n "" MIME-Version: 1.0\n "" Content-Type: text/plain; charset=UTF-8\n "" Content-Transfer-Encoding: 8bit\n " " Plural-Forms: nplurals=2; plural=(n != 1)\n "" X-User-Defined-Variable: value\n "

Comments

Informative

meta data

Technical

Meta-data

Custom fields

Page 17: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gettext PO – Translation Units# translator-comments#. extracted-comments#: filename:linenumber#, flag...msgid untranslated-stringmsgstr translated-string

# Not sure if 'Katalog' is the right word to use #. Menu entry, as in File->open..#: example.c:23#, fuzzymsgid " Open Catalog.. "msgstr " Åpne Katalog.. "

In example.c:

22 /* Menu entry, as in File->open.. */23 gui_set_text(menuitem, gettext( "Open Catalog.." ) );

Page 18: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Gettext PO – Plural Forms

# translator-comments#. extracted-comments#: filename:linenumber#, flag...msgid untranslated-stringmsgid_plural untranslated-string-pluralmsgstr[0] translated-string-case-0msgstr[1] translated-string-case-1...msgstr[n] translated-string-case-n

msgid " %d file "msgid_plural " %d files "msgstr[0] " %d plik "msgstr[1] " %d pliki "msgstr[2] " %d plików "

Polish Example:

Page 19: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Recent Updates to the PO format

• Preserve changes in the source string

• Disambiguate by adding context.

# translator-comments#. extracted-comments#: filename:linenumber#, flag...#prev_msgid previous-msgid#prev_msgid_plural previous-msgid-plural#prev_msgctxtmsgctxt message-contextmsgid untranslated-stringmsgid_plural untranslated-string-pluralmsgstr[0] translated-string-case-0msgstr[1] translated-string-case-1...msgstr[n] translated-string-case-n

Page 20: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Limitations of PO

• Limited support for meta-data

• Very limited pre-translation support

• No support for binary content such as images and icons

• No segmentation & alignment support

• PO is a simple string table format, and not fit for paragraph-based text and inline elements

– PO is exploited for translation of XML-based formats such as Docbook and SVG

• So we need a replacement…

Page 21: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF Overview

• XML Localisation Interchange File Format

“…A specification for the lossless interchange of

localizable data and its related information,

which is tool-neutral, has been formalized as

an XML vocabulary, and features an

extensibility mechanism.”[ XLIFF FAQ]

Page 22: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF Overview

• Extract localisable content to a common file format

• Extract – Localise – Merge

Original Material

Extract

(convert)

Localised Data

(Translation Units)

Page 23: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF Overview<xliff version='1.2'>

<file original='example.txt' source-language='en' target-language='nb-NO'>

<header>Meta-data on file and localisation process

</header><body>

<trans-unit id='#1'><source> Hello World! </source><target> Hei Verden! </target><alt-trans>

Translation suggestions from TM, MT...</alt-trans>

</trans-unit><group>

<trans-unit> ... </trans-unit></group>

</body></file>

</xliff>

Header

Body

Page 24: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF Overview

• Support for features PO is lacking

– abstraction of inline codes and markup

– advanced context information

• Through <context-group> elements

– Workflows

• Through <phase> elements

– Pre-translation and Translation suggestions

• Through <alt-trans> elements

– Other meta-data

Page 25: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

One step forward…

Page 26: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

From PO to XLIFF

• A deliverable for XLIFF 1.2 was a set of

Representation Guides for describing how

common file formats could be presented in

XLIFF

• Our Goal: To create a standard XLIFF

representation of the PO file format

• The XLIFF Representation Guide for Gettext

PO is part of XLIFF 1.2

Page 27: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

The XLIFF Tools Project

• Aimed to develop tools to support XLIFF in Open Source Localisation

– Guide for representing PO in XLIFF

– Input from key people in various Open Source Communities

• Started in January 2005

• Hosted on freedesktop.org

• >200 Messages between January and July 2005

• The XLIFF representation Guide for PO was transferred to the XLIFF TC in July 2005

Page 28: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

PO to XLIFF

• A PO file maps to a XLIFF <file> element

• PO Header stored in skeleton or treated as a

translation unit

• A PO translation unit maps to an XLIFF <trans-unit> element

– Each plural form also maps to a <trans-unit>element, but contained within a <group> element

• Inline codes such as parameterized strings

abstracted when possible

Page 29: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

PO Translation Units<trans-unit id=' messages_1 ' approved=' no'>

<source> untranslated-string </source><target> translated-string </target><note from=' po-file '>

translator-comments</note><context-group name=' po-reference#1 ' purpose=' location '>

<context context-type=' sourcefile '> sourcefile </context><context context-type=' linenumber '> linenumber </context

</context-group><context-group name=' po-entry-header ' purpose=' information '>

<context context-type=' x-po-autocomment '>extracted-comments

</context></context-group>

</trans-unit>

Page 30: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Plural Forms<group restype=' x-gettext-plurals '>

<trans-unit id=' messages_1[0] '><source> untranslated-string-singular </source><target> translated-string-form-0 </target>

</trans-unit><trans-unit id=' messages_1[1] '>

<source> untranslated-string-plural </source><target> translated-string-form-0 </target>

</trans-unit>...<trans-unit id=' messages_1[n] '>

<source> untranslated-string-plural </source><target> translated-string-form-n </target>

</trans-unit>...additional context information...

</group>

Page 31: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

PO Header

• PO header stored in skeleton in workflows where

variables are not modified by translators

<trans-unit id=' messages_1 ' approved=' yes '><source>

Project-Id-Version: Project Name and Version...POT-Creation-Date: YYYY-DD-MM HH:MM-SSSS

</source><target>

Project-Id-Version: Project Name and Version...POT-Creation-Date: YYYY-DD-MM HH:MM-SSSS

</target><note from=' po-file '>

SOME DESCRIPTIVE TITLE.Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER

</note></trans-unit>

Page 32: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Abstraction of inline codes

PO:#, c-formatmsgid "My name is %s"msgstr ""

becomes

<trans-unit id=' messages_1 ' approved=' no'><source> My name is <ph id='#1' type='x-c-param'>%s</ph> </source>

</trans-unit>

Page 33: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF in Open Source Localisation

Workflows

How do we handle XLIFF-based

localisation within present build

systems and development processes?

Page 34: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Common PO-based Workflow

• Uses the PO file format throughout the localisation process

– PO files stored in Version Control System

• Uses PO Compendiums as Translation Memories

• Translators work with PO editors (like KBabel) or text editors

• Other formats converted to PO for localisation and follow a similar process

Page 35: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

1) Optional XLIFF-based Workflow

• PO files optionally converted to XLIFF for

translation in XLIFF-based editors

• Translators can choose to use PO or XLIFF

• PO files still used as persistent file format

• XLIFF meta-data lost on back-conversion to PO

Page 36: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

2) Native XLIFF-based Workflow

• Eliminate the need for PO in the localisation process

• Convert to XLIFF in the build system

• Store XLIFF files – not PO files, in the repository

• Uses PO in the build systems as a intermediate format

• Needs additional gettext-like tools to merge and initialize XLIFF files

Page 37: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

3) Gettext-integrated XLIFF Workflow

• Same as previous, but:

• XLIFF support implemented within the Gettext

toolkit

• Extract resources directly to XLIFF, eliminating

need for PO

• Only works for GNU Gettext based processes

– What about 3rd party tools?

Page 38: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

…and two steps back again?

Page 39: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Status Quo

• 2 years have passed by since these discussions

and there is little or no uptake of XLIFF in

Open Source Localisation processes

• Why is there so little interest in XLIFF?

Page 40: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Status Quo

“I don't expect XLIFF based open-source translation editors in the next 3 years: It took ca. 2 years until KBabel was built, which is so far the only good open-source translation editor. (….) …An editor which not only has to accommodate a hundred of different elements and attributes, but also a configurable GUI around it, is not going to be seen in the open-source world soon.”

Bruno Haible (GNU Gettext maintainer)

xliff-tools mailing list, Feb 2005

Page 41: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Tool Support

Page 42: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

SUN’s Open Language Tools

• XLIFF 1.0 Editor

• Integrated “Mini-TM”

• XLIFF converters for

– HTML

– Docbook SGML

– JSP

– XML

– OpenOffice.org

– Plaintext

– Software Messages

Page 43: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

The Wordforge Foundation

• Translate.org.za & KhmerOS Collaboration

• The Translate Toolkit

– Converters for e.g. Mozilla and OpenOffice.org formats to PO and XLIFF

– Uses XLIFF and PO as common Resource Containers

– QA Tools

• Pootle

– Web-based Translation Environment

• Pootling

– Rich Client Translation Environment

Page 44: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Wordforge: Pootle

• Web based Translation and Project

Management

Page 45: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Wordforge: Pootling

• XLIFF and PO editor

• Supports TBX glossaries

• Integrated TM

• Uses the Translate Toolkit Internally

• Still in very early development

Page 46: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

KBabel & Kaider

• KBabel has been the most advanced Translation Editor for PO

• Part of the KDE project

• No longer maintained

• New tool on the block: Kaider

• TM and TBX support

• XLIFF support on the TODO-list

Page 47: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Okapi & OmegaT

• Use the Okapi

Framework for

processing

files for

translation

• Translate

using OmegaT

• .NET and Java

combination

Page 48: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

XLIFF Tool Support

• Tools only implement the basic features of

XLIFF

• In reality, there is not even a single mature

open source XLIFF (1.1 or 1.2) translation

environment for the GNU/Linux platform

Page 49: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Vertical Solutions

• Project-specific Localisation Tools

– KDE: KBabel/Kaider (PO based)

– Mozilla: Mozilla Translator

– GNU/Gnome: GTranslator (PO based)

– Eclipse: Eclipse Babel (planned!)

• Significant challenges in creating a cross-

project solution based around XLIFF

Page 50: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Current Challenges

• Little separation between Developer and

Translator

– Need to know Source Control systems like CVS

– No abstraction

• Need to know source formats like C and C++ format

strings

• No protection of inline markup

• Only ad-hoc glossary management

• Only ad-hoc translation reuse

Page 51: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

• Wiki-based glossaries at best

• Language-based, not project based

• New Tools starting to consider Terminology

Management

– Kaider and Pootling support TBX based glossaries

– Still only retro-fitted Terminology Management

Page 52: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Where to from here?

Page 53: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

What is XLIFF support?

• XLIFF is only the resource container…

• There is a pressing need to build an eco-system of

tools around XLIFF, similar to the rich set of tools

currently existing for the PO format

– Merging, word-count, QA checks…

– XLIFF without the tools support is really two steps back…

• Perhaps a case for an XLIFF ‘reference

implementation’?

Page 54: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Towards an XLIFF Standard

• The XLIFF 1.2 Specification a significant

improvement from previous versions

– Less ambiguity

– More consistent

– Support for segmentation

– Representation Guides for HTML, Java, PO

Page 55: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Towards an XLIFF Standard

• Some challenges with the current standard

– “Poor” white-space handling

– Canonicalisation

• XML representation

• Resource representation

• Tool processing

– Complexity and Separation of Concerns

• XLIFF 2.0 Should address these issues

Page 56: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

What we are doing at QUT with XLIFF

• Using XLIFF extensively in our research

• Java-based API for manipulating XLIFF

– Thin layer upon the XOM (XML object model) library

– Every element is an object (File, Group, TransUnit …)

– XPath, XSLT support

– Property change support

• Converter for PO

• Will be made available as open source “at some

point, or sooner if you bug me”™

Page 57: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

What we are doing at QUT with XLIFF

Page 58: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Thank You!

• James M. Hogan (QUT)

• XLIFF Tools Project Contributors

• Red Hat team, Brisbane

• Dwayne Bailey (translate.org.za & Wordforge)

Page 59: XLIFF in the Localisation of Open Source Software · CRICOS No. 00213J XLIFF in the Localisation of Open Source Software One step forward, two steps back? Asgeir Frimannsson

CRICOS No. 00213J

Questions / Discussion