Upload
ubishops
View
0
Download
0
Embed Size (px)
Citation preview
1
2003-11-20 | Conférence de l’ACCTI Conference © 2003 NMi Consulting
Selecting a Computer-Aided Translation (CAT) tool for XML content — A cookbook —
Normand Montour – Principal Consultant
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Agenda
§ Introduction§ What is XML ? (How much XML do I need to know ? )§ Approach to selecting your CAT tool for XML content
§ Conclusion
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Introduction
§ Warning! – This method was used in customer situations
– Its scope covers the selection of CAT tools to translate and manage translation of XML-based content
– It is not a general-purpose method for evaluating CAT tools
– Additional XML-related standards are continually emerging; ensure that your deployment is aligned with the latest trends.
2
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Introduction§ Context
– Web-oriented (Internet/Intranet/Extranet) data constitutes a much larger (and much growing) part of each organization’s translatable content
– This content takes the form of many new file types and document formats : HTML, XML, SGML, XSL (T-FO), PhP, Java, etc.
– Traditional Fax, Print, cut-and-paste and PDF methods are no longer efficient enough to meet the demands of quick turnaround, efficient and cost-effective translations
– More one-to-many language translations than ever before
– CAT tools need to provide a leverage in translating the material directly in the document format without having to perform post-editing or formatting functions.
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Where did XML come from ?
ä 1996: some 80+SGML experts form the W3C SGML WG toä Support generalized markup on the Web
ä Produce ideally valid SGML documents
ä Provide URL (as in HTML) compatible hyperlinkingä XML
ä is a (meta)language, a profile of SGML
ä brings generalized markup to the Webä XML documents
ä are self-describing (document specific DTD)
ä can be validated against a reference structure (the DTD)
ä are platform and software neutral
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
OK, so What is XML ?
“XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification.
It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a `metalanguage' —a language for describing other languages—which lets you design your own customized markup languages for limitless different types of documents. XML can do this because it's written in SGML, the international standardmetalanguage for text markup systems (ISO 8879).”
From WWW.XML.org
3
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
XML is…
Extensible Markup LanguageW3C : W3C :
http://www.w3.org/TR/REChttp://www.w3.org/TR/REC--xmlxml
XML is about– Descriptive Markup, not Procedural Markup
– Structure Definition (DTD) or XML Schema
– Documents Conforming to Structure (Instance)
– Software and Platform Independent Format
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
XML States Syntactic Rules
§XML is not– A uniform document structure
– A standard list of markup tags
§XML is– A language for defining hierarchic document
structure: Document Type Definition (the DTD), and
– For descriptive markup of text: the XML instance
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Markup
§ Markup is adding non-textual information into a text to make it more meaningful§ Traditional Examples :
spaces between words
emphases (italics, bold, underlined)
layout (new lines, new pages, bullets)
4
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Expressing Structure in XML§ Elements
Structural and logical components
TitlesChaptersAirlineAirport…
Chosen with respect to structural and logical nature of information components
Assembled in model groups
ex. (A, B, (C | D))Delimited by a Start Tag and an End Tag
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
§Elements - Hierarchy– Structural elements contain sub-elements or model
groups
– Sub-elements may contain their own sub-elements or model groups
– Nesting of elements and model groups define a hierarchy
– The highest level element is referred to as the Document Type
Expressing Structure in XML
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Expressing Structure in XML
Using an element requires three components
End TagStart Tag Semantic Content of Element
Generic Identifier
<Title>An Introduction to SGML</Title>
End Tag Differentiation
5
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Two types of XML§ Well-formed XML:§ Minimally meets some structural criteria§ Tags have to be balanced§ Naming of elements has to be correct§ Typically, instances extracted from databases or
converted from word processors are well-formed XML
§ Valid XML§ Meets all structural requirements§ Must conform to a DTD or a Schema§ Can be parsed for validity by a standard parser§ Typically, instances produced by standardized
publishing software are found in this category
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Expressing Meta-Data in XML
§ Attributes– Represent “meta-information” - information about information
– Qualify elements
– Comprise a name and a value
– Used in the start tag of elements
<Book status="revision" version=’4.1.2’>
Element Attribute
Attribute
Value delimiter LIT Attribute value
Value delimiter LITA Attribute value
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Expressing Objects in XML§ Entities
“A collection of characters that can be referenced as a unit”
Character strings assembled into an information object, or non character data (graphics, multimedia) assembled in a storage object, with a name that can be used for referencing
Used for
– storing XML document instances (fragments)– recalling long strings through short names (macro)– inserting external objects into an XML document (e.g.
graphics, multimedia) – inserting special characters not available on a keyboard
(e.g. é À, etc.)
6
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Three dimensions of information§ Structure:
– Hierarchic organization
– Expressed in terms of semantic content
Independent of platform and software
Traditional publishing software express format rather than structure, in a proprietary manner !
§ Content:– Source information (text, data, graphics,
multimedia)
§ Format:– Appearance of published content
– Specific to platform and software
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Logical Components
§ElementsTitles, chapters, sections, fielded data, etc.
§AttributesID, version, language, security, etc.
§EntitiesStandard text, special characters, external
objects, graphics, etc.
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach in selecting your CAT tool
§ Gather requirements and selection criteria
§ Inventory of major vendors in this arena
§ Determine how vendor products respond to selection criteria
§ Establish the software pricing and related costs
§ Recommend a solution
§ Lay out an implementation strategy
§ Define a rollout plan
§ Create the training material, deliver training and provide on-going support
7
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach - Gather requirements and selection criteria
§ XML Criteria:– Can validate (I.e. strictly adheres to the DTD) document before (so
that you won’t be told later you made the document invalid) and after the translation (to ensure that you didn’t make it invalid)
– Can properly respect tag content and placement at import and at export (tag locking, hiding, showing)
– Can properly deal with character entities, both for import and export
– Allows the translator to displace tags within a segment
– Allows the translator to unlock/modify the content of tags (attributes, URLs, etc.) when necessary
– When aligning existing translations, the software should make use of the tags and attributes to improve on the alignment quality and the subsequent segmentation
What does that mean in practice ?
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Other criteria to consider
§ Alignment tool should utilize the markup to improve on the quality of the alignment§ XML-based exchange standards (LISA) for
Translation Memory (TMX) and Terminology (TBX) means you can get XML to work for YOU!
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
CAT tools for XML content should…
§ Be designed to respect and protect XML markup during the translation process§ Allow the translator to translate « between the
tags » so they don’t have to worry about the proper use of tags, or their place in the document.§ Allow the translator to insert missing tags/entities
when they are present in the source segment but not in the Translation Memory§ Allow the translator to show or hide tags/entities
as required§ For example…
8
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Trados Tag EditorFull-Tag View
Element Section5 with id and rev attributes
Start Tag for element Heading
End Tag for element Heading
RefExt and RefIntelements are « inline »
Entities are used for special characters
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Trados Tag EditorNo-Tag View
•Contains less information about the structure
•Provides a clearer view of the context
•Translator can easily flip/flop between the two views
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach - Inventory of major vendors that claim XML support
§ Atril – DéjàVu§ CypresoftTransSuite§ MultiTrans§ SDLX§ StarTransit§ SynchoTerm§ Trados§ Wordfast
9
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach - Determine how vendor products respond to selection criteria
Criteria 4…Criteria 3Criteria 2Criteria 1Vendor
Wordfast
Trados
SynchoTerm
StarTransit
SDLX
MultiTrans
Cypresoft TransSuite
Atril – DéjàVu
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach - Establish the software pricing and related costs
§ Things to remember:– Software cost is not everything…
– You may need to upgrade your hardware and network
– You may need to consider the support costs (current year and ongoing)
– Consider the one-time costs (training, installation, configuration, etc.)
– Consider the cost of aligning your legacy documents
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Approach – cont’d§ Recommend a solution
– Make a short list of vendors
– Invite those vendors to participate in a pilot project § Lay out an implementation strategy
– Leverage “champions” and other enthusiasts
– Avoid attitude pitfalls
– Acquire top-management endorsement early in the project§ Define a rollout plan
– Use a pilot project to show early benefits
– Don’t run until you can walk
– Choose a visible portion of your corpus for your pilot§ Create the training material, deliver training, provide
on-going support and follow up with your users§ Measure your costs and benefits
10
Selecting a Computer-Aided Translation (CAT) tool for XML content | A cookbook | © 2003 NMi Consulting
Conclusion
§ XML content needs to be treated with special attention and with special tools§ Make sure you establish clearly your criteria for
the selection of your CAT tool§ Use a rigorous selection process § Test, test and re-test§ Measure your costs and benefits – they will make
you get the proper management attention