Upload
mavis-bates
View
234
Download
2
Tags:
Embed Size (px)
Citation preview
23rd Internationalization and Unicode Conference, Prague, Czech Republic – March, 2003
Common XML Locale Repository
Dr. Mark Davis
Steven R. Loomis
IBM San José Globalization Center of CompetencyCopyright © 2003 IBM Corporation
Prague, Czech Republic, March 2003Common XML Locale Repository 2
Locale Data Confusion
Variations in localized data can irritate or confuse users…
OS #1: 2003-02-17 (févr. )
OS #2: 03-02-17 (fév)
Prague, Czech Republic, March 2003Common XML Locale Repository 3
Locale Data Problems
But, over the network, mismatched data can be catastrophic.
OS #1: 2003-02-17 (févr. )
OS #2: 03-02-17 (fév)
Prague, Czech Republic, March 2003Common XML Locale Repository 4
What is Locale Data?
• Locale = identifier string referring to linguistic and cultural preferences
• Typical data– Dates/times– Numbers– Measurement– Currency– Sorting (Collation)– Translated country and language names
Prague, Czech Republic, March 2003Common XML Locale Repository 5
Where is locale data found?
• International Components for Unicode (ICU)
• OpenOffice.org
• Operating Systems– Linux, Solaris, AIX, Windows, …
• Java
Prague, Czech Republic, March 2003Common XML Locale Repository 6
Common XML Locale Repository Team
• Li18nux is now OpenI18N(part of the Free Standards Group)
– Linux Application Development Environment subgroup
• Common XML Locale Repository project
http://www.openi18n.org/subgroups/lade/locale/
QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.
Prague, Czech Republic, March 2003Common XML Locale Repository 7
Repository Objectives
• Common XML format for locale data
• Collect data from platforms
• Make repository available to the public
• Validate and release corrected data
• Enable W3C Web Services– Exchange and display of data in localized
form
Prague, Czech Republic, March 2003Common XML Locale Repository 8
Repository Features
• Version controlled database
• HTTP based- browsing or custom tools
• Compare data between platforms– (Comparisons available now)
Prague, Czech Republic, March 2003Common XML Locale Repository 9
Repository Structure
• Contents– Common– ICU– OpenOffice.org– Windows?– …
• Migrate to Common over time
Prague, Czech Republic, March 2003Common XML Locale Repository 10
Locale Data Markup Language
• XML “vocabulary” for locale data interchange
• Data stored in separate files (fr.xml or cs_CZ.xml)
• Inheritance used: ‘root.xml’ root locale, ‘fr.xml’ for French, ‘fr_CA.xml’ for French, Canada
Prague, Czech Republic, March 2003Common XML Locale Repository 11
Locale Naming
• ISO-639 + ISO-3166 +Variant (POSIX-like)en — Englishfr_BE — French in Belgiumde_DE — German in Germanysv_FI_AALAND — Swedish in Finland
(Åland region)
de_DE@collation=phonebook,currency@pre-euro
German in Germany, Phonebook collation, pre-Euro Currency.
Prague, Czech Republic, March 2003Common XML Locale Repository 12
<identity> Element
<localeData> <identity> <version number="1.1">Various
notes and changes</version> <generation date="2002-08-28"/> <language type="sv"/> <territory type="FI"/> <variant type="AALAND"/> </identity></localeData>
Prague, Czech Republic, March 2003Common XML Locale Repository 13
Inheritance
fr• Janvier, Février…• 1,234.56 • …
fr_CA• 1 234,57 $ • …
fr_LX• 1.234,57 €• …
Prague, Czech Republic, March 2003Common XML Locale Repository 14
Aliasing
zh (Chinese)
zh_CN zh_HK zh_TWTraditionalSimplified
Prague, Czech Republic, March 2003Common XML Locale Repository 15
<alias> element
<localeData> <identity> <language type="zh"/> <territory type="HK"/> </identity> <collations> <alias source="zh_TW"/> </collation></localeData>
Prague, Czech Republic, March 2003Common XML Locale Repository 16
type attribute
<numberFormatStyle type="decimal">
1 234,57<numberFormatStyle type="percent">
123%
cs_CZ
Prague, Czech Republic, March 2003Common XML Locale Repository 17
type attribute in Locale
<numberFormatStyle type="percent">
123%
cs_CZ@numberFormatStyle=percent
Prague, Czech Republic, March 2003Common XML Locale Repository 18
Standard Keys/Types
• CollationTraditional, Pinyin, Stroke, Direct (Hindi),
posix
• CurrencyPre-Euro
• CalendarGregorian, Arabic (Religious and Civil),
Chinese, Hebrew, Japanese, Thai (Buddhist)
Prague, Czech Republic, March 2003Common XML Locale Repository 19
draft and standard
• Unverified data may be marked with draft=true<localeData draft="true">
• Standard-conforming data may be marked with standard=…– Name: <collation standard="MSA 200:2002">
– URL: <dateFormatStyle standard="ISO 8601, http://www.iso.ch/iso/…CatalogueDetail?…ICS3=30,DIN 5008">
Prague, Czech Republic, March 2003Common XML Locale Repository 20
Data Access
• Normal HTTP request
http://openi18n.org/locale/icu/de_DE.xml?version=2.2¤cy=pre-euro• Accessible by web browser or
programmatically.
Prague, Czech Republic, March 2003Common XML Locale Repository 21
Calendars
• Non-Gregorian calendars supportedGregorian data is the ‘root’for inheritance
• Calendars distinguished by ‘class’(class="japanese", class="arabic", …)
Prague, Czech Republic, March 2003Common XML Locale Repository 22
<calendars>
<calendar class="gregorian"> <monthNames> <month type="1">January</month> <month type="2">February</month> </monthNames> <dayNames> <day type="sun">Sunday</day> <day type="mon">Monday</day> </dayNames>
Prague, Czech Republic, March 2003Common XML Locale Repository 23
<calendars> (cont’d)
<dateFormats> <default type="medium"/> <dateFormatStyle type="full"> <dateFormat> <pattern>EEEE, MMMM d, yyyy</pattern> </dateFormat> </dateFormatStyle>
<dateFormatStyle type="medium"> <default type="DateFormatsKey2"> <dateFormat type="DateFormatsKey2"> <pattern>MMM d, yyyy</pattern> </dateFormat> <dateFormat type="DateFormatsKey3"> <pattern>MMM dd, yyyy</pattern> …
Prague, Czech Republic, March 2003Common XML Locale Repository 24
<calendars> <eras>(gregorian, continued) <eras> <eraAbbr> <era type="0">BC</era> <era type="1">AD</era> </eraAbbr> </eras></calendar>
<calendar class="japanese"> <eras> <eraAbbr> <era type="0">Taika</era> <era type="1">Hakuchi</era> </eraAbbr> </eras></calendar>
Prague, Czech Republic, March 2003Common XML Locale Repository 25
<numbers>
• <symbols> - digits, separators, signs• <numberFormats> - Patterns• <currencies> - Monetary
patterns, symbols
Prague, Czech Republic, March 2003Common XML Locale Repository 26
<symbols><decimal> . </decimal> <group> , </group> <list> ; </list> <percentSign> % </percentSign> <nativeZeroDigit> 0 </nativeZeroDigit> <patternDigit> # </patternDigit> <plusSign> + </plusSign> <minusSign> - </minusSign> <exponential> E </exponential> <perMille> ‰ </perMille> <infinity> ∞ </infinity> <nan> _ </nan>
Prague, Czech Republic, March 2003Common XML Locale Repository 27
<numberFormats><numberFormats> <numberFormatStyle type="decimal"> <numberFormat type="long"> <pattern type="positive">#,##0.###</pattern> <pattern type="negative">-#,##0.###</pattern> </numberFormat> </numberFormatStyle>
<numberFormatStyle type="percent"> <numberFormat type="short"> <pattern
type="positive">#,##0%</pattern> </numberFormat> </numberFormatStyle>
Prague, Czech Republic, March 2003Common XML Locale Repository 28
<numberFormats> currency
<numberFormatStyle type="currency"> <numberFormat type="medium"> <pattern type="positive"> #,##0.00;</pattern> <pattern type="negative"> ( #,##0.00)</pattern> </numberFormat> </numberFormatStyle></numberFormats>
Prague, Czech Republic, March 2003Common XML Locale Repository 29
<currencies><currencies> <default type="USD"/> <currency type="USD"> <displayName>dollar</displayName> <symbol>$</symbol> </currency> <currency type ="JPY"> <displayName>yen</displayName> <symbol>¥</symbol> </currency></currencies>
Prague, Czech Republic, March 2003Common XML Locale Repository 30
<collations>
• ‘root’ locale behavior = UCA
• Sub locales defined in terms of tailorings to the UCA
Prague, Czech Republic, March 2003Common XML Locale Repository 31
<collations>: Swedish<collation> <base UCA='3.1.1'> <settings caseLevel="on"/> <rules> <reset>Z</reset> <p>æ</p> <t>Æ</t> <t>aa</t> <t>aA</t> <t>Aa</t> <t>AA</t> ... </rules>
</collation>
Prague, Czech Republic, March 2003Common XML Locale Repository 32
<special>
• Can appear anywhere in the locale
• Denotes data specific to is not part of the LDML specification.
• Used to store data specific to OpenOffice.org, ICU, or other sources– Single source.
Prague, Czech Republic, March 2003Common XML Locale Repository 33
<special> example
<special owner="http://oss.software.ibm.com/icu/">
<transforms> <transform type="Latin">
<> a ; <> v ; </transform> </transforms></special>
Prague, Czech Republic, March 2003Common XML Locale Repository 34
Other Elements
• <displayName>• <localizedPatternChars>• <timeZoneNames>• <delimiters>• <encodings>• <layout>• <localeDisplayNames>• <measurement>
Prague, Czech Republic, March 2003Common XML Locale Repository 35
Open Issues
• Vetting process not defined
• Versioning and release of Repository not finalized
Prague, Czech Republic, March 2003Common XML Locale Repository 36
Current Status
• LDML 1.0 Specification released, and approved by Openi18n steering committee
• Preliminary data available by CVS (Source code repository)
• Newsgroup available for discussions• Database available for reporting bugs or
feature requests