17
San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

Embed Size (px)

DESCRIPTION

What is ICU? Roadmap and Myths San Jose, California September nd International Unicode Conference3 What is ICU? Internationalization libraries for C, C++, Java* –Open source – non-viral –Sponsored by IBM *Sun’s Java licenses an earlier ICU version; ICU4J updates it. Unicode Standard compliant –full supplementary support Cross-platform; extensible and customizable High performance and thread-safe –Multiple locales in same thread – simultaneously

Citation preview

Page 1: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 2002

What is ICU? Roadmap and Myths

Helena Shih ChapmanICU Development Manager

IBM Globalization Center of Competency

Page 2: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 2

What is ICU? Roadmap and Myths

Agenda• What is ICU?• Architecture Overview• Significant New ICU Features• Near Future Features• Common Misunderstandings about ICU• ICU – what you should know• Globalization Gotcha’s• References• Q and A

Page 3: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 3

What is ICU? Roadmap and Myths

What is ICU?• Internationalization libraries for C, C++, Java*

– Open source – non-viral– Sponsored by IBM* Sun’s Java licenses an earlier ICU version; ICU4J updates it.

• Unicode Standard compliant– full supplementary support

• Cross-platform; extensible and customizable• High performance and thread-safe

– Multiple locales in same thread – simultaneously• http://oss.software.ibm.com/icu/

Page 4: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 4

What is ICU? Roadmap and Myths

ICU Features• Unicode text handling

• Character set conversions (700+)

• Collation & Searching

• Locales (170+)

• Resource Bundles

• Calendar & Time zones

Page 5: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 5

What is ICU? Roadmap and Myths

ICU Features• Breaks: character, word, line, & sentence

• Formatting– Date & time

– Messages

– Numbers & currencies

• Transforms– Normalization

– Casing

– Transliterations

• Complex Text Layout Engine

Page 6: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 6

What is ICU? Roadmap and Myths

Architecture Overview• Locale Based Services

– Locale is an identifier, not a container• Object in C++ and Java, char* in C

– Default locale is set to the platform locale

• Resource inheritanceroot

en de ja ru

US IE DE AT JP RU

PREEURO

Root Locale

Language

Country

VariantPREEURO PREEURO

Page 7: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 7

What is ICU? Roadmap and Myths

Architecture Overview• Open and Close Service Model

– Better performance by avoiding setup costs per operation

• ICU Threading Model– Multiple versions in use simultaneously

– Large resources shared in read-only cache

Page 8: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 8

What is ICU? Roadmap and Myths

Architecture Overview• Data Driven Services

– Customize at build-time or run-time– Interchange with other platforms;

• same results on each

– Rule-based• Collation, Word-breaks, Transforms

– Pattern-based• Formats, UnicodeSet

– Table-based• Character Conversion

Page 9: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 9

What is ICU? Roadmap and Myths

Architecture Overview – ICU4C

• Simple Error Handling– C++ subset for portability

– Support for multi-threaded environment

• Version Management– Multiple versions at the same time

– Data and library versioning

• String Buffer Management– Preflighting and overflow protection

Page 10: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 10

What is ICU? Roadmap and Myths

New Common ICU Features• Unicode 3.2 Update• Supplementary Character Support• Dual Currency Support• UCA Compliant Collation• Fast Unicode Normalization• Customizable RuleBasedBreakIterator• Many Other Enhancements…

Page 11: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 11

What is ICU? Roadmap and Myths

New ICU4C Features• Character Set Conversion Enhancements

– Alias Management– Tighter Unicode conformance– Better compression scheme for Unicode

• Memory Management– Load and unload ICU libraries– Root UObject: new and delete overwrite

• Complete Unicode 3.2 Properties

Page 12: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 12

What is ICU? Roadmap and Myths

Near-Future Plans• POSIX Style Compatibility Interface• Plug-in APIs for Linguistics Engines

(runtime)• Service Registration• Flexible User Data Loading Mechanism• ICU4C and ICU4J Feature Sync-up• Further Performance and Robustness

EnhancementsPlease note that this list is subject to change!

Page 13: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 13

What is ICU? Roadmap and Myths

Misconceptions about ICU• The X license agreement is viral and too open.• ICU4C distribution size is very large.• ICU4C converter names and aliases are ambiguous and

misleading.• I cannot use multiple versions of ICU4C on the same

machine.• ICU Unicode type is not the same as platform native

Unicode type.• There is no way to process Unicode characters with

Standard Input/Output in ICU4C.• ICU is not ported to the platforms I use.

Page 14: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 14

What is ICU? Roadmap and Myths

ICU – what you should know• ICU default locale is not connected to OS

locale utilities

• ICU resource bundle is platform independent

• Multiple versions of ICU can be linked to the same application

Page 15: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 15

What is ICU? Roadmap and Myths

Globalization Be Aware• wchar_t is not consistently defined across

platforms

• Lack of native Unicode string literal support in C

• Migration issues from proprietary implementation to ICU

• Different levels of Unicode support between OS and ICU

Page 16: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 16

What is ICU? Roadmap and Myths

References• ICU main site:

– http://oss.software.ibm.com/icu/– Links to

• Download ICU• User Guide, Technical FAQ, Support, Bug Reports

• Unicode Consortium– http://www.unicode.org

• Unicode glossary, Unicode character database

• IBM Developerworks– http://www.ibm.com/developerworks/unicode

Page 17: San Jose, California September 2002 What is ICU? Roadmap and Myths Helena Shih Chapman ICU Development Manager IBM Globalization Center of Competency

San Jose, California September 200222nd International Unicode Conference 17

What is ICU? Roadmap and Myths

Questions and Answers