29
1998. Yuri Demchenko. TNC'98, Dresden. ML MUA Testing - TERENA Pilot Proje ct ML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project Yuri Demchenko, TERENA <[email protected]> TNC’98 Dresden October 5-8, 1998

1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

Embed Size (px)

Citation preview

Page 1: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_1

Testing multilingual support in Mail User Agents

TERENA Pilot Project

Yuri Demchenko, TERENA <[email protected]>

TNC’98 Dresden October 5-8, 1998

Page 2: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_2

TERENA Pilot Project on Testing Multilingual MUAs

• Officially started in April 1998 till September 1998

• The project objectives can be described as:

– Develop benchmarking methodology for Multilingual MUAs, and specify templates for collecting the results in a coherent way.

– Design a set of composite multilingual test messages

– Configure each MUA for all supported national character sets and send the test messages to other MUAs and to themselves.

– Compile the results, analyzing how the MUA composes, sends, receives and displays the test messages.

– Prepare recommendations for users - correct setup and operation of popular multilingual MUAs

àç

Page 3: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_3

The list of mail clients to be tested

• Derived from TERENA MUAs usage statistics based on analysis of more than 3000 messages from TERENA Mail archives collected during the period August 1997 - March 1998

áóêè

Microsoft Windows (NT, 3.11, 95)•Microsoft Outlook Express •Netscape Mail 3.x and 4.x •Netscape Messenger •Qualcomm Eudora 3.0 and 4.0 beta •Pegasus Mail •The Bat! •ESYS Simeon•Alis Tango Mailer

UNIX Terminal•Elm •MH•Pine

UNIX GUI (with X11R6) •Netscape Mail •EXMH •Z-Mail

Page 4: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_4

Activity and Projects in i18n and Multilingual Support

• i18n activity (ISO, IETF, ECMA, TERENA, Unicode Consortium)

• CEN/TC304 works on European character sets and keyboard

• MAITS project

• Internet Mail Consortium - Report on using International Characters in Internet Mail

• Terena Pilot Project on Testing Multilingual support in MUAs

âåäè

Page 5: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_5

Internet Mail Consortium - i18n Report

Summary of recommendations

1. Explicit charset parameter

2. Sending UTF-8

3. Displaying UTF-8

4. Choosing charsets on creation

5. Specifying languages

6. Multi-language text

7. Non-ASCII headers

8. Handling all common charset

9. MTAs and 8-bit content

ãëàãîë

Report strongly recommends that all mail-creating and mail-displaying programs created or revised after January 1, 1999, must be able to create and display mail using UTF-8 and have ability to handle all common charsets in addition to UTF-8

Page 6: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_6

Standard on i18n and Character Sets Technologies

• ISO standards– ISO 2022 Character Set Concept and Terminology

– ISO 8859-x Character Sets

– ISO Standards on APIs i18n and FDCC

• Unicode standards

• RFC 2277 IETF Policy on Character Sets and Languages

• Recommendation of IAB Workshop on character sets technology (RFC 2130)

• MIME format of messages (Using MIME in Internet Mail) RFC 2045-RFC 2049

• RFC 822 - Syntax of electronic messages format according

äîáðî

Page 7: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_7

Standards in i18n and Multilingual Support in Internet Mail

• RFC 2045 - RFC 2049, RFC 2231 - MIME – Coded Character Set

– Character Encoding Scheme specified by the Charset parameter to the Content-Type header field

– Transfer Encoding Syntax like Base64, QP specified by the Content-Transfer-Encoding header field

• RFC 2277 - IETF Policy on Character Sets and Languages – main definitions and requirement for language tagging

• RFC 2130 - Recommendation of IAB Workshop on character sets technology

– framework for interoperability between the many characters in use

– an architecture model for on-the-wire transmission of text

– recommendations for tagging transmitted (and stored) text

åñòü

Page 8: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_8

RFC 2130 Architecture model

• User interface issues (OS, GUI, API)– Layout

– Culture

– Locale

– Language

• On-the-wire– The Coded Character

– The Character Encoding Scheme

– The Transfer Encoding Syntax

æèâåòå

Page 9: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_9

The testing and the evaluation scheme

MTA

MTA

OS Environment (Language, KBD, TTFs, l10n, etc.)

Compose Settings(Font (A, S, B, Q),Mapping) Change Settings

(Language/Encoding)

Send Settings(MIME (QP, Base64),uuencode)

Compose Message(Type, Cut&Paste, Reply,Forward, Attachment)

MUA

Message Editor Message Sender

Sending Message

MessageComposer

Set of MLTest Messages

OS Environment (Language, KBD, TTFs, l10n, etc.)

Read Settings(Font (A, S, B, Q),Mapping) Change Settings

(Language/Encoding)

Receiving Settings(MIME (QP, Base64),uuencode)

Read Message(Replied Msg, ForwardedMsg, Attachment)

MUA

Message Reader Message Receiver

Receiving Message

MessageReader(Human, User)

çåëî

Page 10: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_10

Testing of Multilingual support in MUAs

• Includes the following phases:

– Evaluation of Multilingual features/settings of MUAs

– Testing Message Reading procedure

– Testing Message Composing procedure

– Testing Message Sending and Receiving procedure

çåìëÿ

Page 11: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_11

Evaluation of Multilingual features/settings of MUAs

• READ operation mode– choose Language/Encoding

– choose Fonts (Optional for Address, Subject, Message Body, Quoted Text)• Optional - Font mapping

• COMPOSE operation mode– choose Language/Encoding Settings

• Optional - Possibility to switch Language/Encoding during composition/typing

– choose Fonts (Optional for Address, Subject, Message Body, Quoted Text)• Optional - choose Spelling/Language/Dictionary

• SEND operation mode– set MIME encoding (Quoted Printable, Base64)

• Optional - select/disable Uuencode mode (non standard)

– Allow/disallow 8-bit in Header Fields

– select/disable HTML in body parts

èæå

Page 12: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_12

Message Reading procedure

• Multilingual MUAs should support the following features:– Reading/Displaying non-ASCII characters in Message Body

– Reading/Displaying non-ASCII characters in Message Header (Address, Subject Lines)

– Reading Forwarded Message with non-ASCII characters in Address, Subject, Message Body, using the same or different MIME character set attributes

– Reading Attached non-ASCII Text File (Document)

• Possible problems are detected comparing the original and the delivered test messages appearance– This includes the evaluation of the MUAs correct/incorrect

processing of the MIME attributes of the test message.

è

Page 13: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_13

Message Composing procedure

• Message composition operations to be tested– Typing message from keyboard

– Copy and Paste operations

– Text/File attachments

– Quoted text/message

– Edit different parts of message

– Charset/Encoding processing by Message Composer/Editor

• Real Message composition also includes operations like:– Typing non-ASCII text in Message Body and Message Header

– Pasting non-ASCII-Text into Body and Header fields

– Reply to message with non-ASCII Text

– Forward message with non-ASCII content

– Attach text documents containing non-ASCII characters

êàêî

Page 14: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_14

Test messages set

Each test is performed in at least 2 character sets, one of which is US ASCII (or ISO 8859-1), and the other with characters that are not part of US-ASCII or ISO 8859-1.

• Mandatory– tmsg1 - Message with non-ASCII characters/text in the Subject line

– tmsg2 - Message with non-ASCII characters/text in Mail Address free-form name

– tmsg3 - Message with non-ASCII characters/text in the Message Body text (single part)

– tmsg4 - Message with non-ASCII characters/text in text/plain attachment

• Optionally– tmsg6* - Message with UTF-7/UTF-8 Character set in

Message Body and Header (optional)

ëþäè

Page 15: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_15

Testing program mapìûñëåòå

test-1display

test-2print

test-3reply totmsg12

test-4reply totmsg3

test-5reply totmsg3 Cut&Paste

test-6forward all

test-7type kbd

test-8exch tmsg5

test-9test-1-5tmsg6

tmsg1non-ASCIISubjecttmsg2non-ASCIIAddresstmsg3non-ASCIIBodytmsg4non-ASCIIAttachmenttmsg5non-Latin1defaulttmsg6UTF8 inBody, Header

Page 16: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_16

Testing Methodology - The tests to be performed

• test-1 - Receive all 4 test messages tmsg1-tmsg4 and display them correctly (Change Language/Alphabet/Encoding Options if needed)

• test-2 - Print all 4 messages tmsg1-tmsg4 to the standard printer

• test-3 - Reply to messages tmsg1 and tmsg2, and check that information is returned in the same character set as it arrived in

• test-4 - Reply to message tmsg3 using "reply including quote of body"

• test-5 - Reply to message tmsg3 using the environment's "cut and paste" function to insert the non-ASCII characters into the outgoing message

• test-6 - Forward all 4 messages to the originator address

• test-7 - Generate, as completely as possible, the same messages from the keyboard of the IUT

• test-8* - Check possible text distortion when exchanging by tmsg1-2-3 with non-ASCII Default Language/Alphabet/Encoding

• test-9* - Provide tests 1-5 for message tmsg6* with UTF-7/UTF-8

íàø

Page 17: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_17

Testing Results Presentationîí

MS Outlook Express 97 for Windows 95 URL: http://www.microsoft.com/outlook/Language/ EncodingSetting

Examination:non-ASCII text(8-bit)Send/Receive/Attachment

Support of non-ASCIItext in RFC 822message parts/fields

Testing:Support of non-ASCII text

NotesProblemsRecommendations

Compose(As is, MIME(QP, Base64),UTF7/UTF8,HTML)

Body Subject

AddressFree-form

Read Type Paste Send Forwardmessage

Attachedtext

MessagesList

Central European(ISO, Windows)Cyrillic (ISO,Windows, KOI8-R,KOI8-RU)……Universal Alphabet(UTF-7, UTF-8)

As isMIME (QP, Base64)UTF7/UTF8HTMLHTML(Multipart/Alternative)

+ + + +**+*5

+***+*6

+ +*4 + + +*5 ** You can’t changeencoding for Cyrillictext when readingmessage

Page 18: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_18

ML MUAs Testing Results and Data Analysis

• Testing results are documented and presented at – http://park.kiev.ua/multiling/ml-mua/prjdocs/mlmua-repv1.html

• Standards overview on Internationalisation and Multilinguality – http://park.kiev.ua/multiling/ml-mua/mldoc-review.html

• Test messages constructor pilot version – http://park.kiev.ua/multiling/ml-mua/testcon.html

ïîêîé

Page 19: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_19

Evaluation of ML MUAs

• First group - includes MUAs that support multiple languages/alphabets by means of multiple charsets support and use internal language/charset transformation

• Microsoft Outlook Express – Netscape Messenger 4.04 and previous product Netscape Mail 3

– exmh for X Windows

• Second group - provides ML support by selecting proper font for creating and displaying messages

ðöû

– Eudora Pro 3.0

– Pegasus

– Forte Agent

– The Bat!

– Simeon

UNIX Terminal Products

– pine

– elm

Page 20: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_20

First group - Full Multilingual Support

• Microsoft Outlook Express – has the best and richest multilingual support

– use effective internal conversion scheme that is good controlled by users via setup and Alphabet/Charset selection menu

• Netscape Messenger 4.04 and Netscape Mail 3.04 – provide rich multilingual support for many charsets/encodings

– but are very inflexible for Languages that have many charsets in use (F.E., Cyrillic Windows CP-1251 and KOI8-R/U for Russian/Ukrainian, or ISO 8859-2 and Windows CP-1250 for Central European Languages

– Netscape products for X Windows - the same features.

• exmh for X Windows – provides good support for main groups of European languages

using Latin 1, Latin 2 Cyrillic charsets

ñëîâî

Page 21: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_21

Second group – Simplified Multilingual Support

• Popular in Latin1 (ISO 8859-1) and English speaking community

• Languages and charsets/encodings support is provided by selecting proper font for creating and displaying messages. – Eudora Pro 3.0

– Pegasus

– Forte Agent

– The Bat! – provide simple conversion between Cyrillic encodings (ISO 8859-5, Windows CP-1251, KOI8-R)

– Simeon

– pine and elm for UNIX

òâåðäî

Page 22: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_22

Common problems of multilingual support in MUAs

• Conversion between different Encodings/Charsets for the same language

• Correct processing of MIME tags in message Header fields (Subject and Address lines) during displaying when charset name in header is different from Message Body

• The same problems occur when user tries to change Charset/Encoding when displaying or composing message, or use Copy&Paste operations for different Charsets

• View message source code and/or message info (charset/encoding for the Header and Body, Multipart MIME structure, so on)

• Using common and correct terminology for language/charset settings in MUAs

óê

Page 23: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_23

Project’s Main Results

• The international environment of the project allowed to discover the main problems in multilingual MUAs support

• Multilingual test messages set

• Evaluation scheme for the forthcoming ML MUAs

• Project activity was conducted in coordination with other multilingual related projects:– IMC MAIL-I18N report on Internationalization and Character Set

technologies

– Mozilla i18n project (Netscape 5.0)• PT members have contributed to the new Ukrainian Language enabled Mozilla

• proposed model of multilingual support in MUAs was discussed

– ESYS Simeon IMAP Mail multilingual features testing

ôåðòü

Page 24: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_24

Follow-on Projects and activityõåð

• Testing new products using proposed methodology– New releases of OutLook Express 98, Netscape Messenger 4.5 and 5.0

– New products of 1999 that is expected will implement recommendations of IETF/IMC

• Another areas of further activity– Establishing ML/i18n supporting Charsets repository for online support of

Multilingual mail (mapping reference tables download, translation, configuration, etc.)

– Creating Web based ML test messages Constructor which pilot version is demonstrated at project’s page

• http://park.kiev.ua/multiling/ml-mua/testcon.html

Page 25: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_25

Test Messages Constructor http://park.kiev.ua/multiling/ml-mua/testcon.html

îò

Page 26: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_26

Test Messages Constructor - Creating test messageöû

Page 27: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_27

Project Team

Yuri Demchenko, TERENA

Konstantin Chuguev, Ural Technical University, Russia

Janja Faganel, Jozef Stefan Institute, Slovenia

Vadim Shevchenko, Kiev Polytechnic Institute

Alexey Medvedev, Kiev Polytechnic Institute

÷åðâü

Page 28: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_28

Acknowledgmentsøòà

• Borka Jerman-Blazic, Jozef Stefan Institute, Slovenia

• Claudio Allocchio, Sincrotrone Trieste & INFN Trieste, Italy

• Peter Heijmens Visser from TERENA for provided MUAs usage statistics

• Harald T. Alvestrand, Maxware Norway

Page 29: 1998. Yuri Demchenko. TNC'98, Dresden.ML MUA Testing - TERENA Pilot ProjectML_MUA_1 Testing multilingual support in Mail User Agents TERENA Pilot Project

1998. Yuri Demchenko. TNC'98, Dresden.

ML MUA Testing - TERENA Pilot Project ML_MUA_29

IMPORTANT NOTE

Multilingual page will be moved and supported at TERENA webserver

http://www.terena.nl/multiling/

åð