32
Overview of technologies for translators and language service providers Belinda Maia University of Porto

Overview of technologies for translators and language service providers Belinda Maia University of Porto

Embed Size (px)

Citation preview

Overview of technologies for translators and language

service providersBelinda Maia

University of Porto

Translator asLanguage Services Provider

• MUST HAVE KNOWLEDGE OF:– Science and Technology– National and International Economics, Politics, Law

and Current Affairs– Multimedia– Human Language Technologies - HLT– Information Society Technologies - IST

• MUST BE:– A Multidisciplinary Communicator– A Multimedia Communicator– AND an Intercultural Communicator

Translator as Intercultural Communicator

• MUST HAVE KNOWLEDGE OF:– Psycholinguistics– Contrastive linguistics– Sociolinguistics– Cultural theory– Literary theory

• MUST BE:– Multi-lingual and multi-culturally sensitive

Translator as Multimedia Communicator

• MUST HAVE KNOWLEDGE OF:– General IT as user– Special IT for translators – MT, CAT etc– Subtitling and Dubbing programmes– Web Pages– ETC

• MUST BE:– Computer literate and aware of new media

Information Society Technologies

• European Programme at: http://cordis.europa.eu/ist/

• Focus on:– Technology for providing information– Language as vehicle of information– Language as structuring knowledge– Knowledge management

HLT(1)Calls for (research) proposals

• 1999/2000

• MLIS (Multi Lingual Information Society) – the provision of multilingual language

resources over global networks – the development of multilingual networked

services

HLT(2)Calls for (research) proposals

• 2000/2001

• Multilingual communication services and appliances – Multilingual e-service and e-commerce – Natural and multilingual interactivity– Multilingual web – Multimodal and multi-sensorial dialogue

modes

HLT (3)Calls for (research) proposals

• 2002/6 – Focus on– Knowledge and Interface Technologies

• Multi-modal interfaces• Semantic-based knowledge systems

– Cognitive systems– Bio-inspired Intelligent Information Systems

Multimodal Interfaces • Multilingual Communication –

> Facilitating translation for unrestricted domains, especially for spontaneous (unrestricted) or ill-formed (speech) inputs in task oriented settings.

• Areas to be addressed include:

Multimodal Interfaces • human-to-human;

• human-to-things;

• human-to-self;

• human-to-content;

• device-to-device; • human-to-embodied robots.

Multimodal Interfaces • Areas to be addressed include:

• speech-to-speech translation;

• statistical/mixed approaches to translation;

• adaptive techniques, incorporating learning;

• robustness of approach.

Don’t forget

• HLT research proposals are for cutting-edge technology

• The results will be in the future

• But the future is coming!

Technology FOR Translators

• Machine translation (MT)

• Machine assisted translation (MAT)

• Internet for information retrieval

• Corpora use

• Terminology Management

• Multimedia tools

• Summarisation and Revision

MT– a threat, a solution or a tool?

• A threat?• Under present circumstances - No• A solution?• Partially > ‘gist’ translation• A tool?• Increasingly > + pre- and post- editing • OR• Human Assisted MT (HAMT)

Online MT - uses

• Training in awareness of lexical and syntactic difficulties for both human and machine translation

• Our experiment with METRA

• It gets hundreds of hits per day, so who is using it?

• A lot of translators….. !

MAHT Commercial Programmes

• SDL + TRADOS - Check

• http://www.sdl.com/

• http://www.trados.com/

• DÉJA VU http://www.atril.com/

• STAR - TRANSIT http://www.star-group.net/eng/home.html

• WORDFAST - http://www.wordfast.net/

MAHT Basic tools

• Translation memories (TMs) + concordancer• TM created:

– As translator works– Using text aligner on previous texts + translations

• Terminology database created:– Pre-translation by terminologist / company / translator– Post- translation by aligning terms in text and

translation

MAHTAdditional tools

• Spelling and grammar checkers – in Word• Machine Translation• File formatting facilities• Terminology > knowledge databases• Project Management facilities• ETC• For further details come to the commercial

sessions on Wednesday!

eCoLoReTraining kits for TM technology

• Problem: OK – we have bought the TM software for our university – but it is empty!

• Solutions?

• Make your own TMs • eCoLoRe at http://ecolore.leeds.ac.uk/

Translation technology- Needs

• To find, keep and re-use information

• To work within multimedia technology

• Good understanding of Linguistics

• Understanding of how/why spelling and grammar checkers, MT, and other HLTs do(n’t) work

Using the Internet

• To find information• Understanding how the internet works• Using browsers intelligently

• To keep information• Collecting site links• Downloading useful information

• To convert information to knowledge• Studying special subjects

Internet information

• Eurodicautom, online terminology, glossaries, dictionaries

• On-line encyclopedias – e.g. Wikipedia

• Translators’ pages

• Translators’ forums and mailing lists

• Systematic finding, analysing and storage of relevant information / knowledge

Monolingual Corpora as tools

• Large quantities of varied types of text

• British National Corpus (BNC) – online at: http://sara.natcorp.ox.ac.uk/lookup.html

• Linguateca – Portuguese corpora – online at: http://www.linguateca.pt

• PLEASE inform of others!

Multilingual Corpora as tools

• EU documents at: http://europa.eu.int/

• Parallel corpora (Translation Memories?)– E.g. COMPARA > EN & PT (literary) online at:

http://www.linguateca,pt – 1 million x 2

• Comparable corpora – originals in different languages, but same domain and/or genre

Corpora - uses

• Monolingual corpora – finding the right word or collocation

• Multilingual / parallel corpora – finding terminology and translation suggestions

• Comparable corpora – discovering expert terminology and local text conventions

Terminology > KnowledgeFrom:

The ‘right word’Glossaries / dictionariesDatabasesThesauriConceptual organizationOntologiesKnowledge databases

Corpógrafo – integrated suite of online tools

• Corpora construction and analysis • Semi-automatic term extraction • Concept databases• Traditional terminology fields• Semi-automatic extraction of definitions and

semantic relations• Visualization of concept systems / ontologies• Produced by Linguateca – PoloCLUP and freely

available at: http://www.linguateca.pt/corpografo

Multimedia translation

• Localization• XML• Sub-titling• Dubbing• Web-pages• Software for interpreters• Speech-to-speech machine translation &

interpretation?

Other skills ± software

• Revision

• Translation evaluation

• Summarization

• Terminology management

• Information retrieval

• Project management

Linguistics

• Essential training for translators– General linguistics– Contrastive linguistics

• Translators > language experts > new specializations – Natural language processing – Translation and terminology tools

Group workHow much of this technology do you

• Use? • Find useful?- Know about? - Believe to be useful?- Don’t know about?- Want to find out more about?- Believe to be (ir)relevant to translating as

a profession?