39
Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Embed Size (px)

Citation preview

Page 1: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Translating into the L2:Corpus tools and resources

Federico Zanettin

Università di Perugia

Page 2: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Outline

Translation into the L2 Corpus resources and tools Sample translation activity

Role of corpora Role of students Role of teacher

Conclusions

Page 3: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Translation into the L2

Is translation into L2 to be avoided? standard practice in translator training textbooks and manuals actual professional practice actual practice in L2 learning environments

Should the teacher be a native speaker of the L2? Many are not…

A number of studies challenge these views: e.g. Campbell 1998, Stewart 1999, 2000, forthcoming,

Grosman et al. 2000, Kelly et al. 2003, Pokorn 2005, Kearns forthcoming, …

Page 4: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Corpus resources and tools Corpora

The Web as corpus The Web as a source for DIY corpora Online corpora

Monolingual Parallel

‘Traditional’ corpora (i.e. non-native electronic texts) Monolingual Multilingual/Parallel

Tools General purpose search engines (e.g. Google) Online corpus analysis services Stand-alone corpus analysis software (e.g.Wordsmith Tools, Textstat,

Paraconc) Custom software (e.g. Xaira, ENPC Explorer, etc.)

Page 5: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

The Web as corpus

Search engines advanced options Specialized “sub-webs”

Google scholar Google books

Online concordancers WebCorp WebCONC KwikFinder

Page 6: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia
Page 7: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia
Page 8: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia
Page 9: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

The Web as source of DIY corpora Manual DIY corpora

Download + corpus analysis software (e.g. Wordsmith Tools, TextStat, etc.)

(Semi) automatic DIY corpora Sketch Engine

Page 10: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Sketch Engine

Create your instant DIY web corpora Add linguistic annotation to your corpora Consult very large corpora for many

languages Word lists Concordances Word profiles (Word Sketch) …

Page 11: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Word Sketch for ‘Disease’

Page 12: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Word Sketch

A Word Sketch is a corpus-based summary of a word's grammatical and collocational behaviour.

Each column shows the words that typically combine with disease in a particular grammatical relations. For example, "object_of" lists - in order of statistical significance rather than raw frequency - the verbs that most typically occupy the verb slot in cases where disease is the object of a verb.

Switching between Concordance mode and Word Sketch mode is a useful way of getting more information about a particular word combination. Thus, if you want to look at examples of the string “transmit + disease", simply click on the number next to “transmit" in the object_of list (93) and you will be taken directly to a concordance showing all instances of this combination.

Adapted from the Sketch Engine website

Page 13: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Online corpora

Monolingual Leeds Internet corpora The corpus of contemporary American English

(COCA) etc.

Bilingual OPUS Compara …

Page 14: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Internet Corpora at Leeds

al-luġatu l-’arabiyyatu l-fuṣḥā

Page 15: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia
Page 16: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

OPUS (Europarl parallel corpus) in modo sistematico Systematically vs. in a systematic way

Page 17: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia
Page 18: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

The Web vs. well-constructed corpora Corpora = reliability, core patterns of

language use The Web

Lexical and terminological richness Multi-word expressions

“naked eye”

Page 19: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

“to the naked eye”

Google = 2.5 million hits BNC = 884 hits

Page 20: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

“visible to the naked eye”

Google = 1.2 million hits BNC = 18 hits

Page 21: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

“barely visible to the naked eye”

Google = 83,000 hits BNC = none

Page 22: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

“be barely visible to the naked eye”

Google = 49,000 hits BNC = none

Page 23: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

“grains that are so small as to be barely visible to the naked eye”

Google = 5 hits (2 different results, duplicated)

Page 24: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Sample activity

Revise the output of an online machine translation system

Source text: specialized text in a curricular field (e.g. history, economics, politics)

Tools for revision Dictionaries Corpus resources and tools

Page 25: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

An example

• I puritani della Nuova Inghilterra furono i primi fra tutti i coloni inglesi d'America ad elaborare in modo sistematico una teoria originale dello Stato e della società.

• The puritani of New England were the first between all coloni English of America to elaborate in systematic way a theory originate them of the State and the society.

• New England Puritans were the first among all English colonists of America to elaborate systematically an original theory of State and society.

Page 26: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Google advanced search

New England Puritans The New England Puritans (The) Puritans of New England (The) New England’s Puritans

Page 27: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

How the students worked

Doubts about the MT outcome Unknown words and expressionsToo literal renderings

“in some cases, it was just a matter of verifying the accuracy of the MT output, whereas in others there were good reasons to improve the overall quality of the text.”

Page 28: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Use of corpus resources

Does something exist? Are there better alternatives?

Doubts confirmed (MT wrong) Doubts disconfirmed (MT right)

Specific terminology

Need to ask the right questions formulate queries properly analyse results successfully

Page 29: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Example 1

Can globalization “exercise an effect” on income redistribution?

Google search = no results Search for “an effect”

Something can “have” or “produce” an effect

“globalization has ( a number of) effects on income redistribution”

Page 30: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Example 2

“the central theme of the debate” Very literal: “il tema centrale del dibattito” Google search = many results EU proceedings parallel corpus = many

results

Page 31: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Example 3

“processi (economici) in corso” = “(economic) processes Corsican” ?

Search for “processes” (COCA) ongoing + processes (frequent collocates)

“ongoing (economic) processes” Attested in comparable texts (sources of

concordance lines)

Page 32: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Example 4

“la diffusione di nuove tecnologie” = “the spread of new technologies”? Google search = attested expression But: what about “diffusion”? Search for “spread” vs “diffusion (Web + COCA) Search for

“the spread of * technologies” vs “the diffusion of * technologies”

Spread = general English Diffusion = academic English “the diffusion of new technologies”

Page 33: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Example 5

“avere i requisiti per votare” = “have the requirements to vote”?

Dictionary: “fulfil/satisfy/comply with/suit/match the requirements”

Corpora: “meet the requirements”

Page 34: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Role of corpora

"dictionary items + combinatory rules"

VS "corpora + rules for querying and analyzing

them" focus on language units larger than the single

word Multiple local grammars grammars for 1, 2, 3… word combinations

Page 35: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Unanalyzed knowledge Acquisition vs learning Corpora used to produce generalizations

Gerund + “is not a duty”

Page 36: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Role of students

Serendipity/discovery learning A corpus is not necessarily “expected to

provide the right answers … but constantly presents new challenges and stimulates new questions, renewing the user’s curiosity and offering ample opportunity for researching aspects of language and culture” (Bernardini 2002:166).

Page 37: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Role of teacher

Guide, facilitator vs “walking dictionary” Can only L2 native speakers be good translation

teachers? Native speakers

More knowledgeble about target language Non-native speakers

More knowledgeble about source language Same directionality of translation Better understanding of translation difficulties Better able to evaluate translation process

Page 38: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Risks Insufficient expertise in the use of software will result

in clumsy and superfluous searches so enough time should be devoted to teaching search

techniques, which are often specific to the corpora used Insufficient expertise in the analysis of the data

(concordances) will result in wrong conclusions ... and in turn in bad translations so enough time should be devoted to teaching how to

manipulate and interpret corpus data However, something can also be learned from less

successful learners, whose comments highlight areas of difficulty.

Page 39: Translating into the L2: Corpus tools and resources Federico Zanettin Università di Perugia

Conclusions

By using corpora in L2 translation learners can heighten their awareness of contrastive aspects and of varieties of possible translations

Even if equipped with limited formal linguistic knowledge learners are given the opportunity to discover language rules and conventions by themselves

The use of corpus resources in a translation task fosters reading and writing skills and encourages self-confidence and autonomy

Teachers do not necessarily have to be target language native speakers, but rather experts in using resources, formulating queries, evaluating findings