71
Comparable Corpora Comparable Corpora for Terminology for Terminology Stella E. O. Tagnin - USP Stella E. O. Tagnin - USP Corpus Linguistics, Translation and Corpus Linguistics, Translation and Terminology Terminology New Technologies in Translation - CAPES New Technologies in Translation - CAPES Universitat Rovira i Virgili-Universidade de Universitat Rovira i Virgili-Universidade de São Paulo São Paulo Tarragona Tarragona

Comparable Corpora for Terminology

  • Upload
    edric

  • View
    96

  • Download
    0

Embed Size (px)

DESCRIPTION

Comparable Corpora for Terminology. Stella E. O. Tagnin - USP Corpus Linguistics, Translation and Terminology New Technologies in Translation - CAPES Universitat Rovira i Virgili-Universidade de São Paulo Tarragona June 9-10, 2009. Comparable Corpora. Natural language in both corpora - PowerPoint PPT Presentation

Citation preview

Page 1: Comparable Corpora for Terminology

Comparable Corpora for Comparable Corpora for TerminologyTerminology

Stella E. O. Tagnin - USPStella E. O. Tagnin - USPCorpus Linguistics, Translation and TerminologyCorpus Linguistics, Translation and Terminology

New Technologies in Translation - CAPES New Technologies in Translation - CAPES Universitat Rovira i Virgili-Universidade de São PauloUniversitat Rovira i Virgili-Universidade de São Paulo

TarragonaTarragonaJune 9-10, 2009June 9-10, 2009

Page 2: Comparable Corpora for Terminology

Comparable CorporaComparable CorporaNatural language in both corporaNatural language in both corporaPhraseologies (Conventionality)Phraseologies (Conventionality)

TerminologyTerminologyDiscourseDiscourse

Make for acquaintance with research Make for acquaintance with research areaarea

Basic notionsBasic notions““Clues” for further researchClues” for further research

Page 3: Comparable Corpora for Terminology

Online corpora with Online corpora with built-in toolsbuilt-in tools

Page 4: Comparable Corpora for Terminology

Results of your search - BNCYour query was saltHere is a random selection of 50 solutions from the 2943 found... ABB 131 Add the pimentos, salame and boiling water or stock to the pan with most, but not all of, the parsley and a little salt and pepper. ABB 105 A pinch of salt is taken for granted in many cake recipes and is added simply to bring out the flavour of the other ingredients. ABB 1332 Return the veal to the pan, add the fresh and dried tomatoes, rosemary, wine, stock, salt and pepper. AMU 1667 The sea churned the banalities of his life into flotsam: sheets, shirts, sandals, books, charts, salt cellar… B77 931 But it is possible to reduce salt consumption further by `;placing the salt shaker at some distance from the table';. BPG 1548 freshly ground black pepper and salt C97 618 SALT is the spice of life .CFS 1681 Substitute LoSalt for common salt, at the table and in cooking to reduce your family's salt intake. G36 1259 Sieve flour into a bowl with pinch salt.

Page 5: Comparable Corpora for Terminology

Query Results - CobuildNOTE: no more than 40 lines will be displayed here, since a threshold has been implemented. If there were more than 40 instances found, a random selection will have been applied.

are more effective than a pinch of salt. [p] Fold in with a metal spoon, chutney [/h] 2 oz walnuts [p] 1/4 tsp salt [p] 1/4 tsp cayenne pepper [p] 2 sea `vegetables" of all types. Sea salt also provides some as does sw3 (tel: 071-276 5599). 4 Topiary salt and pepper pots by Swid add the saffron and stir. Season with salt and pepper. markets and collected cartoon animal salt and pepper shakers, plastic cuckoo entertaining even Hollywood moguls to salt-beef sandwiches in mainly with boiled water, sugar and salt, can save most diarrhea victims' salt. [p] Herb, vegetable and spice salt: compounds of salt with other few leaves crisp iceberg lettuce [p] salt,freshly ground black pepper [p] a great lover of liberally sprinkling salt on her food at the table, thereby served fried egg and crisp slices of salt pancetta. [p] Caesar salad was tray at my head.l A large pinch of salt should be applied to this story. mousse-like. Sift over the flour and salt, then fold in to the eggs and Lanzarote round potatoes and rock salt. Tomatoes, sweet potatoes, I took her words with a grain of salt, went home, put the sample on a  

Page 6: Comparable Corpora for Terminology

T-Score - saltCollocate  Corpus Freq  Joint Freq Significance And 1369241 813 17.533492Pepper 903 285 16.869713With 364279 237 9.984592Lake 2689 93 9.579897Sugar 2472 90 9.427256Water 15678 86 8.887077Black 16881 84 8.744025City 20496 85 8.711252Pinch 405 73 8.533166Ground 8804 64 7.748380Sea 5756 59 7.509810Tsp 290 53 7.271002Flour 676 51 7.119786Add 5006 52 7.052378Freshly 404 45 6.694434Season 16627 51 6.609096 

Page 7: Comparable Corpora for Terminology

Mutual Information - saltCollocate Corpus Freq Joint Freq Significance Pepper 903 285 10.431903Monosodium 12 3 10.095633Glutamate 13 3 9.980145Dampier 18 4 9.925691Tsp 290 53 9.643600Pinch 405 73 9.623633Teaspoon 151 27 9.612068Paprika 51 9 9.593083Crinkle 17 3 9.593083Vinegar 272 40 9.330022Nutmeg 82 12 9.322967Sodium 150 18 9.036634Oregano 43 5 8.991187Freshly 404 45 8.929159

Page 8: Comparable Corpora for Terminology

Equivalence in Equivalence in TranslationTranslation

Pragmatic definition:Pragmatic definition: a term that “works” in target text as it a term that “works” in target text as it

“works” in source text“works” in source textTranslator aims atTranslator aims at fluent translation to ensure better fluent translation to ensure better

understanding on the reader’s partunderstanding on the reader’s partTherefore he should notTherefore he should not use an unusual term which might sound use an unusual term which might sound

strange to reader or cause ambiguitystrange to reader or cause ambiguity

Page 9: Comparable Corpora for Terminology

DictionariesDictionaries

lack of criteria in compiling terms lack of criteria in compiling terms lack of usage exampleslack of usage examplesno updating due to rapid development of no updating due to rapid development of

scientific and technological researchscientific and technological researchno coverage of various technical areasno coverage of various technical areas

Page 10: Comparable Corpora for Terminology

Advantages of a corpusAdvantages of a corpus

built according to translator’s needsbuilt according to translator’s needs

can be constantly updatedcan be constantly updated

offers authentic examples of usageoffers authentic examples of usage

therefore, translator feels secure in therefore, translator feels secure in his choice of term to usehis choice of term to use

Page 11: Comparable Corpora for Terminology

CorTecCorTechttp://www.fflch.usp.br/dlm/comet/consulta_cortec.htmlhttp://www.fflch.usp.br/dlm/comet/consulta_cortec.html

CorTecCorTec (Technical Corpus), (Technical Corpus), part of part of

COMET – Corpus Multilíngüe COMET – Corpus Multilíngüe para Ensino e Traduçãopara Ensino e Tradução

Fifteen comparable corpora English-Fifteen comparable corpora English-PortuguesePortuguese

Each corpus approx. 200,000 words Each corpus approx. 200,000 words in each languagein each language

Description of contentDescription of content

Page 12: Comparable Corpora for Terminology

CorTecCorTechttp://www.fflch.usp.br/dlm/comet/consulta_cortec.htmlhttp://www.fflch.usp.br/dlm/comet/consulta_cortec.html

Kidney failure Computing I & II Ecotourism Contracts Cooking Recipes I & II Hipertension

•Linguistics•Nutritional Supplements•Football (soccer)•Coffee•Cultural Tourism• Astronomy•Electromagnetic flowmeters

CorTecCorTec (Technical Corpus), part of (Technical Corpus), part of COMET – Corpus Multilíngüe para COMET – Corpus Multilíngüe para

Ensino e TraduçãoEnsino e Tradução

Page 13: Comparable Corpora for Terminology

CorTec built-in toolsCorTec built-in toolshttp://www.fflch.usp.br/dlm/comet/consulta_cortec.html http://www.fflch.usp.br/dlm/comet/consulta_cortec.html

WordlistWordlist by frequency and alphabeticalby frequency and alphabeticalConcordancerConcordancer by word or expression (Expression or word by word or expression (Expression or word

equal to) equal to) by prefixes or beginning of word (Beginning by prefixes or beginning of word (Beginning

with)with) by suffix or endings (Ending in) by suffix or endings (Ending in) by parts of words (Containing) by parts of words (Containing) N-gram generatorN-gram generator combinations of com 2, 3 or 4 wordscombinations of com 2, 3 or 4 words

Page 14: Comparable Corpora for Terminology
Page 15: Comparable Corpora for Terminology

Identifying equivalentsIdentifying equivalents by Wordlistby Wordlist

by Collocatesby Collocates

by Translation of collocatesby Translation of collocates

by Concordancesby Concordances

by Contextby Context

Page 16: Comparable Corpora for Terminology

Identifying equivalents Identifying equivalents by Wordlistby Wordlist

http://www.fflch.usp.br/dlm/comet/consulta_cortec.html http://www.fflch.usp.br/dlm/comet/consulta_cortec.html

Page 17: Comparable Corpora for Terminology

ContractsContracts

Wordlist in both languagesWordlist in both languages Portuguese:Portuguese: most frequent content word most frequent content word

contratocontrato - 1832 occurrences - 1832 occurrences contract: contract: just 186 timesjust 186 times

contrato ≠ contractcontrato ≠ contract English: most frequent content word English: most frequent content word

agreementagreement - 1724 occurrences - 1724 occurrences

Page 18: Comparable Corpora for Terminology

agreement vs. contratoagreement vs. contrato1a For the purposes of this 1a For the purposes of this AgreementAgreement, all merchantable Logs 6" in diameter, all merchantable Logs 6" in diameter1b Constitui objeto do presente 1b Constitui objeto do presente contratocontrato o intercâmbio eletrônico de docum o intercâmbio eletrônico de docum

entos entos

2a 12.2  This 2a 12.2  This AgreementAgreement may be cancelled by either party, at it  may be cancelled by either party, at it 2b 13.2 A rescisão deste 2b 13.2 A rescisão deste contratocontrato implicará retenção de créditos decorren  implicará retenção de créditos decorren

3a The term of this 3a The term of this AgreementAgreement shall expire June 30, 2001, (the "Term“ shall expire June 30, 2001, (the "Term“3b O presente 3b O presente contratocontrato terá prazo de (xxx), iniciando-se no di  terá prazo de (xxx), iniciando-se no di

Relinquished Property Contract Relinquished Property Contract Replacement Property Contract Replacement Property Contract 

Adhesion contractAdhesion contract42.000 “adhesion contract”42.000 “adhesion contract”890 “adhesion agreement” (mostly translated sites) 890 “adhesion agreement” (mostly translated sites)

Page 19: Comparable Corpora for Terminology

Identifying equivalents Identifying equivalents byby

Translation of CollocatesTranslation of Collocates

Page 20: Comparable Corpora for Terminology

finely

chopped

sliced

dicedgrated

shredded

Cooking Recipes

Page 21: Comparable Corpora for Terminology

8686 Soy sauce - 1 tbsp Onion - 1 medium, Soy sauce - 1 tbsp Onion - 1 medium, finely choppedfinely chopped Celery - 3 sticks, finely chopp Celery - 3 sticks, finely chopp

8787 apples - 450g (1 lb), peeled, cored and apples - 450g (1 lb), peeled, cored and finely choppedfinely chopped Onions - 225g (8 oz), finely ch Onions - 225g (8 oz), finely ch

8888 Milk - 600 ml (1 pint) Onion - 2 tbsp, Milk - 600 ml (1 pint) Onion - 2 tbsp, finely choppedfinely chopped Celery - 2 tbsp, finely chopped Celery - 2 tbsp, finely chopped

chop = picarchop = picar29 occurrences with “fino” or derived form: 29 occurrences with “fino” or derived form:

““picad* fin*” (18 occurrences), picad* fin*” (18 occurrences), ““finamente picad*” (11 occurrences)finamente picad*” (11 occurrences)..

Page 22: Comparable Corpora for Terminology

Results for picad*Results for picad*most frequent adverb with “picad*” most frequent adverb with “picad*”

is “bem” (79 occurrences): “bem is “bem” (79 occurrences): “bem picada, bem picado” etc. picada, bem picado” etc.

““picadinh*” (96 occurrences, out of picadinh*” (96 occurrences, out of which 10 are “bem picadinha”). which 10 are “bem picadinha”).

best equivalences for best equivalences for finely finely choppedchopped

“ “bem picad*” or “picadinh*”bem picad*” or “picadinh*”

Page 23: Comparable Corpora for Terminology

Results for picad*Results for picad* 2 cebolas médias 2 cebolas médias bem picadasbem picadas ½ dente de alho ½ dente de alho bem picadobem picado junte os tomates pelados junte os tomates pelados bem picadosbem picados. . Calabresa Calabresa picadinhapicadinha 100 g de bacon 100 g de bacon picadinhopicadinho 2 dentes de alho 2 dentes de alho picadinhos picadinhos Polvilhar salsa Polvilhar salsa bem picadinhabem picadinha ½ cebola ½ cebola bem picadinhabem picadinha

Page 24: Comparable Corpora for Terminology

finely slicedfinely slicedSlice = cortar em fatiasSlice = cortar em fatias (?* fatiar)(?* fatiar)

Calda 4 laranjas descascadas Calda 4 laranjas descascadas cortadas em fatias cortadas em fatias finasfinas

200 g de cebola 200 g de cebola cortada em fatias finascortada em fatias finas 1 pepino sem sementes 1 pepino sem sementes cortado em fatias finascortado em fatias finas 6 rabanetes, 6 rabanetes, cortados em fatias finascortados em fatias finas Juntar as batatas Juntar as batatas cortadas em fatias finascortadas em fatias finas. . Decore a quiche com um alho-poró cru Decore a quiche com um alho-poró cru cortado cortado

em rodelas finas. em rodelas finas. 1 cebola média 1 cebola média cortada em rodelas finascortada em rodelas finas 400 g de lingüiça portuguesa 400 g de lingüiça portuguesa cortada em rodelas cortada em rodelas

finasfinas

Page 25: Comparable Corpora for Terminology

finely dicedfinely diced 50 g de bacon 50 g de bacon em cubinhosem cubinhos 500 g de peito de frango cozido e 500 g de peito de frango cozido e cortado em cubinhoscortado em cubinhos 1 tomate sem pele e sementes 1 tomate sem pele e sementes cortado em cubinhoscortado em cubinhos 1 abacaxi médio 1 abacaxi médio cortado em cubinhoscortado em cubinhos 150 g de presunto cozido 150 g de presunto cozido cortado em cubinhoscortado em cubinhos 1/2 xícara (chá) de queijo prato 1/2 xícara (chá) de queijo prato cortado em cubinhoscortado em cubinhos 1 cebola grande, 1 cebola grande, cortada em cubinhoscortada em cubinhos 100 g de bacon 100 g de bacon em cubos pequenosem cubos pequenos 300g de abóbora moranga 300g de abóbora moranga cortada em cubos pequenoscortada em cubos pequenos 200 g de bacon 200 g de bacon cortado em cubos pequenoscortado em cubos pequenos

Page 26: Comparable Corpora for Terminology

finely gratedfinely grated 2 2 occurrencesccurrences for cheese for cheese 2 col. (sopa) de queijo parmesão 2 col. (sopa) de queijo parmesão ralado finoralado fino 80g de queijo gruyère 80g de queijo gruyère ralado finoralado fino    2 colheres (sopa) de parmesão 2 colheres (sopa) de parmesão raladoralado 50 g de queijo parmesão, 50 g de queijo parmesão, raladoralado

32 32 occurrences for cheeseoccurrences for cheese 1 xícara de queijo prato 1 xícara de queijo prato ralado grossoralado grosso 2 xícaras de queijo mussarela 2 xícaras de queijo mussarela ralado grossoralado grosso (200 g) (200 g) Para polvilhar 50 g de queijo parmesão Para polvilhar 50 g de queijo parmesão ralado grossoralado grosso Vegetables and chocolateVegetables and chocolate 1 cebola 1 cebola ralada finoralada fino 4 xícaras de repolho 4 xícaras de repolho ralado finoralado fino Cobertura de chocolate Cobertura de chocolate ralado bem finoralado bem fino

Page 27: Comparable Corpora for Terminology

Identifying equivalents Identifying equivalents by Translation of by Translation of

CollocateCollocate

Page 28: Comparable Corpora for Terminology

Hipertension Hipertension heart heart coração & cardi-coração & cardi-

Heart : Heart : 768 768 occurrencesoccurrences heart failure & heart diseaseheart failure & heart disease

77 dy fou 154 th ISH reduced the incidence of stroke, dy fou 154 th ISH reduced the incidence of stroke, heart failureheart failure, and 8, and 8 si 255 AG increases with advanced age, stroke, si 255 AG increases with advanced age, stroke, heart failureheart failure, 9, 9 as had as had 706 ol subjects,12 and patients with severe 706 ol subjects,12 and patients with severe heart failureheart failure.18.18

1010 ysfunctio 281 l infarction or in patients with severe ysfunctio 281 l infarction or in patients with severe heart failureheart failure. These . These inuria 365 ); acute pulmonary edema, inuria 365 ); acute pulmonary edema, congestivecongestive heart failureheart failure, left 12, left 12

1111 e was si 312 ocardial infarction, stroke, e was si 312 ocardial infarction, stroke, congestivecongestive heart failureheart failure, , hows 495 h hypertension, obesity, and hows 495 h hypertension, obesity, and congestivecongestive heart failureheart failure, the, the

mHpresença de mHpresença de insuficiência cardíaca congestivainsuficiência cardíaca congestiva, hemorragia cere-2 , hemorragia cere-2 is quando existe is quando existe insuficiência cardíaca congestivainsuficiência cardíaca congestiva associada. Podem causar   associada. Podem causar  3tensão severa ou 3tensão severa ou insuficiência cardíaca congestivainsuficiência cardíaca congestiva associada. Limitações do  associada. Limitações do 

Page 29: Comparable Corpora for Terminology

Identifying functional Identifying functional equivalents by equivalents by ConcordancesConcordances

marcar um golmarcar um gol

Page 30: Comparable Corpora for Terminology
Page 31: Comparable Corpora for Terminology

kidneykidney vs. vs. renalrenal

rim rim vs.vs. renal renal

Page 32: Comparable Corpora for Terminology
Page 33: Comparable Corpora for Terminology

3-word clusters

Page 34: Comparable Corpora for Terminology
Page 35: Comparable Corpora for Terminology
Page 36: Comparable Corpora for Terminology
Page 37: Comparable Corpora for Terminology
Page 38: Comparable Corpora for Terminology
Page 39: Comparable Corpora for Terminology

Identifying functional Identifying functional equivalents by Contextequivalents by Context

gol contragol contra

Page 40: Comparable Corpora for Terminology
Page 41: Comparable Corpora for Terminology
Page 42: Comparable Corpora for Terminology
Page 43: Comparable Corpora for Terminology
Page 44: Comparable Corpora for Terminology
Page 45: Comparable Corpora for Terminology

Discovering new termsDiscovering new terms

overtimeovertime

injury timeinjury time

Page 46: Comparable Corpora for Terminology
Page 47: Comparable Corpora for Terminology
Page 48: Comparable Corpora for Terminology

overtime???overtime???

4949 eam that lost the penalty kicks eam that lost the penalty kicks scores a goalscores a goal in the in the overtime period,overtime period,

1 as the game extended from regulation to1 as the game extended from regulation to a pair of 15-minute overtime a pair of 15-minute overtime periodsperiods. .

2 Too many games decided on the 2 Too many games decided on the free kicks that follow the overtime free kicks that follow the overtime periodperiod. .

3 In the 110th minute, early in 3 In the 110th minute, early in the second overtime sessionthe second overtime session of a 1-1 tie at a of a 1-1 tie at a sold-out Olympic Stadium, sold-out Olympic Stadium,

4 score, 1-1, in the 19th minute with a header off a corner kick. In overtime, 4 score, 1-1, in the 19th minute with a header off a corner kick. In overtime, the two became involved again, this time with Zidanethe two became involved again, this time with Zidane

5 One minute into 5 One minute into the injury timethe injury time added on to added on to the the 30-minute overtime30-minute overtime, , 6 Their patience almost paid off at the start of 6 Their patience almost paid off at the start of the the 30-minute overtime30-minute overtime as as

reserve reserve

Page 49: Comparable Corpora for Terminology

acréscimos - acréscimos - prorrogaçãoprorrogação

5 Quando todos esperavam a 5 Quando todos esperavam a prorrogaçãoprorrogação, os italianos definiram a vitória nos , os italianos definiram a vitória nos acréscimosacréscimos. .

6 nos últimos minutos, mas acabou levando mais um gol nos 6 nos últimos minutos, mas acabou levando mais um gol nos acréscimosacréscimos 7 mas conseguiu um gol de pênalti marcado por Totti, nos 7 mas conseguiu um gol de pênalti marcado por Totti, nos acréscimos acréscimos da da

partida. partida. 8 Van Bronckhorst acabou expulso nos 8 Van Bronckhorst acabou expulso nos acréscimosacréscimos por entrada dura em Tiago. por entrada dura em Tiago.    1 Ahn marcou o gol que eliminou, 1 Ahn marcou o gol que eliminou, aos 12min do 2º tempo da aos 12min do 2º tempo da prorrogaçãoprorrogação, , os os

italianos nas italianos nas 2 e sacramentou a vitória argentina sobre o México por 2 a 1, já na 2 e sacramentou a vitória argentina sobre o México por 2 a 1, já na

prorrogaçãoprorrogação, , 3 mas nada disso foi suficiente para evitar 3 mas nada disso foi suficiente para evitar que a partida fosse para a que a partida fosse para a

prorrogaçãoprorrogação

Page 50: Comparable Corpora for Terminology
Page 51: Comparable Corpora for Terminology

injury timeinjury time

1 1 John Aloisi added one in John Aloisi added one in injury timeinjury time. . 2 Zinedine Zidane's substitution in 2 Zinedine Zidane's substitution in injury timeinjury time could mark his World Cup could mark his World Cup

farewell -- he will mfarewell -- he will m 3 Rahdi Jaidi's header in 3 Rahdi Jaidi's header in injury timeinjury time Wednesday gave Tunisia a 2-2 tie Wednesday gave Tunisia a 2-2 tie

with Saudi with Saudi 4 nded dramatically on reserve Oliver Neuville's goal in 4 nded dramatically on reserve Oliver Neuville's goal in injury timeinjury time. . 5 Nadj was ejected in 5 Nadj was ejected in first-half first-half injury timeinjury time, and Domoraud got his second , and Domoraud got his second

yellow card in syellow card in s 6 and Domoraud got his second yellow card in 6 and Domoraud got his second yellow card in second-half second-half injury timeinjury time..

AloisiAloisi deu números finais ao placar já nos deu números finais ao placar já nos acréscimosacréscimos

Page 52: Comparable Corpora for Terminology

acréscimos vs. acréscimos vs. prorrogaçãoprorrogação

1 os minutos, mas acabou levando mais um gol nos 1 os minutos, mas acabou levando mais um gol nos acréscimosacréscimos. . Zidane Zidane recebeu livre na esquerda, deu um corte em Puyol, e barecebeu livre na esquerda, deu um corte em Puyol, e ba

2 do parecia que mais uma partida seria decidida na 2 do parecia que mais uma partida seria decidida na prorrogaçãoprorrogação, , ZidaneZidane cobrou uma falta da direita, a zaga espanhola desviou e cobrou uma falta da direita, a zaga espanhola desviou e

     E quando parecia que mais uma partida seria decidida na E quando parecia que mais uma partida seria decidida na prorrogaçãoprorrogação, ,

ZidaneZidane cobrou uma falta da direita, a zaga espanhola desviou e Vieira, livre cobrou uma falta da direita, a zaga espanhola desviou e Vieira, livre na segunda trave, cabeceou firme, colocando a França na frente. A na segunda trave, cabeceou firme, colocando a França na frente. A Espanha partiu para uma pressão nos últimos minutos, mas acabou Espanha partiu para uma pressão nos últimos minutos, mas acabou levando mais um gol nos levando mais um gol nos acréscimosacréscimos. . ZidaneZidane recebeu livre na esquerda, recebeu livre na esquerda, deu um corte em Puyol, e bateu firme, vencendo o goleiro Casillas, seu deu um corte em Puyol, e bateu firme, vencendo o goleiro Casillas, seu

companheiro de Real Madrid.companheiro de Real Madrid.

Page 53: Comparable Corpora for Terminology

How to identify equivalents?How to identify equivalents?

1.1. Wordlist Wordlist most frequent words: most frequent words: contrato – contrato – agreement;agreement;

2.2. Collocates of search word: marcar um Collocates of search word: marcar um golgol score a score a goal; kidney/renal goal; kidney/renal vs.vs. rim/renal rim/renal

3.3. Translation of collocates: Translation of collocates: finelyfinely + + chopped/diced/sliced/grated/shreddedchopped/diced/sliced/grated/shredded

4.4. Cognate collocate: congestive heart failure Cognate collocate: congestive heart failure insuficiência cardíaca insuficiência cardíaca congestivacongestiva

5.5. Context: Context: gol contragol contra Zaccardo, Aloisi; Zaccardo, Aloisi; injury time injury time vs.vs. overtime overtime

Page 54: Comparable Corpora for Terminology

Parallel corporaParallel corpora

Page 55: Comparable Corpora for Terminology

Studies with parallel corporaStudies with parallel corpora

Contrastive StudiesContrastive Studieswww.linguateca.pt www.linguateca.pt Catálogo de Publicações Catálogo de Publicações Procura Procura

de Publicações de Publicações COMPARA COMPARA

Page 56: Comparable Corpora for Terminology

Naturalness in languageNaturalness in language

Page 57: Comparable Corpora for Terminology

A contrastive methodology A contrastive methodology

to avoid “translationese”to avoid “translationese”

Page 58: Comparable Corpora for Terminology

Possible StudiesPossible Studies

OriginalsOriginals 33 TranslationsTranslations English - EOEnglish - EO Portuguese - PTPortuguese - PT

11 4 4 2 2 OriginalsOriginals 33 TranslationsTranslations Portuguese – PO Portuguese – PO English – ETEnglish – ET

1. Contrastive Linguistics1. Contrastive Linguistics 3. Translation strategies and 3. Translation strategies and normsnorms

2. Contrasting translations2. Contrasting translations 4. “Translationese” 4. “Translationese”

Page 59: Comparable Corpora for Terminology

Possible StudiesPossible Studies

1.1. EO vs. PO – Contrastive Linguistics: EO vs. PO – Contrastive Linguistics: natural forms in both languages > natural forms in both languages > similarities and differences similarities and differences

2.2. ET vs PT – contrasting translations: ET vs PT – contrasting translations: differences between translations into differences between translations into various languagesvarious languages

3.3. EO vs. PT; PO vs. ET: translators’ EO vs. PT; PO vs. ET: translators’ options: strategies and normsoptions: strategies and norms

4.4. EO vs. ET; PO vs PT: “translationese” – EO vs. ET; PO vs PT: “translationese” – peculiarities of translated language peculiarities of translated language which do not occur in original texts, or which do not occur in original texts, or do so with different frequency do so with different frequency (over/underuse)(over/underuse)

Page 60: Comparable Corpora for Terminology

Methodology Methodology Starting always with the originalStarting always with the original

1.1. EOEO PT PT: survey of translations into : survey of translations into Portuguese of study item Portuguese of study item

2.2. POPO ET ET: survey of these equivalents : survey of these equivalents and their translations into English and their translations into English

3.3. EOEO PT PT: survey of these : survey of these equivalents and their translations equivalents and their translations into Portugueseinto Portuguese

4.4. POPO ET ET: and so on...: and so on...

Page 61: Comparable Corpora for Terminology

ResultsResults

said EO (310)=said EO (310)=0.4%0.4% disse PT (203)= disse PT (203)=0.25%0.25%

disse PO (936)=disse PO (936)=0.23%0.23% said ET (772)=0.18% said ET (772)=0.18% told ET (59)=0.013% told ET (59)=0.013%

0.193%0.193%

Page 62: Comparable Corpora for Terminology

ConclusionsConclusions1.1. PT has a greater variety of elocution PT has a greater variety of elocution

verbs but sticks to natural form in target verbs but sticks to natural form in target languagelanguage

2.2. ET has low variety of elocution verbs, but ET has low variety of elocution verbs, but even so falls short of naturalness in even so falls short of naturalness in target languagetarget language

ET – 1106 said: 772 ET – 1106 said: 772 disse – 69,8% disse – 69,8% 334 334 outras – 30,19% outras – 30,19%

1106/421.725 = 1106/421.725 = 0.26%0.26%

Page 63: Comparable Corpora for Terminology

Comparable CorporaComparable CorporaNatural language in both corporaNatural language in both corporaPhraseologies (Conventionality)Phraseologies (Conventionality)

TerminologyTerminologyDiscourseDiscourse

Acquaintance with research areaAcquaintance with research areaBasic notionsBasic notions““Clues” for further researchClues” for further research

Page 64: Comparable Corpora for Terminology

Triple CorpusTriple Corpus

L1 Corpus

L2 Translation Corpus

Parallel Corpus

L2 Corpus

Comparable Corpus

Monolingual Comparable

Corpus

Page 65: Comparable Corpora for Terminology

UNIVERSIDADE DE SÃO PAULO UNIVERSIDADE DE SÃO PAULO FACULDADE DE FILOSOFIA, LETRAS EFACULDADE DE FILOSOFIA, LETRAS E

CIÊNCIAS HUMANAS CIÊNCIAS HUMANAS Departamento de Departamento de Letras ModernasLetras Modernas

Lourdes Bernardes GonçalvesLourdes Bernardes Gonçalves

Universidade Federal do Ceará (Depto Letras Universidade Federal do Ceará (Depto Letras Estrangeiras)Estrangeiras)

Orientadora:Orientadora: Dra. Dra. Stella Esther Ortweiler Stella Esther Ortweiler TagninTagnin

DUBLINERS SOB A LUPA DA LINGÜÍSTICA DE CORPUS:

Uma contribuição para a análise e a avaliação da tradução literária

Page 66: Comparable Corpora for Terminology

Cap. III: A Análise do Texto Cap. III: A Análise do Texto LiterárioLiterário

Resultados da Pesquisa:Resultados da Pesquisa: Área semântica de Música: Palavras-chave:Área semântica de Música: Palavras-chave:tenor (45,4); concert (43,8); artistes (34,0); concerts (24,9); baritone (22,7); clapping (22,7); song (20,5); opera (17,0); piano (15,7); music (14,6); sing (14,4); musical (14,4); artiste (14,2); accompanist (14,2); waltz (13,8); melody (11,8); singers (11,3).

Conclusão:Conclusão: Importância da música na definição de personagens, Importância da música na definição de personagens, tom da narrativa, comentário da ação.tom da narrativa, comentário da ação.

A palavra SHE: (684 ocorr. como sujeito, 207 verbos distintos) A palavra SHE: (684 ocorr. como sujeito, 207 verbos distintos) Concordâncias:Concordâncias:Verbos volitivos (ação deliberada):Verbos volitivos (ação deliberada): 99,4% não colocam a mulher numa 99,4% não colocam a mulher numa posição de submissão ou opressão;posição de submissão ou opressão;Verbos intelectivos (processos mentais):Verbos intelectivos (processos mentais): nenhum aponta para nenhum aponta para subserviênciasubserviênciaVerbos afetivos (emoções):Verbos afetivos (emoções): 99,03% não apontam para opressão 99,03% não apontam para opressão

Page 67: Comparable Corpora for Terminology

Exemplos de Pesquisa Exemplos de Pesquisa direcionada pela Lingüística direcionada pela Lingüística

de Corpus:de Corpus:Palavras-chavePalavras-chaveComparaçãoComparação dubjj dubjj comcom refcor (pal. refcor (pal.

positivas):positivas):

ComparaçãoComparação dubjj dubjj comcom refcor (pal. refcor (pal. negativas):negativas):

N WORD FREQ. DUBJJ.LST % FREQ. REFCOR.LST %KEYNESS1 MR 573 0,84 173 0,08 916,12 GABRIEL 141 0,21 0 400,13 AUNT 125 0,18 31 0,01 216,34 HIS 1.158 1,70 2.158 1,02 193,05 KERNAN 66 0,10 0 187,2

N WORD FREQ. DUBJJ.LST % FREQ. REFCOR.LST %KEYNESS129 MOTHER 30 0,04 295 0,14 48,6130 IT 581 0,86 2.808 1,32 100,9131 HER 790 1,16 3.674 1,73 112,5132 OH 3 320 0,15 152,0133 SHE 695 1,02 4.471 2,10 376,6

Page 68: Comparable Corpora for Terminology

Exemplos de Pesquisa Exemplos de Pesquisa direcionada pela Lingüística de direcionada pela Lingüística de

Corpus:Corpus:Concordâncias comConcordâncias com Mrs. Kearney: Mrs. Kearney:

N Concordance1 repeated Fitzpatrick," Mr about Mrs Kearney. "I have my cont2 anything. do she could asked Mrs Kearney looked searchingl3 movement. language the in er Mrs Kearney was well content 4 were room the of corner other Mrs Kearney and her husband,5 asked me?" treat you did t way Mrs Kearney. Her face was i6 corridor. the of part discreet a Mrs Kearney asked him when 7 smoothly. on went Everything Mrs Kearney bought some lovel8 become had Devlin Miss ing. Mrs Kearney out of spite. She 9 but volubly, spoke He excited. Mrs Kearney said curtly at int

10 life, married of year first r the Mrs Kearney perceived that su

Verbos de negócios e argumentação (22): take note, speak, take into consideration, explain, see (=understand), determine, learn

Page 69: Comparable Corpora for Terminology

Exemplos de Pesquisa Exemplos de Pesquisa direcionada pela Lingüística de direcionada pela Lingüística de

Corpus:Corpus:Possibilidades de TraduçãoPossibilidades de Tradução

Concordâncias deConcordâncias de man man em “A Mother”:em “A Mother”:

N Concordance1 little a was He hand. e smiled and shook his man, with a white, vacant face. She notice2 a been had she if o have treated her like that man. But she would see that her daughter 3 haired grey-a was He ld see that it went in. man with a plausible voice and careful mann4 little haired fair-a was ell, the second tenor, man who competed every year for prizes at t5 good `My friend: some to occasion to say man is packing us off to Skerries for a few w6 young slender a was Duggan, The bass, Mr man with a scattered black moustache. He 7 elderly suave, a was He e room by instinct. man who balanced his imposing body, whe8 a such that perceived Kearney ried life, Mrs man would wear better than a romantic pers9 Freeman the were They baritone. ly and the man and Mr O'Madden Burke. The Freeman

10 Freeman The Burke. O'Madden Mr man and man had come in to say that he could not 11 Freeman the entertaining was Holohan ile Mr man Mrs Kearney was speaking so animat

Page 70: Comparable Corpora for Terminology

Exemplos de Pesquisa Exemplos de Pesquisa direcionada pela Lingüística de direcionada pela Lingüística de

CorpusCorpusAlinhador de TextosAlinhador de Textos

JJ W h e n w e k n e w h im f irs t h e u se d to b e ra th e r in te re s tin g , ta lk in g o f fa in ts a n d w o rm s ; b u t I so o n g re w tire d o f h im a n d h is e n d le ss s to r ie s a b o u t th e d is ti l le ry .

HT N o p r in c íp io , q u a n d o o co n h e c e m o s , c o s tu m a v a se r in te re s sa n te c o m su a s c o n v e rsa s so b re v e rm e s e d e sm a io s , m a s lo g o c a n sa ra -m e d e le e d e su a s in te rm in á v e is h is tó r ia s a re sp e ito d a d e s ti la r ia .

OS Q u a n d o o c o n h e c e m o s e ra u m su je ito c a tiv a n te , q u e fa la v a d e b a g a ç o e d e se rp e n tin a s ; m a s lo g o c a n se i-m e d e le e d e su a s h is tó r ia s in te rm in á v e is a re sp e ito d o a la m b iq u e .

Page 71: Comparable Corpora for Terminology

Considerações FinaisConsiderações FinaisContribuição das Listas de Palavras-chave:Contribuição das Listas de Palavras-chave:

Mr. Mr. (ch = 916,1) e (ch = 916,1) e she she (ch = - 376,6)(ch = - 376,6)Função da música no texto Função da música no texto DublinersDubliners

Contribuição das Concordâncias:Contribuição das Concordâncias:Concordâncias com sujeitos femininosConcordâncias com sujeitos femininosConcordâncias com Concordâncias com Mrs. KearneyMrs. Kearney como como sujeito (sujeito (Mrs. KearneyMrs. Kearney ou ou sheshe) )

Contribuição do Alinhamento:Contribuição do Alinhamento:Visualização do original e traduçõesVisualização do original e traduçõesAnálise de traduções frase a fraseAnálise de traduções frase a frase