32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

Embed Size (px)

Citation preview

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    1/7

    R.R.K.HartmannUniversityofExeterTheUseofParallelTextCorporaintheGeneration ofTranslationEquivalentsforBilingualLexicographyAbstractThe paperis intended to demonstrateth epractical applicabilityof th etheoret icalnotionofcontrastiveextology'Hartmann 980)obilingualexicography.By

    meansof ystematicnalysisofparallelextsromorrespondinggenresnparticularpairsoflanguagesitispossibleto generatematchingwordsandtheircollocations which canbecodifiedas translation equivalents inbilingualdictionaries.Promisingworkhas beendoneto developcomputer-aidedtechniquesfo rutilizingsuchparallelextcorporanth eearchfo rexicalquivalencexamplesfrom EnglishandGerman.1.Introduction

    Enormousstrideshavebeenmadeinthelastfew yearsinapplyingthef indings oftextl inguis t ics andcorpustechnologytothefieldof lexicography.Theformer(sometimesundertheheadingof'combinatorics')coverssuchphenomenascollocation'ithinentencesndcohesion'etweensentencesas wel las the 'pragmatic'embedding ofdiscoursein context,withinterestingramificationsintosociolinguistics.Thelatter(underthetitleof'corpusinguistics')soncernedi thechniquesorompilingndexploitingtextualdatabasesinanefforttodocumentthewholerangeofl inguist ic structureand extra-linguistic knowledge,whichoverlapswith theterritoryofartificialintelligence.

    Theaimofthispapersorelatehesedevelopmentso particularproblem in bilinguallexicography,viz .translationequivalence.Startingwiththeheoreticaldeafcontrastiveextology',tracesomefhepossibilitiesandpracticalproblems ofcomputer-aidedparalleltextanalys is forthebenefitofcompilers(andusers)ofbilingualdictionaries.2.Contrastivetextology

    In hought-provokingpaperonhehemeofdata-gathering,Bujas(1975) askedwhat is needed in theefficientupdatingofa bilingualdictionaryforalanguagepairlikeEnglishandSerboCroat.To answerthequestion,hehadto employalargenumberofstudenthelperstomanuallyexcerpttextsfromnewspapersandmagazinesandchecktheirappropriacyforarevisededitionofthedictionary.Todaymuchofthisworkcanbedoneby relyingon

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    2/7

    292 uralex1 9 9 4

    existingextorporaryomputercanningndoncordancingcf .Flowerdew& Tong1994 ) .

    However,whileheuseoftextcorporasairlyirmlyestablishednmonolingualeneral-purpose,edagogicalnderminologicalexico-graphy,muchremainstobedoneinbilinguallexicography.Thestumblingblockheresheproblemoftranslationequivalence,whichrequiresaninterlingualapproach.

    MyownbookContrastiveTextology(Hartmann1980 )was intendedas aprogrammaticplea forasystematiccombination ofcontrastiveanalysis anddiscourseanalys is .tsdoublepurposew as theimproveddescriptionofthelinguisticfacts(atthelevelofthetext)ofpairsoflanguagesandimprovedproblem-solvinginpracticaldomainssuchas translation, foreign-languageteaching andbilinguallexicography.

    Examples ofproblemsawaiting solutions arethe following:Whatarethemeansavailablendifferentanguagesforanaphoraandotherformsofcross-referencenext?Whatreheignalshatelimituccessivediscourseblocs?hatarethefactorsthatdetermineregisterandgenrerangesindifferentlanguages?Whatshiftsarerequiredinthetranslationoftextsfromonelanguagetoanother?(cf.Hartmannforthcoming)

    Examplesofdifferentapproachestotheseproblemsinclude'comparativestylistics',contrastivehetoric',cross-culturaliscourserammar','comparativediscourseanalysis',andmanyothers(cf.Pry-Woodley1990 ) .Forthefieldofbilinguallexicography,theidea ofcollectingandcomparing'paralleltexts'seemsparticularlypromising-seebelow.3.Translation equivalence

    Theraditionalotionfequivalenceasoelateordsoheircounterpartsas correspondingformalunitsinparallellinguistics y s t ems ,aviewhatwasstrengthenedbyheapparenteasewithwhichbilingualdictionariescansupplyready-madelexicalequationsforinsertionintotheappropriateportionof atext(cf.Zgusta1984) .

    However,thesemanticabstractionthatis builtintothelexicalinventoryofthedictionaryhasdeprivedeachof thesewordsoftheirnaturalcontext,andthetranslatormustcompensateforthelackof contextualinformationfromhis/herownbilingualdiscoursecompetence,particularlyinthatmostintractableareaof'culture-specific'vocabulary.Morerecentresearch(cf.Hartmann985nd992a ,atim ason990 )astressedheapproximativenatureoftheseequivalencecreationprocesses.

    Fromthisvantagepoint,thecontrastivetextologistwill wanttogo beyondthemerecomparisonofgivenparallelextsasranslationproductsandsearchnsteadfortheactualcode-switchingoperationsthatallowedthecompetenttranslatorto'find'asuitabletarget-languageequivalentinthefirstplace.Th i s is ofdirectrelevancetobilinguallexicography:thedictionarymakerneeds notonlytocodifytheresultsofpasttranslation acts-however

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    3/7

    Thewaywordsworktogether/combinatorics93theymayhavebeenachieved- butalsoohaveanawarenessofhetechniquesthatcanbeusedto bring aboutsuchtranslationequivalence.4.Dictionary equivalents

    Theoveragefexicalquivalentsnheilingualictionaryshit-and-miss,rial-and-erroraskcapableofempiricalobservationandsystematization(andhusimprovement).haveconsultedanumberofmonolingualandbilingualdictionaries(Hartmann1992b) tocheckhowtheytreatarangeof14regionallymarkedlexicalitemsinBritishEnglishandAustrianGermanandtheir equivalents -seeFig.1.

    Theonclusion ameoashatheoveragefheranslationequivalentsinbilingualdictionaries,whi leingeneralquitereasonableandnoworsehanthatofevenmorespecializedmonolingualdictionaries,s basedonanelementofchancewhich weshouldattempttoreduce in future.5.Paralleltext corpora

    Onesolution to theproblemofsystematizingthediscoveryoftranslationequivalents,assuggestedbytheproponentsofthevariousapproachestocontrastivetextology,l iesinthecomparisonof so-calledparalleltexts,i.e.bitsofdiscourseromcorrespondingvarietiesorextypesnhewolanguagesnquestion.fw eknew,oroheargumentgoes ,whathesemanticrangesandcollocationalrestrictionsofwordswereinthetextualcontextsofonelanguage,thenwe couldmatchthem in paralleltextsfromtheotherlanguage.Thi s is exactlywhat JohnLaffling(1991 )attempted.Hebuiltupa corpusofparalleltextsofpartypoliticalmanifestoesinEnglishandGerman,in otherwords ,thepoliticalprogrammesoftheBritishLabourPartyandtheGermanocialists,heritishonservativesndheDU-CSUnGermany,andtheGreensinbothcountries.B y meansofanalgorithmwhichcomputer- matchedwordsandphrasesintheseparalleltexts,hemanagedtoextractfromthemthenaturallyoccurringtranslationequivalents.

    InFig.2 presentasmallportionofLaffling'sresultsnrelationodictionarycoverage.Theinformationis arranged in fourcolumns.Inthefirst,ontheleft,therearefourphrases(youmightcallthempoliticalclichs)fromthe Germancorpus.

    ThesecondcolumncontainsEnglishequivalentsofthesephrasesas foundin the official andunofficial translations ofthetexts,whichon the wholearefairly literal renderings.Column3is the mostinteresting:herew e have'real'textualequivalentsnotfoundin translations,butin separatelyformulatedparalleltexts, i .e. thepartymanifestoesofthecorrespondingpoliticalpartyintheUnitedKingdom.

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    4/7

    29 4 Euralex1994

    1 3 u g

    0 8 .g 1S OC3Z o E U H a

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    5/7

    Thewaywords worktogether/combinatorics 295

    matcheshenaturallyoccurringphrasesfromth eEnglishparalleltexts,althoughtheycomereasonablyclose.

    p o l i t i s c h eA u s e i n a n d e r s e t z u n g

    B i l d u n g s a n g e b o tH o t u n d E l e n d

    breiteS c h i e b t e n d e rB e v l k e r u n g

    translation

    politicala r g u n e n t

    e d u c a t i o n a l o p p o r t u n i t i e s a n ta n d

    p o v e r t y b r o a d l a y e r s o f

    t h e p o p u l a t i o n

    parallelexts

    politicald e b a t e

    e d u c a t i o n a lp r o v i s i o n oiserya n d

    h a r d s h i p larqesectionso f

    t h e p o p u l a t i o n

    D C D E S - M P O R D

    .political]d e b a t e

    e d u c a t i o n a l ]o f f e rp o v e r t y( a n dh a r d s h i p )

    [ * ;liseryb r o a ds e c t i o n so f

    t h ep o p u l a t i o n Figure2:Paralleltextanalysis.

    [Theullversionofhispaperwillelaborateonheproblemsandpossibilitiesfheethodologyfomputer-ass i s tedenerationftranslationequivalents from otherparalleltextcorpora,basedon th e results ofresearch to beundertaken atMacquar ieUniversity inAugus t1994.] 6 .Implicationsfor bilinguallexicography Theexistingliteratureonbilingualdictionary-making (cf.Bartholomew &Schoenhals 1983,Marello1989,Svensn1993)is strangelysilentonthese issues.However,recentimpulseshavecomefrom machinetranslation(seeLaffling1991asdiscussedinSection5above),artificialintelligenceandcomputer technology.Kenneth Church and William Gale (1991),for example,have explored th e use of paralleltextconcordances,such as thosebased on th e French-English Canadian Hansard,if inremarkable ignorance of th e pioneering theoretical

    workmentionedabove.Church&Galeclaim thattranslation equivalentscanbeextracted from such bilingualcorpora by aligningth e parallel texts atth esentencelevel.Eugenioicchindisis aolleagues1992)averoposed'workstation' fo r lexicographers intenton monitoringand processing lexical

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    6/7

    296 uralex1 9 94

    equivalentsderivedfromEnglish-Italian textcorpora. Eachofthesesetsofbilingualtexts fromdifferentlanguage varieties canbefirst'synchronized',usingorphologicalroceduresndnformationromnlectronicbilingualdictionary,andthensearchedfor'directlinks'betweenthetexts,whichproducesa choice ofpotentialtranslationequivalents.Iwouldventuretosuggestthatwearenotfaroffthetimewhenhesetechniquesnotonlybecomemorewidelyavailable,butalsocouldhelpus designbilingualthesauruses(cf.Hartmann1994)fromtextcorporatakenfromcorrespondinggenresinselectedpairsoflanguagesandthusbenefitdictionary compilersanddictionaryusers,especially translators.References

    (a ) Cited dictionaries Collins Dictionary OfThe English Language.Ed.L .Urdang.London& Glasgow:Collins 2nd ed.1986Collins-Klett[Pons]Grosswrterbuch Deutsch-Englisch Englisch-Deutsch.Comp.P . Terrel letal .Glasgow:Coll ins& Stuttgart:E.Klett1980/1983Concise Oxford Dictionary OfCurrentEnglish.Ed.R.E.Allen.OxfordUP.8th ed .1990Dictionary Of Britain.An ATo ZOfBritishLife.Comp.A.Room.OxfordUP.1986/1990Duden-OxfordGrosswrterbuchnglisch-Deutsch].ds. .cholze-Stubenrecht .Sykes.Mannheim:Dudenver lag& OxfordUP.1990Kleinessterreich-Lexikon.WissenswertesBerLandUndLeute.Comp..Gassner&W .Simonitsch.Mnchen:C.H.Becksterreichischesrterbuch. rterbuchstel le/Bundesminister iumrnterrichte tl.].Vienna:sterreichischer Bundesver lag37th ed .1990 Wahrig DeutschesWrterbuch.Comp.G.Wahr ig[etal.].Gtersloh:Berte lsmann&Mosaik 4th ed .1980(b )OtherliteratureBartholomew,D.A.& Schoenhals,L .C .1983.Bil ingualDictionaries fo r IndigenousLanguages.Mexico:SummerInstitute ofLinguistics. Bujas,2.1975.'Testing th e per formance of abilingualdictionary ontopicalcurrent texts"StudiaRomanica etAnglica Zagrabiensia39:193-204.Church,K.&Gale,W .1991."Concordances forparalleltext",inUsingCorpora.Proceedingsofth e7thAnnualConferenceofth eCentrefo rth eNewOxfordEnglishDictionaryandTextResearch[Oxford]ed .L .M .Jones .Oxford UP,40-62.Flowerdew,. Tong, .K .ds.994.nteringext.apersfromth eKUST/GIFL JointSeminar .onCorpusLinguistics and Lexicology.HongKong:HKUSTLanguageCentre.Hartmann, .R.K.980.ontrastiveextology.Comparative iscourseAnalysisnAppl ied Linguistics(StudiesinDescript iveLinguistics5).Heidelberg:J.Groos.Hartmann,R.R.K.1985."Contrastive textanalysis and th e search fo r equivalence in th e bilingualdictionary"nymposiumonLexicographyIIed.. y ldgaard-Jensen A.ettersten(Lexicographica SeriesMaior5).Tbingen:M.Niemeyer,121-132.Hartmann,R.R.K.1992a."300yearsofEngl ish-German languagecontactandcontrast:The translation of culture-specif icinformationin th e genera lbilingual dictionary" in Languageand

    Civilization.AConcer tedProfusion ofEssaysand Studies in Honour ofOtto Hietsch ed.C. Blank.Frankfurt:P .Lang ,300-327.Hartmann,R.R.K.1 992b. "Contrastive linguistics:(How)is itrelevant to bilinguallexicography?"inewDeparturesnContrastiveLinguisticsMed..Mair M.MarkusInnsbruckerBeitrge zu r Kulturwissenschaft,Angl ist ische Reihe 4 & 5) .Innsbruck:Universitt,Institut f rAnglistik,I :293-299.

  • 7/30/2019 32_Euralex_R.R.K. Hartmann - The Use of Parallel Text Corpora in the Generation of Translation Eq

    7/7

    Thewaywords worktogether/combinatorics97Hartmann,R.R.K.1994."The onomasiological dictionary inEnglish and German.A contrastive textologicalperspective" inDie Welt in einer Liste vonWrtern/TheWorld in a List ofWords.ed.W .Hllen(LexicographicaSeriesMaior).Tbingen:M.Niemeyer ,172-185. Hartmann,R.R.K.forthcoming."Contrastive textology,bil inguallexicography and translation" in Encyclopedicictionaryfhinese-English/English-Chinese Translationd. Chan Sin-wai.HongKong:Chinese University Press.Hatim,B.&Mason,.990. iscourseandth eranslator(LanguageinSocialLifeSeries). London :Longman.Laffling, J.1991.Towards High-Precision Machine Translation,Based on Contrastive Textology (Distributed Language Translation7).Berlin:ForisPublications.Marello, .989. izionaribilinguiconchedesuidizionariitalianiperfrancese,nglese,spagnolo,tedesco (Fenomen iLinguistici6) .Bologna:Zanichell i.Pry-Woodley,M.-P.990.Contrastingiscourses: ontrastivenalysisnd discourse approach to writing"Language Teaching 23:143-151.Picchi,.tl.992.TheisaexicographicWorkstation:hei l ingualomponent"nEURALEX'92 Proceedingsed.H.Tommolaetal.(StudiaTranslatologicaA.2).Tampere:

    Yliopisto,I: 277-285. Svensn,B.1993.Practical Lexicography.Principles and Methods ofDictionary-making.Oxford:OxfordU.P.Zgusta,. 1984.ranslationalquivalencenhei l ingualictionary"nEXeter83Proceedingsed.R.R.K.Hartmann (LexicographicaSeriesMaior1) .Tbingen:M.Niemeyer ,147-154.