1
6 Computer LANGUAGE TRANSLATION In “Languages and the Computing Profession” (The Profession, Mar. 2004, pp. 104, 102-103), Neville Holmes describes a method of auto- mated language translation using a standardized “completely unnatural” intermediate language and discusses various problems. This method may work well for translating the literature of various technical fields because they have well-defined vocabularies. The problems Holmes discusses are more serious in fields of human dis- course outside the technical areas. Because human languages do not match well with regard to vocabulary, phrases, puns, and so forth, any trans- lation that a human creates involves making subjective choices in translat- ing words and other elements of the source language. These choices depend on the particular translator’s biases. Even if computers perform the trans- lations, a degree of subjectivity will be present in the translation software since it is unlikely that there could be a one- to-one mapping of the words, phrases, and so on in all human languages to the intermediate language. In addition, for general literature, the characteristic of literality is problematic. Idioms, clichés, hackneyed phrases, and the like cannot be excluded without pre- venting the richness of expression in source language documents from being conveyed in the destination language— and these are the areas where transla- tor biases are the most evident. Holmes’s discussion of work to be done shows that he has thought about these matters. However, he does not explicitly discuss subjectivity. I am interested in knowing if he expects that subjectivity can be eliminated from the process. Martin Sachs Westport, Conn. [email protected] Neville Holmes responds: The implication that there can be no subjectivity in the actual machine translation is well made. The machine processes data; the information, and thus the subjectivity, can only be in the minds of the people using or making the software. To avoid, or at least lessen, the build- ing of bias into the software was why I emphasized the importance of having philosophers (I had ethicists particu- larly in mind) and semanticists central to the project. Indeed, it is another good reason for such a project to be under the aegis of the United Nations. On the other hand, the bias that an author or reader inevitably imposes on text, even in technical fields, is won- derfully human, and the last thing I would want to do is eliminate it. That is why I suggested that departures from literality, perhaps the most obvious source of bias, might be encoded punc- tuationally in the intermediate language so that translation from the intermedi- ary could—when we’ve worked out how—deal with it appropriately. Furthermore, my suggestion of adding “parameters that allow selection [and detection] of styles, periods, regionali- ties, and other variations” to translation programs would, for instance, provide for a document in English with one spec- trum of biases to be translated into the intermediate language and then back into English with a completely different spectrum of biases. What I am suggesting merges with interpretation in the long term, but there will be some texts that cannot be interpreted, only mimicked. One example is the kind of “Wockerjabby” doggerel that went the rounds quite a few years ago: Eye halve a spell ling check err. Eat came whither peace see. Eat plane lea marques form I revue. Mist ache sigh mite knot sea. I’ve run this pome threw eat, Aim shore yawp least two no. Its let err perfect inn it’s weigh. My chequer tolled miso. COMPILER ENHANCEMENTS The techniques that Peter Maurer out- lines in “Metamorphic Programming: Unconventional High Performance” (Mar. 2004, pp. 30-38) indeed have a successful history among software engineers emulating CPUs (or virtual machines) and creating fast state machines. The sources below provide additional explanations of the tech- niques as they are employed in various tasks: A. Ertl, “Threaded Code;” www. complang.tuwien.ac.at/forth/ threaded-code.html. • E. Gagnon and L. Hendron, “SableVM: A Research Frame- work for Efficient Execution of Java Bytecode,” Proc. Java Virtual Machine Research and Tech- nology Symp., Usenix 2001; www. usenix.org/publications/library/ proceedings/jvm01/gagnon/gagnon. pdf. E. Miranda, “Portable Fast Direct Threaded Code,” 29 Mar. 1991; compilers.iecc.com/comparch/article/ 91-03-121. B. Hoff, “High-Speed Finite State Machines, Dr. Dobbs J., Nov. 1997; www.grouse.com.au/ggrep/. GCC Manual, “Labels as Values;” gcc.gnu.org/onlinedocs/gcc/Labels- as-Values.html#Labels%20as% 20Values. As Maurer explains, there is perfor- mance to be gained by using procedural code. There may be two explanations for this. First, the label-as-value tech- nique treats the compiler as a macro assembler, better matching how the underlying hardware works. Second, the performance ratios may be larger when using the GNU Compiler Collection. LETTERS @ @

Language Translation - Letters

  • Upload
    m

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Language Translation - Letters

6 Computer

LANGUAGE TRANSLATION

In “Languages and the ComputingProfession” (The Profession, Mar.2004, pp. 104, 102-103), NevilleHolmes describes a method of auto-mated language translation using astandardized “completely unnatural”intermediate language and discussesvarious problems. This method maywork well for translating the literatureof various technical fields because theyhave well-defined vocabularies.

The problems Holmes discusses aremore serious in fields of human dis-course outside the technical areas.Because human languages do notmatch well with regard to vocabulary,phrases, puns, and so forth, any trans-lation that a human creates involvesmaking subjective choices in translat-ing words and other elements of thesource language. These choices dependon the particular translator’s biases.

Even if computers perform the trans-lations, a degree of subjectivity will bepresent in the translation software sinceit is unlikely that there could be a one-to-one mapping of the words, phrases,and so on in all human languages to theintermediate language.

In addition, for general literature, thecharacteristic of literality is problematic.Idioms, clichés, hackneyed phrases, andthe like cannot be excluded without pre-venting the richness of expression insource language documents from beingconveyed in the destination language—and these are the areas where transla-tor biases are the most evident.

Holmes’s discussion of work to bedone shows that he has thought aboutthese matters. However, he does notexplicitly discuss subjectivity. I aminterested in knowing if he expects thatsubjectivity can be eliminated from theprocess.Martin SachsWestport, [email protected]

Neville Holmes responds:The implication that there can be nosubjectivity in the actual machine

translation is well made. The machineprocesses data; the information, andthus the subjectivity, can only be in theminds of the people using or makingthe software.

To avoid, or at least lessen, the build-ing of bias into the software was whyI emphasized the importance of havingphilosophers (I had ethicists particu-larly in mind) and semanticists centralto the project. Indeed, it is anothergood reason for such a project to beunder the aegis of the United Nations.

On the other hand, the bias that anauthor or reader inevitably imposes ontext, even in technical fields, is won-derfully human, and the last thing Iwould want to do is eliminate it. Thatis why I suggested that departures fromliterality, perhaps the most obvioussource of bias, might be encoded punc-tuationally in the intermediate languageso that translation from the intermedi-ary could—when we’ve worked outhow—deal with it appropriately.

Furthermore, my suggestion of adding“parameters that allow selection [anddetection] of styles, periods, regionali-ties, and other variations” to translationprograms would, for instance, providefor a document in English with one spec-trum of biases to be translated into theintermediate language and then backinto English with a completely differentspectrum of biases.

What I am suggesting merges withinterpretation in the long term, butthere will be some texts that cannot beinterpreted, only mimicked. Oneexample is the kind of “Wockerjabby”doggerel that went the rounds quite afew years ago:

Eye halve a spell ling check err.Eat came whither peace see.

Eat plane lea marques form I revue.Mist ache sigh mite knot sea.

I’ve run this pome threw eat,Aim shore yawp least two no.

Its let err perfect inn it’s weigh.My chequer tolled miso.

COMPILER ENHANCEMENTS

The techniques that Peter Maurer out-lines in “Metamorphic Programming:Unconventional High Performance”(Mar. 2004, pp. 30-38) indeed have asuccessful history among softwareengineers emulating CPUs (or virtualmachines) and creating fast statemachines. The sources below provideadditional explanations of the tech-niques as they are employed in varioustasks:

• A. Ertl, “Threaded Code;” www.complang.tuwien.ac.at/forth/threaded-code.html.

• E. Gagnon and L. Hendron,“SableVM: A Research Frame-work for Efficient Execution ofJava Bytecode,” Proc. Java VirtualMachine Research and Tech-nology Symp., Usenix 2001; www.usenix.org/publications/library/proceedings/jvm01/gagnon/gagnon.pdf.

• E. Miranda, “Portable Fast DirectThreaded Code,” 29 Mar. 1991;compilers.iecc.com/comparch/article/91-03-121.

• B. Hoff, “High-Speed Finite StateMachines, Dr. Dobbs J., Nov.1997; www.grouse.com.au/ggrep/.

• GCC Manual, “Labels as Values;”gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html#Labels%20as%20Values.

As Maurer explains, there is perfor-mance to be gained by using proceduralcode. There may be two explanationsfor this. First, the label-as-value tech-nique treats the compiler as a macroassembler, better matching how theunderlying hardware works. Second, theperformance ratios may be larger whenusing the GNU Compiler Collection.

L E T T E R S@@