62
Chapter One: Introduction ... [K]nowing the technical terms ... is not a sufficient condition for suc- cessful reading of specialised material. It was, in fact, the non-technical terms which created more of a problem. (Cohen et al. 1988: 162) Prologue Reading, for students of English for specific purposes (ESP), is probably the most important skill in terms of acquiring new knowledge. It does, however, often pose learning problems, especially with respect to vo- cabulary. The psycholinguistic model of reading widely favoured in lin- guistics and cognitive psychology in the 1960s and 1970s considered that the main constructs underlying reading are making predictions and deducing meaning from context (cf. Goodman 1967: 127). However, during the 1980s, the interactive approach to reading became dominant, in which it was proposed that successful comprehension is achieved by the interactive use of two reading strategies: the top-down approach (i.e. making use of the readers’ previous knowledge, expectations and ex- perience in reading the text) and the bottom-up approach (i.e. under- standing a text mainly by analysing the words and sentences in the text itself: cf. Sanford & Garrod 1981; van Dijk & Kintsch 1983; Carrell 1988). Research in ESP reading (e.g. Selinker & Trimble 1974; Cohen et al. 1988) provides empirical support for the interactive framework, find- ing morphographemic word-processing skills to be a major component of reading. It has also, since the 1980s, been broadly agreed among re- searchers (cf. Kennedy & Bolitho 1984; Trimble 1985; Cohen et al. 1988) that for non-native ESP readers the most problematic element in compre-hending scientific and technical (ST) texts is a set of vocabulary items that has been variously labelled sub-technical, non-technical and semi-techni-cal. This set is commonly said to be shared across academic disciplines and to be context-dependent. Most notably, when such items appear in scientific or technical contexts they take on extended meanings. Another way of defining ‘sub-technical’ is to say that it refers to those words that have one or more ‘general’ English meanings and which in technical contexts take on extended meanings. (Trimble 1985: 129). Whatever the name given to the words in this group, if they appear to hinder students of ESP in comprehending texts in their discipline, it is worthwhile for language teachers and ESP practitioners to seek ways in 1

Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Embed Size (px)

Citation preview

Page 1: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Chapter One: Introduction ... [K]nowing the technical terms ... is not a sufficient condition for suc-cessful reading of specialised material. It was, in fact, the non-technical terms which created more of a problem. (Cohen et al. 1988: 162)

Prologue Reading, for students of English for specific purposes (ESP), is probably the most important skill in terms of acquiring new knowledge. It does, however, often pose learning problems, especially with respect to vo-cabulary. The psycholinguistic model of reading widely favoured in lin-guistics and cognitive psychology in the 1960s and 1970s considered that the main constructs underlying reading are making predictions and deducing meaning from context (cf. Goodman 1967: 127). However, during the 1980s, the interactive approach to reading became dominant, in which it was proposed that successful comprehension is achieved by the interactive use of two reading strategies: the top-down approach (i.e. making use of the readers’ previous knowledge, expectations and ex-perience in reading the text) and the bottom-up approach (i.e. under-standing a text mainly by analysing the words and sentences in the text itself: cf. Sanford & Garrod 1981; van Dijk & Kintsch 1983; Carrell 1988). Research in ESP reading (e.g. Selinker & Trimble 1974; Cohen et al. 1988) provides empirical support for the interactive framework, find-ing morphographemic word-processing skills to be a major component of reading. It has also, since the 1980s, been broadly agreed among re-searchers (cf. Kennedy & Bolitho 1984; Trimble 1985; Cohen et al. 1988) that for non-native ESP readers the most problematic element in compre-hending scientific and technical (ST) texts is a set of vocabulary items that has been variously labelled sub-technical, non-technical and semi-techni-cal. This set is commonly said to be shared across academic disciplines and to be context-dependent. Most notably, when such items appear in scientific or technical contexts they take on extended meanings.

Another way of defining ‘sub-technical’ is to say that it refers to those words that have one or more ‘general’ English meanings and which in technical contexts take on extended meanings. (Trimble 1985: 129).

Whatever the name given to the words in this group, if they appear to hinder students of ESP in comprehending texts in their discipline, it is worthwhile for language teachers and ESP practitioners to seek ways in

1

Page 2: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

which learners’ lexical repertoires can be raised to at least the threshold level of skilled readership in their chosen fields. Context The Hong Kong University of Science and Technology (UST) opened in October 1991 as the third university in Hong Kong and its first techno-logical university, with special emphasis on research in science and tech-nology and close collaboration with business and industry. The medium of instruction is English. Three of the University’s schools – Science, En-gineering, and Business and Management – provide both undergraduate and postgraduate education, while the School of Humanities and Social Sciences offers postgraduate education, a minor programme and general education for all undergraduates. The Language Centre, in which the practical work for this study was undertaken, has a pan-University role in the provision of language courses. Its English programmes seek to help students acquire the necessary lan-guage skills to gain the maximum benefit from their undergraduate curriculum and to ensure that they can cope with the level of English required for their studies. The Language Centre also provides Business Communication courses for the School of Business and Management, and Technical Communication courses for the Schools of Science and Engineering, all of which cater for the career needs of prospective graduates. UST students generally attribute their difficulties in reading ST texts to not knowing enough words, and this does indeed appear to be one of their major problems in comprehending texts in their subject areas, especially during the first year. All students are exposed to computer science-related reading materials, especially, of course, those working in the Departments of Computer Science (CS), Computer Engineering (CE) and Information and Systems Management (ISMT), who are required to read a wide range of CS textbooks and reference works. In addition to course requirements, no matter which subjects students study, there is a need for them to read CS-related texts, and there is inevitably therefore a considerable ‘vocabu-lary gap’ between secondary school and university. Students are expected, by the end of their first semester, to be familiar with at least one word-processing program, and to accustom themselves to using the Internet. Consulting manuals becomes one of the means through which they can directly learn to operate the necessary computer programs and

2

Page 3: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

procedures accurately and efficiently. When students have learned to communicate via electronic mail and to use the World Wide Web, they are required to know how to deal with the language used by professionals working in various areas of information technology (IT). In short, students at UST need to learn enough vocabulary to understand successively three ‘levels’ of increasingly complex and extensive CS-related texts. Nature of the investigation The English proficiency of most UST students at entry is below the threshold level for their subjects. Many of the problems that these students encounter in using English are related to comprehension, and are caused by their limited knowledge of vocabulary, including, crucially, a lack of awareness of polysemy. The present study was undertaken to determine whether a lack of the knowledge of the distinct additional meanings in CS texts of general words would adversely affect first-year university under-graduates’ comprehension of such texts, and hinder their effective use of new computer programs. Further, it incorporated an assessment of whether a subject-specific glossary with various relevant features would help second-language learners of English to read CS-related literature, to understand CS textbooks, and to use computer programs successfully in their own time and at their own pace. In this regard, John Swales (1985: 214) has noted:

Unfortunately, it is precisely that interest in process and interaction which is missing in vocabulary work; interactions between specialised vocabulary support materials (technical dictionaries, glossaries, lexical fields, etc.) and the language learner and user are under-researched, and the process of technical and sub-technical vocabulary acquisition imagined rather than investigated.

His remarks helped this study to focus on two aspects of what has been variously called sub-, non-, or semi-technical vocabulary: the investiga-tions of how such items affect students’ understanding of CS-related texts, and of whether a corpus-based subject-specific glossary with various features, such as Chinese explanations, interactive demonstrations and fre-quency information is a more efficient consulting tool for such students than a general learner’s dictionary (GLD), in terms of ease and time. The computer literature I chose for this study consists of the texts already collected in the UST CS Corpus (mainly passages from CS

3

Page 4: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

textbooks) and the program manual Running Word 6 for Windows which is, a widely- used word-processing program in Hong Kong. Aim and scope of the study The aim of this study is threefold: first, to establish as clearly as possible the existence and nature of an intermediate lexical category in CS texts, between the general and the fully technical; second, to investigate whether the presence of such intermediate and often ‘technicalised’ vocabulary adversely affects students’ understanding of CS texts; and third, to assess the extent to which a corpus-based subject-specific glossary of such vo-cabulary is a more efficient consulting tool for students than a GLD. The study comprises an empirical investigation of the problems which a first-year student at UST is likely to encounter in reading CS texts, conducted in three different forms together with two oral data collection methods: a decontextualised word-level glossing exercise (DWLGE); a multiple-choice questionnaire survey (MCQS) and hands-on computer tasks (HOCTs) on the understanding of semi-technical vocabulary, with or without using the thinking-aloud technique (TAT) or the retrospective interview (RI) to elicit an immediate physical and psychological response from subjects throughout the experiments; and a prototype electronic glossary of semi-technical vocabulary (EGSV), incorporating such features as Chinese explanations and frequency-based information relating to a specialised corpus. The subject respondents were 283 first-year CS and CE students in the School of Engineering, and 162 first-year students studying ISMT in the School of Business & Management, at UST. Students from these Depart-ments were chosen because it was felt that the findings of this study could have implications for the design and implementation of ESP materials for the use of English language enhancement courses not only in the Uni-versity generally, but in these Departments particularly. The 1,001,895-word UST CS Corpus (James cum al. 1994) is assessed with respect to its contributions in helping ESP learners and practitioners. Lexicographers should be able to benefit from the information provided in a specialised corpus in dictionary production, and in this study, a process will be described through which the specialised corpus determines how CS texts are categorised by high-frequency semi-technical vocabulary that falls into distinctive lexical categories, along

4

Page 5: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

with its possible effects on first-year university students’ understanding of CS-related texts. Learners should benefit from whatever modern technology can provide to help them understand English and an electronic corpus-based subject-specific glossary operated on a pen-based computer platform is in-vestigated for its possible use in the dictionary consultation process. The usefulness of including semi-technical vocabulary items selected uniquely from the Microsoft Word 6.0 manual is assessed. The findings may shed light on future planning and the production of a glossary/dictionary with similar features and with semi-technical vocabulary items chosen from the wider language of IT professionals. The TAT and the RI are examined to determine their effectiveness in ex-tracting data from subjects, which may reveal how well their successive attempts indicate their understanding of such semi-technical vocabulary in the MCQS and in the HOCT format. In particular, this study aims at investigating the concept of a semi-technical (and often ‘technicalised’) vocabulary, which covers such items as insert and tool (two terms on the command line of the Microsoft Word 6.0 program) and top and card (as used in naming items of computer hardware, as in laptop, palmtop, card-address backplane, card modules). In the process, it will also contribute to resolving three issues:

whether the semi-technical vocabulary of CS is semantically and stylistically distinct from the same vocabulary as it appears in general texts and from fully technical vocabulary, and if so, in what ways it is distinct;

whether such semi-technical vocabulary creates comprehension

problems for first-year university students in reading CS-related texts;

whether a prototype glossary of semi-technical vocabulary with

various features helps students understand CS texts, in terms of both ease of use and the time required for consultation.

Sub-technical, non-technical or semi-technical vocabulary

5

Page 6: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Many ESP teachers have found that vocabulary can be one of the major problems that affect students’ understanding of scientific and technical (ST) texts. According to Kennedy & Bolitho (1984), Trimble (1985) and Nation (1990), the difficulty lies not with technical vocabulary as such but, as Cohen et al. (1988: 153) put it:

… even students with mastery over the technical terms become so frustrated in reading technical English that they seek native-language summaries of the English texts, or native-language books covering roughly the same material, or do not read the material at all, but concentrate rather on taking verbatim lecture notes.

As researchers have begun to investigate the reading problems of non-natives, it has become clear that the difficulties of ESP learners extend beyond technical vocabulary as such. Many of them find ST texts difficult to comprehend because they have problems even in the mastery of general English words that appear in those texts. The situation is further com-plicated by a category of vocabulary items which are neither technical, nor, paradoxically, strictly non-technical. In the relevant literature there have been, in effect, three levels of vocabulary under discussion with regard to the needs of students dealing with technical English: (i) the general and basic vocabulary of English, usually taken to be best represented by Michael West’s classic General service list of English words (GSL, 1953); (ii) the entirely technical and specific vocabulary of a particular subject, such as CS, engineering, biology, chemistry or information technology; and (iii) a hazier range between these two, variously described as sub-technical or semi-technical and even non-technical. Commentators have broadly concluded that the difficulties for students lie less in either the first entirely non-technical level or the fully technical level, and more in the intermediate area. This level of words, according to Kennedy & Bolitho (1984: 58), has created problems for students in comprehending ST materials:

One reason why sub-technical vocabulary can prove a problem to the learner is that words commonly met in ‘general’ English take on a specialized meaning within a scientific or technical context. The learner may know the ‘general’ meaning already and may be confused when he meets it in a context with a different meaning. Examples of such words are cycle (cf. its use in the carbon cycle or a cycle of electricity), conductor (in electricity), and resistance (in an electrical circuit).

Increasingly researchers have favoured the view that such an area of vocabulary creates significant barriers to students’ understanding of ST texts, but the discussion has been complicated by the use of several

6

Page 7: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

different terms for what appears to be the same intermediate-level area of difficulty, for which commentators such as Cowan (1974), Robinson (1980), Trimble (1985) and Tong (1993a, 1993b) use the term sub-technical vocabulary, while others use non-technical with or without the hyphen (cf. Barber 1962; Nation 1990; Tao 1994), and still others use semi-technical (cf. Johns & Dudley-Evans 1980; Farrell 1990; McArthur 1996b). Are they all in fact talking about the same category of words? Are they referring to the same basic concept in defining sub-, non- and semi-technical vocabulary? If they are indeed doing so, are the three different terms all necessary and useful? How are these terms defined? And are the definitions adequate? In his pioneering work on technical, as opposed to general, vocabulary, C. L. Barber (1962) discussed the development of a frequency-based list of technical terms that would complement and extend Michael West’s GSL (1953), which comprised the 2,000 most frequently used words of every-day English. Barber’s research consisted mainly of an analysis of three texts from North American books (taken to be particularly representative of the subject): a 7,500-word excerpt from a university-level textbook on the engineering applications of electronics (Fink 1938); a 5,300-word text from an article concerned with basic research in biochemistry (Michaelis 1947); and a 9,600-word chapter on instrumental optics from an elemen-tary university textbook on astronomy (Russell et al. 1945). Focusing on two main areas – sentence structure and verb forms, and vocabulary – Barber took the view that the teaching of specialised technical terms does not fall within the responsibility or the competence of the English teacher as such, and considered that the notion of a general vocabulary of science is important because words that are broadly useful to ST students occur frequently in all kinds of ST literature, and teachers can teach such words. He noted (op. cit.: 37):

We cannot teach our scientific and technical students the whole of the scientific vocabulary: this is beyond the capacity of any individual. Nor do we normally want to teach them the specialized technical terms of their own subject … What the English teacher can usually hope to do is to teach a vocabulary which is generally useful to students of science and technology –words that occur frequently in scientific and technical literature of different types. Some of these words will be technical ones, but many will not.

Barber attempted to compile a list of such words, which he referred to as non-technical vocabulary, basing his work on the three texts, to bridge

7

Page 8: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

the gap between the basic GSL and lists of strictly technical vocabulary items. This list of non-technical words is therefore meant to be an extension of the GSL, providing a vocabulary for explaining scientific ideas to the layman, which is not necessarily the same as the technical vocabulary needed by scientists. Barber further pointed out that one special feature of the general vocabulary of science is its high frequency of occurrence in a 23,400 ‘word-pool’ extracted from his three texts. In terms of ST registers, however, he did not seek to be either comprehensive or detailed, but sought instead a general ST vocabulary of non-technical words. In doing so, he provided the basis for future more specific lists, and opened the way to a discussion of words that have both general and technical aspects. For this reason his work is cited by more specifically focused com-mentators such as Sager et al. (1980) and Farrell (1990). Ronayne Cowan, working in a research programme organised by the University of Illinois and Teheran University, appears to have been the first to use the term sub-technical in this context. The primary focus of his research was the creation of reading materials to train Iranian university students to read ST English prose at an advanced level. He believed that effective EFL reading materials could be constructed by accurately analysing the lexicon and syntax of ST prose. Although Cowan does not seem to have known of Barber’s work, his views are similar. He notes, for example, in his paper ‘Lexical and syntactic research for the design of EFL reading materials’ (1974), that highly technical words, such as sickle-cell anemia, myocardium or carbohydrate, are usually taken care of by subject teachers at university; it is sub-technical vocabulary, such words as function, isolate or basis, that EFL practitioners need to focus on. Thus, he took a wider view than Barber, describing sub-technical vocabu-lary as context-independent words which occur with high frequency across disciplines (op. cit.: 391) and are typically included in Thorndike & Lorge’s The teacher’s wordbook of 30,000 words (1944), West’s GSL (1953) and Ku era & Francis’ corpus, later known as the Brown Corpus (1967), an orthographically transcribed corpus of 1,014,294 words, com-prising five hundred c. 2,000-word samples of American English writing published in 1961, in fifteen categories, compiled in 1963–64 at Brown University. The sub-technical vocabulary to which Cowan refers consists basically of Latinate words that form part of the vocabulary of general educated usage; that is, they are part of standard English at large. How-ever, he also includes words that he classifies as

8

Page 9: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

semi-technical and non-technical (this latter term also used by Barber) and interprets these as high-frequency general English words which appear in specialised texts, such as hospital, medicine, patient, disease or clinic in medical texts. In Cowan’s approach, semi-technical and non-technical vocabulary are different from his coined term, sub-technical vocabulary, within which some very general words can on occasion be ‘technicalised’ [my term] for the purposes of certain specific academic disciplines, words such as func-tion, inference, relation or simulate. Phillips et al. (1974), reported on a study of four agriculture textbooks, in which they identified two categories of vocabulary: “verbs for the organi-sation of knowledge”, such as examine, ascertain, determine; and “semi-technical” vocabulary. Their statistics showed that 15% of verbs in these textbooks were specifically associated with agriculture, whereas 60–70% of the overall vocabulary was semi-technical, a term already used by Cowan to mean high-frequency general English words. Phillips et al., however, used this to refer to an area of vocabulary that Cowan had called sub-technical. Such words, whatever they are called, are commonly found in ST texts and may include general English words that have acquired specialised senses (i.e. have been ‘technicalised’). Like Barber, Stig Johansson (1975) sought to be scientific in defining a range of words which he called the ‘general vocabulary of science’: in effect, the lexical zone between general vocabulary and highly technical vocabulary. He carried out an empirical study of the Brown Corpus, in which he found that c.160,000 words belonged in Category J: ‘learned and scientific writings’.1 From this study, Johansson also established a list 1 There are 80 texts in Category J of the Brown Corpus, in the following disciplines: J01–12 Natural sciences J13–17 Medicine J18–21 Mathematics

J22–35 Social, behavioural sciences a J22–25 Social, behavioural sciences b J26–30 Sociology

c J31 Demography d J32–35 Linguistics

J36–50 Political science, law, a J36–39 Education education b J40–47 Politics and economics c J48–50 Law J51–68 Humanities a J51–54 Philosophy b J55–59 History c J60–63 Literary criticism d J64–67 Art education e J68 Music J69–80 Technology and engineering

9

Page 10: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

of items characteristic of academic learned and scientific English (SE), based on a ‘differential ratio’ which relates the frequency of occurrence of items in SE to the frequency with which they occurred in the Brown Corpus (Johansson op. cit.: 6), according to the formula:

100corpus in the item theof soccurrence ofnumber the

JCategory iteman of soccurrence ofnumber the×

To illustrate this, let us consider the word sample. It occurs 48 times in Category J out of a total of 57 occurrences in the Brown Corpus as a whole. Thus, the differential ratio is calculated as follows:

21052.8457

10048=

×

Johansson commented on words with various differential ratios, establishing their categorial range:

Table 1: Categorial range of vocabulary in Category J in the Brown Corpus, from Johansson (1975).

Category Differential ratio Examples

‘General’ lexicon 0–11 he, I, what, who

General vocabulary of science 23–66 information, process, analysis, surface

Words with a technical sense 89–100 bronchial, detergent, polynomial, hypothalamic, optimal

According to Johansson, the differential ratio can be an objective means of classifying words in ST texts. Many of the words categorised as ‘general vocabulary of science’ fall in the middle-frequency range, between 23 and 66, and seem to be the largest group in the composition of ST texts. Cf. Johansson (op. cit.: 9):

The division based on differential ratio seems intuitively satisfying. Many of the words which are characteristic of SE according to this criterion denote matters which we associate with scientific exposition (discussion, argument, result, conclusion, etc) or procedure (analysis, experiment, measurement, observations, test, etc)

10

Page 11: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Like Cowan’s, Johansson’s list of ‘general vocabulary of science’ consists of words typical of educated usage. Many of them belong to ‘general technical’ usage and some of them are technicalised in subject-specific texts. Although Johansson’s study was not undertaken to identify sub-technical vocabulary, as were those of Cowan and Phillips et al., his dif-ferential ratio can be regarded as an objective means of demarcating it. Unlike Barber or Cowan, he did not further define, or comment on, the characteristics of a ‘general vocabulary of science’. Anne Martin (1976) defined and compared the terms ‘technical’, ‘scienti-fic’, and ‘sub-technical’ vocabulary, making it clear that she preferred to use the term academic vocabulary rather than Cowan’s (1974) term sub-technical vocabulary, and noting (ibid.: 92):

Technical and academic vocabulary are not synonymous terms … Ewer and Latorre wrote A Course in Basic Scientific English (1969). They use the term science to embrace some of the social sciences as well as the physical sciences and thus come close at times to what I call academic vocabulary. Cowan (1974: 391) uses the term “sub-technical vocabulary” for high-frequency context independent words occurring across disciplines. Aca-demic and sub-technical vocabulary appear to be equivalent terms. I prefer the term academic vocabulary.

In addition to applying Cowan’s definition of sub-technical, for which she used the term academic, Martin developed four largely intuitive peda-gogical criteria for selecting the academic vocabulary items included in her study. Such words: are unfamiliar to, or incorrectly used by, many stu-dents; help students not only to recognise familiar items but also to extend their knowledge to include unfamiliar items; are useful in all four areas of language use – listening comprehension, speech, reading comprehension, and writing; and reinforce, and are reinforced by, a wide range of essential academic skills including outlining, paraphrasing, taking examinations, note-taking, writing papers, and giving seminars. Marianne Inman, in her 1978 study of 114,460 words of ST prose taken from the disciplines of biology, mathematics, physics, chemistry, chemical engineering, geology, mining engineering, electrical engineering, civil en-gineering, mechanical engineering and metallurgical engineering, distin-guished technical vocabulary from a class of vocabulary items that she called sub-technical, the term first adopted by Cowan. Inman identified sub-technical vocabulary as “context-independent words which occur with high frequency across disciplines”

11

Page 12: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

(op. cit.: 244). She also stated that the method she used to distinguish this vocabulary from technical lexis was essentially intuitive, but it appears nonetheless to be comparable to the statistical range discussed by Johansson. According to her data, the mean percentage across the total sample for sub-technical vocabulary was 70%, while the proportion of function words (such as prepositions, articles, pronouns and particles) was 9%, and that of technical words, 21%. Inman was explicit that if ESP teachers focused on teaching strictly technical vocabulary, they would probably need to exclude nearly 80% of the texts they taught (ibid.: 248):

Based on frequency and range of occurrence in authentic scientific and technical texts, it is obviously subtechnical vocabulary which should be focused on in teaching scientific and technical English. Technical vocabu-lary, on the other hand, is best left to presentation through the discipline itself.

Sub-technical vocabulary was therefore, for her, the area on which ESP teachers needed to focus. Interestingly, she used only three word cat-egories (technical words, sub-technical words and function words) to analyse the scientific prose she chose for her study. It seems that for her any word that does not belong to the categories of technical and function words will automatically fall into the sub-technical category. This cat-egory of vocabulary will therefore of necessity include words of general educated usage that are sometimes technicalised in specific academic disciplines – this is essentially the same standpoint as Cowan’s. Like Cowan (1974) and Inman (1978), Pauline Robinson (1980) used the term sub-technical. Although she did not intentionally set out to define ‘sub-technical’, she indicated that sub-technical words are items of acad-emic vocabulary which occur across disciplines, and further that this group of words is often incorrectly used by students because they take it for granted that they already know them. She also agreed with Barber, Cowan and Inman that sub-technical vocabulary is the area that learners of ESL need to pay particular attention to (op. cit.: 71):

Coursebooks …, especially for in-service students, perhaps do not need to concentrate on the very specialised vocabulary items as students will get these from other sources. Rather it if [sic] the sub-technical level which is often difficult.

Quoting Phillips et al. (1974), Robinson claimed that statistical evidence supported her contention that ST texts are composed of at least two kinds of vocabulary, one technical and the other sub-technical, but did not elaborate on possible frequency ranges.

12

Page 13: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

In their book, English special languages (1980), the three British scholars, Juan Sager, David Dungworth and Peter McDonald did not use the terms sub-technical or semi-technical. Instead, they stated that the lexicon of such special subjects, as chemistry, engineering, biology and mathematics, consists of the following three major groups of words (op. cit.: 242)

General language words used in all disciplines without distinction, e.g. note, observe, demonstrate and prove, and general language words appropriate to a particular discipline, e.g. stir, shake, boil, and freeze in chemistry. General language words used specifically with some restriction or modi-fication of meaning in a particular discipline, e.g. segregate, precipitate, and suspend in chemistry, and current in electrical engineering. The terms specific to a discipline and therefore normally used only by specialists. These are either words created for a particular subject or general language words redefined for the purpose and thus ‘terminologised’, e.g. age as in to age a dye, and the phrase conversational device.

The first group would appear to include West’s GSL and words in general educated use; the second, words in ‘general technical’ use, and the third, highly technical words as used in specialised subject areas such as chemistry and CS. In addition, Sager et al. conceive of many words in the third group as having been ‘terminologised’ (or ‘technicalised’) from the first or second group. These three groups of words therefore necessarily flow into each other, forming a cline, and the first and second groups of words would appear to belong to the area of vocabulary that Cowan, Inman and Robinson called sub-technical. For Sager et al., one feature of this group of vocabulary items is the high frequency of occurrence of its members in ST, and they cite Barber (1962) as having attempted to define such words as not contained in GSL. Thus they identify a class of fre-quently occurring words of general use which have modified or restricted meanings when used in ST literature. The specialised senses of some of these words can often be found in such large recent general dictionaries as the New Oxford dictionary of English (1998) and the Encarta world English dictionary (1999). Sager et al., however, made no comment on Cowan’s view that sub-technical vocabulary items are context-indepen-dent across disciplines, and some of them (especially those which have been technicalised) clearly are not. Tim Johns and Tony Dudley-Evans (1980) devoted a whole section to semi-technical vocabulary. They did not give any definition of semi-technical vocabulary, nor did they mention frequency as a criterion to

13

Page 14: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

qualify such a group of words. Instead, they quoted examples of semi-technical vocabulary which fall into Cowan’s (1974) category of ‘sub-technical’ vocabulary and Martin’s ‘academic vocabulary’. They com-mented on the importance of semi-technical vocabulary in ESP and that such terms are technicalised (op. cit.: 147):

Here the emphasis is on terms drawn from the ‘common core’ of English which take on a special significance in a number of different subjects, either in description or as a result of the nature of the teaching/learning interaction.

They further stressed that to have a thorough understanding of semi-technical vocabulary items which have extended meanings in ST texts, close co-operation is required among language teachers, students and subject specialists. At about this time, the term sub-technical appears to have become generally accepted in this field. Michael Wallace (1982), without mentioning any previous research in the field, straightforwardly uses it to describe a group of general English words that are typical of academic discourse. He further describes these vocabulary items as words attached to concepts that learners might already be familiar with. Wallace (ibid.: 17) advocates that ESP practitioners need to take such words into account in their work because:

The serious problem for the EFL learner … is probably not technical language as such, but the language framework in which the technical expres-sions are placed … sub-technical words and expressions typical of academic discourse…which the subject specialist may assume that the student should already know.

Wallace argued firstly, that sub-technical vocabulary items are words which learners would have already encountered in other literature before meeting them in ST texts; and secondly, that they would already have had a preconception of the meanings of the words as used in the written lan-guage generally, which might not, however, extend into ST texts, thus creating obstacles to their understanding. He did not, however, propose frequency as a means of defining this category of words, as others had done, nor did he discuss whether sub-technical vocabulary items are context-independent across various disciplines. Chris Kennedy & Rod Bolitho (1984) adopted without discussion the term sub-technical to describe one of the five lexical features they observed in ST texts. They offer five lexical categories in specialised ST

14

Page 15: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

texts: tech-nical abbreviations, symbols and formulæ, highly technical vocabulary, sub-technical vocabulary and other specialist vocabulary (in areas such as law, computing and sociology). They define sub-technical vocabulary as words which “are not specific to a subject speciality but which occur regularly in scientific and technical texts” (ibid.: 57). Examples they cite include reflection, intense, accumulate, tendency, isolate and dense. Such words, they suggest, provide a basis for the organisation of specialist knowledge and categories for teaching purposes. Kennedy & Bolitho further quoted Inman (1978: 248), who had estimated that the occurrence of sub-technical items in scientific texts was as high as 70%, to lend statistical support to what they saw as the important rôle played by such items in ST texts. They agreed with the view that words such as conductor (in electricity) and resistance (in an electrical circuit), which are “commonly met in ‘general’ English, take on a specialised meaning within a scientific or technical context” (op. cit.: 58). Learners may already know the ‘general’ meaning of such words, and may be con-fused when they meet them in a ST context. These words consequently appear in most specialised literature as sub-technical and ‘sub-specialist’ [their term] usage. In line with Sager et al., Kennedy & Bolitho define sub-technical vocabulary as any general words which take on a specialised meaning (in Sager et al.’s words: “restriction and modification of mean-ing”) in a ST context. In a 1985 study, J. R. T. Cassels & A. H. Johnstone indicate that their work was influenced by that of Paul Gardner (1972), who had investigated vocabulary for the Australian Science Education Project. Gardner had shown around 500 of what Cassels & Johnstone called ‘normal’ English words, to pupils at different stages in secondary schools, using the multiple-choice question research technique. He had used his results to compile lists of words accessible to pupils at different stages. Although Cassels & Johnstone agreed that there is a group of vocabulary items which are general English words used in a science context, they did not use such terms as sub-technical or semi-technical. Like Barber, Sager et al. and Kennedy & Bolitho, however, they discussed ‘general vocabulary items’ which they called “normal” English words used in a science context, and commented on their significance (op. cit.: 1):

The problem lay, not so much in the technical language of science, but in the vocabulary and usage of normal English in a science context. Pupils and

15

Page 16: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

teachers saw familiar words and phrases which both “understood”, but the assumption that both understandings were identical was just not tenable.

What Cassels & Johnstone had in mind was not the technical vocabulary which has specific meaning in certain disciplines, nor the ordinary vo-cabulary used in everyday life, but a group of words which have commonly known meanings in general texts, but different senses in science texts, as with volatile, partial, converge, spontaneous, all of which take on different meanings in chemistry contexts. Cassels & Johnstone did not discuss any criteria, such as frequency counts or context-independence, for establishing words of this type, but stressed that ESP practitioners should pay special attention to the problems such vocabulary items may generate. Particular difficulties occur “when both learner and teacher know the meaning of a word and each assumes that the other shares the same meaning… and the sad thing is that neither group (teachers or pupils) is aware that a problem exists” (ibid.: 15). Here, we are once again in the under-appreciated area of ‘technicalised’ usage. Also in 1985, Louis Trimble noted that sub-technical vocabulary was at that time a newly-named field in ESP, in which further research should be encouraged. He acknowledged Cowan as the coiner of the term, and as defining sub-technical vocabulary as context-independent words which occur with high frequency across disciplines. Trimble, however, added another feature to such vocabulary, indicating that such words are “common” but occur with special meanings in specific ST fields. He therefore proposed that sub-technical vocabulary be defined as words that have one or more “general” meanings and in technical contexts take on extended specialised technical meanings; see Trimble (1985: 129):

[Cowan] defines sub-technical vocabulary as ‘context-independent words which occur with high frequency across disciplines’. This definition applies to those words that have the same meaning in several scientific or technical disciplines. To these words we have added ‘those “common” words that occur with special meanings in specific scientific and technical fields’.

Trimble concluded that English sub-technical vocabulary is basically of two kinds: words that have the same meaning in several scientific or technical disciplines, and words that are “common” but take on extended meanings in specific ST fields. This implies that the sub-technical vo-cabulary of ST texts is a broad category (comparable to general and tech-nical vocabulary), embracing both context-independent words across dis-ciplines, such as function, isolate, basis, stir, shake, boil, freeze, (as sug-

16

Page 17: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

gested by Cowan, Inman, Robinson and Sager et al.), and context-depend-ent words such as the examples (with definitions) given below, from Trimble (1985: 131):

17

Page 18: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

1. base Botany: ‘The end of a plant member nearest the point of attachment to

another member, usually of a different type.’ Chemistry: ‘A substance which tends to gain a proton.’ ‘A substance which

reacts with acids to form salts.’ Electronics: ‘Part of a valve [US “tube”] where the pins that fit into holes in

another electronic part are located … The middle region of a transistor.’

2. dog Construction: ‘A steel securing piece for fastening together two timbers.’ Machining: ‘A lathe carrier.’ Mechanical engineering: ‘An adjustable stop used in gears.’ Petroleum engineering: ‘A clutching attachment for withdrawing well-

digging tools.’ Railroading: ‘A spike for securing rails to sleepers [US “ties”].’

For Trimble (ibid.: 128), “[t]echnical vocabulary by itself does not pose enough of a problem for the majority of non-native students to need spe-cial attention in the classroom”. However, he conducted class discussions on sub-technical vocabulary in order to “make students see very quickly that familiar words may have very unfamiliar meanings [in various subject areas]” (ibid.: 130). Apart from adding an extra feature to Cowan’s defini-tion of sub-technical vocabulary, Trimble did not comment on whether frequency should be considered as a criterion. He, however, used semi-technical (the term used by Phillips et al. to identify the category of words which most of the researchers mentioned in this study have referred to as sub-technical) to describe the word fast. Such a word, he noted, has a “semi-technical use” because it carries different senses in different texts such as “an arsenic-fast virus” in medicine and “to make something fast” on ships. By 1988, when Mona Baker carried out an analysis of rhetorical items in medical journal articles, the concept of sub-technical vocabulary had become important in ESP research. Baker endorsed the claim that the real difficulty in understanding ST texts, as far as learners are concerned, lies in this area, which she perceived as covering a range of items neither highly technical and specific to a certain field of knowledge nor obviously general in the sense of being everyday words which are not used in a distinctive way in specialised texts. Nevertheless, Baker also observed (ibid.: 91):

… “subtechnical” as a category has proved to be elusive and confusing for many teachers, the reason being that the term has neither been clearly nor consistently defined in the literature.

18

Page 19: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

She therefore described the sub-technical “territory” as a grey area be-tween specialised and general vocabulary and, like Trimble, extended Cowan’s definition of sub-technical vocabulary. According to Baker’s description, each feature seems to flow into the next, forming a continuum:

Items which express notions general to all or several specialised disciplines. This type of word includes items such as factor, method, function which can be found in specialised disciplines like mathematics, chemistry and physics.

Items which have a specialised meaning in one or more disciplines, in addition to their meaning in general language, such as bug in CS and solution in mathematics and chemistry.

Items not used in general language but which have different meanings in several specialised disciplines, such as morphological in dis-ciplines like linguistics and botany.

Items which are traditionally viewed as general vocabulary but which have restricted meanings in certain specialised disciplines, such as effec-tive and expressed in botany.

General language items which are used in preference to other se-mantically equivalent items, to describe or comment on technical pro-cesses and functions, such as take, place and occur in biology.

Items used in specialised texts to perform specific rhetorical func-tions. She cites the terms discussed by Johns & Dudley-Evans (1980) as examples of general words, such as explanation, others and pointed out, which when used in, for example lectures on plant biology, carry a speci-fic rhetorical load. Baker thus expanded the boundary of the term sub-technical to include items that require varied teaching techniques, and therefore require more attention of ESP practitioners. She also prioritised the types of definitions into a sequence of difficulties for learners, from Type 1 to Type 6. Type 1 seems to be the easiest and falls into the category of Cowan’s (1974) sub-technical and Martin’s (1976) academic vocabulary. Types 2, 3, 4 and 5 are in the middle range in terms of complexity, with Types 2 and 3 causing “little or no problem” whilst Types 4 and 5 are stable when the meanings of the vocabulary items are defined. Type 6, in her opinion, is the most difficult area to teach and acquire. Interestingly, the features of sub-technical vocabulary on which Baker focuses encompass both context-independent (as in Type 1) and context-

19

Page 20: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

dependent words (Types 2, 3, 4, 5 and 6). She has not, however, com-mented on whether these items can be identified by frequency count. Li Lan (1989) appears to agree with Baker that sub-technical vocabulary constitutes a significantly large group of words, requiring the attention of ESP practitioners. She adopted, without explanation, the term semi-technical to refer to words common to every branch of science and there-fore relevant to the students’ future careers (ibid.: 63). She further claimed that such semi-technical words are at the core of ESP. Like Trimble, she also suggests that this category of vocabulary embraces two sets of words: (1) academic words – context-independent, occurring across disciplines, such as estimate, value, assess, as in Wilkins (1976); and (2) context-dependent, discipline-based general words, which commonly “occur with special meanings in specific academic fields” (op. cit.: 70), such as base, dog, fast in Trimble’s list. In short, Li highlighted a feature of discipline-based general words that others scholars had not discussed: a group of words with extended meanings in ST texts that are at times specialised and used metaphorically. An example is the word operation. For example, for doctors, it means ‘surgery’; for military men, it means ‘a piece of organised and concerted activity’ and for finance professionals, it means ‘a business organisation’. In certain circumstances, these technicalised vocabulary items are used metaphorically. This can be seen from metaphorical use based on parts of the body, such as:

head tool head, index head mouth mouth of a crusher, mouth of a furnace shoulder shoulder of a bell, shoulder of a curve, shoulder

nipple. Like Cowan (1974), Li also mentioned frequency counts and considered their strengths and weaknesses as a determinant for a ‘core vocabulary’ for teaching, but she did not herself use a count to define the category of sub-technical vocabulary. Paul Farrell (1990) traced the historical changes in the interpretation of the notion of what he termed semi-technical vocabulary. His study was largely concerned mainly with electronic English and used examples from

20

Page 21: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Category J of the LOB2 corpus. This category contains 80 text samples of 2,000 words each, giving a corpus of 160,000 words in learned and scientific writing – the text samples are taken from the areas of natural sciences, medicine, mathematics, psychology, sociology, demography, linguistics, education, politics, economics, law, philosophy, history, lit-erary criticism, art, music, technology and engineering. Farrell claimed that four features distinguish semi-technical vocabulary from general and technical words of English: a wide range and/or a high frequency; context-independence; non-presence of words not found in general courses; and formality (ibid. 1990: 12). His list of semi-technical vocabulary, drawn from his 20,000-word Electronics English Corpus includes words that also appear in the LOB semi-technical list. Where Cassels & Johnstone (1985), Trimble (1985), Baker (1988) and Li (1989) had extended the boundary of semi- or sub-technical vocabulary to cover both context-dependent and context-independent words, Farrell nar-rowed the scope of such vocabulary items to only the context-independent words that Cowan, Inman and Robinson had referred to as general academic/educated vocabulary that happens to appear in ST texts. Farrell, however, does not seem to be entirely satisfied with his own definition of semi-technical vocabulary. First, although he set out objective criteria to identify semi-technical vocabulary, he admitted that he would sometimes base his judgements on subjective “intuitive evaluation” to determine which word should be kept on the list or deleted from it. Second, he noted that some semi-technical vocabulary items are actually polysemous and homonymous, in which case they cannot be completely context-indepen-dent. He pointed out that a number of semi-technical words from his study of the English of electronics have one or more general English meanings, and take on extended meanings in technical contexts. Two examples are current (the technical sense is a flow of electricity but in a general text it means a flow of water) and resist (the technical sense is obstructing or slowing a flow of electricity but in a general text it refers to opposing or standing firm against somebody or something). Although Farrell did not include such words in his list of semi-technical vocabulary, he frankly admitted that “[a] number of such items were noted in the present study, but it was felt that they

2 The London-Oslo/Bergen (LOB) Corpus is an orthographically transcribed corpus of

1,006,901 words, comprising 500 x c. 2,000 word-samples of British English writing published in 1961, in fifteen categories. This corpus is intended to be a British English parallel to the Brown Corpus. The compilation was undertaken in 1970–78 at the Universities of Lancaster and Oslo, in collaboration with the Norwegian Computing Centre for the Humanities, Bergen.

21

Page 22: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

belonged more properly under the heading of ‘technical vocabulary’, as they usually seemed to require a precise technical explanation by the subject teacher” (op. cit.). It is clear from this that the line between the kinds of vocabulary being discussed is by no means easily established. We seem to be dealing with fluid tendencies rather than hard-and-fast categories. Stewart Marshall, Marion Gilmour and Don Lewis (1990) reported a replicative study of Cassels & Johnstone (1985), using the term non-technical to describe a group of academic words that ESP practitioners need to pay attention to (op. cit.: ii)

Many of the problems of comprehension experienced by students in science classes arise not from the technical words used but from the non-technical words used. Science teachers usually define the scientific and technical words when they are first used, but the non-technical words are assumed to be understood from their use in ‘normal’ English. Unfortunately, this assumption is not correct. Studies in the UK and Australia have revealed that many students, for whom English is a first language, do not thoroughly comprehend the non-technical words used in science.

In their study, however, they also used the term sub-technical, and adopted Cowan’s (1974) definition, claiming that sub-technical vocabu-lary consists of context-independent words with a high frequency across several disciplines: that is, general (or ‘normal’: Cassels & Johnstone) words used in ST texts. Such words are important in such texts because, as Marshall & Gilmour (1993: 70) note, they are used to modify or to express the relations that exist between the key concepts of any discipline, that is, they define, quantify and qualify. Marshall & Gilmour agreed with Cowan (1974) and others that sub-technical vocabulary consists of context-independent words with high frequency across disciplines, but they also point out that such a group of words can be polysemous and homonymous. And, like Cassels & Johnstone (1985), Trimble (1985), Baker (1988) and Li (1989), they express the view that such a category of vocabulary could, at one and the same time, embrace context-dependent words with different meanings in different contexts, noting (1990: 1):

Although students may appear to understand the words used, it is often the case that their understanding of the meaning of the words is not the same as that of their teachers.

Paul Nation (1990) used the term academic vocabulary when discussing this area of vocabulary, and with regard to the difficulty of words in tech-

22

Page 23: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

nical texts quoted from a study by Robert Lynn (who mentions non-technical terms from the academic register which present the greatest problems to students but he does not go so far as to define the area).

Perhaps the most striking features of the resulting list is the absence of technical terms. One might expect to find terms like debenture, blue-collar, inflation and debit, and indeed terms of this type make up the bulk of the vocabulary items in nearly all TEFL texts for commercial students. But instead we find “textbook English” words – non-technical terms from the academic register – presenting the greatest problems to our students. Even such apparently commercial terms as “appraise” and “compensate” were not, in fact, encountered in a commercial context, but in such academic phrases as “appraising the significance of…” and “factors which compensate for…”

(Lynn, in Nation 1990: 140) Although Nation did not explicitly equate academic vocabulary with non-technical vocabulary, he implied that these two groups overlap. He was more concerned about frequency counts and noted that those non-technical vocabulary items which appeared in Lynn’s study were middle-frequency words. He also pointed out that such words as substantial, consistent and significant, whose occurrence usually lies in the middle-frequency range, do not belong to the category of technical vocabulary, but are part of academic vocabulary at large. Nation (op. cit.: 140) also noted that Lynn’s study “does not show that most of the words that learners do not know are non-technical. It only shows that the words that most learners do not know are the middle-frequency words”. This is a point which none of the earlier commentators had raised. Nation, however, did not go so far as to discuss the definition of general academic vocabu-lary, nor did he elaborate on the relations between general academic and non-technical vocabulary. Stephen Pickersgill and Roger Lock (1991), replicated the studies of both Gardner (1972) and Cassels & Johnstone (1985), using Gardner’s term non-technical, but without defining it. In their study, they noted two important features of this category of words: first, many non-technical words are in common use but are not readily understood by students, who often take the opposite meaning to that intended; in addition, students’ usage can lack precision with regard to the meanings of ST words. Second, following Cassels & Johnstone (1985), Trimble (1985), Baker (1988), Li (1989) and Marshall et al. (1990), they expected that students would have difficulty in understanding words that are heavily context-dependent. The results of their study show that (op. cit.: 71):

23

Page 24: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

… the use of such terms [non-technical vocabulary], without some form of explanation, could adversely affect a student’s understanding of the parti-cular concept being taught … This misconception may be difficult to counter simply because, if the word is commonly used, the misunder-standing is likely to have built up over a number of years and, as a result of this reinforcement, be firmly fixed in the mind.

Pickersgill & Lock further concluded that ESP teachers should devote time within science lessons to teach the meaning of non-technical terms. Teachers’ misconceptions regarding commonly used non-technical terms hinder students’ understanding of ST texts. Keith Tong Sai-tao (1993a), following Trimble (1985), adopted the term sub-technical without discussion. He agreed that words have one or more general meanings but in a technical context may take on an extended meaning. He also agreed with Kennedy & Bolitho (1984) that such words are commonly found in general non-scientific writing, operate across more than one academic discipline, and may adopt special context-dependent meanings when used in ST texts (in Tong’s study: CS texts). Based on the frequency count of the UST CS Corpus, his report shows that words such as disk-drive or database, which occur frequently in CS, are not sub-technical vocabulary, nor are they words with a frequency of less than 20 in a million. He suggests, intuitively, that words with a frequency of occurrence in the region of 50 to 100 in the UST CS Corpus constitute items of sub-technical vocabulary and carry the feature of “[g]eneral language words used specially with some restriction or modification of meaning in a particular discipline…” (Sager et al. 1980: 242). Like Li (1989), he highlights an important and distinct feature of sub-technical vocabulary in CS texts: that the extended meaning they carry is meta-phorical. Tong claimed that sub-technical vocabulary consists of words which lie in the middle frequency range in a specific text corpus, such as the UST CS Corpus, are general vocabulary items which take on a special-ised meaning in ST texts and are context-dependent. In an investigation of Hong Kong Chinese students’ knowledge of acad-emic and sub-technical vocabulary in English, Edward Li Siu-leung & Richard Pemberton (1994) also used frequency to categorise types of vocabulary. They adopted Cohen et al.’s (1988: 162) view that non-technical vocabulary (also academic vocabulary: Nation 1990: 138), is the category that includes words mainly from the middle-frequency range and may include some high-frequency words: a set of vocabulary items shared across various academic disciplines and context-independent. Li & Pem-berton further argued that within this type of academic vocabulary,

24

Page 25: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

there are words which, when they appear in a technical context, take on an extended meaning (cf. Trimble 1985: 129) which can be identified as sub-technical. They did not make any specific attempt to define sub-technical vocabulary; instead, they borrowed Cohen et al.’s description of academic vocabulary, which includes three common features: middle/high-fre-quency; availability across different academic disciplines; and context-independence. In their view, sub-technical vocabulary (including items such as batch, cluster, corrupt) is a subset of academic vocabulary, with the additional feature that such words usually take on extended meanings in technical contexts. Louis P. K. Tao (1994) also replicated Cassels & Johnstone’s (1985) study, but he adopted Nation’s (1990) term non-technical rather than Cassels & Johnstone’s broader, but more ambiguous, term “normal English words in a science context”. He also noted that students’ problems in understanding ST terms go beyond the technical words and include the use of normal English in a scientific context: words such as account, achieve, conserve in the Physics, Chemistry and Biology papers of the Hong Kong Certificate of Education Examination. He shares the view of those researchers who hold that sub-technical vocabulary comprises general English words used in a science context, and that students find such words difficult to understand, concluding (op. cit.: 22):

[Students] failed to comprehend a large proportion of the non-technical words tested, confused them with words that were phonetically or graphologically similar and in some cases even took them as their antonyms. Hong Kong students also had problems with the correct usage of the words in sentences.

After considering my account of the uses of the terms sub-technical, semi-technical etc., Tom McArthur (personal communication), commented that, of the various terms used within this area of research, semi-technical appears to be the most appropriate. In the Oxford English dictionary, 2nd edition (OED 1989) there are nine pages covering eight categories3 con-taining 30 senses of the prefix sub-, and two categories

3 OED defines the suffix sub- in eight main categories: (1) Meaning under, underneath,

below, at the bottom (of); (2) Meaning subordinate, subsidiary, secondary; subordinately, subsidiarily, secondarily. (3) Senses related to next below, near or close (to); (4) Meaning incomplete(ly), imperfect(ly), partial(ly); (5) Meaning secretly and covertly; (6) Meaning from below, up (hence) away; (7) With the sense of being in place of another, as in substitute; (8) Relating to the sense of in addition, by way of or as an additon as in subinsert. (OED, 2nd ed. 1989, at sub-, in Vol. XVII, pp. 3, 5, 8, 9, §§ I, II, III, IV).

25

Page 26: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

carrying 11 senses 4 in the New shorter Oxford English dictionary (NSOED 1993). The term sub-technical is not, however, listed in either of these works, nor in leading current US and UK desk dictionaries, while the prefix sub- may well mean more than simply ‘under(neath)’, as in:

‘incomplete(ly)’, ‘imperfect(ly)’, or ‘partial(ly)’, examples sub-historical, sub-literate, sub-mature, sub-moral, sub-solid, etc. (see OED, 2nd ed. 1989, at sub-, in Vol. XVII, p. 9, col. 3, § IV) ‘nearly’, ‘approximately’, ‘partially’, examples subcylindrical, subglabrous, subaquatic, subdelirium, etc. (see NSOED 1993, Vol. 2, p. 3114)

In McArthur’s view, one can, however, use semi-technical to mean “sometimes general, sometimes technical (depending on circumstances)”. In NSOED, for example, the senses of semi- include:

‘partly, partially, to some extent’, examples semi-official, semi-detached etc. ‘imperfectly, incompletely’, examples semi-liquid etc. (NSOED 1993, Vol. 2, p. 2769)

McArthur has further commented to me that many terms ‘float’, as it were, between general and technical, their use depending on context, as for example with bullet, directory and frame in everyday life and in CS texts. From McArthur’s viewpoint as a lexicographer, since the vocabulary being investigated consists of words which are not always, or even usually, technical, the term sub-technical seems not to be the best choice, for two reasons. First, the prefix sub-, as listed in dictionaries such as OED, NSOED and COBUILD 2 (1995), can have negative or pejorative implications, as in substandard, subnormal etc. Second, the CS terms used in this research, such as cell, dialog box etc., are not non-technical words ‘masquerading’ as technical words. They are entirely successful as technical terms in their own right, even though they happen also to have a wider meaning from which the technical sense may well have been derived by analogy. As a result, they are necessarily classifiable as technical on occasion and non-technical on occasion, their interpretations

4 NSOED defines sub- in two ways. (1) Senses as a living prefix like ‘denoting a lower

spatial position’; (2) Appearing usually in words adopted or derived from Latin, like those prefixed to numbers, adjectives, and terms with the sense ‘secret(ly), covert(ly)’ as suborn and subreption etc. (NSOED 1993, Vol. 2, pp. 3113–14)

26

Page 27: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

varying accordingly. Neither sub- nor semi- is the perfect vehicle for describing this state of affairs, but semi- is certainly more positive and probably more accurate. The foregoing discussion has demonstrated that often when researchers use such different terms as general English words used in science, non-technical vocabulary, sub-technical vocabulary and semi-technical vo-cabulary, they are talking about much the same thing. Some may have a single absolute definition for such words, while others consider that the category has open boundaries which will flexibly allow for any general word that becomes technicalised and also any technical vocabulary item that becomes generalised. However, having so many different terms for the same category of words obviously leads to confusion. This is especially true when such a group of words attracts considerable attention from scholars in the field, and when a considerable amount of research has already been done. Although many of the definitions are, or seem to be, very different, they can nonetheless be compared and consolidated in order to obtain a clearer profile of the group of words that warrant the attention of ESP practitioners We may conclude, then, that there are, by and large, non-technical words (the general mass of the vocabulary, such as mouse, green, love) and technical words (items which belong only or almost wholly to specialised fields, such as astrophysics, retroflex, software), but that many of the general words of the language have several senses, one or more of which are non-technical while one or more others have to be labelled ‘technical’ in terms of one or more distinct register (as, for example, the general and the CS meanings of view and field). The ‘technicalised’ senses of these general words can then be said to occupy “a special kind of space between the really-truly general and the really-truly technical” (McArthur, personal communication). Such extended senses (the computer’s mouse, the golfer’s green, the tennis player’s love etc.) can reasonably be called tech-nical, allowing us to think of the word itself as semi-technical, belonging as it were in both lexical worlds – as illustrated in Table 2:

27

Page 28: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Table 2: Examples of general, semi-technical (CS) and technical (CS) vocabulary.

General vocabulary

employee is wish maintain six target

Semi-technical CS

vocabulary

child dialog mouse parent tree view

Technical CS vocabulary

fortran gotrue hipo keyboard megabyte tuple

In this study, therefore, following Phillips et al. (1974), Johns & Dudley-Evans (1980), Li (1989) and Farrell (1990), the term semi-technical, rather than sub-technical or non-technical, has been adopted. There are few negative connotations involved in the use of semi-, while there is an implication that words such as mouse, tree and dialogue may or may not be technical. They are ‘half-and-half’ words that can sometimes be gen-eral, sometimes technical, depending on context, and the list of such words is open to further members, as CS professionals creatively adapt other terms as needed from the language at large. As outlined above, many researchers have pointed out that it is not spe-cialised or technical vocabulary that necessarily creates obstacles for stu-dents in understanding ST texts, but often items of vocabulary from normal English operating within a science context, the semi-technical vo-cabulary. The perspective that semi-technical vocabulary consists largely of general English words which have taken on specialised meanings highlights three problems faced by non-native-English students of ST. First, if learners are already familiar with the general meaning of the word, they are likely to become confused when the same word is used in a scien-tific or technical context with a different sense (cf. Kennedy & Bolitho 1984: 57–8). Second, as Wallace (1982: 17) has claimed, the serious problem in under-standing a ST text is not the technical language itself, but “the language framework in which the technical expressions are placed”. Since one of the elements within the language framework is the “sub-technical words and expressions typical of academic discourse (that is, words such as ratio, approximate, hence) which the subject specialist may assume that the student should already know” (ibid.), the specialist may build on such

28

Page 29: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

an assumption and try to transfer more complex concepts to the learner, not appreciating that the learner may be ignorant of the extended meaning of a particular word in a designated ST context. Third, students and their subject teachers may encounter a word that looks familiar, and both may think that they share an understanding of its meaning. However, such an assumption is not always warranted, and may eventually cause miscommunication between students and teachers. This may finally weaken the learner’s understanding of a particular text (Cassels & Johnstone 1985: 1). To illustrate this point, we may consider the word, volatile. In a normal English context relating to people, it means ‘changing quickly from one mood or interest to another’, in Chinese,‘情緒或興趣多變的,無常性的’(Oxford advanced learners’ English–Chinese dictionary [OALECD] 1994: 1697). However, when it appears in a chemistry context and refers to a substance, its definition applied to people can be mistakenly transferred to mean ‘unstable’, ‘inflammable’ or even ‘explosive’ instead of ‘easily changed into gas’, in Chinese,‘容易揮發’(Cassels & Johnstone 1985: 1–2). It has been estimated that up to some 80% of all research material in science and technology worldwide, depending on the discipline, is pub-lished in English (cf. Hyland 1996). Rapid and effective comprehension of such reference material becomes vital for students at the tertiary level. However, research conducted in the UK (Wang 1988), Papua New Guinea (Marshall et al. 1990) and Hong Kong (Tao 1994) suggests that when students use a dictionary as an aid to reading, they often accept the first definition given in the entry, if it is plausible in the context, without examining the variety of other senses that a word may have, and that words may also take on special meanings in different contexts (cf. Tong 1993a). The format of the dictionary entry for a semi-technical word may also have a significant effect on learners when they have to decide which sense should be accepted for a particular word in a reading context. In view of this and the urgent needs of students studying CS and CE in the School of Engineering, and ISMT in the School of Business & Manage-ment, at UST, among others, in comprehending their textbooks, the pre-sent research seeks to harness computer technology to investigate the lin-guistic features of ST writing in English, so as to isolate areas of potential difficulty and to generate materials, e.g. a syllabus and an electronic dic-tionary, to help students overcome problems in comprehending their textbooks and program manuals.

29

Page 30: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Chapter Two: Semi-technical vocabulary in Computer Science

Introduction As we have seen, at least 23 authors have, in the past thirty years, have defined the related terms non-technical, sub-technical and semi-technical vocabulary in various ways, twelve of them using frequency as a criterion. The majority consider that (non-, sub-,) semi-technical vocabulary items are of high frequency in science texts, while others (Johansson 1975; Tong 1993a), claim that only words that fall into the middle range of fre-quency counts are sub-/semi-technical. All 23 researchers agree that, in many cases, semi-technical vocabulary items carry extended meanings in ST texts. Tong (1993a, 1993b) and Li & Pemberton (1994) cite examples from the UST CS Corpus, both borrowing Trimble’s (1985) definition of sub-technical vocabulary. Tong (1993a) stresses two features while Li & Pemberton, also taking Cohen et al.’s (1988) view of academic vocabu-lary into account, indicate four features of semi-technical vocabulary in the context of CS in tertiary education in Hong Kong. However, any such system is necessarily imprecise because of the fluidity of word uses and contexts both through time within a subject area and across subject boundaries, as a consequence of which classifying semi-technical vocabulary items according to any simple criterion is a debatable matter. At present, there is no formal theory that clearly describes this cat-egory of vocabulary; rather than using any arbitrary criterion to describe the items selected for the present study, I have therefore chosen Laurie Bauer’s English word-formation (EWF 1983), Randolph Quirk et al.’s Comprehensive grammar of the English language (CGEL 1985), Tom McArthur’s Oxford companion to the English language (OCELang 1992) and Sidney Greenbaum’s Oxford English grammar (OEG 1996) as a basis for developing a framework for such a category, because these provide comprehensive descriptions of word formation and use, and have attempted to systematise the word-formation process by means of such major categories as affixation, compounding, and conversion and minor categories as back-formation, reduplicatives, abbreviations and blends. We note, however, Bauer’s (1983: 1) observation:

30

Page 31: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

There is, at the moment, no single “theory of word-formation”, nor even agreement on the kind of data that is relevant for the construction of such a theory…. [G]iven the confusion that reigns at the moment, it should be borne in mind that virtually any theoretical statement about word-formation is controversial.

Instead of directly adopting the conventional classification of word-formation categories suggested in EWF, CGEL, OCELang and OEG, I have developed a provisional model to describe the creative aspect of semi-technical vocabulary, as found in such works as the Online hacker jargon file (Version 4.0.0: 24 July 1996), Gregory James cum al.’s English in Computer Science (1994), the program manual Running Word 6 for Windows (1994) and dictionaries of computer terms. Such a model will serve as a base for a set of significant lexical categories that will suffi-ciently describe the highly metaphorical nature of the vocabulary items selected for the purpose of this study. The creative aspects of semi-technical vocabulary The language registers of CS investigated here, more than most academic and professional registers, are best described as informal, playful and in-ventive. The most significant feature that identifies CS language registers from other English varieties is its use of metaphorical language.

Although every scientific discipline uses metaphor, there is probably no field that uses metaphor quite as pervasively and idiosyncratically as does computer science. (Johnson 1991: 273)

Vocabulary items used by computer professionals are usually spontaneous and fashionable, and in traditional terms many new words are formed in an unconventional and unstructured manner. They do, however, exhibit an enjoyment of language play. The words created are therefore usually uninhibited, at times unrestricted in range, and often transient. Although figurative usages are also found in other English registers, those features, when they appear in the registers of CS, tend to be more playful, fashionable and inventive, for two reasons. The first is the so-called ‘hacker culture’ which has developed in the last three or four decades. Since computer professionals, enthusiasts and the like are generally highly creative people who define themselves partly by the rejection of ‘normal’ social values and working habits, their linguistic inventions are con-sciously part of “a game to be played for pleasure and as a result display an almost unique combination of the neotenous

31

Page 32: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

enjoyment of language-play with the discrimination of educated and powerful intelligence” (Raymond 1996: 2–3). The second factor that contributes to the creative and inventive aspects of computer-related language is the working environment, including the electronic media that connect enthusiasts, and the continuous upgrading of hardware and software, whose development in the last two decades has created a dynamic environment in which computer users seem to feel the need to communicate with in fluid, vivid and vigorous language. This, in turn, generates a uniquely intense and accelerated view of linguistic evolution in action. The creative element of ‘computerese’ is often reflected in the vocabulary used in this genre, indi-cated by its highly figurative, and especially metaphorical, nature (James cum al. 1994: 49ff.). Adapting the structure of the word-formation cat-egories suggested in EWF, CGEL, OCELang and OEG, I will use two broad categories that I identify here as lexical, because they relate to both the use and creation of words and the forms that they take: rhetorical (e.g. allusion, figurativeness, neologism, semantic change and word play) and morphological (e.g. derivation, compounding, blending, abbreviation and phrasal-verb formation). I then use these categories to describe the signi-ficant features of the vocabulary items under investigation. These two sets of lexical categories are not conceived as mutually exclusive, rather, they flow into each other. Because language is a complex, ever-changing social activity, any list of categories used to describe it is unlikely to be exhaustive or sufficiently subtle. The ten rhetorical and morphological lexical categories selected here for the discussion of semi-technical vocabulary in CS are sufficient for our purposes, however, as a means of accounting for, and describing, significant features of the creative aspects of the items of semi-technical vocabulary chosen for the present study. The rhetorical dimension OCELang broadly defines rhetoric, in traditional terms, as “the craft of speaking”, and categorises its present-day use as: (1) the study and prac-tice of effective communication; (2) the art of persuasion; and (3) an insincere eloquence intended to win points and get people what they want. Within this broad range, such devices as analogy, metaphor and met-onymy have been used and commented on since the time of the ancient Greeks. The term ‘rhetoric’ is chosen here because it covers such matters as presentation, persuasion, position-taking, and self- and group-identi-fication, as, for example, the ways in which computer specialists,

32

Page 33: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

enthusi-asts and general users identify themselves. In order to maintain a ‘club spirit’, such users tend to communicate with fellow ‘in-group’ members by means of novel technical and lexical usages, to help them manage their affairs and show how they operate – as they see themselves – at the cutting edge of technology. By analysing the vocabulary items used in this genre, I have established five significant inter-related categories which help highlight the highly metaphorical nature of the semi-technical vocabulary items discussed in this study. These (necessarily overlapping) categories are allusion, figurative usage, neologism, semantic change and word play. Category 1: Allusion This term formerly included metaphors, parables and puns, but is now generally used to refer to the implicit or unacknowledged use of someone else’s words. Allusions are indirect and often cryptic, but used in the way that a speaker or writer can share an understanding with certain listeners and readers, who are likely to know who or what someone is alluding to. Four examples, taken from James cum al. (1994), of such indirect and usually incomplete reference are:

catch-up and the systems game, as in You’ll have to play catch-up, and in the systems game, that may be difficult. In this context, to play catch-up means to com-pete and win even if it means using illegitimate tactics. The systems game refers to the circumstances in which computer systems work.

turbulence and airspace, as in The turbulence in the British telecommunications air-space is caused by the political desire to deregulate a mono-polistic industry. In this context, an indirect punning reference is made to the unstable environment in which an aircraft flies and the political difficulties that British telecom-munications encounter.

Las Vegas, as in This question is an example of what I call the “Las Vegas” mentality exhibited by some designers, alluding obliquely to an inclination to gamble.

33

Page 34: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Robin Hood, as in Another possibility is to play Robin Hood, taking space from the tables with much room and giving it to the tables with little room, alluding to the mythical outlaw’s action of stealing from the rich in order to help the poor.

Category 2: Figurative usage I take this category to cover both the free use of such figures of speech as metaphor and simile, especially anthropomorphic and animative meta-phors, which arise from a tendency to treat hardware and software as human or animate, and the use of figurative extensions that often create polysemes, as in the case of common biological metaphors (cf. James cum al. 1994: 41–2).

Anthropomorphisation and animation

Candidate, as in Note the use of a composite key in Fig. 8.2. Composite keys are candidate keys … Protocol handler, as in ‘The protocol handler got confused.’ One may say of a computer routine that ‘its goal in life is to do x or y’. Sometimes, when the computer does not perform as expected, one may hear such comments as ‘... and its poor little brain couldn’t understand this algorithm, and it died’.

Figurative extension from real life

Bug, as in Because they were not formally designed, they cannot be precisely repeated, and no one is sure whether there was a bug or not. Doctor and tree, as in To implement the doctor-imitating program, we begin with DOCTOR, the procedure that couples you to the DOCTOR transition tree.

Category 3: Neologism Defined as a new word or phrase, or a new sense of a word or phrase, such as the compounds toolbar (< tool + bar); video-conferencing, formed initially from video-conference (< video + conference); the acronyms WYSIWYG (< what you see is what you get), GIGO (< garbage in, garbage out), and the hacker’s term P-convention, through which a

34

Page 35: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

question is built into a word by adding ‘p’ to denote a predicate (following a convention in the programming language LISP): examples, Foodp meaning, ‘Shall we have lunch/some food now?’ and State-of-the-world-p, to mean ‘What are you up to?’. Neologisms are formed spontaneously. They are uninhibited and un-restricted because computer enthusiasts do not necessarily look up lin-guistic origins when they create a new term, and are not particularly aware of following (or breaking) any word-formation rules. Their aim is to communicate, in an interesting but special ‘in-group’ manner. This category also flows into the morphological categories, such as abbre-viation and compounding, discussed below. Category 4: Semantic change We identify here three kinds of semantic change: the adaptation of mean-ings and uses of vocabulary items from language at large into computer usage, from computer usage into language at large, and from one register of computer usage to another, involving specialisation (new uses for old words), generalisation (uses of computer jargon extended into daily life) and contextual transfer (the movement of terms from other disciplines to computer usage). Such categories are not always sharply distinguishable and can shade into one another or develop from another type of linguistic classification such as conversion (the shift from one grammatical category to another). Semantic change is probably best discussed in terms of webs of shifting forms and relationships rather than words on their own. In most cases, the reference of a word extends figuratively or idiomatically.

Examples of specialisation (including conversion)

1. The term architecture generally means the art and science of designing and constructing buildings, while in CS it means the arrangement of complex hardware and software.

2. The term chip generally means a sliver of wood, whereas in CS it means a tiny wafer of silicon on which is engraved a minute circuit.

3. The noun library is used in CS as a verb, as in, ‘Library a set of programs for common tasks’.

35

Page 36: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

4. The noun protocol is used as a verb, as in ‘Protocol the rules by which two computers or programs communicate’.

Examples of generalisation (including conversion)

The noun mode can be used as part of a compound, as in, ‘I attended the meeting in sponge mode [= listened but said nothing]’, or ‘He immediately switched himself into supervisor mode [= took on a leading role]’. The noun e-mail or email (which is a blend of e for electronic and mail) is used as a verb as in ‘I will email him tonight [= send a message through a computer network]’.

Examples of overgeneralisation

In general use, the suffix -ity is often added to adjectives in -ous (among others) to form nouns in -osity as in porous/porosity, generous/generosity. Overgeneralised coinages are found in CS contexts, such as: mysteriosity (< mysterious), ferrosity (< ferrous), obviosity (< obvious), dubiosity (< dubious). In CS contexts. certain analogical coinages can be found, such as, by analogy with gratitude, latitude, magnitude etc., lossitude (< lose), cruftitude (< cruft ‘an unpleasant substance’).

Contextual transfer

1. Spam (< ‘spiced ham’), used as a noun in CS texts meaning ‘irrelevant or inappropriate messages sent on the Internet to a large number of newsgroups or users’, and as a verb meaning ‘to send the same message indiscriminately to large numbers of newsgroups or individuals on the Inter-net’. The Internet sense derives from a sketch by the cult British ‘Monty Python’ comedy group, set in a café in which every item on the menu includes spam.

2. Virus, a medical term, used in CS for a planted program

that copies itself from machine to machine, using up memory or corrupting or deleting files.

36

Page 37: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

37

Page 38: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Category 5: Word play My definition here: any adaptation or use of words to achieve a humorous, ironic, satirical, dramatic, critical, or other comparable effect. Word-play can involve form (both sound and spelling) and grammar. Deliberate word distortion is common (in what hackers call ‘soundalike slang’), as in: Boston Horrid for ‘Boston Herald’, New York Slime for ‘New York Times’, dirty genitals from ‘data general’, hysterical raisins from ‘his-torical reasons’. Verb doubling is an example of grammatical word play, as with lose in ‘The disk heads just crashed, lose, lose [= a comment on an undesirable situation]’, and chomp as in ‘Boy, what a bagbiter! Chomp, chomp [= to say “You chomper/loser!”]!’ Such word play is common in CS registers and may therefore be en-countered by native or non-native CS students, neither of whom might immediately grasp their meanings. The way words are formed reflects the creative inclinations of computer enthusiasts and their desire to com-municate vividly. For example, unlike the rather opaque Cockney rhyming slang, the ‘soundalike slang’ that computer enthusiasts use seeks to be transparent. Computer slang expressions are included in the discussion here because they are words or phrases converted from an ordinary word or phrase in an interesting and dynamic manner. The morphological dimension As can be seen from the preceding discussion, the creation of terms in CS can be discussed in terms of traditional word-formation categories as described in EWF, CGEL, OCELang and OEG. I have chosen here what appear to be the five most significant categories with respect to the vocabulary items investigated in this study. These are derivation, com-pounding (including fixed phrases), blending, abbreviation and phrasal-verb formation. Category 1: Derivation The process of creating a more complex word from a simpler one, accord-ing to a set of rules, as in complexity for complex. Hacker, hacking, hackintosh, hackish, hackishness, hackitude are all coined from hack, both a noun (originally meaning ‘a quick job that produces what is needed, but not necessarily well’) and its verb equivalent. The jocular

38

Page 39: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

noun bogosity and verb bogosify derive from bogus (an adjective, meaning ‘non-functional, useless, false, incorrect, unbelievable or silly’). The noun cruft is a back-formation from the already CS-specific adjective crufty (‘poorly built, possibly over-complex’). Category 2: Compounding (including fixed phrases) The process of forming a further word from two or more existing words (compound words or simply compounds) as in database < data + base, toolbar < tool + bar, watermark < water + mark, tooltip < tool + tip, lightpen < light + pen, lookalike < look + alike, pressbutton < press + button, flowchart < flow + chart, download < down + load. Some compounds can be classified as fixed phrases (a phrase, often consisting of an adjective and a noun, which functions as a word, either with unique reference or as an idiom; also a category in phraseology), which is a significant phenomenon in CS language. It is not categorised as a separate word-formation category in EWF, CGEL, OCELang or OEG but is dis-cussed under compound in CGEL. The line between fixed phrases and compound words is not easy to draw. Some examples are:

black art: a collection of arcane, unpublished, and mostly ad-hoc techniques developed for a particular application or systems area;

dangling pointer: a reference that does not actually lead

anywhere;

finger-pointing syndrome: an all-too-frequent consequence of bugs, especially in new or experimental configurations, in which the hardware vendor points a finger at the software, and the software vendor points a finger at the hardware.

Category 3: Blending A process which collapses two words into one. The result is known as a blend word, word blend, amalgam or fusion, in which a compound is made by ‘blending’ one word with another. As noted in OEG, blends, like other word-formation categories, are generally not well defined, and blending tends to shade off into compounding, affixation, clipping and acronyming. It is also related to abbreviation, but is distinct from them

39

Page 40: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

all, as in: fortran fusing formula and translator, LISP fusing list and pro-cessing, Compuserve fusing computer and server, computron fusing com-puter and electron, Macintrash fusing Macintosh and trash. Category 4: Abbreviation The shortening of words and phrases and any result of such shortening, such as: the initialisms CD < compact disc, PPN < project-programmer number and VR < virtual reality; the acronyms WYSIWYG (pronounced ‘wizzywig’) < what you see is what you get and YAUN (cf. ‘yawn’) < yet another UNIX nerd; the blend of initialism and acronym CD-ROM < compact disc – read only memory, and the blends of letter and word e-mail < electronic mail and p-mail < physical mail. Category 5: Phrasal-verb formation The phrasal verb is a common type of verb that operates more like a phrase than a word, as in come up, go down, move out, take up with. EWF discusses phrasal verbs as marginal cases of conversion, while Anthony Cowie (personal communication) considers them part of phraseology. Regardless of how it is best classified, the phrasal verb is a significant element in CS texts, often having specialised meanings and uses. Examples are:

blow away: to remove (files and directories) from permanent storage, generally by accident;

blow up: to become unstable;

comment out: to surround a section of code with comment

delimiters or to prefix every line in a section with a comment marker, preventing it in this way from being compiled or interpreted;

log on, log in: to begin a period of using a computer system by

performing a fixed set of operations;

log off, log out: to finish a period of using a computer system by performing a fixed set of operations.

40

Page 41: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Phrasal verbs are abundant in everyday usage such as in conversation, newspapers and magazines, but are less common in ST usage. Many are informal, vividly metaphorical, and slangy. Computer professionals fre-quently coin and use them in both speech and writing to create a playful and humorous effect for conscious pleasure. Words adapted, created, and used in such ways can pose significant comprehension problems, mainly because non-native students are not at all aware of such possibilities. Indeed, for learners the problem may emerge at an earlier stage than native users of English might suppose, because extensions of meaning and use over even a relatively ‘short’ figurative distance can be opaque to learners (though probably transparent enough to most native speakers, especially after a short period of ad-justment). The following usages have proved to be difficult for UST students:

candidate, as in Note the use of a composite key in Fig. 8.2. Composite keys are candidate keys that contain more than one attribute. They are needed when there is no single attribute candidate key.

bug, as in Because they were not formally

designed, they cannot be precisely repeated, and no one is sure whether there was a bug or not. After the bug has been ostensibly corrected, no one is sure that the retest was identical to the test that found the bug.

postamble, as in When a data record is recorded, it

is often preceded by a preamble and followed by a postamble.

catch-up, as in You’ll have to play catch-up, and in

the systems game, that may be difficult.

The problem intensifies when students encounter a word that, in a com-putational context, has a different sense from the one they are broadly familiar with, especially in combination, as with title and bar in title bar; mouse and pointer in mouse pointer; control, menu and box in control-menu box; hot and spot in hot spot; history and button in history button; formatting and inspector in formatting inspector. Interestingly, the process of extending the general meaning of a word in the CS context will, in some cases, create fully technical words that do not normally

41

Page 42: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

occur in non-CS contexts such as AutoCAD (a fused compound of Auto Computer-Aided Design), bitmap (< bit + map), CompuServe (< computer + server), metafile (< meta + file), micrografx (a fused compound of micro and graphics). These, however, will not be discussed further in this study. In the following sections, I will consider the results of the experiments con-ducted to establish whether a selection of vocabulary items which satisfy our criteria used to describe semi-technical vocabulary creates problems for students in understanding CS-related texts.

42

Page 43: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Chapter Three: Semi-technical vocabulary in the comprehension of CS texts

Introduction Despite its internationalisation, CS language can often be difficult, even for native speakers of English. The value of research into this genre is especially important for ESP learners, including students who have to deal with the unfamiliar meanings of otherwise familiar words. As we have seen, significant linguistic features in the language of CS include: the use of metaphor; kinds of lexical modification; the manipulation of allusions; facetious expressions; and verbal humour. To examine whether these features also affect the comprehension of CS texts, a cohort of 120 students studying on the Language Enhancement Course at UST in the Fall semester of 1995 was chosen as a group from which data could be collected. All were first-year students who had attained grade D or below in the Hong Kong Examinations Authority’s Use of English examination. The 120 subjects were from eight classes, viz. four Computer Science (CS), two Information and Systems Management (ISMT), and two Biology (BIOL). During the experiments, subjects were asked to use the thinking-aloud technique to verbalise their thoughts while they were writing down the explanations of the words given. The thinking-aloud, or introspective, technique is a method borrowed from experimental psychology, which has recently been applied in research into language learning, language pro-cessing and learners’ lexical inferencing procedures (cf. Lam 1991). In the first week of their language classes, the subjects were asked to work on three different tests, designed to investigate whether the areas of meta-phor, linguistic variation (in the form of lexical modification), verbal humour, and allusive usage created unnecessary confusion in their under-standing of CS texts. Test designs Test 1: Metaphor in CS texts As we have seen, figurative usage is pervasive in the language of CS, especially metaphors such as anthropomorphism and the representation of

43

Page 44: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

the inanimate or abstract as being alive. For example, such words in the UST CS Corpus as subordinate and superior, candidate and selector, host and server, manager and client, and parent and child are applied to non-living things and processes, as are tree, vegetable, bug, dinosaur, mouse, worm and virus. The aim of Test 1 is to investigate whether the use of such metaphors affects students’ comprehension of CS texts. The test was carried out in a two-stage schema. In the first stage, the subjects were asked to write down the meanings of the four individual metaphors (Appendix 1) candidate, subordinate, bug and tree, first in English, but with the freedom to express themselves in Chinese or both languages together (cf. Wong 1995). During the glossing process, students were asked to engage in thinking-aloud protocols. If they found that verbal communication was not effective enough, they could also use diagrams and examples to illustrate their thoughts, or give all possible meanings of each word they could think of, and state in context how those words would be likely to occur. In the second stage of the test, subjects were asked to explain in writing the meanings of the words in context (Appendix 2). All four samples were presented in context, texts being taken from the UST CS Corpus. If subjects felt that they were unable to grasp the meanings of the words within the given text, a longer paragraph of up to 999 characters,5 with the examined word embedded in it, would be given to them (Appendix 3). This procedure was based on the assumption that longer texts may provide more information and background details to help students’ comprehension of the meanings of the word in context. In both stages of the test, subjects were interviewed immediately after they had submitted their answers. These immediate retrospective interviews were conducted to elicit qualitative information about the subjects’ under-standing of the words (and to confirm the data collected from the thinking-aloud technique), for example, information was elicited on: where they had first learned a word (to establish whether their first encounters with the words were in non-technical texts, or whether they

5 999 is the maximum number of characters that can be saved for printing in

MicroConcord, a concordancing program that allows users to search large amounts of computer-readable text for words and combinations of words for the purpose of studying their meanings and the ways in which they are used. (cf. Murison-Bowie 1993: 7)

44

Page 45: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

had never come across them before); how certain they were of the meanings (to establish whether they had clearly understood the meanings of the words); and what associations they might have experienced when the word was recognised (to establish whether their previous experience of the words had had any effect on their understanding of the words this time). Each interview lasted from fifteen to twenty-five minutes, and was conducted in English, with subjects being allowed to answer in English or Cantonese, or a mixture of both. Test 2: Lexical modification in CS texts Sager et al. (1980), among others, have highlighted a significant feature of English through which speakers and writers can convert from one word category into another without change in form. Conversion is most com-mon from verb to noun and vice versa. Examples from the online Jargon File are “All nouns can be verbed ... all verbs can be nouned”, “I’ll mouse it up”, “Hang on while I clipboard it over” (Jargon File, v. 4.0.0, 24th July 1996). In view of this, Test 2 aims at investigating whether linguistic variation in the form of conversion causes problems to ESP learners in understanding CS-related texts. The same 120 subjects were asked to locate the subject of the verb is in the following sentence:

Selectively disabling interrupts on one or more levels is called interrupt masking.

(UST CS Corpus, AR037-2.ASC, from James cum al. 1994: 45) When conversion occurs in certain words, the unusual results are likely to surprise learners. One example of a verb converted into a noun is interrupts 6 , as above. Subjects were again asked to verbalise their thinking during the process, so as to reveal what factors influenced their decisions. A retrospective interview was held immediately after the written test.

6 There are altogether 307 occurrences of interrupt(s) in the UST CS Corpus. Some 299, or

over 97%, are functionally nominal or adjectival (James cum al. 1994: 44).

45

Page 46: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Test 3: Allusions and facetious words in CS texts With the advent of computer technology, a new domain-specific variety of English has emerged. What traditional prescriptivist grammarians have generally regarded as “bad English” is sometimes perfectly acceptable in the domain of CS (James cum al. 1994: 50). Although the novel word formations, the grammatical deviations and syntactic re-formations that accompany this recently developed technology to some extent represent the stylishness and modernity of technological usage, these changes in the language impose additional burdens on second language learners. Test 3 aims at investigating the extent to which allusions and facetious words create confusion for ESP learners when reading CS texts. The same procedure of a two-stage schema of written explanation exercise, as used in Test 1, was carried out to assess the students’ knowledge of those expressions. Each subject was asked to write the meanings of six words, postamble, underlaps, catch-up, turbulence, Las Vegas and Robin Hood identified in James cum al. (1994: 49, 52): first, providing the meaning of individual vocabulary items, followed by a given CS context from the UST CS corpus (Appendix 4). Longer texts with the words-in-test embedded were available on request for the subjects to examine (Appendix 5). Discussion of results The success rates for subjects’ ability to recognise the correct lexical sense of the four metaphors used in Test 1 are shown in Table 3: Table 3: Students’ success rates in recognising the correct lexical senses of the

four selected words.

Vocabulary Items

Stage 1 Word-level recognition

Stage 2 Sentence-level recognition

in CS context

Number of correct

responses (n = 120)

Number of correct responses (n = 120)

candidate 114 36

tree 120 56

bug 16 12

subordinate 4 4

46

Page 47: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

In the first stage, the words candidate and tree were successfully explained in their general senses by most subjects. Chinese equivalents such as 申請人 and 考生 were given for candidate and 樹 for tree. The success rates were 95% and 100% respectively. No subject used any diagram or other means to illustrate the words. The results in the second stage, however, are different: only 36 and 56 subjects (fewer than half of the cohort) successfully explained candidate and tree respectively in the CS context. The difference between the two stages was explained in the retrospective interviews. Around 112 individuals (93% of the cohort) admitted that they immediately substituted the examined words with the Chinese equivalents they had used in the first stage of the test. All 112 subjects concluded that knowing the general meaning of the words did not help them to understand the items in the given CS sentences, nor did such a situation help them to understand the sentences taken from CS contexts. Around 85% of the 112 subjects indicated that this preconception of the words’ general meaning hindered their understanding of the CS texts. When subjects were asked to locate the sentence subject for the verb is in the sentence Selectively disabling interrupts on one or more levels is called interrupt masking in Test 2, around 114 (95%) of the 120 answered wrongly. However, when they were asked to reconsider the question more carefully, some of them managed to answer correctly. Such a delay in comprehending a text may have been occasioned by the form interrupts being used anomalously as a nominal object instead of a verb; or the position of the participle and the collocation of a singular verb occurring right after an adverbial phrase. The results of Test 3, supported by the data collected from the thinking-aloud technique and the observations gathered during the retrospective interviews (as illustrated in Table 4 below), indicate that 98 subjects (82%) gave acceptable one-word definitions, during the word-level recognition glossing exercise, in English or Chinese, such as 追上 follow, or 抓緊 for catch-up, while only 6 (5%) of them mentioned that the word could be related to a game or the name of a game. Similarly, with respect to the words with cultural allusions, Las Vegas and Robin Hood, most subjects, 74 (62%) and 72 (60%), simply translated these (in their written texts) as 拉斯維加斯 and 羅賓漢 respectively. The results of the sentence-level recognition exercise show that subjects found it easier to apply an allusive meaning for Robin Hood to CS texts than to understand the use of Las Vegas. Data collected from the thinking-aloud protocol and

47

Page 48: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

retrospective interview confirm that, since subjects were more familiar with stories or films related to the person of Robin Hood than with the renowned gambling city of Las Vegas, they could more easily understand the meaning of Robin Hood in the CS context. Most of them used a four-word Chinese proverb, 劫富濟貧 [‘taking from the rich to help the poor’], to describe the functional use of the word in such a context. As regards Las Vegas, no subjects used any phrases like 賭仔心態 [‘gambling mentality’] or 賭場風氣 [‘gambling atmosphere’] to elaborate the meaning of the word in context. Table 4: Students’ success rates in recognising the correct lexical sense of the six

selected words.

Vocabulary

items

Stage 1 Word-level recognition

Stage 2 Sentence-level recognition

in CS context

Correct/near-correct

responses (n = 120)

Sensible guesses (n = 120)

catch-up 98 6 turbulence 30 0 Las Vegas 74 0 Robin Hood 72 6 postamble 0 0 underlaps 0 0

These results indicate that linguistic features such as the use of metaphor, linguistic variation in the form of lexical modification and the manipu-lation of allusions, facetious words and verbal humour can affect uni-versity students’ understanding of CS texts. A proportion of CS language consists of such casually invented expressions as postamble (< preamble) and underlaps (< overlaps), which can be classified as humorous neo-logisms. As a result, it is necessary to provide a tool that can help learners overcome problems with such formations. One suggested solution is a glossary of semi-technical vocabulary with such features as Chinese ex-planations, examples of use (in the form of interactive demonstrations) and frequency information based on a specialised corpus. Trimble (1985) pointed out that semi-technical vocabulary poses a major problem to students in their comprehension of ST texts. Although com-puter senses of certain entries such as mouse, file, bug, hack and menu appear in such dictionaries as the Oxford advanced learner’s dictionary of current English (5th edition, 1995) (OALD 5) and the Longman

48

Page 49: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

dictionary of contemporary English (3rd edition, 1995) (LDOCE 3), the coverage is not extensive enough to satisfy the needs of users, and in any case such works are not specifically designed to help with this kind of vocabulary. My own experience of reading CS-related texts, especially when learning a new program, suggests that semi-technical vocabulary can be an important day-to-day problem for ESP learners in comprehending these texts, and it therefore merits the attention of specialised lexico-graphers and lexicologists. A glossary of semi-technical vocabulary in CS should be able to give meaningful definitions to ESP learners, and so help them understand what they are reading. My experience also suggests that it would be even more useful if the glossary were compiled as an appendix to each CS textbook, program manual or other related CS product. An appropriately designed dictionary/glossary would, therefore, be a timely product for learners who need to use certain computer programs, such as Microsoft Word and Powerpoint. Such a glossary should be both subject-specific and corpus-based so as to be more efficient for such students than a general learner’s dictionary (GLD). A useful outcome of frequency counts based on the UST CS Corpus is a ranked A–Z wordlist or glossary, which could first be used to enable the relevant university staff to devise appropriate tests or examinations to help them consider the intake of suitable candidates for their courses, or to identify students with a deficient awareness of the vocabulary of CS. Thirdly, it could be used to assign students to the most appropriate level for their studies. A wordlist or glossary of CS terms could also be added as an appendix to general EAP lecture notes or subject-specific textbooks, in order to alert students to what sorts of words in CS they should know and why, and provide key information to reduce their problems. Finally, a wordlist or glossary with frequency counts in CS can also act as a means of widening students’ knowledge in related vocabulary areas, which they might or might not have picked up in their general reading. A systematic wordlist or glossary, generated from a specialised corpus such as the UST CS corpus, could encourage students to make an effort to study words lying in a particular frequency range, deliberately and regularly when they are alerted to the problems that semi-technical items may create. Furthermore, this wordlist/glossary could help to determine the level of difficulty of different CS textbooks, which could then be used

49

Page 50: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

at the appropriate stages of the students’ tertiary education. Hence, students’ reading and writing skills could be highlighted and improved through such a resource in which frequency counts are a necessary feature. My proposed glossary of semi-technical vocabulary was designed in terms of the advantages suggested. It was constructed with a pen-based device, with an alternative inputting system to a mouse, so as to make it easy for Chinese or other users, who could enter words in their own language directly onto the screen to initiate a search. A sample of the entry close is suggested below:

You can close a document without quitting Word. If youchanged the document since you last saved it, Word asks if youwant to save the document. To activate this command, you canclick File on the Menu bar at the top of the screen and clickClose.

在 檔案F 功能表 上,按一下 關閉檔案[C]。

115

Close 關閉WORD SEARCH:EXPLANATION:

ABOUTHKUST

COMPUTER SCIENCECORPUS

PREV IO U SDEMO FU RTH ER

EX PLAN A TIO NNEX T

FR EQUENCY: in the HKUST CS CORPUS

in 1,001,895 words

WO RD

STA RTU P

WIN D O W

Figure 1: A sample entry from the electronic glossary of semi-technical vocabulary.

50

Page 51: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Chapter Four: Methodology of experimentation Introduction The design and implementation of the present study was carried out in both qualitative and quantitative manners (Young 1953; Denzin 1970), together with the currently popular learner-centred problem-solving approach to the teaching of ESP and the user-perspective approach to pedagogical lexicography (Hartmann 1989, 1998). The use of different methods is aimed at gathering reliable data to investigate whether semi-technical vocabulary adversely affects students’ understanding of CS-related texts, and to assess the extent to which a corpus-based subject-specific glossary of such vocabulary is a more efficient consulting tool for students than a general learner’s dictionary (GLD). The multiple measures of investigation are designed to demonstrate how the prototype glossary can affect subjects’ understanding of CS-related texts, such as those which appear in CS textbooks, computer programs and program manuals, in terms of ease of use and time. Research techniques and test design Five data collection methods were used: a decontextualised word-level glossing exercise (DWLGE); a multiple-choice questionnaire survey (MCQS); a list of tasks to be performed in the Microsoft Word 6.0 program (hands-on computer tasks: HOCTs); the thinking-aloud technique (TAT) and the retrospective interview (RI). The DWLGE and the MCQS served as preliminary tools aimed at soliciting 425 subjects’ (60 in the pilot study and 365 in the main study) knowledge of semi-technical vocabulary in general English and CS-specific contexts. This was followed by an empirical study which invited 100 subjects (20 in the pilot study and 80 in the main study) to complete 21 HOCTs, which provided the means to investigate further the effect of semi-technical vocabulary on students’ performance in using a new computer program. The TAT and RI were used to collect qualitative data, such as why and how the subjects came to their decision in the DWLGEs, the MCQSs and the HOCTs.

51

Page 52: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Decontextualised word-level glossing exercise (DWLGE) This exercise was chosen as the first means of confirming subjects’ knowledge of selected semi-technical vocabulary items in a general English context, in both the pilot study and the main study. Four DWLGEs were conducted (see Appendices 6 and 7 for samples): DWLGEs 1 and 3 contained 40 words and were the bases for constructing the MCQSs; and DWLGEs 2 and 4 contained 30 words and were used for constructing the HOCTs. In both the pilot and main studies, subjects were asked to gloss and/or explain orally (in the pilot study) or in written form (in the main study) the meanings of the words under investigation, either by a single word or a phrase, in English and Cantonese (for oral feedback) or Chinese (for written feedback), or in both languages. They were also allowed to use examples or diagrams to illustrate the meanings of the words when they had difficulties in expressing them directly. For each word, they were encouraged to provide all possible meanings that they could think of and state the context in which the word was likely to occur. Only items that were accurately explained by all subjects were selected for the MCQSs and the HOCTs. A multiple-choice questionnaire survey (MCQS) A vocabulary test, in a multiple-choice format, was chosen as the primary means of investigating whether semi-technical vocabulary adversely affects students’ understanding of CS-related texts. Although multiple-choice questions cannot give data on the logical process through which subjects analyse given vocabulary to deduce correct meaning in a single-word format or word-in-context format, they do allow a greater number of points to be covered in a given time than other types of test question. Such a format also allows the researcher to focus responses and limit them to particular kinds of feedback, thus saving subjects’ time and energy when responding. The semi-technical vocabulary items used to construct the MCQS for the pilot study were selected from the list of 40 words in DWLGE 1 and semi-technical vocabulary items used to construct the MCQS for the main study were selected from DWLGE 3 (see Appendix 7). To facilitate data analysis and interpretation, only words that all subjects gave correct answers for were used to design the MCQSs. Based on this rationale, 21 words from the DWLGE 1 (with oral feedback) and DWLGE 3 (with

52

Page 53: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

written feedback) were chosen to form the core vocabulary test items for the MCQSs in both the pilot and main studies. The multiple-choice questionnaire comprises questions arranged in three different formats:

Type 1: Finding a synonymous expression, in CS texts, for the independent vocabulary item.

Type 2: Selecting a sentence, in a CS context, that uses the word correctly.

Type 3: Selecting a synonymous expression for the given vocabu-lary item embedded in a CS context.

Four versions of the same questionnaire, with different arrangements of questions and distracters, were trialled in the pilot study. The intention was to determine a questionnaire format that would help to solicit comparatively reliable data. See Appendix 8 for the final versions of the MCQS used in the pilot and the main studies. Fifty subjects handed in the completed multiple-choice questionnaire survey; another ten were asked to engage in the thinking-aloud protocol, in my presence, while they were working on the survey, and were then invited to attend the RIs, in which I clarified and obscurities and lapses of data provided during the thinking-aloud process. Feedback of the test design was collected, and only one format from the four versions of MCQS was selected. A revised MCQS, based on subjects’ feedback, was used in the main study. All 365 subjects returned the questionnaire, 60 of whom used the TAT in my presence and attended the RIs afterwards. All data from the questionnaire survey were in written form; oral data were gathered from the TAT and the RI. Hands-on computer tasks (HOCTs) Another vocabulary test, in the form of HOCTs, was chosen, first to investigate the effect of semi-technical vocabulary items on subjects’ understanding of CS text which will affect their academic performance when the use of computer programs is involved; and second to establish whether a prototype glossary of semi-technical vocabulary would help students’ understanding of CS texts in terms of ease of use and the time for consultation. The HOCTs are a set of instructions in the style of the Running Word 6 for Windows manual (1994), which subjects were asked to format, adding new information to the article ‘Introduction to Microsoft Word 6’ which appears in the Microsoft Word 6.0 word-processing program (see

53

Page 54: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Appendix 9). These HOCTs were an alternative means of gathering empirical evidence to investigate the effect of semi-technical vocabulary on students’ academic study. The 21 semi-technical vocabulary items used to construct the HOCTs were taken from the 30-word DWLGE 2, for which oral feedback was obtained, and DWLGE 4, for which written feed-back was gathered. All 21 items were correctly explained by all subjects who participated. For the investigation of the effect of semi-technical vocabulary items, twenty subjects participated in the pilot study, ten of whom produced the thinking-aloud protocol in my presence, and then attended the RIs. In the main study, 80 subjects were asked to participate, 40 of whom used the TAT during the experiments and attended the RIs immediately after the tasks were finished. With respect to the establishment of the superiority of the prototype glossary to a GLD, subjects were given one GLD of their own choice for consultation in their first attempt at the HOCTs, and the electronic gloss-ary of semi-technical vocabulary (EGSV) for consultation during their second attempt. Twenty subjects participated in the pilot study, ten of whom used the TAT and attended the RIs. In the main study, 80 subjects worked on the HOCTs, 40 of them producing the thinking-aloud protocol and subsequently providing retrospective data. Thinking-aloud technique (TAT) This technique was regarded as a supporting research technique in this study. It was used to solicit unquantifiable data, such as feelings and intui-tions, that could not be traced through the results gathered from the MCQSs and the HOCTs. The most pressing reason which led me to use the TAT was the need to obtain information on strategies that subjects used to understand semi-technical vocabulary in both general and CS contexts. In the later stage of this research, the TAT appeared to be not only appropriate for the tasks, but also a workable method of allowing the subjects to reveal their suc-cessive attempts to understand a word. This also allowed me to observe the process and analyse the material produced in the form of subjects’ verbalisations (see Appendices 10 and 11).

54

Page 55: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

In this study, the first subject of the cohort provided a relatively pure ‘thinking-aloud’ protocol by verbalising his test-taking activity, as he sat alone in a room with the tape-recorder. However, his RI feedback and the recorded data show that it was difficult for a subject to go through the whole process entirely on his own. Thus, later, in both the pilot and main studies, the subjects attempting the tasks worked through the test in my presence. I reminded them where necessary to keep verbalising their thinking as they proceeded with the tests. Prompts such as “What are you thinking about?”, “Keep talking” and “Speak up” were constantly used to remind subjects to continue with their oral reports. At certain points during the sessions, I also used what Cohen et al. (1988) call “immediate retrospection”, to clarify strategies that subjects employed to formulate the answers for their surveys. It was recognised that this more interventionist approach would have had an influence on the way the subjects took the test, but this disadvantage was outweighed by the more substantial data elicited. Before any thinking-aloud protocol was made in either the pilot or main study, subjects were asked to trial the technique. They were required to give simultaneous reports on whatever was in their mind while they were working on the survey questionnaire and computer tasks. The use of this method in studying the process of how subjects determine what a vocabulary item means in a CS context provides a valuable source of data about the sequence of events that occurs while the subjects are performing their cognitive tasks. Thus, in the instructions which I gave to my subjects, I wrote:

Please think in a loud voice, say everything that passes through your mind during your work searching for equivalents ... Do not plan what to say or speak after the thought, but rather let your thoughts speak, as though you were really thinking out loud.

By understanding thoroughly the procedures and following the instruc-tions carefully, all subjects were able to produce concurrent verbal reports. As with all experimental methods, I did not expect the TAT to be completely foolproof, but I had no doubt as to its effectiveness in offering a reliable and adequate account of some aspects of the subjects’ receptive lexical competence, which could not be obtained from written data. By using the TAT, externalising the content of the subjects’ short-term memory to elicit possible problem-solving strategies used by the subjects during the MCQS and the HOCT becomes possible.

55

Page 56: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Retrospective interview (RI) The RI was also used as an auxiliary research tool in this study. Such interviews are usually conducted when introspective enquiry methods, such as TATs, are used. The major use of the RI, for example, is to identify strategies for translation (Al-Besbasi 1990; Lam 1991), or to identify strategies for comprehending unknown words in reading (Haastrup 1987), because a retrospective account contains value judge-ments and opinions that help one to solicit data left out of the thinking-aloud protocol. It also helps the researcher to clarify ambiguity caused in broken discourse recorded during the thinking-aloud process. This techni-que is of particular value for researchers in understanding subjects’ behaviour and the reasons for their actions. The data obtained from retro-spective accounts are superior to data obtained from recorded audio tapes since they reveal a subject’s thoughts, opinions and attitudes. The RIs were conducted immediately after the thinking-aloud protocol had taken place in both the MCQSs and the HOCTs. Their value was confirmed during the data analysis, when lapses of information were found in the fragmented discourse recorded during the introspective session. The retrospective account reveals some valuable information such as why subjects have chosen to use certain translation equivalents, or what factors would have independently affected their understanding of the meanings of semi-technical vocabulary items, or correctly in context. During the interviews, subjects’ opinions of the survey questions were solicited. A list of questions was prepared to serve as the framework of the interviews. In some cases, questions were modified to meet different indi-vidual subjects’ encounters. The framework of the interviewing questions is categorised as follows:

1. Questions related to the MCQS: a. How do you feel about this test? Easy? Difficult? Any other

comments? b. Which part of the test did you find most difficult? Why? c. Which question did you find most difficult to answer? Why?

2. Questions related to the HOCTs: a. Have you used any word-processing programs before? If yes,

which? b. Did you find the instructions in the hands-on computer tasks

difficult? If yes, which question? Why?

56

Page 57: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

c. Did you find the electronic glossary (EGSV) useful? If yes, how would you describe it? Easier to use? Quicker to find the meanings of the word(s) in question? Are the explanations clear?

d. What feature(s) of the EGSV did you find most helpful? Chinese and English explanations? Frequency? Demo facility? Pen-based device?

e. Did you find it easier to search for word meanings by using the EGSV compared to the paperback or online dictionary? If so, in what way?

f. Did you find it quicker to search for word meanings by using the EGSV compared to the paperback or online dictionary? If so, in what way?

3. Questions related to the dictionary used:

a. Did you use any dictionary to help you answer the questions? b. If your answer to 3a is “yes”, which dictionary did you find

most helpful? In what way? c. What dictionaries did you usually use when you found a word

you did not understand? (Try to relate to your past experience, or refer to what you have just done during the experiment not to what you may do in the future.)

d. What dictionaries did you usually use when you found a word you did not understand in Computer Science texts? (Try to relate your past experience, or refer to what you have just done during the experiment, not to what you not do in the future.)

All subjects who undertook the thinking-aloud protocol attended the RIs. Revisions, based on the recommendations collected from the data, were made in the main study. Selection of subjects The choice of subjects is generally determined by the nature and purposes of the study and by the practical constraints envisaged. In this study, the selection of subjects was based on two major considerations: first, the practical value of this research to the subjects themselves; second, the availability of the subjects at the moment when the research was con-ducted.

57

Page 58: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

In terms of practical value, I intentionally involved students whom I considered to be most affected by semi-technical vocabulary in their work, especially in understanding textbooks and computer program manuals. Three groups of students studying in two schools, that is, the School of Engineering and the School of Business and Management, from three departments, the Department of Computer Science (CS), the Department of Computer Engineering (CE) and the Department of Information and Systems Management (ISMT), were thus chosen for the survey. CS and CE students were selected because this group is frequently exposed to CS-related literature; ISMT students, although their main study areas are related to business, all need to acquire certain computer skills, especially the use of different computer program packages. Since this investigation aimed at generating results that can be useful and meaningful for course design, it required a large number of subjects, first-year students who had just arrived at the University and who did not have very much knowledge of using Microsoft Word 6.0. A total of 365 students in the Fall semesters in the academic years 1994–5 and 1995–6 were invited to participate in the research. An overview of how subjects were involved in the experiments is listed in Tables 5 and 6. Selection of semi-technical vocabulary The semi-technical vocabulary items selected to form the core for in-vestigation in the multiple-choice survey questionnaire and the HOCTs were taken from the Running Word 6 for Windows manual, each with a frequency of occurrence of over 100 in the UST CS Corpus. Examples of selected semi-technical vocabulary for use are inter alia: link, as in The up-link and down-link frequency to permit full duplex operation. (UST CS Corpus: NE066-1.ASC) sequence, as in The rules for digit-sequence above may be reexpressed as *CL*. (UST CS Corpus: PR069-1.ASC) format, as in Format values displayed as percents, format currency amounts with commas and a dollar sign, format numeric data with a fixed number of places to the right of the decimal point. (UST CS Corpus: MI096-C.ASC)

58

Page 59: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

directory, as in This is, of course, crude, but is better than winding up in an infinite loop following directory cycles. (UST CS Corpus: DB024-C.ASC) update, as in The Sequential Update Logic, or variations of it, is one of the most frequently encountered logic requirements which the program will find and, therefore, should be thoroughly understood. (UST CS Corpus: PR095-2.ASC) mode, as in The fopen mode argument determines the way in which the program opens a file. (UST CS Corpus: PR122-2.ASC) The rationale for using a frequency of occurrence of over 100 in the UST CS corpus as the yardstick for selecting vocabulary items for this study is that two lexical studies had already been conducted at UST, viz. Tong (1993a) and Li & Pemberton (1994), which had both selected words below the frequency of 100 occurrences per million from the UST CS Corpus as test items. It was therefore my intention to determine whether words of this range (≥100) of frequency in the same specialised corpus has the same negative effects on students’ comprehension of CS-related texts.

59

Page 60: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

Selection of dictionaries In order to assess the extent to which a corpus-based subject-specific glossary of semi-technical vocabulary is a more efficient consulting tool for students than a GLD, four learners’ dictionaries generally available in Hong Kong were selected for experimental purposes:

1. Collins COBUILD English language dictionary (COBUILD 1, 1987)

2. Longman dictionary of contemporary English (LDOCE 2, 1987)

3. Oxford advanced learner’s dictionary (OALD 4, 1989) 4. Oxford advanced learner’s English-Chinese dictionary 牛津高

階英漢雙解詞典 (OALECD 4, 1994) OALECD 4 is popular among secondary school students in Hong Kong because it provides Chinese equivalents. COBUILD 1 is not so well known in Hong Kong as the other three. Neither electronic nor online dictionaries were chosen for two reasons. First, this study was set to compare the subject-specific specialised glossary with the most popular GLDs on the Hong Kong market. During the experimentation period electronic and online dictionaries had not gained popularity – they were still expensive and not easily accessible. Second, it is generally expected that electronic or online dictionaries could be potentially more advanced consulting tools than paperback dictionaries. It was the aim of this study to gain evidence to establish whether there are any grounds for such a belief. Selection of the electronic glossary of semi-technical vocabulary I have designed a glossary which has a different format from traditional paperback dictionaries and has explanations that are more specific and functional to CS-related texts. This glossary, which is in an electronic form, consists of semi-technical vocabulary items taken from both the Microsoft Word 6.0 manual and the UST CS Corpus (with the addition of Chinese equivalents and frequency information). The glossary I created for the experiments, consists of 21 semi-technical vocabulary items, forming 21 HOCTs to test subjects’ knowledge of the vocabulary items concerned. I argue that the effectiveness of such a

62

Page 61: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

glossary is not only in terms of the English and Chinese meanings it pro-vides, but also in terms of other information, such as the frequency in-formation in a subject-specific corpus, the interactive demo feature and its easy-to-use pen-based device. Selection of venues Two different types of venue were selected for carrying out the experi-ments. They are the general classroom (including the lecture theatres) and the language laboratory. Since no electronic or electrical equipment was required, all the MCSQs were conducted either in a general classroom or a lecture theatre immediately after the close of the lectures which the sub-jects had attended. For the TAT, the language laboratories were used, where audio-visual equipment, such as tape-recorders, television sets and computers are available, and there is a control console where language instructors can observe and listen quietly to the progression of any individual language practice performance without students being fully aware of when they are being observed. Each language laboratory consists of a console and accommodates 24 students at a time. In practice, the number of subjects involved in each session varied from one to twenty. I usually sat in the console area using the audio equipment to listen to individual subjects verbalising their thoughts. Oral hints or prompts were given via the microphone and the headphones whenever necessary. Subjects’ thinking-aloud protocols and RI conversations were all audio taped. Selection of time It was not easy to find an appropriate time to conduct research at UST. The students in this University are required to spend more time in attending lectures and completing assignments (not to mention that most of them are also engaged in part-time jobs) than students studying at other Hong Kong universities. Second, on top of the daily routine work, students are encouraged to participate in different research activities, which takes up a considerable amount of their free time. Third, the fact that the Language Centre is a service centre and does not have its own degree course affects the availability of subjects. To overcome these problems, I had to explain clearly the aims of the research to those lecturers who had kindly allowed me to enter their classrooms to conduct the investigation with their students. I also had to convince the audiences

63

Page 62: Chapter One: Introduction - HKUST Institutional Repositoryrepository.ust.hk/ir/bitstream/1783.1-1056/3/Jacqch14.pdf · Chapter One: Introduction ... the interactive approach to reading

64

that the research results would be used as a prerequisite for producing useful language tools, such as a glossary of semi-technical vocabulary, which would facilitate students’ understanding of CS-related literature. Most of the MCQSs were therefore conducted in the evening when stu-dents had finished their daily commitments, under my direct supervision so as to maintain a consistent and reliable survey environment. Any tasks that required the use of audio or computer equipment had to be conducted either in the evenings or on Saturdays and Sundays, to ensure the avail-ability of room facilities. However, during the RIs, subjects observed that the time selected for the study was helpful because they felt less press-urised when they did not need to rush to lectures or to perform other duties in the University. Experimentation Pilot study A small-scale pilot study was carried out prior to the main study. The main aim of this study was to trial the test designs to determine whether there were any possible errors or mismatches that might occur within the research frameworks, noting the subjects’ opinions on: research variables (such as the selection of semi-technical vocabulary test items, the survey designs, venues and times for the experiments, dictionaries provided); and the research tools DWLGE, MCQS, HOCTs, TAT and RI. The research procedures were then improved and adapted for the main study. Revision of the research design after the pilot study Since the findings of the pilot study were largely satisfactory, only minor revisions were made, to the DWLGEs, the MCQSs and the HOCTs. The only change made to the DWLGEs in the main study was the form of feedback. In the pilot study, 80 subjects were interviewed and oral feed-back from individuals was solicited. Since 445 subjects were involved in the main study, one-to-one interviews were not possible, and written feedback was obtained: 365 subjects were given DWLGE 3, and 60 sub-jects were given DWLGE 4 to complete before they were asked to fill out the MCQS or attend the HOCTs.