Some uses of natural language interfaces in computer assisted language learning

Instructional Science 18:45--61 (1989) 45 © Kluwer Academic Publishers, Dordrecht - Printed in the Netherlands

Some uses of natural language interfaces in computer assisted language learning

R. D. WARD Department of Computing Science, University of Aberdeen, Aberdeen AB9 2UB, Scotland

Abstract. It has often been proposed that computer programs simulating written conversation could be effective in language teaching and remediation. This paper presents a theoretical rationale for this approach, and reports empirical studies of its potential. Although the studies were concerned mainly with language-impaired children, their findings should have some relevance for the wider field of computer assisted language learning in general.

Several microcomputer programs were developed to hold written dialogue with children about screen graphics. Studies of the software in use over several months by two different groups of language-impaired children produced evidence to suggest that experiences associated with the software led to improved skills in the language covered by the programs. The studies also produced new ideas about the kinds of language leamLng activities which might be promoted by this kind of software.

The paper concludes with suggestions about how these ideas might form the basis of future intelligent tutoring systems able to prescribe a variety of language learning activities, over a range of language materials.

Introduction

It has often been proposed that computer programs simulating written conversation could be effective in language teaching and remediation. Dialogue programs like Eliza (Weizenbaum, 1966), SHRDLU (Winograd, 1972) and adventure games, which involve the use of language as the currency of interaction rather than its subject matter, have been cited as suitable examples (e.g. Goldenberg, Russell and Carter, 1984).

In practice, most developers of computer assisted language learning software have tended to avoid activities in which learners are required to produce open- ended language. In fact Ahmad, Corbett, Rogers and Sussex (1985, page 59) explicitly advise against this because of problems such as form and meaning, structural variety, ellipsis, inference, world knowledge, humour, and so on, which despite being easily dealt with by human beings, pose tremendous difficulties for computers. Where natural language processing techniques have been used, it has been mainly for handling students' answers in tutorial question and answer dialogue where language remains the subject matter of exercises (e.g. Cerri and Breuker, 1981; Markosian and Ager, 1984).

Little empirical evidence has therefore been produced to support the hypothesis that simulated conversation with a computer is effective in language learning in so far as it can be implemented. Indeed, it is not clear what kinds of learning activities

46

might be based upon this kind of interaction. Eliza, SHRDLU and adventure games would all appear to promote qualitatively different kinds of activity, with differing degrees of problem solving, user initiative, input constraint and domain explicitness. The importance of such factors in computer assisted language learning remain largely unresearched.

This paper reports an empirical investigation of computer simulated conversation in language remediation, and describes several remedial language activities which emerged during the investigation. Although, the paper is ostensibly concerned with teaching English'to language-impaired children who are having problems with early multiple-word language (a stage of development at which many deaf, dysphasic, autistic and mentally handicapped children run into difficulties), we believe that it also has relevance for the wider field of computer assisted language learning in general.

Human and computer based language remediation

Theoretical justification for using natural-language dialogue programs in computer assisted language remediation can be found in the literature on language acquisition and development. Research conducted during the 1970s towards resolving the conflict between behaviorist and innatist theories led to a now generally accepted cognitive interactionalist account of language development. Evidence suggests that normal language development depends upon the infant experiencing meaningful and purposeful linguistic interaction with others, who use increasingly sophisticated language, pitched at levels just ahead of the infant's developing skills (Bruner, 1983).

Over the years a great many remedial language schemes, including computer based schemes, have drawn their ideas from research into the acquisition and development of language. Thus early computer based learning, in all topic areas, was strongly influenced by programmed instruction, a direct application of behav- iorism. Similarly the innatist influence can be directly traced into the structuring of language teaching materials, for example in the computer based ILIAD system (Bates, Beinashowitz, Ingria and Wilson, 1981).

However, remediation drawn from the more recent research has been almost wholly designed for human administration, and not, as yet, expressed in computer based form. One obvious reason for this is the difficulty of implementation. For example, in the 'Living Language' scheme (Locke, 1985), teachers or speech therapists are required to present remedial materials through two-way conversation. Even though the scheme uses a sequenced syntax programme and a develop- mentally sequenced vocabulary of objects, events, properties and relationships, it is stressed that dialogue must take place within the context of real activities pos- sessing inherent meaning and purpose. Similarly the Cooper, Moodley and

47

Reynell (1978) scheme uses a kit of toys and other objects through which language can be related to symbolic skills, for example the classification of objects by colour, type, shape, use, size, quantity and relative position. Activities are encouraged which make relationships between nouns (e.g. "Put the spoon in the box"), between nouns and verbs (e.g. "Show me the man sitting down") and between nouns and adjectives (e.g. "Show me the longest pencil").

In many ways, these activities resemble the mother-infant games, called formats, which Bruner (1983) believes to be extremely important in normal language development throughout the first few years of life. Formats are games in which the same subset of language is used repetitively, within well-defined domains of meaning, on many successive occasions. For example, in picture-book reading games a mother might utter the four part sequence "Look! What is that? It's a rab- bit. Yes!". Gradually, over several months, infants themselves will begin to utter single parts of the sequence, and their mothers will then carry on with the next part as if a spontaneous conversation were taking place. Formats can also change over time in keeping with the infant's developing skills. Brunet believes that formats may be significant across the full range of language functions throughout infancy, early childhood and later.

One to one interactive language teaching and remediation of this kind is very expensive in human resources, but it might be possible to moderate its demands through the use of computer programs which simulate conversation, even in a limited way. The essential requirements would appear to be that the software should promote the expressive and receptive use of a functional range of language as a natural tool for communication, rather than in exercises, whilst retaining the repetition and the structured, restricted domains of remedial schemes and formats.

All of these elements can be found in a wide variety of computer games and educational programs, but hardly ever together. A great many drill exercises involve repetitive structured language at the sub-sentence, sentence and multiple sentence levels, but do not involve language as a communicative tool. Adventure games do involve language as a tool, but tend to restrict the user to terse imperative expression such as "open door" and "take lamp". Also in adventure games, whilst the balance of initiative is usually with the user, most of the language tends to be produced by the program, often in the form of complex, figurative descrip- tions unsuited to children with language difficulties, but with likely potential in foreign language teaching. Finally, adventure games, in common with Eliza, are situated mainly within a written, abstract context with little concrete meaning.

The approach used in SHRDLU appears to have greater remedial potential. This program was able to converse in language containing many of the structures which language-impaired children find difficult, within a concrete, dynamic, blocks-world domain reminiscent of many remedial language materials. Also, SHRDLU's receptive and expressive language was reasonably symmetrical - in fact its language understanding is more sophisticated than its language generation.

48

It did not however possess symmetry of initiative; it was a passive program which did not, for example, ask questions to be answered by the user, but this facility could easily be added.

Working towards the requirements derived from formats above, and drawing loosely from the ideas and materials of remedial language schemes and programs like SHRDLU, we began to develop a set of microcomputer programs devised to allow users to hold written English dialogues with a computer about screen graphics. Developing the software was an iterative process involving consultations with teachers and several small studies of children working with prototype programs. This led to the development of 12 microcomputer based linguistic microworlds, pitched at the level of early multiple-word language, containing language and concepts known to be found difficult by many language-impaired children.

In developing the programs, our aims were realistic: no attempt was made to simulate Brunerian formats in all their real-fife richness, and the linguistic abilities of the software were limited, based upon a simple finite-state grammar. The aim was to provide a sufficient level of dialogue to investigate the hypothesis that language-impaired children can benefit from written, format-like dialogue with a computer. Thus the software resembled formats to the extent that the dialogue occurred in well-defined domains of meaning, was repetitive, and contained meaningful, purposeful, yet well-bounded language. Software capable of more sophisticated interaction might then be developed later, should the approach be found viable.

One further caveat is required. Clearly, normal language learning situations, such as formats, take place through spoken, and not written language. However, this turns out to be a far less serious difficulty than it at first appears. Some language-impaired children, especially those with impaired hearing or auditory processing problems, find language quite accessible in its written form. Also, although written language is not usually interactive, written dialogue with a computer is, and therefore assumes some of the qualities of spoken language such as immediacy of feedback. At the same time, written dialogue with a computer may retain a pace of interaction that leaves language open to inspection, allowing time for reflection in a way that speech never can.

Description of the software

The requirements of the software, as defined above, were that it should promote expressive and receptive use of language as a vehicle for meaningful and purposeful communication between computer and user, whilst retaining the elements of structure and repetition. Through trial and error, and through observations of pro- totypes in use by both handicapped and normal children, several SHRDLU-Iike microcomputer programmes were devised to allow children to exchange limited, written English dialogue with a computer.

49

o * 4. eAI ,+ml N A. m

Your turn.

I

O >Take away three yel lo~ diamond ! I do not understand. -- >Take away three yellow diamonds. There are too few yellow diamonds.

@ O

KEY

'" ' -]yellow

~ green

m blue

Figure 1. Screen layout of "Shapes"

The subject of dialogue was screen graphics, and each program presented a different graphics environment. One of the simplest programs was concerned with the relative positions of just two objects, a square and a cross, in which the square could be over, under, to the left of, or to the right of the cross. Another simple program displayed three objects, a triangle, a square and a diamond, which could be coloured red, blue or green independently. These programs had vocabularies of around 10 words or phrases, allowing around 40 "acceptable" sentences to be constructed by the user. Other programs possessed more complex screen environments and greater vocabularies. One of them, which held a dialogue about the numbers, shapes and colours of up to 32 objects displayed on the screen (Figure 1), had a vocabulary of 43 words or phrases, allowing over 3,000 different "acceptable" sentences to be constructed. Programs also varied in cognitive com- plexity. One of the most difficult programs discussed the relative lengths of six lines, three of which were labelled as "belonging" to the computer, and three as "belonging" to the user. In dialogue with this program, the normal reversal rules for possessive pronouns applied, thus the user's "My lines" were referred to as "Your lines" by the computer, and vice versa.

Dialogues with the software could include instructions, statements, questions and answers, either as complete sentences or as elliptical answers to questions. Exchanges could be initiated either by the user or the computer. Figure 2 shows examples from four programs.

50

1) f

Square and cross

Your rum. > Put the cross to the left of the square. (The computer does this)

M y turn.

Where is the square? > To the right of the cross.

Well done.

2) Colours

My turn. What colour is the triangle? Is the triangle red? Yes. Well done.

Y o u r turn.

> What is blue? The square is blue.

3) Shapes f

4) ~x f

Y o u r tum.

> Give me twenty one shapes. (The computer does this, filling the first 21 vacant positions with objects of random colour and shape).

My turn. Do you have four blue shapes?

> No, I have five. Well done.

(The computer outlines the 5 blue shapes). J

Lines

My turn. Is my blue line longer than your green llne? No, my green line is longer. Well done. Your tum. What is shorter than my red line? My red line, my blue line and your red line.

Figure 2. Example program dialogues (The user's input follows the ">" prompt)

J

what

where

is

put white square

move red cross

me ~ green I

give triangle

take away ~ yellow diamond

show me blue star

I Have I finished? I

I ERASER I

Figure 3. Keyboard overlay for "Hidden Shapes"

51

Input was constructed word-by-word or phrase-by-phrase, using units of language marked out upon paper overlays to a peripheral, A4-sized, touch-sensitive pad (Figure 3) 1 . This conveniently precluded attempts to use vocabulary unknown to the program, but its main purpose was to help focus the user's attention upon language, rather than upon typing or spelling. Figure 4 shows the syntax diagram associated with the vocabulary of Figure 3.

t,~ o 0 q

/ "1~

° t "~ o

-

. ~ .~ -= ~ .~ ~ o

Figure 4. Syntax diagram for "Hidden Shapes". This permits 1 l, 311 different input constructions

52

The software provided feedback about inputs it was unable to process. For syn- tactically unacceptable inputs, the software attempted to identify the erroneous parts of the input, and overprint them in reverse field. Procedures were also writ-

ten to provide short explanatory messages about simple omission, insertion, sub- stitution and transposition errors, for example "PUT ... GREEN: word missing?", "TO OVER: extra word?", but although this worked satisfactorily in tests, and although these categories accounted for a high proportion of the errors in the language of hearing-impaired children (Myklebust, 1965), the 32K microcomputers used did not possess sufficient RAM for these procedures to be included. 2 Thus in

the final programs, syntactic feedback was provided simply by the message "I do not understand", together with the reverse field overprinting (Figure 1).

In response to other errors the software provided specific messages. For example, if the user attempted to give a semantically impossible instruction, the message "You can't do that" would be produced, or possibly a more specific message such as "There are too few yellow diamonds" (Figure 1). If the user ignored a

question asked by the computer, say by giving an instruction, then this would result in "It is MY turn!". If the user answered a question about "the red box" with a statement about "the blue box", then the computer would respond "I asked about the red box". The environments used in the programs were fairly small, allowing all such eventualities to be programmed in.

Evaluating the software in use

Two studies of the the software in use in schools, over several months, by two different groups of language-impaired children were conducted. The results sug- gested that experiences associated with the software did lead to improvements in ability to use the language covered by the programs.

The subjects of the first study were six profoundly deaf teenagers, with a mean chronological age of 14 years 2 months, and a mean reading age of 7 years 1 month, attending a special unit for the hearing-impaired in a comprehensive school. Subjects worked with 6 programs over a period of twelve weeks, for about one hour each week. For most of the time subjects worked individually, but in the presence, and with the support of either a teacher or an experimenter. Subjects did not work through the programs in any imposed or predefined order as the inten- tion was that the software be used by subjects and teachers in school as naturally as possible.

Before and after the 12-week period, subjects took part in a non-computer based referential communication task designed to act as a pretest and posttest of linguistic ability within the domain of language covered by the software. The task was a game which used two sets of six small drawers, each arranged in two rows of three, and two sets of twelve items for placing in the drawers (suing, a screw- driver, a large and a small eraser, a large and a small battery, and three crayons and three sweets coloured red, yellow and blue).

53

The game was played by the experimenter and one subject at a time, with the drawers positioned so that each player could see only one set. One player hid six items, one in each drawer, and the other player then had the task of replicating this hidden arrangement in the second set of drawers using only information obtained by communicating with the first player. Communication between the experimenter and the hearing-impaired subjects was accomplished by pointing, word by word, to a vocabulary card, with the experimenter reading and audio taping the words indicated for later transcription.

There were four conditions to the procedure. In condition 1, the experimenter hid six items, and then issued six "put" instructions to the subject. This condition served mainly to introduce the game to subjects. Condition 2 was similar to condition 1, but with the roles of the experimenter and subject reversed. In condition 3 the subject again hid the items, but this time the experimenter obtained information about their positions by asking "What" questions. Condition 4 was similar to condition 3, but with the roles of hider and questioner reversed. Thus conditions 2 and 4 required subjects to conslruct instructions and questions in the form of complete sentences, and condition 3 required subjects to answer questions in whatever form they chose.

All subjects completed the tasks of all four conditions successfully. Within each subject's best six sentences of conditions 2 and 4 there were a total of 98 syntax errors in the pretest, and 53 in the posttest. This represented 27 qualitatively different subject-errors in the pretest, and 16 in the posttest. Only one subject, Subject 3 did not improve by these measures (Table 1). Across all subjects, most reduction in syntax errors occurred in the use of prepositions and object noun phrases.

Clearly, 53 syntax errors in 72 sentences in the posttest is still a high error rate. In part, this may have been due to the way in which subjects were asked to com- pose sentences. Pointing word by word to a vocabulary card probably requires real-time language skills, and is likely to be very difficult for people who are not fluent users of language.

Table 1. Numbers of syntax errors in each subject's best six sentences of conditions 2 and 4

Total number of errors Number of different types of error

Subject Pretest Posttest Pretest Posttest

1 8 0 3 0 2 29 7 6 3 3 20 23 6 7 4 17 9 4 2 5 13 9 2 2 6 11 5 6 3

Totals 98 53 27 16

54

To give an idea of what the numbers represent in terms of the actual language produced, the four different types of error made by Subject 4 in the pretest were:

- "to the" omitted from "to the left of" or "to the right of", (e.g. "What is right of the large battery"), (5 occurrences).

- "the" omitted from the object noun phrase, (e.g. "Put the yellow crayon left of red crayon"), (4 occurrences).

- "of" added in "under of" or "over of", (e.g. "What is under of the blue sweet"), (6 occurrences).

- completely omitting the object noun phrase, (e.g. "Put the string in of right"), (2 occurrences).

The two types of error in this subject's posttest were 1) the "under of" construction remained, but with only 4 occurrences, so it might have been moving towards extinction, and 2) the omission of "to the" from phrases such as "to the left of" was replaced by the omission of just "the", producing phrases such as "to left of" (5 occurrences), arguably a less severe form of the same error.

Errors of a similar nature, and similar improvements, were observed in subjects' interactions with the software itself. For example, Subject 2, early in the study, made many simple omissions and insertions of "is" and "in" in statements and instructions, and also constructed several highly convoluted statements such as "The red box is where in nothing the triangle". Towards the end of the study the convoluted forms had completed disappeared, and other syntax errors had almost been eliminated. Another error which Subject 2 made early in the study, but which later disappeared, was the reversal of the direction of meaning of prepositions, for example "The red box is in the cross". All subjects continued to give factually incorrect answers to questions asked by the computer throughout the study, apparently by carelessness rather than deliberation. Activities in which fewer careless errors occurred will be described in the next section.

It was mentioned that the pretest-posttest data, in order to allow comparisons to be made, cover only the best six sentences produced by each subject in each condition: the logical minimum requirement to complete the referential communication task. All subjects in the pretest, and three subjects in the posttest, needed more than six communications in at least one of conditions 2, 3, or 4, either because they gave inadequate or contradictory information in their instructions or answers, or because they misunderstood the experimenter's answers. These extra communications may be seen as a measure of subjects' understanding of the func- tion of language. In the pretest, subjects required a total of 28 extra communications over and above the logical minimum of 18, but only 12 in the posttest. All subjects showed improvement, most of which occurred in the use of questions in Condition 4.

Another indicator of functional language use was the type of communications constructed by subjects in condition 3 where they were free to answer the experimenter's "What" questions in any way they chose. Resl~onses included elliotical

55

noun phrases, complete descriptive statements, "put" instructions and deviant forms such as "What is put red the crayon right" (Subject 2, Pretest). In the posttest the deviant forms had disappeared, and subjects' responses had begun to change towards more acceptable conversational norms, for example from "put" instructions to elliptical noun phrases. Thus subjects' answers contained 14 elliptical noun phrases (4 subjects), no statements, 11 "what" repetitions (2 subjects) and 11 "put" instructions (2 subjects) in the pretest, as compared with 28 elliptical noun phrases (6 subjects), 3 statements (1 subject), 8 "put" instructions (1 subject) and no "what" repetitions in the posttest.

Comparable gains were observed with a different group of language-impaired children in the second study. The subjects of the second study were nine 10-13 year olds, with a mean language age of 6 years 8 months, attending a residential school for speech- and language-impaired children. Their language difficulties were diverse, and associated with a variety of conditions, but included, either alone or in combination, problems with receptive language, problems with expressive language, comprehension problems suggesting cognitive deficiency and severe language delay with an environmental component. They differed from the subjects of the first study in that impaired hearing was not their main handicap, although three were hearing-impaired to a degree.

The subjects worked with 8 programs over 15 weeks, again often individually, but with the support of the teacher or experimenter as in the first study. The programs were sequenced in approximate order of difficulty, and subjects did not begin to use a new program until they had become fairly familiar with the previous program. As one of the aims of the second study was to demonstrate that subjects improved in their skills at using the software itself, the pretest, posttest measure adopted was a single program (the "Square and Cross" program) presented to students for a timed 20 minutes about two months before the study began, and for a timed 15 minutes in the fifteenth week, but unseen by subjects in between. It had previously been established in a pilot study that when similar subjects used this program for the very first time, the learning curve, in terms of the rate at which inputs were made, increased considerably during the first five- minute time band, but then much more slowly over the next three five-minute time bands. This pattern also occurred in the real pretest. In accordance with observers' impressions, the gains during the first five minutes were interpreted as being largely the result of learning how to operate the program and the equipment, and the later gains as being constrained by subjects' language deficiencies. Subjects' performance during the pretest session, excluding the first five minutes, could therefore be compared with their performance over the whole posttest session.

In the posttest, subjects' overall mean time per input had fallen to 22 seconds, from 35 seconds in the pretest. This decrease occurred across all categories of input, but was greatest in question and answer forms. Also, subjects' inputs were

56

more accurate in the posttest, where 36 out of 265 inputs were not accepted by the computer (13.6%), than in the pretest, where 31 out of 147 inputs (21.1%) were not accepted. Most of the gain in accuracy was due to a decrease in syntactic errors. These data are for 8 subjects. One subject (Subject 9) became increasingly uncooperative after the pretest session, and would not participate in the posttest.

Exploring the use of the software

A further main aim of the second study was to explore more widely the kinds of language learning activities that might be promoted with the aid of a computer simulated conversant. From our earliest discussions during the software development period, it became obvious that teachers produce a great many imaginative ideas for using particular items of educational software in a variety of learning situations, and that the programs we were developing could be used in ways other than one child working alone exchanging utterances with the computer.

One teacher liked to set children goals, extrinsic to the software, which required the construction of two or three successive inputs for completion, for example to produce a given arrangement of screen objects. These activities extended the scope of interaction with the computer beyond a single exchange of utterances towards some longer-term goal. Later, during the first study, we began to suspect that subjects made fewer errors, and became more involved with programs during these activities, especially when working in groups of two or three together. The second study aimed to develop some of these ideas further with a view to incorporating them more directly into the software at a later stage.

It was found that the "Shapes" program, which conversed about the numbers, colours and shapes of up to 32 objects on the screen, had interesting possibilities for pattern-making activities. For example, with the aid of diagrams, subjects could be asked to produce a border of yellow diamonds surrounding an empty central area, requiring a minimum sequence of six instructions (although most subjects needed more). Some subjects could set their own patterns, or set patterns for others to construct.

"Shapes" was also the only program of interest to Subject 9. Although she was, both linguistically and cognitively, the least able subject, she did learn, after patient demonstration and guidance by one of her teachers, to construct simple "Give me" and "Take away" sentences requiring three key presses (e.g. "Give me five squares", "Take away two triangles"). Difficulties still occurred with singu- lar and plural distinctions and with two-word numbers such as "twenty seven", so a special keyboard overlay was devised with these words removed. Subject 9 was then found to be capable of working alone with the program for ten to fifteen minutes at a time, repeatedly filling the screen with shapes and then removing them all again, with apparent enjoyment. This surprised the class teacher and the subject's speech therapist who had not thought her capable of this, or of working

57

alone at any linguistic task at all. It was thought that her preference for this particular program resulted from her liking for numbers and counting activities.

It was found that another program, which discussed and altered the sizes of three crosses on the screen, could be used as a competitive game for two players in which, taking alternate turns, one player aimed to make the crosses as large as possible, whilst the other aimed to reduce the crosses to their minimum size. The three crosses, coloured red, blue and green, could each assume any one of ten graded sizes. The best strategy for the player attempting to maximise the sizes was to always take the smallest cross and make it larger than the largest cross by means of an instruction such as "Make the red cross larger than the blue cross", unless the largest cross was already of size 10, in which case the smallest cross should be made "... the same size as the ..." largest cross. The best opposing strategy was, of course, the converse. Two subjects, both of language age 6 years 10 months, using the program in this way, were observed to make 63 inputs with only one error each, neither being a syntax error, a much lower error content than in their individual work with the same program. Because, in the competition, making an input error was equivalent to missing a turn, subjects took great care not to do so.

These were extrinsic goals and activities imposed by teachers from outside the software. Ways were also sought to incorporate intrinsic goals and activities directly into the software. This led to the development of a program called "Hidden Shapes" (Figures 3, 4 and 5). Several versions of the program were implemented, and one designed for use by language-impaired children, was used successfully with the subjects of the second study. The version described here presented the task in a more difficult form which proved popular with adults as a computer game. It begins to show the kind of language and activity that might be suited to language teaching more generally.

"Hidden Shapes" required the user to reproduce a pattern of up to nine objects, (e.g. a blue triangle, a green cross) which had been "hidden" randomly in a 3 x 3 grid. Usually there would be about 5 objects and 4 empty spaces. The hidden grid was initially drawn as being "obscured" by question marks on the left of the screen. The hidden pattern was to be reconstructed in the user's workspace, initially blank, which can be seen on the fight in Figure 5. Although this was essen- tially a non-linguistic task, it could be completed only by using language. The user could gather information about the relative positions of the hidden objects only by asking the computer questions, and could obtain and manipulate the second set of objects in the workspace only by giving the computer instructions which made use of the information obtained.

The program began by stating the identity, but not the position, of one of the hidden objects, (e.g. "I have used the blue triangle"). This named object could then be used as a referent in formulating the first question (e.g. "What is above the blue triangle?"). Following this, the user could either ask further questions, or

58

begin to arrange Objects in the workspace. Eventually the question "Have I finished?" would either produce a statement of the number of positions in the two grids which still differed, or, if the user had completed the task, "reveal" the hidden objects. Figure 6 shows the dialogue leading up to the position shown in Figure 5.

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 a o o s Q e o o s , , ~ o o * o o 6

?????????????????? ?????????????????? 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Q e o o e e o o a o o o o o o o o *

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 o o o o o o O o a o O o o o o O o l

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 o o o o * o e o Q o o * e o e o o l

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 * e e o o o * o l o e o o o * o o ~

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 o o o e e o e e o o o e o e e e e e

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 o o o o o o o o o o e o o o o o o e

U +

7> Put the yellow diamond below the blue triangle. 8> What is to the left of the blue triangle? The yellow diamond and the yellow square. 9> MoCe the blue triangle to the right. 10>

KEY

["] yellow

[ ] green

1 blue

Figure 5. Screen layout of"Hidden Shapes"

1 >

2 >

3 > 4 > 5 >

6 >

7 > 8 >

9 >

10 >

I have used the blue triangle. What is above the blue triangle? The green square. The green cross. Give me the green square. Give me the green cross. Put the blue triangle below the green square. What is below the blue triangle? The green triangle. The yellow diamond. Put the green triangle below the blue triangle. Put the yellow diamond below the blue triangle. What is to the left o f the blue triangle? The yellow diamond. The yellow square. Move the blue triangle to the right.

Figure 6. Dialogue from hidden shapes prior to the positions shown in Figure 5

59

At the point shown in Figure 5, an expert at the task would have worked out that once the yellow square has been positioned on the same line as the blue triangle, to its left, then we have all the objects needed, and they are on their correct rows. However, we are still not certain into which columns the objects should go, but we know that the green square must be moved out of the left hand column, that the blue triangle is now in the correct position, and that the green triangle needs to be moved from the left hand column and replaced by the yellow diamond.

Experts could usually complete the "Hidden Shapes" task in under 20 inputs. Novices were found to need as many as 100 in extreme cases. Some gave up in despair. A Lecturer in Education, after struggling for more than an hour to complete the task, stated that it was clearly well beyond the abilities of pre-teenage children. Later the same day, a bright ten year old completed the task without difficulty at the first attempt.

Reactions to "Hidden Shapes" were mainly enthusiastic, and it seems likely that further programs could be devised to intrinsically promote other activities covering a variety of language. Counting activities, and competitive games of the form produced by the "Shapes" and "Sizes" programs might be better promoted by software designed specifically to do the job, even though the existing programs did produce high motivation and involvement. One program that was implemented, called "Hidden Sizes", presented a similar kind of task to "Hidden Shapes", except that it required the user to match the heights and widths of two sets of objects, using language such as "What is shorter than my green cross?" and "Make my blue rectangle as wide as my yellow Mangle".

Future possibilities

Although the studies suggest that language learning activities based upon simulated conversation with a computer could be effective in language learning and remediation, it must be said that this approach will probably only gain wide acceptance when the available software covers a more extensive range of language materials and activities than in the programs implemented. As no single item of language learning software could ever hope to cover a full language sylla- bus, any computer based scheme for language remediation would need to consist of a large number of separate but related programs, with a range of organised materials and activities, just as can be found in non computer based schemes. This would appear to be true for other areas of language learning too.

For language remediation, requisite computer based materials and activities can be readily derived from the literature which describes in detail the kinds of language which language-impaired children find difficult to master. Lexical- semantic and syntactic difficulties include the use of negation, conditional questions, "some" and "all" quantifiers and temporal and kinship relations which

60

appear so often in language assessment instruments. The more powerful graphics capabilities of newer microcomputers, able to represent more lifelike objects than the abstract geometric figures described above, might promote discussions about items such as furniture or fruit, about actions such as moving, looking or eating, and about actors such as people or animals. Some programs like this could be implemented now without any major computational differences from the software developed in this project.

At times, and especially in foreign language learning applications, there would be a need for more sophisticated natural language processing than that used here. This could still be computationally tractable within the now widely known body of natural-language processing techniques (e.g. see Noble, 1988), evidenced for example in SHRDLU's ability to accept new concepts such as "steeple", to refer to previous dialogue and to consider hypothetical situations. More recent dialogue systems research has extended the possibilities by moving on from simple question answering to the maintenance of cohesive discourse, for example in simulating the behaviour of a hotel manager whose goal is to fill the hotel with guests (e.g. see Kobsa and Wahlster, 1988).

Remedial conversations with programs might therefore be developed to cover differences between "to be" and "to have", various subordinate clauses, pronouns, possessive inflections, tense inflections, adverbs, auxiliaries and conjunctions. Again, every one of these structures is known to cause difficulties for language- impaired children. It should also be possible to devise conversational activities which involve functions of language other than just the declarative, imperative and interrogative functions implemented here. Perhaps if, rather than simulating dialogue between the user and the computer, the dialogue took place between the user and figures depicted on the screen, then interactional and personal functions of language might be promoted, such as greetings and the expression of feelings. Perhaps these could be placed in an everyday context by simulating situations such as shopping.

A scheme consisting of many items of inter-related software would be adminis- tratively demanding. These demands might be met by software which recorded each student's progress through the scheme, and recommended further computer based activities appropriate to each student's needs. Such a system would, in effect, be an intelligent teaching system within the kind of framework proposed by Hartley and Sleeman (1973), with components representing the domain, the student, alternative teaching operations, and the selection of teaching operations for students. The many teaching programs within the scheme would comprise the alternative teaching operations. Developing the other components, and identifying the kinds of information that would need to pass between them and the teaching programs, would be a longer term research goal. The ultimate goal would be to develop a system able to deliver a variety of learning activities, using a range of language materials, with facilities to incorporate new materials and activities in a flexible way.

61

The simulated conversations described here represent just one possible kind of learning activity. They successfully stimulated children's motivation. This is very important in learning - students using computer assisted language learning must want to answer a question and must want to know an answer.

Acknowledgements

The author would like to thank his Ph.D. supervisors, Dr. Andrew Rostron and Dr. David Sewell of the Department of Psychology, University of Hull, U.K.; several teachers, including Richard Cubic and the late Dennis Sewell for correspondence, discussions and access to their classrooms; and the many children who allowed him to observe their interactions with the software described in this paper.

Parts of this research were supported by the Leverhulrne Trust, the Joseph Rowntree Memorial Trust, the Department of Education and Science and Trent Polytechnic, Nottingham.

Notes

1. It would be nice to have a flat, touch-sensitive keypad with a programmable display. This would facilitate switching between programs and vocabularies, and could be used for providing help to the user by selectively highlighting input units as and when appropriate. 2. In order to be able to collect data in classroom settings, it was important that the software could run on microcomputers in British schools, and we were therefore constrained by the hardware available in 1983 when the project began.

References

Ahrnad, K., Corbett, G., Rogers, M. and Sussex, R. (1985). Computers, language learning and language teaching. Cambridge: Cambridge University Press.

Bates, M., Beinashowitz, J., Ingfia, R. and Wilson, K. (1981). Generative tutorial systems. Paper presented to the annual meeting of the Association for the Development of Computer-Based Instructional Systems (ADCIS), Atlanta, Georgia, March 3-5.

Brunet, J. S. (1983). Child's talk: learning to use language. Oxford: Oxford University Press. Cerri, S. and Breuker, J. (1981). A rather intelligent language teacher. Studies in Language Learning,

3, 182-192. Cooper, J., Moodley, M. and ReyneU, J. (1978). Helping language development: a developmentalpro-

gramme for children with early language handicap. London: Edward Arnold. Goldenberg, E. P., Russell, S. J. and Carter, L (1984). Computers, education and special needs.

Reading, MA: Addison-Wesley. Hartley, J. R. and Sleeman, D. H. (1973). Towards intelligent teaching systems. International Journal

of Man.Machine Studies, 5, 215-236. Kobsa, A. and Wahlster, W. (1988). User models in dialogue systems. New York: Spfinger-Verlag. Locke, A. (1985). Living language. Windsor: NFER-Nelson. Markosian, L. Z. and Ager, T. A. (1984). Applications of parsing theory to computer-assisted instruc-

tion. In D. H. Wyatt (Ed.), Computer-ass&ted language instruction. Oxford: Pergamon Press. Myklebust, H. (1965). The psychology of deafness (2nd edition). New York: Grune and Stratton. Noble, H. M. (1988). Natural language processing. Oxford: Blackwell Scientific Publications. Ward, R. D. (1987). Natural language, computer-assisted learning and language-impaired children.

Unpublished Ph.D. Thesis, Department of Psychology, University of Hull. Weizenbaum, J. (1966). ELIZA - A computer program for the study of natural language communica-

tion between man and machine. Communications of the ACM, 9, 36--45. Winograd, T. (1972). Understanding natural language. New York: Academic Press.

Documents

Some uses of natural language interfaces in computer assisted language learning