New Chips

Data General/One

Nonsense imitation

can be disconcertingly recognizable

recognizable mannerisms of the texts from which they are derived. For example, the following text was generated by the first sentences of this article: English letter-combination frequencies from text was generived. For example. Though nonsentencies from text was the text was generated to generisms of that mimics the first sentencies from text the texts have a have a sample, they article:The nature of such texts has been little explored, in part because it's been difficult to get samples. Claude Shannon generated "approximations to English" by hand in 1948, but the laborious calculation it involved prevented extensive study. This is clearly a task for a computer, but programs have been hampered by the need for impractical amounts of memory. We offer a Pascal program, Travesty, to fabricate Hugh Kenner and Joseph o'Roarke teach pseudo-text quickly from any input text. Students of style and linguistics will see possibilities . So may pro- English and computer science, respectively, at grammers since Travest contains a feature th at ca n The Johns y Hopkins greatly speed up general pattern-matching proce- university, Baltimore, dures. We add a special-case version that is (continued) MD 21218.


nglish letter-combination frequencies can be used to generate random text that mimics the frequencies found in a sample. Though nonsensical, these pseudo-texts havea haunting plausibility, preserving as they do many



Each of these writers had his own way with trigrams, tetragrams, pentagrams, matters to which he surely gave no thought.even speedier. To make clear what Travesty does, we'll first discuss language statistics and what they imply.LANGUAGE STATISTICS Finish typing a page of English prose, and the key you hit most often will have been the space bar. Either "e" or "t" will rank second. You did not make those decisions, the language did. In fact, the language makes three-quarters of your writing decisions for you. Not only do the letters observe preferred frequencies, they keep preferred company. A familiar example: write "Q", and (unless you are drafting a QANTAS ad or some comments on Iraq) the next character is almost sure to be "u". If probability coerces the successor to a single letter, what follows a letter pair is even more tightly bound. Write "th", and the probability is very high that what follows will be "e". If it is, then the character after "e" is most likely to be either a space or an "r". Pairs like "th" are called digrams; triplets like "the" are trigrams. They have frequencies, like letters. The most common English digram is "he"; you will find it three times in the sentence you are reading now, 15 times in this paragraph. And you will guess correctly that as we move up from single letters to diagrams and trigrams, the probabilities that govern the next character grow ever more rigorous. By the time we've reached, say, pentagrams, has the author any choice at all?

Yes, he has; otherwise Henry James could have had no way to be Henry James, or James Joyce to be James Joyce. At a fairly low level, the statistics of English would have taken over from both of them, and neither would have been distinguishable from The New York Times.But that is not what happens. True, even with a James or a Joyce holding the pen, the statistics do not lie dormant. However, they no longer derive from the undifferentiated language , i.e., from a large sample of everything we can find. The significant statistics derive from the personal habits of James , or Joyce, or Jack London, or J. D. Salinger. Each of these writers, amazingly, had his own way with trigrams, tetragrams, pentagrams, mat(continued on page 449) ters to which he surely gave no thought.



(continued from page 131)

This line of reasoning brings us to the unexpected fact that essentially random nonsense can preserve many "personal" characteristics of a source text. Travesty (listing 1), a program suitable for small systems, will scan a sample text and generate, from the sample's n-gram statistics, a "nonsense" imitation through which the original text, and even its authorship, is disconcertingly recognizable. For example, we provided Travesty with 29 names of towns taken from a gazetteer of England and called for third-order (trigram) analysis. It promptly churned out a couple thousand characters. These letter groups included (1) many input words regurgitated; (2) some uninteresting letter strings that we agreed to call "garbage" (on the principle that a weed is a flower you don't want); and (3) some wondrously plausible names for English towns that don't exist but ought to. They included Bambudge, Nettlewett, Gidge, Hample, Bognorton, Chire, Clop, Tootinton, Bleweth, and Eastle. (If any of these is a real name, that's by accident; none was on our input list.) And fancy being Mayor of Clop! The connection of the output to the source can be stated exactly: for an order-n scan , every n-character sequence in the output occurs somewhere in the input, and at about the same frequency. That is all, yet it is enough to account for an eerie similarity. Every string of three letters in our pseudo-place-names, "ttl" or "dge", for instance, was lifted out of a string of characters and spaces that consisted simply of the 29 input words typed one after another with one space after each. Figure 1 shows one of the thousands of machine-generated derivations Travesty can extract from a 75-word sample of James Joyce's Ulysses. This passage is an order-4 scan; every four-character sequence in the output comes from somewhere in the input.FREQUENCY ARRAYS

language and literature to investigate. To what degree can personal "style" be described as a manifestation of letter frequencies? Such a question, though not new, was merely tantalizing before the modern computer;

even more so before procedures were discovered-quite recently-that didn't demand impossible amounts of machine memory.Brian P. Hayes, associate editor of (continued)

REMARKS ON THE TRAVESTY LISTINGstandardized. We have three Pascal systems available: Turbo Pascal for CP/M and MS-DOS, Lucidata Pascal for CP/M and HDOS, and Berkeley Pascal running under UNIX-and we haven't been able to write a version of Travesty that will run on all three unmodified. judging that Turbo is the rising young comer, we list the Turbo Pascal version, with notes on such problem areas as we know about. This version might run on UCSD Pascal too, but we've not been able to try it. Since it avoids features unique to Turbo and UCSD, it ought to be transportable to any decent Pascal system at the cost of a little attention to input and output.


ascal input/output (1/O) conven-

have been declared:Type STRING = PACKED ARRAY[1 .. 12] OF CHAR;

tions are, to say the least, poorly

Then change line 49 to InFile STRING. 62 Some Pascals will require you to declare a variable i and say, FOR i 1 TO 12 DO READ InFile[i};. 63 Berkeley Pascal doesn't use the ASSIGN command. You'd omit this line and change line 64 to reset (f, infile);.Also, you will probably want output to a disk file, and you'll have to set that up yourself. Add a second TEXT variable, g, to line 33 and a second STRING variable, OutFile, to line 49. Then insert after line 64 a request for the name of the Outfile, and ASSIGN it to g in whatever way your system provides. And if your system requires files to be explicitly closed, add a statement line, CLOSE (g), just before the final END. (Don't forget the semicolon at the end of the line above it.)NOTES ON HELLBAT

Line numbers are, of course, for reference only: don't type them into your Pascal listing.23 This value is safe and may even be increased, but remember that you'll have two arrays this size. How big you can make ArraySize depends on your system's memory requirements. Turbo Pascal, when compiled to disk to get the compiler itself out of the way, permitted ArraySize = 14,000 on a 64Kbyte CP/M system. That's about 2300 words of input text. On an MS-DOS system with 196K bytes, maximum ArraySize increased to 21,000, or 3500 words of text. independent of whether compilation was to memory or to disk. 33 If your Pascal doesn't know about the TEXT type, change this line to f file of char. 40 If your Pascal system has a RANDOM function, you can drop lines 40 to 44 altogether. Then change line 239 to read toss : = random(total) + 1;. You should also delete lines 38. 52, and 53. 49 Many versions of Pascal don't recognize STRING types unless they

There is a lot of fun to be had here. There is also much for the student of

To change Travesty into Hellbat, procedures InitSkip and Match are replaced by the versions given in listing 2, and numerous lines are deleted as shown below. Note that WriteCharacter now receives its characters from Match and has only formatting duties to perform. If your Pascal has its own RANDOM function, make the deletions listed in the section on Travesty for line 40; and the major change-applied above to the WriteCharacter procedure-should instead be made to the line in the new Match procedure that invokes Random. Lines to delete for Hellbat include 28, 72 to 80, 269, 273 (all references to FregArray), and 232 to 245 (process for getting a character).

Listing 1: Travesty, a program for generating pseudo-text. The program will scan a sample text and generate a "nonsense" imitation. For an order-n scan, every n-character sequence in the output occurs somewhere in the input.3 PROGRAM travesty (input, output); (Kenner / O'Rourke, 5/9/84} This is based on Brian Hayes's article in Scientific American, November 1983. It scans a text and generates an n-order simulation of its letter combinations. For order n, the relation of output to input is exactly: "Any pattern n characters long in the output has occurred somewhere in the input, and at about the same frequency." Input should be ready on disk. Program asks how many characters of output you want. It next asks for the "Order"-i.e. how long a string of characters will be cloned to output when found, You are asked for the name of the input file, and offered a "Verse" option. If you select this, and if the input has a " I " character at the end of each line, words that end lines in the original will terminate output lines. Otherwise, output lines will average 50 characters in length. 22 CONST 23 ArraySize = 3000 ; { maximum number of text chars} 24 MaxPat = 9 ; { maximum Pattern length}

PROCEDURE FirstPattern; User selects "order" of operation, an integer , n, in the range 1 .. 9. The input text will henceforth be scanned in n-sized chunks. The first n-1 characters of the input file are placed in the "Pattern" Array. The Pattern is written at the head of output. j : INTEGER; BEGIN FOR j : = 1 TO PatLength DO Pattern[j] : = BigArray[j]; CharCount PatLength; NearEnd : = false; IF Verse THEN (' '); FOR j : - 1 TO PatLength DO WRITE (Pattern[j]) END; (Procedure FirstPattern)

(Put opening chars into Pattern}

(Align first line)


168 PROCEDURE InitSkip; 169 (* The i-th entry of SkipArray contains the smallest index 170 (* j > i such that BigArray[j] - BigArray[i]. Thus SkipArray 171 (* links together all identical characters in BigArray. 172 (* StartSkip contains the index of the first occurrence of 173 (* each character, These two arrays are used to skip the 174 (* matching routine through the text , stopping only at 175 (* locations whose character matches the first character 176 (* in Pattern. 177 VAR 178 ch : CHAR; 179 j : INTEGER; 180 BEGIN 181 FOR ch :- ''TO ';' DO 182 StartSkip[ch] : - TotalChars + 1; 183 FOR j : - TotalChars DOWNTO 1 DO 184 BEGIN 185 ch :- BigArray[j]; 186 SkipArray[j] StartSkip(ch); 187 StartSkip(ch] : = j


251 252 253 254 255 256 257 259260

IF ((NearEnd) AND (NewChar = ' ')) THEN BEGIN {If NearEnd} WRITELN; IF Verse THEN WRITE NearEnd : = false END {If NearEnd} END; {Procedure WriteCharacter) PROCEDURE NewPattern;( This removes the first character of the Pattern and

261 262 263 264 265 266 267 268 269 270 272 273 274 275 276277

( appends the character just printed. FregArray is (- zeroed in preparation for a new scan. VAR j : INTEGER; BEGIN FOR j 1 to PatLength - 1 DO {Move all chars leftward} Pattern[j] := Pattern[j+1]; {Append NewChar} Pattern[PatLength] NewChar; ClearFreq END; {Procedure NewPattern} BEGIN {Main Program} ClearFreq; NullArrays; InParams; FillArray;FirstPattern;

278 279 280 281 282 283 284

InitSkip; REPEAT Match; WriteCharacter; NewPattern UNTIL CharCount > = OutChars; END. {Main Program}

Scientific American, explained in the November 1983 issue of that publication ("Computer Recreations," pages 18-28) how the obvious approach to an order-n scan used n-dimensional arrays. Let Array, count all occurrences of all characters. Let Array2

keep track of the character that follows each character. Let Array3 keep track of the character that follows each pair. Now generate a random printout that reflects the probabilities recorded in the arrays. The result of this order-3 scan will be what

we saw in the place-names, a scrambled impression that preserves, to a surprising degree, many idiosyncrasies of the input text. A higher order would be even more interesting, especially if the input sample were long. But until quite recently order-4 was an exceptional achievement and no one had ever seen an order-5. Getting even as far as order-3 (trigrams) gobbles up memory if you use arrays of arrays. For example, even if you restrict the character set to a mere 28-the uppercase letters, a space, and an apostrophe-you have to store 21,952 characters to create three arrays. Most of the character combinations would actually be blanks, because most trigram possibilities don't occur: think of "cnx". A fourth array would entail over half a million places of storage. A fifth is nearly unthinkable (and almost empty). Sparse arrays are generally a sign that a new approach is needed. It was Hayes himself who took the next step, which was to discard multiplied arrays altogether. He perceived that you could get the same result by scanning the text for patterns and simply recording the character that follows each pattern. Consider this brief text: ALL IN ALL THE CHANCES MAY WELL BE ENHANCED. To run a trigram scan, take any two letters and see what comes next. 'AL' is followed only by "L'; "LL' is followed only by a space, which we'll(continued)


Gleaming harnesses, petticoats on slim ass rain. Had to adore. Gleaming silks, spicy fruits, pettled. Perfume all. Had to go back. Had to back. Had to back. His braces all him ass to adore. Gleaming harness rain yielded. A warm silver, rays of the mutely craved Born on slim ass rays of the woman plumpnesses. Uselesh obscurely, he mutely craved down on him assailed. A warm hungered down on him assailed to adore, Gleaming silks, silver, rich from Jaffa. A warm hungered down on slim braces all. Hig

method provides for this too.His method has several advantages. It is applicable to programs that fit a small computer. It makes feasible order-4, order-5, order-6 scans-in fact scans of any order. Nor need it conserve memory by restricting the character set; it can accommodate uppercase and lowercase letters, numerals, and all punctuation. But it does have one disadvantage. Since with his method you have to scan the entire input text to generate each character, the process can be very slow. We'll describe a partial remedy for that. IMPLEMENTATION In our first attempts to implement the Hayes algorithm, we left the source text on disk to be read through over and over. Though it's tempting to let source length be limited only by disk(continued)

Figure 1: An order-4 scan taken from a 75-word sample of James Joyce's Ulysses. represent as "_": so the pair 'AL' can only lead to the string ' ALL_". But the next pair, "L_", may be followed by "I", "T", or "B". We list these characters as we come to them, then choose one from the list at random. Let's say we choose "B" so that we have 'ALL-B". The next pattern is "_B", and the only thing that can follow is "E". Continue in this way, and by the time you've convinced yourself that one possible result is ALL BE ENHANCES MAY WELL THE ENHANCED, you'll have understood the method. You'll also see how it produced a word-ENHANCES-that was not in the input. One further principle isn't illustrated by an example this short. A long text may yield a fairly long list of characters from which to make the random choice, and many of these will have turned up over and over. In such a case, frequent appearance on the list should improve a character's chance of being chosen. What we want our output to reflect is n-gram frequency, not just n-gram presence. Hayes's

capacity, the amount of disk reading can be altogether unreasonable: one full read per output character. Not only is the process slow but , if you try to run the program to get a meaningful amount of output from a long text, it might consume an appreciable fraction of a disk drive's service life. We settled for a limited input sample, to

be read from the disk just once and stored in an array.In the program in listing 1, this is BigArray, indexed by a range of integers from I to ArraySize. As the source text is read in, linefeeds, carriage returns, and tabs are stripped out, and runs of blanks are condensed to a single blank. The cleaned-

up result is stored, character by character, in BigArray. Next, for an ordern scan, the first n-1 characters of the source are put into a small array called Pattern. They are also printed, to get the output started. All this may take a second or a few. The rest is simple in principle. A Match procedure runs through BigArray, checking for matches with the contents of Pattern. Each time a match is found, the next character is stored in a way we'll explain in a moment. At the end of the scan, one of the stored characters is randomly chosen for printing, the more frequent ones being the more likely choices. We now make a new Pattern, by dropping the old Pattern's first character and appending the character just printed. And we keep this up until we have generated as many characters of output as we want. Tb store the characters from which to choose at random, we used a method suggested by Hayes. FreqArray is an array of integers, indexed by 93 ASCII (American Standard Code for Information Interchange) characters, from " " (space, ASCII 32) to " " (ASCII 124). (We might have stopped with "z" JASCII 1221, but we had a special use for " 1 ", as will appear.) Before each scan, all of FreqArray's elements are zeroed. After each match, the element indexed by the found character is incremented. Thus, after four "e"s have been found, FregArrayj'e'j contains four. At the end of the scan, the contents of all FreqArray elements are totaled, and a random number in the range "I ... Total" is generated. The contents of the individual FreqArray elements (most of them zero) are then subtracted, one by one, from this number; the subtraction that drops the remainder to zero or less chooses the character that will be printed. That way, characters that index some fairly large FreqArray number stand a better chance of being chosen. You decide how much output you want by setting the variable MaxChars when the starting menu prompts you. You also set the Order, which deter(continued)Circle 28 on inquiry card. -y

The language makes three-fourths of your writing decisions.mines the pattern length. As the Order increases, so does the likelihood of every pattern being unique, in which case the output would be simply the unaltered input. We set the constant MaxPat to 9, but that's an option. The output is formatted by counting the characters printed (TotalChars) and watching for a chance to break the line whenever this total passes a multiple of 50. The next space after that will trigger a new line. To make verse look more like verse, we type a bar ("; ") at the end of each line of input text. No bar is ever printed, but if the user has selected the Verse option from the starting menu, a new line will commence whenever a bar is encountered. On top of that, any line breaks created by the "TotalChars MOD 50" test will be followed by a 5-character indent. Figure 2 showsWe are not appear Sight kingdom These do not appear: This very long Between the wind behaving alone And the Kingdom Remember us, if at all, not appear Sightless, unless In a fading as lost kingdom

what the result can look like. The input text was the whole of T. S. Eliot's The Hollow Men.The smooth running of this system depends on the fact that a match will be found, and a new character returned, somewhere on every scan. There is one potential bug-when the new character isn't valid because the matching process stepped right off the end of BigArray. Suppose the last four characters of the input text are "it._". Now suppose an order-4 scan has come clear to the end, seeking the pattern "t._"; though it finds a match, the character after the match is undefined. If that character happens to get selected for printing, something unauthorized will turn up in the output. Worse, something undetermined will go into the next pattern. Whether the Match procedure can find the new pattern in the input text is now doubtful. If it can't, the program will lock into an endless loop. Our solution was to wrap the end of BigArray back to the beginning, by appending a space (if there wasn't

one) plus as many of the opening characters as there are in the search pattern. Thus the input text is, in effect, a closed loop and has no end that the matching routine can step off. A FASTER VERSION As first set up, with a simple matching algorithm, all this worked perfectly but slowly, new characters trickling onto the screen like drops of molasses. Then a way to get a dramatic speedup presented itself. Consider that the Match procedure is spending most of its time checking out cases that don't match at even the first character. They are the majority of the cases, and they are a total waste of time. Suppose we are looking for the string "rep"; is it possible to skip through BigArray, investigating only sequences that begin with "r"? Yes, it is, at the cost of a second array the size of BigArray. Called SkipArray, it works as if it were many linked lists braided together. In our example, we have only to follow SkipArray's linked skein of "r"s. With its guidance, the Match procedure can skip swiftly through BigArray from one "r" to the next, ignoring all other checkpoints. If all letters were equally frequent, that would hasten the matching process by a factor of 26. In the program run that generated figure 2 , the number of searches was cut by a factor of 15. If Match were the only determinant, we'd speed up execution by that much. However, other overhead stays constant and hogs time. In practice, the overall speedup approaches a limit of about 7-for big jobs, the difference between five minutes and half an hour. SkipArray is accompanied by a much smaller array called StartSkip, which is indexed by characters and simply records the first location of each character in BigArray. The two of them are set up, once and for all and very quickly, by the procedure InitSkip. Once they are in place, we start each search from the information in StartSkip, and then use SkipArray as follows.Consider our example, a pattern (continued)

Remember us, if at all , not me also wear And voices are thine images Are raised, here, is that five o'clock in dreams In the Kingdom. This is the world endsThere, the stuffed meaning places

Are quiet and more solent souls, but on a bang but a whimper. Let me also wear Prickly pear:

Figure 2 : A pseudo-text version , in verse form , of T S. Eliot's The Hollow Men.



that begins with -r". Let's suppose we have learned from StartSkip that the first "r" is at BigArravl3J. Then the "r" entries in SkipArray might begin like this: 3 4 5 6 7 8 9 10 11 12 13... 7 13 22 At 3, we learn that the next "r" is at 7; at 7, that the next "r" is at 13; at 13, that the next "r" is at 22: and so on. Of course, the other positions in SkipArray would be filled with information about the other characters. That information is ignored this time through but stays available for other scans. Inspection of the listing will show how the Match procedure consults StartSkip to see where, in BigArray, the first character of the Pattern to be matched first appears. Thereafter, guided by SkipArray, it can safely leave most locations of BigArray unvisited. In a count that includes the spaces, "r" occurs in English about once every 18 characters. So if BigArray were 2000 characters long, only about 110 of its locations would be checked. "P", with a frequency of about 2 percent, would trigger a mere 40 visits. The improvement in speed over our preceding versions is dramatic. Scanning a small input file on a fast system, we've seen new characters patter onto the screen faster than we could read them. The analogy is no longer with molasses but with raindrops. TIMING ANALYSIS Some statistics yielded by the "time" function on the UNIX system suggest two timing considerations. First, the time consumed seems independent of pattern length. (Most matches fail well before the pattern length is reached.) Second, time is largely determined by the product of two factors: the number of characters in the input file and the number of output characters requested. For large inputs and outputs, doubling either doubles the time, and doubling both quadruples the time. But on a small scale, the program works more slowly than that.(continued)

In the short run Hellbat can be disablingly input-sensitive.More exactly, the time in seconc.s follows quite closely an equation of the formT=K(i/l0+o+io) where i is thousands of characters supplied as input, o is thousands wanted in the output, and K is a system constant, obtained by trial. Thus, 1000 characters both in and out (i and o both = 1) took 21 seconds on one system, a VAX 750 running Berkeley Pascal. The same job took 130 seconds with a different compiler on a 2-MHz Heath H-89. System constants were thus 10 and 62, respectively. For large inputs and outputs, the

last term-the product of i and o-predominates. On a smaller scale, the increasing weight of the first two terms will slow things down. It is possible to obtain a measure of the speed contributed by the SkipArray process. With SkipArray deactivated so that the Match procedure must check every character, the final term gets multiplied by about 7. This factor represents an empirical weighted average of the frequency of the characters that get checked, and it means that for big jobs, where the final term predominates, SkipArray speeds up the matching of English text by 7 times. For smaller samples, the improvement factor may be closer to 4. SHANNON'S ALGORITHM At this point, we wondered if there wasn't still more speed to be gained. In 1948, working without a computer,

Claude Shannon had not bothered to build up frequency tables at all. He opened a book at random, selected a letter, and recorded it. He then opened the book again, read till he found that letter, and recorded its successor. On another random page, he found the first successor to that letter ... and so on.This process amounts to hopping into BigArray at random and letting the first occurrence of Pattern you encounter be the one to define the following character. Why build a table, only to select one element from it at random? Why not acknowledge that the text itself is a frequency table and make random entries into it? So we implemented a new method and got a further speedup factor of better than 3 (listing 2). After molasses and raindrops, this was a torrent! The new version flew like a bat out of hell and got nicknamed Hellbat.

PROCEDURE Match; VAR INTEGER; (one location BEFORE start of Match in BigArray) INTEGER; (index into pattern) Found : BOOLEAN; {true if there is a match from i + 1 to i +j -1 } chi : CHAR; (first character in Pattern; used for skipping) BEGIN chi := Pattern[1]; (Hop into BigArray at a random location) = TRUNC((TotalChars-PatLength) * Random(Seed)); {Search for an instance of ch1 at location i+1} WHILE BigArray[i+1) chi DO i : _ (i + 1) MOD (TotalChars - PatLength); Found : = false; WHILE (NOT Found) DO BEGIN j := 1; Found true; WHILE (Found AND 0 < = PatLength)) DO IF BigArray[i +j] < > Pattern[j] THEN Found false ELSE j := j + 1; IF Found THEN NewChar := BigArray[i + PatLength + 1] ELSE i := SkipArray[i+1] - 1 END END;

silk-h- would be considered equally likely; the letters "s" and "h" would ring up the same increments in FreqArray; and the upshot, all else being equal, would be truly 50-50. Travesty, in short, makes probability independent of the spacing of items. Though slower than Hellbat, it is less particular about peculiar input structures. Hellbat, on the other hand, is temptingly fast; the io term disappears altogether from its timing equation, which takes the form T = K (i/5 + o) For order-5, and input and output of 1,000 characters each, the two machines that took 21 and 130 seconds to run Travesty ran Hellbat in 4 and 50 seconds, respectively.

Because a long pattern is apt to entail more tries before a match is found, Hellbat's time, unlike Travesty's, is order-dependent. The dependency could be incorporated into K, which could then stand for both the computer system and the order.So, if the input text is fairly free of repeated words clumped together, or if it is long and varied enough for such words to be scattered elsewhere also, then Hellbat (jump in with closed eyes) is the algorithm of choice. Otherwise, opt for the inefficiencies of Travesty and its frequency table. We print herewith (listing 1) the Pascal code for Travesty (modified Hayes) and append instructions for converting it into random-jump Hellbat (Shannon) (listing 2). The nature of your input must decide which version is for you. sBIBLIOGRAPHY 1. Bennett , William R. Jr. "How Artificial is Intelligence ?" American Scientist. November-December 1977, page 694. 2. Hayes, Brian . "Computer Recreations." Scientific American . November 1983, page 18. 3. Shannon, Claude E. "The Mathematical Theory of Communication:' Bell System Technical Journal . July and October 1948, pages 379, 623. 4. Shannon, Claude E. "Prediction and Entropy of Printed English:' Bell System Technical Journal . January 1951, page 50.NOVEMBER 1984 B Y T E 469