The C programming language and a C compiler

The Cprogramminglanguage and aCcompiler

In the last few years, the C programming language hasbeco.me .one of the most widelyused languages forapplIcatIons and systems softwareon microcomputers~stems. This paper describes the C languageand itshIstory. It then presentsa specific implementation ofC, the MicrosoftC Compiler, which runs on the IBMPersonalComputer.

I n 1970, Thompson, while working on the earliestversion of the UNIX operating system at the Bell

Telephone Laboratories, Inc., wrote an interpreterfor a language patterned on BCPL, which he calledB. 1 Ritchie then wrote a compiler for a derivative ofB that he called c. Thompson and Ritchie wrote thenext version of the UNIX operating system in the Clanguage. Until that time, most operating systemshad been written in assembly language, but Thomp-son and Ritchie found it so much easier working inC that they were able to add many new functions toUNIX. The first version of C was used to generatecode for the then-new DEC PDP-II. The compiler wassmall and fast and did not attempt any ambitiousoptimizations. Instead, the C language provided reg-ister variables and low-level operations that allowedthe programmer to guide the compiler in generatinguseful code.

Through the I970s, C remained closely associatedwith UNIX. Most of the system is written in the Clanguage. Compilers were written for C for use onthe IBM System/370 and the Interdata 32, as well asa large number of portable programs written in theC language." AT&T began to use C to develop their

IBM SYSTEMS JOURNAL, VOL 24, NO 1, 1985

by R. R. RyanH. Spiller

switching systems, and eventually it became themain development language at Bell Laboratories. Inthe mid-1970s, UNIX was introduced into the aca-demic environment at a number of universities, andthe popularity of C started to spread. By the lateI970s, C compilers were becoming available for alarge number of processors and operating systems,and papers began to appear comparing C with otherlanguages for various applications.'?

In the original language, there was little in the wayof type checking and few restrictions as to type. Thec language was invented by experienced program-mers for their own use, and it provided them with agreat deal of power but very little protection fromdata-type errors. A program called "lint'? for check-ing types in C programs was the only way to providesuch protection. As the language has evolved, moreattention has been paid to types and to type checkingwithin both the language and the compilers, but thephilosophy of c remains unchanged. Even as thelanguage has incorporated stronger type checking,there has continued to be a mechanism to overrideits restrictions.

At first, the typical C user was a systems programmerwho needed to be close to the machine and who

<i) Copyright 1985 by International Business Machines Corporation.Copying in printed form for private use is permitted withoutpayment of royalty provided that (I) each reproduction is donewithout alteration and (2) the Journal reference and IBM copyrightnotice are included on the first page. The title and abstract. but noother portions. of this paper may be copied or distributed royaltyfree without further permission by computer-based and otherinformation-service systems. Permission to republish any otherportion of this paper must be obtained from the Editor.

RYAN AND SPILLER 37

required the efficient code generation of assemblylanguage. At the same time, such programmerswanted the fluency of expression inherent in a high-level language. However, they were doing their de-velopment on the target machine, which was a min-

Cis apopular choice of applicationsprogrammers, although it is rarelyone's first programming language.

icomputer and thus not large enough to supportlarge optimizing compilers. Therefore, the early ccompilers saved space by allowing the programmerto do some of the optimization.

The 16-bit personal computers of today are aboutthe same size as the minicomputers of the early daysof c. Application writers working on personal com-puters are under very much the same constraints assystems programmers using c in the early 1970s.Because of this, c has proved to be a popular choiceamong applications programmers, although it israrely one's first programming language. Experi-enced programmers appreciate the ability to writepowerful programs economically in c.

In 1983, an ANSI standards committee (X3JI I) for cwas formed. Kernighan and Ritchie? had earlierwritten a reference manual, which was the closestthing to a standard. However, the manual was notprecise and did not contain later features in thelanguage, nor did it document the standard library.The major job of the committee has been to codifycommon understanding, because the c language hashad remarkably few local extensions added over theyears. Among those few extensions are argumentchecking, structure copying, and an enumeratedtype.' There has also been an effort by UNIFORUM, aUNIX users' organization, to create a standard for thec library." Although this standard is oriented towardUNIX, most of it is sufficiently general that the ANSIcommittee is using it as a basis for their librarystandard.

38 RYAN AND SPILLER

Concepts of the C language

The c programming language" is expression-ori-ented, with a rich set of operators and a blockstructure, encouraging well-structured programs. Itincludes constructs for decision making (if state-ments), several types of looping (for, while, and do-while), and selection of cases (switch). In keepingwith the spirit of low-level languages, there is alsosupport for goto.

The c language is relatively low-level, providing aset of operators that work on the same kinds ofobjects that most computers do. This allows theprogrammer to work close to the underlying ma-chine. Yet c is of a sufficiently high level to makeprograms portable across a wide variety of machines.The language views the underlying machine as asingle monolithic array, allowing objects to be ref-erenced anywhere in that array. Pointer arithmeticand array subscripting both manipulate the samekind of object-a memory address.

The fundamental data types that c supports arecharacters, integers of several sizes,and floating pointnumbers of both single and double precision.Pointers can be declared to all data types, includingderived data types and functions. New data typescan be derived through the use of arrays, pointers,structures, unions, and functions.

c is a relatively simple language that provides fewmechanisms for dealing directly with complex datastructures such as strings, lists, and arrays. Similarly,there is no direct language support for dealing withthe environment (no input-output, no memory al-location, etc.). Instead, these capabilities are pro-vided through function calls. In the evolution of thec language, a particular set of function calls (the so-called standard library) has come to be consideredpart of the language. But because the language com-piler does not recognize anything special about thesefunction calls, a programmer can replace standardfunctions with special implementations if necessary.

Another powerful feature of the language is a textprocessor, called the c preprocessor, which allowstheuse of macros, conditional compilation, and theinclusion of other text files. These features facilitatethe writing of c programs that are portable to avariety of machines and operating systems.

The simplicity of the language has allowed compilersto be written easily and quickly. The expression


orientation of the language and the rich set of oper-ators enable the programmer to create efficient pro-grams using simple compilers.

A register declaration instructs the compiler that thevariable should be placed in a register if possible.Registers are allocated in the order declared as long

/* Preprocessor directives-replace all occurrencesof*/

Two examples

c is attractive to systems programmers because ofthe number of low-level operators and data typesprovided. One task these low-level operators areespecially suited to is manipulating linked lists. Thefollowing are two somewhat dense examples illus-trating a number of the operators used in initializingand scanning a linked list. The function is presentedin c, followed by a discussion of the operators in-volved. Note that anything occurring between 1* ...and ... */ is a comment.

Example 1. The following are some basic functionsof the c language.

#define SIZE 100#define NULL 0

struct thing I

struct thing *link;

char val;

I;

/* 'SIZE' with 100 *//* 'NULL' with 0 */

/* Declare a structuretype, 'thing', */

/* consisting of a pointerto a 'thing' */

/* and an 8-bit signedinteger */

Code produced using registervariables is almost always smaller

and faster than that using localvariables.

as they are available; otherwise, they are treated aslocal variables. Our implementation for the Intel8086/80286 provides two register variables, SI andDI; other processors may have more or fewer. Thecode produced using register variables is almost al-ways smaller and faster than that using ordinarylocal variables.

A for-loop is of the form

for (expressionl; expression2; expression3)BODY OF LOOP

which can be expressed in a pseudocode as follows:struct thing Tablelsizs]; /* declare an array of

'things' */struct thing *Freething; /* a pointer to a thing */

/* initialize a freelist ofthings */

iniu)I

register struct thing /* declare a local register*p; pointer variable */

for(p = Table; p < &Table[sIzE); p+= 1)I /* make each element */

p - >link = p-l; /* point to the previouselement */

p - >val = 0;

Freething = p-l; /* initialize freelist head */Table[O].link = NULL; /* and tail */


evaluate expression 1while (expression2 is TRUE)

BODY OF LOOPevaluate expression3end-of-loop

The for-loop in the initialization example just givenillustrates several features of c. The first clause setsa pointer p to point to the base of the array Table.In c, the name of an array is synonymous with theaddress of the first element. A pointer is a variablecontaining an address.

Following the first semicolon is the for-loop's endcondition. In this case, the value of the pointer isbeing tested against the address of the end of thearray. Following the second semicolon there is anexpression to be evaluated as the last action eachtime through the loop. Any expression can be putthere.

RYAN AND SPILLER 39

In Example I, expression3 of the for-loop uses the+= operator, which has the following form:

expression I += expression2

is equivalent to

expression I = expression I + expression2

with the exception that expression I is evaluated onlyonce; thus any side effects occur only once. In this

Pointer arithmetic is machineindependent.

case, the += operator is adding a constant value tothe pointer. In c, pointer arithmetic is scaled to thesize of the object pointed to. Because we are addingto a pointer to a "thing" in the example, the constantI is scaled to the size of the "thing."

Doing this scaling based on pointer type has aninteresting and useful effect. Because an array nameis equivalent to a pointer to the base of the array,and adding an index to a pointer causes the index tobe scaled by the element size, the resulting operationis the same arithmetic used for array subscripting.Incidentally, this means that array subscripting iscommutative (i.e., array [i] == i[array]) becauseaddition is commutative. This makes pointer arith-metic machine independent.

The braces I ... I delimit a compound statementthat is to be executed each pass through the loop.The first statement assigns the address ofthe previouselement to its own link field, thereby taking advan-tage of the pointer scaling operation. The -> oper-ator references a field of the structure pointed to.

The second statement inside the braces initializesthe integer field. The two statements after the for-loop clean up the boundary conditions. This is notintended to be a stylistically"nice" routine but rathera dense illustration.

40 RYAN AND SPILLER

Example 2. The following example illustrates furtherexpressions, using the same data structure as that inExample I:

/* Return the nth element of the list, or NULL if* there aren't that many. The function is de-* clared to return a pointer to a 'thing'*/

struct thing *nth(list, n)register struct "list;register int n;

while(list && n--) /* while there is list left,count out elements */

list = list->next;return(n>=O ? NULL: list); /* use count to de-

cide if too many*/

In c, the value 0 is treated as a boolean false;anythingelse is true. Thus, the statement while(expr) is thesame as the statement while(expr!= 0). (The symbol!= means "not equals.")

The && operator is a short circuit logical AND. If thecondition before the && is false, the truth value isknown and evaluation is stopped. This has a benefitother than performance in that it can be used toprotect illegal evaluations from being made. Thus,for example, if(list && list->val. .. does not try todereference a null pointer.

The expression n-- is a use of the post-decrementoperator that has the same effect as a function con-taining

temp = n

n=n-I

return(temp)

Thus, in Example 2, the value tested is the valuebefore the variable is decremented. Like many of thespecial low-level operators in c, this gives the com-piler a very good hint about generating good code,especially because many machines have an address-ing mode that does exactly this. An idiom which isoften seen is the following:

while(*p++ = *q++)


This copies a null-terminated string. On some ma-chines, it is easy for the compiler to recognize thisidiom and produce an optimal machine instructionsequence.

The return statement sets the return value of thefunction. In this case, the return statement is using

Sharing compiler technology acrosslanguages and processors both

lowers the cost and increases theoverall quality.

a ternary conditional operator, a C language con-struct of the following form:

expression I ? expression2 : expression3

which is equivalent to

if (expression1 is TRUE)evaluate expression2

elseevaluate expression3

The result of the whole expression is the value ofwhichever expression has been evaluated.

Microsoft C compiler

The Microsoft c compiler is part of a project of theMicrosoft Corporation to create a set of commoncompiler phases for several languages and machines.Like many language producers, we realized that writ-ing good compilers is expensive and that sharingcompiler technology across languages and processorsboth lowers the cost and increases the overall quality.Thus, we designed an intermediate language (IL) thatallows language processors to communicate with amachine-independent optimizer, which in turn com-municates through another IL to the machine-de-pendent portions of the compiler. At present we havelanguage front ends for c, Pascal, FORTRAN, and


BASIC. Code generators have been written for theIntel 8088/8086/286,9 as well as for other machines.Written in c, the compilers are very portable andrun on IBM res and other systems.

Because c continues to be an evolving language, thefirst question we asked was which c we should im-plement. We started with the UNIX System-v c com-pilers, which we used as a base language definition.When the ANSI standards effort began, we reviewedthe extensions proposed by the committee and in-corporated the ones that seemed likely to becomepart of the eventual standard.

It has been important to keep the compilers consis-tent across machines and operating systems and toprovide strong cross-development tools. For exam-ple, the relocatable object format is identical for PCDOS and for XENIX on the 8086 and 286.

Memory models. One of the difficultiesof creating ac compiler for the 8088/8086/286 is the segmentedarchitecture of the processor. The fact that the clanguage views the underlying machine as a flataddress space creates a conflict. The usual approachto this problem is to construct the following memorymodels to approximate the c assumptions:

• Small. All code and data addresses are 16 bits.This limits the program to two 64K-byte segments,one for code and one for data.

• Medium. Code addresses are 32 bits, but dataaddresses are 16 bits. This model allows an arbi-trary number of code segments, but it is limitedto a single data segment.

• Compact. The inverse of the medium model, thecode addresses of the compact model are 16 bitsand data addresses are 32 bits. A single codesegment can access an arbitrary number of datasegments.

• Large. Both code and data addresses are 32 bits,and both can have an arbitrary number of seg-ments.

The major problem with pure memory models isthat a program must pay a severe performance pen-alty to access large quantities of code or data. It ismore expensive to manipulate 32-bit pointers than16-bit pointers. The problem is inherent in the ar-chitecture, which is really that of a 16-bit machine.For many applications, the ability to address largeamounts of memory is not needed for the wholeprogram but only for selected portions. To addressthis need, we have added the followingtwo key wordsto the language:

RYAN AND SPILLER 41

Figure 1 General architecture of the C compiler

(P3)

ASM ASSEMBLY

PGO DSM LISTING

OIL LINKABLELTE OBJECT

MODULE

• near denotes an item with a 16-bit address.• far denotes an item with a 32-bit address.

Regardless of the memory model, an object declaredwith a near or far attribute is given the appropriateallocation. The programmer now has a way to escapefrom the memory model. Realistically, programmerstend to compile programs using the small or mediummodel and then use far data pointers to access thefew data structures that need to extend beyond thesize of a segment. The converse is less useful, but itis nonetheless available to provide near calls to localprocedures in middle-model programs or access tonear data in large-model programs.

General architecture. The Microsoft Ccompiler con-sists of three phases-s-et, P2, and P3-as shown inFigure I. The PI phase or front end reads the sourceprogram and does the c language preprocessing. Thisincludes macro substitution, conditional compila-tion, and inclusion of named files. PI then translatesthe program into expression trees, doing semanticanalysis, checking for syntax and type correctness,and writes an intermediate language (IL) temporaryfile. This intermediate language is both machine-and language-independent and contains informationabout such things as control structures, expressions,symbols, and data initialization. PI also emits OIL,which contains data-type information to be used bya symbolic debugger.

P2 is the first pass of the back end. It translates theintermediate language (IL) to a form of machineinstruction. It first reads the IL and builds expressiontrees. Thus P2 assigns specific storage to data anddoes various optimizations: constant folding,strength reduction (e.g., converting a multiplicationby a power of two into a shift), type optimizations(e.g., changing a 32-bit addition to a 16-bit addition,if that is the precision required in the result), com-mon subexpression elimination, and value propaga-

42 RYAN ANDSPILLER

tion. Aho and Ullman 10 describe these optimizationsin more detail. After P2 has done these optimizations,the expression trees are processed by the code gen-erator. This part of P2 is generated automaticallyfrom a pattern language processor called XACC andis described further in the next section. P2 translatesthe optimized expression trees into Assembly Inter-mediate Language (AIL), also described further in thenext section. AIL is very similar to assembly languagebut in a form that is more useful to the rest of thecompiler.

The AIL is read in by P3, the last part of the backend. Based on command line switches, P3 optionallypasses the code through the Post-Generation Opti-mizer (PGo), which does further code improvement.The PGO is described in a later section of this paper.After it has completed its optimizations, P3 buildsinstructions and emits one or more of the following:

• Link text for the 8086/80286. This relocatableformat is based on the Intel Object Module For-mat (OMF).

• Assembly language source for the code being gen-erated.

• A listing that looks like the listing fileof the typicalassembler. The listing contains the offsets andbinary values of the emitted instructions, as wellas the assembly language.

Phase PI, the early phases of P2, and much of P3 arefairly conventional. Aho and Ullman'? and many ofthe other compiler construction textbooks are goodreferences for most of these processes. The CodeGenerator and the PGO, which are more interesting,are now presented in greater detail.

Code generation. The code generator translates in-termediate language (IL) into an assembly language(AIL). The IL consists oftrees representing the variousexpressions from the source program, and the AIL isa set of instructions and addressing modes for thetarget machine. Because the translation is usuallycomplex, writing high-quality code generators is adifficult task. The problem is largely one of recogniz-ing patterns in the input representation and translat-ing them to the output representation in an optimalway. Glanville I I observed that since the code gener-ator, like the front end, is a translator, much of thetheoretical work on pattern matching for front endscan be applied to the code generator.

The Glanville approach uses a tool called a LALRparser generator, which takes a formal description ofthe input language and produces a parsing program


that recognizes patterns in the input language andmakes them available for translation. 10 When a pat-tern is recognized, it is a simple matter to write outthe appropriate output language. Much of the powerof the Glanville technique comes from the ability tohave the parser recognize arbitrary patterns thatcorrespond to machine instructions.

However, a number of workers have discovered thatregister targeting is a problem with this approachand have used various techniques to deal with it. 12

-15

That is, if an intermediate calculation is needed forsome purpose-say as a pointer-it may be neces-sary to perform the calculation in a special-purposeregister. Glanville's approach recognizes patterns by

Apattern can consist of terminalsymbols, other productions,

conditions to be met, or actions tobe taken.

collecting smaller patterns into larger ones (i.e., in abottom-up manner). Thus, it does not recognize thecontext in which a pattern is used.

Our approach also uses a parser built by a parsergenerator, but the matching process is steered by theeventual use of the pattern, in addition to the com-ponent patterns (i.e., a top-down approach). We nowillustrate how our code generator works by steppingthrough the translation of a C program fragment.

The pattern description language is of the followingform:

production: patternI optional alternate pattern

A pattern can consist of terminal symbols, otherproductions, conditions to be met, or actions to betaken. If the first pattern fails to match, the second


is tried, and so on. A pattern description is processedby the XACC pattern language processor, which pro-duces a parser that is part of P2. In the followingexample, each time a pattern is recognized a codetemplate is associated with the pattern. This templatehas the following three parts:

• CODE represents some number of machine instruc-tions (possibly zero). Code templates look muchlike assembly language, but have substitution ma-cros (indicated in the example by a leading %symbol) and other operators imbedded in them.

• DEFER is analogous to the return value of a func-tion. The deferred strings allow us to build up asingle machine instruction out of component pat-terns. As an example, an instruction typically usesthe deferred strings of its subpatterns for operandaddressing modes. The addressing modes have, inturn, been built up from simpler components suchas register names and constants.

• ATTRIB contains miscellaneous attributes of thetemplate.

1. The first field contains the number of childtemplates or subpatterns.

2. The second field consists of the following twoparts:

Rewrite rule. Whenever the actions of a tem-plate are complete and AIL has been generated,the original expression tree is rewritten to re-flect the current state, most commonly asREREG (in a register) or REFAD (an effectiveaddress).

Path. When machine code is eventually emit-ted, the lower regions of the tree come out first.Each template in the code sequence may havesome restrictions on allowed registers. This at-tribute, termed the path, is used to pass registerconstraint information downward. These con-straints are accumulated so that the eventualselection of a register satisfies the tightest con-straint on its use along a path.

3. The third field contains operators that deter-mine the order in which code is emitted. Thisis described in more detail later in this paper.

4. The remaining fields contain register allocationinformation. Each of these fields has three sub-fields:

RYAN AND SPILLER 43

Link attaches the other two subfields to a par-ticular child template.

Constraint places restrictions on which registersmay be selected. CAREG, for example, statesthat there must be an address register. Notethat the child template can further propagatethis constraint using the path parameter previ-ously described.

I PLUS addr.ireg constantICODE(""); /* template 5 */DEFER("[%regl + %2]");ATTRIB("2, REFADIO, SU_ADR,

llcAREGIO, 21010, 0,0");

Allocation attributes declare what to do with aregister: NEW to allocate a new register; FREE tofree all registers used by the subpattern orothers.

Usually there are only one or two of these subfieldsthat are nonzero. The following is the example forwhich we have been setting the stage. We first givean example of an XACC input grammar. Although itis not complete, it is sufficient to compile the Cexpression that follows it.

start

reg

: ASSIGN address regIconavmov %bw %1, %2");

/* template I */DEFER ('''');ATTRIB("2, 0, 0, IIOIFREE, 210IFREE,

0,0");

: EXTRACT addressICODE("mov %bw %1, %2");

/* template 2 */DEFER("% I");ATTRIB(" I, REREG II, SU-SCR,°I°INEW,

IIOIFREE,O,O");I

I PLUS reg constantIcoosr'add %bw %1, %2");

/* template 3 */DEFER("% I");ATTRIB("I, REREGll, SU-SCR, 01010,

IIOIFREE,O,O");

constant: CONSTANTICODE(""); /* template 6 */DEFER("%C");ATTRIB("O,O,O, 0,0,0,0");I

addr., : EXTRACT addressreg

ICODE("mov %bw %1, %2");

/* template 7 */DEFER("% 1"):ATTRIB(" I, REREGll, SU_ADR,

0INEwlcAREG, 1IFREE, 0,0");I

I reg

The macros used in the example are defined asfollows:

% I Replace with the results of the first link fieldin the ATTRIB. This could be either the DE-FER string of a subpattern or the result ofallocating a new register.

%2 Similarly, replace with the results of thesecond link field.

%c Substitute the value of the constant in theassociated expression tree.

%s Substitute the symbol name from the asso-ciated expression tree.

%regl Replace with the register name returned inthe first link field.

Here is an example c expression to be compiled:

i = p->x + 3;

address : NAMEICODE(""); /* template 4 */DEFER("[%S]");ATTRIB("O,O,O, 0,0,0,0");I

44 RYAN AND SPILLER

This expression means to get the value of the x fieldof the structure pointed to by p, add 3, and assign tothe variable i.

After PI and the optimizer stage of P2 are done, theexpression tree looks as shown in Figure 2.


Note that the NAME node represents a memory ad-dress. The EXTRACT node is used to get the contentsof that address. The parser generated from the pat-tern grammar given previously tries to match thistree by doing a preorder or forward Polish walk ofthis tree. This is written as follows:

ASSIGN NAME(i) PLUS EXTRACT PLUS EXTRACTNAME(p) CONSTANT(offset x) CONSTANT(3)

In the pattern to be matched, startconsists of ASSIGN(which matches), and two subpatterns: address andreg. The address pattern matches the NAME node.Similarly, the rest of the input matches the regpattern and its associated subpatterns. Each time apattern is recognized, the associated template (if itexists) is pushed onto a stack. When the entirematching process is complete, a new tree is createdparallel to the original expression tree. Figure 3shows the parallel tree for this example. The instruc-tions from the CODE fields are used to identify all theoperands. The number in parentheses is the numberof the template producing the node.

To generate AIL from this parallel tree, we mustdetermine the optimal order in which to walk thetree, and we must assign registers where required.The first walk of the parallel tree analyzes the orderoperators (the third field in each ATTRIB record).These operators have the prefix su., after Sethi andUllman, who created an ancestor of the technique;"such operators are used to decide the order in whichto walk the tree. The ordering heuristic deals withoverlapping and conflicting register classes. For eachnode in the parallel tree, a held vector (the registersleft busy from this calculation) and a used vector (thetotal set of registers used by this entire subtree) arecomputed on the basis of the su., operator. Eachcomponent of the vector contains the number ofregisters held or used in that register class by theexpression. The sum of held and used registers ineach class predicts where the register allocator willrun out of registers and spill code will have to begenerated. The ordering phase attempts to minimizethe number of spills by considering whether thereare fewer spills in first walking the left side of thetree in Figure 3, then the right side, or vice versa. Inthe example, there are no conflicts, so that any orderis as good as any other order.

After deciding the order in which to generate codefor the parallel tree nodes, we do another walk in theestablished order, passing register constraints fromthe ATTRIB fields down the tree. These constraints

ISM SYSTEMS JOURNAL, VOL 24, NO 1, 1985

Figure 2 Example expression tree after P1 and P2processing

are accumulated so that the ultimate selection of aregister satisfies the strictest constraint used in theexpression. In the example, the addressing modetemplate (5) constrains its first child to CAREG (Con-strain to Address REGister), which, on the 8086,forces one of BX, SI, or DI to be chosen. If there hadbeen other constraints, say, that the register must beconvertible to a byte register, we would have re-stricted it further to the intersection of the two sets.That is, the intersection of IAX,CX,DX,BXj andIBX,SI,DII is [ax].

As the leaf nodes are reached, we back out of thetree, allocating registers on the fly according to theaccumulated constraints and emitting code to theAIL stream. The original expression trees are rewrit-ten according to the rewrite field of the ATTRIB inthe template in preparation for potential spills (dis-cussed below).

Here is the example in C again:

RYAN ANO SPILLER 45

Figure 3 New expression tree after completion of thematching process

i = p - >x + 3;

and the code produced is the following:

; add z to register ax; store register ax into x

add ax,zmov x,ax

cmp x,O; compare x and 0je Ll ; branch if x == 0mov ax,y; load y into register axadd ax,z; add z to register axmov x.ax; store register ax into xj L2 ; unconditional branch

r.i: mov ax,w ; load w into register axadd ax,z; add z to register axmov x,ax; store register ax into x

Consider the following c statements: if (x) then x =y+z; else x = w+z. P2 generates the following 8086instructions. The comments after each line describeeach instruction rather than explaining the generalflow.

The Post-Generation Optimizer. The Post-Genera-tion Optimizer (PGo) is based largely on the FINALphase of the Bliss compiler. 17 The PGO performs avariety of optimizations that the code generator can-not do or does not do because the PGO can do abetter job. These include peephole optimizations,which involve replacing sequences of code with moreefficient sequences that produce the same result. ThePGO also performs a simple flow analysis to eliminateredundant register loads. Branches are shortenedwhere possible, and several simple code movementoptimizations are applied. One of the more interest-ing of these code movement optimizations, which iscalled cross-jumping, involves sharing the instruc-tions ofa common sequence before a common label.

L2:

When there are unresolvable register conflicts, it isnecessary to spill an intermediate value from thedemanded register to another location in order toproceed with the calculation. On machines like the8086, it is often more effective to spill to a differentregister than to spill to memory. Regardless of thechoice, the spill sufficiently changes the pattern ofcode to be generated that a reanalysis is in order.Thus, we emit code to make the required registeravailable, rewrite the original tree to represent thecurrent state of the evaluation, and run the wholegrammar/order/register allocation algorithm again.

Observe that the last two instructions of each path

are the same. Cross-jumping takes advantage of thisand rewrites the whole sequence as

move the contents of AX into itemplate 1, with operands

from 4 and 3

mov bx.p move the contents of pintoBX

template 2 with operand fromtemplate 4

mov ax.jbx--x] move the contents of addressBX+X into AX

template 2 with operand fromtemplate 5 (which was builtfrom 2 and 6)

add 3 to AXtemplate 3 with operands

from 2 and 6

mov i.ax

add ax,3

46 RYANANDSPILLER IBM SYSTEMS JOURNAL, VOL 24, NO 1, 1985

cmp x,O; compare x and °je LI ; branch if x ==°mov ax,y; load y into register axj L2 ; branch

L I: mov ax,w ; load w into register axL2: add ax,z ; add z to register ax

mov x,ax; store register ax into x

Because our target machines are small, our objectivehas been to concentrate on generating small code.Small code is usually fast code. Cross-jumping doesnot change execution speed, but it reduces code size.Such sequences are surprisingly common in gener-ated code.

PI passes line numbers to P2, which passes them onto P3. This allows the creation of code offset-line-number tables in the object module. Data allocationinformation is passed from P2 to P3 in the AIL, andP3 merges it with the type information from the OILto create a symbol table in the object module. Asymbolic debugger capable of using this informationis currently being studied.

Concluding remarks

In the past two years, there have been over ten newC compilers offered for the rc. The availability of somany compilers has made more people aware of thelanguage. The hardware requirements are basic. Aprogrammer can adequately use microsoft c on thePC with two floppy disks and as little as 192K bytesof memory. For a realistic development environ-ment, a minimum of 256K bytes of memory and ahard disk are preferable. A growing number of de-velopment tools are becoming available for use withc-special-purpose libraries, symbolic debuggers,and program-oriented editors. c has proved to be afine systems and applications language for smallcomputers. The greatest advantage of c stems fromits original concept, which is to give the programmerboth low-level control of the machine and high-levelexpression of programming concepts.

Cited references

I. D. M. Ritchie, S. C. Johnson, M. E. Lesk, and B. W. Ker-nighan, "The C programming language," The Bell SystemTechnical Journal 57, No.6, Part 2, 1991-2020 (July-August1978).

2. S. C. Johnson and D. M. Ritchie, "Portability ofC programsand the UNIX System," The Bell System Technical Journal57, No.6. Part 2, 2021-2048 (July-August 1978).

3. A. Feuer and H. Narain, "A comparison of the programminglanguages C and Pascal," ACM Computing Surveys 14, No. I,73-92 (March 1982).

IBM SYSTEMS JOURNAL. VOL 24. NO 1. 1985

4. A. Feuer and N. Gehani, Comparing and Assessing Program-ming Languages: Ada. C, and Pascal, Prentice-Hall, Inc.,Englewood Cliffs, NJ (1984).

5. J. Gilbreath, "A high-level language benchmark," Byte Mag-azine b, No.9, 180-198 (September 1981).

6. B. w. Kernighan and D. M. Ritchie, The C ProgrammingLanguage, Prentice-Hall, Inc., Englewood Cliffs, NJ (1978).

7. Draft Proposed Standard-Programming Language C. ANSIX3JII Language Subcommittee, L. Rosler, Chairman, Com-puter and Business Equipment Manufacturers Association,Washington, DC (1984).

8. Proposed Standard. UNIFORUM, /usr/group StandardsCommittee (August 31, 1983).

9. Intel MCS-86 User's Manual, Intel Corporation, Santa Clara,CA (February 1979).

10. A. V. Aho and J. D. Ullman, Principles ofCompiler Design,Addison-Wesley Publishing Co., Reading, MA (1977).

II. R. S. Glanville and S. L. Graham, "A new method for compilercode generation," Proceedings of the ACM Principles of Pro-gramming Languages Conference (January 1978), pp, 509-514.

12. P. Aigrain, S. L. Graham, R. R. Henry, M. K. McKusick, andE. Pelegri-Llopari, "Experience with a Graham-Glanville stylecode generator," Proceedings ofthe ACM SIGPLAN Sympo-sium on Compiler Construction (June 1984), pp. 13-24.

13. T. W. Christopher, P. J. Hatcher, and R. C. Kukuk, "Usingdynamic programming to generate optimized code in a Gra-ham-Glanville style code generator," Proceedings ofthe ACMSIGPLAN Symposium on Compiler Construction (June 1984),pp.25-36.

14. J. Crawford, "Engineering a production code generator," Pro-ceedings of the ACM SIGPLAN Symposium on CompilerConstruction (June 1982), pp. 205-225.

15. M. Ganapathi and C. N. Fischer, "Description driven codegeneration using attribute grammars," Proceedingsofthe ACMPrinciples of Programming Languages Conference (January1982), pp. 108-119.

16. R. Sethi and J. D. Ullman, "The generation of optimal codefor arithmetic expressions," Journal of the ACM 17, No.4,715-728 (October 1970).

17. W. Wulf, R. K. Johnsson, C. B. Weinstock, S. O. Hobbs, andC. M. Geschke, The Design ofan Optimizing Compiler, Else-vier North-Holland Publishing Co., New York (1975).

General references

Byte Magazine 8, No.8 (August 1983), whole issue.P. M. Chirlian, Introduction to C. Matrix Publishers, Beaverton,OR (1984).A. R. Feuer, The C Puzzle Book, Prentice-Hall, Inc., EnglewoodCliffs, NJ (1982).M. Ganapathi, C. N. Fischer, and J. L. Hennessy, "Retargetablecompiler construction," ACM Computing Surveys 14, No.4, 573-592 (December 1982).L. Hancock and M. Krieger, The C Primer, McGraw-Hili BookCo., Inc., New York (1982).S. C. Johnson, YACC-Yet Another Compiler Compiler, ReportNo. 32, Bell Telephone Laboratories, Inc., Murray Hill, NJ.T. Plum, Learning to Program in C, Plum Hall, Inc., Cardiff, NJ(1983).J. J. Purdum, C Programming Guide, Que Corporation, Indian-apolis. IN (1983).

RYAN ANDSPILUER 47

T. O. Sidebottom and L. A. Wortman, The C Programming Tutor,R. J. Brady co., Bowie, MD (1984).R. J. Traister, Programming in C for the Microcomputer User,Prentice-Hall, lnc., Englewood Cliffs, NJ (1984).Z. T. Zahn, C Notes, A Guide to the C Programming Language,Yourdon Press, New York (1979).

Reprint Order No. G321-5236.

Ralph R. Ryan Microsoft Corporation, 10700 Northrup Way. Box97200. Bellevue, Washington 98009. From 1978 to 1981, Mr.Ryan worked as a contract software engineer in the area ofdistributed databases. In 1981,he worked for the Boeing AerospaceCorporation on a portable compiler project. Mr. Ryan joined theMicrosoft Corporation in 1982 and is currently manager of theCompiler Technology (CMERGE) group. Mr. Ryan received aB.S.E.E. from Rutgers University in 1977 and an M.S.E. in elec-trical engineering from Princeton University in 1978.

Hans Spiller Microsoft Corporation, 10700 Northrup Way, Box97200, Bellevue, Washington 98009. From 1978 to 1981, Mr.Spiller supported and developed compilers at the Data GeneralCorporation, including compilers for Pascal and DG/L compilersfor the Nova, Eclipse. and Eagle (MV/#000 series). He joined theMicrosoft Corporation in 1981. There he has developed the XE-NIX language products, including C compilers, assemblers, anddebuggers for the 68000, the Z8000, and the 8086. He is nowresponsible for the CMERGE 8086/286 code generator. Mr. Spillerreceived an A.B. in computer science from the University ofCalifornia at Berkeley in 1978.

48 RYAN AND SPILLER IBM SYSTEMS JOURNAL, VOL 24, NO 1, 1985

Documents

The C programming language and a C compiler