4

Click here to load reader

On the generation of compilers from language definitions

Embed Size (px)

Citation preview

Page 1: On the generation of compilers from language definitions

INFORMATION PROCESSING LETTERS 18 March 1980

ON TMP GENERATION OF C?MFtLERS FROM LANGUAGE DEFINITIONS

Frank G, b’,O>.f~Ah! ibrrip~rrr~ Shwce Departtnent. Southern lfliilois University, Carbondale? Ii-, U.S.A.

Rscckxl 22 Mmlt 1979; revised version received 30 October 1979

~?r~l~r~rnrn~~~ kmyagcs, compiler generation, formal semantics, compilers, interpreters, metalanguages, Algol68

This note J~ohrts out some extensions to the seman- tic specification techniques advocared by the author in two earlier prpers [8,9j. TJtese extensions pertain to a~t~~rna~~~ compiler generation and were inspired by cCrt;do related concepts devefoped by others [f .2, StiJf.

The main aim of the two earlier papers was to advance and illustrate the principle that a general-pur- p9se programming language migJrt feasibly and profit- ably be emproyed as a metalanguage in formaJ seman- tic definitions of programming languages; one paper [a 1 dealing with operational semantics and the otlrer lo{ with functional (denotational, ‘mathematical’) semantics. Tflc main potential advantages of such an approach are

(a) easier and better understanding of formal Jan- guage specifications by a wider range of people and

(b) ready amenability of the specifications to com- puter cJrecking and processing.

On the latter point, since semantic definitions of the t;vo types studred can be regarded as abstract isiterpretcrs for tJre languages being defined, a language de~~i~~o~ expressed in an executable metalanguage can. 3 prC. be an (interpretive, prototype) imple- mcrhtion. so that tile problem of cowerting it into wch eiisappea:s. TIE problem of making the imple- mentation rciascinably efficient reduces, in principle at leas?. IO a program optimization problem. However, the rprcstion of prov~ciing prototype translational ~~~~~~~~J~~~ irnpJemen$ations was not ;tdrtressed.

(language to be defined) by means of the type-defini- tion facilities of the metalanguage. For tltc sake of definiteness, supp ose that Algo is the metalan- guage (albeit an imperfect one). Then the abstract syntax specifications of a particular subject language may be formulated as a set of mutually recursive defi- nitions of modes, mostly structures and unions, the most general of which (prog, say) specifies the major components oi‘ a complete program. A particular abstract program pr is then a value (structure) of mode prog. Jt may, if desired, be regarded as resulting from the execution of the identity-declaration

prog pr = trans(s) ,

where s is the textual form of the program (a charac- ter string) and trans is a function, of mode proc(string)prog, which transforms textual programs into abstract programs. (Those who prefer to Jzave compilers described in terms of tlteir action on source programs may substitute trans(s) wherever pr appears in this note.)

Let the mode file describe the sets of input values and output values manipulable by the subject language. In the case of a language for manipulating arbitrary quantities of integer data, for example, file could be defined as ref [ ] int or ref flex [ ] int [8]. A defini- tional interpreter may now be written as a function procedure of the following mode:

mode intr = proc(prog, file)fde .

In particular, suppose that the procedure interpret of tJte foIlowing form defines the operational semantics of the subject language associated with prog and file:

Page 2: On the generation of compilers from language definitions

Volume 10, number Z INPORMATION PROCESSING LETTERS 18 March 1980

pmc interwet = (prop p, file input) file: ( declaraiions for the sompounds of an abstract machine; proc int prog = (prog p, file input) file: . . . . declarations of other, mutually recursive. proce-

dures such as int command and eval expr; int prog (p input) ).

The body of interpret could instead have specified a functional semantics for the subject language [9], provided that the metalanguage is extended to include the partial pavamefrizathn feature [5,6]. This feature permits a procedure to be called with arguments for only some of its parameters, resulting in another pro- cedure with a smaller number of parameters. For example, given the declaration

procadd=(inty,x)int: x+y

(so that the mode of add is proc(int, int) int), the call add(l ,) yields a procedure value equivalent to

(int x)int : x t 1

(of mode proc(int)int). It should be noted that, as in earlier papers [8,9],

the use of Algoi 68 as a metalanguage here is for expo- sitory purposes only and is not intended to imply that this language is very suitable, in an absolute sense, for the purposes to which it is being applied. The adverse effect of partial parametrization on readability and the fact that functions cannot always be programmed in a purely applicative style, for example, impair its viability for the specification of denotational seman-F tics [IO]. In relation to other extant, general-purpose programming languages, however, Algol68 is far supe- rior for this kind of use, and the results are good enough to justify the conclusion that a sufficiently well-designed and powerful programming language could outdo the purely formal metalanguages in all important respects.

Turning now to the compiler generation problem, instead of regarding the output of a compiler (an ob- ject program) as a program expressed in some sort of machine language, we adopt the more abstract view [7] that it consists of a fttnctioiz that maps input files into output files:

mode objprog = proc(fde) file .

A coml>iler is the itself a function of mode cmplr, defined as

mode cmpir = proc (prog) objprog

(i.e., proc(prog) proc(file) file). The key idea now is that an object program corresponding to an abstract program pr can be obtained simply by partially parametrizing the definitional interpreter with pr; i.e., the call interpret(pr,) yields the proc(fiIe) file function which is the ‘meaning’ of pr. Hence, a compiler can be constructed simply by abstracting this call into a pro- cedure of mode cmplr:

cmplr compile = (prog p) objprog : interpret (p, ) .

The role played by partial parametrization in this technique is a crucial one, and is analogous to that of partial evaluafion [2,3] or mixed cmnpt~tation [I ] in other studies t>f the generation of compilers from interpreters. An important difference is that here par- tial parametrization is being regarded as an integral feature of the metalanguage rather than as a separate process.

To compile the program pr and name the resulting object program objpr, we may elaborate the phrase

objprog objpr = compile (prj .

Then to execute the object program on a file input and name the resulting file output, we elaborate

file output = objpr (input).

Alternatively, to compile the program and execute it immediately, these two steps can be combined:

file output = compile (pr) (input).

The runtime efficiency of this process depends on the degree of optimization of the partial parametrization feature in the implementation of the metalanguage. In the worst case, creation of an object program by par- tial parametrization of the interpreter would be tanta- mount to building a pair of the form (interpret, p) and subsequent execution would, in effect, be noihing more than pure interpretation. On the other hand, an implementation of the metalanguage could perform a true partial evaluation of the interpreter in the sense

of the other studies [1,2,3], so that execution would be considerably faster tkm interpretation. i’t is an interesting side benefit of the present ay~mach that ,I\e burden of optimizing a generated com@zc is, in part, already borne by the (presumed o~t~nlized} im- plementation of the metalanguage.

Page 3: On the generation of compilers from language definitions

Volumcr to. number 2 INFORMATJON PROCESSING LETTERS 18 March 1980

~~~v~~r~ ore it) consider how WC\ might write a gen- edmc for generating compilers from inter-

s, the foilowing routine of mode proc(intr) will convert any interpreter for the subject lan- under consideration into a compiler:

= (intr int) cm;-lr :

(fprog p, intr int) objprog : int (p,)) (, int) .

trtfy of cgen is conceptually equivalent to

Cpraa, p) objprog : int (p,)

except that the appearance of int in the latter gives rise to an Algol 68 scope violation. The corrective tecb~iq~e of addmg an extra parameter to the inner r~~~~tl~~.tc~t and immediately partially parametrizing it is one which is used heavily in the specification of

ctional semantics in extended Algol68 [9,10].) Frrr ex~rn~!e~ the call cgeminterpret) will yield tlte f~~~c~~~n compile defined above. The procedure cgen coufd be useful if one wished to produce an efficient ~~~rn~iler by first optimizing the definitional inter- preter.

The usual meaning of the term ‘compiler generator’, however, is a program to convert a definition of any subject language into a compiler for that language. The procedure cgcn is tied to one language because of the occurrence of prog in its body and the implicit occur- rences of prog and file i n the modes intr and cmplr. To be truly general, the compiler generator should accepi such modes as parameters. This requires a fur- ther extension !o the metalanguage to include a type parametrization feature as found in a few experimental programming languages and known as the modal fea- ture in one proposal concerned with Alp01 68 [4]. With this extension, the following procedure will gen- era:e a compil,r for any language from any interp-eter expressed as a procedure in this metalanguage:

proc gencmplr = (mode pmode, fmode, procgpmode, fmode) fmode int)

proc(pmode) proc(fmode) Cnode : Gpmode p. procl;smode, fmode) fmode in!) proc(fmode) fm~de : Jnt (p,)) (, int) .

Here pmode, fmode, znd mt are the three formal parameters of gemrplr, proc(pmode, fmode) fmode is the generalized rr:ode for an interpreter, and ~roc~~rn~e~ proc(fmode)fmode is the generalized

mcde for a compiler. h’os~ the call

gencmplr (grog, file, interpret)

will yield the function compile. The preceding observations provide further sup

port for the principle of using some advanced and expressive general-purpose programming language as a metalanguage for semantic specifications: compiler generation can be so conceptually simple a process that it is almost true to say that an (interpretive) lan- guage definition is a compiler. The technique could be used now in a practical translator writing system if there happened to exist a sufficiently powerful (meta-) programming language capable of adequately expressing interpretive langauge definitions and of manipulating functions as truly first-class values. (It would seem that features equivalent to partial param- etription, medals, and so forth ought to be available in advanced languages for general use anyway.) It would then be possible to investigate the relative effi- ciency of object programs produced by the generated compilers. For the present, all that can be said is that the efficiency should be substantially better than that of definitional interpreters, depending on how much optimization is performed within implementations of the metalanguage.

References

[l] A.P. Ershov, rlln the essence of compilation, in: E.J. Neuhold, Ed., Formal Description of Programming Con- cepts (North-Holland, Amsterdam, 1978) 391-420.

[ 21 Y. Futamura, Partial evaluation of computer programs: an approach to a compiler-compiler, Trans. Inst. Elec- tronics Comm. Engrgs. Japan 54 (8) (1971) 32-33. A. Haroldsson, A partial evaluator and its use for com- piling iterative statements in Lisp, Conf. Record of the 5th Annual Symp. on Principles of Programming Lan- guages (ACM, New York, 1978). C.H. Lindsey, Modals, Algol Bull. 37 (1974) 26-29. C.H. Lindsey, Partial parameirlzation, Algol Bull. 37 (1974) 24-26.

[6] C.H. Lindsey, Specification of partial parametrization proposal, Algol Bull. 39 (1976) 6-9.

[‘I ] P.D. Mosses, Compiler generation using denotational semantics, Proc. 5 th Symp. on Mathematical Founda- tions of Computer Science, Lecture Notes in Computer Science 45 (Springer, Berlin, 1976) 436-441.

[8] F.G. Pagan, On interpreter-oriented definitions of pro- gramming languages, Comput. J. I9 (1976) 151-155.

Page 4: On the generation of compilers from language definitions

Volume 10, number 2 INFORMATION PROCESSING LETTERS 18 March 1980

[9] F.G. Pagan, Algo as a metalanguage for denotational semantics, Comput. J. 22 (1979).

[lo] F.G. Pagan, Studies in the metaling&tic uJe of a gen- eral-purpose programming language for the specification

of denotaticnal semantics, Technical Report 79-O 1, Dept. of Computer Science, Southern Illinois Univ., Carbondale 11979).