Chapter 2Language Processors
Fall 2013
Chart 2
Translators and Compilers Interpreters Real and Abstract Machines Interpretive Compilers Portable Compilers Bootstrapping Case Study: The Triangle Language
Processor
Chart 3
Translator: a program that accepts any text expressed in one language (the translator’s source language), and generates a semantically-equivalent text expressed in another language (its target language)o Chinese-into-Englisho Java-into-Co Java-into-x86o X86 assembler
Chart 4
Assembler: translates from an assembly language into the corresponding machine codeo Generates one machine code instruction per source
instruction Compiler: translates from a high-level language
into a low-level languageo Generates several machine-code instructions per source
command.
Chart 5
Disassembler: translates a machine code into the corresponding assembly language
Decompiler: translates a low-level language into a high-level language
Question: Why would you want a disassembler or decompiler?
Chart 6
Source Program: the source language text Object Program: the target language text
Compiler
ObjectProgram
Syntax Check
Context Constraints
Generate Object Code
Semantic Analysis
SourceProgram
• Object program semantically equivalent to source program If source program is well-formed
Chart 7
Why would you want to do:o Java-into-C translatoro C-into-Java translatoro Assembly-language-into-Pascal decompiler
Chart 8
M
P
L
P
L
M
P = Program NameL = Implementation Language
M = Target Machine
For this to work, L must equal M, that is, the implementation language must be the same as the machine language
S T
L
S = Source LanguageT = Target LanguageL = Translator’s Implementation Language
S-into-T Translator is itself a program that runs on machine L
Chart 9
• Translating a source program P • Expressed in language T, • Using an S-into-T translator • Running on machine M
P
S
M
S T
M
P
T
Chart 10
• Translating a source program sort • Expressed in language Java, • Using a Java-into-x86 translator • Running on an x86 machine
sort
Java
x86
Java x86
x86
sort
x86
The object program is running on the same machine as the compiler
sort
x86
x86
Chart 11
sort
Java
x86
Java PPC
x86
sort
PPC
Cross Compiler: The object program is running on a different machine than the compiler
sort
PPC
PPC
download
• Translating a source program sort • Expressed in language Java, • Using a Java-into-PPC translator • Running on an x86 machine• Downloaded to a PPC machine
Chart 12
sortJava
x86
Java C
x86
sort
C
Two-stage Compiler: The source program is translated to another language before being translated into the object program
sort
x86
x86
• Translating a source program sort • Expressed in language Java, • Using a Java-into-C translator • Running on an x86 machine
x86
x86
x86
sort
x86C
• Then translating the C program• Using an C-into x86 compiler• Running on an x86 machine• Into x86 object program
Chart 13
Translator Ruleso Can run on machine M only if it is expressed in machine
code Mo Source program must be expressed in translator’s
source language So Object program is expressed in the translator’s target
language To Object program is semantically equivalent to the source
program
Chart 14
Accepts any program (source program) expressed in a particular language (source language) and runs that source program immediatelyo Does not translate the source program into object code
prior to execution
Chart 15
Interpreter
Program Complete
Fetch Instruction
Analyze Instruction
Execute Instruction
SourceProgram
• Source program starts to run as soon as the first instruction is analyzed
Chart 16
When to Use Interpretationo Interactive mode – want to see results of instruction
before entering next instructiono Only use program onceo Each instruction expected to be executed only onceo Instructions have simple formats
Disadvantageso Slow: up to 100 times slower than in machine code
Chart 17
Exampleso Basico Lispo Unix Command Language (shell)o SQL
Chart 18
SL S interpreter expressed in language L
SM
P
S
M
Program P expressed in language S, using Interpreter S, running on machine M
Basicx86
graph
Basic
x86
Program graph written in Basic running on a Basic interpreter executed on an x86 machine
Chart 19
Hardware emulation: Using software to execute one set of machine code on another machineo Can measure everything about the new machine except
its speedo Abstract machine: emulatoro Real machine: actual hardware
An abstract machine is functionally equivalent to a real machine if they both implement the same language L
Chart 20
nmiC
M
C M
M
New Machine Instruction (nmi) interpreter written in C
nmiC
nmiM
The nmi interpreter is translated into machine code M using the C compiler
Compiler to translate C program into M machine code
nmi interpreter written in C nmi interpreter expressed in machine code M
nmiM
P
nmi
M
P
nmi
nmi
Chart 21
Combination of compiler and interpretero Translate source program into an intermediate languageo It is intermediate in level between the source language
and ordinary machine codeo Its instructions have simple formats, and therefore can be
analyzed easily and quicklyo Translation from the source language into the
intermediate language is easy and fast
An interpretive compiler combines fast compilation with tolerable running speed
Chart 22
Java JVM
M
JVMM
Java into JVM translator running on machine M
JVM code interpreter running on machine M
Java JVM
M
P
Java
P
JVM
M
P
JVM
M
JVMM
A Java program P is first translated into JVM-code, and then the JVM-code object program is interpreted
Chart 23
A program is portable if it can be compiled and run on any machine, without changeo A portable program is more valuable than an unportable
one, because its development cost can be spread over more copies
o Portability is measured by the proportion of code that remains unchanged when it is moved to a dissimilar machine
• Language affects portability• Assembly language: 0% portable• High level language: approaches 100% portability
Chart 24
Language Processorso Valuable and widely used programso Typically written in high-level language
• Pascal, C, Javao Part of language processor is machine dependent
• Code generation part• Language processor is only about 50% portable
o Compiler that generates intermediate code is more portable than a compiler that generates machine code
Chart 25
Java JVM
JavaJVMJava
Java JVM
JVM
P
Java
P
JVM
M
P
JVM
M
JVMM
JVMC
Java JVM
JVM
2. Rewrite interpreter in C
C M
M
M
JVMC
JVMM
JVMM
Note: C M Compiler exists; rewrite JVM interpreter from Java to C
1. Start with the following
3. Compile the compiler 4. Java program P is translated into JVM program P and run using ghe JVM intrepreter
Chart 26
The language processor is used to process itselfo Implementation language is the source language
Bootstrapping a portable compilero A portable compiler can be bootstrapped to make a true
compiler – one that generates machine code – by writing an intermediate-language-into-machine-code translator
Full bootstrapo Writing the compiler in itselfo Using the latest version to upgrade the next version
Half bootstrapo Compiler expressed in itself but targeted for another
machine Bootstrapping to improve efficiency
o Upgrade the compiler to optimize code generation as well as to improve compile efficiency
Chart 27
Bootstrap an interpretive compiler to generate machine code
JVM M
Java
M
JVMM
JVM M
Java Java JVM
JVM
JVM M
JVM
M
JVMM
JVM M
JVM JVM M
JVM
JVM M
M
M
Java JVM
JVM JVM M
M
Java JVM
MJava JVM
M
M
JVM M
M
M
P
Java
P
JVMP
M
1, First, write a JVM-coded-into-M translator in Java
2. Next, compile translator using existing interpreter
3. Use translator to translate itself
4. Translate Java-into-JVM-code translator into machine code
5. Two stage Java-into-M compiler
Chart 28
Full bootstrapAda-S M
C
v1
Ada-S M
C C M
M
Ada-S M
M
M
v1 v1
Ada-S M
Ada-S
v2
Ada-S M
Ada-S Ada-S M
M
Ada-S M
M
M
v2 v2
v1 Ada M
Ada-S Ada-S M
M
Ada M
M
M
v3 v3
v2
Ada M
Ada-S
v3
5. Extend Ada-S compiler to (full) Ada compiler
3. Convert the C version of Ada-S into Ada-S version of Ada-S
1. Write Ada-S compiler in C 2. Compile v1
using the C compiler
4. Use v1 to compile v2 6. Compile full version of Ada using Ada-S compiler
Chart 29
Half bootstrapAda HM
Ada
Ada HM
HM
Ada TM
Ada
Ada TM
Ada Ada HM
HM
HM
Ada TM
HMP
Ada Ada TM
HM
P
TMP
TM
TM
Ada TM
Ada Ada TM
HM
HM
Ada TM
TM
2. Ada compiler that generates machine code for machine H expressed in HM
1. Ada compiler that generates machine code for machine H expressed in Ada
3.Rewrite Ada compiler that generates HM code to Ada compiler that generates machine code for machine TM
5. Make sure the compiler works properly
6. Use the compiler to compile itself. Now have an Ada compiler that generates TM code that runs on machine TM
4. Use existing Ada compiler fro HM to compile the Ada compiler that run on HM but generates code for TM
Chart 30
Bootstrap to improve efficiency
Ada Ms
Ms
v1
Ada Ms
Ada
v1Ada Mf
Ada
v2
Ada Mf
Ada
v2
Ada Ms
Ms
v1 Ada Mf
Ms
v2
M
Ada Mf
Ms
v2P
Ada
1. Start with v1 targeted to M-slow written in Ada and compiled to run on M-slow
2. Rewrite v1 to run on M-fast in Ada
3. Use v1 to compile v2
P
Mf
P
Mf
M
4. Compile a program using v2; it will compile slow but run fast
Ada Mf
Ada
v2
Ada Ms
Ms
v2 Ada Mf
Mf
v3
M5. Use v2 to compile v2 to produce v3
which will be the fast compiler.