View
229
Download
0
Embed Size (px)
Citation preview
Chapter 3 Program translation 1
Chapt. 3 Language Translation
• Syntax and Semantics
• Translation phases
• Formal translation models
Chapter 3 Program translation 2
Syntax
• What is a valid string of the language?– First pass of a compiler
• Error messages (are they helpful?)
– Compiler compiler (generator) such as YACC can automatically generate parser from BNF
Chapter 3 Program translation 3
Good syntax criteria
– Assist in Readability• COBOL as self documenting• Comments• Length of identifiers• Overloading of names
– Examples of poor features for readability • blank as concatenation operation in SNOBOL• Identifier names in Basic
– X1, Y
• Implicit typing• Late binding
Chapter 3 Program translation 4
Good syntax criteria(cont.)
• Assist in Writeability– Few and concise statements– Rich library–created by language and user– Support of abstraction– Orthogonality
• Examples of poor features for writeability– Large number of constructs– Lack of necessary constructs– Redundancy– Ambiguity
• Ex: if statement
– Case sensitivity??
Chapter 3 Program translation 5
Syntactic elements
• Character set– 5, 6, 7, 8, 16 bit encoding schemes
• Identifiers– Symbols such as letters, digits, $, _, blank– Length limitation
• Operation symbols – various examples– LISP –prefix identifiers (ex: PLUS)– APL – special Greek characters– FORTRAN - .EQ., .GT.– C - &&, ==– Java & and &&, | and ||
Chapter 3 Program translation 6
Syntactic elements (cont.)
• Keyword– identifier used as part of primitive program unit (ex: if,
then, else, case)
• Reserved word– Keyword that cannot be assigned by programmer
• READ is not a reserved word in Pascal
– Adding new reserved words to an update of a language can make old programs incorrect (upward compatability)
Chapter 3 Program translation 7
Syntactic elements (cont.)
• Noise words– Used to improve readability-optional
• Ex: perform 5 [times]
• Comments– Used for documentation; readability
• Blanks– Completely ignored in FORTRAN
• Do 10 I = 1.5
• Delimiters and brackets– Spaces, ; , paired ()[] {} begin end
• Fixed format vs free format
Chapter 3 Program translation 8
Program Structure
• Expression– Precedence rules
• Statements – structured programming
• Modules/ functions/ subprograms/ classes– Nested units
• Static checks, efficient code for nonlocal references
– Separate unit compilation.– Data and operations are compiled as a unit in classes– Interface issues – function specification to allow static checks
(prototypes)– Specifications (.h files) separate from implementations
Chapter 3 Program translation 9
Translation I- Lexical Analysis
– Byte stream organized into lexemes, each of which is identified (tagged)
– Numbers may be converted to binary– Identifiers are stored in symbol table– Tokens are output for syntactic analysis
Chapter 3 Program translation 10
Translation II parsing – syntactic analysis
• Tokens organized into expressions, statements, etc.
• Is the input a valid string in the language?
• Generates parse tree, tables
• Produces error messages for invalid strings
Chapter 3 Program translation 11
Translation III semantic analysis
• Produces error messages for invalid constructs– Ex: identifier not declared; type mismatch
• Compiled languages use and discard symbol table– Reference to variable as offset from data sections
• Information must be stored together with identifier (ex: type, range limitations)
• Macro substitutions• Compiler directives
– #define– #ifndef– Pragma suppress range_checks
Chapter 3 Program translation 12
Translation IV optimization
• Semantic analysis output is typically one statement at a time
• Compiler can optimize code to optain results as efficient as assembly code– Ex:Save intermediate results in registers– remove constant operations from loop– Change 2-dimensional array storage
• Code generations• Linking and Loading
Chapter 3 Program translation 13
BNF (Backus Normal/Naur Form)
• Metalanguage::= defined as| alternative<> nonterminal{} later introduced for iteration[] for optionalsequence is implicit
ex: <unsigned integer> ::= <digit>|<unsigned integer> <digit><digit> ::= 0|1|2|3|4|5|6|7|8|9
Chapter 3 Program translation 14
Context Free Grammars
For balanced parenthesisS SS | (S) | ()
Problem: generate a parse tree for a string such as (()(()))((())()) from above
Some language definition issues are context sensitive, such as: each identifier must be declared before use
Implementation issues such asPass by value or reference
Chapter 3 Program translation 15
Syntax Charts
Term at top left is defined by the following graph
Graph branches for alternativeEmpty branch for optional
Box around string for nonterminalCircle for terminal
Arrow back for iteration ex: p. 96 in text
Sequence is explicit
Chapter 3 Program translation 16
Finite-State Automata
Table used for lexical analysisEx: valid floating point number
(note that limitations on range and precision are not specified)
(whole part) (decimal) (fractional) (exp) (exp value)Where whole part, fractional, and exp value have a looping arrowDigit is input to whole part . is input leading to decimalDigit leads from decimal to fractionalE leads from fractional to expDigit leads from exp to exp value