Chapter 10: Compilers and Language Translation

  • Published on
    05-Feb-2016

  • View
    25

  • Download
    3

DESCRIPTION

Chapter 10: Compilers and Language Translation. Invitation to Computer Science, C++ Version, Third Edition. Objectives. In this chapter, you will learn about: The compilation process Phase I: Lexical analysis Phase II: Parsing Phase III: Semantics and code generation - PowerPoint PPT Presentation

Transcript

Chapter 10: Compilers and Language TranslationInvitation to Computer Science,C++ Version, Third EditionInvitation to Computer Science, C++ Version, Third EditionObjectivesIn this chapter, you will learn about:The compilation processPhase I: Lexical analysisPhase II: ParsingPhase III: Semantics and code generationPhase IV: Code optimizationInvitation to Computer Science, C++ Version, Third EditionIntroductionHigh-level language instructions must be translated into machine language prior to executionCompilerA piece of system software that translates high-level languages into machine languageInvitation to Computer Science, C++ Version, Third EditionIntroduction (continued)Goals of a compiler when performing a translationCorrectnessProducing a reasonably efficient and concise machine language codeInvitation to Computer Science, C++ Version, Third EditionFigure 10.1General Structure of a CompilerInvitation to Computer Science, C++ Version, Third EditionThe Compilation ProcessPhase I: Lexical analysisCompiler examines the individual characters in the source program and groups them into syntactical units called tokensPhase II: ParsingThe sequence of tokens formed by the scanner is checked to see whether it is syntactically correctInvitation to Computer Science, C++ Version, Third EditionThe Compilation Process (continued)Phase III: Semantic analysis and code generationThe compiler analyzes the meaning of the high-level language statement and generates the machine language instructions to carry out these actionsPhase IV: Code optimizationThe compiler takes the generated code and sees whether it can be made more efficientInvitation to Computer Science, C++ Version, Third EditionFigure 10.2Overall Execution Sequence on a High-level Language ProgramInvitation to Computer Science, C++ Version, Third EditionThe Compilation Process (continued)Final stepObject program is written to an object fileSource programOriginal high-level language programObject programMachine language translation of the source programInvitation to Computer Science, C++ Version, Third EditionPhase I: Lexical AnalysisLexical analyzerThe program that performs lexical analysisMore commonly called a scannerJob of lexical analyzerGroup input characters into tokensTokens: syntactical units that are treated as single, indivisible entities for the purposes of translationClassify tokens according to their typeInvitation to Computer Science, C++ Version, Third EditionFigure 10.3Typical Token ClassificationsInvitation to Computer Science, C++ Version, Third EditionPhase I: Lexical Analysis (continued)Input to a scannerA high-level language statement from the source programScanners outputA list of all the tokens in that statementThe classification number of each token foundInvitation to Computer Science, C++ Version, Third EditionPhase II: Parsing IntroductionParsing phaseA compiler determines whether the tokens recognized by the scanner are a syntactically legal statementPerformed by a parserInvitation to Computer Science, C++ Version, Third EditionPhase II: Parsing Introduction (continued)Output of a parserA parse tree, if such a tree existsAn error message, if a parse tree cannot be constructedSuccessful construction of a parse tree is proof that the statement is correctly formed(See the next slide)Invitation to Computer Science, C++ Version, Third EditionExampleHigh-level language statement: a = b + cInvitation to Computer Science, C++ Version, Third EditionGrammars, Languages, and BNFSyntaxThe grammatical structure of the languageThe parser must be given the syntax of the languageBNF (Backus-Naur Form)Most widely used notation for representing the syntax of a programming language;Invitation to Computer Science, C++ Version, Third EditionGrammars, Languages, and BNF (continued)In BNFThe syntax of a language is specified as a set of rules (also called productions)A grammarThe entire collection of rules for a languageStructure of an individual BNF ruleleft-hand side ::= definitionInvitation to Computer Science, C++ Version, Third EditionGrammars, Languages, and BNF (continued)BNF rules use two types of objects on the right-hand side of a productionTerminalsThe actual tokens of the languageNever appear on the left-hand side of a BNF ruleNonterminalsIntermediate grammatical categories used to help explain and organize the languageMust appear on the left-hand side of one or more rulesInvitation to Computer Science, C++ Version, Third EditionGrammars, Languages, and BNF (continued)Goal symbolThe highest-level nonterminalThe nonterminal object that the parser is trying to produce as it builds the parse treeAll nonterminals are written inside angle bracketsInvitation to Computer Science, C++ Version, Third EditionParsing Concepts and TechniquesFundamental rule of parsingBy repeated applications of the rules of the grammarIf a parser can convert the sequence of input tokens into the goal symbol, then that sequence of tokens is a syntactically valid statement of the languageIf the parser cannot convert the input tokens into the goal symbol, then this is not a syntactically valid statement of the languageInvitation to Computer Science, C++ Version, Third EditionParsing Concepts and Techniques (continued)One of the biggest problems in building a compiler is designing a grammar that:Includes every valid statement that we want to be in the languageExcludes every invalid statement that we do not want to be in the languageInvitation to Computer Science, C++ Version, Third EditionParsing Concepts and Techniques (continued)Another problem in constructing a compiler: designing a grammar that is not ambiguousAn ambiguous grammar allows the construction of two or more distinct parse trees for the same statementInvitation to Computer Science, C++ Version, Third EditionPhase III: Semantics and Code GenerationSemantic analysisThe compiler makes first pass over parse tree to determine whether all branches of the tree are semantically validIf they are valid, the compiler can generate machine language instructionsIf not, there is a semantic error; machine language instructions are not generatedInvitation to Computer Science, C++ Version, Third EditionPhase III: Semantics and Code Generation (continued)Code generationCompiler makes the second pass over the parse tree to produce the translated codeInvitation to Computer Science, C++ Version, Third EditionPhase IV: Code OptimizationTwo types of optimizationLocal Global Local optimizationThe compiler looks at a very small block of instructions and tries to determine how it can improve the efficiency of this local code blockRelatively easy; included as part of most compilersInvitation to Computer Science, C++ Version, Third EditionPhase IV: Code Optimization (continued)Examples of possible local optimizationsConstant evaluationStrength reductionEliminating unnecessary operationsInvitation to Computer Science, C++ Version, Third EditionPhase IV: Code Optimization (continued)Global optimizationThe compiler looks at large segments of the program to decide how to improve performanceMuch more difficult; usually omitted from all but the most sophisticated and expensive production-level optimizing compilersOptimization cannot make an inefficient algorithm efficientInvitation to Computer Science, C++ Version, Third EditionSummaryA compiler is a piece of system software that translates high-level languages into machine languageGoals of a compiler: correctness, and producing efficient and concise codeSource program: high-level language programInvitation to Computer Science, C++ Version, Third EditionSummaryObject program: the machine language translation of the source programPhases of the compilation processPhase I: Lexical analysisPhase II: ParsingPhase III: Semantic analysis and code generationPhase IV: Code optimizationInvitation to Computer Science, C++ Version, Third Edition

Recommended

View more >