23
COMP 144 Programming Language Concepts Felix Hernandez-Campos 1 Lecture 5: Lecture 5: Syntax Analysis Syntax Analysis COMP 144 Programming Language COMP 144 Programming Language Concepts Concepts Spring 2002 Spring 2002 Felix Hernandez-Campos Felix Hernandez-Campos Jan 18 Jan 18 The University of North Carolina at The University of North Carolina at Chapel Hill Chapel Hill

1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 5: Syntax Analysis COMP 144 Programming Language Concepts Spring 2002 Felix Hernandez-Campos

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

11

Lecture 5: Lecture 5: Syntax AnalysisSyntax Analysis

COMP 144 Programming Language ConceptsCOMP 144 Programming Language Concepts

Spring 2002Spring 2002

Felix Hernandez-CamposFelix Hernandez-Campos

Jan 18Jan 18

The University of North Carolina at Chapel HillThe University of North Carolina at Chapel Hill

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

22

Review: Compilation/InterpretationReview: Compilation/Interpretation

Compiler or InterpreterCompiler or Interpreter

Translation Translation ExecutionExecution

Source CodeSource Code

Target CodeTarget Code

Interpre-Interpre-tationtation

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

33

Review: Syntax AnalysisReview: Syntax Analysis

Compiler or InterpreterCompiler or Interpreter

Translation Translation Execution Execution

Source CodeSource Code• Specifying the Specifying the formform

of a programming of a programming

languagelanguage

– TokensTokens» Regular Regular

ExpressionExpression

– SyntaxSyntax» Context-FreeContext-Free

GrammarsGrammars

Target CodeTarget Code

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

44

Phases of CompilationPhases of Compilation

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

55

Syntax AnalysisSyntax Analysis

• Syntax:Syntax:– Webster’s definition: Webster’s definition: 1 a : the way in which linguistic 1 a : the way in which linguistic

elements (as words) are put together to form constituents elements (as words) are put together to form constituents (as phrases or clauses)(as phrases or clauses)

• The syntax of a programming languageThe syntax of a programming language– Describes its formDescribes its form

» Organization of tokensOrganization of tokens (elements)(elements)» Context Free Grammars (CFGs)Context Free Grammars (CFGs)

– Must be Must be recognizablerecognizable by compilers and interpreters by compilers and interpreters» ParsingParsing» LL and LR parsersLL and LR parsers

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

66

Context Free GrammarsContext Free Grammars

• CFGsCFGs– Add recursion to regular expressionsAdd recursion to regular expressions

» Nested constructionsNested constructions

– NotationNotationexpressionexpression identifieridentifier | | numbernumber | | -- expressionexpression | | (( expressionexpression )) | | expressionexpression operatoroperator expressionexpressionoperator operator ++ | | -- | | ** | | //

» Terminal symbolsTerminal symbols» Non-terminal symbolsNon-terminal symbols» Production rule (i.e. substitution rule)Production rule (i.e. substitution rule)

terminal symbol terminal symbol terminal and non-terminal symbols terminal and non-terminal symbols

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

77

ParsersParsers

• ScannersScanners– Task: recognize language tokensTask: recognize language tokens– Implementation: Deterministic Finite AutomatonImplementation: Deterministic Finite Automaton

» Transition based on the next characterTransition based on the next character

• ParsersParsers– Task: recognize language syntax (organization of tokens)Task: recognize language syntax (organization of tokens)– Implementation:Implementation:

» Top-down parsingTop-down parsing» Bottom-up parsingBottom-up parsing

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

88

Parse TreesParse Trees

• A parse is graphical representation of a derivationA parse is graphical representation of a derivation

• ExampleExample

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

99

Parsing exampleParsing example

• Example: comma-separated list of identifierExample: comma-separated list of identifier

– CFGCFG

id_list id_list idid id_list_tailid_list_tailid_list_tail id_list_tail ,, id_list_tailid_list_tailid_list_tail id_list_tail ;;

– ParsingParsing

A, B, C;A, B, C;

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1010

Top-down derivation of Top-down derivation of A, B, C;A, B, C;

CFGCFG

Left-to-right,Left-to-right,Left-most derivationLeft-most derivation

LL(1) parsingLL(1) parsing

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1111

Top-down derivation of Top-down derivation of A, B, C;A, B, C;

CFGCFG

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1212

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

Left-to-right,Left-to-right,Right-most derivationRight-most derivation

LR(1) parsingLR(1) parsing

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1313

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1414

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1515

ParsingParsing

• Parsing an arbitrary Context Free GrammarParsing an arbitrary Context Free Grammar– O(nO(n33))– Too slow for large programsToo slow for large programs

• Linear-time parsingLinear-time parsing– LL parsers LL parsers

» Recognize LL grammarRecognize LL grammar» Use a top-down strategyUse a top-down strategy

– LR parsersLR parsers» Recognize LR grammarRecognize LR grammar» Use a bottom-up strategyUse a bottom-up strategy

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1616

Hierarchy of Linear ParsersHierarchy of Linear Parsers

• Basic containment relationshipBasic containment relationship– All CFGs can be recognized by LR parserAll CFGs can be recognized by LR parser– Only a subset of all the CFGs can be recognized by LL Only a subset of all the CFGs can be recognized by LL

parsersparsers

LL parsingLL parsing

CFGsCFGs LR parsingLR parsing

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1717

Recursive Descent Parser ExampleRecursive Descent Parser Example

• LL(1) grammarLL(1) grammar

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1818

Recursive Descent Parser ExampleRecursive Descent Parser Example

• Outline of Outline of

recursive parserrecursive parser

– This parser onlyThis parser onlyverifies syntaxverifies syntax

– matchmatch is isthe scannerthe scanner

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

1919

Recursive Descent Parser ExampleRecursive Descent Parser Example

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

2020

Recursive Descent Parser ExampleRecursive Descent Parser Example

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

2121

Recursive Descent Parser ExampleRecursive Descent Parser Example

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

2222

Semantic AnalysisSemantic Analysis

Compiler or InterpreterCompiler or Interpreter

Translation Translation Execution Execution

Source CodeSource Code• Specifying the Specifying the meaningmeaning

of a programming of a programming

languagelanguage

– Attribute GrammarsAttribute Grammars

Target CodeTarget Code

COMP 144 Programming Language ConceptsFelix Hernandez-Campos

2323

Reading AssignmentReading Assignment

• Scott’s Chapter 2Scott’s Chapter 2– Section 2.2.2Section 2.2.2– Section 2.2.3Section 2.2.3