26
1 Syntax Analysis Syntax Analysis (Section 2.2-2.3) (Section 2.2-2.3) CSCI 431 Programming Languages CSCI 431 Programming Languages Fall 2003 Fall 2003 A modification of slides A modification of slides developed by Felix Hernandez- developed by Felix Hernandez- Campos at UNC Chapel Hill Campos at UNC Chapel Hill

1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

Embed Size (px)

Citation preview

Page 1: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

11

Syntax AnalysisSyntax Analysis(Section 2.2-2.3)(Section 2.2-2.3)

CSCI 431 Programming LanguagesCSCI 431 Programming Languages

Fall 2003Fall 2003

A modification of slides developed by Felix A modification of slides developed by Felix Hernandez-Campos at UNC Chapel HillHernandez-Campos at UNC Chapel Hill

Page 2: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

22

Review: Compilation/InterpretationReview: Compilation/Interpretation

Compiler or InterpreterCompiler or Interpreter

Translation Translation ExecutionExecution

Source CodeSource Code

Target CodeTarget Code

Interpre-Interpre-tationtation

Page 3: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

33

Review: Syntax AnalysisReview: Syntax Analysis

Compiler or InterpreterCompiler or Interpreter

Translation Translation Execution Execution

Source CodeSource Code• Specifying the Specifying the formform

of a programming of a programming

languagelanguage

– TokensTokens» Regular ExpressionsRegular Expressions

(also F.A.s & Reg. Grammars)(also F.A.s & Reg. Grammars)

– SyntaxSyntax» Context-FreeContext-Free

GrammarsGrammars(also P.D.A.s)(also P.D.A.s)

Target CodeTarget Code

Page 4: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

44

Phases of CompilationPhases of Compilation

Page 5: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

55

Syntax AnalysisSyntax Analysis

• Syntax:Syntax:– Webster’s definition: Webster’s definition: 1 a : the way in which linguistic 1 a : the way in which linguistic

elements (as words) are put together to form constituents elements (as words) are put together to form constituents (as phrases or clauses)(as phrases or clauses)

• The syntax of a programming languageThe syntax of a programming language– Describes its formDescribes its form

» Organization of tokensOrganization of tokens » Context Free Grammars (CFGs)Context Free Grammars (CFGs)

– Must be Must be recognizablerecognizable by compilers and interpreters by compilers and interpreters» ParsingParsing» LL and LR parsersLL and LR parsers

Page 6: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

66

Context Free GrammarsContext Free Grammars

• CFGsCFGs– Add recursion to regular expressionsAdd recursion to regular expressions

» Nested constructionsNested constructions

– NotationNotationexpressionexpression identifieridentifier | | numbernumber | | -- expressionexpression | | (( expressionexpression )) | | expressionexpression operatoroperator expressionexpressionoperator operator ++ | | -- | | ** | | //

» Terminal symbolsTerminal symbols» Non-terminal symbolsNon-terminal symbols» Production rule (i.e. substitution rule)Production rule (i.e. substitution rule)

terminal symbol terminal symbol terminal and non-terminal symbols terminal and non-terminal symbols

Page 7: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

77

ParsingParsing

• Parsing an arbitrary Context Free GrammarParsing an arbitrary Context Free Grammar– O(nO(n33))– Too slow for large programsToo slow for large programs

• Linear-time parsingLinear-time parsing– LL parsers (a ‘Left-to-right, Left-most’ derivation)LL parsers (a ‘Left-to-right, Left-most’ derivation)

» Recognize LL grammarRecognize LL grammar» Use a top-down strategyUse a top-down strategy

– LR parsers (a ‘Left-to-right, Right-most’ derivation)LR parsers (a ‘Left-to-right, Right-most’ derivation)» Recognize LR grammarRecognize LR grammar» Use a bottom-up strategyUse a bottom-up strategy

Page 8: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

88

Parsing exampleParsing example

• Example: comma-separated list of identifierExample: comma-separated list of identifier

– CFGCFG

id_list id_list idid id_list_tailid_list_tailid_list_tail id_list_tail ,, id_list_tailid_list_tailid_list_tail id_list_tail ;;

– ParsingParsing

A, B, C;A, B, C;

Page 9: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

99

Top-down derivation of Top-down derivation of A, B, C;A, B, C;

CFGCFG

Left-to-right,Left-to-right,Left-most derivationLeft-most derivation

LL(1) parsingLL(1) parsing

Page 10: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1010

Top-down derivation of Top-down derivation of A, B, C;A, B, C;

CFGCFG

Page 11: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1111

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

Left-to-right,Left-to-right,Right-most derivationRight-most derivation

LR parsingLR parsing(a shift-reduce parser)(a shift-reduce parser)

Page 12: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1212

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

Page 13: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1313

Bottom-up parsing of Bottom-up parsing of A, B, C;A, B, C;

CFGCFG

Page 14: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1414

LR Parsing vs. LL ParsingLR Parsing vs. LL Parsing

• LLLL– A ‘top-down’ or ‘predictive’ parserA ‘top-down’ or ‘predictive’ parser– Predict needed productions based on the current left-most Predict needed productions based on the current left-most

non-terminal in the tree and the current input tokennon-terminal in the tree and the current input token– The top-of-stack contains the left-most non-terminalThe top-of-stack contains the left-most non-terminal– The stack contains a record of what the parser expects to The stack contains a record of what the parser expects to

seesee

• LRLR– A ‘bottom-up’ or shift-reduce parserA ‘bottom-up’ or shift-reduce parser– Shifts tokens onto the stack until it recognizes a right-hand Shifts tokens onto the stack until it recognizes a right-hand

side then reduces those tokens to their left-hand sideside then reduces those tokens to their left-hand side– The stack contains a record of what the parser has already The stack contains a record of what the parser has already

seenseen

Page 15: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1515

An appropriate LR GrammarAn appropriate LR Grammar

id_listid_list id_list_prefixid_list_prefix ;;

id_list_prefixid_list_prefix id_list_prefixid_list_prefix ,, idid

idid

This grammar can’t be parsed top-down!This grammar can’t be parsed top-down!

Problems for LL grammars:Problems for LL grammars:

- left recursion, example above- left recursion, example above

- common prefixes, example:- common prefixes, example:

stmtstmt id := id := exprexpr | id ( | id (arg_listarg_list))

Page 16: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1616

LL(1) Grammar for the Calculator LL(1) Grammar for the Calculator LanguageLanguage

Page 17: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1717

LR(1) Grammar for the Calculator LR(1) Grammar for the Calculator LanguageLanguage

Page 18: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1818

Hierarchy of Linear ParsersHierarchy of Linear Parsers

• Basic containment relationshipBasic containment relationship– All CFGs can be recognized by LR parserAll CFGs can be recognized by LR parser– Only a subset of all the CFGs can be recognized by LL Only a subset of all the CFGs can be recognized by LL

parsersparsers

LL parsingLL parsing

CFGsCFGs LR parsingLR parsing

Page 19: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

1919

Bigger PictureBigger Picture

• Chomsky Hierarchy of GrammarsChomsky Hierarchy of Grammars

RegularRegularGrammarGrammar

Context Free GrammarContext Free Grammar

Context Sensitive GrammarContext Sensitive Grammar

Unrestricted GrammarUnrestricted Grammar

Page 20: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2020

Implementation of an LL ParserImplementation of an LL Parser

• Two options:Two options:– A recursive descent parser (section 2.2.3)A recursive descent parser (section 2.2.3)

» For LL grammars onlyFor LL grammars only

– Parse table and a driver (section 2.2.5)Parse table and a driver (section 2.2.5)» LR parsers covered in section 2.2.6LR parsers covered in section 2.2.6

Page 21: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2121

Recursive Descent Parser ExampleRecursive Descent Parser Example

• LL(1) grammarLL(1) grammar

Page 22: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2222

Recursive Descent Parser ExampleRecursive Descent Parser Example

• Outline of Outline of

recursive parserrecursive parser

– This parser onlyThis parser onlyverifies syntaxverifies syntax

– matchmatch is isthe scannerthe scanner

Page 23: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2323

Recursive Descent Parser ExampleRecursive Descent Parser Example

Page 24: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2424

Recursive Descent Parser ExampleRecursive Descent Parser Example

Page 25: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2525

Recursive Descent Parser ExampleRecursive Descent Parser Example

A program that develops recursive decent A program that develops recursive decent parsers: parsers: JavaCC

Page 26: 1 Syntax Analysis (Section 2.2-2.3) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel

2626

Semantic AnalysisSemantic Analysis

Compiler or InterpreterCompiler or Interpreter

Translation Translation Execution Execution

Source CodeSource Code• Specifying the Specifying the meaningmeaning

of a programming of a programming

languagelanguage

– Attribute GrammarsAttribute Grammars

Target CodeTarget Code