CSci210.BA4. Chapter 4 Topics Introduction Lexical and Syntax Analysis The Parsing Problem ...

LEXICAL AND SYNTAX ANALYSIS

CSci210.BA4

Chapter 4 Topics

Introduction Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing

Introduction

Syntax analyzers almost always based on a formal description of the syntax of the source language (grammars)

Almost all compilers separate analyzing syntax into:Lexical Analysis – low-level Syntax Analysis – high-level

Reasons to Separate Syntax and Lexical Analysis

Simplicity – lexical analysis is less complex, so the process is simpler when separated

Efficiency – allows for selective optimization

Portability – lexical analyzer is somewhat platform dependent whereas the syntax analyzer is more platform independent

Lexical Analysis

A pattern matcher for character strings Performs syntax analysis at the lowest

level of the program structure Extracts lexemes from a given input

string and produce the corresponding tokens

Lexical Analysis (continued)result = oldsum – value / 100;

Token LexemeIDENT result

ASSIGN_OP =

IDENT oldsum

SUB_OP -

IDENT value

DIV_OP /

INT_LIT 100

SEMICOLON ;

Building a Lexical Analyzer

Write a formal description of the tokens and use a software tool that constructs lexical analyzers when given such a description

Design a state transition diagram that describes the tokens and write a program that implements the diagram

Design a state transition diagram that describes the tokens and hand-construct a table-driven implementation of the state diagram

State (Transition) Diagram Design A directed graph with nodes labeled with

state names and arcs labeled with input characters

Including states and transitions for each and every token pattern would be too large and complex

Transitions can be combined to simplify the state diagram

The Parsing Problem

Two goals of syntax analysis:Check the input program for any syntax

errors, produce a diagnostic message if an error is found, and recover

Produce the parse tree, or at least a trace of the parse tree, for the program

Two Classes of parsers:Top-downBottom-up

Top-Down Parsers

Traces or builds a parse tree in preorder (leftmost derivation)

The most common top-down parsing algorithms:Recursive descentLL parsers

Bottom-Up Parsers

Produce the parse tree by beginning at the leaves and progressing towards the root

Most common bottom-up parsers are in the LR family

Complexity of Parsing

Parsing algorithms that work for any unambiguous grammar are complex and inefficient: O(n3)

Compilers use parsers that only work for a subset of all unambiguous grammars, but do it in linear time: O(n)

Recursive-Descent Parsing Top-Down Parser EBNF is ideal for the basis of a

recursive-descent parserEach terminal maps to a functionFor a non-terminal with more than one RHS,

look at the next token to determine which side to choose

No mapping = syntax error

Recursive-Descent Parsing Grammar for an expression:

<expr> → <term> {+ <term>}

<term> → <factor> {* <factor>}

<factor> → id | int_constant | ( <expr> )

How do we parse?Expression: 1 + 2

→ <factor> + <term>

→ 1 + <term>

Recursive-Descent Parsing Grammar for an expression:

<expr> → <term> {+ <term>}

<term> → <factor> {* <factor>}

<factor> → id | int_constant | ( <expr> )

What does code look like?void expr() {

term();

while (nextToken == ADD_OP) {

lex();

term();

Recursive-Descent Parsing The LL (Left Recursion) Problem

How do we fix it?Modify grammar to remove left recursionBefore: <expr> → <expr> + <term>

After: <expr> → <term> + <term>

<term> → id | int_constant | <expr>

Recursive-Descent Parsing The Pairwise Disjointness Problem

If the grammar is not pairwise disjoint, how do you know which RHS to pick based on the next token?

<variable> → identifier | identifier[<expr>]

How do we fix it? Left Factoring

<variable> → identifier<new>

<new> → ø | [<expr>]

Bottom-Up Parsing

Parsing is based on reductionReverse of a rightmost derivationAt each step, find the correct RHS that

reduces to the previous step in the derivation

Example Grammar<S> → <A>b Input: ab

<A> → a Step 1: <A>b

Bottom-Up Parsing

Most bottom-up parsers are shift-reduce algorithmsShift – move token onto the stackReduce – replace RHS with LHS

Bottom-Up Parsing

HandlesDef: is the handle of the right sentential

form iff = w if and only if S =>*rm Aw =>rm w

The handle of a right sentential form is its leftmost simple phrase

Bottom-Up Parsing is essentially looking for handles and replacing them with their LHS

Bottom-Up Parsing

Advantages of Shift Reduction ParsersThey can be built for all programming

languagesThey can detect syntax errors as soon as it

is possible in a left-to-right scanThey LR class of grammars is a proper

superset of the class parsable by LL parsers (for example, many left recursive grammars are LR, but none are LL)

Bottom-Up Parsing

Shift Reduction AlgorithmsInput Sequence – input to be parsedParse Stack – input is shifted onto the

parse stackACTION Table – what the parser doesGOTO Table – holds state symbols to be

pushed onto the stack when a reduction is completed

Bottom-Up Parsing

ACTION Table (or Parse Table)Rows = State SymbolsColumns = Terminal symbols

ValuesShift – push token on stackReduce – replace handle with LHSAccept – stack only has start symbol and

input is emptyError – original input is invalid

Bottom-Up Parsing

GOTO Table (or Parse Table)Rows = State SymbolsColumns = Nonterminal Symbols

Values indicate which state symbol should be pushed onto the parse stack after a reduction has been completed

CSci210.BA4. Chapter 4 Topics Introduction Lexical and Syntax Analysis The Parsing Problem ...

Documents

TOWN COUNCIL OFFICES 1 PARK ROAD SHEPTON MALLET BA4 … · 2 Longbridge Shepton Mallet BA4 5EN Supported . 4 6. Planning Applications Decided No Address ... 1 TOWN COUNCIL OFFICES

Syntax and Parsing of Semitic Languages - Tsarfatytsarfaty.com/pdfs/semitic.pdfSyntax and Parsing of Semitic Languages 3 1.1 Parsing Systems Syntactic Analysis A parsing system is

Manual logo 0 ba4 port ma_ind1

u.cs.biu.ac.ilu.cs.biu.ac.il/~89-680/parsing-algorithms.pdf · Title: parsing-algorithms

Top Down Parsing, Predictive Parsing

cdn.simba-dickie-group.de · Attaching tool case Werkzeugkasten-Einbau Fixation de la boîte outils BA4 BA4 BHI Xl TS-26 BJI BGI TS-26 BA4 BHI *Attach magnet using instant cement

Parsing Context-Free Grammars, Parsing, Syntax Trees

56326 P1 · 2017-09-11 · Les utiliser comme pièces de rechange. TAMIYA PAINT COLORS TAMIYA-FARBE PEINTURES TAMIYA ... 3x 10m BB15 BB23 BA4 813 B B 14 BA4 3x BB14 3X 18. BA4 B13

Dependency Parsing (3) - University Of Maryland · Dependency Parsing: what you should know •Transition-based dependency parsing •Shift-reduce parsing •Transition systems: arc

BA4 EN BA5VOOR ELEKTRICIENS - … - BA4 en BA5 voor... · BA4 EN BA5VOOR ELEKTRICIENS Werken aan of nabij elektrische installaties die al dan niet onder spanning staan is niet zonder

Bare-Bones Dependency Parsing - Uppsala Universitystp.lingfil.uu.se/~nivre/docs/BareBones.pdf · I Parsing methods for bare-bones dependency parsing I Chart parsing ... Eisner 2000]:

BA4 Fundamentals of Ethics, - Global Edulink€¦ · BA4 Fundamentals of Ethics, Corporate Governance and Business Law Module: 10 Dismissal . 167 1. Notice and dismissal Most of us

Syntactic Analysis Operator-Precedence Parsing Recursive-Descent Parsing

Dependency Parsing Parsing Algorithms Peng.Huang peng.huangp@alibaba-inc.com

CIMA Subject BA4 Fundamentals of Ethics, Corporate ...kaplan-publishing.kaplan.co.uk/SiteCollectionDocuments/cima-look... · CIMA Subject BA4 Fundamentals of Ethics, Corporate Governance

Learning for semantic parsing using statistical syntactic parsing techniques

BÄCHTOLD’S HEUKRAN BA4 / BA5 / BA6 · 2016. 11. 17. · BÄCHTOLD’S HEUKRAN BA4 / BA5 / BA6 Die starke, moderne Schweizer Krananlage! Telefon +41 41 493 17 70 info@baechtold

Sony Ba4 Chassis Kv21se40a Tv Sm

Bare-Bones Dependency Parsing - Uppsala Universitynivre/docs/BareBones.pdf · I Parsing methods for bare-bones dependency parsing I Chart parsing techniques ... [Kuhlmann and Satta

Parsing Expression Grammar and Packrat Parsing (Survey)