13
AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER SANDESH KANGONDI BASSEM ABUEIN

SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

Embed Size (px)

Citation preview

Page 1: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER

SANDESH KANGONDI BASSEM ABUEIN

Page 2: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

INTRODUCTIONA regular expression parser basically parses

the regular expression in the following steps.Takes as input the regular expression.Converts the input regular expression to an

NFA.Converts the NFA obtained to a DFA.Finally converted to a minimum DFA.Goal:implementation of a Regular Expression

parser .

Page 3: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

Specification Has a GUI to understand the states and transitions Use of ^ and $ tokens to specify match at the beginning and

ending of the pattern respectively. A C# implementation – object oriented. Has a feature allowing for the control the greediness of the

parser - allowing you to experience the different behavior of greediness.

Eg: When Greediness is set to false. An expression "a_*p" in string "appleandpotato"- will match "ap" and not "appleandp".

Page 4: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

FEATURES

Page 5: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

APPLICATIONSApplications of parsing include everything

from simple phrase finding, for proper name recognition to full semantic analysis of text, e.g. for information extraction or machine translation.

Page 6: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

DESIGN

Page 7: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

Class diagram for the parser

Page 8: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

DESIGNThe Set class is a simple representation of a Set

in mathematics. The Map class is a map between a key and one

or more objects. The State class holds the data structure of the

automata. RegEx - main class that actually uses other

classes. The RegExValidator class is used to validate a

pattern string. Validation done using Recursive Descent

Parsing. Besides validating the pattern, it does two other

tasks: insertion of implicit tokens making it explicit and expanding character classes.

Page 9: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

RECURSIVE DESCENT PARSERA recursive descent parser is a top-down

parser built from a set of mutually-recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the production rules of the grammar.

structure of the resulting program closely mirrors that of the grammar it recognizes.

Page 10: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

RECURSIVE DESCENT PARSEROne easy way to do recursive descent parsing is to have

each parse method take the tokens it needs, build a parse tree, and put the parse tree on a global stackWrite a parse method for each nonterminal in the

grammarEach parse method should get the tokens it needs,

and only those tokensThose tokens (usually) go on the stack

Each parse method may call other parse methods, and expect those methods to leave their results on the stack

Each (successful) parse method should leave one result on the stack

Page 11: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

Running and TESTINGThe following slide show a sansnapshot of

running program.The input is regular expression : a_*pThe output as you see.Note that at the left side of this snapshot we

can search string for specific regular expression pattern.

Page 12: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

Running and TESTING

Page 13: SANDESH KANGONDI BASSEM ABUEIN. INTRODUCTION A regular expression parser basically parses the regular expression in the following steps. Takes as input

ReferencesMichael Sipser. Introduction to the Theory of

Computation, Second Edition. 1996 Cambridge, Massachusetts."Discrete Mathematics and Its Applications" -

Kenneth H. Rosen (Fourth Edition) "Compilers - Principles, Techniques and Tools" -

Aho, Sethi and Ullmanhttps://intraweb.wvutech.edu/~mclark/Introduction to Automata Theory, Languages and

Computation, John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman, 2nd edition,