Upload
binary-studio
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
ANTLR 4
ANother Tool for Language Recognition
by Alexander Vasiltsov
What ANTLR can do
● Can generate parser using formal language
description called grammar
● Grammar describes language in EBNF-like
way
● Automatically generates classes for walking
through syntax tree
● Contains powerful error recovery mechanism
● Can deal with left recursive rules
Successful usages
● Twitter search engine
● Hadoop (Hive & Pig)
● Oracle (SQL Developer IDE, Migration
Tools)
● NetBeans IDE
How it works
Target Languages
ANTLR 4 Target languages:
● Java
● C#
● Python
ANTLR 3 also supports following languages: C,
C#, Java, JavaScript, ActionScript, Objective-C,
Perl, Python, Ruby and other.
Setup for Java
Java 1.6 or newer required
1) Download latest ANTLR4 package (antlr-4.4-
complete.jar) at
http://www.antlr.org/download.html
It’s done!
Setup for C#
Java 1.6 or newer required!
1) Add ANTLR reference to the projectPM> Install-Package Antlr4
2) Install ANTLR Language Support extension
ANTLRWorks
http://tunnelvisionlabs.com/products/demo/antlrworks
Lexing
Lexing (tokenizing) - is the process of grouping
of input chars stream into words (tokens).
Token contains at least 2 data fragments: its
type and matched text
Parsing
Parsing - is the process of matching of linear
sequence of tokens with language’s formal
grammar
Parse tree (syntax tree) is a result of parsing
Syntax tree
Syntax tree represents the structure of
recognized sentence where each node gives
an abstract name to its children nodes
Nodes represent grammar rules
Leafs represent tokens
Parsing process
Parser generation by ANTLR4
ArrayInitParser.java (.cs) Contains parser class definition according to grammar
named ArrayInit
ArrayInitLexer.java (.cs) Contains lexer class definition respectively
ArrayInit.tokens
ArrayInitLexer.tokens
Internal ANTLR’s files, contain token dictionary with
corresponding identifiers
ArrayInitListener.java (.cs) Listener’s interface - for walking through syntax tree
and its processing
ArrayInitBaseListener.java (.cs) Base listener class with empty methods
ArrayInitVisitor.java (.cs) Visitor’s interface - also for walking through syntax tree
using Visitor design pattern
ArrayInitBaseVisitor.java (.cs) Base visitor class with empty methods
Syntax tree structure
Walker
Listener
Visitor
“Visitor” design pattern
Parser’s generation step-by-step
● Java target language:> java -jar antlr-4.4-complete.jar <grammar-file-name>
● C# target language: add grammar file to the
project and compile it. Generated classes
will be added to obj\Debug directory
Common grammar structure
Typical Grammar