03 Regular Expressions

Embed Size (px)

Citation preview

  • 8/3/2019 03 Regular Expressions

    1/26

    BasicsMiscellaneous

    Regular Expressions

    Scripting and Computer Environment - Lecture 3

    Saurabh Barjatiya

    International Institute Of Information Technology, Hyderabad

    26 July, 2011

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    2/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Contents

    1 BasicsMatching single character

    Using special charactersRepetition match

    2 Miscellaneous

    OR of regular expressionsBackreference

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    3/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Sample text file

    For all future examples following will be contents of sample textfile (say test.txt)

    Contents

    ABCD

    XYZxyz

    abcd

    340

    6$there are two elfs in this room

    there are two shelves in this room

    341 + 24 = 356

    !@#$%

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    4/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Matching single character - 01

    All alphabets and numbers match themselves. Hence nothingfancy is required to match simple words or numbers.

    Example 1 - Matching alphabets and numbersCommand: grep this test.txt

    Output:

    there are two elfs in this room

    there are two shelves in this room

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    M hi i l h

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    5/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Matching single character - 02

    Following characters are treated differently when used in regularexpressions:

    Plus (+)

    Pipe (|)Brackets ({}[]())

    Dot (.)

    Question mark (?)

    Caret (^)Dollar ($)

    Backslash (\)

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    M t hi i l h t

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    6/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Matching single character - 03

    Special characters can be escaped using backslash (\) so thatthey match themselves and do not have any special significance.

    Example 2 - Matching special charactersCommand: grep \$ test.txt

    Output:

    6$

    !@#$%

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    Matching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    7/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Using special characters - Brackets - 01

    Square brackets ([]) are used to specify range / set of charactersthat can be matched.

    Example 3 - Matching set of characters

    Command: grep [iou] test.txt

    Output:

    abcd

    there are two elfs in this roomthere are two shelves in this room

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    Matching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    8/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Using special characters - Brackets - 02

    Example 4 - Matching range of characters

    Command: grep [0-9] test.txt

    Output:

    340

    6$

    341 + 24 = 356

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    Matching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    9/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Using special characters - Brackets - 03

    [a-z] Any small alphabet between a and z (bothinclusive)

    [A-Z] Any capital alphabet between A and Z (bothinclusive)

    [a-Z] Any alphabet capital or small[A-z] Wont match with anything, not even capital A or

    small z

    Note that above description is based on grep behavior. Otherregular expression matching programs or libraries can interpret thesame sequences differently. Hence it is best to test regularexpression when using new library / application or use long formlike [abcdefghijklmnopqrstuvwxyz] to ensure that regular

    expression is portable.Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    B iMatching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    10/26

    BasicsMiscellaneous

    Matching single characterUsing special charactersRepetition match

    Using special characters - Brackets - 04

    Following and many other named classes can be used within squarebrackets ([]):

    [:alpha:] Any alphabet small or capital

    [:alnum:] Any alphabet or number

    [:digit:] Any digit[:lower:] Any lower case letter

    [:upper:] Any upper case letter

    [:space:] Whitespace

    Usage of named classes

    Command: grep [[:alpha:]] test.txt

    The symbol \w is a synonym for [[:alnum:]]

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    B iMatching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    11/26

    BasicsMiscellaneous

    g gUsing special charactersRepetition match

    Using special characters - Brackets - 05

    If square brackets range begins with caret (^) then it negates therange and all characters which are not specified within squarebrackets get matched.

    Example 05 - Negation of range with square brackets

    Command: grep [^[:alpha:] ] test.txt

    Output:

    340

    6$

    there are two elfs in this room

    341 + 24 = 356

    !@#$%

    The symbol \W is a synonym for [^[:alnum:]]

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMatching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    12/26

    BasicsMiscellaneous

    g gUsing special charactersRepetition match

    Using special characters - Description

    Period (.) Matches any single character

    Caret (^) Matches empty character at start of line

    Dollar ($) Matches empty character at end of line

    (\) Matches empty character at end of word

    (\b) Matches empty character at edge of word

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMatching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    13/26

    BasicsMiscellaneous

    Using special charactersRepetition match

    Using special characters - Period

    Example 06 - Usage of period

    Command: grep 6. test.txt

    Output:

    6$

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMatching single character

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    14/26

    BasicsMiscellaneous

    Using special charactersRepetition match

    Using special characters - Caret

    Example 07 - Usage of caret

    Command: grep ^a test.txt

    Output:

    abcd

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMatching single characterU i i l h

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    15/26

    BasicsMiscellaneous

    Using special charactersRepetition match

    Using special characters - Dollar

    Example 08 - Usage of dollar

    Command: grep 6$ test.txt

    Output:

    341 + 24 = 356

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    Basics Matching single characterU i i l h t

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    16/26

    MiscellaneousUsing special charactersRepetition match

    Using special characters - \

    Command: grep s\> test.txt

    Output:

    there are two elfs in this room

    there are two shelves in this room

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    Basics Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    18/26

    MiscellaneousUsing special charactersRepetition match

    Using special characters - \b

    Example 11 - Usage of \b

    Command: grep \b3 test.txt

    Output:

    340

    341 + 24 = 356

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMi ll

    Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    19/26

    MiscellaneousUsing special charactersRepetition match

    Repetition match - Description

    ? The preceding item is optional and matched atmost once.

    * The preceding item will be matched zero or more

    times.+ The preceding item will be matched one or more

    times.

    {n} The preceding item is matched exactly n times.

    {n,} The preceding item is matched n or more times.{n,m} The preceding item is matched at least n times,

    but not more than m times.

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMi ll

    Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    20/26

    MiscellaneousUsing special charactersRepetition match

    Repetition match examples

    Example 12 - Usage of ?

    Command: grep -o \ test.txt

    Output:

    two

    this

    two

    this

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneo s

    Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    21/26

    Miscellaneousg p

    Repetition match

    Repetition match examples

    Example 13 - Usage of *

    Command: grep -o b.* test.txt

    Output:

    bcd

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneous

    Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    22/26

    Miscellaneousg

    Repetition match

    Repetition match examples

    Example 14 - Usage of +

    Command: grep -o th[a-z]\+ test.txt

    Output:

    there

    this

    there

    this

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneous

    Matching single characterUsing special characters

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    23/26

    MiscellaneousRepetition match

    Repetition match examples

    Example 15 - Usage of{}

    Commands: grep -o ro\{2\}m test.txt

    grep -o ro\{1,\}m test.txtgrep -o ro\{1,2\}m test.txt

    Output:

    room

    room

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneous

    OR of regular expressionsBackreference

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    24/26

    Miscellaneous Backreference

    Contents

    1 BasicsMatching single characterUsing special charactersRepetition match

    2 MiscellaneousOR of regular expressionsBackreference

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneous

    OR of regular expressionsBackreference

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    25/26

    u

    OR of regular expressions

    AND on regular expressions can be performed by justconcatenating two regular expressions. For OR of regularexpressions | operator can be used.

    Example 16 - Usage of |Commands: grep -o s[a-z]\+\|[a-z]\+s test.txt

    Output:

    this

    shelves

    this

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    BasicsMiscellaneous

    OR of regular expressionsBackreference

    http://find/http://goback/
  • 8/3/2019 03 Regular Expressions

    26/26

    Backreference

    Backreferences can be used to refer to matched regular expressionwhen using regular expressions for search and replace operations.

    Example 17 - Usage of backreferenceCommands: sed s/\/I\1/g test.txt | grep I

    Output:

    there are two elfs In this room

    there are two shelves In this room

    Saurabh Barjatiya Regular Expressions Scripting and Computer Environment - Lecture 3 IIIT Hyderabad

    http://find/http://goback/