41
Unix grep Utility CS465

Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix

  • View
    234

  • Download
    2

Embed Size (px)

Citation preview

Unix grep Utility

CS465

The grep utility• grep stands for globally search for a regular expression

and print the results

• It is one of the most used Unix tools. It has even added to the Unix user's vocabulary:

– Verb: “Grep through the files to see what should be changed.”

– Adjective: “The projx file is grepped source code.”

– Noun: “Grepping is the best way to find that information.”

grep commands

• grep is actually a family of commands– fgrep– grep– egrep

• All three search files for strings which match specified patterns:

fgrep – pattern must be a fixed string

grep – pattern can include regular expressions

egrep – pattern can include extended regular expressions

Simplest grep

– Searches all files in the file-list

– Displays the filenames of the files which contain the fixed-pattern, along with the line in the file that the pattern was found on.

– If you list only ONE filename in the file-list, fgrep will NOT include the filename in the results

• The simplest grep command is fgrep:

$ fgrep fixed-pattern [file-list]

fgrep Example

• Search for all files in the current directory that contain the string "main".

• Example:$ fgrep main *

memo: The main point is that the

new.c:main()

prog1.c:main()

$

More fgrep examples

Display all lines in file prog.c containing “num”:$ fgrep num prog.cnum = 0;while ( num < 5 ) {num = num + 1;$

Display info on all users lines containing “small”:$ fgrep small /etc/passwdsmall000:x:1164:102:Faculty – Pam Smallwood:/export/home/small000:/bin/ks

$

grep format$ grep [options] pattern [filelist]

– Search for specified pattern in each line of specified files – Send lines containing pattern (or other info) to the standard

output (i.e. display them)

• Options:-c display only count of matching lines

-h outputs matched lines but not filenames-i ignore case when matching-l display names of files only (no matching lines)-n display line numbers

-s suppresses error messages for nonexistent or unreadable files

-v display only non-matching lines

-w restricts pattern to matching a whole word

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

$ grep –n are soccer.txt

2:There are no time outs.

3:There are no helmets,

$ grep are soccer.txt

There are no time outs.

There are no helmets,

$ grep –c are soccer.txt

2

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ grep –v no soccer.txt

In Soccer,

So if that’s what you need,

Are you ready?

$ grep Are soccer.txt

Are you ready?

$ grep –vc no soccer.txt

3

$ grep –i Are soccer.txt

There are no time outs.

There are no helmets,

Are you ready?

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ grep –vw no soccer.txt

In Soccer,

So if that’s what you need,

Play another sport.

Are you ready?

$ grep –l soccer *

$ grep –l Soccer *

soccer.txt

$ grep –li soccer *

soccer.txt

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team1 team2

team1:Scott Stewart

team1:Scott Smith

team2:Scott Jones

$ grep –l Scott team1 team2

team1

team2

$ grep –h Scott team1 team2

Scott Stewart

Scott Smith

Scott Jones

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team1 taem2

team1:Scott Stewart

team1:Scott Smith

grep: can’t open taem2

$ grep –s Scott team1 taem2

team1:Scott Stewart

team1:Scott Smith

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team*

team1:Scott Stewart

team1:Scott Smith

team2:Scott Jones

$ grep –c Do team* | grep ":0"

team1:0

$ grep –c Doug team*

team1:0

team2:1

More grep examples

Print non-commented lines in file myfile (i.e. lines that do NOT start with the string "#")$ egrep -v "^#" myfilename=billecho $name$

Search files in sub subdirectory for string “test” (ignore case)$ grep –i test `ls /sub`ltr: Test for todaymbox:Subject: testmakefile: test1: test1.cmakefile: gcc test1.c -o test1$

More grep example

Determine number of users in the projectX group:$ grep projectX /etc/group

projectX:x:507:Plin_9318,Fyusuf_9287,Rlee_8656,Rdeich_1254,Njuwal_5960,Mmelto_8858,Wbucki_6698,Tespin_0604,Psmallwo_000

$

-c shows matching count-c shows matching count-c shows matching count-c shows matching count-c shows matching count-c shows matching count

$ grep -c 507 /etc/passwd

9

$

Searching for Multiple Strings

-f option– If you have multiple strings that you want to

search for, you can put all the strings into a string file, and use:

$ cat stringfilepattern1pattern2$ grep –f stringfile filelist

grep examples$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ cat words

Soccer

dugouts

helmets

$ grep –f words soccer.txt

In Soccer,

There are no helmets,

no warm dugouts,

$ grep -v –f words soccer.txt

There are no time outs.

no shoulder pads,

no commercial breaks,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

grep Exercises

• Display all lines of file “test” that do not contain the string “and”, ignoring case

$ grep -iv "and" test

• Display a count of all of the lines of each “.c” file in current directory that contain the strings “num” or “number”

• Answer:$ cat stringsnumnumber$ grep -c –f strings *.c

Advanced grep

• grep is much more powerful when used with regular expressions to match more complex strings.

• Regular expressions are strings of characters and special symbols that are used to match other strings.

Regular Expressions• A pattern matching string is called a regular

expression (RE)

• grep (and other Unix utilities) can use REs

• Regular Expression Metacharacters:

. (period) match any single character, except newline (similar to wildcard ?)

* (asterisk) match any number (including zero) of the preceding character

.* match any number of any character

Not Filename Expansion!

• Although there are similarities to the metacharacters used in filename expansion

• this is different!

• Filename expansion is done by the shell.

• Regular expressions are used by commands (programs).

More RE Metacharacters

^ (caret) match start of line

$ (dollar) match end of line

Character Sets:

[ ] match any of enclosed characters

[^ ] match anything BUT the enclosed

NOTE: If the caret (^) is anywhere inside a character set except right after the opening bracket, it has no special meaning

Character Set Ranges

• The hyphen (-) character can be used with the square brackets to indicate a range of characters:

[0-9] is the same as [0123456789]

[a-z] is the same as [abcd...wxyz]

• If the hyphen is placed at the beginning or end of the character set, it has no special meaning (and will match a hyphen in the string)

Other Characters• Any character other than a metacharacter will accept

one of itself:

– A single letter a in a regular expression will accept a single letter a in a string

– This is assumed to be case-sensitive; lowercase a doesn't accept uppercase A

• Use a backslash to turn OFF metacharacter processing (i.e. match a metacharacter to its real value)

• Use quotes around Regular Expressions to prevent SHELL metacharacter interpretation.

Example Pattern Matches

• Some sample regular expressions and what they match:"abc" matches the string abc

"^abc" abc at the beginning of a line

"abc$" abc at the end of a line

"^abc$" abc as the entire line

"[Aa]bc" abc or Abc

"a[aeiuo]c" a, lowercase vowel, c

"a[^aeiou]c" a, not lowercase vowel, c

Example Pattern Matches• More regular expressions and what they match:

"[x-z]" matches x or y or z

"[x\-z]" matches x or - or z

"[xz-]" matches x or - or z (same)

"[.c]" matches any character followed by a c

"[\.c]" matches .c

"[a-zA-Z0-9]" matches any letter or digit

"[^0-9]" match any non-digit

"[^\^]" match any single character except ^

More Pattern Matches

"[Pp][Aa][Mm]"

– Matches “Pam" or “pam" or “pAM“

– Does not match "am" or “pa“

"[abc]*"

– matches "aaaaa" or "acbca“

"0*10"

– matches "010" or "0000010" or "10"

grep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ grep –i "^b" pattern

Background is black, and white.

but not yellow.

$ grep –i b pattern

Background is black, and white.

And I love blue,

but not yellow.

$ grep "^b" pattern

but not yellow.

$ grep "\." pattern

Background is black, and white.

But not yellow.

$ grep "." pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

grep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ grep "," pattern

Background is black, and white.

I love red,

and I love blue,

$ grep ",$" pattern

I love red,

And I love blue,

grep examples$ cat pattern2

rd

reed

red

reef

ref

reep

$ grep "re[df]" pattern2

red

ref

$ grep "re*d" pattern2

rd

reed

red

$ grep "f$" pattern2

reef

ref

$ grep "re*[dp]" pattern2

rd

reed

red

reep

More grep examples (using RE)

Display names of all files in this directory that refer to Unix (or unix)

$ grep -l "[Uu]nix" *mboxmyfile.txtscript2$

List soft-linked files only:$ ls –l | grep "^l"lrwx------ 2 small000 faculty 512 Jun 4 13:04 t1 -> t

lrwx------ 2 small000 faculty 512 Jun 2 13:43 t2 -> t

$

grep in a script

Display long list of files, then the number of "old" files (files last accessed in 2007):

$ cat oldfiles.ksh#! /bin/kshls -l > listfilenum=`grep 2007 listfile | wc -l`echo Number of old files: $numrm listfileexit 0$

Extended Regular ExpressionAdditional Metacharacters(available only with egrep)

+ match any number (greater than zero) of preceding character

? match zero or one instances of preceding character

| combines REs with either-or matching

( ) groups pattern matching characters

egrep examples$ cat pattern2

rd

reed

red

reef

ref

reep

$ egrep "re+d" pattern2

red

reed

$ egrep "re?d" pattern2

rd

red

$ egrep "re+[dp]" pattern2

red

reed

reep

$ egrep "re?[df]" pattern2

rd

red

ref

grep/egrep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ egrep "I|a" pattern

Background is black, and white.

I love red,

and I love blue,

$ grep "red|blue" pattern

$ egrep "red|blue" pattern

I love red,

and I love blue,

$ egrep "^I|^a" pattern

I love red,

and I love blue,

Extended RE Examples

[abc]+d matches "aaaaad" or "acbcad "

but does NOT match "d"

0+10 matches "010" or "0000010"

but does NOT match "10"

x[abc]?x matches "xax", "xbx", "xcx" or "xx"

A[0-9]?B matches "A8B" or "AB“

but does NOT match "a8b" or "A123B"

Grouping

• The parentheses special characters ( and ) can be used to group several subexpressions together and apply a suffix to them as a group:

ba+d accepts bad, baad, baaad, etc

(ba)+d accepts bad, babad, bababad, etc

(ba)+(cd)+ accepts bacd, babacd, bacdcd, bacdcdcdcd, etc

Alternatives• The (|) "either-or" choice can also be "grouped"

aa|bb will accept either aa or bbfr(ie|ei)nd will accept friend or freind, and

nothing else

• There can be any number of choices:m(a|e|ai|oo)n will accept man, men, main, or moon

• If all the choices are single characters, then you might as well use a character set:p(a|e|i)n will accept pan, pen, or pin, and is

equivalent to p[aei]n

More egrep Examples

• You must use egrep in order to have access to the EXTENDED regular expressions:

$ egrep 'ab+c?d' filematch lines with a followed by any number of

b’s and and optional c followed by a d

$ egrep '(ab)+c?(de)+' filematch lines with any number of ab’s and optional c

followed by any number of de’s

Handout

• See handout for morefgrep, grep, and egrep examples

grep/egrep Exercise

• Display all lines of file “test” that end in the letters x, y, or z

$ grep '[xyz]$' testOR

$ egrep 'x$|y$|z$' test