158
UNIX UNIT 1

Commands and shell programming (3)

Embed Size (px)

Citation preview

Page 1: Commands and shell programming (3)

UNIXUNIT 1

Page 2: Commands and shell programming (3)

Features of UNIX1.Multiuser System2.Multitasking System3.The Building Block Approach4.The UNIX tool kit5.Pattern Matching6.Programming Facility7.Documentation.

Page 3: Commands and shell programming (3)

Locating Commands• type: To know about the location

of an executable program is to use the type command.$ type lsls is /bin/ls

• echo $PATH

Page 4: Commands and shell programming (3)

Internal and External Commands

The programs or file having an independent existence in the /bin directory (or /usr/bin), is branded as an external command.

Most of the commands are external in nature, but there are some which are not really found in anywhere, and some which are normally not executed even when if they are in one of the directories specified by PATH these are called as internal commands.

Page 5: Commands and shell programming (3)

Command Structure“Command arguments”Commands and arguments have to be separated by spaces or tabs to enable the system to interpret them as words. • Options• Filename Arguments• Exceptions

Page 6: Commands and shell programming (3)

General purpose utilities:cal produces the calendar of a month or year.cal [ [ month ] year ]cal 03 2006date can display any component of the system date and time in a number of formats.date +%mdate +%hdate +“%h %m”

Page 7: Commands and shell programming (3)

echo: Displays a message on the screen.printf: It works likes echo but can use format specifies like %d, %s, …bc: is the calculator/xcalc (graphical object)scale, ibase, obasescript:is the UNIX systems recorder which logs all activities of a user in a separate file. passwd is used to change a user’s password but is not displayed on the screen. The command prompts for the old password before the new one.

Page 8: Commands and shell programming (3)

who shows the users working on the system and the time of logging in.who am iuname: Knowing users machine’s characteristics; by default it displays the name of the operating system. It reveals details of the machine’s operating system (-r). It also displays the host name (-n) that is used by networking commands.tty: (teletype) displays the device name of your terminal.stty: displays and sets various terminal attributes. Use stty sane to set the terminal to some standard values.

Page 9: Commands and shell programming (3)

FileOrdinary File is also known as regular file. It contains only data as a stream of characters. Directory File – It’s commonly said that a directory contains files and other directories, but strictly speaking, it contains their names and a number associated with each name.Device File – All devices and peripherals are represented by files. To read or write a device, you have to perform these operations on its associated file

Page 10: Commands and shell programming (3)

File name

A file name can consist of up to 255 characters.Files may or may not have extensions. It can consist of practically any ASCII character except

the / and the NULL character. Users are permitted to use control characters It is recommended that only the following characters

should be used◦Alphabetic characters and numerals.◦The period (.), hyphen (-), and underscore (_).

A file can have as many dots embedded in its name; A file name can also begin with a dot or end with a dot.UNIX is case sensitive.

Page 11: Commands and shell programming (3)

The Parent Child Relationship

The file system is a hierarchical structure, and the top-most directory is called root. Files and directories have a parent-child relationship.

Page 12: Commands and shell programming (3)

Directory Related Commands

pwd: Present Working Directory.cd: Change Directory.mkdir: Make Directory.rmdir : Remove Directory. A subdirectory

cannot be removed with rmdir unless it is empty, and one is positioned in its parent directory or above. But can remove a directory without using rmdir also(rm –r can remove a directory tree recursively even if is not empty).

Page 13: Commands and shell programming (3)

Options of ls command

Page 14: Commands and shell programming (3)

THE UNIX FILE SYSTEM

Page 15: Commands and shell programming (3)
Page 16: Commands and shell programming (3)

File Related Commands

cat is not only used to display one or more files but also to create a file.cp is used to copy files.rm is used to remove files.mv (Move)rename a file.more is a pager that supports a repeat factor. Helps to search for a pattern(/) and repeat the search (n). lp prints a file and can directly print Postscript documents.file identifies the file type beyond the normal three categories. wc counts the number of lines, words, and characters.od displays the octal value of each character and is used to display

invisible characters.cmp tells us where the first difference was encountered.comm shows the lines that are common and optionally shows the lines

unique to either or both the sorted files.diff lists the differences as a sequence of instructions.

Page 17: Commands and shell programming (3)

Compressing and Archiving files

gzip and gunzip compresses and decompresses individual files(extension- .gz).

tar always works recursively to archive a group of files into archive. tar and gzip are often used together to create compressed archives (extension- .tar.gz).

zip and unzip can perform all functions that are found in gzip, gunzip, and tar. zip alone can create a compressed archive from directory structures(-r).

Page 18: Commands and shell programming (3)

ls –l (Listing file attributes)

File type and PermissionsLinksOwnershipGroup OwnershipFile sizeLast modification time.file name

Page 19: Commands and shell programming (3)

Permissions

There are three types of file access supported by UNIX. r – read, view the contents of a file or a

directory w –write, edit file/directory contents x –execute, run executable file

19

Page 20: Commands and shell programming (3)

Permissions

Here’s an example Suppose you type in ls -l and the result is

- rwx r-x r-- 1 s v 858 Aug 22 22:28 test.sh

What do all these symbols mean?

20

Page 21: Commands and shell programming (3)

Permissions

- rwx r-x r-- 1 sv sv 555 Jan 12 20:12 Test

typeowner

group size Modification date/time

File name

User permission

sGroup

permissions

Other Permissio

ns

links

21

Page 22: Commands and shell programming (3)

PermissionsUser – the person who created the file.Group – the group owns the file.Others – the rest of the world“754” is a decimal number. But you

can represent each digit with a binary number.

4 => read permission2 => write permission, 1=> execute permission

22

Page 23: Commands and shell programming (3)

Permissions

read=4; write=2; execute=1 rwx r-x r--

4 + 2 + 1

7 5 4

4 + 0 + 1 4 + 0 + 0

23

Page 24: Commands and shell programming (3)

Permissionsrwx r-x r-- is a symbolic way to specify file modes,while 754 is a numeric way (remember 7 111, 5 101, 4100 ? ).

How would you represent this file mode numerically? --x --x –wx

How would you represent this bit string symbolically? 6 1 4

24

Page 25: Commands and shell programming (3)

Permissionschmod mode file(s) : Change the access

mode of one or more files.Examples:

chmod 751 my_file The owner of my_file has rwx(7) permission, the

group has r-x(5) permission, others have --x permission.

Tell me what the following command will do?chmod u=rwx, g=r, o=wr my_file

25

Page 26: Commands and shell programming (3)

u : Userg : Groupo : Othersa : All(ugo)

+ : Assigns Permission - : Removes Permission =: Assigns absolute Permission

r : Read Permissionw : Write Permissionx : Execute Permission

Page 27: Commands and shell programming (3)

File Permissions UNIX also provides a way to protect files

based on users and groups. Three types of permissions:

read, process may read contents of file write, process may write contents of file execute, process may execute file

Three sets of permissions: permissions for owner permissions for group permissions for other

Page 28: Commands and shell programming (3)

Directory permissions Same types and sets of permissions as

for files read: process may a read the directory

contents (i.e., list files) write: process may add/remove files in the

directory execute: process may open files in

directory or subdirectories

Page 29: Commands and shell programming (3)

Utilities for Manipulating file attributes chmod change file permissions chown change file owner chgrp change file group only owner or super-user can change

file attributes

Page 30: Commands and shell programming (3)

Chmod command Symbolic access modes {u,g,o} /

{r,w,x} example: chmod +r file

Octal access modesoctal read write execute0 no no no1 no no yes2 no yes no3 no yes yes4 yes no no5 yes no yes6 yes yes no7 yes yes yes

Page 31: Commands and shell programming (3)

Directory PermissionsThe default permissions of a directory on any system will be usually 755. A directory must never be writable by group and others. If so be assured that every user can remove files in the directory.

Page 32: Commands and shell programming (3)

Changing File Ownership

chown : Changing File Ownership

chown options owner [:group] file(s)

Changing ownership requires superuser permission.

$ suPassword: *********# _

Page 33: Commands and shell programming (3)

chgrp: Changing Group Owner

The group owner of a file is the group to which the owner belongs. The chgrp command changes a file’s group owner.

A user can change the group owner of a file, but only to a group to which the user belongs.

A user can belong to multiple groups.

Page 34: Commands and shell programming (3)

UNIX – The vi Editorvi Basics: operates in 3 modes

Command Mode : In this mode we pass commands to act on text, can not use this

mode to enter or replace text. The default mode where every key pressed is

interpreted as a command to run on text. Navigation, copying, deleting text are

performed in this mode.

Insert Mode : ready to input text, at each line <Enter>, backspace to wipe out

unwanted text, [Ctrl-w] to erase the entire word. Press [Esc] to revert back to

Command Mode. Press l (el) or h to move the cursor(h left, l right, K up, j down).

The entered text is not saved in the insert mode to save the text from buffer switch to

ex-Mode

ex-Mode or Last Line Mode :while in Insert Mode, enter : (colon), then enter x,

press <Enter> :x <Enter>The file is now saved and you are back to the $ prompt

07/20/11 34

Page 35: Commands and shell programming (3)

UNIX – The vi Editor• Insert Mode

• Invoke this mode by pressing on of the following keys• i – insert text to the left of the cursor• a – appends text to the right of the cursor• I – insert text at the beginning of the line• A – appends text at the end of the line• o – opens line below• O – opens line above• rch Replaces single character with ch(No Esc required)• R Replaces text from cursor to right• s Replaces single character with any number of characters• S Replaces entire line

• The File .exrc• vi reads the file $HOME/.exrc (same as ~/.exrc in some shells)

on startup, You can create abbreviations, redefine your keys to behave differently

07/20/11 35

Page 36: Commands and shell programming (3)

UNIX – The vi Editor Input Mode – Entering and Replacing Text Insert and append (i, a, I, A) I Inserts text at beginning of line A Appends text at end of line

Replace (r, R, s, S) Open a line (o, O) o Opens lines below O Opens line above

:set showmode : messages like INSERT MODE, REPLACE MODE, CHANGE MODE will now appear the last line can add this in the .exrc file for permanent settings Saving Text and Quiting – ex mode:w saves file and remains in editing mode:wq saves file and quits editing mode:x saves file and quits editing mode:q Quits editing mode when no changes are made on the file:q! Quits editing mode but after abandoning changes

07/20/11 36

Page 37: Commands and shell programming (3)

UNIX – The vi Editor• Saving Text and Quitting

• :w – save and continue• :w anotherfile• It is common practice to ignore the readonly label on the file

• what happens when trying :w?

• :x and :wq – save and exit• :q and :q! – abandon all changes and quit (press [Ctrl-z] and

suspend the process)

07/20/11 37

Page 38: Commands and shell programming (3)

UNIX – The vi Editor• Escape to the Shell (:sh and [Ctrl-z]

• :sh • returns to shell prompt• execute cc to compile C program• use [Ctrl-d] or exit to return to the editor

• use [Ctrl-z] to suspend current vi session. Run your commands and use fg to return to editor . (Job control)

07/20/11 38

Page 39: Commands and shell programming (3)

UNIX – The vi Editor• Deleting Text

• x – deletes one or more character• X – deletes to the left• dd – deletes the current line

• Joining Lines• J – joins two lines

• the current line and the line following it• 4J joins following 3 lines with current one

• Changing Case (~)• changes the case of the text

07/20/11 39

Page 40: Commands and shell programming (3)

UNIX – The vi Editor• Copying lines

• yy – copy current line• y3y – will copy 3 lines• “ayy and “ap to copy lines from one file to another

• open the new file in ex Mode :e newfile • Paste

• after yy; p will paste the lines

• Undoing Last Editing• u – will undo the last change

• Substitution – Search and Replace (:s)

07/20/11 40

Page 41: Commands and shell programming (3)

ShellThe shell is a process that runs when a user logs in and terminates when she logs out. It scans the command line for metacharacters and rebuilds it before turning it over to the kernel for execution.

The shell is grouped into two categories here:•The Bourne family comprising the Bourne shell (/bin/sh) and its derivatives – the Korn shell (/bin/ksh) and Bash (/bin/bash).•The C shell (/bin/csh) and its derivative, Tcsh (/bin/tcsh)

The shell matches filenames with wild-cards that have to be expanded before the command is executed. It can match any character (*) or a single one (?). It can also match a range ([ ]) and negate a match ([!]). The * and ? don’t match a filename beginning with a dot.

Page 42: Commands and shell programming (3)

Redirection of input/ouput• Redirection of output: >

– example:$ ls -l > my_files• Redirection of input: <

– example: $ cat <input.data• Append output: >>

– example: $ date >> logfile• Arbitrary file descriptor redirection: fd>

– example: $ ls –l 2> error_log

Page 43: Commands and shell programming (3)

Multiple Redirection• cmd 2>file

– send standard error to file– standard output remains the same

• cmd > file 2>&1– send both standard error and standard output

to file• cmd > file1 2>file2

– send standard output to file1– send standard error to file2

Page 44: Commands and shell programming (3)

Aliases• Like macros (#define in C)• Shorter to define than functions, but

more limited• Not recommended for scripts• Example:alias rm='rm –i'

Page 45: Commands and shell programming (3)

Introduction to Filters

A class of Unix tools called filters.◦Utilities that read from standard input,

transform the file, and write to standard outUsing filters can be thought of as data

oriented programming.◦Each step of the computation transforms data

stream.

Page 46: Commands and shell programming (3)

Examples of Filters

Sort◦Input: lines from a file◦Output: lines from the file sorted

Grep◦Input: lines from a file ◦Output: lines that match the argument

Awk◦Programmable filter

Page 47: Commands and shell programming (3)

cat: The simplest filter

The cat command copies its input to output unchanged (identity filter). When supplied a list of file names, it concatenates them onto stdout.

Some options: ◦-n number output lines (starting from 1)◦-v display control-characters in visible form

(e.g. ^C)

Page 48: Commands and shell programming (3)

head

Display the first few lines of a specified file

Syntax: head [-n] [filename...]◦-n - number of lines to display, default is 10◦filename... - list of filenames to display

When more than one filename is specified, the start of each files listing displays==>filename<==

Page 49: Commands and shell programming (3)

tail

Displays the last part of a fileSyntax: tail +|-number [lbc] [f] [filename]

or: tail +|-number [l] [rf] [filename]◦+number - begins copying at distance number

from beginning of file, if number isn’t given, defaults to 10

◦ -number - begins from end of file◦ l,b,c - number is in units of lines/block/characters◦ r - print in reverse order (lines only)◦ f - if input is not a pipe, do not terminate after end

of file has been copied but loop. This is useful to monitor a file being written by another process

Page 50: Commands and shell programming (3)

head and tail examples

head /etc/passwdhead *.ctail +20 /etc/passwd

ls -lt | tail -3head –100 /etc/passwd | tail -5tail –f /usr/local/httpd/access_log

Page 51: Commands and shell programming (3)

tee

Copy standard input to standard output and one or more files◦ Captures intermediate results from a filter in

the pipeline

Unix Command Standard output

file-list

Page 52: Commands and shell programming (3)

tee con’t

Syntax: tee [ -ai ] file-list◦-a - append to output file rather than

overwrite, default is to overwrite (replace) the output file

◦-i - ignore interrupts◦file-list - one or more file names for capturing

outputExamples

ls | head –10 | tee first_10 | tail –5who | tee user_list | wc

Page 53: Commands and shell programming (3)

cut: select columns

The cut command prints selected parts of input lines. ◦can select columns (assumes tab-separated input) ◦can select a range of character positions

Some options: ◦ -f listOfCols: print only the specified columns (tab-

separated) on output◦ -c listOfPos: print only chars in the specified

positions◦ -d c: use character c as the column separator

Lists are specified as ranges (e.g. 1-5) or comma-separated (e.g. 2,4,5).

Page 54: Commands and shell programming (3)

cut examples

cut -f 1 < datacut -f 1-3 < datacut -f 1,4 < data

cut -f 4- < datacut -d'|' -f 1-3 < datacut -c 1-4 < dataUnfortunately, there's no way to refer to "last column" without counting the columns.

Page 55: Commands and shell programming (3)

paste: join columns

The paste command displays several text files "in parallel" on output.

If the inputs are files a, b, c ◦the first line of output is composed

of the first lines of a, b, c ◦the second line of output is composed

of the second lines of a, b, c Lines from each file are separated by a tab

character.If files are different lengths, output has all lines

from longest file, with empty strings for missing lines.

Page 56: Commands and shell programming (3)

paste example

cut -f 1 < data > data1cut -f 2 < data > data2cut -f 3 < data > data3

paste data1 data3 data2 > newdata

Page 57: Commands and shell programming (3)

sort: Sort lines of a file

The sort command copies input to output but ensures that the output is arranged in ascending order of lines.◦By default, sorting is based on ASCII

comparisons of the whole line. Other features of sort:

◦understands text data that occurs in columns. (can also sort on a column other than the first)

◦can distinguish numbers and sort appropriately

◦can sort files "in place" as well as behaving like a filter

◦capable of sorting very large files

Page 58: Commands and shell programming (3)

sort: Options

Syntax: sort [-dftnr] [-o filename] [filename(s)]-d Dictionary order, only letters, digits, and

whitespace are significant in determining sort order

-f Ignore case (fold into lower case)-t Specify delimiter-n Numeric order, sort by arithmetic value

instead of first digit-r Sort in reverse order-o filename - write output to filename, filename

can be the same as one of the input filesLots of more options…

Page 59: Commands and shell programming (3)

sort: Specifying fields

Delimiter : -tdOld way:

◦+f[.c][options] [-f[.c][options] +2.1 –3 +0 –2 +3n

◦Exclusive◦Start from 0 (unlike cut, which starts at 1)

New way:◦-k f[.c][options][,f[.c][options]]

-k2.1 –k0,1 –k3n◦Inclusive◦Start from 1

Page 60: Commands and shell programming (3)

sort Examples

sort +2nr < datasort –k2nr datasort -t: +4 /etc/passwdsort -o mydata mydata

Page 61: Commands and shell programming (3)

uniq: list UNIQue items

Remove or report adjacent duplicate lines Syntax: uniq [ -cdu] [input-file] [ output-file]

-c Supersede the -u and -d options and generate an output report with each line preceded by an occurrence count

-d Write only the duplicated lines-u Write only those lines which are not

duplicated◦ The default output is the union (combination) of

-d and -u

Page 62: Commands and shell programming (3)

wc: Counting results

The word count utility, wc, counts the number of lines, characters or words

Options:-l Count lines-w Count words-c Count characters

Default: count lines, words and chars

Page 63: Commands and shell programming (3)

wc and uniq Examples

who | sort | uniq –dwc my_essaywho | wcsort file | uniq | wc –l

sort file | uniq –d | wc –lsort file | uniq –u | wc -l

Page 64: Commands and shell programming (3)

tr: TRanslate Characters

Copies standard input to standard output with substitution or deletion of selected characters

Syntax: tr [ -cds ] [ string1 ] [ string2 ]• -d delete all input characters contained in string1• -c complements the characters in string1 with

respect to the entire ASCII character set

• -s squeeze all strings of repeated output characters

in the last operand to single characters

Page 65: Commands and shell programming (3)

tr (continued)

tr reads from standard input. ◦Any character that does not match a character

in string1 is passed to standard output unchanged

◦Any character that does match a character in string1 is translated into the corresponding character in string2 and then passed to standard output

Examples◦ tr s z replaces all instances of s with z◦ tr so zx replaces all instances of s with z and

o with x◦ tr a-z A-Z replaces all lower case characters

with upper case characters◦ tr –d a-c deletes all a-c characters

Page 66: Commands and shell programming (3)

tr uses

Change delimitertr ‘|’ ‘:’

Rewrite numberstr ,. .,

Import DOS filestr –d ’\r’ < dos_file

Find printable ASCII in a binary filetr –cd ’\n[a-zA-Z0-9 ]’ < binary_file

Page 67: Commands and shell programming (3)

find utility and xargs

 find . -type f -print | xargs wc -l◦ -type f for files◦ -print to print them out◦xargs invokes wc 1 or more times

wc -l a b c d e f gwc -l h i j k l m n o…

Compare to: find . -type f –exec wc -l {} \;

Page 68: Commands and shell programming (3)

Regular Expressions◦Allow you to search for text in files◦grep command

Utilities that let you write high level programs for stream manipulation:◦sed

Page 69: Commands and shell programming (3)

grep and sed

Regular Expressions◦Allow you to search for text in files◦grep command

Stream manipulation:◦sed

Page 70: Commands and shell programming (3)

Regular Expressions

Page 71: Commands and shell programming (3)

What Is a Regular Expression?

A regular expression (regex) describes a set of possible input strings.

Regular expressions descend from a fundamental concept in Computer Science called finite automata theory

Regular expressions are endemic to Unix◦vi, ed, sed, and emacs◦awk, tcl, perl and Python◦grep, egrep, fgrep◦compilers

Page 72: Commands and shell programming (3)

Regular Expressions

The simplest regular expressions are a string of literal characters to match.

The string matches the regular expression if it contains the substring.

Page 73: Commands and shell programming (3)

UNIX Tools bad

match

UNIX Tools good

match

UNIX Tools is okay.no match

regular expression d

Page 74: Commands and shell programming (3)

Regular Expressions

A regular expression can match a string in more than one place.

Greenapple and the apple.

match 1 match 2

regular expression a p p l e

Page 75: Commands and shell programming (3)

Regular Expressions

The . regular expression can be used to match any character.

For me to sit on

match 1 match 2

regular expression o .

Page 76: Commands and shell programming (3)

Character Classes

Character classes [] can be used to match any specific set of characters.

Ten men in a den

match 1 match 2

regular expression . [ie] n

match 3

Page 77: Commands and shell programming (3)

Negated Character Classes

Character classes can be negated with the [^] syntax.

Ten men in a den

match

regular expression [^td] e n

Page 78: Commands and shell programming (3)

More About Character Classes

[aeiou] will match any of the characters a, e, i, o, or u[kK]orn will match korn or KornRanges can also be specified in character classes[1-9] is the same as [123456789][abcde] is equivalent to [a-e]You can also combine multiple ranges

[abcde123456789] is equivalent to [a-e1-9]Note that the - character has a special meaning in a

character class but only if it is used within a range,[-123] would match the characters -, 1, 2, or 3

Page 79: Commands and shell programming (3)

Named Character Classes

Commonly used character classes can be referred to by name (alpha, lower, upper, alnum, digit, punct, cntrl)

Syntax [:name:]◦[a-zA-Z] [[:alpha:]]◦[a-zA-Z0-9] [[:alnum:]]◦[45a-z] [45[:lower:]]

Important for portability across languages

Page 80: Commands and shell programming (3)

Anchors

Anchors are used to match at the beginning or end of a line (or both).

^ means beginning of the line$ means end of the line

Page 81: Commands and shell programming (3)

Ten men in a den

match

regular expression ^ T [ie] n

regular expression d [eor]n $

Ten men in a den

match

^$^word$

Page 82: Commands and shell programming (3)

Repetition

The * is used to define zero or more occurrences of the single regular expression preceding it.

Page 83: Commands and shell programming (3)

I got mail, yaaaaaaaaaay!

match

regular expression y a * y

For me to look on.

match

regular expression . O * .

.*

Page 84: Commands and shell programming (3)

Repetition Ranges

Ranges can also be specified◦{ } notation can specify a range of

repetitions for the immediately preceding regex

◦{n} means exactly n occurrences◦{n,} means at least n occurrences◦{n,m} means at least n occurrences but no

more than m occurrencesExample:

◦.{0,} same as .*◦a{2,} same as aaa*

Page 85: Commands and shell programming (3)

Subexpressions

If you want to group part of an expression so that * or { } applies to more than just the previous character, use ( ) notation

Subexpresssions are treated like a single character◦a* matches 0 or more occurrences of a◦abc* matches ab, abc, abcc, abccc, …◦(abc)* matches abc, abcabc, abcabcabc, …◦(abc){2,3} matches abcabc or abcabcabc

Page 86: Commands and shell programming (3)

grep

grep comes from the ed (Unix text editor) search command “search a global regular expression and print it” or g/re/p

This was such a useful command that it was written as a standalone utility

There are two other variants, egrep and fgrep that comprise the grep family

grep is the answer to the moments where you know you want the file that contains a specific phrase but you can’t remember its name

Page 87: Commands and shell programming (3)

Family Differences

grep - uses regular expressions for pattern matching

fgrep - file grep, does not use regular expressions, only matches fixed strings but can get search strings from a file

egrep - extended grep, uses a more powerful set of regular expressions but does not support backreferencing, generally the fastest member of the grep family

agrep – approximate grep; not standard

Page 88: Commands and shell programming (3)

Syntax

Regular expression concepts we have seen so far are common to grep and egrep.

grep and egrep have different syntax◦grep: BREs◦egrep: EREs (enhanced features we will

discuss)Major syntax differences:

◦grep: \( and \), \{ and \}◦egrep: ( and ), { and }

Page 89: Commands and shell programming (3)

Protecting Regex Metacharacters

Since many of the special characters used in regexs also have special meaning to the shell, it’s a good idea to get in the habit of single quoting your regexs◦This will protect any special characters from

being operated on by the shell◦If you habitually do it, you won’t have to worry

about when it is necessary

Page 90: Commands and shell programming (3)

Escaping Special Characters

Even though we are single quoting our regexs so the shell won’t interpret the special characters, some characters are special to grep (eg * and .)

To get literal characters, we escape the character with a \ (backslash)

Suppose we want to search for the character sequence 'a*b*'◦Unless we do something special, this will match zero

or more ‘a’s followed by zero or more ‘b’s, not what we want

◦‘a\*b\*’ will fix this - now the asterisks are treated as regular characters

Page 91: Commands and shell programming (3)

Egrep: Alternation

Regex also provides an alternation character | for matching one or another subexpression◦(T|d)en will match ‘Ten’ or ‘den’◦^(From|Subject): will match the From and

Subject lines of a typical email message It matches a beginning of line followed by either the

characters ‘From’ or ‘Subject’ followed by a ‘:’Subexpressions are used to limit the scope of

the alternation◦At(ten|nine)tion then matches “Attention” or

“Atninetion”, not “Atten” or “ninetion” as would happen without the parenthesis - Atten|ninetion

Page 92: Commands and shell programming (3)

Egrep: Repetition Shorthands

The * (star) has already been seen to specify zero or more occurrences of the immediately preceding character

+ (plus) means “one or more” abc+d will match ‘abcd’, ‘abccd’, or

‘abccccccd’ but will not match ‘abd’ Equivalent to {1,}

Page 93: Commands and shell programming (3)

Egrep: Repetition Shorthands cont

The ‘?’ (question mark) specifies an optional character, the single character that immediately precedes it July? will match ‘Jul’ or ‘July’ Equivalent to {0,1} Also equivalent to (Jul|July)

The *, ?, and + are known as quantifiers because they specify the quantity of a match

Quantifiers can also be used with subexpressions◦(a*c)+ will match ‘c’, ‘ac’, ‘aac’ or ‘aacaacac’ but will

not match ‘a’ or a blank line

Page 94: Commands and shell programming (3)

Grep: Backreferences

Sometimes it is handy to be able to refer to a match that was made earlier in a regex

This is done using backreferences◦\n is the backreference specifier, where n is a

numberLooks for nth subexpressionFor example, to find if the first word of a

line is the same as the last:◦^\([[:alpha:]]\{1,\}\) .* \1$◦The \([[:alpha:]]\{1,\}\) matches 1 or more

letters

Page 95: Commands and shell programming (3)

Practical Regex Examples

Variable names in C◦[a-zA-Z_][a-zA-Z_0-9]*

Dollar amount with optional cents◦\$[0-9]+(\.[0-9][0-9])?

Time of day◦(1[012]|[1-9]):[0-5][0-9] (am|pm)

HTML headers <h1> <H1> <h2> …◦<[hH][1-4]>

Page 96: Commands and shell programming (3)

grep Family

Syntaxgrep [-hilnv] [-e expression] [filename]egrep [-hilnv] [-e expression] [-f filename]

[expression] [filename]fgrep [-hilnxv] [-e string] [-f filename] [string]

[filename]◦ -h Do not display filenames◦ -i Ignore case◦ -l List only filenames containing matching lines◦ -n Precede each matching line with its line

number◦ -v Negate matches◦ -x Match whole line only (fgrep only)◦ -e expression Specify expression as option◦ -f filename Take the regular expression (egrep) or

a list of strings (fgrep) from filename

Page 97: Commands and shell programming (3)

grep Examples

grep 'men' GrepMe grep 'fo*' GrepMe egrep 'fo+' GrepMe egrep -n '[Tt]he' GrepMe fgrep 'The' GrepMe egrep 'NC+[0-9]*A?' GrepMe fgrep -f expfile GrepMe

• Find all lines with signed numbers $ egrep ’[-+][0-9]+\.?[0-9]*’ *.c

bsearch. c: return -1;compile. c: strchr("+1-2*3", t-> op)[1] - ’0’, dst,convert. c: Print integers in a given base 2-16 (default 10)convert. c: sscanf( argv[ i+1], "% d", &base);strcmp. c: return -1;strcmp. c: return +1;

• egrep has its limits: For example, it cannot match all lines that contain a number divisible by 7.

Page 98: Commands and shell programming (3)

x

xyz

Ordinary characters match themselves (NEWLINES and metacharacters excluded) Ordinary strings match themselves

\m ^ $ .

[xy^$x] [^xy^$z]

[a-z] r*

r1r2

Matches literal character m Start of line End of line Any single character Any of x, y, ^, $, or z Any one character other than x, y, ^, $, or z Any single character in given range zero or more occurrences of regex r Matches r1 followed by r2

\(r\) \n

\{n,m\}

Tagged regular expression, matches r Set to what matched the nth tagged expression (n = 1-9) Repetition

r+ r?

r1|r2 (r1|r2)r3 (r1|r2)*

{n,m}

One or more occurrences of r Zero or one occurrences of r Either r1 or r2 Either r1r3 or r2r3 Zero or more occurrences of r1|r2, e.g., r1, r1r1, r2r1, r1r1r2r1,…) Repetition

fgrep, grep, egrep

grep, egrep

grep

egrep

This is one line of text

o.*o

input line

regular expression

QuickReference

Page 99: Commands and shell programming (3)

Sed: Stream-oriented, Non-Interactive, Text Editor

Look for patterns one line at a time, like grep

Change lines of the fileNon-interactive text editor

◦Editing commands come in as script◦There is an interactive editor ed which accepts

the same commandsA Unix filter

◦Superset of previously mentioned tools

Page 100: Commands and shell programming (3)

Sed Architecture

scriptfile

Input

Output

Input line(Pattern Space)

Hold Space

Page 101: Commands and shell programming (3)

Conceptual overview

· All editing commands in a sed script are applied in order to each input line.

If a command changes the input, subsequent command address will be applied to the current (modified) line in the pattern space, not the original input line.

The original input file is unchanged (sed is a filter), and the results are sent to standard output (but can be redirected to a file).

Page 102: Commands and shell programming (3)
Page 103: Commands and shell programming (3)

The rules for building shell variables are as follows:

1. A variable name is any combination of alphabets, digits and an underscore.

2. No commas or blanks are allowed within a variable name.

3. The first character of a variable name must either be an alphabet or an underscore.

4. Variables names should be of any reasonable length.

5. Variable names are case sensitive.

Page 104: Commands and shell programming (3)

echo if until trap else read case wait set fi esac evalunset while break exec randomlydo continue ulimit shift done exitUnmask export for return

Page 105: Commands and shell programming (3)

$ PS1=“BScH” (System prompt 1)• PS2 (The system prompt 2)• PATH • HOME• LOGNAME• MAIL• MAILCHECK• IFS (Internal Field Separator)• SHELL • TERM (Defines the name of the terminal on which you

are working)• TZ (Defines the name of the time zone)

Unix-defined or System Variables

Page 106: Commands and shell programming (3)

Tips and Traps All shell variables are string variables. A var can contain more than one word. (“ ”). More than one assignment is possible in one line. All var dies the moment the execution of the

script is over. A variable which has been defined but has not

been given any value is known as null variable. ($d=“ ” , $d=' ', $d= ) On echoing a null var, only blank line appears on

the screen. If a null var is used anywhere in a command shell

manages to ignore it. Not only the system var's but also the user

defined var's can be displayed using set commad.

Page 107: Commands and shell programming (3)

• Unchanging variables ($ readonly var): When the variables are made readonly, the shell does not allow us to change their values.

• All such variables can be listed by entering readonly at the $ prompt.

• Wiping out variables ($ unset var) : If we want the shell to forget about a variable altogether, we use the unset command.

• $ unset PS1 is not allowed.

Page 108: Commands and shell programming (3)

Positional ParametersPositional parameters ($1 to $9)$sh test.sh a b c d e f g h i

Page 109: Commands and shell programming (3)

Setting Values of Positional Parameters.

$ set a b c d e f g h i j k$ echo $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 a b c d e f g h i a0 a1$ shift 5$ echo $1 $2 $3 $4 $5 $6 f g h i j k echo $* (Here $* stands for all positional

parameters, including those beyond $9. It takes all the parameter together not individually.)

Page 110: Commands and shell programming (3)

• Comment # at the beginning of each line.• More than one assignment can be done in a

single statement. (a=20 b=30)• Along with expr modular division operator %.• Multiplication symbol is always be preceded

by a \.• Terms of the expression provided to expr must

be separated by blanks.

Page 111: Commands and shell programming (3)

• expr performs operations with priorities. /, *, % First Priority

+, - Second Priority• In case of a tie between operations of same

priority , preference is given to the operator which occurs first (From left).

• Pair of parentheses changes normal priority.• Innermost parentheses will have higher priority.

Page 112: Commands and shell programming (3)

• Since the expr commands has been put within accent graves it is substituted with the output of expr, which is promptly displayed by the echo statement on the screen.

• expr is capable of carrying out only integer arithmetic.

• To carry out arithmetic on real numbers bc is used.a=10.5 b=3.5 c=‘echo $a + $b | bc’

Page 113: Commands and shell programming (3)

Read and echoecho -e '\E[34;47mThis is in blue.'; tput sgr0 echo -e '\E[33;44m'"yellow text on blue bkgrd"; tput sgr0 echo -e "\033[4mThis is underlined text.\033[0m" echo -e "\033[1mThis is bold text.\033[0m" echo -e "\033[31m I am in Red \033[0m"echo -e "\033[32m I am in Green \033[0m"

Page 114: Commands and shell programming (3)

Escape Sequence\033[0m Normal Characters.\033[1m Bold Characters.\033[4m Underlined Characters.\033[7m Reverse Video Characters.

clear Clear the screenbold Bold displayrev Reverse Video Characters.

The tput Command

Page 115: Commands and shell programming (3)

Control Instructions in Shell• Sequence control Instruction.• Selection or Decision Control Instruction.• Repetition or Loop Control Instruction.• Case Control Instruction.

Page 116: Commands and shell programming (3)

Wildcards (patterns)* matches any string of characters? matches any single character[list] matches any character in list[low-up] matches any character in range low-up

inclusive[!list] matches any character not in list

Page 117: Commands and shell programming (3)

Chapter 10

Page 118: Commands and shell programming (3)

• if … then … fi• if…then…else…fi• Nested if…then…elif…else…

fi• else+if equals elif• case … in … esac

Page 119: Commands and shell programming (3)

test Summary• String based tests

-z string Length of string is 0-n string Length of string is not 0string1 = string2 Strings are identicalstring1 != string2 Strings differstring String is not NULL

• Numeric testsint1 –eq int2 First int equal to secondint1 –ne int2 First int not equal to second-gt, -ge, -lt, -le greater, greater/equal, less,

less/equal• File tests

-r file File exists and is readable-w file File exists and is writable-f file File is regular file-d file File is directory-s file file exists and is not empty

• Logic! Negate result of expression-a, -o and operator, or operator( expr ) groups an expression

Page 120: Commands and shell programming (3)

-z string Length of string is 0-n string Length of string is not 0string1 = string2 Strings are identicalstring1 != string2 Strings differstring String is not NULL

String based tests

Page 121: Commands and shell programming (3)

File Tests-s file File exists and is not empty-f file File is regular file-d file File is directory-c file File exists and a character special file-b file File exists and a block special file-r file File exists and is readable-w file File exists and is writable-x file File exists and is executable-k file File exists and its sticky bit is set

Page 122: Commands and shell programming (3)

Logic! Negate result of expression-a, -o and operator, or operator( expr ) groups an expression

Numeric testsint1 –eq int2 :First int equal to secondint1 –ne int2 :First int not equal to

second-gt, -ge : greater than, greater/equal-lt, -le : less than, less/equal

Page 123: Commands and shell programming (3)

• Any UNIX command. Evaluates to true if the exit code is 0, false if the exit code > 0

• Special command /bin/test exists that does most common expressions– String compare– Numeric comparison– Check file properties

Page 124: Commands and shell programming (3)

• Sample switch statement : case $var in

opt1) command1command2;;

opt2) command;;

*) command;;

esac• * is a catch all condition

Page 125: Commands and shell programming (3)

echo "Say something."while truedo read INPUT_STRING case $INPUT_STRING in hello) echo "Hello" ;; bye) echo "Bye" ;; *) echo "I'm sorry?" ;; esacdoneecho "Take care.”

Case Example

Page 126: Commands and shell programming (3)

• opt can be a shell pattern, or a list of shell patterns delimited by |

• Example:case $name in *[0-9]*) echo "That doesn't seem like a name." ;; S*|V*) echo "Your name starts with S or V." ;; *) echo "You're different." ;;esac

Page 127: Commands and shell programming (3)

• You can use cases in any order (!sorted)• Example:case $num in 121) echo "I am in case 121" ;; 7) echo "I am in case 7" ;; 22) echo "I am in case 22" ;;

*) echo "I am in default case..." ;;esac

Page 128: Commands and shell programming (3)

• Value of case can be a shell var, a shell script argument or output of a command:

case $1 in cat) echo "cat at command line" ;; dog) echo "dog at command line" ;; parrot) echo "parrot at command line" ;;

*) echo "Incorrect argument at command line..." ;;esac

Page 129: Commands and shell programming (3)

• Options can be combined using or op |

case $1 in cat | dog) echo "Animal name" ;; parrot | crow) echo "Bird" ;; whale | shark) echo "fish" ;;

*) echo "Incorrect argument at command line..." ;;esac

Page 130: Commands and shell programming (3)

echo "Enter any char"read charcase $char in[a-z]) echo "You enterd a small case letter";;[A-Z]) echo "Capital letter";;[0-9]) echo "Number" ;;?) echo "Symbol";;*) echo " More than one character";;esac

Case example....

Page 131: Commands and shell programming (3)

echo "Enter any word"read wordcase $word in[aeiou]*) echo "You entered a word which begins with a small

case vowel letter";;[AEIOU]*) echo "Vowel capital letter";;*[0-9]) echo "Ends with a Number" ;;???) echo "Entered 3 letter word";;Esac

Case example....

Page 132: Commands and shell programming (3)
Page 133: Commands and shell programming (3)

loop statements: While… do … doneuntil … do … donefor … do … done

Page 134: Commands and shell programming (3)

• while statementwhile control commanddo

command1command2

done while loop executes till the exit status of the control command is true and terminates when the exit status becomes false.

Page 135: Commands and shell programming (3)

• Until statementuntil control command doesn’t return truedo

thisand thisand this

doneUntil loop executes till the exit status of the control command is false ad terminates when this status becomes true.

Page 136: Commands and shell programming (3)

• for statementfor control-variable in value1 value2 value3…do

command1command2command3

done

Page 137: Commands and shell programming (3)

• The way if statements can be nested, similarly whiles, untils and fors can also be nested.

Page 138: Commands and shell programming (3)

• break - We often come across situations where we want to jump out of a loop instantly, without waiting to get back to the control command.

• When the keyword break is encountered inside any loop, control automatically passes to the first statement after the loop.

• A break is usually associated with an if.

Page 139: Commands and shell programming (3)

• When the keyword continue is encountered inside any loop, control automatically passes to the beginning of the loop.

• continue is usually associated with an if.

Page 140: Commands and shell programming (3)

for loops allow the repetition of a command for a specific set of values.Syntax:

for var in value1 value2 ...do

command_setdone

command_set is executed with each value of var (value1, value2, ...) in sequence

Page 141: Commands and shell programming (3)

Example: Listing all files in a directory.for i in *do echo $idone

NOTE: * is a wild card that stands for all files in thecurrent directory, and for will go through each valuein *, which is all the files and $i has the filename.

Page 142: Commands and shell programming (3)

Example: Listing all c files in a directory.for file in *.cdo mv $file $file.cppdone

NOTE: This loop would pick up all C program files from the current directory and add the extension “.cpp” at the end of ech such file.

Page 143: Commands and shell programming (3)

Conditionals are used to “test” something.In Java or C, they test whether a Boolean variable is true or false.In a Bourne shell script, the only thing you can test is whether or not a command is “successful”.

Page 144: Commands and shell programming (3)

Every well behaved command returns back a return code.0 if it was successfulNon-zero if it was unsuccessful (actually 1..255)

Page 145: Commands and shell programming (3)

Use for checking validity.Three kinds:Check on files.Check on strings.Check on integers

Page 146: Commands and shell programming (3)

The test command has an alias ‘[]’.Each bracket must be surrounded by spaces

smallest=100for i in 5 8 19 8 7 3doif [ $i -lt $smallest ]then smallest=$ifidoneecho $smallest

Page 147: Commands and shell programming (3)

While loops repeat statements as long as the next Unix command is successful.Works similar to the while loop in C.

Page 148: Commands and shell programming (3)

i=1sum=0while [ $i -le 100 ] do sum=`expr $sum + $i` i=`expr $i + 1`doneecho The sum is $sum.

NOTE: The value of i is tested in the while to see if it is less than or equal to 100.

Page 149: Commands and shell programming (3)

Until loops repeat statements until the next Unix command is successful. Works similar to the do-while loop in C.

Page 150: Commands and shell programming (3)

x=1until [ $x -gt 3 ]do echo x = $x x=`expr $x + 1`done

NOTE: The value of x is tested in the until to see if it is greater than 3.

Page 151: Commands and shell programming (3)

While [ $# -le 5 ]

While who | grep $logname

While [ -r $file -a -w $file ]

Page 152: Commands and shell programming (3)

There is a minor difference between the working of while and until loops.

The while loop executes till the exit status of the control command is true and terminates when the exit status becomes false.

Unlike this until loop executes till the exit status of the control command is false and terminates when this status becomes true.

Page 153: Commands and shell programming (3)

As a rule the while must have a control command that will eventually return an exit status 1 (false), otherwise the loop would be executed forever, indefinitely.

Page 154: Commands and shell programming (3)

As a rule the until must have a control command that will eventually return an exit status 0 (true), otherwise the loop would be executed forever, indefinitely.

Page 155: Commands and shell programming (3)

When the keyword break is encountered inside any loop, control automatically passes to the first statement after the loop.

(A break is usually associated with an if)

Page 156: Commands and shell programming (3)

When the keyword continue is encountered inside any loop, control automatically passes to the beginning of the loop.

(A break is usually associated with an if)

Page 157: Commands and shell programming (3)

• Filename Substitution Metacharacters ? * [..] [!..]

• I/O Redirection Metacharacters> < >> << m> m>&n

• Process Execution Metacharacters; () & && ||

• Quoting Metacharacters\ “ ” ’ ’ ‘ ‘

• Positional Parameters $1…$9• Special Parameters $0 $* $@ $# $! $$ $-

Page 158: Commands and shell programming (3)

$# Number of positional parameters$- Options currently in effect$? Exit value of last executed command$$ Process number of current process$! Process number of background

process$* All arguments on command line"$@" All arguments on command line

individually quoted "$1" "$2" ...