Upload
sandra-johnston
View
218
Download
1
Embed Size (px)
Citation preview
Introduction To Perl
Dulal C. Kar
Perl
• Perl: Practical extraction and Reporting Language
• General purpose programming language
• Downloads available from:– www.perl.com for unix or unix-like system– www.activestate.com for windows– www.macperl.com for mac op sys
An Example Perl Program
• Example#!/usr/bin/perl –wprint “Hello world\n”;
• Perl is a free-format scripted language. Perl is like a compiler and an interpreter
• Whitespace separates tokens• White spaces include spaces, tabs, newlines, returns, formfeeds
• Pound sign (#) is used to begin a comment line
• Semicolon (;) is used to separate statements
• There is no “main” routine as in C
Running a Perl Program
• Several ways to run a Perl program
$ perl progfile
$ progfile
Scalar Data
• Scalar data– Numbers and Strings
• Numbers– Integer literals and Float literals– Integer literals: 10, -2005, 0245 (octal), -0x0e
(negative hex num)– Float literals: 1.45, -1.2e20, 2.34e12
• Strings– Single-Quoted Strings: ‘hello’, ‘don\’t’– Double-Quoted Strings: “hello world\n”
Escape Characters
• Double quoted string escapes:
\n – newline \t – tab \\ - backlash
\r – carriage return \b – backspace
\f – form feed \” – double qoute
Scalar Operators
• Addition (+), subtraction (-), Multiplication (*)
• Division (/): Perl always does floating point division
• Exponentiation (**): 3 ** 2 is 9
• Modulus (%)8 % 3 is 28.5 % 3.5 computed as 8 % 3 which is 2
• Logical comparison operators: <, <=, ==, >=, >, !=
Operators for Strings
• Concatenation operator (.)“welcome”.”home” # is “welcomehome”
• String repetition operator (x)“dog” x 4 # is “dogdogdogdog”
• String comparison operators– Use FORTRN like logical operators
eq, ne, lt, gt, le, ge
• Precedence of operations is similar to c++
Conversion Between Numbers and Strings
• In most cases, Perl performs all conversions automatically
• If a string value is an operand in a numeric operator, Perl automatically converts it to its equivalent numeric value
– Trailing nonnumerics ignored, if any– Leading whitespaces ignored, if any– Zero, if it is not a number at all
• Numeric value is converted to string when a string operator is encountered
“x”.(5+5) is same as “x”.”10” or “x10”
Variables
• Scalar variable– To hold a single value
• Array– To hold a list of scalars indexed by
numbers/subscripts
• Hash– To maintain a list of scalars indexed by strings
• By default, all variables are global
Scalar Variables
• Scalar variable names begin with a dollar sign ($) followed by at least one letter (and more letters/digits/underscore)
• Names are case sensitive
• Scalars are typeless– There are no such things as integer variables, string
variables, floating-point variables, etc
• No declaration of scalars needed before use
Scalar Operators and Functions
• Assignment and arithmetic operators
$x = 23;$y = ‘Welcome’;$z = ’50’;$w = $z * 2; #$w is 100$v = “Welcome home!”;
• Similar to C/C++, it supports ++ and -- operators
• Binary assignment operators: +=, -=, *=, /=, .=, %=, **=
Scalar Operators and Functions (cont’d)
• chop function– Built-in function removes the last character from the string value
of that variable
$x = “Welcome”;chop($x); # $x is “Welcom” now$x = chop($x); # But $x is “m” now
• chomp function– Removes only a newline character
$x = “Welcome\n”;chomp($x); # $x is now “Welcome”chomp($x); # $x is still “Welcome”
Interpolation of Scalars and Strings• A double-quoted string can be used for variable
interpolation$x = “home!”;$y = “Welcome $x”; # $y is now “Welcome home!”
• No substitution is performed in single-quoted strings• To prevent substitution
$x = ‘$dude’; # $x is now “$dude”$y = “Hey $x”; # $y is now ‘Hey $dude’
• Case shifting$x = “\Ucams”; # $x is now “CAMS”$x = “\LCAMS”; # $x is now “cams”
<STDIN> as a Scalar Value
• <STDIN> is a complete text line from standard input including the newline character
$a = <STDIN>; # get text and save in $a
chomp($a); # removes newline character
Or
chomp($a = <STDIN>);
Output with print Function
print (“Welcome home\n”);
print “Welcome home\n”;
$str = “home”;
print “Welcome $str.\n”; # Welcome home
print ‘Welcome $str.\n’; # Welcome $str.\n
List Data
• List– ordered scalar data
• List Examples(1, 5, 7) # List of 1, 5, and 7($a+$b, 6, 7, 9, $a)() # Empty list(1 .. 4) # same as (1, 2, 3, 4)(1.5 .. 4.5) # same as (1.5, 2.5, 3.5, 4.5)(1.3 .. 6.1) # same as (1.3, 2.3, 3.3, 4.3, 5.3)
Arrays• Array
– Variable that holds a list– Array name begins with @ character– Can have any number of elements; dynamically
allocated– mixed values allowed
$x = 10; $y = 20; $c = 30;@p = (5, 10, 15);@q = (‘bat’,’cat’, ‘dog’); @r = (12, ‘red’, 56, ‘dog’); # mixed values@s = ($x, $y, $z); # contains 10, 20, and 30@t = (@p, @r); # @t contains elements of @p and @r
More Array Examples
($y, $x) = ($x, $y); # swaps $y and $x
($x, $y, $z) = (1, 2, 3);($w, @p) = ($x, 2, 5);($q, @p) = @p; # makes $q = 2, @p = (5)
@p = (5, 9, 8);$q = @p; # $q gets 3, number of items in @p($q) = @p; # $q gets 5, first element of @p
Accessing Array Elements• Elements are indexed as 0, 1, . . ., n-1 sequentially• Elements are also reverse-indexed beginning with the last element
as -1, -2, -3, . . ., -n• Reference to an element begins with $ sign instead of @• Part of an array can be accessed, called slicing• Subscript for a slice is an array expression enclosed in square
bracket([ ])• Example
@f = (5, 3, 8);$c = $f[1]; # $c gets 3$f[2] = 10; # now @f = (5, 3, 10)
# Examples on Slices (use [.,.,.,] )@f[1,2] = (7, 0) #changes last two values in @f to 7 and 0@f[0,1,2] = @f[0, 0, 0] # all elements set to first
Array Functions
• push and pop functions (LIFO behavior)– push appends and pop removes from end
push (@mylist, $item);push (@mylist, 3, 6, 7);@newlist = (2, 3, 4);push (@mylist, @ newlist); # push accepts a list of values$item = pop(@mylist);
• shift and unshift functions– unshift prepends and shift removes from front
unshift(@mylist, 5); unshift(@mylist, $x, $y);$z = shift(@mylist);
Array Functions (cont’d)
• sort function, reverse function, and chomp function (removes the terminating newline character, if any, from all elements)
• join function @a = (“bat”, “cat”, “dog”);
print join(‘:’, @a); # “bat:cat:dog”
• split function $a = ‘bat:cat:dog’;
@x = split(‘:’,$a); # Now @x is (‘bat’, ‘cat’, ‘dog’)
<STDIN> as an Array
• In a list content, <STDIN> returns all lines up to end of file (Ctrl-D or Ctrl-Z, depends on the system)
• Each line is returned as an element of the list
@a = <STDIN>
• Type 5 lines and then press Ctrl-D, you will have an array @a with 5 elements, each ends with a newline character
Variable Interpolation of Arrays
• In a double quoted string with an array:@a = (“dog”, “cat”, “bat”);
print “Animals: @a”; # “Animals: dog cat bat”
• Portion of an array with a slice:print “Animals: @a[1,2]”; # “Animals: cat bat”
@b = (“bat”, “cat”, “dog”, 1, 2);
print “Animals: @b[@b[3,4])]”; # “Animals: cat dog”
Hashes• Also called associative arrays
• A hash variable name start with % character• A hash stores a list of scalars indexed by arbitrary scalars (called
keys) instead of numbers
• Hash Initialization:
%team = ( ‘Bill’ => ‘William Burns’, ‘Jim’ => ‘James Brown’, ‘Tim’ => ‘Timothy Johnson’);
• Insertion $team{‘Liz’} = “Elizabeth Taylor”;
Hash Access
# Hash creation%team = ( ‘Bill’ => ‘William Burns’, ‘Jim’ => ‘James Brown’, ‘Tim’ => ‘Timothy Johnson’);
# Changing data$a = $team{‘Bill’}; # $a contains ‘William Burns’($c,$d) = @team{‘Bill’,’Tim’}; # $c contains ‘William Burns’ # $d contains ‘Timothy Johnson’
Hash Functions• keys (hash) function
– Yields a list of all the keys in the hash– In a scalar context, keys function returns number of key-value
pairs in hash– Consider %team as the example hash
$n = keys(%team); # $n is 3@k = keys(%team); # @k is (‘Bill’, ‘Jim’, ‘Tim’)
• values (hash) function– Yields a list of the values in the hash
@v = values(%team) # @v = (‘William Burns’, # ‘James Brown’, # ‘Timothy Johnson’)
Hash Functions (cont’d)
• each (hash) function– Returns a key-value pair as a two-element list– Example:
while (($first, $last) = each(%names)) {
print “The last name of $first is $last\n”;
}
• delete function– Removes hash elements
delete $team{“Liz”}; # removes the key-value pair for ‘Liz’
Relational Operators
Numeric (c-like)String (Fortran-like)
== eq
!= ne
< lt
> gt
<= le
>= ge
Control Structures
if (some_expression) { . . . }
if (some_expression) { . . . } else { . . . }
if (some_expression) { . . . }elsif (some_expression) { . . . }..else { . . . }
Looping Contructs• while statement
while (some_expression) { . . . }
• until statementuntil (some_expression) { . . . }
• do-while statementdo { . . . } while some_expression;
• do-until statementdo { . . . } until some_expression;
• for statementfor (initial_exp; test_exp; re_init-exp) { . . . }
foreach Statement
• foreach $i (@some_list) { . . }
@animals = (‘dog’, ‘cat’, ‘bat’);foreach $a (reverse @animals) {
print $a;}
@animals = (‘dog’, ‘cat’, ‘bat’);foreach (@animals) { # use $_ by default
print;}
@a = (2, 3, 4);foreach $item (@a) {
$item ++;} # @a is now (3, 4, 5)
File I/O
• Special filehandles: STDIN, STDOUT, STDERR
$a = <STDIN>; # read next line or undef@a = <STDIN>;# read all remaining lines in
#To process on line a timewhile (defined($line) = <STDIN>) {
# process $line}
#To read into $_ variable and remove newline charwhile (<STDIN>) {
chomp;# process $_ here
}
Diamond Operator: <>
• Reads lines from files given on the command line or read lines from STDIN if no files given on the command line
• Suppose you have a program named test having code:#!/usr/bin/perlwhile (<>) {
print $_;}
• And suppose you invoke with test file1 file2. Diamond operator reads each line of file1 and then file2. If you do not specify any file names, it reads from the standard input automatically.
@ARGV array
• <> operator makes use of @ARGV array which holds command line arguments
• @ARGV array can be even set in the program and used by <> operator
@ARGV = (“file1”, “file2”);# process lines of file1 and file2while (<>) {
print $_;}
More File I/O
• To open a file for readingopen (INFILE, “<filename”); # < indicates inputopen (INF, “filename”); # By default, input
• To open a file for writingopen (OUTFILE, “>filename”);
• To open a file for appendopen (APPFILE, “>>filename”);
• To close a fileclose (FILEHANDLE);
File I/O Error Checking• Check the return status of your open() statement
# File copy program with error checkingopen (INFILE, “file.in”) ||
die “Failed to open file for reading\n”;
open (OUTFILE, “file.out”) ||die “Failed to open file writing\n”;
while ($line = <INFILE>) { # read from file.inprint OUTFILE $line; # write to file.out
}
close (OUTFILE);close (INFILE);
Reading File into an Array
open(INFILE, “<file.in”) ||die “Failed to open file\n”;
@lines = <INFILE>;
close (INPUT);
File for Next Examples
• Assume “input.dat” file contains:
Brown:Mark:Corpus Christi:78413
Smith:Barney:Fargo:58105
Wayne:John:Corpus Christi:78412
Williams:Cathy:New York:10075
Example Application
• Program to read and parse file:
open (FILE, “<input.dat”) || die “Open failed.”;while ($line = <FILE>) {
chomp($line);($lname,$fname,$city,$zip) = split(‘:’,$line);print “$fname $lname lives in $city\n”;
}Close(FILE);
• Program Output………….
Reading File into a Hash
open (INFILE, “input.dat”) || die “open failed.\n”;
while ($line = <INFILE>) {chomp ($line);($lname, $fname, $city, $zip) = split(‘:’, $line);$addr{$lname} = $city;$fname{$lname} = $fname;
}close (INFILE);
# output where everybody livesforeach $n (keys(%addr)) {
print “$fname{$n} $n lives in $addr{$n}\n”;}
# Output sorted by last nameforeach $n (sort(keys(%addr))) { print “$fname{$n} $n lives in $addr{$n}\n”;}
User Defined Subroutines
• A simple subroutinesub Hi { print “Hello World!”; }
Hi (); # Hello world!
• Passing parameterssub Hi {
($name = @_;print “Hello $name!”;
}Hi(‘Scott’); # Hello Scott!Hi(‘Dawn’); # Hello Dawn!
Defining Variable Scope• A problem with parameters
sub Hi {($name = @_;print “Hello $name!”;}$name = ‘George’;Hi(‘Scott’); # Hello Scott!print “$name\n”; # Scott;
• Fixing the problemsub Hi {
my ($name) = @_;print “Hello $name!\n”;
}$name = ‘George’;Hi (‘Scott’); # Hello Scott!print “$name\n”; # George
Regular Expressions
• A regular expression is a pattern to be matched against a string
• A regular expression is enclosed in slashes
• Example 1if (/abc/) { #regular expression abc is tested print $_; # variable $_ containing a text line}
• Example 2while (<>) { # prints all lines containing reg. expr. abc if (/abc/) { print $_; }}
Regular Expressions
• To print lines containing an a followed by zero or more b’s, followed by a c:
while (<>) { if (/ab*c/) { print $_; } }
• Substitute operator s/ab*c/def/; #replaces ab*c by def in $_
Single Character Patterns
• Dot (“.”)– Matches any single character except newline (\n).
• Pattern-matching character class– Enclosed in [] and only one of the characters must be present– Examples
[abcde] # match any character a, b, c, d, or e [aeiouAEIOU] # match any vowel, upper-case or lower-case [0123456789] # match any digit[0-9] # same thing[0-9\-] # match 0-9, or minus[a-z0-9] # match any single lower case letter or digit[a-zA-Z0-9_] # match any single letter, digit, or underscore[^0-9] # match any single non-digit[^aeiouAEIOU] # match any single non-vowel[^\^] # match single character except an up-arrow
Predefined Character Class Abbreviations
Construct Equivalent Class Negated
Construct
Equivalent Negated
Class
\d (a digit)
\w (word char)
\s (space char)
[0-9]
[a-zA-Z0-9_]
[ \r\t\n\f]
\D (digits, not)
\W (words, not)
\S (space, not)
[^0-9]
[^a-zA-Z0-9_]
[^ \r\t\n\f]
Abbreviated classes can be used as part of other characters as well:
[\d-fA-F] # match one hex digit
Grouping Patterns
• Sequence (Ex: abc)
• Multiplier
– Asterisk (*) indicates zero or more of immediate previous character or class
– Question mark (?) means zero or one of immediate previous character
– Plus sign (+) means one or more of immediate previous character
– Example: /fo+ba?r/ # at least one o, optional a
– General multiplier: x{5}, x{5,10}, x{5,}
– Notice that * is equivalent to as {0,}, + is equivalent {1,}, and ? is equivalent to {0,1}.
• For more information on regular expression, read from some Web site maintained for Perl.
More Information• Local online documentation
man perlperldoc perlperldoc –f <built-in function name>
(e.g. perldoc –f chomp)
• Web siteshttp://perl.oreilly.com/http://effectiveperl.com/
• Books“Learning Perl” by Randal Schwartz & Tom Christiansen“Programming Perl” by Larry Wall, Randal Schwartz & Tom
Christiansen