View
221
Download
0
Embed Size (px)
Citation preview
What is Perl? Practical Extraction and Report Language Interpreted Language
Optimized for String Manipulation and File I/O Full support for Regular Expressions
Running Perl Scripts Windows
Download ActivePerl from ActiveState Just run the script from a 'Command Prompt'
window UNIX – Cygwin
Put the following in the first line of your script#!/usr/local/bin/perl
Make the script executable% chmod +x script_name
Run the script% ./script_name
Basic Syntax Statements end with semicolon Comments start with ‘#’
Only single line comments Variables
You don’t have to declare a variable before you access it
You don't have to declare a variable's type
Scalars and Identifiers Identifiers
A variable name Case sensitive
Scalar A single value (string or numerical) Accessed by prefixing an identifier with '$' Assignment with '='
$scalar = expression
Strings Quoting Strings
With ' (apostrophe) Everything is interpreted literally
With " (double quotes) Variables get expanded
With ` (backtick) The text is executed as a separate process, and
the output of the command is returned as the value of the string
Check 01_printDate.pl
String Operation Arithmetic
lt less than <
gt greater than >
eq equal to ==
le less than or equal to <=
ge greater than or equal to >=
ne not equal to !=
cmp compare, return 1, 0, -1 <=>
Comparison Operators
Operator Operation
||, or logical or
&&, and logical and
!, not logical not
xor logical xor
Logical Operators
Operator Operation
. string concatenation
x string repetition
.= concatenation and assignment
$string1 = "potato";
$string2 = "head";
$newstring = $string1 . $string2; #"potatohead"
$newerstring = $string1 x 2; #"potatopotato"
$string1 .= $string2; #"potatohead"
String Operators
Check concat_input.pl
Perl Functions Perl functions are identified by their unique names
(print, chop, close, etc) Function arguments are supplied as a comma
separated list in parenthesis. The commas are necessary The parentheses are often not Be careful! You can write some nasty and unreadable
code this way!
Check 02_unreadable.pl
Lists Ordered collection of scalars
Zero indexed (first item in position '0') Elements addressed by their positions
List Operators (): list constructor , : element separator []: take slices (single or multiple element chunks)
List Operations sort(LIST)
a new list, the sorted version of LIST reverse(LIST)
a new list, the reverse of LIST join(EXPR, LIST)
a string version of LIST, delimited by EXPR split(PATTERN, EXPR)
create a list from each of the portions of EXPR that match PATTERN
Check 03_listOps.pl
Arrays A named list
Dynamically allocated, can be saved Zero-indexed Shares list operations, and adds to them
Array Operators @: reference to the array (or a portion of it, with []) $: reference to an element (used with [])
Array Operations push(@ARRAY, LIST)
add the LIST to the end of the @ARRAY pop(@ARRAY)
remove and return the last element of @ARRAY unshift(@ARRAY, LIST)
add the LIST to the front of @ARRAY shift(@ARRAY)
remove and return the first element of @ARRAY scalar(@ARRAY)
return the number of elements in the @ARRAY
Check 04_arrayOps.pl
Associative Arrays - Hashes Arrays indexed on arbitrary string values
Key-Value pairs Use the "Key" to find the element that has the
"Value" Hash Operators
% : refers to the hash {}: denotes the key $ : the value of the element indexed by the key (used
with {})
Hash Operations keys(%ARRAY)
return a list of all the keys in the %ARRAY values(%ARRAY)
return a list of all the values in the %ARRAY each(%ARRAY)
iterates through the key-value pairs of the %ARRAY delete($ARRAY{KEY})
removes the key-value pair associated with {KEY} from the ARRAY
Pattern Matching A pattern is a sequence of characters to be
searched for in a character string /pattern/
Match operators =~: tests whether a pattern is matched !~: tests whether patterns is not matched
Pattern Matches Pattern Matches
/def/ "define" /d.f/ dif
/\bdef\b/ a def word /d.+f/ dabcf
/^def/ def word /d.*f/ df, daffff
/^def$/ def /de{1,3}f/ deef, deeef
/de?f/ df, def /de{3}f/ deeef
/d[eE]f/ def, dEf /de{3,}f/ deeeeef
/d[^eE]f/ daf, dzf /de{0,3}f/ up to deeef
Patterns
Character RangesEscape
SequencePattern Description
\d [0-9] Any digit
\D [^0-9] Anything but a digit
\w [_0-9A-Za-z] Any word character
\W [^_0-9A-Za-z] Anything but a word char
\s [ \r\t\n\f] White-space
\S [^\r\t\n\f] Anything but white-space
Backreferences Memory of matched portion of input
/[a-z]+(.)[a-z]+\1[a-z]+/ asd-eeed-sdsa, sd-sss-ws NOT as.eee-dfg
They can even be accessed immediately after the pattern is matched (.) in the previous pattern is $1
Pattern Matching Options
Escape Sequence
Description
g Match all possible patterns
i Ignore case
m Treat string as multiple lines
o Only evaluate once
s Treat string as single line
x Ignore white-space in pattern