Upload
evan-mills
View
52
Download
0
Embed Size (px)
DESCRIPTION
Perl. OBJECTIVES. What is Perl Concepts Variables Control Structures Modules Objects Windows. Perl. P ractical E xtraction and R eport L anguage Originally designed as a text processing and “glue” language Perl is a scripting language - PowerPoint PPT Presentation
Citation preview
AE6382
Perl
Practical Extraction and Report Language Originally designed as a text processing and “glue”
language Perl is a scripting language Each invocation of a Perl script compiles then executes
code Uses a C-like syntax Has object-oriented programming features Highly portable between OS’s
AE6382
Running Perl
On Unix Typically set line 1 to #!/usr/bin/perl (wherever Perl is installed)
On Windows Set file extension to
– .pl for standard Perl– .pls for PerlScript (ActiveX scripting engine)
Run from the perl command line
AE6382
Variables
Perl is not a strongly typed language, the contents of a variable are converted as necessary
The first character of a variable name indicates the type of a variable $name
The name part of variable can also be enclosed in { } ${name} @{$reference_to_array}
Scalar $ Name of individual valueArray @ List a values, keyed by index
Hash % List of values, keyed by stringSubroutine & Callable Perl codeTypeglob * Everything
AE6382
Variables - Scalar
A scalar represents a single value Integer Floating point String Reference
The data held by the variable is converted as necessary Scalar names start with a $
$name
As an lvalue $name = “george burdell”;
AE6382
Variables - Arrays
An array is an ordered list of scalars Arrays are indexed by a number, starting at 0 Arrays indexed by negative numbers are ordered
backwards from the end of the array The indexing operator is [ ] An array starts with @ To refer to full array (or a slice)
@names @names[1,3,5] slice @names[2 .. 6] slice
A single element of an array starts with $ $names[4] $names[$value]
AE6382
Variables - Arrays
As an lvalue $names[4] = 345; @names = (1,2,3,4,5); @names = 1 .. 5; $last_value = $names[-1];
AE6382
Variables - Hashes
A hash, or associative array, is an un-ordered list of scalars
Hashes are indexed by strings The indexing operator is { } A hash starts with % To refer to the entire hash
%months A single element of a hash starts with $
$months{‘Mar’} $months{$some_string}
As an lvalue $months{‘Mar’} = ‘March’; %months = (‘Jan’ => ‘January’, ‘Feb’ => ‘February’);
AE6382
Variables - Namespaces
Two types of namespace Global Lexical
Global variables are kept in symbol tables that are named and accessible Are created in the context of a package (default is $main::) Can be referenced from another package using
$package::variable
Lexical variables are created and exist only in the context of a Perl block (normally region enclosed with { })
AE6382
Literals – Numeric
Numeric literals can take several formats 12345 integer 12345.67 floating point 1.23e06 scientific 1_234_567 0123 octal 0xffff hexidecimal 0b101010 binary
AE6382
Literals - String
There are several ways to quote a string Substitution for variables in a string is known as interpolation
print “The value is $value\n”; print ‘The value is ‘,$value,”\n”;
Interpolation occurs for variables and back slash literals
Usual General Meaning Interpolate‘ ‘ q/ / Literal string No
“ “ qq/ / Literal string Yes
` ` qx/ / Command execution Yes
( ) qw/ / Word list No
/ / m/ / Pattern match Yes
s/ / / s/ / / Pattern substitution Yes
y/ / / tr/ / / Character translation No
AE6382
Literals - String
Special additions to the character set Backslash escape characters
\n newline \r carriage return \t tab \033 character represented by octal 033 \cX Control-X \x{263a} Unicode character \\ back slash
Translation escapes \u force next character to uppercase \l force next character to lowercase \U force all following characters to uppercase \L force all following characters to lowercase \E end \U or \L switch
AE6382
Literals - String
There is flexibility in choosing quotes $string = qq[This method allows inclusion of ‘ and ‘’]; $string = qq{This method allows inclusion of ‘ and ‘’}; $string = qq/This method allows inclusion of ‘ and ‘’/;
The following executes a command using the OS shell and returns its output as a string $result = qx(ls);
Word list form does not require tedious quoting @months = qw(January February March April);
AE6382
Interpolation
Interpolation is the process of expanding a variable in a string literal, the “ form of the string
Scalars are resolved in place, numeric values are converted to characters
Arrays are interpolated by joining all the elements of the array separated by the value of the special $” variable $” = ‘~‘; @months = qw(jan feb mar apr may jun); $string = “The months are: @months”;
– The months are: jan~feb~mar~apr~may~jun
Hashes are interpolated similarly, the key followed by the value are inserted into the string
AE6382
List Values
A list consists of values enclosed in ( ) and separated by commas @array = (1,3,5,7,9,11);
In list context the above example loads the array with the values
In a scalar context, each value is evaluated and the last value is returned, $value == 11 below $value = (1,3,5,7,9,11);
There is an important difference between a list and an array, when an array is evaluated in scalar context it returns its length, $length == 6 $length = @array; $length = scalar @array; $length = @array + 0;
AE6382
List Values
List interpolation (@array1, @array2, 1) Each element above is evaluated and inserted into the list that is
generated There are no lists of lists
Lists can be indexed using [ ] ($day,$month,$year) = (localtime())[3..5];
Lists may be used as lvalues (see above)
AE6382
Context
Every operation in Perl is evaluated in one of two contexts: scalar or list
Assignment to a scalar lvalue will cause the right side to be evaluated in scalar context
Assignment to an arrary, hash, or a slice lvalue will cause the right side to be evaluated in list context
Assignment to a list on the left will cause the right side to be evaluated in list context
Use the scalar function to force evaluation in scalar context
Some operations return different values depending on the context in which they are evaluated $number_of_matches = m/([^,]+)*/; @numbers = m/([^,]+)*/;
AE6382
Arrays and Context
An array when referenced using @ operates in a list context
An array element operates in a scalar context When a list is assigned to an array each value is
inserted into the next element Special forms of arrays
$length = scalar @array; (scalar not required here)
$last_index = $#array; scalar @array == $#array + 1 (an identity)
AE6382
Hashes and Context
A hash when referenced in the % form operates in list context
A hash element operates in a scalar context When a list is assigned to a hash each pair of values in
the list is taken as a key-value pair %colors = (‘red’,0xff0000,’green’,0x00ff00,’blue’,0x0000ff);
There is a special syntax available for this %colors = (red => 0xff0000, green => 0x00ff00, blue =>
0x0000ff);
Use the keys function to generate a list of keys for a hash
To find the number of keys in a particular hash $number_of_keys = scalar keys %hash;
AE6382
Filehandles and Input
A filehandle refers to a file Filehandles are, by convention, all upper case
STDIN, STDOUT, STDERR are predefined
Use <> operator to read from a filehandle $line = <STDIN>; read one line from STDIN @lines = <STDIN>; read all lines from STDIN
Read and print entire STDIN while(<>) { print; }
– reads each line to the special variable $_ which is used implicitly in both the <> and print commands
AE6382
Operators
Operator precedence Operators can be overloaded
when using objects
Terms and list operators->
++ --**
! ~ \ unary + unary -=~ !~
* / % x+ - .<< >>
Named unary operators< > <= >= lt gt le ge== != <=> eq ne cmp
&| ^&&||
.. ...? : (ternary)
= += -= *= (etc), =>
List operatorsnotand
or xor
AE6382
Simple Statements
A simple statement is an expression that is evaluated A simple statement is terminated with a ; A simple statement may be followed by a modifier
if expr unless expr while expr until expr foreach list
Examples print “Value is $i\n” if $i > 5; print “i=$i-- \n” while $i != 0;
AE6382
Compound Statements
Expressions containing blocks A block is normally contained in { } if statement
if (expr) block if (expr) block else block if (expr) block elsif (expr) block if (expr) block elsif (expr) block else block
unless statement is similar
$i = $max;
if ($i == $max) {
print “The max is five\n”;
exit;
} else {
$i++;
}
$i = $max;
unless ($i == $max) {
$i++;
} else {
print “The max is five\n”;
exit;
}
AE6382
Compound Statements
while statement label while (expr) block label while (expr) block continue block
until statement label until (expr) block label until (expr) block continue block
The continue block is executed before starting next iteration of loop
while (<STDIN>) {
chomp;
@fields = split(/:/);
print “Field 1: $fields[0]\n”;
}
AE6382
Compound Statements
for loop label for (expr1 ; expr2 ; expr3) block
expr1 start condition expr2 ending condition expr3 loop statement
for (my $i = 0;$i < 10;$i++) {
print “i=$i\n”;
}
AE6382
Compound Statements
foreach statement label foreach (list) block label foreach var (list) block label foreach var (list) block continue block
Loops over each entry in the list When var is omitted then $_ is used
foreach my $key (sort keys %people) {
print “Key: $key, Value=$people{key}\n”;
}
foreach my $entry (@items) {
print “Item: $entry\n”;
}
AE6382
Compound Statements
Labeled block label block label block continue block
Equivalent to a single iteration loop Can be used with last, next, and redo
AE6382
Loop Control
These statements can be used with blocks The optional label further refines their effect last label
Exit the loop (block) The continue block is not executed
next label Skip the rest of this iteration and start the next iteration Execute the continue block before the next iteration begins
redo label Restart the loop with the current iteration parameters The continue block is not executed
The label parameter enables multi-level block control
AE6382
Declarations
Subroutine declaration is a global declaration Must declare a subroutine before using it
sub count;
Can define a subroutine at declaration sub count { … }
Pragmas are directives to the Perl compiler use strict; use integer; use warnings; use English;
AE6382
Declarations
Variable declarations Lexically scoped declarations
– my $var; – my ($var1, $var2);– my $value = function();
Lexically scoped global declarations– our $var;
Dynamically scoped global declarations– local $var;
AE6382
Pattern Matching
Regular Expressions Rule based pattern matching mechanism
Simple patterns m/Class/
Complex pattern m/AE[0-9]+[A-Z]/
AE6382
Regular Expressions
Meta-characters \ | ( ) [ { ^ $ * + ? . Have special meanings inside patterns \ is the escape character used to use one of the meta-characters
as itself in a pattern, eg, \\ or \.
Quantifiers * + ? {3} {2,5} RE’s normally match maximal text Add ? to end to match minimal text
Character classes [ ] or [^ ]
Grouping ( )
AE6382
Regular Expressions
The pattern matching operators m// match s/// substitute tr/// transliterate
Binding operators =~ binds string to pattern operator !~
Examples $string =~ m/AE[0-9]{4}[A-Z]/; $string =~ s/old/new/; $string =~ s(old)(new); can use arbitrary delimiters $string =~ s’old’new’;
AE6382
Regular Expressions
Maximal and Minimal matches “exasperate” =~ m/e(.*)e/
– Returns “xasperat” “exasperate” =~ m/e(.*?)e/
– Returns “xasp”
AE6382
Functions
There are many built-in functions Can be used with or without parentheses around arguments
– With parentheses it will be parsed as a function– Without parentheses it will be parsed as a prefix operator, preferred– Use the –w switch on the #!/usr/bin/perl –w line to flag when it is
being parsed as a function– Example
• print 1+2*4; # prints 9• print (1+2)*4; # prints 3
For details see perl documentation or Camel book
Users may define functions sub name { code }; User functions are called with parentheses around arguments
AE6382
Functions - Arguments
Arguments are passed to functions in the built-in array @_
The elements of @_ can be accessed by any of several techniques
sub func {
my $arg1 = $_[0];
my $arg2 = $_[1];
}
sub func {
my $arg1 = shift;
my $arg2 = shift;
}
sub func {
my $arg1 = shift;
my @rest = @_;
}
shift is a built-in function that returns the first element of an array then shifts the remaining elements down
shift operates in a manner similar to a stack pop
sub func {
my $nargs = @_;
my $arg1 = shift;
my @rest = @_;
}
AE6382
eval Function
The eval function normally used to trap runtime errors The eval function has two forms
eval block– Will execute the code enclosed by the block
eval expr– Compiles and executes the code in expr– The code in expr can be dynamically created
The special variable $@ contains the result of execution $@ is set to the error message if there is an error $@ is set to an empty string if there is no error
eval { … } # execute block of code
if ($@) { … } # handle error
AE6382
References
A reference in Perl is a scalar that contains a pointer to some data in memory
Perl has two types symbolic and hard Symbolic: scalar contains the name of another variable Hard: scalar contains the address of the memory
Use the $ prefix to dereference a reference $ref is the scalar that contains the reference $$ref # dereference ${$ref} # dereference
Hard references are generally more common
AE6382
References
The \ (backslash) operator is used to create a hard reference
$ref = \$sample In this example $ref is an alias for $sample, they both refer to
the same location in memory Use $$ref to refer to that memory location: $$ref == $sample
and ${$ref} = $sample
$ref = \@array In this example $ref is an alias for @array To access an array element: $$ref[1] or ${$ref}[1] or $ref->[1] To access array: @$ref or @{$ref}
AE6382
Data Structures
References are useful in accessing anonymous data structures
Anonymous array [ element1, element2, … , elementN ] $ref = [0,1,2,3,4]; $$ref[0] or ${$ref}[0] or $ref->[0]
Anonymous hash { key1=>element1, key2=>element2, … , keyN=>elementN } $ref = { Jan=>1, Feb=>2, Mar=>3, Apr=>4 }; $$ref{Jan} or ${$ref}{Jan} or $ref->{Jan}
The -> operator is syntactic shorthand that removes the extra $ dereference
AE6382
Data Structures
Creating arbitrarily complex data structures is relatively easy using references
Create any number of anonymous structures placing their address into a scalar (reference)
Store the resulting scalars into other structures
AE6382
Arrays of Arrays
An array of arrays is how to create a multi-dimensional array in Perl
In each cell of one array save a reference to another array
There is no requirement that each secondary array be the same length
my @array;
for (my $i=0;$i<4;$i++) {
my $ref;
for (my $j=$i;$j<$i+4;j++) {
push @{$ref},$j;
}
$array[$i] = $ref;
}
print $array[0]->[0],”\n”;
my $array_ref;
for (my $i=0;$i<4;$i++) {
my $ref;
for (my $j=$i;$j<$i+4;j++) {
push @{$ref},$j;
}
$array_ref->[$i] = $ref;
}
print $array_ref->[0]->[0],”\n”;
AE6382
Hash of Arrays
In each cell of a hash table save a reference to an arraymy %months = ( Jan=>[1..31],
Feb=>[1..28]);
$, = ‘, ‘;
foreach my $month (keys %months) {
print “$month: “,@{$months{$month}},”\n”;
}
Jan: 1, 2, 3, 4, … 27, 28, 29, 30, 31
Feb: 1, 2, 3, 4, … 27, 28
AE6382
Complex Structures
Data structures can be created to any level of complexity Can mix all types to any depth
Arrays of hashes of hashes of arrays Hashes containing references to user defined functions
– &{$func_list{$member}}(…arguments…)
sub startup {
print “Startup\n”;
}
sub shutdown {
$code = shift;
print “Shutdown: $code\n”;
}
%func_list = (Startup=>\&startup,
Shutdown=>\&shutdown);
&{$func_list{shutdown}}(99);
AE6382
Packages
A package is the way to isolate code in its own namespace
This is particularly useful for re-usable code (libraries) As generally used, the scope of a package declaration
is the file in which it appears Usually package is the first line of a file that is
processed by require or use To refer to a variable in another package use
$package::variable The default package is main, $main::variable or
$::variable
AE6382
Modules
The module is the basic unit of re-usable Perl code Module files end with the .pm file extension Modules come in two forms
Traditional: functions and variables Object-Oriented: methods and properties
Modules are accessed with the use keyword use Module;
A module file contains a package declaration with the same name as the file
A module may export a list of functions and variables to the namespace that contains the use statement (do not export OO methods)
AE6382
Modules
Module names should begin with a capital letter and end with .pm
The last line of a module must be 1;
File Sample.pm
package Sample;
sub func1 {
}
sub func2 {
}
1;
use Sample;
my $result = Sample::func1;
AE6382
Modules
Beyond the simple form there is additional support for modules The Exporter module can be used to place selected symbols
into the Perl code that uses the module There is a version checking mechanism There is an autoload feature
File Sample.pm
package Sample;
require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw(func1 func2);
sub func1 {
}
sub func2 {
}
1;
use Sample;
my $result = func1;
AE6382
Objects
The module forms the basis of the Object Oriented features of Perl
The package name is the class name (type) The function definitions in the module are the methods A class may inherit methods from parent classes A class may be sub-classed Perl classes inherit methods not data An object is a reference to an instance of a class All Perl classes are sub-classes of the UNIVERSAL
class
AE6382
Objects – Method Invocation
Assume a class named Sample with an instance named $instance
Invoking a class method Sample->class_method(…arguments…);
Invoking an instance method $instance->instance_method(… arguments…);
The first argument of a method invocation is hidden and is either the class name (class method) or a reference to an object (instance method)
Methods can override super class methods
AE6382
Objects – Method Invocation (2)
There is an alternate invocation method using indirect objects
Looks like method object (list) method object list method object
This method is less common as it suffers from some syntactic ambiguity
Frequently used in calling constructor $q = new CGI; $q = CGI->new;
AE6382
Objects - Constructors
A constructor method is an ordinary method, usually named new
Constructors for sub-classable classes need to be designed carefully (Camel Book 3rd ed, p 318)
The instance properties are usually kept in an anonymous hash that is saved in the instance variable
The bless function associates the reference variable with the class # Constructor for class named Sample
sub new {
my $obj = shift;
my $class = ref($obj) || $obj;
my $self = { @_ };
bless($self,$class);
return $self;
}
$object = Sample->new(alpha=>1,beta=>2);
AE6382
Objects - Constructors
In the previous example the instance data are stored in an anonymous hash
The ref built-in function returns the class name of the object that is referred to
Any reference can be used, hashes are common and convenient
The use fields …; pragma is useful for creating object field storage, use this with the use base …; pragma
AE6382
Objects – Properties
The instance data can be referenced as hash entries when the object is hash based
my $prop1 = $object->{alpha};
my $prop2 = $object->{beta};
Instance data should normally be accessed using accessor methods
AE6382
Objects - Overloading
Perl provides a mechanism to overload operators use overload implements this There is a handler (method/function) associated with
each operator that has been overloaded, Perl will take care of the details
AE6382
Tied Variables
In Perl the tie function associates an object with a normal Perl variable (scalar, array, hash)
For example, a file can be accessed as if it were a simple array
The store and fetch accesses to the variable are provided by methods, Perl handles the details
There are numerous available modules that create tied variables to access more complex data sources
AE6382
Extending Perl
There are several ways to extend Perl Create modules (object oriented or traditional) Create native code, C code, that is appended to the Perl
interpreter
Hundreds of modules are available at http://www.cpan.org/
Perl is available for almost every OS Generally pre-compiled for Linux Windows version from http://www.activestate.com/
The Perl interpreter can be embedded in native code programs