Hans Petter Langtangen
Simula Research Laboratory
Department of Informatics
University of Oslo
Table of Contents
1 Introduction to Perl . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 1 1.1 A Scientific Hello World Script .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Reading and Writing Data Files . . . . . . . . . . . . . . .
. . . . . . 3 1.1.2 The Complete Code . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 3 1.1.3 Dissection . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 The Concept of Context in Perl . . . . . . . . . . . . . . .
. . . . . . 7
1.2 Automating Simulation and Visualization . . . . . . . . . . . .
. . . . . . . 8 1.2.1 The Complete Code . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 8 1.2.2 Dissection . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.3 There’s More Than One Way To Do It . . . . . . . . . . . . . .
. . . . . . . . 12 1.3.1 A Script for Perl Beginners . . . . . . .
. . . . . . . . . . . . . . . . . . 13 1.3.2 Using the Underscore
Variable . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.3 A
Script Written in Typical Perl Style . . . . . . . . . . . . . . .
. 14 1.3.4 Shorter Scripts for Lazy Programmers . . . . . . . . . .
. . . . . . 15 1.3.5 The Ultimate Goal: Getting Rid of the Script
File . . . . . 15 1.3.6 Perl Has a Grep Function Too . . . . . . .
. . . . . . . . . . . . . . . . 15
1.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 17 1.4.1 Basic Control Statements . . . . .
. . . . . . . . . . . . . . . . . . . . . . 17 1.4.2 File Reading
and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . .
18 1.4.3 Running an Application . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 19 1.4.4 One-Line Perl Scripts . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 20 1.4.5 Array and List
Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4.6 Hash Operations . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 24 1.4.7 Splitting and Joining Text . . . .
. . . . . . . . . . . . . . . . . . . . . . 25 1.4.8 Text
Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 26 1.4.9 String Operations . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 26 1.4.10 Environment
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 1.4.11 Subroutines . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 28 1.4.12 Nested, Heterogeneous
Data Structures . . . . . . . . . . . . . . . 32 1.4.13 Testing a
Variable’s Type . . . . . . . . . . . . . . . . . . . . . . . . . .
. 33 1.4.14 Numerical Expressions . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 34 1.4.15 Listing of Files in a Directory .
. . . . . . . . . . . . . . . . . . . . . . 34 1.4.16 Testing File
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 34 1.4.17 Copying and Renaming Files . . . . . . . . . . . . .
. . . . . . . . . . . 35 1.4.18 Creating and Moving to Directories
. . . . . . . . . . . . . . . . . . 36 1.4.19 Removing Files and
Directories . . . . . . . . . . . . . . . . . . . . . . 36 1.4.20
Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 36 1.4.21 Traversing Directory Trees . . . . . . . .
. . . . . . . . . . . . . . . . . . 37 1.4.22 Downloading Internet
Files . . . . . . . . . . . . . . . . . . . . . . . . . . 38 1.4.23
CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . .
. . . . 39
1.4.24 Programming with Classes . . . . . . . . . . . . . . . . . .
. . . . . . . . 40 1.4.25 Debugging Perl Scripts . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 40 1.4.26 Regular
Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 42 1.4.27 Exercises . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 46 1.4.28 Building and
Using Modules . . . . . . . . . . . . . . . . . . . . . . . . 49
1.4.29 Binary Input/Output . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 51
1.5 Installing Perl and Additional Modules . . . . . . . . . . . .
. . . . . . . . . . 52 1.5.1 Installing Basic Perl . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 52 1.5.2 Manual
Installation of Perl Modules . . . . . . . . . . . . . . . . . . 52
1.5.3 Automatic Installation of Perl Modules . . . . . . . . . . .
. . . . 53 1.5.4 The Required Perl Modules . . . . . . . . . . . .
. . . . . . . . . . . . . 54
1.6 Perl Versus Python . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 54 1.6.1 Python’s Advantages . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.6.2
Perl’s Advantages . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 56 1.6.3 Efficiency . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 GUI Programming with Perl/Tk . . . . . . . . . . . . . . . . .
. . . . . . . . . . 59 1.7.1 The First Perl/Tk Encounter . . . . .
. . . . . . . . . . . . . . . . . . . 60 1.7.2 The Similarity of
Python/Tkinter and Perl/Tk . . . . . . . . 62 1.7.3 Binding Events
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 62
1.8 Web Interfaces and CGI Programming . . . . . . . . . . . . . .
. . . . . . . . 63 1.8.1 Web Versions of the Scientific Hello World
Program . . . . 63 1.8.2 Debugging CGI Scripts in Perl with
CGI::Debug . . . . . . . 65 1.8.3 Using Perl’s CGI Module to
Construct Forms . . . . . . . . . 67
2 Introduction to Tcl/Tk . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . 70 2.1 A Scientific Hello World Script . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 71 2.1.2 Dissection . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.2 Reading and Writing Data Files . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 72 2.2.1 The Complete Code . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 73 2.2.2 Dissection
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 73 2.2.3 Double Quotes, Braces, Brackets, and Variable
Substi-
tution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 76 2.3 Automating Simulation and
Visualization . . . . . . . . . . . . . . . . . . . 77
2.3.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 77 2.3.2 Dissection . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 81 2.4.1 File Reading and Writing . . . . .
. . . . . . . . . . . . . . . . . . . . . . 82 2.4.2 Running an
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82 2.4.3 List Operations . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 83 2.4.4 Associative Array Operations
. . . . . . . . . . . . . . . . . . . . . . . 84 2.4.5 Splitting
and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . .
. 85 2.4.6 Text Processing . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 85 2.4.7 String Operations . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2.4.8
Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 87
2.4.9 Environment Variables . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 87 2.4.10 Procedures . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.4.11
Listing of Files in a Directory . . . . . . . . . . . . . . . . . .
. . . . . 91 2.4.12 Testing File Types . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 91 2.4.13 Copying and
Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.14 Creating and Moving to Directories . . . . . . . . . . . . .
. . . . . 91 2.4.15 Removing Files and Directories . . . . . . . .
. . . . . . . . . . . . . . 92 2.4.16 Splitting Pathnames . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.4.17
Traversing Directory Trees . . . . . . . . . . . . . . . . . . . .
. . . . . . 92 2.4.18 CPU-Time Measurements . . . . . . . . . . . .
. . . . . . . . . . . . . . . 93 2.4.19 Programming with Classes .
. . . . . . . . . . . . . . . . . . . . . . . . . 93 2.4.20
Building and Using Packages . . . . . . . . . . . . . . . . . . . .
. . . . 93 2.4.21 Regular Expressions . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 95 2.4.22 Exercises . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
2.5 GUI Programming with Tcl/Tk . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 96 2.5.1 The First Tcl/Tk Encounter . . . . . .
. . . . . . . . . . . . . . . . . . 97 2.5.2 Binding Events . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.5.3 Widget Name Hierarchy . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 99 2.5.4 The Similarity of Python/Tkinter and
Tcl/Tk . . . . . . . . . 99 2.5.5 Using Variables in Widget Names .
. . . . . . . . . . . . . . . . . . . 100 2.5.6 Configuring Widgets
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.5.7 The Grid Geometry Manager . . . . . . . . . . . . . . . . . .
. . . . . . 102
Preface
The purpose of this document is to show how the introductory
programming examples from the book “Python Scripting for
Computational Science” [?] can be implemented in Perl and Tcl. In
addition, we list some core func- tionality of these scripting
languages, typically corresponding to the same information and
examples as in Chapter 3Basic Pythonchapter.135 in [?]. If you know
the examples in a Python context from Chapters 2Getting Started
with Python Scriptingchapter.49 and 3Basic Pythonchapter.135 in
[?], it is quite easy to pick up basic Perl and Tcl from the
present note. The Perl and Tcl chapters can be read
independently.
The author has a desire to include other scripting languages, e.g.,
Ruby and Scheme. Potential authors of such (independent) chapters,
with the same structuring as the Perl and Tcl chapters, are
encouraged to drop me an email (
[email protected]).
The present printing of the document contains the Perl part
only.
Chapter 1
Introduction to Perl
This chapter gives a quick introduction to the Perl language for
readers who are familiar (at least to some extent) with the Python
scripts from Chapters 2.1A Scientific Hello World
Scriptsection.50–2.3Gluing Stand-Alone Applicationssection.89 and
3Basic Pythonchapter.135 in the book [?]. We shall look at the same
sample scripts and show how the syntax changes when we program in
Perl.
Recommended Documentation. As a companion to the introductory
examples and the overview of basic Perl functionality provided in
this appendix, you need the Perl man pages. These come along with
the Perl distribution. I find it convenient to read the man pages
in plain text format using the perldoc tool. Some common ways of
looking up information with perldoc are exemplified below.
perldoc perl # overview of all Perl man pages perldoc perlsub #
read about subroutines perldoc Cwd # look up a special module, here
’Cwd’ perldoc -f open # look up a special function, here ’open’
perldoc -q cgi # seach the FAQ for the text ’cgi’
A Web version of the man pages can be found in the doc.html file.
There you can also find the Perl FAQ and a quick reference.
Having grasped the basic introduction to Perl from this appendix,
you will find the definite Perl reference, the famous “Camel book”
[?], very useful. However, much of the text in [?] coincides with
the Perl man pages. If you feel that a more comprehensive
introduction to Perl is needed, “Learning Perl” [?] and [?] are
recommended. Ready-made recipes for numerous common tasks in
scripting are collected in the highly recommended “Perl Cookbook”
[?]. Advanced features of Perl are well discussed in [?] and [?].
Some Web resources regarding Perl topics are listed in
doc.html.
The first Perl encounter consists of three of the examples from the
intro- duction to Python in Chapter 2Getting Started with Python
Scriptingchapter.49 in [?]. We start out with a Hello World script,
before continuing with a script concerning file handling and array
processing. Thereafter we present a script gluing a simulation and
a visualization program. All these scripts referred to in this
section are found in src/perl. Thereafter, in Chapter 1.4 we list,
in an example-oriented way, some basic and useful Perl
functionality for quick refer- ence. Chapter 1.5 explains how to
install Perl and additional modules. A brief comparison of Perl
versus Python appears in Chapter 1.6, while Chapters 1.7
2 1. Introduction to Perl
and 1.8 deal with graphical user interfaces: standard GUIs and
dynamic Web pages, respectively.
1.1 A Scientific Hello World Script
Our first look at Perl will be the Scientific Hello World script
from Chap- ter 2.1A Scientific Hello World Scriptsection.50 in [?].
This script reads a real number from the command line, takes the
sine of the number, and writes “Hello, World! sin(r)=s” with the
appropriate values of the numbers r and s. In Perl, we can write
the script like this:
#!/usr/bin/perl $r = $ARGV[0]; # fetch the first ([0]) command-line
argument $s = sin($r); # compute sin(r) and store in variable s
print "Hello, World! sin($r)=$s\n"; # print to standard
output
Comments in Perl start with # and continue for the rest of the
line. However, the first line #/usr/bin/perl! has a special
meaning: Under Unix it tells that the script, if run as an
executable file, is to be interpreted by the program /usr/bin/perl.
If the executable Perl interpreter is stored in another path on
your system, you must write the correct full path in the top line
of the script or (usually better) use a different header to be
presented in Chapter 1.1.1.
Scalar variables in Perl are always preceded by a $ sign, i.e., $r
and $s are scalar variables in the present script. The command-line
arguments to a Perl script are automatically stored in the array
ARGV. Subscripting this array is done as in $ARGV[0] (which implies
extracting the first entry; arrays in Perl start with 0 as in C and
Python). The length of the array is $#ARGV+1, i.e., $ARGV[$#ARGV]
is the last entry of the array. The array itself as a variable is
reached with the syntax @ARGV (and one can say, e.g., print
"ARGV=@ARGV").
Variables can be directly inserted into a text string, a convenient
feature called variable interpolation:
print "Hello, World! sin($r)=$s\n"; # print to screen
Such variable interpolation works only if the string is surrounded
by double quotes. Single quotes just leads to output of text with
dollar characters.
Perl’s syntax is much inspired by C. For example, the newline
character is \n and all statements are terminated by a
semicolon.
As usual in scripting, variables are never declared; the context
determines the type. Contrary to Python, a variable can be used
both as a string and a floating-point number. For example, $r is
initialized to a text, but can be sent to the sine function, which
expects a floating-point variable, without any explicit type
conversion.
Perl’s printf function gives good control of the output format of
numbers and strings:
printf "Hello, World! sin(%g)=%12.5e\n", $r, $s;
1.1. A Scientific Hello World Script 3
There is no possibility to control the format when using variable
interpolation (i.e., Python’s %(s)12.5e is not supported).
If the script is stored in a file hw.pl, you can execute the script
by typing
perl hw.pl 0.1
or you can make the file executable under Unix (chmod a+x hw.pl)
and then just write
./hw.pl 0.1
1.1.1 Reading and Writing Data Files
Chapter 2.2Working with Files and Datasection.59 in [?] deals with
a script for reading a file with (x, y) data points in two columns
and writing a new two-column file with transformed data points (x,
f(y)). On the next pages we shall present and explain a Perl
counterpart to the Python scripts. This case study demonstrates how
to work with files, subroutines, and arrays in Perl.
1.1.2 The Complete Code
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if
running under some shell
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
($infilename, $outfilename) = @ARGV;
# read one line at a time: while (defined($line=<INFILE>))
{
($x, $y) = split(’ ’, $line); # extract x and y value $fy =
myfunc($y); # transform y value printf(OUTFILE "%g %12.5e\n", $x,
$fy);
} close(INFILE); close(OUTFILE);
}
4 1. Introduction to Perl
1.1.3 Dissection
The Perl script starts with a header
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if
running under some shell
This header ensures that executing the script as
./datatrans1.pl infile outfile
implies interpreting the code by the first perl program encountered
in the directories listed in your PATH environment variable. The
explanation of all the details in our Perl header is intricate, but
it can be found in the file src/perl/headerfun.sh. (This is
actually a document written in Bash (!) so you need to run the file
to get the document printed.)
In the case where the user has failed to provide two command-line
ar- guments, we want to write a usage message and abort the script.
This is accomplished by Perl’s die statement: die prints a string
on standard error and terminates the script. In the present example
the script dies if there are less than two command-line
arguments:
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
Recall that $#ARGV is the last legal index in @ARGV, i.e., the
length of @ARGV is $#ARGV+1, so the test is $#ARGV+1 < 2,
leading to $#ARGV < 1.
Extracting the first two command-line arguments can be performed by
standard subscripting:
$infilename = $ARGV[0]; $outfilename = $ARGV[1];
However, it is more common (and elegant) to use Perl’s list
assignment con- struction:
($infilename, $outfilename) = @ARGV;
The list on the left-hand side is set equal, entry by entry, to the
entries in the array on the right-hand side. We refer to the remark
at the end of this section for an explanation of the difference
between list and array in Perl terminology.
Opening files in Perl is done with the open function:
open(INFILE, "<$infilename"); # open for reading open(OUTFILE,
">$outfilename"); # open for writing
1.1. A Scientific Hello World Script 5
The first argument to open is a file handle, which is used for
accessing the file in the Perl code. Input files are recognized by
< in front of the name1, > signifies an output file, and
>> implies that text will be appended to the file.
Reading from a file handle, line by line, is accomplished by
while (defined($line=<INFILE>)) { # process $line
}
In the present script we want to split the line into an array of
words, separated by whitespace. The split function performs this
task:
($x, $y) = split(’ ’, $line); # extract x and y value
Having the coordinates $x and $y available, we can transform the y
value by calling a function myfunc,
$fy = myfunc($y); # transform y value
One way of printing the transformed coordinate pair to the output
file is to apply the printf function:
printf(OUTFILE "%g %12.5e\n", $x, $fy);
The core of a printf call is the format string, which follows the
same syntax as in C and Python (and all other languages that
supports the C’s printf style for formatting). Perl’s ordinary
print function can also be used for writing to files, e.g., print
OUTFILE "$x $fy\n";
The myfunc function is defined as
}
Functions are referred to as subroutines in Perl. Their look is
typically
}
The most striking difference from subprograms in other languages is
that the argument list is not a part of the subroutine heading.
Instead, all arguments are available in an array @_. The first step
is normally to store the arguments in local variables:
1 If there is no < symbol, the file is opened for reading. In
fact, opentt(F,"<$name"), open(F,"$name"), and open(F,$name) all
lead to open- ing a file a file with name $name.
6 1. Introduction to Perl
my ($y) = @_; # list assignment # or my $y = @_[0]; #
subscripting
The my keyword tells that all variables on the left-hand side are
declared as local variables in the subroutine. This is a good habit
as using unintended global variables inside a subroutine may have
undesired effects in other parts of the script.
As in Chapter 2.2Working with Files and Datasection.59 in [?], we
can modify datatrans1.pl such that (i) the file is loaded into an
array of lines, (ii) the x and y coordinates are stored in two
arrays, and (iii) the output file is written by a for loop over the
array entries.
We start with making the open statement a bit more robust. Perl
does not by default write any error message if the file we try to
open does not exist. This can be quite annoying, but the problem is
solved by a “try something or die” construction:
open(INFILE, "<$infilename") or die "unsuccessful opening of
$infilename; $!\n";
The $! variable is a special variable in Perl containing the last
error message issued by the operating system.
Loading a file into an array of lines is enabled by the
syntax
@lines = <INFILE>;
One can then process the array @lines, line by line:
for $line (@lines) { # process $line
}
# process $line }
In the present case we want to create two arrays, @x and @y,
containing the x and y coordinates:
@x = (); @y = (); # start with empty arrays for $line (@lines)
{
($xval, $yval) = split(’ ’, $line); push(@x, $xval); push(@y,
$yval);
}
The x and y coordinates are extracted by splitting the line with
respect to whitespace, exactly as we did in the datatrans1.pl code.
The push function appends new array entries.
Creating the output file can now be performed by a C-like for loop
over the array indices:
1.1. A Scientific Hello World Script 7
open(OUTFILE, ">$outfilename") or die "unsuccessful opening of
$outfilename; $!\n";
for ($i = 0; $i <= $#x; $i++) { $fy = myfunc($y[$i]); #
transform y value printf(OUTFILE "%g %12.5e\n", $x[$i], $fy);
} close(OUTFILE);
Recall that $#x is the last valid index in the array @x. The
complete code is found in src/perl/datatrans2.pl.
Remarks on Terminology. Perl distinguishes between the terms array
and list. Roughly speaking, an array is the variable having a list
as value [?, Ch. 4.0]. For example, in an assignment @a =
("a","b","c"), a is an array, whereas its value ("a","b","c") is a
list. The function push operates on ar- ray variables and not on
lists, meaning that push(@a,"q") works well, while
push(("a","b","c"),"q") does not make sense.
1.1.4 The Concept of Context in Perl
Operations in Perl er evaluated in a specific context. For
newcomers to the language the context concept can be quite
confusing. A thorough explanation of context is provided in the
“Camel” book [?, Ch. 2] or the perldata man page (invoke perldoc
perldata and search for “Context”). Here we shall only exemplify
the two major contexts: scalar and list. The assignment
@a = ("a","b","c");
evaluates the list on the right-hand side in a list context, and @a
becomes an array variable having its entries equal to the three
scalars in the list ("a","b","c"). When assigning the list to a
scalar,
$a = ("a","b","c");
the list on the right-hand side is evaluated in a scalar context.
In this case, the value of the list is the value of the last
element (as with the C comma operator). Therefore, $a becomes "c".
On the other hand,
$b = @a;
evaluates the array variable @a in a scalar context, and its value
is then the length of the array. That is, $b becomes 3.
These examples show that an array variable can have a list as value
in a list context and its length as value in a scalar context. A
hash evaluated in a scalar context becomes true if there are
elements in the hash, and false otherwise2.
The property that an array evaluates to its length in a scalar
context is often taken advantage of by Perl programmers. Two common
applications are 2 There is more information in the scalar value,
see the perldata man page.
8 1. Introduction to Perl
}
die "Usage: $0 file" unless @ARGV; die "Usage: $0 -f file" unless
@ARGV == 2;
Especially the two latter examples have an attractive readability.
The return value of many Perl functions depends on the context.
One
example is localtime:
$t = localtime();
yields the date as a string; $t is "Sun May 13 09:02:27 2001", for
instance. In a list context,
@t = localtime();
localtime returns a list of nine values containing the time, day,
month, year, etc. (see perldoc -f localtime), and @t becomes an
array of numbers (say) (27, 2, 9, 13, 4, 101, 0, 132, 1).
1.2 Automating Simulation and Visualization
Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?]
describes a sim- ple simulation code, called oscillator, for
solving a differential equation mod- eling an oscillating system.
Using a script, we can improve the user friendli- ness of the
simulation code and also launch a visualization of the solution. A
Python version of such a script is explained in detail in Chapter
2.3Gluing Stand-Alone Applicationssection.89 in [?], and the
purpose of the present section is to present the Perl version of
that script.
1.2.1 The Complete Code
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if
running under some shell
# default values of input parameters: $m = 1.0; $b = 0.7; $c = 5.0;
$func = "y"; $A = 5.0; $w = 2*3.14159; $y0 = 0.2; $tstop = 30.0;
$dt = 0.05; $case = "tmp1"; $screenplot = 1;
# read variables from the command line, one by one: while (@ARGV)
{
$option = shift @ARGV; # load cmd-line arg into $option if ($option
eq "-m") {
$m = shift @ARGV; # load next command-line arg } elsif ($option eq
"-b") { $b = shift @ARGV; } elsif ($option eq "-c") { $c = shift
@ARGV; }
1.2. Automating Simulation and Visualization 9
elsif ($option eq "-func") { $func = shift @ARGV; } elsif ($option
eq "-A") { $A = shift @ARGV; } elsif ($option eq "-w") { $w = shift
@ARGV; } elsif ($option eq "-y0") { $y0 = shift @ARGV; } elsif
($option eq "-tstop") { $tstop = shift @ARGV; } elsif ($option eq
"-dt") { $dt = shift @ARGV; } elsif ($option eq "-noscreenplot") {
$screenplot = 0; } elsif ($option eq "-case") { $case = shift
@ARGV; } else {
die "$0: invalid option ’$option’\n"; }
}
# create a subdirectory with name equal to case and generate # all
files in this subdirectory: $dir = $case; use File::Path; #
contains the rmtree function if (-d $dir) { # does $dir
exist?
rmtree($dir); # remove directory (old files) } mkdir($dir, 0755) or
die "Could not create $dir; $!\n"; chdir($dir) or die "Could not
move to $dir; $!\n";
# make input file to the program: open(F,">$case.i") or die
"open error; $!\n"; print F "
$m $b $c $func $A $w $y0 $tstop $dt
"; close(F);
# run simulator: $cmd = "oscillator < $case.i"; # command to run
$failure = system($cmd); die "running the oscillator code failed\n"
if $failure;
# make gnuplot script: open(F, ">$case.gnuplot"); print F " set
title ’$case: m=$m b=$b c=$c f(y)=$func A=$A w=$w y0=$y0 dt=$dt’;
"; if ($screenplot) {
print F "plot ’sim.dat’ title ’y(t)’ with lines;\n"; } print F
<<EOF; # print multiple lines using a "here document" set
size ratio 0.3 1.5, 1.0; # define the postscript output format: set
term postscript eps monochrome dashed ’Times-Roman’ 28; # output
file containing the plot: set output ’$case.ps’;
10 1. Introduction to Perl
# basic plot command: plot ’sim.dat’ title ’y(t)’ with lines; #
make a plot in PNG format as well: set term png small; set output
’$case.png’; plot ’sim.dat’ title ’y(t)’ with lines; EOF close(F);
# make plot: $cmd = "gnuplot -geometry 800x200 -persist
$case.gnuplot"; $failure = system($cmd); die "running gnuplot
failed\n" if $failure;
The complete source code appears in src/perl/simviz1.pl.
1.2.2 Dissection
The script starts with a safe Perl header, which ensures
interpretation of the script by the first Perl interpreter found in
the user’s path. After having assigned default values to the input
parameters to the oscillator code, we encounter an important part
of many scripts, namely parsing of command- line arguments. The
idea is that we “eat” the entries in @ARGV one by one using the
shift operator:
$option = shift @ARGV;
This statement implies setting $options equal to the first element
in @ARGV
and then removing this element from @ARGV3. We search for options
on the command line until the @ARGV array is empty:
while (@ARGV) { # while @ARGV is non-empty $option = shift @ARGV; #
load command-line arg. into $option if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg } elsif ($option eq
"-b") { $b = shift @ARGV; } ... else {
die "$0: invalid option ’$option’\n"; }
}
As an alternative to this explicit grabbing of command-line
arguments, we can use a special Perl utility called GetOptions [?,
p. 445]:
use Getopt::Long; # load module with GetOptions function
GetOptions("m=f" => \$m, "b=f" => \$b, "c=f" => \$c,
"func=s" => \$func, "A=f" => \$A, "w=f" => \$w, "y0=f"
=> \$y0, "tstop=f" => \$tstop, "dt=f" => \$dt, "case=f"
=> \$case, "screenplot!" => \$screenplot);
3 Experienced Perl programers will often write just $options =
shift; because shift without arguments implies shifting @ARGV. More
examples regarding such shortcuts in Perl are provided in Chapter
1.3.
1.2. Automating Simulation and Visualization 11
The syntax m=f means searching for the command-line argument --m
and loading the proceding argument as a floating-point number (=f)
into the Perl variable $m. A single hyphen as in -m works too.
Similarly, func=s specifies --func to take a string argument. The
specification of the flag screenplot
allows us to use either --screenplot for setting $screenplot to a
true value or --noscreenplot for setting $screenplot to a false
value (note to get this on/off behavior, the exclamation mark is
required in "screenplot” =¿ $screen-
plot!). The GetOptions function has a rich functionality; the
purpose here just is to notify the reader about the existence of
such a handy function. Instruc- tive information is obtained from
perldoc Getopt::Long. There are several other modules in the Getopt
family. For example: Getopt::Simple for a sim- plified interface to
Getopt::Long, Getopt::Std for single-character options,
Getopt::Mixed for long and single-character options, and
Getopt::Declare for handling command-line options or configuration
files with associated help text and initialization code.
The next step in our script is to move to the prescribed directory.
However, we should first check whether the directory exists, and if
so, we should delete it and recreate it to avoid mismatch between
old and new result files. Checking if a directory exists is done by
the command if (-d $directoryname) in Perl. Removing a non-empty
directory can be conveniently done by first loading an external
Perl module, use File::Path, and then calling the function
rmtree
in that module:
use File::Path; # has the rmtree function if (-d $dir) { # does
$dir exist?
rmtree($dir); # remove directory (old files) } mkdir($dir, 0755) or
die "Could not create $dir; $!\n"; chdir($dir) or die "Could not
move to $dir; $!\n";
Observe that we test for success of mkdir. For example,
insufficient permission to create a new directory will not be
noticable when running the script unless we include the or die
statement4.
The next task is to write an input file for the oscillator program.
Multi- line output can easily be created through an ordinary string
with embedded newlines5
print F " $m $b $c $func $A $w $y0
4 Python will in such cases abort the script and write a
“Permission denied” mes- sage to standard output. See Exercise
1.8.
5 Python requires a triple quoted string for this purpose.
12 1. Introduction to Perl
$tstop $dt
";
Alternatively, we can use a special Perl construction (stemming
from Unix shells), known as a here document :
print F <<EOF; $m $b $c $func $A $w $y0 $tstop $dt
EOF
Everything between the two EOF marks is treated as output text. The
enclos- ing EOF must start in the first column of the script file.
The Gnuplot script later in the simviz1.pl code is actually written
as a here document.
Perl’s system function is used for running applications:
$cmd = "oscillator < $case.i"; # command to run $failure =
system($cmd); die "running the oscillator code failed\n" if
$failure;
Visualization of the solution in Gnuplot requires writing a small
script with the proper Gnuplot commands:
open(F, ">$case.gnuplot"); print F <<EOF; # print multiple
lines using a "here document" ... # output file containing the
plot: set output ’$case.ps’; # variable interpolation ... EOF
close(F);
# make plot: $failure = system("gnuplot $case.gnuplot"); die
"running gnuplot failed\n" if $failure;
Never forget to close files before continuing with system commands
involving the generated files!
1.3 There’s More Than One Way To Do It
A famous Perl slogan is “There’s More Than One Way To Do It” (often
ab- breviated TIMTOWTDI, pronounced “Tim Toady”). The goal of the
present
1.3. There’s More Than One Way To Do It 13
section is to exemplify this slogan and demonstrate different Perl
program- ming styles. We shall develop scripts for finding files
containing a specified string and show that there might be many
different Perl solutions to a pro- gramming problem.
When working with computers, you have probably often tried to find
a file containing some particular text, but you have a hard time
figuring out what the filename is. If you remember parts of the
text, the Unix grep command is handy. For example,
grep superLibFunc *
searches all files (*) in the current working directory for the
text string superLibFunc and writes out the matches. This can help
you finding the file you are looking for. We shall present a
cross-platform Perl script, which im- plements the grep
functionality.
1.3.1 A Script for Perl Beginners
A verbose, easy-to-read grep script in Perl can take the following
form.
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if
running under some shell
die "Usage: $0 pattern file1 file2 ...\n" if $#ARGV < 1;
# first command-line argument is the pattern to search for:
$pattern = shift @ARGV; # run through the next command-line
arguments, i.e. files, and grep: while (@ARGV) {
$file = shift @ARGV; if (-f $file) {
open(FILE,"<$file"); @lines = <FILE>; # read all lines
foreach $line (@lines) {
if ($line =~ /$pattern/) { print "$file: $line";
} } close(FILE);
if ($line =~ /$string/)
which is a test whether the variable $line matches the regular
expression contained in $string. If so, we write out this
line.
14 1. Introduction to Perl
1.3.2 Using the Underscore Variable
The Perl program can be written more compactly using the implicit
$_ vari- able. Let us present the code first and the explain what
the syntax means.
#!/usr/bin/perl die "Usage: $0 pattern file1 file2 ...\n" unless
@ARGV >= 2; ($pattern, @files) = @ARGV; foreach (@files) {
if (-f) { open(FILE,"<$_"); foreach (<FILE>) {
if (/$pattern/) { print;
} }
The extraction of command-line arguments is elegantly performed by
divid- ing the arguments into the leading search string and an
array holding the filenames:
($pattern, @files) = @ARGV;
Many Perl commands can be issued without an explicit variable to
work with. One example is foreach (@files). In such cases the
“invisible” variable is $_. That is, foreach (@files) actually
means foreach $_ (@files).
The previous code is best explained by showing the equivalent Perl
state- ments where the $_ appears explicitly:
#!/usr/bin/perl die "Usage: $0 pattern file1 file2 ...\n" unless
@ARGV >= 2; ($pattern, @files) = @ARGV; foreach $_(@files)
{
if (-f $_) { open(FILE,"<$_"); foreach $_ (<FILE>) {
if ($_ =~ /$pattern/) { print $_;
1.3.3 A Script Written in Typical Perl Style
A more modern Perl style could be introduced in the script that
makes use of the implicit $_ variable:
1.3. There’s More Than One Way To Do It 15
}
The next unless -f statement means that one jumps to the next
iteration in the loop unless the test if (-f $_) is true, i.e.,
unless the current filename ($_) is an existing file.
1.3.4 Shorter Scripts for Lazy Programmers
There are many shortcuts in Perl aimed at lazy programmers. Here is
an example of a grep script equivalent to those above, but with a
much more compact file reading construction:
#!/usr/bin/perl $pattern = shift; # shift; means shift @ARGV while
(<>) { # read line by line in file by file
print if /$pattern/o; # o increases the efficiency }
The while (<>) loop implies reading all lines in all files
whose names are in @ARGV. (If there are no filenames on the command
line, <> reads from standard input.) Since processing a list
of files in a line-oriented fashion is a frequently encountered
task in scripts, while (<>) is a popular and widely used
construction that saves quite some typing. It goes without saying
that each line is available in the $_ variable.
1.3.5 The Ultimate Goal: Getting Rid of the Script File
We can also do the grep operation with a command-line Perl
script:
perl -n -e ’print if /superLibFunc/;’ file1 file2 file3
Here, the -n option tells Perl to invoke a loop over all lines in
all files specified on the command line (equivalent to while
(<>)) and execute the string after -e as a Perl script
applied to each line. Implicit here is that the line is stored in
the $_ variable.
1.3.6 Perl Has a Grep Function Too
The grep operation is so common that Perl has in fact a built-in
grep function:
16 1. Introduction to Perl
} }
The grep function searches for $string in a list of all the lines
in the file and returns a list with the lines that contain $string.
Of course, this readable script can be condensed to two lines if
desired, using the <> notation:
#!/usr/bin/perl $pattern = shift; print grep /$pattern/,
<>;
Observe that we here do not easily print the filename.
Remark. We should mention that reading the whole file into memory
at once, which is implied by @lines=<FILE> and also the
<> operator, may face memory problems if you work with large
data files. The line-by-line reading can then be more
appropriate.
Exercise 1.1. Modify a very Perl-ish grep script. Consider a grep
script in typical modern Perl style:
}
Extend this script such that the filename and the line number are
printed at the beginning of the lines that match the given string.
You can count the number of lines in the last foreach loop, or you
can make use of Perl’s special variable $., which holds the line
number of the current line. Write the line number in a field of
width (say) 5 characters such that the out- put is nicely aligned
in three colums (filename, line number, line), see Ex- ercise
8.4Exercisesexercise.483 on page 349Exercisesexercise.483 in [?]
for a sample output.
Observe how simple such an extension would have been if we had used
named variables instead of $_, or in other words, readability and
extendability are seldom well supported by extensive use of
$_.
1.4. Frequently Encountered Tasks 17
1.4 Frequently Encountered Tasks
Frequently encountered tasks in Perl scripts have been collected
and orga- nized in the present section, with the aim of providing a
kind of example- oriented quick reference for the reader. The
following tasks are covered:
– basic control structures,
– executing other programs,
– splitting, joining, searching, and replacing text,
– writing and calling Perl subroutines,
– checking a file’s type, size, and age,
– listing and removing files,
– creating and removing directories,
– measuring CPU time,
if ($answer eq "copy") { $copy = 1;
} elsif ($answer == 0) { $quit = 1;
} elsif { $answer eq ’run’ or answer eq ’execute’) { $run =
1;
} else { print ’Invalid answer $answer\n’;
}
Perl has numerous ways of writing if tests. Some examples are
if ($pen ne "up") { $pen = "up"; } if (not $pen eq "up") { $pen =
"up"; } if (! $pen eq "up") { $pen = "up"; } $pen = "up" if $pen ne
"up"; $pen = "up" if not $pen eq "up"; $pen = "up" if ! ($pen eq
"up");
The for or foreach statement visits the entries in an array, entry
by entry:
18 1. Introduction to Perl
# convert some PostScript files to GIF: @somelist = (’file1.ps’,
’file2.ps’, ’file3.ps’); for $psfile (@somelist) {
$giffile = $psfile; $giffile ~ s/\.ps/.gif; system("convert
ps:$psfile gif:$giffile");
}
There is both a while loop and a do-while loop in Perl:
$r = 0; $dr = 0.1; while (r <= 10) {
$s = sin($r); print "$s\n"; $r += $dr;
}
$s = sin($r); print "$s\n"; $r += $dr;
} while ($r <= 10);
}
The next statement continues with the next iteration in the
loop:
# print lines not starting with ’#’: for $file (@files) {
}
1.4.2 File Reading and Writing
The following code segments demonstrate opening a file and reading
it line by line or loading it into a list of lines:
$infilename = "myprog.cpp"; open(INFILE, "<$infilename") # open
for reading
or die "Cannot read file $infilename; $!\n"; @lines =
<INFILE>; # load file into a list of lines
# alternative reading, line by line: while (defined($line =
<INFILE>)) {
# process $line }
# process current line, stored in $_ } close(INFILE);
1.4. Frequently Encountered Tasks 19
The recipe for opening a file for writing a list of lines is given
next.
$outfilename = "myprog2.cpp"; open(OUTFILE, ">$outfilename") #
open for writing
or die "Cannot write to file $outfilename; $!\n"; $line_no = 0; #
count the line number in @lines foreach $line (@lines) {
$line_no++; print OUTFILE "$line_no: $line";
} close(OUTFILE);
We can proceed with appending text to a file, using Perl’s features
for writing (large) blocks of text in one output statement, with
embedded variables if desired:
open(OUTFILE, ">>$filename") # open for appending or die
"Cannot append to file $filename; $!\n";
# print multiple lines at once, using a ‘‘here document’’: print
OUTFILE <<EOF; /*
This file, "$outfilename", is a version of "$infilename" where each
line is numbered.
*/ EOF
# equivalent output using a string instead: print OUTFILE \
"/*
*/";
close(OUTFILE);
If you need to treat a file handle, such as OUTFILE, like a
variable, e.g., when sending it to a function, you should use
Perl’s FileHandle objects, see perldoc FileHandle.
1.4.3 Running an Application
Any operating system command can be executed by calling the system
func- tion. Here is an example involving running an application
myprog:
$cmd = "myprog -c file.1 -p -f -q"; $failure = system("$cmd >
res"); # output goes to file res die "$0: running $cmd failed\n" if
$failure;
A different way of testing for failure is
system("$cmd > res") == 0 or die "$0: running $cmd
failed\n";
The return value from system is also available in the special Perl
variable $?:
20 1. Introduction to Perl
system("$cmd > res"); die "$0: running $cmd failed\n" if
$?;
To redirect the output from the application into a list of lines,
one can use back quotes:
$cmd = "myprog -c file.1 -p -f -q"; @res = ‘$cmd‘;
Alternatively, one can open a pipe to the application and read the
output as if it were a file:
open(APP, "$cmd |"); @res = <APP>;
# process the current line, stored in $_ } close(APP);
Pipes can also be used for running interactive applications. After
having opened a write pipe to a program, we can issue various
commands, which are executed upon closing the pipe. Here is an
example involving the interactive Gnuplot program:
open (GNUPLOT, "| gnuplot -persist"); # open a pipe to Gnuplot
print GNUPLOT "set xrange [0:10]; set yrange[-2:2]\n"; print
GNUPLOT "plot sin(x)\n"; # draw a sine function print GNUPLOT
"quit\n"; close(GNUPLOT); # run Gnuplot with the commands
1.4.4 One-Line Perl Scripts
Perl supports some command-line options for wrapping a script with
a loop over all lines in a series of files. This is very convenient
for creating one-line scripts on the fly. For example,
perl -p -i.bak -e ’...’ file1 file2 file3
runs a loop over all lines in file1, file2, and file3. For each
line, the Perl commands provided inside the quotes (after the -e
option) are executed, and the -p option implies that the line is
printed after execution of the commands. Without the -i option the
printing goes to standard output, but with -i the files are
modified in-place, i.e., the original file is replaced by the new
output. With -i.bak the file file1 is first copied to file1.bak
before it is being overwritten. The -p and -i.bak options are
normally combined into -pi.bak. Each line in the files is stored in
$_. As an illustration we can let the script specified by the -e
option be s/float/double/g; meaning that float is replaced by
double in some files (here file1, file2, and file3):
1.4. Frequently Encountered Tasks 21
perl -pi.bak -e ’s/float/double/g;’ file1 file2 file3
To avoid automatic printing of each line, we can replace the -p
option by -n. Suppose a data file has numbers in a series of
columns, separated by whitespace, and you want to extract the first
and the fourth column. The relevant one-liner is then
perl -ne ’@s=split; print "$s[0]\t$s[3]\n"’ datafile
Calling split without an argument implies splitting $_ with respect
to whites- pace. The equivalent Perl script, stored in a file, in
this latter example can also be made very short:
while (<>) {@s=split; print "$s[0]\t$s[3]\n";}
1.4.5 Array and List Operations
The most common statements for creating and traversing arrays are
listed next. Creating an array with three entries goes like
this:
@arglist = ($myarg1, "displacement", "tmp.ps");
@arr = ($var1, $var2); @arglist = ($myarg1, "displacement", @arr,
"tmp.ps");
but @arglist does not have an array as the third element; the @arr
array’s entries are simply inserted in @arglist, i.e., @arglist now
contains
($myarg1, "displacement", $var1, $var2, "tmp.ps");
To force the third entry to be the @arr array, this entry must be a
reference to @arr, obtained by prefixing @arr with a backslash (see
page 29):
@arglist = ($myarg1, "displacement", \@arr, "tmp.ps");
New entries can be appended to an array using the push function,
e.g.,
push(@arglist, $myvar2); push(@arglist, @arr2);
$arglist[2] = "displacement";
22 1. Introduction to Perl
foreach $entry (@arglist) { print "entry is $entry\n";
} # or for $entry (@arglist) {
print "entry is $entry\n"; }
Index-based traversal is also possible:
for ($i = 0; $i <= $#arglist; $i++) { print "entry is
$arglist[$i]\n";
} # or for ($i = 0; $i < @arglist; $i++) {
print "entry is $arglist[$i]\n"; }
A widely used shortcut for creating a list of strings is the qw
operator:
@strlist = qw/item1 item2 item3/; # equivalent to: @strlist =
("item1", "item2", "item3");
The qw operator is frequently used in Perl/Tk programming.
Extracting entries from an array is often performed by a list
assignment,
e.g.,
($filename, $plottitle, $psfile) = @arglist;
This assignment works regardless of the length of @arglist6. If
@arglist has (say) two elements, $psfile becomes an undefined
variable. The final list entry on the left-hand side can be a list,
e.g.,
($filename, $plottitle, @rest) = @arglist; # @rest becomes
$arglist[2], $arglist[3] and so on
The shift function returns and removes the first array
element:
$first_entry = shift @arglist;
The pop function returns and removes the last array element:
$last_entry = pop @arglist;
Without arguments, shift and pop works on @ARGV in the main program
and @_ in subroutines, e.g.,
6 Similar list assignments in Python requires that the lists on
each side of the assignment operator have equal lengths.
1.4. Frequently Encountered Tasks 23
$file = shift; # same as shift @ARGV;
}
Array items can be changed in-place:
# @A is some array of numbers for ($i=0; $i<=$#A; $i++) {
if ($A[$i] < 0.0) { $A[$i] = 0.0; } } # @A does not contain
negative numbers
The follwing construction also works, i.e., entries in @A are
changed7:
for $r (@A) { if ($r < 0.0) { $r = 0.0; }
}
Perl arrays allow slicing: @arglist[1..3] returns the second up to
and including the fourth entry, that is, 1..3 denotes the indices
1-3.
Unlike Python, an array assignment like
@a = @b;
creates a new array @a where each element is a copy of the
corresponding array element in @b. To make a refer to the array b,
as in the Python assignment a
= b, we need to let a be a reference:
$a = \@b;
See page 29 for more information about references and how to access
the values referred to by $a.
Reversing the order of the entries in an array is performed by the
reverse
function:
@sortedl_ist = sort(@list); # sort in ascending ASCII order
The sort order can be controlled by a user-defined function,
e.g.,
7 The similar construction does not work in Python (cf. the example
starting on page 87Lists and Tuplessubsection.148 in [?]).
24 1. Introduction to Perl
sub numeric_sort { if ($a < $b) { return -1; } elsif ($a == $b)
{ return 0; } else { return 1; }
} @sorted_list = sort numeric_sort @list;
The arguments $a and $b in sort criteria routines are automatically
initialized by Perl and used instead of the @_ array for speed. The
numeric_sort routine is often required, but writing a separate
subroutine is actually not necessary because Perl already has a
compound comparison operator <=> that works with
numbers:
@sorted_list = sort { $a <=> $b } @list; # numeric sort
The statement $a <=> $b evalues to −1, 0 or 1, depending on
whether $a is less than, equal to, or greater than $b,
respectively. The operator works for text too. We refer to the
description of the sort function in perldoc perlfunc
(or write just perldoc -f sort) for numerous examples on writing
customized sort functions, e.g., case-insensitive text
comparison.
The perlfunc man page is very useful; if you wonder about the Perl
func- tion name for doing a specific task, write perldoc perlfunc
and search for keywords in this man page.
1.4.6 Hash Operations
A hash, also known as associative array in other languages, or
dictionary in Python, is a kind of array where the index, called
key, can be an arbitrary text. For example, all command-line
options to a script could be stored in a hash with the name of the
option (without any hyphens) as key:
$cmlargs{’m’} = 1.2; # or $cmlargs{m} = 1.2; $cmlargs{’tstop’} =
6.0; # or $cmlargs{tstop} = 6.0;
This allows for easy processing of a large number of command-line
arguments and corresponding script variables. Here is a possible
code segment:
# init the entire hash with default values: # (the entire hash is
preceded by %) %cmlargs = (
’tstop’ => 6.0, ’m’ => 1.2 );
while (@ARGV) { # run through all command-line arguments $option =
shift @ARGV; $option = substr($option, 2); # strip off hyphens (--)
if (exists($cmlargs{$option})) {
# next command-line argument is the value: $value = shift @ARGV
$cmlargs{$option} = $value;
} else {
die "The option $option is not registered\n"; }
} # traverse the hash structure, key by key: foreach $option (keys
%cmlargs)
{ print "cmlargs{’$option’}=$cmlargs{$option}\n"; }
With this technique you could develop various tools for
initializing and pro- cessing command-line options, and each time
you need to add a new variable and a corresponding option to the
script, you can simply add one new line to the initialization of
the default values in the hash cmlargs.
1.4.7 Splitting and Joining Text
The split function splits a string according to a delimiter string
or a regular expression. A common use of split is to split a text
into words:
$files = "case1.ps case2.ps case3.ps"; @filenames = split(’ ’,
$files); # split wrt whitespace
The entries in @filenames become
("case1.ps", "case2.ps", "case3.ps")
The behavior of split(’ ’, $str) is equivalent to str.split() in
Python, i.e., whitespace surrounding the words is ignored. Any
string delimiter can be used, e.g.,
$files = "case1.ps, case2.ps, case3.ps"; @filenames = split(’, ’,
$files);
results in @filenames as
("case1.ps", "case2.ps", "case3.ps")
The split function can also split with respect to a regular
expression, just as re.split in Python, e.g.,
$files = "case1.ps, case2.ps, case3.ps"; @filenames = split(/,\s*/,
$files);
This results in the correct split of $files. (There is a slight
difference between Perl and Python when splitting a
string with respect to whitespace using the regular expression \s+.
Leading and trailing blanks results in an empty string as first and
last element in the returned list, when using Python, whereas
Perl’s split function does not result in an array element
corresponding to the trailing blanks.)
The join command is the inverse of split:
@filenames = ("case1.ps", "case2.ps", "case3.ps"); $cmd = "print "
. join(" ", @filenames);
yields $cmd as the string "print case1.ps case2.ps case3.ps".
26 1. Introduction to Perl
1.4.8 Text Processing
A basic issue in text processing is recognizing and replacing parts
of a text. Recognizing text can be done in several ways:
# exact string match: if ($line eq "double") { # is $line equal to
"double"?
# matching with full regular expressions: if ($line =~ /double/) {
# does $line contain double? # (here, double can be replaced by any
valid regular expression)
Note that in Perl, the comparison operators for strings and numbers
are different8 (e.g., eq and ne for strings vs. == and != for
numbers, see also Chapter 1.4.14).
Here is an example regarding substituting double by float
everywhere in a file:
$copyfilename = "$filename.old~~"; rename($filename,
"$copyfilename"); # take a copy of the file open(FILE,"
<$copyfilename") or die "$0: couldn’t open file; $!\n"; $filestr
= join("", <FILE>); # read lines and join them to a string
close(FILE);
$filestr =~ s/float/double/g; # substitute
open(FILE, ">$filename"); # write to the orig file print FILE
$filestr; # print the whole (modified) file close(FILE);
Since the need for such types of file substitutions often arises,
Perl offers a one-line statement for accomplishing the task:
perl -pi.old~~ -e ’s/float/double/g;’ *.c
See page 20 for an explanation of the various parts of this
command.
1.4.9 String Operations
Strings in Perl are enclosed in single or double quotes, but the
type of quotes affects the string contents, as illustrated next.
Double quotes enable variable interpolation:
$w = ’World’; $s1 = "Hello, $w!"; # becomes "Hello, World!"
Single quotes preserve $, @, and other special Perl
characters:
$s2 = ’Hello, $w!’; # becomes "Hello, $w!"
Multi-line strings are also possible:
8 Python applies == as well as <, <=, >, >= for all
data types.
1.4. Frequently Encountered Tasks 27
$s3 = "ordinary strings can be used for multi-line text";
String concatenation is enabled by the dot operator:
$myfile = $filename . ’_tmp’ . ’.dat’;
The $myfile variable becomes case1_tmp.dat if $filename is the
string case1. Substrings can be extracted by the substr function,
e.g.,
$teststr = ’0123456789’; # extract 6 characters, starting # from
the beginning of the string: $strpart = substr($filename, 0, 5); #
result: ’01234’
# another example: $strpart = substr($filename, 3, 5); # result:
’34567’
# skipping the first two characters: $strpart = substr($filename,
2);
# skipping up to the last three characters: $strpart =
substr($filename, -3);
Stripping away leading and trailing blanks in a string is easily
carried out by regular expressions:
$line1 =~ s/^\s*//; $line1 ~= s/\s*$//;
1.4.10 Environment Variables
The environment variables are stored in a Perl hash called ENV. You
can modify, e.g., $ENV{PATH} in the script and it has effect on all
child processes (started by calls to the system function, for
instance). Here is an example how we can read the PATH environment
variable, split it into its various directories, and check each
directory if it contains the executable file vtk:
$program = "vtk"; $path = $ENV{PATH}; #
/usr/bin:/usr/local/bin:/usr/X11/bin etc. @paths = split(/:/,
$path); foreach $dir (@paths) {
if (-d $dir) { if (-x "$dir/$program") {
} }
} if (defined($program_path)) {
print "$program found in $program_path\n"; } else { print "$program
not found\n"; }
28 1. Introduction to Perl
Note that the regular expression split on colon is Unix specific.
On Windows we need to insert a semi-colon instead (note that /[:;]/
does not give a cross-platform solution since colon is used in
Windows paths, e.g., C:\). Also note the need for double quotes in
the second if test; writing $dir/$program
without double quotes would be an invalid mixture of variables and
text (the slash), or division of two text variables – what we need
is to construct a new string using variable interpolation.
1.4.11 Subroutines
Functions in Perl are called subroutines. Subroutines take the
form
}
The arguments are not part of the subroutine heading. Instead, they
are available in the array @_. Output variables are transferred to
the calling code by returning an appropriate data structure, e.g.,
a list of the various output quantities. The return statement can
be omitted.
A Simple Example of a Subroutine. A subroutine for finding the
maximum value of two numbers can be written straightforwardly as
follows:
}
The my keyword makes variables local to the subroutine9. Unless you
specify a variable with my it is treated as a global variable whose
value is visible outside the routine as well. Frequently, one maps
the @_ array onto suitable local values using convenient list
techniques, e.g.,
my ($a, $b) = @_;
This allows working with scalars, such as $a and $b, instead of the
array entries $_[0] and $_[1]. Alternatively, we can extract $a and
$b using the shift operator:
my $a = shift; # same as shift @_; my $b = shift;
9 See [?] for a precise explanation of the my keyword.
1.4. Frequently Encountered Tasks 29
Variable Number of Arguments. Here is a subroutine statistics, with
a variable number of arguments, which returns a list containing the
average and the minimum and maximum value of all the
arguments:
($avg, $min, $max) = statistics($v1, $v2, $v3, $b); # usage
sub statistics { # arguments are available in the array @_ my $avg
= 0; my $n = 0; # local variables
foreach $term (@_) { $n++; $avg += $term; } $avg = $avg / $n;
my $min = $_[0]; my $max = $_[0]; shift @_; # swallow first arg.,
it’s already treated foreach $term (@_) {
}
return ($avg, $min, $max); }
Call by Reference. Modifying the arguments inside the subroutine,
i.e., call by reference, is enabled by working directly on the @_
array. For example,
swap($v1, $v2); # swap the values of $v1 and $v2
sub swap { my $tmp = $_[0]; $_[0] = $_[1]; $_[1] = $tmp;
}
That is, @_ contains references to the variables used in the
subroutine call10. We remark that the swap function is just an
example on call by reference; the elegant Perl way of swapping two
variables reads ($v2,$v1)=($v1,$v2).
One can also pass references to variables to subroutines and in
this way get the effect of call by reference. A reference to a
variable $a reads \$a. Having the reference as a variable $a_ref,
we can extract its value by ${$a ref}. We may then write the swap
function as
}
swap(\$v1, \$v2);
10 Perl applies call by reference, and copying the arguments in @
into local variables in a my statement simulates call by
value.
30 1. Introduction to Perl
Alternatively, we can just swap the references themselves:
}
Another example on using references in Perl appears on page
31.
Keyword Arguments. By using a hash to hold the arguments passed to
a subroutine, one can obtain a very readable syntax and the
possibility for assigning default values to an arbitrary set of the
arguments11. Here is an example, where we call a subroutine with
two parameters, message and file:
$filename = "my.tmp"; print2file(message => "testing hash args",
file => $filename);
sub print2file { my %args = (message => "no message", #
default
file => "tmp.tmp", # default @_); # assign and override
open(FILE,">$args{file}"); print FILE "$args{message}\n\n";
close(FILE);
}
Inside the subroutine we first assign default values to the hash
entries and thereafter we insert the argument list @_, which can be
interpreted as a hash as well. This latter hash might then override
our default values. For example, calling
print2file(file => $filename);
leaves $args{message} as no message, but $args{file} is overwritten
by the $filename variable inside the print2file subroutine. The use
of a hash in sub- routine calls also makes the sequence of
arguments irrelevant. The technique is used throughout Perl’s Tk
module for creating graphical user interfaces and (see Chapter
1.7).
Omitting Parenthesis in a Call. If a subroutine is declared before
you call it, you can omit the parenthesis in the call statement,
e.g.,
sub myproc { my $file1 = shift; // implicit shift on @_ my $file2 =
shift; ...
} # call myproc without parenthesis: myproc $myfile,
"$yourdir/$yourfile";
11 This is the counterpart to Python’s keyword arguments, see page
111Keyword Argumentssubsection.175 in [?].
1.4. Frequently Encountered Tasks 31
All the subroutines in the Perl libraries are declared before you
use them so you can omit parenthesis if you desire. Here are some
examples:
print "No of iterations=$iter\n"; print("No of
iterations=$iter\n");
open TMPFILE, ">$tmpfile"; open(TMPFILE, ">$tmpfile");
system "simulator -q 1.2"; system("simulator -q 1.2");
Multiple Arrays as Arguments. If you want to send several arrays to
a sub- routine, you need to explicitly pass references to the
arrays. Otherwise, one cannot detect where one array stops and the
next starts in @_. We shall now show an example where we transfer
two arrays to a subroutine and print them out simultaneously in a
nice format:
@curvelist = (’curve1’, ’curve2’, ’curve3’); @explanations =
(’initial shape of u’,
’initial shape of H’, ’shape of u at t=2.5’);
# send the two arrays to displaylist, using references # (\@list is
a reference to the array @list): displaylist(list =>
\@curvelist, help => \@explanations);
The implementation of the displaylist routine, taking two array
arguments transferred by references, is listed next.
sub displaylist { my %args = (@_); # extract the two lists from the
two references: my $list_ref = $args{’list’}; # extract reference
my @list = @$list_ref; # extract array from reference my $help_ref
= $args{’help’}; # extract reference my @help = @$help_ref; #
extract array from reference
my $index = 0; my $item; for $item (@list) {
printf("item %d: %-20s description: %s\n", $index, $item,
$help[$index]);
$index++; }
# Alternative, without lots of local variables: $index = 0; for
$item (@{$args{’list’}}) {
printf("item %d: %-20s description: %s\n", $index, $item,
${@{$args{’help’}}}[$index]);
$index++; }
The output of displaylist looks like this:
item 0: curve1 description: initial shape of u item 1: curve2
description: initial shape of H item 2: curve3 description: shape
of u at t=2.5
We refer to the Pass by Reference section of perldoc perlsub (or
the equiv- alent text in [?, p. 116-118]) for more
information.
1.4.12 Nested, Heterogeneous Data Structures
The problems with displaylist and the need for references also
occur in nested, heterogeneous data structures. Say we want a list
such as the curves1
list in page 88Lists and Tuplessubsection.148 in [?]. In Perl we
could build some of its components first, which are straight
arrays:
@point1 = (0,0); @point2 = (0.1,1.2); @point3 = (0.3,0); @point4 =
(0.5,-1.9);
A list of these points must be a list of references to @point1,
@point2, etc.:
@points = (\@point1, \@point2, \@point3, \@point4);
Now, suppose we have an array @xy1 similar to @points. The curves1
array is supposed to contain a string, @points, another string, and
@xy1. Again, references are required to avoid “flattening” the
structure:
@curves1 = ("u1.dat", \@points, "H1.dat", \@xy1);
It is tedious to write the sublist as separate variables so we can
do with
@curves1 = ("u1.dat", [[0,0], [0.1,1.2], [0.3,0], [0.5,-1.9]],
"H1.dat", \@xy1);
That is, lists in square brackets provides a reference to an array.
Indexing is performed with a syntax similar to Python. For
example,
$a = $curves1[1][1][0];
yields $a as 0.1. Nested data structures in Perl must make use of
references, and it can
be troublesome to debug such structures. The Data::Dumper module
converts Perl data structures to readable strings: print
Dumper(@curves1) results in the present case in
1.4. Frequently Encountered Tasks 33
$VAR1 = ’u1.dat’; $VAR2 = [
The Data::Dumper module supports lots of output formats, see
perldoc Data::Dumper. More information about references can be
found in perldoc perlreftut.
1.4.13 Testing a Variable’s Type
An ordinary Perl variable is either a scalar, an array, or a hash.
The prefix determines the type of the variable, so the variable
name together with its prefix shows its type; it is no need to test
on the variable’s type (as in Python). Writing
$var = 1; # scalar @var = (1, 2); # array %var = (key1 => 1,
key2 => ’two’); # hash
creates three different Perl variables. Every time we use one of
the variables, the prefix immediately shows its type.
However, when working with references the prefix is always a
dollar. The function ref can be used to test what kind of
underlying data structure the reference is pointing to. The return
value in a scalar context is a string, like ’SCALAR’, ’ARRAY’, or
’HASH’. In a boolean context, ref returns true if its argument is a
reference:
34 1. Introduction to Perl
if (ref($r) eq "HASH") { # test return value print "r is a
reference to a hash.\n";
} unless (ref($r)) { # use in boolean context
print "r is not a reference at all.\n"; }
The ref function is handy when you work with nested, heterogeneous
data structures. See perldoc -f ref and perldoc perlref for more
information.
1.4.14 Numerical Expressions
Perl supports the same numerical expressions as C. Strings are
automatically transformed to numbers when required:
$b = 1.2; # b is a number $b = "1.2"; # b is a string $a = 0.5 *
$b; # b is converted to a real number before mult.
if ($b < 100) { print "ok\n"; } else { print "error!\n"; } #
prints "ok"
In the last test, the < operator works on numbers, and $b is
interpreted as a number (<, >, ==, =!, etc. are the
comparison operators for numbers, whereas strings must be compared
with lt, gt, eq, ne, etc.).
1.4.15 Listing of Files in a Directory
The following statements return a list of files (in the current
working direc- tory) having extensions .ps or .gif:
@filelist = glob("*.ps *.gif");
# alternative: @filelist = <*.ps *.gif>;
A more sophisticated glob function is also available, see perldoc
File::Glob.
1.4.16 Testing File Types
Perl supports a range of tests for classifying files:
if (-f $myfile) { print "$myfile is a plain file\n"; } if (-d
$myfile) { print "$myfile is a directory\n"; } if (-x $myfile) {
print "$myfile is executable\n"; } if (-z $myfile) { print "$myfile
is empty(zero size)\n"; } if (-T $myfile) { print "$myfile is a
text file\n"; } if (-B $myfile) { print "$myfile is a binary
file\n"; }
There are also tests for the size and age of a file:
1.4. Frequently Encountered Tasks 35
$size = -s $myfile; $days_since_last_access = -A $myfile;
$days_since_last_modification = -M $myfile;
See perldoc perlfunc and search for -f, -d, and so on for
information about file tests.
The stat function gives more detailed results about a file:
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($myfile);
A quote from the description of stat in the man page perlfunc
explains what the various list entries above mean:
0 dev device number of filesystem 1 ino inode number 2 mode file
mode (type and permissions) 3 nlink number of (hard) links to the
file 4 uid numeric user ID of file’s owner 5 gid numeric group ID
of file’s owner 6 rdev the device identifier (special files only) 7
size total size of file, in bytes 8 atime last access time since
the epoch 9 mtime last modify time since the epoch
10 ctime inode change time (NOT creation time!) since the epoch 11
blksize preferred block size for file system I/O 12 blocks actual
number of blocks allocated
There is an alternative stat function in the File::stat module, see
perldoc File::stat.
1.4.17 Copying and Renaming Files
Renaming a file is simple:
rename($myfile, "tmp.1"); # rename $myfile to tmp.1
Moving files across file systems is reliably done with the move
function in Perl’s File::Copy library:
use File::Copy; move($myfile, "/work/temp") or die "Could not
rename file\n";
Copying a file $file to a file $tmpfile is performed with the copy
function in the File::Copy library:
use File::Copy; copy($file, $tmpfile);
1.4.18 Creating and Moving to Directories
Creating a directory and moving to a directory are tasks performed
with the mkdir and chdir functions, respectively:
use Cwd; $origdir = cwd; # remember where we are $dir =
"../mynewdir"; mkdir($dir, 0755) or die "$0: couldn’t create dir;
$!\n"; chdir($dir); ... chdir($origdir); # move back to the
original directory chdir; # move to your home directory
($ENV{HOME})
Suppose you want to create a new directory perl/projects/test1 in
your home directory, but neither perl, nor projects and test1
exist. Instead of using repeated mkdir commands, Perl offers the
mkpath command, from the File::Path module, to create the whole
path in one statement:
use File::Path; mkpath("$ENV{HOME}/perl/projects/test1");
1.4.19 Removing Files and Directories
Single files are removed by the unlink statement, e.g.,
unlink("myfile") or die "Could not remove file\n";
A list of files can also be transferred to unlink:
unlink(@files); unlink(glob("*.ps *.gif"));
unlink "myfile", ’yourfile’, @thosefiles, "$file.tmp" or \ die
"Could not remove files\n";
Frequently, one wants to remove a directory tree, possibly full of
files, an action that requires the rmtree function from the
File::Path library:
use File::Path; rmtree("mydir");
1.4.20 Splitting Pathnames
Let $fname be a filename containing a possibly long path,
e.g.,
$fname = /usr/home/hpl/scripting/perl/intro/hw2a.pl
Occasionally, one wants to split this filename into the basename
hw2a.pl and the directory name
/usr/home/hpl/scripting/perl/intro/:
1.4. Frequently Encountered Tasks 37
use File::Basename; $basename = basename($fname); $dirname =
dirname($fname);
One can also extract the base of the basename, hw2a, either
by
$base = $basename; # or by substituting the file extension by an
empty string: $base =~ s/\.pl$//g;
or by the fileparse function:
($base, $dirname, $extension) = fileparse($fname,".pl");
The fileparse function can take an arbitrary number of possible
extensions.
1.4.21 Traversing Directory Trees
The very useful Unix find command can be implemented in a
cross-platform fashion in Perl using the File::Find library and its
find function. The basic recipe for using Perl’s find goes as
follows.
use File::Find; # run through directory trees dir1, dir2, and dir3,
and # for each file call the user-provided subroutine ourfunc:
find(\&ourfunc, "dir1", "dir2", "dir3");
}
We shall now implement a script that lists all files larger than
1Mb in the home directory tree. The easiest way to extract the size
of a file is to write
$size = -s $file;
#!/usr/bin/perl use File::Find;
find(\&printsize, $ENV{HOME}); # traverse home-directory
tree
sub printsize { $file = $_; # more descriptive variable name... if
(-f $file) { # is $file a plain file, not a directory? $size = -s
$file; # or $size = (stat($file))[7]; if ($size > 1000000)
{
printf("%.1fMb %s in %s\n",$size/1000000.0,$file,
$File::Find::dir);
} }
}
38 1. Introduction to Perl
We recommend to read perldoc File::Find to see the many
possibilities that Perl’s find function offers.
There is a program find2perl that translates a Unix find command
into the equivalent Perl program. The resulting program is not
always easy to read for newcomers to Perl so writing the Perl
script yourself gives better control of what you want to do. In the
present example you can try
find2perl find $HOME -name ’*’ -type f -size +2000 -exec ls -s {}
\;
and realize that the resulting code has 55 (!) lines and is less
cross-platform than our hand-coded version.
1.4.22 Downloading Internet Files
The libwww-perl package contains numerous modules and scripts for
working with the World Wide Web. You can easily test if libwww-perl
is already installed on your system by trying
perl -e ’use LWP::Simple’
If this one-liner gives an error message, you need to get
libwww-perl from CPAN (see page 54).
The Perl script lwp-download (from the libwww-perl package) fetches
a single file whose URL is known:
lwp-download http://www.ifi.uio.no/~hpl/downloadme.dat
The script looks at the file contents and creates a suitable local
filename for the copy. In this case, downloadme.dat is a text file
that lwp-download stores as downloadme.dat.txt. A second argument
to lwp-download can be used to specify a local filename.
Inside a Perl script we can easily copy a file, given as a URL, to
a local file:
use LWP::Simple; $URL =
"http://www.ifi.uio.no/~hpl/downloadme.dat"; getstore($URL,
"downloadme.dat"); # copy only if local file is not up-to-date:
mirror($URL, "downloadme.dat.pl");
or we can load the remote file directly into an array of
lines:
@lines = get($URL);
The URL in these examples could also have been an ftp address,
e.g.,
ftp://ftp.ifi.uio.no/pub/blab/xite/xite3_4.tar.gz
1.4. Frequently Encountered Tasks 39
1.4.23 CPU-Time Measurements
Measurement of elapsed time in Perl can be done with the time
function:
$t0 = time; # elapsed time in seconds since the epoch # do tasks...
$elapsed_time = time - $t0;
Because time is measured in seconds, you need to perform efficiency
tests that last several seconds. Timing with finer resolution is
possible, see the Perl FAQ: perldoc -q ’time under a second’.
Throughout this section we assume that the reader is familiar with
terms like epoch, elapsed time, system time, CPU time, and the
difference between children and parent processes, as briefly
explained in Chapter 8.10.1CPU- Time Measurementssubsection.577 in
[?].
A more sophisticated function times returns an array with four
entries. The first two represent the user and system times of the
current process while the next two contain the user and system
times of the current process’ child processes.
@t0 = times; # do tasks... system "$time_consuming_command" # child
process @t1 = times; $user_time = $t1[0] - $t0[0]; $system_time =
$t1[1] - $t0[1]; $cpu_time = $user_time + $system_time;
$cpu_time_system_call = $t1[2] - $t0[2] + $t1[3] - $t0[3]
There is also a higher-level module Benchmark, based on the time
and times
functions, with various support for timing of Perl scripts. The
usuage goes as follows.
use Benchmark; $t0 = new Benchmark; # do some tasks... $t1 = new
Benchmark; $td = timediff($t1, $t0); # time difference between $t0
and $t1 $nice_td_formatting = timestr($td, ’noc’); print "tasks:
$nice_td_formatting\n";
The output looks like this:
tasks: 9 wallclock secs( 3.12 usr + 0.10 sys = 3.22 CPU)
The Benchmark module has also a function timeit that runs a piece
of Perl code a specified number of times:
use Benchmark; print "100 runs took",
timestr(timeit(100,\&somefunc)), "\n";
40 1. Introduction to Perl
We refer to perldoc Benchmark for more details about this module.
From a pedagogical point of view it might be instructive to write a
func-
tion like timeit in the Benchmark module. Doing this we also have
the pos- sibility of tailoring such a timing function to suit our
needs. The function, here called timer, can take four arguments:
(i) a function to call, (ii) a list of arguments to be used in the
function to call, (iii) the number of call repeti- tions, and (iv)
the name of the function to call. In Perl we would represent the
first two arguments by a function reference and a reference to a
list. The complete function could then take the following
form:
sub timer { my ($func_ref, $args_ref, $repetitions, $func_name) =
@_; my $t0 = time; # initial elapsed time my ($u0, $s0, $rest) =
times; # initial user and system time for (my $i = 0; $i <
$repetitions; $i++) {
&$func_ref(@$args_ref); } my @t1 = times; printf("$func_name:
elapsed=%g, CPU=%g\n",
time - $t0, $t1[0] - $u0 + $t1[1] - $s0); }
The similar Python function is presented in Chapter 8.10.1CPU-Time
Measurementssubsection.577 in [?].
1.4.24 Programming with Classes
Classes are implemented in Perl using quite advanced concepts like
references and packages. Although Perl fans claim that classes in
Perl are much more flexible than those in C++ and Java, it is no
doubt that programming with classes is more weird in Perl than in
C++, Java, and Python. Explaining Perl classes in a couple pages
without first covering references and packages is difficult and
therefore omitted here.
1.4.25 Debugging Perl Scripts
Unfortunately, Perl is by default quite silent about errors. The
following short script, which tries to open a non-existing file,
illustrates the point:
perl -w -e ’open(F,"<mynonexistingfile"); close(F);’
Perl executes this script without any error message12. The Fatal
module can be used for letting Perl speak up about run-time
errors:
perl -e ’use warnings; use strict; use diagnostics; \ use Fatal
qw/open/; local *F; \ open(F,"<mynonexistingfile");
close(F);’
12 Python provides instructive run-time messages by default in
similar examples (and the messages can be turned off by, e.g.,
appropriate exception handling in the script).
1.4. Frequently Encountered Tasks 41
Note that you must list the functions you want to be verbose, here
open. The reported error message now contains the helpful
message
Can’t open(F, <mynonexistingfile): No such file or
directory
The use warnings, use strict, and use diagnostic commands can help
you detecting statements that are candidates for trouble. However,
applying use strict
modules to (most of) the Perl scripts in this appendix will result
in lots of error messages about lack of the main:: prefix for all
global variables or an explicit my or local operator (to make
variables local). For quick scripting this can be a bit annoying.
When writing larger scripts, on the other hand, use strict is a
good habit. Here is a sample code demonstrating some im- plications
of use strict:
use strict; # introduce the global variable $counter for the first
time: $counter = 1; # generates error message $main::counter = 1; #
ok, explicit indication of package name my $counter = 1; # ok,
localizing $counter with the my operator my $counter; $counter = 1;
# equiv. with the line above
The reader is encouraged to take a look at the man pages for the
Fatal, strict, and diagonstic modules. For details on warnings, see
perldoc warnings
and man perllexwarn. Inserting print statements on the fly in the
code is an efficient and widely
used debugging method among Perl programmers. Alternatively, the -d
op- tion to a Perl script enables you to interactively debug the
script through a command-line debugger,
perl -d -w mybuggyscript.pl
The -w option turns on many useful warnings about, e.g., unused
variables. The most important commands inside the debugger are s
for single step, n for single step without stepping into
subroutines, x for pretty-print of data structures and variables,
and b 85 for setting a break point at line 85. More detailed
information is provided by perldoc perldebug.
There is a Perl/Tk GUI for the Perl debugger, available in the
module ptkdb. Invoke the debugger by
perl -d:ptkdb -w mybuggyscript.pl
There are several Perl debuggers with graphical interfaces, check
out the links in the Perl resources section in doc.html.
Another Perl module is Devel::Trace, which prints each statement
prior to executing it (the same effect as the -x option to Unix
shell scripts).
42 1. Introduction to Perl
1.4.26 Regular Expressions
The material on regular expressions explained in a Python context
in Chap- ter 8.2Regular Expressions and Text Processingsection.463
in [?] carries over to Perl, but the surrounding Perl code is
different. To test if a string $str
matches a regular expression contained in a string $pattern, one
writes
if ($str =~ /$pattern/) { ... }
$str = "myfile.tmp"; if ($str =~ /\.tmp$/) { print "$str has
extension .tmp"; }
Backslashes and special symbols are preserved in text enclosed in
forward slahes /.../, as in Python raw strings. However, if the
regular expression is to be stored in a double-quoted string,
backslashes and special Perl characters must be preceded by a
backslash:
$pattern = "\\.tmp\$"; if ($str =~ /$pattern/) { print "$str has
extension .tmp"; }
With single-quoted strings a backslash is a backslash, but Perl’s
variable interpolation cannot be used.
Pattern-Matching Modifiers. Perl offers pattern-matching modifiers
to adjust the meaning of the dot, ^, $, whitespace, etc. The syntax
for applying a pattern-matching modifier is like
if ($str =~ /$pattern/q) { ... }
where q denotes one or more single-character pattern-matching
modifiers from the following list:
i case-insensitive matching g match globally, i.e., find all
occurrences s let . match newline as well m treat string as
multiple lines, i.e, change ^
and $ from matching at only the very start or end of the string to
the start or end of any line anywhere within the string (a line is
from a newline to the next newline)
x extend the pattern’s legibility by permitting whitespace and
comments
o compile pattern once only (for increased efficiency)
The o modifier is a counterpart to compiling regular expressions in
Python. We can use other delimiters than forward slashes if the
/.../ group is
preceded by an m, e.g.,
$found = 1 if $path =~ m#/usr/local/bin#;
Extracting Multiple Matches. Suppose you have a string with several
num- bers. To extract all numbers from this string, without knowing
how many numbers there may be, we can apply the following Perl
construct13:
13 This construct is a counterpart to Python’s findall function in
the re module.
1.4. Frequently Encountered Tasks 43
$s = "3.29 is a number, 4.2 and 0.5 too"; @n = $s =~
/\d+\.\d*/g;
The array @n now contains the ent