of 78 /78
Scripting with Perl and Tcl Hans Petter Langtangen Simula Research Laboratory and Department of Informatics University of Oslo

Scripting with Perl and Tcl - IfI

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Text of Scripting with Perl and Tcl - IfI

Hans Petter Langtangen
Simula Research Laboratory
Department of Informatics
University of Oslo
Table of Contents
1 Introduction to Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.3 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.4 The Concept of Context in Perl . . . . . . . . . . . . . . . . . . . . . 7
1.2 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 8 1.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 There’s More Than One Way To Do It . . . . . . . . . . . . . . . . . . . . . . 12 1.3.1 A Script for Perl Beginners . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.2 Using the Underscore Variable . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.3 A Script Written in Typical Perl Style . . . . . . . . . . . . . . . . 14 1.3.4 Shorter Scripts for Lazy Programmers . . . . . . . . . . . . . . . . 15 1.3.5 The Ultimate Goal: Getting Rid of the Script File . . . . . 15 1.3.6 Perl Has a Grep Function Too . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4.1 Basic Control Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4.2 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.4.3 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.4.4 One-Line Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.4.5 Array and List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.4.6 Hash Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.4.7 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.4.8 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.4.9 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.4.10 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.4.11 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.4.12 Nested, Heterogeneous Data Structures . . . . . . . . . . . . . . . 32 1.4.13 Testing a Variable’s Type . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.4.14 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1.4.15 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 34 1.4.16 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1.4.17 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 35 1.4.18 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 36 1.4.19 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 36 1.4.20 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.4.21 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.4.22 Downloading Internet Files . . . . . . . . . . . . . . . . . . . . . . . . . . 38 1.4.23 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4.24 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 40 1.4.25 Debugging Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 1.4.26 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 1.4.27 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 1.4.28 Building and Using Modules . . . . . . . . . . . . . . . . . . . . . . . . 49 1.4.29 Binary Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.5 Installing Perl and Additional Modules . . . . . . . . . . . . . . . . . . . . . . 52 1.5.1 Installing Basic Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 1.5.2 Manual Installation of Perl Modules . . . . . . . . . . . . . . . . . . 52 1.5.3 Automatic Installation of Perl Modules . . . . . . . . . . . . . . . 53 1.5.4 The Required Perl Modules . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6 Perl Versus Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.6.1 Python’s Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.6.2 Perl’s Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 1.6.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 GUI Programming with Perl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 1.7.1 The First Perl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 60 1.7.2 The Similarity of Python/Tkinter and Perl/Tk . . . . . . . . 62 1.7.3 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
1.8 Web Interfaces and CGI Programming . . . . . . . . . . . . . . . . . . . . . . 63 1.8.1 Web Versions of the Scientific Hello World Program . . . . 63 1.8.2 Debugging CGI Scripts in Perl with CGI::Debug . . . . . . . 65 1.8.3 Using Perl’s CGI Module to Construct Forms . . . . . . . . . 67
2 Introduction to Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.1.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.2 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.2.3 Double Quotes, Braces, Brackets, and Variable Substi-
tution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.3 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 77
2.3.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.3.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.4.1 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 2.4.2 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 2.4.3 List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.4.4 Associative Array Operations . . . . . . . . . . . . . . . . . . . . . . . 84 2.4.5 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2.4.6 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2.4.7 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2.4.8 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4.9 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 2.4.10 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.4.11 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 91 2.4.12 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2.4.13 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 91 2.4.14 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 91 2.4.15 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 92 2.4.16 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.4.17 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.4.18 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.4.19 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.4.20 Building and Using Packages . . . . . . . . . . . . . . . . . . . . . . . . 93 2.4.21 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 2.4.22 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.5 GUI Programming with Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 2.5.1 The First Tcl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 97 2.5.2 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 2.5.3 Widget Name Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 2.5.4 The Similarity of Python/Tkinter and Tcl/Tk . . . . . . . . . 99 2.5.5 Using Variables in Widget Names . . . . . . . . . . . . . . . . . . . . 100 2.5.6 Configuring Widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 2.5.7 The Grid Geometry Manager . . . . . . . . . . . . . . . . . . . . . . . . 102
Preface
The purpose of this document is to show how the introductory programming examples from the book “Python Scripting for Computational Science” [?] can be implemented in Perl and Tcl. In addition, we list some core func- tionality of these scripting languages, typically corresponding to the same information and examples as in Chapter 3Basic Pythonchapter.135 in [?]. If you know the examples in a Python context from Chapters 2Getting Started with Python Scriptingchapter.49 and 3Basic Pythonchapter.135 in [?], it is quite easy to pick up basic Perl and Tcl from the present note. The Perl and Tcl chapters can be read independently.
The author has a desire to include other scripting languages, e.g., Ruby and Scheme. Potential authors of such (independent) chapters, with the same structuring as the Perl and Tcl chapters, are encouraged to drop me an email ([email protected]).
The present printing of the document contains the Perl part only.
Chapter 1
Introduction to Perl
This chapter gives a quick introduction to the Perl language for readers who are familiar (at least to some extent) with the Python scripts from Chapters 2.1A Scientific Hello World Scriptsection.50–2.3Gluing Stand-Alone Applicationssection.89 and 3Basic Pythonchapter.135 in the book [?]. We shall look at the same sample scripts and show how the syntax changes when we program in Perl.
Recommended Documentation. As a companion to the introductory examples and the overview of basic Perl functionality provided in this appendix, you need the Perl man pages. These come along with the Perl distribution. I find it convenient to read the man pages in plain text format using the perldoc tool. Some common ways of looking up information with perldoc are exemplified below.
perldoc perl # overview of all Perl man pages perldoc perlsub # read about subroutines perldoc Cwd # look up a special module, here ’Cwd’ perldoc -f open # look up a special function, here ’open’ perldoc -q cgi # seach the FAQ for the text ’cgi’
A Web version of the man pages can be found in the doc.html file. There you can also find the Perl FAQ and a quick reference.
Having grasped the basic introduction to Perl from this appendix, you will find the definite Perl reference, the famous “Camel book” [?], very useful. However, much of the text in [?] coincides with the Perl man pages. If you feel that a more comprehensive introduction to Perl is needed, “Learning Perl” [?] and [?] are recommended. Ready-made recipes for numerous common tasks in scripting are collected in the highly recommended “Perl Cookbook” [?]. Advanced features of Perl are well discussed in [?] and [?]. Some Web resources regarding Perl topics are listed in doc.html.
The first Perl encounter consists of three of the examples from the intro- duction to Python in Chapter 2Getting Started with Python Scriptingchapter.49 in [?]. We start out with a Hello World script, before continuing with a script concerning file handling and array processing. Thereafter we present a script gluing a simulation and a visualization program. All these scripts referred to in this section are found in src/perl. Thereafter, in Chapter 1.4 we list, in an example-oriented way, some basic and useful Perl functionality for quick refer- ence. Chapter 1.5 explains how to install Perl and additional modules. A brief comparison of Perl versus Python appears in Chapter 1.6, while Chapters 1.7
2 1. Introduction to Perl
and 1.8 deal with graphical user interfaces: standard GUIs and dynamic Web pages, respectively.
1.1 A Scientific Hello World Script
Our first look at Perl will be the Scientific Hello World script from Chap- ter 2.1A Scientific Hello World Scriptsection.50 in [?]. This script reads a real number from the command line, takes the sine of the number, and writes “Hello, World! sin(r)=s” with the appropriate values of the numbers r and s. In Perl, we can write the script like this:
#!/usr/bin/perl $r = $ARGV[0]; # fetch the first ([0]) command-line argument $s = sin($r); # compute sin(r) and store in variable s print "Hello, World! sin($r)=$s\n"; # print to standard output
Comments in Perl start with # and continue for the rest of the line. However, the first line #/usr/bin/perl! has a special meaning: Under Unix it tells that the script, if run as an executable file, is to be interpreted by the program /usr/bin/perl. If the executable Perl interpreter is stored in another path on your system, you must write the correct full path in the top line of the script or (usually better) use a different header to be presented in Chapter 1.1.1.
Scalar variables in Perl are always preceded by a $ sign, i.e., $r and $s are scalar variables in the present script. The command-line arguments to a Perl script are automatically stored in the array ARGV. Subscripting this array is done as in $ARGV[0] (which implies extracting the first entry; arrays in Perl start with 0 as in C and Python). The length of the array is $#ARGV+1, i.e., $ARGV[$#ARGV] is the last entry of the array. The array itself as a variable is reached with the syntax @ARGV (and one can say, e.g., print "ARGV=@ARGV").
Variables can be directly inserted into a text string, a convenient feature called variable interpolation:
print "Hello, World! sin($r)=$s\n"; # print to screen
Such variable interpolation works only if the string is surrounded by double quotes. Single quotes just leads to output of text with dollar characters.
Perl’s syntax is much inspired by C. For example, the newline character is \n and all statements are terminated by a semicolon.
As usual in scripting, variables are never declared; the context determines the type. Contrary to Python, a variable can be used both as a string and a floating-point number. For example, $r is initialized to a text, but can be sent to the sine function, which expects a floating-point variable, without any explicit type conversion.
Perl’s printf function gives good control of the output format of numbers and strings:
printf "Hello, World! sin(%g)=%12.5e\n", $r, $s;
1.1. A Scientific Hello World Script 3
There is no possibility to control the format when using variable interpolation (i.e., Python’s %(s)12.5e is not supported).
If the script is stored in a file hw.pl, you can execute the script by typing
perl hw.pl 0.1
or you can make the file executable under Unix (chmod a+x hw.pl) and then just write
./hw.pl 0.1
1.1.1 Reading and Writing Data Files
Chapter 2.2Working with Files and Datasection.59 in [?] deals with a script for reading a file with (x, y) data points in two columns and writing a new two-column file with transformed data points (x, f(y)). On the next pages we shall present and explain a Perl counterpart to the Python scripts. This case study demonstrates how to work with files, subroutines, and arrays in Perl.
1.1.2 The Complete Code
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if running under some shell
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
($infilename, $outfilename) = @ARGV;
# read one line at a time: while (defined($line=<INFILE>)) {
($x, $y) = split(’ ’, $line); # extract x and y value $fy = myfunc($y); # transform y value printf(OUTFILE "%g %12.5e\n", $x, $fy);
} close(INFILE); close(OUTFILE);
}
4 1. Introduction to Perl
1.1.3 Dissection
The Perl script starts with a header
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if running under some shell
This header ensures that executing the script as
./datatrans1.pl infile outfile
implies interpreting the code by the first perl program encountered in the directories listed in your PATH environment variable. The explanation of all the details in our Perl header is intricate, but it can be found in the file src/perl/headerfun.sh. (This is actually a document written in Bash (!) so you need to run the file to get the document printed.)
In the case where the user has failed to provide two command-line ar- guments, we want to write a usage message and abort the script. This is accomplished by Perl’s die statement: die prints a string on standard error and terminates the script. In the present example the script dies if there are less than two command-line arguments:
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
Recall that $#ARGV is the last legal index in @ARGV, i.e., the length of @ARGV is $#ARGV+1, so the test is $#ARGV+1 < 2, leading to $#ARGV < 1.
Extracting the first two command-line arguments can be performed by standard subscripting:
$infilename = $ARGV[0]; $outfilename = $ARGV[1];
However, it is more common (and elegant) to use Perl’s list assignment con- struction:
($infilename, $outfilename) = @ARGV;
The list on the left-hand side is set equal, entry by entry, to the entries in the array on the right-hand side. We refer to the remark at the end of this section for an explanation of the difference between list and array in Perl terminology.
Opening files in Perl is done with the open function:
open(INFILE, "<$infilename"); # open for reading open(OUTFILE, ">$outfilename"); # open for writing
1.1. A Scientific Hello World Script 5
The first argument to open is a file handle, which is used for accessing the file in the Perl code. Input files are recognized by < in front of the name1, > signifies an output file, and >> implies that text will be appended to the file.
Reading from a file handle, line by line, is accomplished by
while (defined($line=<INFILE>)) { # process $line
}
In the present script we want to split the line into an array of words, separated by whitespace. The split function performs this task:
($x, $y) = split(’ ’, $line); # extract x and y value
Having the coordinates $x and $y available, we can transform the y value by calling a function myfunc,
$fy = myfunc($y); # transform y value
One way of printing the transformed coordinate pair to the output file is to apply the printf function:
printf(OUTFILE "%g %12.5e\n", $x, $fy);
The core of a printf call is the format string, which follows the same syntax as in C and Python (and all other languages that supports the C’s printf style for formatting). Perl’s ordinary print function can also be used for writing to files, e.g., print OUTFILE "$x $fy\n";
The myfunc function is defined as
}
Functions are referred to as subroutines in Perl. Their look is typically
}
The most striking difference from subprograms in other languages is that the argument list is not a part of the subroutine heading. Instead, all arguments are available in an array @_. The first step is normally to store the arguments in local variables:
1 If there is no < symbol, the file is opened for reading. In fact, opentt(F,"<$name"), open(F,"$name"), and open(F,$name) all lead to open- ing a file a file with name $name.
6 1. Introduction to Perl
my ($y) = @_; # list assignment # or my $y = @_[0]; # subscripting
The my keyword tells that all variables on the left-hand side are declared as local variables in the subroutine. This is a good habit as using unintended global variables inside a subroutine may have undesired effects in other parts of the script.
As in Chapter 2.2Working with Files and Datasection.59 in [?], we can modify datatrans1.pl such that (i) the file is loaded into an array of lines, (ii) the x and y coordinates are stored in two arrays, and (iii) the output file is written by a for loop over the array entries.
We start with making the open statement a bit more robust. Perl does not by default write any error message if the file we try to open does not exist. This can be quite annoying, but the problem is solved by a “try something or die” construction:
open(INFILE, "<$infilename") or die "unsuccessful opening of $infilename; $!\n";
The $! variable is a special variable in Perl containing the last error message issued by the operating system.
Loading a file into an array of lines is enabled by the syntax
@lines = <INFILE>;
One can then process the array @lines, line by line:
for $line (@lines) { # process $line
}
# process $line }
In the present case we want to create two arrays, @x and @y, containing the x and y coordinates:
@x = (); @y = (); # start with empty arrays for $line (@lines) {
($xval, $yval) = split(’ ’, $line); push(@x, $xval); push(@y, $yval);
}
The x and y coordinates are extracted by splitting the line with respect to whitespace, exactly as we did in the datatrans1.pl code. The push function appends new array entries.
Creating the output file can now be performed by a C-like for loop over the array indices:
1.1. A Scientific Hello World Script 7
open(OUTFILE, ">$outfilename") or die "unsuccessful opening of $outfilename; $!\n";
for ($i = 0; $i <= $#x; $i++) { $fy = myfunc($y[$i]); # transform y value printf(OUTFILE "%g %12.5e\n", $x[$i], $fy);
} close(OUTFILE);
Recall that $#x is the last valid index in the array @x. The complete code is found in src/perl/datatrans2.pl.
Remarks on Terminology. Perl distinguishes between the terms array and list. Roughly speaking, an array is the variable having a list as value [?, Ch. 4.0]. For example, in an assignment @a = ("a","b","c"), a is an array, whereas its value ("a","b","c") is a list. The function push operates on ar- ray variables and not on lists, meaning that push(@a,"q") works well, while push(("a","b","c"),"q") does not make sense.
1.1.4 The Concept of Context in Perl
Operations in Perl er evaluated in a specific context. For newcomers to the language the context concept can be quite confusing. A thorough explanation of context is provided in the “Camel” book [?, Ch. 2] or the perldata man page (invoke perldoc perldata and search for “Context”). Here we shall only exemplify the two major contexts: scalar and list. The assignment
@a = ("a","b","c");
evaluates the list on the right-hand side in a list context, and @a becomes an array variable having its entries equal to the three scalars in the list ("a","b","c"). When assigning the list to a scalar,
$a = ("a","b","c");
the list on the right-hand side is evaluated in a scalar context. In this case, the value of the list is the value of the last element (as with the C comma operator). Therefore, $a becomes "c". On the other hand,
$b = @a;
evaluates the array variable @a in a scalar context, and its value is then the length of the array. That is, $b becomes 3.
These examples show that an array variable can have a list as value in a list context and its length as value in a scalar context. A hash evaluated in a scalar context becomes true if there are elements in the hash, and false otherwise2.
The property that an array evaluates to its length in a scalar context is often taken advantage of by Perl programmers. Two common applications are 2 There is more information in the scalar value, see the perldata man page.
8 1. Introduction to Perl
}
die "Usage: $0 file" unless @ARGV; die "Usage: $0 -f file" unless @ARGV == 2;
Especially the two latter examples have an attractive readability. The return value of many Perl functions depends on the context. One
example is localtime:
$t = localtime();
yields the date as a string; $t is "Sun May 13 09:02:27 2001", for instance. In a list context,
@t = localtime();
localtime returns a list of nine values containing the time, day, month, year, etc. (see perldoc -f localtime), and @t becomes an array of numbers (say) (27, 2, 9, 13, 4, 101, 0, 132, 1).
1.2 Automating Simulation and Visualization
Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?] describes a sim- ple simulation code, called oscillator, for solving a differential equation mod- eling an oscillating system. Using a script, we can improve the user friendli- ness of the simulation code and also launch a visualization of the solution. A Python version of such a script is explained in detail in Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?], and the purpose of the present section is to present the Perl version of that script.
1.2.1 The Complete Code
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if running under some shell
# default values of input parameters: $m = 1.0; $b = 0.7; $c = 5.0; $func = "y"; $A = 5.0; $w = 2*3.14159; $y0 = 0.2; $tstop = 30.0; $dt = 0.05; $case = "tmp1"; $screenplot = 1;
# read variables from the command line, one by one: while (@ARGV) {
$option = shift @ARGV; # load cmd-line arg into $option if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg } elsif ($option eq "-b") { $b = shift @ARGV; } elsif ($option eq "-c") { $c = shift @ARGV; }
1.2. Automating Simulation and Visualization 9
elsif ($option eq "-func") { $func = shift @ARGV; } elsif ($option eq "-A") { $A = shift @ARGV; } elsif ($option eq "-w") { $w = shift @ARGV; } elsif ($option eq "-y0") { $y0 = shift @ARGV; } elsif ($option eq "-tstop") { $tstop = shift @ARGV; } elsif ($option eq "-dt") { $dt = shift @ARGV; } elsif ($option eq "-noscreenplot") { $screenplot = 0; } elsif ($option eq "-case") { $case = shift @ARGV; } else {
die "$0: invalid option ’$option’\n"; }
}
# create a subdirectory with name equal to case and generate # all files in this subdirectory: $dir = $case; use File::Path; # contains the rmtree function if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory (old files) } mkdir($dir, 0755) or die "Could not create $dir; $!\n"; chdir($dir) or die "Could not move to $dir; $!\n";
# make input file to the program: open(F,">$case.i") or die "open error; $!\n"; print F "
$m $b $c $func $A $w $y0 $tstop $dt
"; close(F);
# run simulator: $cmd = "oscillator < $case.i"; # command to run $failure = system($cmd); die "running the oscillator code failed\n" if $failure;
# make gnuplot script: open(F, ">$case.gnuplot"); print F " set title ’$case: m=$m b=$b c=$c f(y)=$func A=$A w=$w y0=$y0 dt=$dt’; "; if ($screenplot) {
print F "plot ’sim.dat’ title ’y(t)’ with lines;\n"; } print F <<EOF; # print multiple lines using a "here document" set size ratio 0.3 1.5, 1.0; # define the postscript output format: set term postscript eps monochrome dashed ’Times-Roman’ 28; # output file containing the plot: set output ’$case.ps’;
10 1. Introduction to Perl
# basic plot command: plot ’sim.dat’ title ’y(t)’ with lines; # make a plot in PNG format as well: set term png small; set output ’$case.png’; plot ’sim.dat’ title ’y(t)’ with lines; EOF close(F); # make plot: $cmd = "gnuplot -geometry 800x200 -persist $case.gnuplot"; $failure = system($cmd); die "running gnuplot failed\n" if $failure;
The complete source code appears in src/perl/simviz1.pl.
1.2.2 Dissection
The script starts with a safe Perl header, which ensures interpretation of the script by the first Perl interpreter found in the user’s path. After having assigned default values to the input parameters to the oscillator code, we encounter an important part of many scripts, namely parsing of command- line arguments. The idea is that we “eat” the entries in @ARGV one by one using the shift operator:
$option = shift @ARGV;
This statement implies setting $options equal to the first element in @ARGV
and then removing this element from @ARGV3. We search for options on the command line until the @ARGV array is empty:
while (@ARGV) { # while @ARGV is non-empty $option = shift @ARGV; # load command-line arg. into $option if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg } elsif ($option eq "-b") { $b = shift @ARGV; } ... else {
die "$0: invalid option ’$option’\n"; }
}
As an alternative to this explicit grabbing of command-line arguments, we can use a special Perl utility called GetOptions [?, p. 445]:
use Getopt::Long; # load module with GetOptions function GetOptions("m=f" => \$m, "b=f" => \$b, "c=f" => \$c,
"func=s" => \$func, "A=f" => \$A, "w=f" => \$w, "y0=f" => \$y0, "tstop=f" => \$tstop, "dt=f" => \$dt, "case=f" => \$case, "screenplot!" => \$screenplot);
3 Experienced Perl programers will often write just $options = shift; because shift without arguments implies shifting @ARGV. More examples regarding such shortcuts in Perl are provided in Chapter 1.3.
1.2. Automating Simulation and Visualization 11
The syntax m=f means searching for the command-line argument --m and loading the proceding argument as a floating-point number (=f) into the Perl variable $m. A single hyphen as in -m works too. Similarly, func=s specifies --func to take a string argument. The specification of the flag screenplot
allows us to use either --screenplot for setting $screenplot to a true value or --noscreenplot for setting $screenplot to a false value (note to get this on/off behavior, the exclamation mark is required in "screenplot” =¿ $screen-
plot!). The GetOptions function has a rich functionality; the purpose here just is to notify the reader about the existence of such a handy function. Instruc- tive information is obtained from perldoc Getopt::Long. There are several other modules in the Getopt family. For example: Getopt::Simple for a sim- plified interface to Getopt::Long, Getopt::Std for single-character options, Getopt::Mixed for long and single-character options, and Getopt::Declare for handling command-line options or configuration files with associated help text and initialization code.
The next step in our script is to move to the prescribed directory. However, we should first check whether the directory exists, and if so, we should delete it and recreate it to avoid mismatch between old and new result files. Checking if a directory exists is done by the command if (-d $directoryname) in Perl. Removing a non-empty directory can be conveniently done by first loading an external Perl module, use File::Path, and then calling the function rmtree
in that module:
use File::Path; # has the rmtree function if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory (old files) } mkdir($dir, 0755) or die "Could not create $dir; $!\n"; chdir($dir) or die "Could not move to $dir; $!\n";
Observe that we test for success of mkdir. For example, insufficient permission to create a new directory will not be noticable when running the script unless we include the or die statement4.
The next task is to write an input file for the oscillator program. Multi- line output can easily be created through an ordinary string with embedded newlines5
print F " $m $b $c $func $A $w $y0
4 Python will in such cases abort the script and write a “Permission denied” mes- sage to standard output. See Exercise 1.8.
5 Python requires a triple quoted string for this purpose.
12 1. Introduction to Perl
$tstop $dt
";
Alternatively, we can use a special Perl construction (stemming from Unix shells), known as a here document :
print F <<EOF; $m $b $c $func $A $w $y0 $tstop $dt
EOF
Everything between the two EOF marks is treated as output text. The enclos- ing EOF must start in the first column of the script file. The Gnuplot script later in the simviz1.pl code is actually written as a here document.
Perl’s system function is used for running applications:
$cmd = "oscillator < $case.i"; # command to run $failure = system($cmd); die "running the oscillator code failed\n" if $failure;
Visualization of the solution in Gnuplot requires writing a small script with the proper Gnuplot commands:
open(F, ">$case.gnuplot"); print F <<EOF; # print multiple lines using a "here document" ... # output file containing the plot: set output ’$case.ps’; # variable interpolation ... EOF close(F);
# make plot: $failure = system("gnuplot $case.gnuplot"); die "running gnuplot failed\n" if $failure;
Never forget to close files before continuing with system commands involving the generated files!
1.3 There’s More Than One Way To Do It
A famous Perl slogan is “There’s More Than One Way To Do It” (often ab- breviated TIMTOWTDI, pronounced “Tim Toady”). The goal of the present
1.3. There’s More Than One Way To Do It 13
section is to exemplify this slogan and demonstrate different Perl program- ming styles. We shall develop scripts for finding files containing a specified string and show that there might be many different Perl solutions to a pro- gramming problem.
When working with computers, you have probably often tried to find a file containing some particular text, but you have a hard time figuring out what the filename is. If you remember parts of the text, the Unix grep command is handy. For example,
grep superLibFunc *
searches all files (*) in the current working directory for the text string superLibFunc and writes out the matches. This can help you finding the file you are looking for. We shall present a cross-platform Perl script, which im- plements the grep functionality.
1.3.1 A Script for Perl Beginners
A verbose, easy-to-read grep script in Perl can take the following form.
: # *-*-perl-*-* eval ’exec perl -w -S $0 ${1+"$@"}’ if 0; # if running under some shell
die "Usage: $0 pattern file1 file2 ...\n" if $#ARGV < 1;
# first command-line argument is the pattern to search for: $pattern = shift @ARGV; # run through the next command-line arguments, i.e. files, and grep: while (@ARGV) {
$file = shift @ARGV; if (-f $file) {
open(FILE,"<$file"); @lines = <FILE>; # read all lines foreach $line (@lines) {
if ($line =~ /$pattern/) { print "$file: $line";
} } close(FILE);
if ($line =~ /$string/)
which is a test whether the variable $line matches the regular expression contained in $string. If so, we write out this line.
14 1. Introduction to Perl
1.3.2 Using the Underscore Variable
The Perl program can be written more compactly using the implicit $_ vari- able. Let us present the code first and the explain what the syntax means.
#!/usr/bin/perl die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2; ($pattern, @files) = @ARGV; foreach (@files) {
if (-f) { open(FILE,"<$_"); foreach (<FILE>) {
if (/$pattern/) { print;
} }
The extraction of command-line arguments is elegantly performed by divid- ing the arguments into the leading search string and an array holding the filenames:
($pattern, @files) = @ARGV;
Many Perl commands can be issued without an explicit variable to work with. One example is foreach (@files). In such cases the “invisible” variable is $_. That is, foreach (@files) actually means foreach $_ (@files).
The previous code is best explained by showing the equivalent Perl state- ments where the $_ appears explicitly:
#!/usr/bin/perl die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2; ($pattern, @files) = @ARGV; foreach $_(@files) {
if (-f $_) { open(FILE,"<$_"); foreach $_ (<FILE>) {
if ($_ =~ /$pattern/) { print $_;
1.3.3 A Script Written in Typical Perl Style
A more modern Perl style could be introduced in the script that makes use of the implicit $_ variable:
1.3. There’s More Than One Way To Do It 15
}
The next unless -f statement means that one jumps to the next iteration in the loop unless the test if (-f $_) is true, i.e., unless the current filename ($_) is an existing file.
1.3.4 Shorter Scripts for Lazy Programmers
There are many shortcuts in Perl aimed at lazy programmers. Here is an example of a grep script equivalent to those above, but with a much more compact file reading construction:
#!/usr/bin/perl $pattern = shift; # shift; means shift @ARGV while (<>) { # read line by line in file by file
print if /$pattern/o; # o increases the efficiency }
The while (<>) loop implies reading all lines in all files whose names are in @ARGV. (If there are no filenames on the command line, <> reads from standard input.) Since processing a list of files in a line-oriented fashion is a frequently encountered task in scripts, while (<>) is a popular and widely used construction that saves quite some typing. It goes without saying that each line is available in the $_ variable.
1.3.5 The Ultimate Goal: Getting Rid of the Script File
We can also do the grep operation with a command-line Perl script:
perl -n -e ’print if /superLibFunc/;’ file1 file2 file3
Here, the -n option tells Perl to invoke a loop over all lines in all files specified on the command line (equivalent to while (<>)) and execute the string after -e as a Perl script applied to each line. Implicit here is that the line is stored in the $_ variable.
1.3.6 Perl Has a Grep Function Too
The grep operation is so common that Perl has in fact a built-in grep function:
16 1. Introduction to Perl
} }
The grep function searches for $string in a list of all the lines in the file and returns a list with the lines that contain $string. Of course, this readable script can be condensed to two lines if desired, using the <> notation:
#!/usr/bin/perl $pattern = shift; print grep /$pattern/, <>;
Observe that we here do not easily print the filename.
Remark. We should mention that reading the whole file into memory at once, which is implied by @lines=<FILE> and also the <> operator, may face memory problems if you work with large data files. The line-by-line reading can then be more appropriate.
Exercise 1.1. Modify a very Perl-ish grep script. Consider a grep script in typical modern Perl style:
}
Extend this script such that the filename and the line number are printed at the beginning of the lines that match the given string. You can count the number of lines in the last foreach loop, or you can make use of Perl’s special variable $., which holds the line number of the current line. Write the line number in a field of width (say) 5 characters such that the out- put is nicely aligned in three colums (filename, line number, line), see Ex- ercise 8.4Exercisesexercise.483 on page 349Exercisesexercise.483 in [?] for a sample output.
Observe how simple such an extension would have been if we had used named variables instead of $_, or in other words, readability and extendability are seldom well supported by extensive use of $_.
1.4. Frequently Encountered Tasks 17
1.4 Frequently Encountered Tasks
Frequently encountered tasks in Perl scripts have been collected and orga- nized in the present section, with the aim of providing a kind of example- oriented quick reference for the reader. The following tasks are covered:
– basic control structures,
– executing other programs,
– splitting, joining, searching, and replacing text,
– writing and calling Perl subroutines,
– checking a file’s type, size, and age,
– listing and removing files,
– creating and removing directories,
– measuring CPU time,
if ($answer eq "copy") { $copy = 1;
} elsif ($answer == 0) { $quit = 1;
} elsif { $answer eq ’run’ or answer eq ’execute’) { $run = 1;
} else { print ’Invalid answer $answer\n’;
}
Perl has numerous ways of writing if tests. Some examples are
if ($pen ne "up") { $pen = "up"; } if (not $pen eq "up") { $pen = "up"; } if (! $pen eq "up") { $pen = "up"; } $pen = "up" if $pen ne "up"; $pen = "up" if not $pen eq "up"; $pen = "up" if ! ($pen eq "up");
The for or foreach statement visits the entries in an array, entry by entry:
18 1. Introduction to Perl
# convert some PostScript files to GIF: @somelist = (’file1.ps’, ’file2.ps’, ’file3.ps’); for $psfile (@somelist) {
$giffile = $psfile; $giffile ~ s/\.ps/.gif; system("convert ps:$psfile gif:$giffile");
}
There is both a while loop and a do-while loop in Perl:
$r = 0; $dr = 0.1; while (r <= 10) {
$s = sin($r); print "$s\n"; $r += $dr;
}
$s = sin($r); print "$s\n"; $r += $dr;
} while ($r <= 10);
}
The next statement continues with the next iteration in the loop:
# print lines not starting with ’#’: for $file (@files) {
}
1.4.2 File Reading and Writing
The following code segments demonstrate opening a file and reading it line by line or loading it into a list of lines:
$infilename = "myprog.cpp"; open(INFILE, "<$infilename") # open for reading
or die "Cannot read file $infilename; $!\n"; @lines = <INFILE>; # load file into a list of lines
# alternative reading, line by line: while (defined($line = <INFILE>)) {
# process $line }
# process current line, stored in $_ } close(INFILE);
1.4. Frequently Encountered Tasks 19
The recipe for opening a file for writing a list of lines is given next.
$outfilename = "myprog2.cpp"; open(OUTFILE, ">$outfilename") # open for writing
or die "Cannot write to file $outfilename; $!\n"; $line_no = 0; # count the line number in @lines foreach $line (@lines) {
$line_no++; print OUTFILE "$line_no: $line";
} close(OUTFILE);
We can proceed with appending text to a file, using Perl’s features for writing (large) blocks of text in one output statement, with embedded variables if desired:
open(OUTFILE, ">>$filename") # open for appending or die "Cannot append to file $filename; $!\n";
# print multiple lines at once, using a ‘‘here document’’: print OUTFILE <<EOF; /*
This file, "$outfilename", is a version of "$infilename" where each line is numbered.
*/ EOF
# equivalent output using a string instead: print OUTFILE \ "/*
*/";
close(OUTFILE);
If you need to treat a file handle, such as OUTFILE, like a variable, e.g., when sending it to a function, you should use Perl’s FileHandle objects, see perldoc FileHandle.
1.4.3 Running an Application
Any operating system command can be executed by calling the system func- tion. Here is an example involving running an application myprog:
$cmd = "myprog -c file.1 -p -f -q"; $failure = system("$cmd > res"); # output goes to file res die "$0: running $cmd failed\n" if $failure;
A different way of testing for failure is
system("$cmd > res") == 0 or die "$0: running $cmd failed\n";
The return value from system is also available in the special Perl variable $?:
20 1. Introduction to Perl
system("$cmd > res"); die "$0: running $cmd failed\n" if $?;
To redirect the output from the application into a list of lines, one can use back quotes:
$cmd = "myprog -c file.1 -p -f -q"; @res = ‘$cmd‘;
Alternatively, one can open a pipe to the application and read the output as if it were a file:
open(APP, "$cmd |"); @res = <APP>;
# process the current line, stored in $_ } close(APP);
Pipes can also be used for running interactive applications. After having opened a write pipe to a program, we can issue various commands, which are executed upon closing the pipe. Here is an example involving the interactive Gnuplot program:
open (GNUPLOT, "| gnuplot -persist"); # open a pipe to Gnuplot print GNUPLOT "set xrange [0:10]; set yrange[-2:2]\n"; print GNUPLOT "plot sin(x)\n"; # draw a sine function print GNUPLOT "quit\n"; close(GNUPLOT); # run Gnuplot with the commands
1.4.4 One-Line Perl Scripts
Perl supports some command-line options for wrapping a script with a loop over all lines in a series of files. This is very convenient for creating one-line scripts on the fly. For example,
perl -p -i.bak -e ’...’ file1 file2 file3
runs a loop over all lines in file1, file2, and file3. For each line, the Perl commands provided inside the quotes (after the -e option) are executed, and the -p option implies that the line is printed after execution of the commands. Without the -i option the printing goes to standard output, but with -i the files are modified in-place, i.e., the original file is replaced by the new output. With -i.bak the file file1 is first copied to file1.bak before it is being overwritten. The -p and -i.bak options are normally combined into -pi.bak. Each line in the files is stored in $_. As an illustration we can let the script specified by the -e option be s/float/double/g; meaning that float is replaced by double in some files (here file1, file2, and file3):
1.4. Frequently Encountered Tasks 21
perl -pi.bak -e ’s/float/double/g;’ file1 file2 file3
To avoid automatic printing of each line, we can replace the -p option by -n. Suppose a data file has numbers in a series of columns, separated by whitespace, and you want to extract the first and the fourth column. The relevant one-liner is then
perl -ne ’@s=split; print "$s[0]\t$s[3]\n"’ datafile
Calling split without an argument implies splitting $_ with respect to whites- pace. The equivalent Perl script, stored in a file, in this latter example can also be made very short:
while (<>) {@s=split; print "$s[0]\t$s[3]\n";}
1.4.5 Array and List Operations
The most common statements for creating and traversing arrays are listed next. Creating an array with three entries goes like this:
@arglist = ($myarg1, "displacement", "tmp.ps");
@arr = ($var1, $var2); @arglist = ($myarg1, "displacement", @arr, "tmp.ps");
but @arglist does not have an array as the third element; the @arr array’s entries are simply inserted in @arglist, i.e., @arglist now contains
($myarg1, "displacement", $var1, $var2, "tmp.ps");
To force the third entry to be the @arr array, this entry must be a reference to @arr, obtained by prefixing @arr with a backslash (see page 29):
@arglist = ($myarg1, "displacement", \@arr, "tmp.ps");
New entries can be appended to an array using the push function, e.g.,
push(@arglist, $myvar2); push(@arglist, @arr2);
$arglist[2] = "displacement";
22 1. Introduction to Perl
foreach $entry (@arglist) { print "entry is $entry\n";
} # or for $entry (@arglist) {
print "entry is $entry\n"; }
Index-based traversal is also possible:
for ($i = 0; $i <= $#arglist; $i++) { print "entry is $arglist[$i]\n";
} # or for ($i = 0; $i < @arglist; $i++) {
print "entry is $arglist[$i]\n"; }
A widely used shortcut for creating a list of strings is the qw operator:
@strlist = qw/item1 item2 item3/; # equivalent to: @strlist = ("item1", "item2", "item3");
The qw operator is frequently used in Perl/Tk programming. Extracting entries from an array is often performed by a list assignment,
e.g.,
($filename, $plottitle, $psfile) = @arglist;
This assignment works regardless of the length of @arglist6. If @arglist has (say) two elements, $psfile becomes an undefined variable. The final list entry on the left-hand side can be a list, e.g.,
($filename, $plottitle, @rest) = @arglist; # @rest becomes $arglist[2], $arglist[3] and so on
The shift function returns and removes the first array element:
$first_entry = shift @arglist;
The pop function returns and removes the last array element:
$last_entry = pop @arglist;
Without arguments, shift and pop works on @ARGV in the main program and @_ in subroutines, e.g.,
6 Similar list assignments in Python requires that the lists on each side of the assignment operator have equal lengths.
1.4. Frequently Encountered Tasks 23
$file = shift; # same as shift @ARGV;
}
Array items can be changed in-place:
# @A is some array of numbers for ($i=0; $i<=$#A; $i++) {
if ($A[$i] < 0.0) { $A[$i] = 0.0; } } # @A does not contain negative numbers
The follwing construction also works, i.e., entries in @A are changed7:
for $r (@A) { if ($r < 0.0) { $r = 0.0; }
}
Perl arrays allow slicing: @arglist[1..3] returns the second up to and including the fourth entry, that is, 1..3 denotes the indices 1-3.
Unlike Python, an array assignment like
@a = @b;
creates a new array @a where each element is a copy of the corresponding array element in @b. To make a refer to the array b, as in the Python assignment a
= b, we need to let a be a reference:
$a = \@b;
See page 29 for more information about references and how to access the values referred to by $a.
Reversing the order of the entries in an array is performed by the reverse
function:
@sortedl_ist = sort(@list); # sort in ascending ASCII order
The sort order can be controlled by a user-defined function, e.g.,
7 The similar construction does not work in Python (cf. the example starting on page 87Lists and Tuplessubsection.148 in [?]).
24 1. Introduction to Perl
sub numeric_sort { if ($a < $b) { return -1; } elsif ($a == $b) { return 0; } else { return 1; }
} @sorted_list = sort numeric_sort @list;
The arguments $a and $b in sort criteria routines are automatically initialized by Perl and used instead of the @_ array for speed. The numeric_sort routine is often required, but writing a separate subroutine is actually not necessary because Perl already has a compound comparison operator <=> that works with numbers:
@sorted_list = sort { $a <=> $b } @list; # numeric sort
The statement $a <=> $b evalues to −1, 0 or 1, depending on whether $a is less than, equal to, or greater than $b, respectively. The operator works for text too. We refer to the description of the sort function in perldoc perlfunc
(or write just perldoc -f sort) for numerous examples on writing customized sort functions, e.g., case-insensitive text comparison.
The perlfunc man page is very useful; if you wonder about the Perl func- tion name for doing a specific task, write perldoc perlfunc and search for keywords in this man page.
1.4.6 Hash Operations
A hash, also known as associative array in other languages, or dictionary in Python, is a kind of array where the index, called key, can be an arbitrary text. For example, all command-line options to a script could be stored in a hash with the name of the option (without any hyphens) as key:
$cmlargs{’m’} = 1.2; # or $cmlargs{m} = 1.2; $cmlargs{’tstop’} = 6.0; # or $cmlargs{tstop} = 6.0;
This allows for easy processing of a large number of command-line arguments and corresponding script variables. Here is a possible code segment:
# init the entire hash with default values: # (the entire hash is preceded by %) %cmlargs = (
’tstop’ => 6.0, ’m’ => 1.2 );
while (@ARGV) { # run through all command-line arguments $option = shift @ARGV; $option = substr($option, 2); # strip off hyphens (--) if (exists($cmlargs{$option})) {
# next command-line argument is the value: $value = shift @ARGV $cmlargs{$option} = $value;
} else {
die "The option $option is not registered\n"; }
} # traverse the hash structure, key by key: foreach $option (keys %cmlargs)
{ print "cmlargs{’$option’}=$cmlargs{$option}\n"; }
With this technique you could develop various tools for initializing and pro- cessing command-line options, and each time you need to add a new variable and a corresponding option to the script, you can simply add one new line to the initialization of the default values in the hash cmlargs.
1.4.7 Splitting and Joining Text
The split function splits a string according to a delimiter string or a regular expression. A common use of split is to split a text into words:
$files = "case1.ps case2.ps case3.ps"; @filenames = split(’ ’, $files); # split wrt whitespace
The entries in @filenames become
("case1.ps", "case2.ps", "case3.ps")
The behavior of split(’ ’, $str) is equivalent to str.split() in Python, i.e., whitespace surrounding the words is ignored. Any string delimiter can be used, e.g.,
$files = "case1.ps, case2.ps, case3.ps"; @filenames = split(’, ’, $files);
results in @filenames as
("case1.ps", "case2.ps", "case3.ps")
The split function can also split with respect to a regular expression, just as re.split in Python, e.g.,
$files = "case1.ps, case2.ps, case3.ps"; @filenames = split(/,\s*/, $files);
This results in the correct split of $files. (There is a slight difference between Perl and Python when splitting a
string with respect to whitespace using the regular expression \s+. Leading and trailing blanks results in an empty string as first and last element in the returned list, when using Python, whereas Perl’s split function does not result in an array element corresponding to the trailing blanks.)
The join command is the inverse of split:
@filenames = ("case1.ps", "case2.ps", "case3.ps"); $cmd = "print " . join(" ", @filenames);
yields $cmd as the string "print case1.ps case2.ps case3.ps".
26 1. Introduction to Perl
1.4.8 Text Processing
A basic issue in text processing is recognizing and replacing parts of a text. Recognizing text can be done in several ways:
# exact string match: if ($line eq "double") { # is $line equal to "double"?
# matching with full regular expressions: if ($line =~ /double/) { # does $line contain double? # (here, double can be replaced by any valid regular expression)
Note that in Perl, the comparison operators for strings and numbers are different8 (e.g., eq and ne for strings vs. == and != for numbers, see also Chapter 1.4.14).
Here is an example regarding substituting double by float everywhere in a file:
$copyfilename = "$filename.old~~"; rename($filename, "$copyfilename"); # take a copy of the file open(FILE," <$copyfilename") or die "$0: couldn’t open file; $!\n"; $filestr = join("", <FILE>); # read lines and join them to a string close(FILE);
$filestr =~ s/float/double/g; # substitute
open(FILE, ">$filename"); # write to the orig file print FILE $filestr; # print the whole (modified) file close(FILE);
Since the need for such types of file substitutions often arises, Perl offers a one-line statement for accomplishing the task:
perl -pi.old~~ -e ’s/float/double/g;’ *.c
See page 20 for an explanation of the various parts of this command.
1.4.9 String Operations
Strings in Perl are enclosed in single or double quotes, but the type of quotes affects the string contents, as illustrated next. Double quotes enable variable interpolation:
$w = ’World’; $s1 = "Hello, $w!"; # becomes "Hello, World!"
Single quotes preserve $, @, and other special Perl characters:
$s2 = ’Hello, $w!’; # becomes "Hello, $w!"
Multi-line strings are also possible:
8 Python applies == as well as <, <=, >, >= for all data types.
1.4. Frequently Encountered Tasks 27
$s3 = "ordinary strings can be used for multi-line text";
String concatenation is enabled by the dot operator:
$myfile = $filename . ’_tmp’ . ’.dat’;
The $myfile variable becomes case1_tmp.dat if $filename is the string case1. Substrings can be extracted by the substr function, e.g.,
$teststr = ’0123456789’; # extract 6 characters, starting # from the beginning of the string: $strpart = substr($filename, 0, 5); # result: ’01234’
# another example: $strpart = substr($filename, 3, 5); # result: ’34567’
# skipping the first two characters: $strpart = substr($filename, 2);
# skipping up to the last three characters: $strpart = substr($filename, -3);
Stripping away leading and trailing blanks in a string is easily carried out by regular expressions:
$line1 =~ s/^\s*//; $line1 ~= s/\s*$//;
1.4.10 Environment Variables
The environment variables are stored in a Perl hash called ENV. You can modify, e.g., $ENV{PATH} in the script and it has effect on all child processes (started by calls to the system function, for instance). Here is an example how we can read the PATH environment variable, split it into its various directories, and check each directory if it contains the executable file vtk:
$program = "vtk"; $path = $ENV{PATH}; # /usr/bin:/usr/local/bin:/usr/X11/bin etc. @paths = split(/:/, $path); foreach $dir (@paths) {
if (-d $dir) { if (-x "$dir/$program") {
} }
} if (defined($program_path)) {
print "$program found in $program_path\n"; } else { print "$program not found\n"; }
28 1. Introduction to Perl
Note that the regular expression split on colon is Unix specific. On Windows we need to insert a semi-colon instead (note that /[:;]/ does not give a cross-platform solution since colon is used in Windows paths, e.g., C:\). Also note the need for double quotes in the second if test; writing $dir/$program
without double quotes would be an invalid mixture of variables and text (the slash), or division of two text variables – what we need is to construct a new string using variable interpolation.
1.4.11 Subroutines
Functions in Perl are called subroutines. Subroutines take the form
}
The arguments are not part of the subroutine heading. Instead, they are available in the array @_. Output variables are transferred to the calling code by returning an appropriate data structure, e.g., a list of the various output quantities. The return statement can be omitted.
A Simple Example of a Subroutine. A subroutine for finding the maximum value of two numbers can be written straightforwardly as follows:
}
The my keyword makes variables local to the subroutine9. Unless you specify a variable with my it is treated as a global variable whose value is visible outside the routine as well. Frequently, one maps the @_ array onto suitable local values using convenient list techniques, e.g.,
my ($a, $b) = @_;
This allows working with scalars, such as $a and $b, instead of the array entries $_[0] and $_[1]. Alternatively, we can extract $a and $b using the shift operator:
my $a = shift; # same as shift @_; my $b = shift;
9 See [?] for a precise explanation of the my keyword.
1.4. Frequently Encountered Tasks 29
Variable Number of Arguments. Here is a subroutine statistics, with a variable number of arguments, which returns a list containing the average and the minimum and maximum value of all the arguments:
($avg, $min, $max) = statistics($v1, $v2, $v3, $b); # usage
sub statistics { # arguments are available in the array @_ my $avg = 0; my $n = 0; # local variables
foreach $term (@_) { $n++; $avg += $term; } $avg = $avg / $n;
my $min = $_[0]; my $max = $_[0]; shift @_; # swallow first arg., it’s already treated foreach $term (@_) {
}
return ($avg, $min, $max); }
Call by Reference. Modifying the arguments inside the subroutine, i.e., call by reference, is enabled by working directly on the @_ array. For example,
swap($v1, $v2); # swap the values of $v1 and $v2
sub swap { my $tmp = $_[0]; $_[0] = $_[1]; $_[1] = $tmp;
}
That is, @_ contains references to the variables used in the subroutine call10. We remark that the swap function is just an example on call by reference; the elegant Perl way of swapping two variables reads ($v2,$v1)=($v1,$v2).
One can also pass references to variables to subroutines and in this way get the effect of call by reference. A reference to a variable $a reads \$a. Having the reference as a variable $a_ref, we can extract its value by ${$a ref}. We may then write the swap function as
}
swap(\$v1, \$v2);
10 Perl applies call by reference, and copying the arguments in @ into local variables in a my statement simulates call by value.
30 1. Introduction to Perl
Alternatively, we can just swap the references themselves:
}
Another example on using references in Perl appears on page 31.
Keyword Arguments. By using a hash to hold the arguments passed to a subroutine, one can obtain a very readable syntax and the possibility for assigning default values to an arbitrary set of the arguments11. Here is an example, where we call a subroutine with two parameters, message and file:
$filename = "my.tmp"; print2file(message => "testing hash args", file => $filename);
sub print2file { my %args = (message => "no message", # default
file => "tmp.tmp", # default @_); # assign and override
open(FILE,">$args{file}"); print FILE "$args{message}\n\n"; close(FILE);
}
Inside the subroutine we first assign default values to the hash entries and thereafter we insert the argument list @_, which can be interpreted as a hash as well. This latter hash might then override our default values. For example, calling
print2file(file => $filename);
leaves $args{message} as no message, but $args{file} is overwritten by the $filename variable inside the print2file subroutine. The use of a hash in sub- routine calls also makes the sequence of arguments irrelevant. The technique is used throughout Perl’s Tk module for creating graphical user interfaces and (see Chapter 1.7).
Omitting Parenthesis in a Call. If a subroutine is declared before you call it, you can omit the parenthesis in the call statement, e.g.,
sub myproc { my $file1 = shift; // implicit shift on @_ my $file2 = shift; ...
} # call myproc without parenthesis: myproc $myfile, "$yourdir/$yourfile";
11 This is the counterpart to Python’s keyword arguments, see page 111Keyword Argumentssubsection.175 in [?].
1.4. Frequently Encountered Tasks 31
All the subroutines in the Perl libraries are declared before you use them so you can omit parenthesis if you desire. Here are some examples:
print "No of iterations=$iter\n"; print("No of iterations=$iter\n");
open TMPFILE, ">$tmpfile"; open(TMPFILE, ">$tmpfile");
system "simulator -q 1.2"; system("simulator -q 1.2");
Multiple Arrays as Arguments. If you want to send several arrays to a sub- routine, you need to explicitly pass references to the arrays. Otherwise, one cannot detect where one array stops and the next starts in @_. We shall now show an example where we transfer two arrays to a subroutine and print them out simultaneously in a nice format:
@curvelist = (’curve1’, ’curve2’, ’curve3’); @explanations = (’initial shape of u’,
’initial shape of H’, ’shape of u at t=2.5’);
# send the two arrays to displaylist, using references # (\@list is a reference to the array @list): displaylist(list => \@curvelist, help => \@explanations);
The implementation of the displaylist routine, taking two array arguments transferred by references, is listed next.
sub displaylist { my %args = (@_); # extract the two lists from the two references: my $list_ref = $args{’list’}; # extract reference my @list = @$list_ref; # extract array from reference my $help_ref = $args{’help’}; # extract reference my @help = @$help_ref; # extract array from reference
my $index = 0; my $item; for $item (@list) {
printf("item %d: %-20s description: %s\n", $index, $item, $help[$index]);
$index++; }
# Alternative, without lots of local variables: $index = 0; for $item (@{$args{’list’}}) {
printf("item %d: %-20s description: %s\n", $index, $item, ${@{$args{’help’}}}[$index]);
$index++; }
The output of displaylist looks like this:
item 0: curve1 description: initial shape of u item 1: curve2 description: initial shape of H item 2: curve3 description: shape of u at t=2.5
We refer to the Pass by Reference section of perldoc perlsub (or the equiv- alent text in [?, p. 116-118]) for more information.
1.4.12 Nested, Heterogeneous Data Structures
The problems with displaylist and the need for references also occur in nested, heterogeneous data structures. Say we want a list such as the curves1
list in page 88Lists and Tuplessubsection.148 in [?]. In Perl we could build some of its components first, which are straight arrays:
@point1 = (0,0); @point2 = (0.1,1.2); @point3 = (0.3,0); @point4 = (0.5,-1.9);
A list of these points must be a list of references to @point1, @point2, etc.:
@points = (\@point1, \@point2, \@point3, \@point4);
Now, suppose we have an array @xy1 similar to @points. The curves1 array is supposed to contain a string, @points, another string, and @xy1. Again, references are required to avoid “flattening” the structure:
@curves1 = ("u1.dat", \@points, "H1.dat", \@xy1);
It is tedious to write the sublist as separate variables so we can do with
@curves1 = ("u1.dat", [[0,0], [0.1,1.2], [0.3,0], [0.5,-1.9]], "H1.dat", \@xy1);
That is, lists in square brackets provides a reference to an array. Indexing is performed with a syntax similar to Python. For example,
$a = $curves1[1][1][0];
yields $a as 0.1. Nested data structures in Perl must make use of references, and it can
be troublesome to debug such structures. The Data::Dumper module converts Perl data structures to readable strings: print Dumper(@curves1) results in the present case in
1.4. Frequently Encountered Tasks 33
$VAR1 = ’u1.dat’; $VAR2 = [
The Data::Dumper module supports lots of output formats, see perldoc Data::Dumper. More information about references can be found in perldoc perlreftut.
1.4.13 Testing a Variable’s Type
An ordinary Perl variable is either a scalar, an array, or a hash. The prefix determines the type of the variable, so the variable name together with its prefix shows its type; it is no need to test on the variable’s type (as in Python). Writing
$var = 1; # scalar @var = (1, 2); # array %var = (key1 => 1, key2 => ’two’); # hash
creates three different Perl variables. Every time we use one of the variables, the prefix immediately shows its type.
However, when working with references the prefix is always a dollar. The function ref can be used to test what kind of underlying data structure the reference is pointing to. The return value in a scalar context is a string, like ’SCALAR’, ’ARRAY’, or ’HASH’. In a boolean context, ref returns true if its argument is a reference:
34 1. Introduction to Perl
if (ref($r) eq "HASH") { # test return value print "r is a reference to a hash.\n";
} unless (ref($r)) { # use in boolean context
print "r is not a reference at all.\n"; }
The ref function is handy when you work with nested, heterogeneous data structures. See perldoc -f ref and perldoc perlref for more information.
1.4.14 Numerical Expressions
Perl supports the same numerical expressions as C. Strings are automatically transformed to numbers when required:
$b = 1.2; # b is a number $b = "1.2"; # b is a string $a = 0.5 * $b; # b is converted to a real number before mult.
if ($b < 100) { print "ok\n"; } else { print "error!\n"; } # prints "ok"
In the last test, the < operator works on numbers, and $b is interpreted as a number (<, >, ==, =!, etc. are the comparison operators for numbers, whereas strings must be compared with lt, gt, eq, ne, etc.).
1.4.15 Listing of Files in a Directory
The following statements return a list of files (in the current working direc- tory) having extensions .ps or .gif:
@filelist = glob("*.ps *.gif");
# alternative: @filelist = <*.ps *.gif>;
A more sophisticated glob function is also available, see perldoc File::Glob.
1.4.16 Testing File Types
Perl supports a range of tests for classifying files:
if (-f $myfile) { print "$myfile is a plain file\n"; } if (-d $myfile) { print "$myfile is a directory\n"; } if (-x $myfile) { print "$myfile is executable\n"; } if (-z $myfile) { print "$myfile is empty(zero size)\n"; } if (-T $myfile) { print "$myfile is a text file\n"; } if (-B $myfile) { print "$myfile is a binary file\n"; }
There are also tests for the size and age of a file:
1.4. Frequently Encountered Tasks 35
$size = -s $myfile; $days_since_last_access = -A $myfile; $days_since_last_modification = -M $myfile;
See perldoc perlfunc and search for -f, -d, and so on for information about file tests.
The stat function gives more detailed results about a file:
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size, $atime,$mtime,$ctime,$blksize,$blocks) = stat($myfile);
A quote from the description of stat in the man page perlfunc explains what the various list entries above mean:
0 dev device number of filesystem 1 ino inode number 2 mode file mode (type and permissions) 3 nlink number of (hard) links to the file 4 uid numeric user ID of file’s owner 5 gid numeric group ID of file’s owner 6 rdev the device identifier (special files only) 7 size total size of file, in bytes 8 atime last access time since the epoch 9 mtime last modify time since the epoch
10 ctime inode change time (NOT creation time!) since the epoch 11 blksize preferred block size for file system I/O 12 blocks actual number of blocks allocated
There is an alternative stat function in the File::stat module, see perldoc File::stat.
1.4.17 Copying and Renaming Files
Renaming a file is simple:
rename($myfile, "tmp.1"); # rename $myfile to tmp.1
Moving files across file systems is reliably done with the move function in Perl’s File::Copy library:
use File::Copy; move($myfile, "/work/temp") or die "Could not rename file\n";
Copying a file $file to a file $tmpfile is performed with the copy function in the File::Copy library:
use File::Copy; copy($file, $tmpfile);
1.4.18 Creating and Moving to Directories
Creating a directory and moving to a directory are tasks performed with the mkdir and chdir functions, respectively:
use Cwd; $origdir = cwd; # remember where we are $dir = "../mynewdir"; mkdir($dir, 0755) or die "$0: couldn’t create dir; $!\n"; chdir($dir); ... chdir($origdir); # move back to the original directory chdir; # move to your home directory ($ENV{HOME})
Suppose you want to create a new directory perl/projects/test1 in your home directory, but neither perl, nor projects and test1 exist. Instead of using repeated mkdir commands, Perl offers the mkpath command, from the File::Path module, to create the whole path in one statement:
use File::Path; mkpath("$ENV{HOME}/perl/projects/test1");
1.4.19 Removing Files and Directories
Single files are removed by the unlink statement, e.g.,
unlink("myfile") or die "Could not remove file\n";
A list of files can also be transferred to unlink:
unlink(@files); unlink(glob("*.ps *.gif"));
unlink "myfile", ’yourfile’, @thosefiles, "$file.tmp" or \ die "Could not remove files\n";
Frequently, one wants to remove a directory tree, possibly full of files, an action that requires the rmtree function from the File::Path library:
use File::Path; rmtree("mydir");
1.4.20 Splitting Pathnames
Let $fname be a filename containing a possibly long path, e.g.,
$fname = /usr/home/hpl/scripting/perl/intro/hw2a.pl
Occasionally, one wants to split this filename into the basename hw2a.pl and the directory name /usr/home/hpl/scripting/perl/intro/:
1.4. Frequently Encountered Tasks 37
use File::Basename; $basename = basename($fname); $dirname = dirname($fname);
One can also extract the base of the basename, hw2a, either by
$base = $basename; # or by substituting the file extension by an empty string: $base =~ s/\.pl$//g;
or by the fileparse function:
($base, $dirname, $extension) = fileparse($fname,".pl");
The fileparse function can take an arbitrary number of possible extensions.
1.4.21 Traversing Directory Trees
The very useful Unix find command can be implemented in a cross-platform fashion in Perl using the File::Find library and its find function. The basic recipe for using Perl’s find goes as follows.
use File::Find; # run through directory trees dir1, dir2, and dir3, and # for each file call the user-provided subroutine ourfunc: find(\&ourfunc, "dir1", "dir2", "dir3");
}
We shall now implement a script that lists all files larger than 1Mb in the home directory tree. The easiest way to extract the size of a file is to write
$size = -s $file;
#!/usr/bin/perl use File::Find;
find(\&printsize, $ENV{HOME}); # traverse home-directory tree
sub printsize { $file = $_; # more descriptive variable name... if (-f $file) { # is $file a plain file, not a directory? $size = -s $file; # or $size = (stat($file))[7]; if ($size > 1000000) {
printf("%.1fMb %s in %s\n",$size/1000000.0,$file, $File::Find::dir);
} }
}
38 1. Introduction to Perl
We recommend to read perldoc File::Find to see the many possibilities that Perl’s find function offers.
There is a program find2perl that translates a Unix find command into the equivalent Perl program. The resulting program is not always easy to read for newcomers to Perl so writing the Perl script yourself gives better control of what you want to do. In the present example you can try
find2perl find $HOME -name ’*’ -type f -size +2000 -exec ls -s {} \;
and realize that the resulting code has 55 (!) lines and is less cross-platform than our hand-coded version.
1.4.22 Downloading Internet Files
The libwww-perl package contains numerous modules and scripts for working with the World Wide Web. You can easily test if libwww-perl is already installed on your system by trying
perl -e ’use LWP::Simple’
If this one-liner gives an error message, you need to get libwww-perl from CPAN (see page 54).
The Perl script lwp-download (from the libwww-perl package) fetches a single file whose URL is known:
lwp-download http://www.ifi.uio.no/~hpl/downloadme.dat
The script looks at the file contents and creates a suitable local filename for the copy. In this case, downloadme.dat is a text file that lwp-download stores as downloadme.dat.txt. A second argument to lwp-download can be used to specify a local filename.
Inside a Perl script we can easily copy a file, given as a URL, to a local file:
use LWP::Simple; $URL = "http://www.ifi.uio.no/~hpl/downloadme.dat"; getstore($URL, "downloadme.dat"); # copy only if local file is not up-to-date: mirror($URL, "downloadme.dat.pl");
or we can load the remote file directly into an array of lines:
@lines = get($URL);
The URL in these examples could also have been an ftp address, e.g.,
ftp://ftp.ifi.uio.no/pub/blab/xite/xite3_4.tar.gz
1.4. Frequently Encountered Tasks 39
1.4.23 CPU-Time Measurements
Measurement of elapsed time in Perl can be done with the time function:
$t0 = time; # elapsed time in seconds since the epoch # do tasks... $elapsed_time = time - $t0;
Because time is measured in seconds, you need to perform efficiency tests that last several seconds. Timing with finer resolution is possible, see the Perl FAQ: perldoc -q ’time under a second’.
Throughout this section we assume that the reader is familiar with terms like epoch, elapsed time, system time, CPU time, and the difference between children and parent processes, as briefly explained in Chapter 8.10.1CPU- Time Measurementssubsection.577 in [?].
A more sophisticated function times returns an array with four entries. The first two represent the user and system times of the current process while the next two contain the user and system times of the current process’ child processes.
@t0 = times; # do tasks... system "$time_consuming_command" # child process @t1 = times; $user_time = $t1[0] - $t0[0]; $system_time = $t1[1] - $t0[1]; $cpu_time = $user_time + $system_time; $cpu_time_system_call = $t1[2] - $t0[2] + $t1[3] - $t0[3]
There is also a higher-level module Benchmark, based on the time and times
functions, with various support for timing of Perl scripts. The usuage goes as follows.
use Benchmark; $t0 = new Benchmark; # do some tasks... $t1 = new Benchmark; $td = timediff($t1, $t0); # time difference between $t0 and $t1 $nice_td_formatting = timestr($td, ’noc’); print "tasks: $nice_td_formatting\n";
The output looks like this:
tasks: 9 wallclock secs( 3.12 usr + 0.10 sys = 3.22 CPU)
The Benchmark module has also a function timeit that runs a piece of Perl code a specified number of times:
use Benchmark; print "100 runs took", timestr(timeit(100,\&somefunc)), "\n";
40 1. Introduction to Perl
We refer to perldoc Benchmark for more details about this module. From a pedagogical point of view it might be instructive to write a func-
tion like timeit in the Benchmark module. Doing this we also have the pos- sibility of tailoring such a timing function to suit our needs. The function, here called timer, can take four arguments: (i) a function to call, (ii) a list of arguments to be used in the function to call, (iii) the number of call repeti- tions, and (iv) the name of the function to call. In Perl we would represent the first two arguments by a function reference and a reference to a list. The complete function could then take the following form:
sub timer { my ($func_ref, $args_ref, $repetitions, $func_name) = @_; my $t0 = time; # initial elapsed time my ($u0, $s0, $rest) = times; # initial user and system time for (my $i = 0; $i < $repetitions; $i++) {
&$func_ref(@$args_ref); } my @t1 = times; printf("$func_name: elapsed=%g, CPU=%g\n",
time - $t0, $t1[0] - $u0 + $t1[1] - $s0); }
The similar Python function is presented in Chapter 8.10.1CPU-Time Measurementssubsection.577 in [?].
1.4.24 Programming with Classes
Classes are implemented in Perl using quite advanced concepts like references and packages. Although Perl fans claim that classes in Perl are much more flexible than those in C++ and Java, it is no doubt that programming with classes is more weird in Perl than in C++, Java, and Python. Explaining Perl classes in a couple pages without first covering references and packages is difficult and therefore omitted here.
1.4.25 Debugging Perl Scripts
Unfortunately, Perl is by default quite silent about errors. The following short script, which tries to open a non-existing file, illustrates the point:
perl -w -e ’open(F,"<mynonexistingfile"); close(F);’
Perl executes this script without any error message12. The Fatal module can be used for letting Perl speak up about run-time errors:
perl -e ’use warnings; use strict; use diagnostics; \ use Fatal qw/open/; local *F; \ open(F,"<mynonexistingfile"); close(F);’
12 Python provides instructive run-time messages by default in similar examples (and the messages can be turned off by, e.g., appropriate exception handling in the script).
1.4. Frequently Encountered Tasks 41
Note that you must list the functions you want to be verbose, here open. The reported error message now contains the helpful message
Can’t open(F, <mynonexistingfile): No such file or directory
The use warnings, use strict, and use diagnostic commands can help you detecting statements that are candidates for trouble. However, applying use strict
modules to (most of) the Perl scripts in this appendix will result in lots of error messages about lack of the main:: prefix for all global variables or an explicit my or local operator (to make variables local). For quick scripting this can be a bit annoying. When writing larger scripts, on the other hand, use strict is a good habit. Here is a sample code demonstrating some im- plications of use strict:
use strict; # introduce the global variable $counter for the first time: $counter = 1; # generates error message $main::counter = 1; # ok, explicit indication of package name my $counter = 1; # ok, localizing $counter with the my operator my $counter; $counter = 1; # equiv. with the line above
The reader is encouraged to take a look at the man pages for the Fatal, strict, and diagonstic modules. For details on warnings, see perldoc warnings
and man perllexwarn. Inserting print statements on the fly in the code is an efficient and widely
used debugging method among Perl programmers. Alternatively, the -d op- tion to a Perl script enables you to interactively debug the script through a command-line debugger,
perl -d -w mybuggyscript.pl
The -w option turns on many useful warnings about, e.g., unused variables. The most important commands inside the debugger are s for single step, n for single step without stepping into subroutines, x for pretty-print of data structures and variables, and b 85 for setting a break point at line 85. More detailed information is provided by perldoc perldebug.
There is a Perl/Tk GUI for the Perl debugger, available in the module ptkdb. Invoke the debugger by
perl -d:ptkdb -w mybuggyscript.pl
There are several Perl debuggers with graphical interfaces, check out the links in the Perl resources section in doc.html.
Another Perl module is Devel::Trace, which prints each statement prior to executing it (the same effect as the -x option to Unix shell scripts).
42 1. Introduction to Perl
1.4.26 Regular Expressions
The material on regular expressions explained in a Python context in Chap- ter 8.2Regular Expressions and Text Processingsection.463 in [?] carries over to Perl, but the surrounding Perl code is different. To test if a string $str
matches a regular expression contained in a string $pattern, one writes
if ($str =~ /$pattern/) { ... }
$str = "myfile.tmp"; if ($str =~ /\.tmp$/) { print "$str has extension .tmp"; }
Backslashes and special symbols are preserved in text enclosed in forward slahes /.../, as in Python raw strings. However, if the regular expression is to be stored in a double-quoted string, backslashes and special Perl characters must be preceded by a backslash:
$pattern = "\\.tmp\$"; if ($str =~ /$pattern/) { print "$str has extension .tmp"; }
With single-quoted strings a backslash is a backslash, but Perl’s variable interpolation cannot be used.
Pattern-Matching Modifiers. Perl offers pattern-matching modifiers to adjust the meaning of the dot, ^, $, whitespace, etc. The syntax for applying a pattern-matching modifier is like
if ($str =~ /$pattern/q) { ... }
where q denotes one or more single-character pattern-matching modifiers from the following list:
i case-insensitive matching g match globally, i.e., find all occurrences s let . match newline as well m treat string as multiple lines, i.e, change ^
and $ from matching at only the very start or end of the string to the start or end of any line anywhere within the string (a line is from a newline to the next newline)
x extend the pattern’s legibility by permitting whitespace and comments
o compile pattern once only (for increased efficiency)
The o modifier is a counterpart to compiling regular expressions in Python. We can use other delimiters than forward slashes if the /.../ group is
preceded by an m, e.g.,
$found = 1 if $path =~ m#/usr/local/bin#;
Extracting Multiple Matches. Suppose you have a string with several num- bers. To extract all numbers from this string, without knowing how many numbers there may be, we can apply the following Perl construct13:
13 This construct is a counterpart to Python’s findall function in the re module.
1.4. Frequently Encountered Tasks 43
$s = "3.29 is a number, 4.2 and 0.5 too"; @n = $s =~ /\d+\.\d*/g;
The array @n now contains the ent