25
Programming Part 3 Introduction to Perl

Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Programming Part 3

Introduction to Perl

Page 2: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Perl

• Like so many things in computer science, Perl is an acronym: Practical Extraction and Reporting Language (you may now forget this)

• Perl has several advantages for us:– can handle large amounts of data

– includes rich set of functions for analysis of string data, and in particular pattern detection

– Syntax (language rules) relatively flexible; more forgiving of variation than many other programming languages

Page 3: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

A simple Perl program#!/usr/bin/perl –w# Chapter 1 - Exercise 1print "Enter single DNA strand: ";my $dnaseq = <STDIN>;chomp $dnaseq;print "\nOpposite strand: ";for (my $i=0;$i<length($dnaseq);$i++) {

my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}

}

Page 4: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Running the program

Page 5: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

What’s going on here?

• Lines that begin with ‘#’ character are comments:

– they provide opportunity to explain something

– they are not code – computer ignores them

– examples:

#!/usr/bin/perl -w

# Chapter 1 - Exercise 1

Page 6: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Output statements

• The print command sends output to the screen

• The data to be printed appears in quotes after the name of the command; examples:

print "Enter single DNA strand: ";

print "\nOpposite strand: ";

print "T";

Page 7: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Input statements and assignment

• Input statements read data from external sources; in our example, we’re reading input from the keyboard; example:

my $dnaseq = <STDIN>;

• Assignment statements assign values to variables; the statement above includes an assignment operation, as do the statements below:

my $i=0;

my $nucleo = substr($dnaseq, $i, 1);

Page 8: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Declaring variables

• There are three kinds of variables in Perl; they include:

– Scalar variables (declared with $)

– Arrays (declared with @)

– Hashes (declared with #)

• Variables may be declared and assigned values in the same statement; this is the case with all of the examples so far

Page 9: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Declaring variables

• The line of code below declares a local scalar variable named dnaseq, then assigns it the value typed in at the keyboard (represented by the constant <STDIN>):my $dnaseq = <STDIN>;– “my” makes the variable local– “$” makes the variable a scalar – that is, a variable

that holds a single value (like a Scratch variable)– “=” is the assignment operator – we read the symbol

as “gets”– This instruction ends with the “;” character

Page 10: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Data types in Perl

• A scalar variable in Perl can store two kinds of data:

– strings

– numbers (integers and real numbers)

• We can assign either kind of data to any scalar variable, although it is useful to store only one type or the other in an individual variable, which is given a name that reflects the kind of data it will hold - this makes the code less confusing

Page 11: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

The chomp command

• When a Perl program reads data from the keyboard, every character entered by the user is read, including the newline character created by pressing the Enter key

• The “chomp” command removes the extraneous newline character from the end of the data

Page 12: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Perl control structures

• Perl supports both loops and selection structures• Our example program contains both; in this case,

a multiway selection structure contained within a loop:for (my $i=0;$i<length($dnaseq);$i++) {

my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}

}

Page 13: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Operations on strings

• Perl is ideally suited for bioinformatics programming because of its rich set of built-in operations on string data; two of these operations are used in the loop:length($dnaseq) andsubstr($dnaseq, $i, 1)

• The length operation tells the program the number of characters in the string; we use this to tell when the loop should end

• The substr operation tells the program the content of a segment of the original string

Page 14: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Substrings

• A substring is a section of a string

• Substrings can be any length, from 1 character to the entire length of the original string

• The substr operation (or function) takes in 3 data items (the original string, the starting position of the substring, and the length of the substring) and gives back one: the actual substring found at the given position, of the given length

Page 15: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Examples

• Suppose we have the following variable:

my $name = “Cathleen Mary Ruth Sheller”;

# it really is!

then these expressions: represent these substrings:

substr($name, 1, 3) “ath”

substr($name, 9, 4) “Mary”

substr($name, 12, 5) “y Rut”

Page 16: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

The for loop

• A for loop is an example of a count-controlled loop; that is, one that repeats a certain number of times

• The structure of the loop is as follows:for (my $i=0;$i<length($dnaseq);$i++) {

# body of loop here}– we start by declaring and initializing the counter, or control

variable: my $i=0;– we then check for the loop ending condition; in this case, we

want to know if the counter has reached a value equal to the number of characters in the dnaseq string: $i<length($dnaseq);

– if the test succeeds, we perform the code in the body of the loop – that is, the statements between the two brackets: { … }

– finally, we increment the counter: $i++

Page 17: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

The selection structure

• The code below:if ($nucleo eq "A") {print "T";}

elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}

represents a selection structure; the first part of the expression: if ($nucleo eq "A") tests to see if the value in variable nucleo is equal to the string value “A”

if the expression tests true, a T is output to the screen; otherwise, the next expression: elsif ($nucleo eq "C") is tested and, if true, a G is printed

and so on – until the last: else {print "A";} prints out an A if none of the previous expressions tested true

Page 18: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

The logic

• As the loop runs, different statements within the selection structure execute

• The next slide shows the loop and selection structure as it executes on the following input string: ATTAGCAG

Page 19: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

The logic

dnaseq: ATTAGCAG

for (my $i=0;$i<length($dnaseq);$i++) {my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}

}

Variable values: Output:i nucleo0 A T1 T A2 T A3 A T4 G C5 C G6 A T7 G C

Page 20: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Making improvements

• The program correctly produces a DNA string’s complement, provided it is given good data

• What happens if the user types a letter that isn’t A, C, G or T?

Page 21: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Another example#!/usr/bin/perl -w# Source: Gibbs, Cynthia and Per Jambeck, Developing Bioinformatics Computer Skills,# O'Reilly, 2001, page 334

my $target = "ACCCTG";my $search_string =

'CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACACACA'.'CATCTAACACTACCCTAACACAGCCCTAATCTAACCCTGGCCACCTGTCTCTCAACTT'.'ACCCTCCATTACCCTGCCTCCACTCGTTACCCTGTCCCATTCAACCATACCATCCGAAC';

my @matches;

foreach my $i (0..length $search_string) {if ($target eq substr($search_string, $i, length $target)) {

push @matches, $i;}

}print "My matches occurred at the following offsets: @matches.\n";print "done\n";exit;

Page 22: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Output from example

Page 23: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Extending strings

• This example introduces some new aspects of Perl programming; consider this line:

my $search_string =

'CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACACACA'.

'CATCTAACACTACCCTAACACAGCCCTAATCTAACCCTGGCCACCTGTCTCTCAACTT'.

'ACCCTCCATTACCCTGCCTCCACTCGTTACCCTGTCCCATTCAACCATACCATCCGAAC';

• A string that is too long to fit on a single line is created by concatenation – three lines (in this case) are glued together (with the ‘.’ character) to make a single string

Page 24: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Arrays

• You may remember the term “array” from our brief discussion of their use in Scratch– an array is a variable that holds a collection of data

– each individual data element can be accessed using a subscript, or index number (although that isn’t done here)

– index values start at 0, so an array with n elements has indexes 0 .. n-1

• The array variable in this program is declared in the following line of code: my @matches;

Page 25: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign

Arrays

• Two array operations are illustrated in this program:

– The push operation adds a value to the array:

push @matches, $i;

– The print operation, when given an entire array as data, prints the array contents as a list of values separated by commas:

print "My matches occurred at the following offsets: @matches.\n";