27
Linux & Shell Scripting Small Group Lecture 4 How to Learn to Code Workshop http://onish.web.unc.edu/how-to-learn-how-to-code-l inux-small-group / Erin Osborne Nishimura 1

Linux & Shell Scripting Small Group Lecture 4 How to Learn to Code Workshop group/ Erin

Embed Size (px)

Citation preview

1

Linux & Shell ScriptingSmall Group

Lecture 4

How to Learn to Code Workshop

http://onish.web.unc.edu/how-to-learn-how-to-code-linux-small-group/

Erin Osborne Nishimura

Questions & Comments

Group pop quizYou start to execute the following code on killdevil but it takes a long time to execute. What do you do? Also, what does this code do?

grep -v ‘>’ genome_file_Celegans.fa | wc

How would you load the module called blast?

Group pop quizYou start to execute the following code on killdevil but it takes a long time to execute. What do you do? Also, what does this code do?

$ grep -v '>' Celegans_TFs.csv | wc$ bsub -q week -n 1 -o %J_counter.txt "grep -v '>' Celegans_TFs.csv | wc"

Note “ and " are sometimes different. Note – and - are sometimes different. Note ‘ and ' are sometimes different.

How would you load the module called blast?$ module add blast OR$ module load blast

Learning objectives week 3

• Understand the difference between working on the login node and on a compute node

• Use ‘bsub’ to execute jobs on killdevil/kure compute nodes

• Load and use modules

What we will do for week 4

• Write our first shell script!– An intro to bash scripts/shell scripts– Variables

How can we execute multiple commands at once?

• Piping using the | symbol chains commands together

• We can also string commands together using ;$ wc file1.txt; wc file2.txt; wc file3.txt

• The best way to chain commands together is by putting the commands into a script!

What is a bash script or shell script?

• A text file• Starts with a shebang– #!/bin/bash– (Or the answer to $ which bash)

• Contains a list of commands

How to write a bash script?

1. Write the script.Choose a text editor. Choose one

with syntax highlighting.

2. Put the script where the shell can find it.

3. Execute the script

Text editors

1) Choose a text editor. Choose one with syntax highlighting. Write the script.– I like Komodo Edit (Not to be confused with Komodo IDE which

is a paid version). • It’s free, pretty, and works well with cyberduck & secure FTP.• http://komodoide.com/komodo-edit/

– Notepad ++ • windows

– TextWrangler• Mac

– emacs, vim, nano, and vi are probably not a good choice for this class.

Text editors

• Point your ftp client to your text editor• In cyberduck:– Go to preferences (in Mac it is under the Cyberduck

menu)– Select editor– Select your text editor– This will allow you to interactively edit documents on

kure/killdevil in your local text editor.• This should work pretty similarly in Secure FTP

(right click on the document)

Let’s start a test program

• Start a shell script:$ touch test.sh

• To test how your script should start, type: $ which bash

• This will find where the bash program is located.

• Now, open your script using your ftp client to edit it with your text editor of choice.

Writing your first script

• Start by writing the shebang.• Comment behind all other #’s.• Write code• Save

#!/bin/bash

#This is our first script

pwddate

How to write a bash script?

1. Choose a text editor. Choose one with syntax highlighting. Write the script.

2. Put the script somewhere the shell can find it. Make sure your working directory contains the script! OR put it in your path (week after next)

3. Execute the script

How to write a bash script?

1. Choose a text editor. Choose one with syntax highlighting. Write the script.

2. Put the script somewhere the shell can find it. Make sure your working directory contains the script! OR put it in your path (week after next)

3. Execute the script bash test.sh OR make the script executable (next week)

Commenting and formatting• You can print your own things using echo and printfecho "1) this is a statement"echo "\t2) this is a statement\n"echo -e "\t3) this is a statement\n"

printf "4) this is a statement"printf "\t5) this is a statement\n”

\t – this is a tab\n – this is a return

echo puts a carriage return at the end; print does not;echo expands \t and \n when the –e is used but printf does it automatically.printf has a lot of fancy options.

sed – a streamline editor

• Sed is a streamline editor. It can make changes to a file by reading each line of the file and replacing the content

Command line usage:

sed 's/<searchstring>/[replacestring]/' <file.txt>

sed 's/_____/_____/' <file.txt>

Exercise: Copy the file Celegans_TFs.csv to your working directory.Try to replace the character string Caenorhabditis with the letter C

sed – a streamline editor• Sed is a streamline editor. It can make changes to a file by reading

each line of the file and replacing the content

Command line usage:

sed 's/<searchstring>/[replacestring]/' <file.txt>

sed 's/_____/_____/' <file.txt>

Exercise: – Copy the file Celegans_TFs.csv to your working directory.– Try to replace the character string Caenorhabditis with the letter C.– How would you capture the output?

sed 's/Caenorhabditis/C/' Celegans_TFs.csv > new_Celegans_TFs.csv

Performing sed in bash

• Let’s imagine putting the sed command into a bash file

Execute this with:$ bash searchReplace.sh

#!/bin/bash

#Let’s replace Caenorhabditis with “C” in the file called Celegans_TFs.csv

sed ‘s/Caenorhabditis/C’ Celegans_TFs.csv > script_generated.csv

Giving bash an argument

• What if we wanted to make this generalizable• Execute with$ bash searchReplace.sh Celegans_TFs.csv newfile.csv

#!/bin/bash

#Let’s replace Caenorhabditis with “C” in the file called Celegans_TFs.csv

sed ‘s/Caenorhabditis/C’ $1 > $2

Homework today• This is a very common problem.

– UCSC Genome browser contains all the chromosomes written as follows:• chrI, chrII, chrIII, chrIV, … chrM

– However, ENSEMBLE uses the following nomenclature:• chromosome1, chromosome2, chromosome3, MtDNA

• Your assignment:– Make a test fasta file of the yeast genome that contains each chromosome header line at 5 lines of sequence

after it. Use grep (use man grep) to make this test file. It should start looking like this:

• >chrI• CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACC• CACACACACACATCCTAACACTACCCTAACACAGCCCTAATCTAACCCTG• GCCAACCTGTCTCTCAACTTACCCTCCATTACCCTGCCTCCACTCGTTAC• CCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTT• ACTACCACTCACCCACCGTTACCCTCCAATTACCCATATCCAACCCACTG• --• >chrII• AAATAGCCCTCATGTACGTCTCCTCCAAGCCCTGTTGTCTCTTACCCGGA• TGTTCAACCAAAAGCTACTTACTACCTTTATTTTATGTTTACTTTTTATA• GGTTGTCTTTTTATCCCACTTCTTCGCACTTGTCTCTCGCTACTGCCGTG• CAACAAACACTAAATCAAAACAATGAAATACTACTACATCAAAACGCATT

• TTCCCTAGAAAAAAAATTTTCTTACAATATACTATACTACACAATACATA

– Write a shell script called UCSCtoEnsemble.sh that will convert the headers of the a file from “chrI”, “chrII” to “chromosome1, chromosome2, etc”. You will need to use sed.

– Start by writing pseudocode and comments– Use echo commands to report to the user what is happening.– It is ok if the script feels kind of clunky or uses temporary files. Just make sure it works! We’ll learn how to make it

more elegant later.

Homework today

• Remember our ‘.csv’ files that contained transcript factors? They contained a lot of quotes “ and commas. Can you write a script to convert one of those .csv files to a tab-delimited text file?

• Remove the quotes• Replace commas with tabs• Write pseudocodes and comments• Use echo and printf statements to report to the user

what is happening.• Use google to help you if you get stuck.

Homework

• You learned about $1 and $2. Write a script to try to learn what the following special variables represent:– $0– $@– $*

Homework

• Start to explore how to store information as a variable in bash.

Assigning values to variables

_____=_____(name) (value)

No spaces!

echo $______

Example

flower=dahlia(name) (value)

No spaces!

echo $flower

Assigning values to variables

#!/bin/bash

#Let’s explore variables

variable1=24 # this variable points to an integer

variable2=abc # this one points to a character string

variable3=abc def # this assignment probably doesn’t work as the # user expected

variable4=“abc def” # this one points to a character string # with embedded spaces

echo $variable1echo $variable2echo $variable3echo $variable4