16
Exercise made by Jon K. Lærdahl [email protected] 1 Unix basics exercise – MBV-INFX410 In order to start this exercise, you need to be logged in on a UNIX computer with a terminal window open on your computer. It is best if you are logged in on freebee.abel.uio.no. If you do not know how to open a terminal window, like the one below, on your desktop, please contact one of the instructors and get help. If you are working on a Mac or Linux computer, make sure you are logged in on freebee (with ssh) before you start! In the terminal window you have a Unix shell open with a command-line (a shell prompt) where you can type your commands. The command-line interpreter will execute your commands. There are several alternative and widely used Unix shells, for example C shell (csh), Bourne shell (sh), tcsh (used to be default at UiO), and Bourne-again shell (bash) (now default at UiO). In this exercise we will use bash. Hopefully you already have a bash shell running. If you are uncertain, type “bash” at the command prompt and press enter. Now you have opened a bash shell inside the other shell, and you are ready to start. 1. Your first command will be pwd (short for print working directory). It will tell you exactly where in the file system hierarchy you are, i.e. the full pathname of the directory you are in. Type pwd and press enter. The command-line interpreter have interpreted you command. It knows that pwd is a program and runs that program. The output from the program was sent to your

Unix basics exercise – MBV-INFX410 - Universitetet i …folk.uio.no/jonbra/MBV-INF4410_2017/exercises/2017-11-13_UNIX... · Unix basics exercise – MBV-INFX410 ... It is best if

  • Upload
    lamdien

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

ExercisemadebyJonK.Lærdahl [email protected]

1

Unixbasicsexercise–MBV-INFX410Inordertostartthisexercise,youneedtobeloggedinonaUNIXcomputerwithaterminalwindowopenonyourcomputer.Itisbestifyouareloggedinonfreebee.abel.uio.no.Ifyoudonotknowhowtoopenaterminalwindow,liketheonebelow,onyourdesktop,pleasecontactoneoftheinstructorsandgethelp.IfyouareworkingonaMacorLinuxcomputer,makesureyouareloggedinonfreebee(withssh)beforeyoustart!

In the terminalwindowyouhaveaUnix shellopenwitha command-line (a shellprompt)where you can type your commands. The command-line interpreter will execute yourcommands. There are several alternative andwidely usedUnix shells, for exampleC shell(csh),Bourneshell(sh),tcsh(usedtobedefaultatUiO),andBourne-againshell(bash)(nowdefault atUiO). In this exercisewewill usebash.Hopefully youalreadyhaveabash shellrunning.Ifyouareuncertain,type“bash”atthecommandpromptandpressenter.Nowyouhaveopenedabashshellinsidetheothershell,andyouarereadytostart.

1. Your first commandwill bepwd (short forprintworking directory). It will tell youexactly where in the file system hierarchy you are, i.e. the full pathname of thedirectoryyouarein.Typepwdandpressenter.

Thecommand-lineinterpreterhaveinterpretedyoucommand.Itknowsthatpwdisaprogram and runs that program. The output from the program was sent to your

ExercisemadebyJonK.Lærdahl [email protected]

2

screen and the interpreter is waiting for a new command. What is your “currentdirectory”?Thisis“Unix-speak”for“thedirectoryyouareinrightnow”.

2. Your secondUnix command isman. It will show themanual pages for other Unixcommands.Typemanpwd.Thenpressenter. (Youalwayspressenter tomaketheinterpreter start working on your command. Therefore we will not repeat thisbelow!)

Yougeta lotof informationabout thepwd command.Press the letter“q”onyourkeyboardtoleavetheso-calledman-page(themanualpageforpwd).

3. Thereisalsoaman-pageforthecommandman.Trytyping“manman”.Youwillseethatwhilepwdwasrathersimple,manhasmanyoptionsandyoucangive itmanydifferentarguments.Youcangoupanddown in themanualpagebyusing“arrowup”,“arrowdown”,“pageup”and“pagedown”onyourkeyboard.Trythis.Thereisa“-h”optionforman.Whatwillitdo?Leavetheman-pages(bypressing“q”)andtryitoutbytyping“man-h”onthecommand-line.Whatdoyouget?Theman-pagesareveryuseful ifyouknowthecommandyouwanttouse,butyouhaveforgottensomedetailsabouthowtouseitorwhatoptionsyoumayuse.Makesureyouusethemancommandoftenlaterinthisexercise!

ExercisemadebyJonK.Lærdahl [email protected]

3

4. Veryusefulandimportantcommandsarels(listsallfilesinadirectory),cd(changedirectory, go to a different directory), and mkdir (make directory, create a newdirectory)

Writethesecommands(withenteraftereachline)

mkdirMBV-INFX410_UnixcdMBV-INFX410_UnixmkdirsubDir1mkdirsubDir2subDir3ls

You have made a new directory called MBV-INFX410_Unix, you went into thatdirectoryandcreatedfirst1andthen2newdirectories.Finallyyoulistedallfilesinyour“currentdirectory”,i.e.thedirectoryyouarein.

5. Herearesomeothercommandswithexplanationscd..(youwillgofromthedirectoryyouareinanduponelevelinthedirectorytree)cd../..(youwillgoup2levelsinthedirectorytree)cd.(nothinghappens!Youchangeddirectorytothedirectoryyouarein) cd~/(gotoyour“homedirectory”,thedirectoryyoucametowhenyouloggedin)cd(thiswilldoexactlythesameasthepreviousline.Ifyou“getlost”youcanalwaystypecdtogohome...)ls-l(showsallfilesinyourcurrentdirectory,butwithmoreinformation) ls-la(alsoshows“hiddenfiles”,i.e.fileswithanamethatstartswitha“.”.Theseareoftenconfigurationfilesthatyoudonotwanttosee)Tryoutthesecommands.Experimentwithwalkingaroundinthedirectorytreeandseewhatyoucanfindout.

6. TypecdtogettoyourhomedirectoryandtypelsMBV-INFX410_Unixtoseewhatisin thisdirectory. It is a lotofwork,andawasteof time, to type thewhole “MBV-INFX410_Unix”. Ifyoutype lsMBVandpresstheTabkey,theshellwilltrytofinishwhatyouarewriting.Mostlikelyyouonlyhaveonedirectoryand“lsMBV”turnsinto“lsMBV-INFX410_Unix”afteryouhavepressedTab.

ExercisemadebyJonK.Lærdahl [email protected]

4

Nowgo into theMBV-INFX410_Unixdirectorybyusingcd,butdonot type the fulldirectoryname.UsetheTabbing-trickinstead.Ok?TypecdsandpressTab.Nowthereareseveraloptions,andtheshellwillnotknowifyouwanttogointosubDir1,subDir2,orsubDir3.IfyoupressTabtwice,yougettheoptionslisted.Type2andenterandyouarein~/MBV-INFX410_Unix/subDir2.Typepwd to findoutwhere youare in the completedirectory tree. This showsyou the“fullpath”or“absolutepath”.MakesureyouusetheTabkeyalot.Itwillsaveyoualotoftimeandtyping!

7. Otherusefulthingsyoucandoare:Press“Ctrl”keyandthen(withoutlettinggoofCtrl)“e”.Thiswilltakeyoutotheendofthecommandline.Makesureyouhavewrittensomethingaftertheprompt.Ifnot,youarealreadyattheend!Thiscombinationofkeystrokeswouldusuallybewritten“<ctrl>-e”or just“^e”.Youcanalsogobackandforth inthetextonthecommandlinebyusing the“arrow left”and“arrowright”keys.Andyoucandelete textwiththe“Delete”and“Backspace”keys.Trythisout!Otherusefulstuff:<ctrl>-a:Gotostartofline,justaftertheprompt<ctrl>-k:Deleteeverythingafterthecursor<ctrl>-c:Kills(stops)ajobthatisrunning<ctrl>-z:Suspendsajobthatisrunning.Asuspendedjobcanberestarted.

8. Go into the directory ~/MBV-INFX410_Unix/subDir3 andmake sure it is empty byrunningthecommandls-l-a.

ExercisemadebyJonK.Lærdahl [email protected]

5

Above,youseethatsubDir3isempty.Theremightappeartobetwo“files”there,buttheyare“.”,thedirectoryyouarein,and“..”,thedirectoryabove.Alsonotethatthecommandsls-laandls-l-awilldoexactlythesamething.Forlsyoucancombinetheargumentslikethis.Thisisnotthecaseforallcommands.Nowwewanttoopenatexteditor.Typenano. Insidethetexteditoryoucantypetext, go up and down with the arrow keys and use delete and backspace. At thebottomoftheeditor,youseesomekeystrokesyoumayusetodovarioustasks.Forexample, “^x” or <ctrl>-x will exit the text editor and bring you back to the shellcommandline.Trythis(butdonotsaveanyfilesyet)!

9. Nanoisanextremelysimpletexteditor.IfyouaremoreseriousaboutusingUnixyoushould learn something else, most likely emacs or vi. We will use nano for thisexercise,butifyouhaveadifferentfavouritetexteditor,usethat.Opennanoandtypeinthetextshownbelow.

Save this text as a text file by typing <ctrl>-o and calling the file “fileOne”. Save itagain,butthistimecallit“fileTwo”.Nowexitnano(<ctrl>-x).Usels-ltolistthefilesyoujustwroteandtheinformationaboutthem.

ExercisemadebyJonK.Lærdahl [email protected]

6

Intheabovewindowyouseeinformationaboutthesetwofilesinmycurrentshell.Hereiswhatitallmeans:

Number/position Value Explanation

1,1stletter - -meansregularfiledmeansdirectory

1,2nd-4thletter rw- Accessrights forthefileowner(Icanbothreadthefileandmodify/writetoit,aswellasdeleteit)

1,5th-7thletter r-- Accessforallmembersofthegroup(here:mikrobi).Thegroupmemberscanopenthefilebutnotmodifyordeleteit

1,8th-10thletter --- Access for everyone else. Users that are notmembersoftheusergroupmikrobicannotopenthefile(ormodifyit)

2 1 Numberofhardlinkstothefile(ignorethis)

3 jonkl Usernameofthefileowner

4 mikrobi Groupofusersthatthefilebelongstwo

5 35 Sizeofthefileinbytes

6 Nov1813:26 Dateandtimethefilewaslastmodified

7 fileOne Filename

Bymodifying the accession rights you canmake files and directories accessible toeveryoneonthecomputer,toonlycertaingroupsortoonlyyourself.Thecommandgroupswillshowwhichgroupsyouareamemberof.Tryit!Thesearealsouseful:

ExercisemadebyJonK.Lærdahl [email protected]

7

• chmodwillmodifytheaccessions,i.e.thestring“-rw-r-----“tosomethingelse• chownwillchangetheownerand/orgroupofafile

Wedonothavetimetotrythisnow,butyoucanusetheman-pagesandexperimentlater.Note how useful this can be! For example, if you are involved in two (top secret)projects,youcanmaketwousergroups,group1andgroup2.Youcanthenhavefilesanddirectoriesthatcanbeviewed(ormodified)byonlyyourself,onlymembersofgroup1,onlymembersofgroup2,orallusersonthecomputer.Maybeyouwillonlyallow group1members to run a certain program,while only group2members areallowedtoseeyourholidayphotos?Thesystemisveryflexible!

10. Thecommandscp(makeacopyofafile),mv(moveafile),andrm(removeafile)areveryuseful.Bytheway,toremoveadirectoryusermdir.Go to~/MBV-INFX410_Unix/subDir3where you still have the two files fileOneandfileTwo.Trythecommandsinthewindowbelow.

Make sure you understoodwhat you did! Alsomake sure you use the Tab-key inordertoavoidalotoftyping.

11. Learntousethewildcards“*”(matcheszeroormorecharacters)and“?”(matchesanysinglecharacter).Forexamples,seethewindowbelow.

ExercisemadebyJonK.Lærdahl [email protected]

8

12. Createsomemoredirectories,copyandmovefiles,andfinallyremovesomeofthedirectoriesandfilesyoucreated.Beverycarefulwithrmandrmdir.Ifyouremovefilesanddirectories,theyaregone,andyouwillnotgetthemback!

13. Copying and pasting text is of course very useful. In order to copy text fromsomewhere in the terminal window, click and drag tomark the text you want tocopy. Then right click on the command line to paste in this text. (Thismightworkslightlydifferentlydependingonyourlaptopand/ormouse.Ifyouarestuck,askoneoftheteachers)Whenyouhavelearnedtocopyandpastewithintheterminalwindowyouarereadytocontinue!

14. Openthiswebpageinawebbrowser:http://www.uniprot.org/uniprot/O15527Clickon the “Format” box near the top, and choose “FASTA (canonical)” to get to thefollowingpage:http://www.uniprot.org/uniprot/O15527.fastaCopy the whole Fasta-sequence including the header. On a Microsoft Windowslaptopyouwouldmarkthetextandpress<ctrl>-c.Openthenanotexteditorinyourterminalwindowandpaste in the text by right clicking in thatwindow. Youmighthave to experiment a bit, as copying andpastingwill dependon computer/mouseandothersettings.Ifyougetstuck,asateacher!Save the file by pressing <ctrl>-x, type “y” (yes) and type a file name (call it“O15527.fasta”). Exit nanoandmake sure youhave createda filewith the correctname.Repeattheprocedurewiththetexthere:http://www.uniprot.org/uniprot/Q92736.fastaSavethisasafilenamedRYR2.fasta.

ExercisemadebyJonK.Lærdahl [email protected]

9

15. Ausefulcommandtojustlistallthetextinatextfileiscat.TrycatO15527.fastaandcatRYR2.fasta.Whathappens?

16. Notethatyou,onthecommandline,cangobackandforthinoldcommandsbyusingthe “arrowup”and “arrowdown”keys. In thiswayyou canaccessold commandsandrunthemagain.Oryoucanmodifythemandrun.Tryitout.Thisalsosavesalotoftyping!

17. The cat commandwill show you thewhole file. If the file is long, the topwill justdisappear out of view at the top of the screen. A useful alternative is themorecommand. Trymore RYR2.fasta.When you press “enter”more will show you onemore lineof text.Pressing thespacebar,givesyouoneextrapageof text.Does itwork?

18. In theUnixshellwehavesomethingcalled“standard input” (STDIN)and“standardoutput” (STDOUT). Put simply, STDIN is what you type on the keyboard, whileSTDOUTiswhat isshownonthescreen.This isthedefault.YoucanredirectSTDINandSTDOUT.Firsttry,asbefore,catO15527.fasta.ThentrycatO15527.fasta>OGG1.fastaApparently nothing happens, but this is because the output wasn’t sent to thescreen.InsteaditwasredirectedintothenewfileOGG1.fasta.Docatonthatfiletocheck.Ok?

19. Open the file OGG1.fasta in nano and change the C-terminal G (glycine) to W(tryptophan).Amutation,right!Nowsavethefileagainwiththemodification.

20. Anotheruseful command isdiff. TrydiffO15527.fastaOGG1.fasta.Whathappens?Usethemanualpagesfordiffifyoudidn’tgetitoryouwanttoknowmore.

21. Thecommandgrepwillshowyouallthelinesinoneorseveraltextfilesthatcontainsapattern,forexamplelikethis:

ExercisemadebyJonK.Lærdahl [email protected]

10

22. Wewant to get access to someGenBank data from the GenBank ftp site. Google“genbankftpsite”tofindwhereitis.Googlecanbeuseful...Hopeyoufoundit?Itishereftp://ftp.ncbi.nih.gov/genbankOpenthislinkinyourbrowser.Youfindanawfullotoffiles,someofthemarequitebig.Youcanfindinformationaboutthedatahereftp://ftp.ncbi.nih.gov/genbank/gbrel.txtftp://ftp.ncbi.nih.gov/genbank/README.genbankClickonthefilegbphg1.seq.gzintheGenBankftpdirectoryinyourwebbrowserandthen save it to ~/MBV-INFX410_Unix/subDir1. This means you have to find thisdirectory on your laptop. Hope you have mounted your UiO home area on yourlaptop...Ifyouarehavingproblems,askoneoftheteachers!This file is in a “packed” format. It has been packed with the gzip program. Tounpack, typegunzip gbphg1.seq.gz. Type ls -lh to see some information about thefile.The-hoptionmeansthatyougetthefilesizeinamoreconvenientformat.Thefile isverybig,239Mb!Thisvaluewas204Mb in2012andwillchangeeverytimeGenBankisupdated.PLEASE!Whenyouhavefinisheddoingthisexercise,deletethisfile(andotherbigfiles).WedonotwanttofillupyourhomeareaandtheUiOcomputerswithdatathatyouabsolutelydonotneed!!ThefilecontainsalargepartofthephagedatainGenBank.Therestisingbphg2.seqand gbphg3.seq, butwewill not look into that now. Domore gbphg1.seq to startreadingfromthetop.Whatistheaccessionnumberofthefirstentry?(AB000833)

23. Are there anyone calledHansen that submitted any of the data? TrygrepHansengbphg1.seq.

ExercisemadebyJonK.Lærdahl [email protected]

11

24. Thecommandshead-n20andtail-n50willprintoutthefirst20andthelast50linesofagivenfile(butyoucanofcoursechoosethenumberoflinesyouwant).Tryhead-n200gbphg1.seqtail-n60gbphg1.seqhead-n10000gbphg1.seq>first10000LinesThe command wc counts the number of words in a file, for example wcfirst10000Lines:

There are 10000 lines, 41690 words and 621150 bytes of information in the file.Actually, it is41687and621140inNovember2013,andthiswillmost likelychangewhenGenBankisupdatednexttime.The “|” (the “pipe”)will send the STDOUTof one command into the STDINof thenext.Whathappenshere?:

ExercisemadebyJonK.Lærdahl [email protected]

12

First,allthelineswith“ACCESSION”isprintedout.Then,insteadofprintingthisoutto the screen, the output is sent to thewc command. It counts 36 lines. It is 36“ACCESSION” in the file. Easier than counting manually! How many lines have“ACCESSION”inthemingbphg1.seq?

25. Use theman-pages forgrep command to findoutwhat “grep -n”will do.Canyoufind the amino acid string/sequence PEECPEE (Pro-Glu-Glu-Cys-Pro-Glu-Glu)anywhereinthefilegbphg1.seq?Yes,itisfoundononeline.Onwhichline?

26. Whatdoes theoption“-B”do for thegrep commands?Use theman-pages to findout. Use the commands you have learnt to find out which species/phage thatencodes a protein with the sequence PEECPEE (Solution: Salmonella phagevB_SenS_AG11).

27. Finallyinthisexercisewewilltrysomeverysimpleshellscripts.Shellscriptsaresmallprograms that are doing tasks that also could have been done step by step by apersonon the command line.Go into ~/MBV-INFX410_Unix/subDir2,which shouldbeempty(orelseremovethefilesthatarethere).CopyfileOnefrom../subDir3here.

ExercisemadebyJonK.Lærdahl [email protected]

13

28. Opennanoandtypethefollowing,

Whatdoesthismean?#!/bin/sh Therestofthisscriptshouldbeinterpretedbytheshshellr=$1 Thevariabler(thatis$r)getsthecontentsofthefirst

argumentonthecommandlineechoHei$r “Hei”andthenthevariablecontentiswrittentoSTDOUTSavethescriptas“scriptOne”.The“#!”onthefirstline,twofirstcharacters,iscalledashebang(from“hash”and“bang”,i.e.exclamationmark).Ittellstheshellthattherestof the script shouldbe interpretedby theprogram/bin/sh (Bourne shell). If ithad been “#!/usr/bin/python” or “#!/usr/bin/perl” that would have indicated thatthisinsteadwasapythonorperlscript,respectively.Most likelyyouhaveaproblem,sincescriptOne isnotanexecutableprogramthattheshellthinksitcanrunasaprogram.However,thecommandchmodu+xscriptOnewillfixthat:

ExercisemadebyJonK.Lærdahl [email protected]

14

The“rw-“inposition2-4ischangedtorwx,fortheuser(thatiswherethe“u”comesfrom).Theusercannow,inadditiontoreadingfromandwritingtothefile,executeit(i.e.runitasaprogram).Nowletusdothat:

You have to put the “./” in front of the name of your script. If you do not, (mostlikely)theshellwillnotknowwheretolookforit.Asyousee,youtypethenameoftheexecutablefile(includingthefilepathtoit)andsomewords.Itwillrepeatbacktoyou“Hei”andthefirstword.Notveryadvanced,butyoufirstscript!

29. Nowtrytypinginthisscript#!/bin/shprefix=$1echoAllfileswillstartwith$prefixfor((i=0;i<5;i++));doechoMakingfile$prefix$icpfileOne$prefix$idoneSaveitasscriptTwo,makeitexecutable(withchmod)andrunitwithanargument.Whatdoyouthinkitwilldo?

ExercisemadebyJonK.Lærdahl [email protected]

15

Thescriptmakes5copiesoffileOne,eachcalledJalla0,Jalla1etc,orwhateverprefixyouenteredonthecommandline.Afinalexamplehere,

Hereitruns...

Whatdoesitdo?(WhenIrun itwiththeargumentsHeiandHopp, itcreatesdirectoryHopp0withasinglecopyoffileOnecalledHei0,Hopp1withHei1andsoon...Finallyitdoesanls-R)Checkthemanpagesforlstofindoutwhat-Risdoing.

30. Toexittheshell,typeexitorpress<ctrl>-d.

ExercisemadebyJonK.Lærdahl [email protected]

16

Getmoreinspirationhere

• TrondHasleAmundsensLinux/UNIXguideforUiO:http://www.uio.no/tjenester/it/maskin/linux/hjelp/tips/guide.html

• bashprogramminginINF3331byHansPetterLangtangenandco-workers:http://www.uio.no/studier/emner/matnat/ifi/INF3331/h12/bash.pdf

• EMBnet“AQuickGuideUNIX”:http://www.embnet.org/sites/default/files/quickguides/guideUNIX.pdf

• Or just search theweb. There is lots of useful tutorials for basic and advancedUnixandscripting

Thanks to Trond Hasle Amundsens Linux/UNIX guide andMereteMoltonWorrens “BasicUnixcommands”forinspirationforthisexercise!