Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Assignment6:MotifFindingBio54882/24/17
SlideCredits:NicoleRockweiler
Assignment6:Motiffinding• Input• Promotersequences• PWMsofDNA-bindingproteins
• Goal• FindputativebindingsitesinthesequencesbyscanningthesequencesformatchestothePWM
• Output• Listofthelocationsandscoresofputativebindingsites
PWM Putativebindingsequence
Promoter
Inputfiles• Promotersequences• Justthesequence,i.e.,notafasta
• PWMsofDNA-bindingproteins• Whitespace-delimited• aij =scoreforbasei atpositionj
• RowscorrespondtoA,C,G,&T• Columnscorrespondtopositions• The higher the score, the better the score
ExamplePWM
-5-945-326-510-1010-10-14310-460-110-31
ExamplePWMfile
AssignmentTODOs
• DeterminethehighestaffinitybindingsiteforeachPWM• CalculatebyhandorwriteascriptJ
• Commentthestarterscriptscan_sequence.py• Commenttheexistingcodeblocks• Commenttheuser-definedfunctionswithfunctiondocstrings
Functiondocstrings
• Purpose:tellsthereaderhowtousethefunction• Guidelinesforwhattoinclude• Describewhatthefunctiondoes• Describetheinputargument(s)• Describetheoutputvalue(s)
• Wheretolearnmore:• PEP257: https://www.python.org/dev/peps/pep-0257/• Google’sPythonstyleguide:http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html?showone=Comments#Comments
Exampleofafunctiondocstring
Summaryline
Descriptionofarguments
Descriptionofreturnvalue
Retrievingafunction’sdocstringCallhelp
Function’sdocstringisreturned
Docstrings arealsousedbythird-partyprogramstocreateuser-friendlydocumentationforyourproject
AssignmentTODOs(cont.)
• DeterminethehighestaffinitybindingsiteforeachPWM• CalculatebyhandorwriteascriptJ
• Commenttheexistingcode• Commenttheuser-definedfunctionswithfunctiondocstrings
• Modifythescripttoscansthereversecomplementoftheinputsequence• Modifythescripttoreportonlyreporthitsthathavescoresaboveagiventhreshold• Scanpromoters(n=2)tofindputativebindingsitesforeachDNA-bindingprotein(n=2)• Answerfollow-upquestions
Indexing
• Indexingissomewhatarbitrary;howeverit’simportanttofollowconventions:• Thestartpositionofafeatureissmallerthanthestopposition• Thecoordinatesarerelativetotheforwardstrand
Pythonlistcomprehensions
• Purpose:createlistsin1lineofcode• Therearealsodictionarycomprehensions thatworksimilarly
Codetemplate Example
Asaforloop
for <item> in <list>:<expression>
x = []for i in range(5):
x.append(i**2)
Listcompre-hension
[<expression> for <item> in <list>] x = [i**2 for i in range(5)]
Pythonlistcomprehensions withfiltering
Codetemplate Example
Asaforloop
for <item> in <list>:if <conditional>:
<expression>
x = []for i in range(5):
if i % 2 == 0: # if i is evenx.append(i**2)
Listcompre-hension
[<expression> for <item> in <list>if <conditional>]
x = [i**2 for i in range(5)if i % 2 == 0]
• Wheretolearnmore:• ListcomprehensionPEP:https://www.python.org/dev/peps/pep-0202/• DictcomprehensionPEP:https://www.python.org/dev/peps/pep-0274/
Python’szip function
• Purpose:“zip”togetherlists• Returnsalist*oftupleswheretheith tuplecontainstheith elementfromeachoftheinputlists
*It’sreallyaniterator,oneoflist’sclosecousins
Codetemplate Example
Asaforloop
<zipped_list> = list(zip(<list1>, <list1>, ...)) x = [0, 1, 2]y = [0, 1, 4]coords = list(zip(x,y))>>> coords[(0, 0), (1, 1), (2, 4)]
• Zippedlistscanbeunzipped(zip(*coords))• Wheretolearnmore• Python.orgdocumentation:
https://docs.python.org/3.4/library/functions.html#zip
PrintingformattedstringsinPythonwithformat
• Purpose:makeyourprintstatementsprint“pretty”output,e.g.,tables• format transformsa“templatestring”bysubstitutingplaceholderswithformattedvalues• Placeholdersareenclosedin{}andspecifyhowthevalueshouldbeformatted
Notsopretty Pretty
>>> score = 1/300>>> print("The score was " + str(score))The score was 0.0033333333333333335
>>> print("The score was {s:.3f}".format(s=score))The score was 0.003>>> print("The score was {s:.3E}".format(s=score))The score was 3.333E-03
• Wheretolearnmore:• Python.orgtutorial:https://docs.python.org/3.4/tutorial/inputoutput.html#fancier-output-formatting• Python.orgdocumentation:https://docs.python.org/3.4/library/string.html#formatstrings• PythonCoursetutorial:http://www.python-course.eu/python3_formatted_output.php
Assignment6:requirements
• Duein1week(3/3/17)at10AM• Yoursubmissiondirectoryshouldcontain• Amodifiedscan_sequence.py thatiswellcommentedandcontainsadocstringforeachuser-definedfunction• AREADME.txt withtheanswerstothequestionsandthecommands/workyouusedtoarriveattheanswer