M.M. Dalkilic, PhD Monday, September 08, 2008 Class III Indiana
University, Bloomington, IN Sequence Homology 1 Sequence Similiarty
(Computation) M.M. Dalkilic, PhD SoI Indiana University,
Bloomington, IN 2008
Slide 2
Outline New Programming and written homework Friday New Reading
Posted on Website Readings [R] Chaps 5 Most Important Aspect of
Bioinformaticshomology search through sequence similarity (contd)
Some vocabulary snuck in 2 Sequence Similiarty (Computation) M.M.
Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008
Slide 3
Computation (review) Sequence Similiarty (Computation) M.M.
Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008 3
Algorithm process or rules for (esp. machine) calculations. The
execution of an algorithm must not include any subjective
decisions, nor must it require the use of intuition or creativity
[Brassard & Bratley]
General Technique of Dynamic Programming Sequence Similiarty
(Computation) M.M. Dalkilic, PhD SoI Indiana University,
Bloomington, IN 2008 5 But what if data needs to be shared or the
cost of redundancy is too high? Rethink computation: Dynamic
Programming or Recursive Optimization Reduce cost of sharing
thereby reduce cost of recursion
Slide 6
General Technique of Dynamic Programming Sequence Similiarty
(Computation) M.M. Dalkilic, PhD SoI Indiana University,
Bloomington, IN 2008 6 Dynamic programming reduces the running time
of a recursive function to be at most the time required to evaluate
the function for all arguments less than or equal to the given
argument, treating the cost of a recursive call as a constant
[Sedgewick] o Top-down DP o Bottom-Up DP
Slide 7
Vocabulary Sequence Similiarty (Computation) M.M. Dalkilic, PhD
SoI Indiana University, Bloomington, IN 2008 7 There are about a
dozen words that you will encounter when engaging in
bioinformaticsor computational biology. Its important to know what
they mean. Im not going to provide a listing of all the important
words, but ones that I believe are important now. ENZYME is
typically a peptide (molecule made from proteins) that enables or
catalyzes phenomenonthis could changing one molecule to
another.
Slide 8
Vocabulary Sequence Similiarty (Computation) M.M. Dalkilic, PhD
SoI Indiana University, Bloomington, IN 2008 8 The IUBMB has
developed six categories tying together nomenclature with function:
EC1 oxidoreductase (moving around hydrogen) EC2 tranferase (move a
functional unit) EC3 hydrolase (involves H 2 O) EC4 lysase cleave
(or cut) without using EC1 or EC3 EC5 isomerase (change in
conformation) EC6 ligase (join functional units with covalent
bonds) ENZYME is typically a peptide (molecule made from proteins)
that enables or catalyzes phenomenonthis could changing one
molecule to another.
Slide 9
Vocabulary Sequence Similiarty (Computation) M.M. Dalkilic, PhD
SoI Indiana University, Bloomington, IN 2008 9 Restriction
endonuclease cleaves DNA at specific sites
Slide 10
Vocabulary Sequence Similiarty (Computation) M.M. Dalkilic, PhD
SoI Indiana University, Bloomington, IN 2008 10 1.Replication
2.Transcription 3.Reverse Transcription 4.Translation
Slide 11
Vocabulary Sequence Similiarty (Computation) M.M. Dalkilic, PhD
SoI Indiana University, Bloomington, IN 2008 11 3 letters of DNA
becomes 3 letters of RNA becomes 1 letter of protein
http://citnews.unl.edu/croptechnology/lessonImages/960324911.gif
codon Six Reading Frames
Slide 12
Multiple Sequence Alignment of Proteins Sequence Similiarty
(Computation) M.M. Dalkilic, PhD SoI Indiana University,
Bloomington, IN 2008 12
http://www.mad-cow.org/00/annotation_frames/tools/genbrow/sulfatases/sulf_diagnostic_early.g
i f protein A Amino acid gap
Slide 13
Why Alignment of Proteins? Sequence Similiarty (Computation)
M.M. Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008 13
http://www.mad-cow.org/00/annotation_frames/tools/genbrow/sulfatases/sulf_diagnostic_early.g
i f Conjecture: Structure imparts function and similar functions
should have similar structures. Therefore, align proteins to look
for regions that are similar in sequence, since sequence determines
structure and like sequences will (likely) produce similar
function.
Slide 14
Domains and Motifs Sequence Similiarty (Computation) M.M.
Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008 14
Collections of motifs that perform a function Structural motifs
Functional motifs Principle is how percent identity (similarity)
and homology play outabove 40% (25%) percent identity one may infer
a homology is plausible.
Slide 15
Domains and Motifs Sequence Similiarty (Computation) M.M.
Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008 15
Primary structure (sequence itself) Secondary structure [most
common] (alpha-helix, beta- sheet) Tertiary structure is collection
of secondary structure interlaced with loops Quarternary structure
is combination of tertiary structures http://www.amazon.com
Slide 16
Recurrence of Aligning two Sequences Sequence Similiarty
(Computation) M.M. Dalkilic, PhD SoI Indiana University,
Bloomington, IN 2008 16
http://www.space.gov.za/pics/hubble_image01.jpg # elementary
particles in universe
Slide 17
Better Recurrences Sequence Similiarty (Computation) M.M.
Dalkilic, PhD SoI Indiana University, Bloomington, IN 2008 17
[Waterman]