37
Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019

Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Lecture5,6Localsequencealignment

Chapter6inJonesandPevzner

Fall2019September12,17,2019

Page 2: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Evolutionasatoolforbiologicalinsight

•  “Nothinginbiologymakessenseexceptinthelightofevolution”-TheodosiusDobzhansky.

•  Thefunctionalityofmany

genesisvirtuallythesameamongmanyorganisms:Canunderstandbiologyinsimplerorganismsthanourselves(“modelorganisms”).

Page 3: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Localalignment:rationale

•  Proteinsareoftenmulti-functional,andarecomposedofregions(domains),eachofwhichcontributesaparticularfunction

•  Example:

² Homeoboxgeneshaveashortregioncalledthehomeodomainthatishighlyconservedbetweenspecies.

² AglobalalignmentmightnotfindthehomeodomainbecauseitwouldtrytoaligntheENTIREsequence

Drosophila homeodomain PDB: 1ZQ3

Page 4: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Localvs.GlobalAlignment(cont’d)

Global Alignment Local Alignment—better for finding a conserved segment

--T—-CC-C-AGT—-TATGT-CAGGGGACACG—A-GCATGCAGA-GAC | || | || | | | ||| || | | | | |||| | AATTGCCGCC-GTCGT-T-TTCAG----CA-GTTATG—T-CAGAT--C

tccCAGTTATGTCAGgggacacgagcatgcagagac ||||||||||||

aattgccgccgtcgttttcagCAGTTATGTCAGatc

Page 5: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

TheLocalAlignmentProblem

•  Goal:Findthebestlocalalignmentbetweentwostrings

•  Input:Stringsv,wandscoringmatrixδ•  Output:Alignmentofsubstringsofvandwwhosealignmentscoreismaximumamongallpossiblealignmentofallpossiblesubstrings

Page 6: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Localvs.globalalignment

Global alignment

Local alignment

Compute a “mini” global alignment to get a local alignment

Page 7: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Localvs.GlobalAlignment

•  TheGlobalAlignmentProblemtriestofindthelongestpathbetweenvertices(0,0)and(n,m)intheeditgraph.

•  TheLocalAlignmentProblemtriestofindthelongestpathamongpathsbetweenarbitraryvertices(i,j)and(i’,j’)intheeditgraph.

•  Inaneditgraphwithnegatively-scorededges,alocalalignmentmayscorehigherthanaglobalalignment

Page 8: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

TheProblemwiththisProblem

•  Naïvemethod(runtimeO(n4)):

-Inagridofsizenxntherearen2nodes(i,j)thatmayserveasasource.

-Foreachsuchnodecomputingalignmentsfrom(i,j)to(i’,j’)takesO(n2)time.

Page 9: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Example

Page 10: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Example

Page 11: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Example

Page 12: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Example

Page 13: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Example

Page 14: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:FreeRides

(0,0)

The dashed edges represent the free rides from (0,0) to every other node.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Yeah, a free ride!

Page 15: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Recurrence

0 si-1,j-1 + δ (vi, wj)

s i-1,j + δ (vi, -) s i,j-1 + δ (-, wj)

Power of ZERO: this is the only change from the original recurrence of a global alignment, representing the “free ride” edge

si,j = max

Page 16: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:Backtrace

•  Scoreofbestlocalalignmentisthemaximumentryofsij

•  Thealignmentisfoundbyabacktracefromthemaximumnode,toanodeforwhichthescoreis0.

Page 17: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

LocalAlignment:SW

•  Thislocalalignmentalgorithmisknownasthe“Smith-Waterman”algorithm1.

•  T.F.SmithandM.W.Waterman.Identificationofcommonmolecularsubsequences.J.Mol.Biol.147:195-197,1981.

1TheSmith-Watermanalgorithmconsidersamoresophisticatedgappenaltyscheme

Page 18: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

ScoringIndels:NaiveApproach

•  Afixedpenaltyσisgiventoeveryindel:•  -σfor1indel,•  -2σfor2consecutiveindels•  -3σfor3consecutiveindels,etc.

Canbetooseverepenaltyforaseriesof100consecutiveindels

Page 19: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

AffineGapPenalties

•  Innature,aseriesofkindelsoftencomesasasingleeventratherthanaseriesofksinglenucleotideevents:

Normal scoring would give the same score for both alignments

This is more likely.

This is less likely.

Page 20: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Affinegappenalty

•  Scoreforagapoflengthxis:-(ρ+σx)where:ρ>0isthegapopeningpenalty σ>0isthegapextensionpenalty

•  ρislargerelativetoσbecauseyoudonotwanttoadd

toomuchofapenaltyforextendingthegap.

Page 21: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

AffineGapPenaltiesandEditGraph

Toreflectaffinegappenaltieswehavetoadd“long”horizontalandverticaledgestotheeditgraph.Eachsuchedgeoflengthxshouldhaveweight -ρ - x *σ

Page 22: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Adding“AffinePenalty”EdgestotheEditGraph

•  Therearemanysuchedges!

•  Addingthemtothegraphincreasestherunningtimeofthealignmentalgorithmbyafactorofn.

•  ThecomplexityincreasesfromO(n2)toO(n3)

Page 23: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

The3-leveledManhattanGrid

gaps in w

matches/mismatches

gaps in v

Page 24: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Manhattanin3Layers

ρ

ρ

σ

σ δ δ

δ

δ δ

Page 25: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

AffineGapPenaltiesand3LayerManhattanGrid

• We’llhavethreerecurrencesina3-layeredgraph.

•  Thetoplevelcreates/extendsgapsinthesequencew.

•  Thebottomlevelcreates/extendsgapsinsequencev.

•  Themiddlelevelextendsmatchesandmismatches.

Page 26: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

SwitchingBetweentheLayers

•  Levels:•  Themainlevelisfordiagonaledges•  Thelowerlevelisforhorizontaledges•  Theupperlevelisforverticaledges

•  Ajumpingpenaltyisassignedtomovingfromthemainleveltoeithertheupperlevelorthelowerlevel(-ρ- σ)

•  Thereisagapextensionpenaltyforeachcontinuationonalevelotherthanthemainlevel(-σ)

Page 27: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

AffineGapPenaltyRecurrences

si,j = s i-1,j - σ max s i-1,j - (ρ+σ) si,j = s i,j-1 - σ max s i,j-1 - (ρ+σ) si,j = si-1,j-1 + δ (vi, wj) max s i,j s i,j

Continue gap in w Start gap in w : from middle

Continue gap in v Start gap in v : from middle

Match or Mismatch End gap: from top End gap: from bottom

Page 28: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

ShouldwecompareDNAorproteinsequences?

• DNAsequenceislessconservedthanproteinsequence

•  TheproteinsequencecontainsmoreinformationthantheDNAsequence

⇒Lesseffectivetocomparecodingregionsatthenucleotidelevel

Page 29: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

ScoringMatrices

Page 30: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

MakingaScoringMatrix

•  Scoringmatricesarecreatedbasedontheintuitionthatsomemutationshaveasmallereffectonthefunctionofaprotein⇒Suchmismatchpenaltiesshouldbelessharshthanothers.

Page 31: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

ScoringMatrix:Example

A R N K

A 5 -2 -1 -1

R - 7 -1 3

N - - 7 0

K - - - 6

•  AlthoughRandKaredifferentaminoacids,theyhaveapositivescore.

•  Why?Theyarebothpositivelychargedaminoacids,thereforesubstitutionwillnotgreatlychangethefunctionoftheprotein.

Page 32: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

SubstitutionsofAminoAcidsMutationratesbetweenaminoacidshavedramaticdifferences!

Page 33: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

Conservation

•  Aminoacidchangesthatpreservethephysico-chemicalpropertiesoftheoriginalresidueshouldreceivehigherscores•  polartopolar•  aspartateàglutamate

•  hydrophobictohydrophobic•  alanineàvaline

Page 34: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

PercentSequenceIdentity

•  Ameasureoftheextenttowhichtwonucleotideoraminoacidsequencesaresimilar

A C C T G A G – A G A C G T G – G C A G

70% identical mismatch

indel

Page 35: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

BLOSUM

•  BlocksSubstitutionMatrix•  Scoresderivedfromobservationsofthefrequenciesofsubstitutionsinalignmentsofrelatedproteins

•  Matrixnameindicatesevolutionarydistance:•  BLOSUM62wascreatedusingsequencessharingnomorethan62%identity

Henikoff, S. and Henikoff, J. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA. 89(biochemistry): 10915 - 10919. 1992

Page 36: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

AnentryfromtheBLOCKSdatabase

Block PR00851A ID XRODRMPGMNTB; BLOCK AC PR00851A; distance from previous block=(52,131) DE Xeroderma pigmentosum group B protein signature BL adapted; width=21; seqs=8; 99.5%=985; strength=1287 XPB_HUMAN|P19447 ( 74) RPLWVAPDGHIFLEAFSPVYK 54XPB_MOUSE|P49135 ( 74) RPLWVAPDGHIFLEAFSPVYK 54P91579 ( 80) RPLYLAPDGHIFLESFSPVYK 67XPB_DROME|Q02870 ( 84) RPLWVAPNGHVFLESFSPVYK 79RA25_YEAST|Q00578 ( 131) PLWISPSDGRIILESFSPLAE 100Q38861 ( 52) RPLWACADGRIFLETFSPLYK 71O13768 ( 90) PLWINPIDGRIILEAFSPLAE 100O00835 ( 79) RPIWVCPDGHIFLETFSAIYK 86//

The BLOCKS database is at: http://blocks.fhcrc.org/

Page 37: Lecture 5,6 Local sequence alignmentcs425/fall19/slides/... · Lecture 5,6 Local sequence alignment Chapter 6 in Jones and Pevzner Fall 2019 September 12,17, 2019 Evolution as a tool

TheBlosum50ScoringMatrix