84
Protein Evolu-on Structure, Func-on, and Human Health 11/28/2013 Dr. Daniel Gaston, Department of Pathology 1

Protein Evolution: Structure, Function, and Human Health

Embed Size (px)

DESCRIPTION

Guest Lecture, Protein Biochemistry course on basics of evolution at the protein level and some applications.

Citation preview

Page 1: Protein Evolution: Structure, Function, and Human Health

Protein  Evolu-on  

Structure,  Func-on,  and  Human  Health  

11/28/2013  Dr.  Daniel  Gaston,  Department  

of  Pathology  1  

Page 2: Protein Evolution: Structure, Function, and Human Health

So,  about  this  evolu-on  thing?  

Why  should  I  care?  What  use  is  it?  

Page 3: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  

•  Knowledge  for  its  own  sake  is  good  – Otherwise,  why  do  science  at  all?  

Page 4: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  

•  Knowledge  for  its  own  sake  is  good  – Otherwise,  why  do  science  at  all?  

•  Shapes  our  understanding  of  ecology  and  biological  diversity  

Page 5: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  •  Knowledge  for  its  own  sake  is  good  

– Otherwise,  why  do  science  at  all?  •  Shapes  our  understanding  of  ecology  and  biological  diversity  

•  Prac-cal  reasons  – An-bio-c  resistance  – Microbiome:  Fecal  transplanta-on  –  Cancer  –  Predic-ng  gene/protein  func-on  –  Predic-ng  the  impact  of  muta-ons  for  poten-al  to  cause  human  disease  (Genotype:Phenotype)  

Page 6: Protein Evolution: Structure, Function, and Human Health

Evolu-on  of  Life  on  Earth  

A  (Very)  Brief  Overview  

Page 7: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 8: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 9: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 10: Protein Evolution: Structure, Function, and Human Health
Page 11: Protein Evolution: Structure, Function, and Human Health

You  are  here  

Page 12: Protein Evolution: Structure, Function, and Human Health

A  Brief  History  of  Cells  and  Molecules  

•  Origin of the earth ~4.5 billion years ago •  Origin of life: ~3.0-4.0 billion years ago

–  Origin of self-replicating entities –  The RNA world (?) –  Origin of the first genes, proteins & membranes –  Gave rise to the first cells –  the Last Universal Common Ancestor (LUCA) of all cells

–  Probably had 500-1000 genes •  First microfossils of bacteria: ~3.5 billion years ago (controversial)

~2.7 billion years ago (for certain) •  Oxygenation of the atmosphere: 2.3-2.4 billion years ago (by

photosynthetic bacteria) •  Origin of eukaryotes: ~1.0-2.2 billion years ago (probably 1.5) •  Origin of animals: ~0.6-1.0 billion years ago

Page 13: Protein Evolution: Structure, Function, and Human Health

•  Homology = descent from a common ancestor – homology is all or nothing: sequences are either

homologous (related) or not homologous (not related)

– Not the same as “similarity” (degrees of similarity are possible)

Some  Defini-ons  

Page 14: Protein Evolution: Structure, Function, and Human Health

Some  Defini-ons  •  Divergence = change in two sequences over time

(after splitting from a common ancestor)

•  Convergence = similarity due to independent evolutionary events

–  On the amino acid sequence level, it is relatively rare & difficult to prove (but see an example later)

T T

Ancestral sequence

Sequence 1 Sequence 2

Page 15: Protein Evolution: Structure, Function, and Human Health

How does evolutionary change happen in proteins?

Page 16: Protein Evolution: Structure, Function, and Human Health

Evolu-on:  Two  Groups  of  Processes  

•  Muta-on  – Many  different  processes  that  generate  muta-ons  – Muta-ons  are  the  raw  materials  needed  for  evolu-on  to  happen  

•  Selec-on  and  DriY  – Muta-ons  happen  in  individuals  – Evolu-on  happens  in  popula-ons  of  organisms  – Selec-on  and  Gene-c  DriY  affect  the  frequency  of  muta-ons  in  a  popula-on  over  -me  

Page 17: Protein Evolution: Structure, Function, and Human Health

Muta-ons  

Page 18: Protein Evolution: Structure, Function, and Human Health

Point  Muta-ons

! ! AGGTTCCAATTAA!! ! TCCAAGGTCAATT!

!!AGGTTCCAATTAA ! TCCAAGGTTAATT!!

REPLICATION (meiotic or mitotic division)

Unrepaired mispaired base

Mutant allele Wild-type alleles

Mutant Gamete (for multicellular org.)

Wild-type Gamete (for multicellular org.)

AGGTTCCAGTTAA ! TCCAAGGTCAATT!

Page 19: Protein Evolution: Structure, Function, and Human Health

AGTCCAAGGCCTTAA -------------> AGTTCAAGGCCTTAA point mutation ���

CCTTA AGTCCAAGGCCTTAA -------------> AGTCCAAGGCCTTACCTTAA

insertion

AAGG AGTCCAAGGCCTTAA -------------> AGTCC-CCTTAA

deletion AGTCCAAGGCCTTAA -------------> AGTCCCCTTCCTTAA

` inversion AGTCCAAGGCCTTAA -------------> AGTCCAAGGCC + translocation + GGTCCTGGAATTCAG GGTCCTGGAATTCAGTTAA AGTCCAAGGCC --------------> AGTCCAAGGCCAGTCCAAGGCC duplication AAGG AGTCCAAGGCCTTAA ---------------> AGTCCAAAGGCTTAA

recombination AGGC

Page 20: Protein Evolution: Structure, Function, and Human Health

Larger  Scale  Muta-ons  

Page 21: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1   Exon  2   Exon  3  

Page 22: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1   Exon  2   Exon  3  

Domain  1   Domain  2  

Page 23: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1  Exon  2   Exon  3  

Page 24: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1  Exon  2   Exon  3  

Domain  2  Domain  A  

Page 25: Protein Evolution: Structure, Function, and Human Health

Genomic  Scale  Muta-ons  

Gene  1   Gene  2  

Page 26: Protein Evolution: Structure, Function, and Human Health

Genomic  Scale  Muta-ons  

Gene  1   Gene  2  

Page 27: Protein Evolution: Structure, Function, and Human Health

Gene  Duplica-on  

Gene  1   Gene  2  

Page 28: Protein Evolution: Structure, Function, and Human Health

Gene  Duplica-on  

Gene  1   Gene  2  Gene  1a  

Page 29: Protein Evolution: Structure, Function, and Human Health

Gene-c  DriY  and  Selec-on  

Page 30: Protein Evolution: Structure, Function, and Human Health

Mutations vs. substitutions

•  Mutations happen in individual organisms

•  A nucleotide ‘substitution’ occurs IF after many generations, all individuals in the population harbour the ‘mutation’

•  This process is called “fixation of mutations”

•  substitution = fixed mutation •  When comparing homologous protein sequences between

species, looking at amino acid substitutions

Page 31: Protein Evolution: Structure, Function, and Human Health

Fixation of alleles

N generations

Proportion of = 1.0 (100%) This is the same as saying that was fixed in the population in N generations The ‘mutation’ became a ‘substitution’ after it was fixed in the population

Population with two alleles:

Proportion of = 1/14 (7.1%) Proportion of = 13/14 (93%)

Page 32: Protein Evolution: Structure, Function, and Human Health

Natural selection and Neutral drift •  Positive selection

–  Mutation confers fitness advantage (more offspring that survive)

–  RARE •  Purifying selection (negative selection)

–  Mutation confers fitness disadvantage (less offspring or ‘no’ viable offspring - e.g. lethal)

–  FREQUENT •  Neutral evolution (genetic drift)

–  Mutation has very little fitness effect –  Will drift in frequency in the population due to random

sampling effects –  VERY FREQUENT

Page 33: Protein Evolution: Structure, Function, and Human Health

Nearly-neutral theory ���

Page 34: Protein Evolution: Structure, Function, and Human Health

Common  Examples  of  Posi-ve  Selec-on  

•  MHC  Genes  – Diversity  =  Good  – Very  polymorphic  in  humans  

•  Envelope  (gp120)  of  HIV  –  Immune  system  evasion  

•  Enzymes  involved  in  human  dietary  metabolism  – Accelerated  posi-ve  selec-on  over  last  ~10,000  years  

Page 35: Protein Evolution: Structure, Function, and Human Health

Gene-c  DriY  

Select  a  marble  randomly  from  a  jar  and  “copy”  it  in  to  the  next  Fixa-on  of  the  plain  blue  allele  in  5  genera-ons  

Page 36: Protein Evolution: Structure, Function, and Human Health

Polymorphism  

•  Polymorphisms  are  sites  with  more  than  one  allele  present  in  a  popula-on  – Muta-ons  that  have  not  yet  been  fixed  

Page 37: Protein Evolution: Structure, Function, and Human Health

Muta-on  and  Codons  

Not  all  muta-ons  are  created  equal  

Page 38: Protein Evolution: Structure, Function, and Human Health

Point mutations in protein genes are classified according to the genetic code:

The genetic code is degenerate: more than one codon often specifies a single amino acid. E.g. Serine has 6 codons, Tyrosine has 2 codons and Tryptophan has one codon!

Page 39: Protein Evolution: Structure, Function, and Human Health

Point mutations in ���protein-coding genes

•  synonymous (silent) substitutions: cause interchange between two codons that code for the same amino acid:

e.g. CTG --> CTA = Leu --> Leu Mostly invisible to selection

•  non-synonymous (replacement) mutations: cause change between codons that code for different amino acids (missense) or stop codons (nonsense)

e.g. CTG --> ATG = Leu --> Met TGG --> TGA = Trp --> Stop

Page 40: Protein Evolution: Structure, Function, and Human Health
Page 41: Protein Evolution: Structure, Function, and Human Health

8 kinds of 1st codon-position synonymous mutation: R-->R and L-->L

Page 42: Protein Evolution: Structure, Function, and Human Health

126 kinds of 3rd-codon position synonymous mutation:

Page 43: Protein Evolution: Structure, Function, and Human Health

A  Note  on  Indels  

•  Ignored  because  indels  are  far  more  likely  to  be  deleterious  – More  likely  to  result  in  frame  shiYs    

•  Can  s-ll  be  non-­‐deleterious  – Par-cularly  if  in  mul-ples  of  three  – Over  evolu-onary  -me  indels  more  oYen  observed  in  loops  than  more  constrained  structural  elements  

Page 44: Protein Evolution: Structure, Function, and Human Health

Evolu-onary  Rates  

Speed  of  Evolu-on  

Page 45: Protein Evolution: Structure, Function, and Human Health

Rates of protein evolution���(i.e. rates that individual amino acids are substituted)

•  Different regions in proteins have different rates of evolution (functional constraints)

•  Different proteins have different overall rates of evolution

Page 46: Protein Evolution: Structure, Function, and Human Health
Page 47: Protein Evolution: Structure, Function, and Human Health

Enolase •  Ubiquitous glycolytic enzyme, highly conserved throughout evolution

•  TIM Barrel family doing an α-proton abstraction

cMLE

MLE

Archaea

Bacteria

Euks

β α γ

Page 48: Protein Evolution: Structure, Function, and Human Health

All Eukaryotes site rates (63 taxa) mapped on Lobster Enolase

low rates blue high rates red

Page 49: Protein Evolution: Structure, Function, and Human Health

Site rate categories 1 and 2 (slowest sites)

Page 50: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 3 and 4

Page 51: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 5 and 6

Page 52: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 7 and 8 (fastest sites)

Page 53: Protein Evolution: Structure, Function, and Human Health

Evolutionary rates as a function of enolase structure/function

•  Rates of evolution increase from the centre of the molecule (slow) to the surface (fast)

•  The pattern is probably due to: –  Distance from the catalytic centre --> catalytic residues don’t change

(slowest), residues that interact with catalytic residues are constrained (slow)

–  Geometric constraints - residues in the centre of the molecule have restricted ‘space’ around them that constrains them. At the surface, there are fewer such constraints

–  Hydrophobic core in centre –  More loops and alpha helices on surface

•  NOTE: this pattern seems to work for soluble globular enzymes with catalytic centre in the centre of mass. It does not hold for structural proteins like tubulin, actin etc.

Page 54: Protein Evolution: Structure, Function, and Human Health

Rates of evolution of sites versus their structural position

•  There are no completely general rules! –  It depends on what the protein is doing and where.

•  Functional sites (catalytic sites) or sites at interfaces (protein-protein interactions) are conserved

•  Geometric, chemical, folding and functional constraints (catalysis, binding) determine evolutionary constraints

Page 55: Protein Evolution: Structure, Function, and Human Health

Detec-ng  and  Quan-fying  Evolu-onary  Rela-onships  

Page 56: Protein Evolution: Structure, Function, and Human Health

How do we know if two proteins are homologous?

(A) If sequences > 100 amino long are >25% identical --> they are probably significantly similar and very likely to be homologous -BLAST, FASTA, Smith-Waterman algorithms are likely to find them “significantly similar” (E-value << 1x10-4)

(B) If they are >100 long and 15-25% identical (Twilight Zone) --> probably homologous BUT need to rigourously test it -a number of methods are available: permutation test

(C) If they are <15% identical......difficult to prove homology -test it -if its not significant look for motifs in multiple alignments -look at tertiary structure

Page 57: Protein Evolution: Structure, Function, and Human Health

15-23%!identity!

}!

Page 58: Protein Evolution: Structure, Function, and Human Health
Page 59: Protein Evolution: Structure, Function, and Human Health

Applica-ons  

•  Evolu-onary  methods  for  studying  protein  func-on  – Annota-ng  novel  proteins  – Func-onal  divergence  

•  Predic-ng  pathogenicity  of  muta-ons  Informing  protein  structure  predic-on  – Mendelian  disease  – Cancer  

Page 60: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  Biology  to  Medicine  

Inherited  Gene-c  Diseases  and  Cancer  

Page 61: Protein Evolution: Structure, Function, and Human Health

Lynch  Syndrome  

•  Autosomal  dominant  cancer  syndrome  •  Increased  risk  for  many  cancers,  mostly  colorectal  cancer  due  to  mismatch  repair  defects  

Page 62: Protein Evolution: Structure, Function, and Human Health

Lynch  Syndrome  

•  Autosomal  dominant  cancer  syndrome  •  Increased  risk  for  many  cancers,  mostly  colorectal  cancer  due  to  mismatch  repair  defects  

Page 63: Protein Evolution: Structure, Function, and Human Health

Mutator  Phenotype  

•  Inac-va-on  of  mismatch  repair  (MMR)  genes  led  to  mutator  phenotypes  in  E.  coli  and  yeast  •  Included  Microsatellite  instability  

 

Page 64: Protein Evolution: Structure, Function, and Human Health

Mutator  Phenotype  

•  Inac-va-on  of  mismatch  repair  (MMR)  genes  led  to  mutator  phenotypes  in  E.  coli  and  yeast  •  Included  Microsatellite  instability  

•  Careful  research  iden-fied  human  homologs  – MLH1  and  MSH2  – Defects  in  these  genes  cause  Lynch  Syndrome    

Page 65: Protein Evolution: Structure, Function, and Human Health

Mismatch  Repair  

•  Mismatch  Repair  -­‐>    •  Microsatellite  Instability  -­‐>    •  Cancer    Most  microsatellites  spread  throughout  the  genome  in  non-­‐genic  regions    But  some  are  found  in  important  tumor  suppressor  genes  

Page 66: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  Biology  to  Medicine  

Predic-ng  Pathogenicity  and  Impact  of  Human  Muta-ons  

Page 67: Protein Evolution: Structure, Function, and Human Health

The  Sequencing  Revolu-on  

Page 68: Protein Evolution: Structure, Function, and Human Health

Problem  

•  OYen  leY  with  hundreds  to  thousands  of  poten-al  muta-ons  in  a  family  that  “track”  with  the  disease  – Needle  in  a  “stack  of  needles”  problem  

•  Must  discriminate  neutral  missense  muta-ons  from  pathogenic  ones  

Page 69: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

•  Many  programs  exist  to  make  these  predic-ons:  – PolyPhen  – Muta-on  Taster  – EvoD  – SIFT  – PROVEAN  – FATHMM  – etc  

Page 70: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

•  Important  amino  acids  have  low  evolu-onary  rates  – Higher  conserva-on  

•  The  more  important  the  protein  the  more  likely  it  is  to  be  broadly  found  among  eukaryotes  – Also  higher  overall  conserva-on  

•  However  many  important  proteins  in  humans  only  found  in  primates,  mammals,  or  animals  

Page 71: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLAHTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Page 72: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLAHTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Compute  an  Evolu-onary  Conserva-on  Score  for  Each  Posi-on  

Page 73: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLACTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Conserva-ve  changes  more  likely  to  be  neutral  

Page 74: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLACTP…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Radical  changes  more  likely  to  be  deleterious  

Page 75: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  to  Protein  Func-on  

Func-onal  Divergence  

Page 76: Protein Evolution: Structure, Function, and Human Health

Func-onal  Divergence  

Gene  1   Gene  2  Gene  1a  

Over  evolu-onary  -me  scales  Gene  1  and  Gene  1a  are  known  as  paralogs,  a    subset  of  homologs    They  can  diverge  from  one  another  in  sequence,  as  well  as  func-on.  

Page 77: Protein Evolution: Structure, Function, and Human Health

Types  of  Func-onal  Divergence  

•  Subfunc-onaliza-on  – Paralog  specializes  and  retains  only  a  subset  of  ancestral  func-on    

•  Neofunc-onaliza-on  – Paralog  gains  a  new  func-on,  and  loses  old  func-on(s)  

•  Subneofunc-onaliza-on  – Paralog  undergoes  rapid  subfunc-onaliza-on  but  then  undergoes  neofunc-onaliza-on  

Page 78: Protein Evolution: Structure, Function, and Human Health

Gene  A  

Family  B  

Family  A  

Func-onal  Divergence  

Page 79: Protein Evolution: Structure, Function, and Human Health

Func-onal  Divergence  …A L H… Species 1 …A L H… Species 2 …A L H… Species 3 …A L H… Species 4 …A L H… Species 5 …A L H… Species 6

…R A H… Species 1 …R R H… Species 2 …R C H… Species 3 …R A H… Species 4 …R A H… Species 5 …R Y H… Species 6

Family  B  

Family  A  

Page 80: Protein Evolution: Structure, Function, and Human Health

Glyceraldehyde-­‐3-­‐Phosphate  Dehydrogenase  

NAD+  NADH  +Pi  +H+  

NAD+  NADH  +  Pi      +  H+  

Glyceraldehyde-­‐3-­‐Phosphate   1,3-­‐Biphosphoglycerate  

Cytosol:  Glycolysis  

Page 81: Protein Evolution: Structure, Function, and Human Health

Glyceraldehyde-­‐3-­‐Phosphate  Dehydrogenase  

NADP+  NADPH  +Pi  +H+  

NADP+  NADPH  +Pi  +H+  

Glyceraldehyde-­‐3-­‐Phosphate   1,3-­‐Biphosphoglycerate  

Plas-d:  Calvin  Cycle  

Page 82: Protein Evolution: Structure, Function, and Human Health

GAPDH  Evolu-on  

Green  Plants  

Cyanobacteria  

‘Chromalveolates’  

Cytosolic  GapC  

Cytosolic  GapC  

Page 83: Protein Evolution: Structure, Function, and Human Health

GAPDH  Structure  

Page 84: Protein Evolution: Structure, Function, and Human Health

NADPH  Binding  Necessary  for  Calvin  Cycle  Func-on