View
223
Download
2
Category
Tags:
Preview:
Citation preview
Genomics
Transcriptomics
Proteomics
Metabolomics
Genes
mRNA
Proteins
Metabolites GC-MS
2D PAGEMALDI-MSESI-MS
DNA arraysGeneChip
Sequencing Programs
TechniquesApproach Component examined
mRNA level expressed protein level nor does it indicate the nature of the functional protein product
GenomicSequence
mRNAProteinProduct
FunctionalProteinProduct
Transcriptional
Control
Translational
Control
Post-Translational
Control
Temporal Changes in mRNA and protein
When you measure expression affects what you find
ProteinGene Expression
t t t
Does mRNA level correlate with protein level?
Anderson & SeilhamerElectrophoresis1997 18:533-537
Anderson & AndersonElectrophoresis1998 19:1853-1861
From Tew et al 1996
20 liver proteins and corresponding mRNAs
Glutathione-S-transferasein 60 human cell lines
xx
x
xx
x
LungOvarianCNSLeukemiaRenalMelanomaBreast
0.1 1.0 10 1000.1
1.0
10
100
1000
R = 0.43
Protein (Affinity-HPLC)
mR
NA
(N
orth
ern)
0.1 1 10 1000.1
1
10
100
1000
R=0.48
Protein (2D gels)
mR
NA
(E
ST
clo
nes)
• Static• Can be amplified• Little complexity:
Single component• Good solubility
characteristics
• Very dynamic• Cannot be amplified• Very complex:
post-translational modification
• Variable solubility
DNA Protein
Challenges of proteins vs DNA
Identifying new protein complexes:
Isolation of proteins using: Classical Purification +1D PAGETag Purification +1D PAGE
Phenotypic Complexity of the Eukaryotic Proteome
Domain Accretion
Protein Architecture
Protein Diversity
Functional Diversity
Domain Expansion
Somatic Rearrangement
Alternative SplicingHorizontal Transfer
Modifications
Biological Processes
Paralogous Expansion
Protein Interactions
Evolution Somatic
de novo Systems
• Duplication• Divergence• Recombination
Recombination
Eukaryotic Proteomes
ProteomeHuman Fly Worm Yeast Mustard Weed
Number of Genes 31,778 13,338 18,266 6,144 25,706
% of DB Matches* 51 56 50 50 52
(* Similarity search of protein sequences in the database)
Comparative Analysis of Proteomic Pheno-Complexity
Eubacteria
ArchaeaEukarya
UnicelluarOrganisms
Invertebrates Vertebrates Mammals Human
ConservedCore Proteins
Lineage-Specific Proteins
Domain Accretion Protein Architecture
Vertebrate-Specific Proteins
Protein Diversity
Functional Diversity
Protein Sequence Homology
Query
(1) Protein Match with Known or Unknown Function
Query
(2) Domain Match with Known or Unknown Function
Match
Match
Ortholog: A evolutionarily conserved gene that arose during speciation
Paralogs: Genes that arose due to intra-genome duplication in a species
Protein Sequence Comparison
(I) Homology• > 40 % : Same Function• 25-40 % : Similar Function• < 25 % : Different Function
(II) Distance
• Phylogenetic Tree
Yeast Worm Fly Weed Human
Domain/Protein*
1 1 1 1 1
0 1 1 1 1
0 1 1 0 1
0 0 0 0 1
Eukaryote-specific
Metazoan-specific
Animal-specific
Vertebrate-specific
Comparative Proteomics
*: The domain/protein is present (1) or absent (0) in the proteome.
61% 43%
46%
Eukaryotic Proteomes Shared with Humans
Human
Fly
Yeast
Worm
ConservedCore Proteins in
1,308 Groups
Human(3,109 Proteins)
Fly(1,445 Proteins)
Yeast(1,441 Proteins)
Worm(1,503 Proteins)
Conserved Core Groups in Eukaryotes
Vertebrate-Specific Proteins
UnicelluarOrganisms
Invertebrates Vertebrates Mammals Human
Human22%Eukaryote and
Prokaryote21%
Vertebrates andOther Animals
Other EukaryotesAnd Animals
24%32%
Vertebrate-specific Proteins
Comparative Pheno-Complexity
Bacteria
ArcheaeEukarya
UnicelluarOrganisms
Invertebrates Vertebrates Mammals Human
ConservedCore Proteins
Lineage-Specific Proteins
Domain Accretion Protein Architecture
Vertebrate-Specific Proteins
Protein Diversity
Housekeeping Functions• Engery/Metabolism• DNA replication/Repair• Translation
Physiological Differences• Defense & Immunity• Cell-Cell Communications• Nervous System
Functional Diversity
Protein Diversity in Eukaryotes
• Horizontal Gene Transfer• Invention of Protein Domain• Expansion of Protein/Domain Families• Evolution of New Protein Architectures
HumanBacteria 223 Genes
Lateral Gene Transfer
• Hydrolase• Oxidoreductase• Dehydrogenase• Monoamine Oxidase• Transporter
• Lineage Specific• Intron Acquisition
Comparative Pheno-Complexity
Bacteria
ArcheaeEukarya
UnicelluarOrganisms
Invertebrates Vertebrates Mammals Human
ConservedCore Proteins
Lineage-Specific Proteins
Domain Accretion Protein Architecture
Vertebrate-Specific Proteins
Protein Diversity
Housekeeping Functions• Engery/Metabolism• DNA replication/Repair• Translation
Physiological Differences• Defense & Immunity• Cell-Cell Communications• Nervous System
Functional Diversity
Protein Function Assignment
12 Function Categories (Gene Ontology Project)
1. Cellular Processes2. Metabolism3. DNA Replication/Modification4. Transcription/Translation5. Intracellular Signaling6. Cell-Cell Communication7. Protein Folding/Degradation8. Transport9. Multifunctional Proteins10. Cytoskeletal/Structural11. Defense and Immunity12. Miscellaneous Function
Classification of Proteome
(1) Functional Categories(2) Evolutionary Conservation(3) Structural Classification
ProteinSequence
Cellular Function
Domain/MotifDatabases
FunctionalAnnotation
~50% of Eukaryotes
PRINTS, Prosite,Pfam, Prosite Profile
Bacteria
ArcheaeEukarya
UnicelluarOrganisms
Invertebrates Vertebrates Mammals Human
Vertebrate-Specific Proteins
Physiological Differences• Defense & Immunity• Cell-Cell Communications• Nervous System
Functions
94 (7%)/1,262 InterPro Families70 Proteins24 Domains
New Proteins and Domains in Vertebrates
YeastWorm Fly
• Few new protein domains invented• Common ancestral domains in animals
Protein Domain
• An evolutionary unit • The coding sequence can be duplicated and/or recombined• ~100 to 250 residues• In small proteins or parts of large ones in a domain family• Descending from a common ancestor
• Duplication: to give arise one or more domains• Divergence: to generate modified proteins by mutations or In/Del• Recombination: to produce new domain arrangements
Protein Domain Architecture
Domain A B C D
(1) Single-domain Protein
(II) Multi-domain Protein
• Prokaryotic Proteome: 2/3 proteins are > 2 domains• Eukaryotic Proteome: 4/5 proteins are multi-domain
Invention of Protein Domain
Yeast Worm Fly Human Weed
Number of Proteins
48 151 357 706 115
7 54 115 188 392
0 113 81 222 17
0 2 8 18 131
Domain
C2H2 zinc finger
Leu-rich repeats
EGF-like
TIR
Immunoglobulin 0 64 140 765 0
CRAB box 0 0 0 171 0
Q14 repeats 0 0 0 1 0
• Expansion of paralogous proteins in metazoan• Invention of new domains in eukaryotic genome evolution
Domain Expansion: Duplication
Yeast Worm Fly Human Weed
Number of Proteins(Domains)
3 8 5 11 0
9 20 19 59 8
6 8 9 16 15
0 67(323) 125(291) 381(930) 0
Domain
RasGAP
RhoGAP
ArfGAP
Ig
PH 24 65(68) 72(78) 193(212) 23
SH3 23(27) 46(61) 55(75) 143(182) 4
Ank 12(20) 75(223) 72(269) 145(404) 66(111)
Domains are expandable in metazoan!
Rosetta StoneRosetta StoneRosetta StoneRosetta Stone
Protein A
Protein B
Protein X
Function 1
Function 2
Functions 1 and 2 due todomain recombination
Similarity Search of Protein Databases
Domain Accretion: Recombination
Ancetral Domains in Different ProteinsA B C D
A C B D
A C B D?
Combinatorial Architecture
Rho X PH
ArfGAP Ank
SH3
ArfGAP
X PH ArfGAP
X PH ArfGAP
Ank Ank
Ank Ank
Ank Ank Ank
Ank Ank Ank
PBS
Superdomain: Domain recombination in sequential order
Rho X PH
ArfGAP Ank
SH3
ArfGAP
X PH ArfGAP
X PH ArfGAP
Classification of Multi-domain ArfGAP Gene FamilyClassification of Multi-domain ArfGAP Gene Family
Class
Rho
X
PH
ArfGAP
Ank
SH3Ras-like GTPases
Domain X
Plecstrin homology domain
Zinc finger domain
Ankyrin repeat
Src homology domain
Ank Ank
Ank Ank
Ank Ank Ank
Ank Ank Ank
PBS
PBS Paxillin-binding subdomain
// //AW993140 (159)
1 11 12 15 16 17
C1
C6
KIAA1099.1
// // 1 11 12 15 16 17
C1
C6
KIAA1099.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
KIAA1099.1
KIAA1099.0
Expression of Variants in Multiple Human Tissues:KIAA1099.0 and .1
Leu
koc
ytes
LN
Sp
leen
Am
ygd
ala
Bra
in
S. M
usc
le
Hea
rt
S. I
.
Sto
mac
h
M. G
lan
d
Liv
er
Kid
ney
Lu
ng
Ute
rus
Tes
tis
Pla
cen
ta
// // 1 11 12 15
BE780934 (395)
C1
C5
// //AW993140 (159)
1 11 12 15
BE780934 (395)
C1
C5
KIAA1099.2
KIAA1099.3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
KIAA1099.3
KIAA1099.2
Expression of Variants in Multiple Human Tissues:KIAA1099.2 and .3
Leu
koc
ytes
LN
Sp
leen
Am
ygd
ala
Bra
in
S. M
usc
le
Hea
rt
S. I
.
Sto
mac
h
M. G
lan
d
Liv
er
Kid
ney
Lu
ng
Ute
rus
Tes
tis
Pla
cen
ta
Expressed Diversities of Functional Domains
Rho X PH ArfGAP Ank Ank
Rho X PH ArfGAP
Class II ArfGAP: KIAA 1099
Transcription
Alternatively Spliced Variant Transcripts
• One alternatively spliced transcript lacks ankyrin repeats.• Other variants have an altered PH domain.
Rho X PH ArfGAP Ank Ank
Eukaryotic Protein Diversity
• Lateral Gene Transfer: Bacterial Genes• Domain Invention: Vertebrate-specific Proteins• New Architecture: Combinatorial Domain Accretion• Domain Expansion: Multiple Domains in a Protein• Paralogous Expansion: Gene Duplication
(I) Genome Evolution (Germ-line)
• Somatic Rearrangement: Ig & TCR Gene Families• Alternative Splicing: Protein Isoforms
(II) Gene Expression (Somatic)
Alternative Splicing: Domain Ablation or Alteration
Phenotypic Complexity of the Eukaryotic Proteome
Domain Accretion
Protein Architecture
Protein Diversity
Functional Diversity
Domain Expansion
Somatic Rearrangement
Alternative SplicingHorizontal Transfer
Modifications
Biological Processes
Paralogous Expansion
Protein Interactions
Evolution Somatic
de novo Systems
• Duplication• Divergence• Recombination
Recombination
• Domain ablation• Domain alteration
Protein Diversity Functional Diversity
Biological Processes
Physiome Patholome
Cellome Metabolome
Proteome
Genome
Integrated Life Sciences in the Post-Genomic Era
Functional Proteomics Structural Proteomics
Gene Repertoire
Protein Repertoire
Systems Biology
Recommended