Upload
tyrone-dennis
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
6 November 2007© ETH Zürich | Genevestigator
Gene expression analysis and network discovery:
Genevestigator
Philip Zimmermann, Genevestigator Team, ETH Zurich
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 2
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 3
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 4
Gene networks - biological context
What is the interpretational value of a gene network derived by graphical modeling or correlation analysis?
a snapshot in time?
a snapshot in space?
an average trend?
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 5
Gene networks - biological context
From what experiment(s) was this network derived?
time-course?
cell culture, whole organism?
stimulus, drug response?
anatomy part?
stage of development?
genetic modification?
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 6
Context and dynamics of networks
Hypothesis: networks are dynamic and context-dependant
=> networks evolve!
=> networks may have different
functions in different contexts!
Question: how can we quantify the role of the context in shaping the network?
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 7
Context: the time-space-response dimensions
Time => time-course, development
Space => anatomy parts, intracellular localization
Response => response to external perturbations
=> response to modifications in the genome
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 8
Context and dynamics of networks
Modeling the time, space and response dimensions
requires: experiments testing time, space and response variables
storage of measurement data and its meta-data
developing analysis methods that incorporate these dimensions
(→ meta-profiles)
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 9
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Analysis versus meta-analysis
Data storage
Data analysis
100 genes –what to do next?
10 billion datapoints –what to do next?
Microarrayexperiment
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
heterogenousdatasets
Data repositories
unsystematicor poor annotation
Data
Annotations
+meta-analysis
impossible!
?
=
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Data warehouses
Dataqualitycontrol
+ordereddatasets
meta-analysis possible!=
systematicannotation
Expert annotationwith systematicontologies
anatomy
development
stimulus
mutation
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 13
Data quality control
RLE NUSE
Border elements Correlation matrix
Affy QC metrics
RNA degradation
Unprocessed values
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Ontologies – example of Anatomy
Mouse / Rat: Edinburgh Mouse Atlas
Human: mapping to Mouse and Rat anatomy tree
Arabidopsis / Barley: terms from Plant Ontology
tree created by Genevestigator
Expert annotationwith systematicontologies
anatomy
development
stimulus
mutation
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Ontologies – example of Development
Mouse: Theiler stages
Rat: Witschi stages
Human: Carnegie table
Arabidopsis: Boyes key
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Meta-analysis tools
• Who is most interested to mine this data?
• Who can best interpret the results?
THE BIOLOGIST!Genevestigator® –a tool for biologists
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 17
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 18
Expression meta-profiles
[space] [time] [response] [response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 19
Data validation
Category type
Probe set
e.g. heart ventricle
e.g. Mm.23432
[space] [time]
[response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 20
Data validation
Category type
Probe set
e.g. heart ventricle
[space] [time]
[response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 21
Mouse anatomy meta-profiles [space]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 22
Data validation
Category type
Probe sete.g. Mm.23432
[space] [time]
[response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 23
Transcription of Rnf33 has been shown to occuralready in the mouse oocyte but not beyond theeight-cell stage nor in adult tissues
Rnf33
Hoxa1 expression starts at E7.5 and begins to retreat caudally by day E8.5
hemopexin (hx), is known to be only lowly expressed in embryos and newborn mice and reaches it’s highest expression level not until the first year of age
Hoxa1
hemopexin
a – f: pre-natalg – l: post-natal
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
light-harvesting chlorophyll a/b binding protein (AT4G14690 )
protochlorophyllide reductase A (At5g54190 )
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 25
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 26
Development of Genevestigator®
14‘500 Affymetrix arrays (Nov 2007)
Human, mouse, rat, arabidopsis, barley
Metabolic and regulatory pathway maps
for mouse and arabidopsis
> 10‘000 registered users
> 500 citations in peer reviewed journals
AnatomyDevelopmentStimulusMutation
Microarray data
Public repositories
Genevestigator database
Curation & Quality control
Biological experiments
Application server
Client Java application
Genevestigator
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 27
Genevestigator® V3
Website Java Client Application
Database and Application Server Cluster
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 28
Toolsets and tools
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 30
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 31
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Biomarker Search toolset
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 33
Abiotic stresses and hormonal responses
salt (+)osmotic (+)
cold (+)
ABA (+)
2,4-Dglucose
salt (+)osmotic (+)
ABA (+)
norflurazon (-)mycorrhiza (-)
anoxia (-)hypoxia (-)
BL / H3BO3(+)
syringolin (-)cycloheximide (-)
H2O2 (-)
salt (-)osmotic (-)
---
ozone (-)genotoxic (-)
salt (+)drought (+)
MeJA (+)
syringolin (-)P. syringae (+)
ozone (+)B. cinerea (+)
hypoxia (-)
ethylene (+)
AVG (+)chitin (+)
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 35
Biclustering
Searches subsets of genes
coexpressed across subsets
of conditions
BiMax algorithm Finds all maximal bicliques
[space] [time]
[response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 37
ABA response
Beta-alanine
Starch / sucrose
Inositolphosphate
Cold response
Phenylalanine / TyrosineProline
ABA biosynthesis
[space] [time]
[response]
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 38
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 39
Biomarker search [time]
Genes expressed specifically
in seeds and germinating
seedlings
De-novo identification
of cis-regulatory elements
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 40
Biomarker search [space]
z = 18.2
z = 5.8
z = 5.4
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 41
Biomarker search [response]
„Supervised biclustering“ isoxaben (+)
norflurazon (-)
light (+)
nitrate_low (-)
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Anatomy clustering and promoter analysis
Clusters of genes expressed specifically in:
cell suspension
petals
roots
seeds
stamen
xylem
z > 5.0
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Development clustering and promoter analysis
Clusters of Arabidopsis genes expressed specifically at:
dev. stage 1
dev. stage 3
dev. stage 9
z > 5.0
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Stimulus clustering and promoter analysis
„Supervised biclustering“ of stimulus meta-profiles:
cluster 1
cluster 2
cluster 4
cluster 5
cluster 7
z > 5.0
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Data integration: transcriptome - proteome
cell
su
spen
sio
n
coty
led
on
s
flo
wer
s
leav
es
roo
ts
see
ds
cell suspension
cotyledons
flowers
leaves
roots
seeds
Transcripts
Pro
tein
s
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Arabidopsis leaf transcripts and proteins
Protein quantification measure
Tra
ns
cri
pt
qu
an
tifi
ca
tio
n m
ea
su
re
Frequency
general background range for transcript quantification measure
proteins detected in leaves
proteins not detected in leaves but for whichthere is a probeset on the ATH1 array
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Protein detection and transcript abundance
Fraction of „present“ transcriptsthat were detected onthe protein level
probe sets called “absent” on ATH1 (p >= 0.05)probe sets called “present” on ATH1 (p < 0.05)leaf proteins detected by peptide identification
0
500
1000
1500
2000
2500
3000
3500
4000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Transcript abundance measure (log2 signal)
Nu
mb
er o
f tr
ansc
rip
ts/p
rote
ins
leaf proteins detected
0
0.2
0.4
0.6
0.8
1
1.2
6 7 8 9 10 11 12 13 14 15
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
GO analysis
cell wallchloroplastcytosolERextracellularGolgi apparatusmitochondrianucleusother cellular componentsother cytoplasmic componentsother intracellular componentsother membranesplasma membraneplastidribosome
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
0
0.2
0.4
0.6
0.8
1
1.2
6 7 8 9 10 11 12 13 14 15
GO Cellular Component
n = 221 specific probesets with average signal in leaves >13
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
GO analysis
cell organization and biogenesisdevelopmental processesDNA or RNA metabolismelectron transport or energy pathwaysother biological processesother cellular processesother metabolic processesprotein metabolismresponse to abiotic or biotic stimulusresponse to stresssignal transductiontranscriptiontransport
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
0
0.2
0.4
0.6
0.8
1
1.2
6 7 8 9 10 11 12 13 14 15
GO Biological Process
n = 221 specific probesets with average signal in leaves >13
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
GO analysis
0
0.2
0.4
0.6
0.8
1
1.2
6 7 8 9 10 11 12 13 14 15
GO Molecular Function
n = 221 specific probesets with average signal in leaves >13
DNA or RNA bindinghydrolase activitykinase activitynucleic acid bindingnucleotide bindingother bindingother enzyme activityother molecular functionsprotein bindingreceptor binding or activitystructural molecule activitytranscription factor activitytransferase activitytransporter activity
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Data integration – pathway analysis
Protein abundance
Tra
nscr
ipt a
bund
ance
Carotenoid biosynthesis
Phenylpropanoidmetabolism
Chlorophyll / Porphyrinmetabolism
Riboflavinmetabolism
Mevalonatebiosynthesis
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Relative protein-to-transcript ratio
Calvin cycle
Fatty acidbiosynthesis
serine, glycine,cystein
starch and sucrosemetabolism
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Relative protein-to-transcript ratio
Chlorophyll / Porphyrinmetabolism
Fatty acidbiosynthesis
Glycolysis / Gluconeogenesis
Purinemetabolism
Pyrimidinemetabolism
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Proteomic and transcriptomic biomarkers
„Root-specific“expression
Search by scoring the proteomic dataset
Search by scoring the Genevestigator dataset
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Proteomic and transcriptomic biomarkers
Search by scoring the proteomic dataset
Search by scoring the Genevestigator dataset
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 56
Presentation flow
Gene networks – biological context
Microarray compendium: how, and what for?
Meta-profile analysis: concepts and validation
Genevestigator® V3
Data integration
Summary & conclusion
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 57
Summary and conclusions
Biological networks: importance of the biological context
Meta-profiles: context-driven analysis
Biological validation of meta-profiles and clusters
Genevestigator – a tool for biologists!
Data integration: challenging biological complexity
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Experimentalcontext?
Organism?
Data type?
Modes ofinteractions?
Network dynamics?
Reproducibility?
6 November 2007 P. Zimmermann / ETH Zurich / [email protected]
Acknowledgements
ETH Zurich
Prof. Gruissem
Developer Team: Tomas Hruz, Oliver Laule, Stefan Bleuler, Philip Zimmermann
Gabor Szabo, Frans Wessendorp, Lukas Oertle, Dominique
Dümmler, Matthias Hirsch-Hoffmann
6 November 2007 P. Zimmermann / ETH Zurich / [email protected] 60
Thanks for your attention!