Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
How to deal with themicroarray results….
Britt Gabrielsson PhDRCEM, Div of metabolism and cardiovascular research
Department of Medicine
The Sahlgrenska Academy at Göteborg University
…and then we will performmicroarray analysis.
Project design
DNA microarray technology
Data analysis
Results/verification
Biological/functional relevanceFollow-up studies
Publications - present to future trends
PubMed “DNA array”+”DNA microarray”+”oligonucleotidearray”+”oligonucleotide microarray”
Publications - present to future trends
PubMed previous search terms AND “cancer”, ”clinical” or“yeast”
Common microarray work flow
Data analysis
Follow upstudies
Revisedplanning
Extendedstudy
Data analysis VerificationPlanning Pilot
Storyselection
Estimate experimental variability
Refine the experimental design
Optimize selection of time points or doses
Triplicate biological replicates per experimental group in thepilot
Possibility to add-on to extend the study
Provides preliminary data for project funding
Advantages of a pilot study
From list of genes to biological context
Once again; if you have favourite genes which areyour main interest - use an alternative method.
There will only be a few genes that you already knowthe function of.
“If you don’t get the genes you love, love the genesyou get!! “
The list of genes ⇒ mental vertigo
Reduced list of genes ⇒ managable
The reductionist’s point of view
From the list of regulated genes to futureprojects
- identification of genes with a commonality e.g. pathway, biological process, chromosomal region, upstream regulation
- verification and extended data mining
Aim: to define future possible lines of investigation1st screen to get an overview of the data
– Add own keywords to auto-annotations (e.g. apoptosis, lipidmetabolism)
– Use of “overview” databases
2nd screen more detailed information– Use of more specialized databases
From list of genes to biological context
First a word about Gene Ontology
The Gene Ontology Consortium (www.geneontology.org) aims todescribe the gene product function in a cell and provides acontrolled vocabulary to describe these attributes.
The three organizing principles of GO are molecular function,biological process and cellular component. A gene product hasone or more molecular functions and is used in one or morebiological processes; it might be associated with one or morecellular components.
Example; insulin-like growth factor 1 (IGF1)
Expressed in most tissues with liver as the main production site
Is secreted and has both paracrine and endocrine actions
Detected in circulation bound to one of many IGF-bindingproteins
The main regulators are growth hormone and insulin
Acts via the IGF1-receptor dimer or a hybrid composed of IGF1-and insulin receptor monomers
Has growth-promoting and metabolic effects
Example; IGF1 (insulin-like growth factor1) defined by GO classification
Biological process; 1501 // skeletal development // traceable authorstatement /// 6260 // DNA replication // traceable author statement /// 6928// cell motility // traceable author statement /// 7165 // signal transduction// traceable author statement /// 7265 // Ras protCellular component; 5615 // extracellular space // inferred from electronicannotationMolecular function; 5159 // insulin-like growth factor receptor binding //traceable author statement /// 5179 // hormone activity // traceable authorstatement /// 8083 // growth factor activity // inferred from electronicannotation /// 18445 // prothoracicotrophic hormone ---
1st screen databases
AIM: to get an overview of main functions of thegene products and if possible to add your own keyword
NCBI: Entrez Gene (OMIM)SwissProtGeneCard
NCBI
NCBI
The importance of knowing the different aliases
Example AdipoQ= Adiponectin= APM1= Acrp30= Acdc
GeneCard
GeneCard
ExPASy/Swiss-Prot
ExPASy/Swiss-Prot
ExPASy/Swiss-Prot
For non-reductionists - other resources
There are several web-based resources that can used togroup regulated genes according to GO classifications i.e.biological processes, molecular function and cellularcomponent.
The analysis tests whether the observed number of genes ofa GO process differ from the expected number of genes.
Examples of such web-sites are FatiGO and GOTreeMachine.
GO Tree Machine (GOTM)
Selecting for possible lines of research
• What was the question again?– Identifying putative susceptibility genes
• Are there linkage studies to this chromosomal region?• Does the Ethical licence allow genomic studies?
– Identifying biomarkers for diagnostic purposes• Limited to secreted proteins• Are there assays available?
– Identifying disease mechanisms• How to differentiate pathology from mechanism leading to disease?
External influences such as funding or competition in the field
The pragmatic view on selecting putative projects
General knowledge of the gene/genesNovelty - knowledge of the gene in your fieldWhat types of follow-up experiments can be performed(techniques in the lab, collaborators, sample availability).Time/cost/resources
2nd screen databases
Aim: To get more detailed knowledge of the geneproducts of the reduced list
NCBI: PubMed, OMIM, Gene Expression OmnibusGeneCardApplied Biosystems: PantherGNF SymAtlas (Tissue distribution)Nucleic Acids Research publishes an update database issueJanuary each year.
NCBI PubMed and OMIM
NCBI OMIM
GENE FUNCTION
By RNase protection and Western blot analysis, Schaffler et al. (1999) showed that APM1 isexpressed by differentiated adipocytes as a 33-kD protein that is also detectable in serum…
MOLECULAR GENETICS
…In 253 nondiabetic Italian subjects, Filippi et al. (2004) found that the 276G-T SNP of theadiponectin gene was associated with higher body mass index (BMI) (p less than 0.01), plasmainsulin (p less than 0.02), and homeostasis model assessment-estimated insulin resistance(HOMA-IR) (p less than 0.02)…
ANIMAL MODELMaeda et al. (2002) generated mice deficient in adiponectin/ACRP30 by targeted disruption.Homozygous mutant mice showed delayed clearance of free fatty acid in plasma, low levels offatty acid transport protein-1 (FATP1; 600691) mRNA in muscle, high levels of TNF-alpha(191160) mRNA in adipose tissue, and high plasma TNF-alpha concentrations…
GeneCard - a gateway with several links toother sites
Chromosomal Location (HGNC and/or Entrez Gene NCBIGenomic Views According to UCSC and Ensembl)
Protein info (UniProt/Swiss-Protein, Ensembl)
Phenotype (Jackson lab - Mouse Genome Informatics)
Ontologies/Pathways (Gene Ontology and KEGG)
Transcripts (NCBI, link to Applied Biosystems for assays)
Tissue distribution (Affymetrix-based, SAGE)
Applied Biosystems (www.pantherdb.org/)
Applied Biosystems (www.pantherdb.org/)
Tissue distribution - GNF SymAtlas
Genomics Institute of the Novartis Foundation (http://symatlas.gnf.org/SymAtlas/)
Expression databasesLarge-scale analysis of gene expression has led to aproliferation of databases for storing the vast quantities ofexpression data. Most are Web-based and are compliantwith the MIAME* and the Gene Ontology Consortium. A fewexamples:
•Gene Expression Omnibus (GEO NCBI)•ArrayExpress (EMBL)•Stanford Microarray Database (non-public)•Expression Array Manager
*Minimum Information About a Microarray Experiment
Gene expression omnibus
Gene expression omnibus
Data downloadCluster analysis
Verification of main findings;
At transcript level (real-time PCR or Northernblot)At protein level (immunohistochemistry, WesternBlot, ELISA/RIA, FACS)
Staining and characterizations of cells (tissues)
Cell culture/animal studies
RNAi/transgenic experiments
Follow-up studies
The importance of asking a precise question i.e. theproject design limits the interpretation of the out-put data
Initial reduction of data to identify possible future linesof research. Use over-view databases to annotate theregulated genes
In view of the present knowledge in your field and possiblefollow-up studies, select a few putative lines of research.Go back to further data mining and more detailedbioinformatics
Summary
Final words about working inmultidisciplinary collaborations
• Who has the over-all responsibility?• Who performs specific parts of the project?• How to report forward and to whom?• How to report backward and to whom?• Understanding each other/communication• Decision making• How do we publish?
Present to future trends
• Diagnostics (cancer, identification of biomarkers)
• Functional studies, the microarray data constitutes aminor part of the article
• Cross-species comparisons and translationalresearch. Shared transcriptional profiles betweenspecies to identify conserved pathways andmechanisms (longevity).
Web-sites
Information; NCBI (incl Gene, OMIM, PubMed) http://www.ncbi.nlm.nih.gov/ ExPASy http://www.expasy.org/ GeneCards http://www.genecards.org/ TIGR http://www.tigr.org/ Gene Ontology http://www.geneontology.org/ Panther/Applied Biosystems http://www.pantherdb.org/ Affymetrix http://www.affymetrix.com/index.affx GNF SymAtlas http://symatlas.gnf.org/SymAtlas/ Nucleic Acids Research db 2006 http://nar.oxfordjournals.org/content/vol34/suppl_1/index.dtl Data mining of gene expression data; GO Tree Machine http://genereg.ornl.gov/gotm/ FatiGO http://www.fatigo.org Databases for expression data; GEO (NCBI) http://www.ncbi.nlm.nih.gov/geo Stanford Microarray Database (SMD) http://genome-www5.stanford.edu/MicroArray/SMD/ ArrayExpress http://www.ebi.ac.uk/arrayexpress/index.html Expression Array Manager http://expression.microslu.washington.edu/expression/ NB there are a number of microarray expression data linked to the publications