42
How to deal with the microarray results…. Britt Gabrielsson PhD RCEM, Div of metabolism and cardiovascular research Department of Medicine The Sahlgrenska Academy at Göteborg University

How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

How to deal with themicroarray results….

Britt Gabrielsson PhDRCEM, Div of metabolism and cardiovascular research

Department of Medicine

The Sahlgrenska Academy at Göteborg University

Page 2: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

…and then we will performmicroarray analysis.

Project design

DNA microarray technology

Data analysis

Results/verification

Biological/functional relevanceFollow-up studies

Page 3: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Publications - present to future trends

PubMed “DNA array”+”DNA microarray”+”oligonucleotidearray”+”oligonucleotide microarray”

Page 4: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Publications - present to future trends

PubMed previous search terms AND “cancer”, ”clinical” or“yeast”

Page 5: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Common microarray work flow

Data analysis

Follow upstudies

Revisedplanning

Extendedstudy

Data analysis VerificationPlanning Pilot

Storyselection

Page 6: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Estimate experimental variability

Refine the experimental design

Optimize selection of time points or doses

Triplicate biological replicates per experimental group in thepilot

Possibility to add-on to extend the study

Provides preliminary data for project funding

Advantages of a pilot study

Page 7: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

From list of genes to biological context

Once again; if you have favourite genes which areyour main interest - use an alternative method.

There will only be a few genes that you already knowthe function of.

“If you don’t get the genes you love, love the genesyou get!! “

Page 8: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

The list of genes ⇒ mental vertigo

Page 9: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Reduced list of genes ⇒ managable

Page 10: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

The reductionist’s point of view

From the list of regulated genes to futureprojects

- identification of genes with a commonality e.g. pathway, biological process, chromosomal region, upstream regulation

- verification and extended data mining

Page 11: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Aim: to define future possible lines of investigation1st screen to get an overview of the data

– Add own keywords to auto-annotations (e.g. apoptosis, lipidmetabolism)

– Use of “overview” databases

2nd screen more detailed information– Use of more specialized databases

From list of genes to biological context

Page 12: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

First a word about Gene Ontology

The Gene Ontology Consortium (www.geneontology.org) aims todescribe the gene product function in a cell and provides acontrolled vocabulary to describe these attributes.

The three organizing principles of GO are molecular function,biological process and cellular component. A gene product hasone or more molecular functions and is used in one or morebiological processes; it might be associated with one or morecellular components.

Page 13: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Example; insulin-like growth factor 1 (IGF1)

Expressed in most tissues with liver as the main production site

Is secreted and has both paracrine and endocrine actions

Detected in circulation bound to one of many IGF-bindingproteins

The main regulators are growth hormone and insulin

Acts via the IGF1-receptor dimer or a hybrid composed of IGF1-and insulin receptor monomers

Has growth-promoting and metabolic effects

Page 14: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Example; IGF1 (insulin-like growth factor1) defined by GO classification

Biological process; 1501 // skeletal development // traceable authorstatement /// 6260 // DNA replication // traceable author statement /// 6928// cell motility // traceable author statement /// 7165 // signal transduction// traceable author statement /// 7265 // Ras protCellular component; 5615 // extracellular space // inferred from electronicannotationMolecular function; 5159 // insulin-like growth factor receptor binding //traceable author statement /// 5179 // hormone activity // traceable authorstatement /// 8083 // growth factor activity // inferred from electronicannotation /// 18445 // prothoracicotrophic hormone ---

Page 15: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

1st screen databases

AIM: to get an overview of main functions of thegene products and if possible to add your own keyword

NCBI: Entrez Gene (OMIM)SwissProtGeneCard

Page 16: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

NCBI

Page 17: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

NCBI

Page 18: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

The importance of knowing the different aliases

Example AdipoQ= Adiponectin= APM1= Acrp30= Acdc

Page 19: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

GeneCard

Page 20: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

GeneCard

Page 21: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

ExPASy/Swiss-Prot

Page 22: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

ExPASy/Swiss-Prot

Page 23: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

ExPASy/Swiss-Prot

Page 24: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

For non-reductionists - other resources

There are several web-based resources that can used togroup regulated genes according to GO classifications i.e.biological processes, molecular function and cellularcomponent.

The analysis tests whether the observed number of genes ofa GO process differ from the expected number of genes.

Examples of such web-sites are FatiGO and GOTreeMachine.

Page 25: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

GO Tree Machine (GOTM)

Page 26: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Selecting for possible lines of research

• What was the question again?– Identifying putative susceptibility genes

• Are there linkage studies to this chromosomal region?• Does the Ethical licence allow genomic studies?

– Identifying biomarkers for diagnostic purposes• Limited to secreted proteins• Are there assays available?

– Identifying disease mechanisms• How to differentiate pathology from mechanism leading to disease?

External influences such as funding or competition in the field

Page 27: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

The pragmatic view on selecting putative projects

General knowledge of the gene/genesNovelty - knowledge of the gene in your fieldWhat types of follow-up experiments can be performed(techniques in the lab, collaborators, sample availability).Time/cost/resources

Page 28: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

2nd screen databases

Aim: To get more detailed knowledge of the geneproducts of the reduced list

NCBI: PubMed, OMIM, Gene Expression OmnibusGeneCardApplied Biosystems: PantherGNF SymAtlas (Tissue distribution)Nucleic Acids Research publishes an update database issueJanuary each year.

Page 29: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

NCBI PubMed and OMIM

Page 30: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

NCBI OMIM

GENE FUNCTION

By RNase protection and Western blot analysis, Schaffler et al. (1999) showed that APM1 isexpressed by differentiated adipocytes as a 33-kD protein that is also detectable in serum…

MOLECULAR GENETICS

…In 253 nondiabetic Italian subjects, Filippi et al. (2004) found that the 276G-T SNP of theadiponectin gene was associated with higher body mass index (BMI) (p less than 0.01), plasmainsulin (p less than 0.02), and homeostasis model assessment-estimated insulin resistance(HOMA-IR) (p less than 0.02)…

ANIMAL MODELMaeda et al. (2002) generated mice deficient in adiponectin/ACRP30 by targeted disruption.Homozygous mutant mice showed delayed clearance of free fatty acid in plasma, low levels offatty acid transport protein-1 (FATP1; 600691) mRNA in muscle, high levels of TNF-alpha(191160) mRNA in adipose tissue, and high plasma TNF-alpha concentrations…

Page 31: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

GeneCard - a gateway with several links toother sites

Chromosomal Location (HGNC and/or Entrez Gene NCBIGenomic Views According to UCSC and Ensembl)

Protein info (UniProt/Swiss-Protein, Ensembl)

Phenotype (Jackson lab - Mouse Genome Informatics)

Ontologies/Pathways (Gene Ontology and KEGG)

Transcripts (NCBI, link to Applied Biosystems for assays)

Tissue distribution (Affymetrix-based, SAGE)

Page 32: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Applied Biosystems (www.pantherdb.org/)

Page 33: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Applied Biosystems (www.pantherdb.org/)

Page 34: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Tissue distribution - GNF SymAtlas

Genomics Institute of the Novartis Foundation (http://symatlas.gnf.org/SymAtlas/)

Page 35: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Expression databasesLarge-scale analysis of gene expression has led to aproliferation of databases for storing the vast quantities ofexpression data. Most are Web-based and are compliantwith the MIAME* and the Gene Ontology Consortium. A fewexamples:

•Gene Expression Omnibus (GEO NCBI)•ArrayExpress (EMBL)•Stanford Microarray Database (non-public)•Expression Array Manager

*Minimum Information About a Microarray Experiment

Page 36: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Gene expression omnibus

Page 37: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Gene expression omnibus

Data downloadCluster analysis

Page 38: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Verification of main findings;

At transcript level (real-time PCR or Northernblot)At protein level (immunohistochemistry, WesternBlot, ELISA/RIA, FACS)

Staining and characterizations of cells (tissues)

Cell culture/animal studies

RNAi/transgenic experiments

Follow-up studies

Page 39: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

The importance of asking a precise question i.e. theproject design limits the interpretation of the out-put data

Initial reduction of data to identify possible future linesof research. Use over-view databases to annotate theregulated genes

In view of the present knowledge in your field and possiblefollow-up studies, select a few putative lines of research.Go back to further data mining and more detailedbioinformatics

Summary

Page 40: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Final words about working inmultidisciplinary collaborations

• Who has the over-all responsibility?• Who performs specific parts of the project?• How to report forward and to whom?• How to report backward and to whom?• Understanding each other/communication• Decision making• How do we publish?

Page 41: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Present to future trends

• Diagnostics (cancer, identification of biomarkers)

• Functional studies, the microarray data constitutes aminor part of the article

• Cross-species comparisons and translationalresearch. Shared transcriptional profiles betweenspecies to identify conserved pathways andmechanisms (longevity).

Page 42: How to deal with the microarray results….bio.lundberg.gu.se/courses/vt06/MicroarrayRes_BG.pdf · Optimize selection of time points or doses Triplicate biological replicates per

Web-sites

Information; NCBI (incl Gene, OMIM, PubMed) http://www.ncbi.nlm.nih.gov/ ExPASy http://www.expasy.org/ GeneCards http://www.genecards.org/ TIGR http://www.tigr.org/ Gene Ontology http://www.geneontology.org/ Panther/Applied Biosystems http://www.pantherdb.org/ Affymetrix http://www.affymetrix.com/index.affx GNF SymAtlas http://symatlas.gnf.org/SymAtlas/ Nucleic Acids Research db 2006 http://nar.oxfordjournals.org/content/vol34/suppl_1/index.dtl Data mining of gene expression data; GO Tree Machine http://genereg.ornl.gov/gotm/ FatiGO http://www.fatigo.org Databases for expression data; GEO (NCBI) http://www.ncbi.nlm.nih.gov/geo Stanford Microarray Database (SMD) http://genome-www5.stanford.edu/MicroArray/SMD/ ArrayExpress http://www.ebi.ac.uk/arrayexpress/index.html Expression Array Manager http://expression.microslu.washington.edu/expression/ NB there are a number of microarray expression data linked to the publications