Genomica - Microarreglos de DNA

Preview:

Citation preview

Prof. Ulises Urzúa ICBM, Facultad de Medicina,

Universidad de Chile

uurzua@med.uchile.cluurzua@med.uchile.cl

DNA microarrays in DNA microarrays in genomics and cancergenomics and cancer

Clase ToxGen-Nov08

(1982 - 2002)

Actualizado, 31 Dic 2008

One gene…or many genes?

Environment Environment and life-style are and life-style are

major major contributors to contributors to

the the pathogenesis of pathogenesis of

complex complex diseasesdiseases

Legal Issues in Genomic Medicine

"We won't be able to offer you a position with our company. The results of our genetic tests suggest that you have a predisposition to attention deficit disorder. Mr. Jones? Mr. Jones?"

Medicina personalizada

http:/ /www.jyi .org/features/ ft .php?id=1047

Tumor classification, risk assessment, prognosis predictionMicroarray CGH

Drug development, therapy development, disease progression

Mutation&Polymorphism

analysis

Drug development, drug response, therapy developmentTranscriptional

analysis

ApplicationApproach

Major microarray applicationsMajor microarray applications

• MicroarrayMicroarray: : ordered arrangement of ordered arrangement of known DNA sequences on a solid-known DNA sequences on a solid-planar planar substrate which allows the hybridization substrate which allows the hybridization binding of labeled sample RNAs or binding of labeled sample RNAs or DNAs.DNAs.

• A single microarray contains from few hundreds to 400.000 microscopic elements of uniform size and spacing.

• Immobilized DNAs are oligonucleotidesoligonucleotides (20-80 mer), cDNAscDNAs (0.5-5 Kb) or BACBAC clones (10-50 Kb). Substrates are rigid, thermostable, optically flat surfaces like nylon, glass or silica.

• DNAs are spotted onto chemically modified substrates and then immobilized using UV. Oligonucleotides can be either spotted or synthesized in situ.

Affimetrix GeneChip® MicroArrays

20µm20µm

Millions of copies of a specificMillions of copies of a specificoligonucleotide probeoligonucleotide probe

Image of Hybridized Probe Array

>400,000 different>400,000 differentcomplementary probes complementary probes

Single stranded, Single stranded, labeled RNA targetlabeled RNA target

Oligonucleotide probeOligonucleotide probe

* **

**

1.28cm1.28cm

GeneChip Probe ArrayHybridized Probe Cell

Suited for both expression profiling

and genotyping

Affimetrix GeneChip®

5´5´ 3´3´Oligo arraysOligo arraysGeneGene

PerfectMatchPerfectMatchMismatchMismatch

MultipleMultiple oligo probesoligo probes

onon

offoff

2424 µm µm

• The photolitographic technique used in Affimetrix GeneChipsTM allows obtaining ultra high-density microarrays (up to 106 probes/cm2)

GeneChip workstation

• Glass slide microarraysGlass slide microarrays

• Up to 48,000 spotted “ off-line” DNA probes Up to 48,000 spotted “ off-line” DNA probes • Spotted cDNA clones, ESTs or oligosSpotted cDNA clones, ESTs or oligos• Gene expression, CGH, and SNPsGene expression, CGH, and SNPs

A comparative A comparative hybridization hybridization experimentexperiment

Mouse NIA 15K cDNA microarray, Mouse NIA 15K cDNA microarray, block 15 (from 32 total)block 15 (from 32 total)

- - Cy5Cy5 mouse ovarian cell line (total RNA) mouse ovarian cell line (total RNA)

- - Cy3Cy3 reference whole newborn mouse reference whole newborn mouse (total RNA)(total RNA)

Microarrays allows only comparative (relative) measurements

Genes up-regulated in mouse ovarian cells

Genes up-regulated in

the reference RNA

Genes equally

expressed in both

samples

BioRobotics Arrayer

Plate loader and lid remover

Refrigerated Biobank (holds upto 24 microtiter plates)

Wash baths for cleaning the pins

The four platforms are capable of holding 120 slides

A 32 pin holder with pins loaded

Telechem pins

- Deposits ~0.4nl a spot- Each spot ~100µm diameter

Total uptake volumesTotal uptake volumes 0.25 0.6 2.5 µl0.25 0.6 2.5 µl

Contact depositionContact deposition

50% DMSOAdvantages: denatures the DNA; low evaporation rate; interacts well with GAPS coating thus generating uniform spots.Disadvantages: Strong irritant; tends to form spots of large diameter, sometimes causing them to merge; DNA aggregates when DMSO concentration is above 70%.

3X SSCAdvantages: Aqueous solvent; produces spots of small diameter, allowing high printing density.Disadvantages: Does not denature the DNA; evaporates quickly so that carefully controlled printing environment is required.

150 mM NaPO4, pH 8.5Similar to 3X SSC in terms of advantages and disadvantages

Spotting solutions

Crosslinking of DNA to polylysine coated glass

- GAPS (gamma aminopropyl silane) coating.

Hybridization

Manual hybridization chambers (TELECHEM-Arrayit )

- 20 to 50 µl of hyb cocktail - prone towards significant experimental variability.

Automatic hybridization station:

- Over 120 µl of hyb cocktail - less variability in replicates - washing also automated

Fluorescence scanners

ScanArray Lite (Perkin-Elmer) GenePix 4000B (Axon)

Exercise # 1Exercise # 1

- GenePix Pro 3.0 (Axon) GenePix Pro 3.0 (Axon) LocalLocal

- mAdb (NCI`s Microarray database)mAdb (NCI`s Microarray database)

http://nciarray.nci.nih.gov/http://nciarray.nci.nih.gov/

Experimental design and variabilityExperimental design and variability

Sources of variability:• Due to attributes or conditions

• Biological variation is intrinsic; influenced by genetic & environmental factors, as well as whether samples are from populations or individuals

• Due to technical issues, results during sample extraction (quality), labeling and hybridization

• Due to fluorophore stability during laser scanning and fluorescence detection

Microarray data workflowMicroarray data workflow

• Marcaje

• Hibridacion

• Escaneo

• Análisis de

imagen

Experimental Analisis numérico Interpretacion

• Normalización

• Valores perdidos

• Imputación

• Distribución

• Clustering

• t-test

• Anova

• Correlación

• PCA

• Clasificación

• Pathways

• Gene ontology

• PubMed

• Gene networks

• Validación (?)

Corrección técnica

(experimental)

Print-tip Loess normalization

3

6

Array #3 Print-tip display

Array #6 Print-tip display

Array #3 “MA” plot

M = log2R - log2G A = (log2R + log2G) / 2

Array #6 “MA” plot

Scale adjustment

Microarray data workflowMicroarray data workflow

• Marcaje

• Hibridacion

• Escaneo

• Análisis de

imagen

Experimental Analisis numérico Interpretacion

• Normalización

• Valores perdidos

• Imputación

• Distribución

• Clustering

• t-test

• Anova

• Correlación

• PCA

• Clasificación

• Pathways

• Gene ontology

• PubMed

• Gene networks

• Validación (?)

Corrección técnica

(experimental)Corrección estadística

Differentially expressed genes: the problem of multiple testing

- ANOVA test, 40 arrays, 7 samples

- FWER (family wise error rate), type I error or false positive.

Urzúa et al. (2006)

J. Cell. Physiol. 206, 594-602

Dataset structure -filtering hierarchy

Statistical tests Co-expression Correlation, etc(multiple test control)

Raw dataset Processed and normalized subset

Candidate genes

Functional groups Pathway analysis

Text-mining (interpretation)

Interaction gene/groups

networks

Case # 1Case # 1

- An An in vitroin vitro mouse mouse model of ovarian model of ovarian

cancercancer

Ovarian cancer: risk factors and possible etiologyOvarian cancer: risk factors and possible etiology

• There is no reliable screening test for early detection. Over 75% is detected late (5-year survival is below 30%).

• Symptoms are often vague and easily confused with other diseases.

➩ risk: No children, continuous ovulation (never used birth control pills)

➪ risk: Pregnancies, lactation

Generation of a mouse modelGeneration of a mouse model

Roby et al., Carcinogenesis 21, 585-594, 2000.

pass 5

MOSE (mouse ovarian surface epithelial) cells

MOSE cells: highest ranking differentially expressed genes

Gene symbol, description Mean difference, adjusted p-value a

Reported in ovarian cancer b [References]

Up-regulated

Spp1, osteopontin +3.77, 0.0e0 Yes [18, 19]

Cyp1b1, cytochrome P450, family 1, subfamily 1, polypeptide 1 +2.85, 3.1e-3 Yes [22]

Nupr1, nuclear protein 1 +2.83, 9.0e-4 Pancreas c

Gsto1, glutathione S-transferase ω1 +2.52, 0.0e0 Breast d

Btg1, B-cell translocation gene +2.41, 0.0e0 Yes [24]

Down-regulated

Mt1, metallothionein 1 −4.77, 0.0e0 Yes [27]

Mt2, metallothionein 2 −4.66, 0.0e0 Yes [27]

Igfbp5, insulin-like growth factor binding protein 5 −3.93, 2.0e-4 Yes [28, 30]

Gstm1, glutathione S-transferase, µ1 −3.32, 0.0e0 Yes [32, 33]

Alcam, activated leukocyte cell adhesion molecule −3.08, 1.7e-2 Breast i

MOSE clonal cells produce tumors in MOSE clonal cells produce tumors in immunocompetent miceimmunocompetent mice

Roby et al., Carcinogenesis 21, 585-594, 2000.

Self organizing tree algorithm (SOTA) clustering

Urzúa et al. (2006)

J. Cell. Physiol. 206, 594-602

Human-Mouse Comparison

• SOM clustering of IG10 and IF5 cell lines compared to human ovarian tumors based on 872 genes with equivalent biological function. Samples description is as follows:

1.- LMP 2.- Stage III 3.- Serous BOT 4.- Mucinous BOT 5.- Mouse IG-10 6.- Mouse IF-5

Urzúa et al. (2006)

J. Cell. Physiol. 206, 594-602

Microarray-CGH conceptMicroarray-CGH concept

Gene expression array Microarray-CGH

Microarray-CGH… Microarray-CGH… How to deal with the genome complexity?How to deal with the genome complexity?

RNA (cDNA) hybridized Genomic DNA hybridized

RNA vs DNA RNA vs DNA raw data raw data distributiondistribution

• Test RNA and DNA obtained from the same source were hybridized against their respective reference RNA and DNA. Statistical values are shown for 13,417 clones from the NIA-15K cDNA mouse clone set. Upper and lower box boundaries indicate the 75th and 25th percentile, respectively. Whiskers above and below the box indicate the 90th and the 10th percentiles. A line within the box mark the median.

Microarray-CGH, experimental optimizationMicroarray-CGH, experimental optimization

Urzúa et al. (2005) Tumor Biol. 26, 236-44

Conventional-CGH vs microarray-CGHConventional-CGH vs microarray-CGH

Z-scoreZ-score

Frankenberger et al. (2006) Appl. Bioinformatics 5, 125-30

Exercise # 2Exercise # 2

Web-aCGH, microarray CGH data Web-aCGH, microarray CGH data analysis and display.analysis and display.

http://129.43.22.27/WebaCGH/welcome.htmhttp://129.43.22.27/WebaCGH/welcome.htm

Focus: clusters of > 50 lymphocytes

Focus score: number of focus/40mm2 glandular tissue (0 4)

Ducto

Acino

Focal lip sialadenitis in Sjogren`s syndrome

Case # 2Case # 2

LSGs expression pattern in Sjogren´s LSGs expression pattern in Sjogren´s syndrome patients -syndrome patients -

Correlation with clinical parameters Correlation with clinical parameters

Isolation of epithelial cells

Positively correlated

gene expression

Negatively correlated

gene expression

Epithelial gene expression

VS

focus score

Top ranked LSG acini expressed genes correlated to focus score a

n.r.d.2,131.09(+), 0.74RING1, Ring finger protein 1

n.r.d.2,331.22(+), 0.85FYB, FYN binding protein (FYB-120/130)

n.r.d.1,970.98(+), 0.79IL10RA, interleukin 10 receptor, alpha

Ohyama et al. (1995), 7621031

3,731.90(+), 0.82CD69, CD69 antigen (p60, early T-cell activation antigen)

n.r.d.2,411.27(+), 0.89SERPINB1, Serpin peptidase inhibitor, clade B (ovalbumin), member 1

Dimitriou et al. (2002), 11876766

6,112.61(+), 0.82HLA-DRA, Major histocompatibility complex, class II, DR alpha

Kay et al. (1995), 7558918

2,811.49(+), 0.84TRBV2, T cell receptor beta variable 2

n.r.d.3,391.76(+), 0.84NQO2, NAD(P)H dehydrogenase, quinone 2

Ogawa et al. (2002), 12384933

5,172.37(+), 0.83CXCL9, chemokine (C-X-C motif) ligand 9

Fei et al. (1991), 1685512

3,291.72(+), 0.77HLA-DQA1, Major histocompatibility complex, class II, DQ alpha 1

n.r.d.4,862.28(+), 0.89HLA-DMA, major histocompatibility complex class II, DM alpha

n.r.d.3,101.63(+), 0.85LCP1, L plastin, actin binding protein

Loiseau et al. (2001), 11423179

5,542.47(+), 0.82HLA-A, major histocompatibility complex, class I, A

n.r.d. d2,621.39(+), 0.83RAC2, ras-related C3 botulinum toxin substrate 2

Azuma et al. (2002), 11947921

3,231.69(+), 0.86LAPTM5, Lysosomal associated multispanning membrane protein 5

Previously reported in SS, PMID

Gene expression shift (ratio)

Gene expression shift (log2)

c

Direction of correlation,R2 value b

GENE SYMBOL, description

Gene expression phenotype correlation (1)

Positively correlated

gene expression

Negatively correlated

gene expression

Gene expression phenotype correlation (2)

Functional (GO, KEGG, BioCarta) and literature (PubMed) linked genes

Strength of correlation respective to transcriptional activity

Urzúa et al. (2008) in preparation

Ovarian tumor frequency

Number oflittersLitter size

89

280

73 00

0 145

Correlation between expression profiles and multiple phenotypes

Case # 3Case # 3

Gene expression profiling in ovarian Gene expression profiling in ovarian carcinomas carcinomas

actin cytoskeleton

regulation of protein metabolism

blood coagulation

response to wounding

apoptosis

Gene expression differences - IOSE vs EOC III (t-test)

Gene expression differences - IOSE vs EOC III (Anova)

*

*

*

**

-2.5 -1.5 -0.5 0.5 1.5 2.5

VEGF

TGFb1

TrkA

FSHR

ER alfa

ER beta

AKT1

AKT2

AKT3

PTEN

MAPK1

GSK3b

E-cad

FOXL2

HIF1A

log2 ratio

IOSE

CaOv

NGF signaling pathway and related genes

Case # 4Case # 4

Wine yeast genomics Wine yeast genomics

L846 (cepa nativa)

L846 ura- (mutante espontánea para uracilo, producto de esporulación)

Spt2 overexpression is concomitant with transposable elements repression

⇑ Spt2, negative regulator of ty transcription

⇓ transposon ty2 protein b

⇓ transposon ty2 protein a

⇓ transposon ty1 protein a

⇓ ty1a protein

⇓ transposon ty1 protein a

⇓ mevalonate pirophosphate decarboxylase

⇓ phosphomevalonate kinase

⇓ farnesyltransferase (essential for yeast?)

⇓ Lanosterol 14-alpha demethylase (citP450 family)

?

Yeast genome may undergo genomic changes when exposed to the environment

288 / 288S C v s S C

1118 / 288EC v s S C

-1333 / 288L v s S C

-957 / 288L v s S C

S288C= cepa estándar de laboratorio

EC1118= cepa comercial francesa

L-1333= cepa aislada en Casablanca

L-957= cepa aislada en Mendoza

Case # 5Case # 5

Norovirus genotyping Norovirus genotyping

SNPs in viral genome allow viruses classification

Real-time Q-PCR validationReal-time Q-PCR validation

Spp1 Mt1

———— 18S rRNA ——●—— Rps16

Urzúa et al. (2006) J. Cell. Physiol. 206, 594-602

Real-time Q-PCR validation (2)Real-time Q-PCR validation (2)

Urzúa et al. (2008) in preparation

Tal como una casa se construye con ladrillos, la ciencia se construye en

base a hechos...

Pero un conjunto de hechos no constituye por sí sólo ciencia, tal como un montón de ladrillos no constituyen

una casa.

Henri Poincaré La Science et l'Hypothese,

Paris, 1908.

AgradecimientosAgradecimientos

• Dra Carmen Romero (Hosp Clinico U de Chile)

• Dr Luigi Devoto (IDIMI, U de Chile)

• Dr. David Munroe (NCI Frederick)

• Dra Julieta Gonzalez, (Prog Biol Cel Mol, ICBM)

• Dr Claudio Martínez (USACH)

• Dra Sandra Ampuero (Virología, ICBM)

• VID, U de Chile, Proyecto de Iniciación

ColaboradoresColaboradores

Gracias!Gracias!

Dr. Ulises Urzúa

uurzua@med.uchile.cl

Fono 978-6877

Systems biology integrates different levels of information to understand how biological organisms function.

In contrast to molecular biology, systems biology does not break down a system into all of its parts and study one part of the process at a time. Systems biologists argue that this

reductionist approach is not robust, either because of nature's redundancy and complexity, or because we have

not understood all the parts of the processes.

The ultimate goal of systems biology is to mathematically model biological processes. Such models

are used to predict how different changes affect the phenotype of a cell, and can be iteratively tested to prove

or disprove the model.

Adapted from http://en.wikipedia.org/

Seminarios 9 de Diciembre, 2008Seminarios 9 de Diciembre, 2008

http://www.ncbi.nlm.nih.gov/pubmed/18596974

http://www.ncbi.nlm.nih.gov/pubmed/17766027

Recommended