Upload
derek-york
View
220
Download
3
Embed Size (px)
Citation preview
Chip-Assisted Analysis of Epithelial Transporter Proteins
Pascale Anderle, ISREC Lausanne
Overview
1. Introduction
1. Transporters in the context of the whole genome
2. Classification of transporters
3. Introduction into microarray technology
4. Overview on various microarray platforms
2. Strategies to select transporter genes and example studies
1. First example: Custom Array
1. Evaluation of transporter and channel genes in the intestine
2. Use of Hidden Markov Models
3. Summary
2. Second example: Affymetrix Platform
1. Genomic profiling of membrane transporters in the intestine
2. Gene Ontology Project
3. Importance of annotation
4. Isrec Ontologizer
5. Conclusions
3. Acknowledgment
Venter at al., Science 2001
Transporters in the Context of the Whole Genome
Membrane Transporter Proteins: Classification
Membrane Transport Proteins
Selective Channels Specific Carriers
Facilitated Diffusion Primary Active Transport Secondary Active Transport
Uniporters
GluT1-5
ATP-powered pumps
ATPases: P-type, F-type and ABC-type ATPases
(ABC transporters)
Symporters Antiporters
hPept1 SLC18A1*
Facilitated diffusionTransport of substances across the membrane by means of uniporters. Transport is from an area of higher concentration to lower concentration. Passive transport powered by the potential energy of a concentration gradient and does not require the expenditure of metabolic energy
Primary active transportEnergy derived from the hydrolysis of ATP to ADP liberating energy from high energy phosphate bond
Secondary active transport. Use of energy from another source-another secondary diffusion gradient set up across the membrane using another ion. Because this secondary diffusion gradient is initially established using an ion pump, as in primary active transport, the energy is ultimately derived from the same source-ATP hydrolysis. *Monoamine transporter, carrier of doxorubicin
http://tcdb.ucsd.edu/tcdb http://lab.digibench.net/transporter/
Introduction into Microarray Technology
Oligomers PCR products
ProbesSpotting:
Photolithography
Printing
Physical support:Glass slide, nylon membrane
Affymetrix: Short oligo chipSingle labeling
cDNA chip: Oligos or PCR products
Dual-labeling
Sample preparation and hybridization:
cRNA vs. cDNA
Single-labeling vs. dual-labeling
Fluorescence vs. radioactivity
Different Microarray Platforms
Definition of biological questions
Experimental design
Chip preparationProbe design
Probe preparationPrinting
Custom arrayPCR products
Oligomers
Commercial arrayShort oligos: Affymetrix
Long oligos: Agilent
Sample preparationcRNA/cDNA Labeling
Hybridization
Scanning
Data Acquisition and Data Analysis
Goals:
• Caco-2 cells: Differentiated cells vs. undifferentiated cells
• Small intestinal and colonic tissues vs. Caco-2 cells
Evaluation of Transporter and Channel Genes in the Intestine
Anderle et al., Pharm Res 2003
Probe Design for Custom Array
Protein seed sequence
Converged PSI-Blast
Core Protein Family
Blast humanEST db
EST nucleotide sequence
Remove vector and characterized ESTs
Assemble contigs
236 Contigs and singlets
HMM Models
Search Pfam HMM db
Run hmmsearch against GenPept db
Putative new genes
Filter genes (human only, set cut off, eliminate red. genes)
Transporters: 670Channels: 263
Keywords, seed sequences
Multiple alignment and selection of repr. genes
Run Pick70
Run Pick70Tm = 70, Palindrome Uniqueness = 15 bp
Multiple alignment and selection of repr. genes
Run Pick70
Transporters: 316Channels: 151Contigs: 156Positive Controls: 9Negative Controls: 3Controls (diff. Oligos): 9 RGS: 75FGF/RGF-like: 7ADAM family: 18
Brown et al. AAPS PharmSci. 2003 Anderle et al. Pharm Res. 2003
5 days vs. 3 weeks
A-values7 9 11 13 15
-3
-
2
-
1
0
1
2
3
Differentiation of Caco-2 cells
M-v
alu
es
- 3
-
2
-
1
0
1
2
3
A-values7 9 11 13 15
5 days vs. 5 days
M-v
alu
es
- 3
-
2
-
1
0
1
2
3
A-values7 9 11 13 15
5 days vs. 1 week 5 days vs. 2 weeks
M-v
alu
es
- 3
-
2
-
1
0
1
2
3
A-values7 9 11 13 15
Time
Summary
Differentiation of Caco-2 cells:
During differentiation: Expression pattern changes Up and down regulation usually < 2 fold Significant changes between 5 days to 1 week and 1 week to 2 weeks No significant changes between 2 weeks and 3 weeks Genes in general related to ion household No major differences between flasks and filters (except GLUT3) Typical small intestinal transporters not especially up regulated in differentiated cells
Comparison Tissue vs. Caco-2 Cells:
Changes more pronounced between tissue and cell line than between undifferentiated and differentiated
cells Tissue vs. Caco-2 cells: More ratios > 2 fold No trend observed: undiff. cells to diff. cells = colon-like to small intestinal-like cells
Genomic Profiling of Membrane Transporters in the Intestine
12 x Mu74Av2
12 x Mu74Bv2
12 x Mu74Cv2
Objective:
Identification of putative segment-specific and non-specific specific drug carriers
Study Design:
Duodenum Jejunum Ileum Colon
Triplicates → 3 Pools of 10 mice
Gene Ontology Project
GO Output
Cellular Component Molecular Function Biological processes
L3 L2 L3 GO:ZL3 L3 GO:Y L3
L4 GO:X L4 GO:Y
Ontologies are structured vocabularies in the form of directed acyclic graphs (DAGs) that represent a network in which each term may be a “child” of one or more than one ”parent”.
Two pragmatic purposes of ontology: 1. Facilitate communication between people
and organizations2. Improve interoperability between systems
ABCB1
Annotation
Affymetrix Representative Sequence
Representative sequence Consensus sequence
Comparison with UG DBBLAT against assembly
sequence from UCSC
Probes
Ensembl DB
TaggerExact mapping to UG and RefSeq DB
Exact mapping to temp cDNA DB
EnsMart DBSIB annotation 4 quality levels
NetAffxUnigene
Representative Sequence: Chosen during chip design as a sequence which is best associated with the transcribed region being interrogated
BLAT threshold: Only records whose match / Qsize >= 75% and; only records whose score >= 0.70, where score = (match - mismatch - gap# x 5 - gap_size x 2) / Qsize; If record has several mapping locations with score > 0.70, choose the highest one; if a record has several mapping locations with the same highest score, all mapping locations kept. EnsMart Approach: cDNA sequence plus an additional length of downstream sequence immediately following the most 3' exon. The individual probe sequences are mapped, by exact matching. If more than 50 % of probes mapped, then listed as hits.
Comparison of Various Annotations
Mouse MOE A and B
Human U133 A and B
TaggerA: 20882
EnsMartA: 15421
NetAffxA: 21545
A: 2686 A: 796
A: 5085
A: 3209
A: 11269
A: 4381A: 147
B: 8473 B: 499
B: 2533
B: 904
B: 4027
B: 8610B: 77
B: 15247
B: 22014 B: 5507
TaggerA: 21675
EnsMartA: 14220
NetAffxA: 22446
A: 2384 A: 418
A: 2657
A: 1193
A: 12460
A: 6409A: 149
B: 7300 B: 355
B: 1728
B: 169
B: 1853
B: 12790B: 85
B: 16456
B: 22112 B: 2462
Quality of Probe Sets
Chip High Medium Low Undefined
HG-133A 13792 1663 1103 5657HG-133B 3795 790 519 17473Mu74v2A 5340 1283 1697 4102Mu74v2B 2587 969 1190 7665Mu74v2C 756 302 982 9828MOE-A 12683 2395 1194 6354MOE-B 2453 620 592 18846
Mapped on: RefSeqs
Chip High Medium Low Undefined
HG-133A 15703 1196 3983 1333HG-133B 10096 2026 3125 7330Mu74v2A 8015 615 2127 1665Mu74v2B 7010 1421 2306 1674Mu74v2C 2600 780 2555 5933MOE-A 18070 1222 2383 951MOE-B 11602 2376 2478 6055
Mapped on: RefSeqsmRNAsESTsHTCs
Distribution: UGs per Probe Set
1
10
100
1000
10000
100000
1 10 100
Number of UniGenes
Nu
mb
er o
f P
rob
e S
ets
EnsMart A
EnsMart B
Tagger A
Tagger B
NetAffx A
NetAffx B
Distribution: Probe Sets per UG
1
10
100
1000
10000
100000
1 10 100
Number of Probe Sets
Nu
mb
er
of
Un
iGe
ne
s
U133A
U133B
U133AB
U74Av2
U74Bv2
U74Cv2
U74ABCv2
U74ABCv3_NA
MOE430A
MOE430B
MOE430AB
Io: Isrec Ontologizer
Selection of hierarchical level
Classification of probe setsClassification of UniGenesClassification of RefSeqs
Flagging of ambiguous results
Multiple probe sets per UniGene:addressed via flagging
Multiple UniGenes per probe set:addressed via quality threshold(user defined annotation)
Io: Overview
Ontology Files Annotation Files
GO Consortium Affymetrix (Custom)
Io engine independent from data structure: Can classify anything hierarchical, provided well structured files are given to the program. (E.g.: Simple extension to spotted arrays.) Flexibility improved by a single configuration file (v0.1.2).
Quality Files
Results FileProbesets
Io: Annotation Organization
Probe sets of interest
IO classification
Loc2UG
Loc2ref
Loc2GO GO term
RefSeq ID
UniGeneTagger
Probe Set ID
UG ID
Quality Filter
RefSeqTagger
NetAffx
GO term
Probe Set ID
UG ID
RefSeq ID
Loc2UG Loc2GO
Functional classification of differentially regulated UGs along the Intestine
Function All All All AllDepth 2 Depth 3 # UG (1/F/G) # total UG % of all UG # UG filt. % of all filt. UGmolecular function 2167 (881/1096/190) 6794 31.9 1341 (1094/167/80) 4685 28.6 anticoagulant activity 0 (0/0/0) 2 0.0 0 (0/0/0) 2 0.0 antifreeze activity 0 (0/0/0) 0 0 (0/0/0) 0 antioxidant activity 9 (5/4/0) 10 90.0 5 (4/1/0) 7 71.4 apoptosis regulator activity 22 (11/11/0) 57 38.6 16 (13/3/0) 42 38.1 binding 1056 (386/586/84) 3697 28.6 610 (486/87/37) 2522 24.2 catalytic activity 24 (10/12/2) 89 27.0 11 (8/2/1) 59 18.6 cell adhesion molecule activity 24 (10/14/0) 72 33.3 13 (13/0/0) 50 26.0 chaperone activity 0 (0/0/0) 0 0 (0/0/0) 0 chaperone regulator activity 1 (0/1/0) 4 25.0 1 (0/1/0) 3 33.3 cytoskeletal regulator activity 38 (16/17/5) 79 48.1 17 (14/2/1) 47 36.2 defense/immunity protein activity 992 (421/477/94) 2642 37.5 651 (538/79/34) 1833 35.5 enzyme regulator activity 68 (34/32/2) 195 34.9 33 (27/3/3) 123 26.8 ice nucleation activity 0 (0/0/0) 0 0 (0/0/0) 0 molecular_function unknown 0 (0/0/0) 0 0 (0/0/0) 0 motor activity 16 (4/11/1) 57 28.1 10 (7/3/0) 29 34.5 nutrient reservoir activity 0 (0/0/0) 0 0 (0/0/0) 0 obsolete 109 (47/47/15) 313 34.8 71 (62/2/7) 205 34.6 protein stabilization activity 0 (0/0/0) 0 0 (0/0/0) 0 protein tagging activity 0 (0/0/0) 0 0 (0/0/0) 0 reg. of establishment of comp. for transf. activity 0 (0/0/0) 0 0 (0/0/0) 0 signal transducer activity 272 (113/137/22) 1188 22.9 160 (134/19/7) 855 18.7 structural molecule activity 126 (47/66/13) 358 35.2 83 (72/7/4) 233 35.6 surfactant activity 0 (0/0/0) 4 0.0 0 (0/0/0) 3 0.0 toxin activity 3 (3/0/0) 8 37.5 1 (1/0/0) 6 16.7 transcription regulator activity 137 (51/73/13) 539 25.4 79 (58/15/6) 371 21.3 translation regulator activity 17 (3/12/2) 58 29.3 7 (5/1/1) 33 21.2 transporter activity 233 (100/106/27) 718 32.5 149 (119/14/16) 532 28.0
amine/polyamine transporter activity 14 (9/3/2) 30 46.7 11 (9/1/1) 24 45.8 auxiliary transport protein activity 0 (0/0/0) 1 0.0 0 (0/0/0) 0 boron transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 carbohydrate transporter activity 5 (3/1/1) 10 50.0 3 (1/1/1) 8 37.5 carrier activity 101 (45/44/12) 227 44.5 65 (52/6/7) 154 42.2 channel/pore class transporter activity 45 (19/23/3) 217 20.7 26 (21/4/1) 170 15.3 drug transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 electron transporter activity 15 (7/7/1) 29 51.7 11 (10/0/1) 22 50.0 group translocator activity 0 (0/0/0) 0 0 (0/0/0) 0 intracellular transporter activity 1 (0/1/0) 9 11.1 1 (1/0/0) 6 16.7 ion transporter activity 42 (18/18/6) 112 37.5 30 (25/1/4) 78 38.5 lipid transporter activity 3 (0/1/2) 7 42.9 1 (1/0/0) 6 16.7 neurotransmitter transporter activity 7 (2/4/1) 15 46.7 5 (3/0/2) 12 41.7 nitric oxide transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 nucleob/nucleos/nucleot./nucl.a. transp. activity 3 (2/1/0) 6 50.0 2 (2/0/0) 5 40.0 organic acid transporter activity 16 (10/4/2) 37 43.2 13 (10/2/1) 29 44.8 organic alcohol transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 oxygen transporter activity 1 (0/1/0) 9 11.1 1 (1/0/0) 7 14.3 peptide transporter activity 3 (1/2/0) 5 60.0 0 (0/0/0) 2 0.0 peptidoglycan transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 permease activity 0 (0/0/0) 0 0 (0/0/0) 0 protein transporter activity 38 (13/23/2) 110 34.5 25 (17/3/5) 84 29.8 toxin transporter activity 0 (0/0/0) 0 0 (0/0/0) 0 vitamin/cofactor transporter activity 0 (0/0/0) 5 0.0 0 (0/0/0) 4 0.0 water transporter activity 0 (0/0/0) 1 0.0 0 (0/0/0) 0
triplet codon-amino acid adaptor activity 0 (0/0/0) 0 0 (0/0/0) 0
Self Organizing Maps
Duodenum
pGEA = 0.01: All genes pGEA = 0.05: Transporters
Jejunum
Ileum
Colon
Duodenum
pGEA = 0.01: All genes pGEA = 0.05: Transporters
Jejunum
Ileum
Colon
Duodenum
pGEA = 0.01: All genes pGEA = 0.05: Transporters
Jejunum
Ileum
Colon
Duodenum
pGEA = 0.01: All genes pGEA = 0.05: Transporters
Jejunum
Ileum
Colon
Pair-wise Comparison: M vs A Plots
M (log2 of fold change) vs A (log2 of absolute average intensity) plots of the pair-wise comparisons of the four intestinal segments. Highlighted are genes for which a significant difference was measured between the two segments of interest and for which the annotation was of “high” or “medium” quality.• differentially regulated genes, p (GEA) ≤ 0.05; • differentially regulated transporters p (GEA) ≤ 0.05; • differentially regulated transporters p (GEA) ≤ 0.01
2 * SD according to lowess fitting3 * SD according to lowess fitting
Conclusions I
Bioinformatic aspects:
Annotation provided by NetAffx does not catch the entire complexity of Affymetrix-based microarray experiments
Heterogeneous representation of genes on GeneChips: 1 unique probe set ≠ 1 unique gene
Need of coherent and comparable annotation when comparing results of microarray experiments
Filtering of genes using an annotation quality threshold
No significant bias in general regarding the distribution of the selected probe sets into the different molecular functions for the top hierarchical levels
Possible influence regarding the distribution of the selected probe sets into the different molecular functions at lower hierarchical levels
Functional classification of gene on the UniGene level and RefSeq level yields very similar results
Flagged genes ambiguous rather due to technical issues than due to the fact that splice variants may be differentially expressed
Conclusions II
Biological aspects:
About 28 % of genes with transporter activity are differentially regulated along the intestine, thus, indicating that the majority of transporter genes are not segment specific.
Some transporters, however, or genes involved in transport activity* may be used as local specific drug targets such as:
The mRNA levels need to be quantified by quantitative RT-PCR.
The expression of SLC34A2, Xtrp3s1, CNT2, SLC10A2, SLC5A8, GLUT1, AI648912 will be measured in the villi, FAE and crypts using LDM and quantitative RT-PCR
Apoa1*, Fabp1*, Xtrp3s1, CNT2 for the small intestine GLUT1 (Slc2a1), the amino acid transporter B0+ (Slc6a14) and the multidrug-resistance
associated protein Abcc6 for the colon Fabp1* might be an interesting target for absorption of fatty acid type drugs in the proximal small
intestine The tumor suppressor gene SLC5A8 seems to be highly expressed in the more distal part of the
intestine, namely the ileum and the colon
Acknowledgments I
UCSF/OSU
Wolfgang SadéeVera RakhmanovaShoshana Brown
Joe DeRisiAdam CarrollJingchun Zhu
Xenoport
Katie WoodfordNoa Zerangue
National Cancer Institute
John WeinsteinKimberly Bussey
Acknowledgments II
ISREC
Jean-Pierre KraehenbuhlMartin Rumbo
Bioinformatics Core Facility
Mauro DelorenziThierry Sengstag
Nestlé
Gary WilliamsonMuriel FiauxRobert MansourianDavid MutchMatthew-Alan Roberts
Swiss Institute of Bioinformatics
Philipp BucherViviane PrazChristian Iseli
Statistical analysis of data
Identification of differentially regulated probe sets
Classical ANOVA or GEA ?GEA: SD a function of A
Mapping to GO terms?
GO Output
Cellular Component Molecular Function Biological processes
L3 L2 L3 GO:ZL3 L3 GO:Y L3
L4 GO:X L4 GO:Y
Functional annotation
Identification of genes with similar functions
Loess, quantile or others ?Normalization across chips
Comparability of chips
What clustering method ?Which measurement of similarity ?
Clustering of genes with similar expression profiles
Identification of similarly regulated genes
Fluorescence signal of 22 probes
1 numeric value
MAS5 or RMA ?
GO Classification Programs
Name Input GO annotation Quality threshold
Assessment of ambiguity
Statistics Selection of level
Classification on UG basis
Classification on RefSeq basis
Comments
IO PS NetAffx, LocusLink
Yes Yes No Yes Yes Yes
GenMapp/ MappFinder
GeneBank/ SwissProt, Trembl
GOA No No Yes: z-score No/Yes No No Linked to pathway maps
Onto-Express PS and others NetAffx No No No Yes No No Included in Onto-Tools: Onto-Translate etc
Affymetrix GO Mining Tool
PS NetAffx No No No No No No
GoMiner HUGO ? No No Yes: Fisher Predeterm. No No Linked to other DBs
FatiGo Depending on species: e.g. UG, SP
GOA No No Yes: Fisher, rel. Enr. factor
Yes No No
GeneSpring PS NetAffx* No No No No No No
David PS,GB,LL,RefSeq,UG
NetAffx, UM associations
No No No Prechosen No No Linked to other DBs