Upload
miranda-kinney
View
41
Download
1
Embed Size (px)
DESCRIPTION
Constructions and Applications of Alternative Splicing Databases. 逢甲大學. 生物資訊研究中心. speaker: 許芳榮. Outline. Introduction Construction of alternative splicing database Survey of existing solutions Applications. Introduction. RNA Splicing. Alternative Splicing. Definitions - PowerPoint PPT Presentation
Citation preview
逢甲大學 生物資訊研究中心
Constructions and Applications of Constructions and Applications of Alternative Splicing DatabasesAlternative Splicing Databases
speaker: speaker: 許芳榮許芳榮
OutlineOutline
IntroductionIntroductionConstruction of alternative splicing Construction of alternative splicing
databasedatabaseSurvey of existing solutionsSurvey of existing solutionsApplications Applications
IntroductionIntroduction
RNA SplicingRNA Splicing
Alternative SplicingAlternative Splicing
DefinitionsDefinitionsSplicing the Splicing the same pre-mRNAsame pre-mRNA in in two or two or more waysmore ways to yield two or more to yield two or more different different mRNAsmRNAs that produce two or more that produce two or more different different protein productsprotein products
Types of alternative splicingTypes of alternative splicing
The Troponin T (muscle protein) pre-mRNAis alternatively spliced to give rise to64 different isoforms of the protein
Constitutively spliced exons (exons 1-3, 9-15, and 18)
Mutually exclusive exons (exons 16 and 17)
Alternatively spliced exons (exons 4-8)
Exons 4-8 are spliced in every possible waygiving rise to 32 different possibilities
Exons 16 and 17, which are mutually exclusive,double the possibilities; hence 64 isoforms
EExpressed xpressed SSequence equence TTags (ESTs)ags (ESTs)
What are the relationships What are the relationships of Genome, mRNA and ESTof Genome, mRNA and EST
s?s?
Genome
EST
EST
3’
Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3
5’
AAA...
EST 1
EST 2
EST 4
EST 3
EST 5
EST 6
AAA...EST 7
Construction of alternative Construction of alternative splicing databasesplicing database
Gene Discovery
SNPAlternative
Splicing
dbEST
5 million ESTs
Genome Sequences
3 billion bp
alignment
Exons, Introns Database
Methods of Alternative Methods of Alternative Splicing Detection Splicing Detection
mRNA – EST alignment (or EST mRNA – EST alignment (or EST consensus)consensus)Without knowledge of genomic Without knowledge of genomic
sequencesequenceGenomic sequence to EST alignmentGenomic sequence to EST alignment
informativeinformative
How to cluster ESTs ? How to cluster ESTs ? UniGene clusterUniGene cluster
Consider the ESTs in the same UniGene clusConsider the ESTs in the same UniGene clusterter
Save time but not informativeSave time but not informativeGenome templateGenome templateGenomic sequence to EST alignmentGenomic sequence to EST alignment
informative but time consuming informative but time consuming
The Approaches of EST The Approaches of EST ClusteringClustering
Unigene like approachUnigene like approach1.1. Overlapped ESTs are grouped in a cluster as UnigOverlapped ESTs are grouped in a cluster as Unig
ene.ene.2.2. Generating a consensus sequence of each cluster.Generating a consensus sequence of each cluster.3.3. Aligning consensus sequences to genome sequeAligning consensus sequences to genome seque
nce. nce. Genome template Genome template
1.1. Cut Human Genome Sequence in 20k base pairs.Cut Human Genome Sequence in 20k base pairs.2.2. Screening in ESTs similarity by BLAST.Screening in ESTs similarity by BLAST.3.3. Detecting exons by sim4. Detecting exons by sim4.
Directly alignmentDirectly alignment
Unigene like approachUnigene like approach
1.1. Overlapped ESTs aOverlapped ESTs are grouped in a clure grouped in a cluster as Unigene.ster as Unigene.
2.2. Generating a consGenerating a consensus sequence oensus sequence of each cluster.f each cluster.
3.3. Aligning consensuAligning consensus sequences to ges sequences to genome sequence. nome sequence.
genomic seq
Candidates of gene location
STS
BLAST
gene
consensus sequence
Report exons
Genome templateGenome template
1.1. Cut Human GenoCut Human Genome Sequence into me Sequence into 20k base pairs.20k base pairs.
2.2. Screening in ESTs Screening in ESTs similarity by BLASsimilarity by BLAST.T.
3.3. Detecting exons bDetecting exons by sim4. y sim4.
genomic template
EST DB
WU-BLAST
ESTs with similarity
Sim4
exons
Directly alignmentDirectly alignment
Using UniGene Cluster is not InfUsing UniGene Cluster is not Informativeormative
Many ESTs in different UniGene clusteMany ESTs in different UniGene clusters are aligned to same genome area.rs are aligned to same genome area.
UniGene cluster ID UniGene cluster ID 101131,100437,100101131,100437,100738,101182 and 100143738,101182 and 100143 should be gro should be grouped together to detect alternative spuped together to detect alternative splicinglicing
Resource Description Approach Quantity Species Institute
ASAP
performed genome-wide detection of human
alternative splicing and Unigen Like6201 A.S sites inall genes Human UCLA
TAP
performed an EST-based gene structure
prediction in genomic sequences and alsocollected splicing information Genome template
669 A.S sites in365of the 1007multiexon genes Human WU
PALSdb
PALS db is a collection of PutativeAlternative Splicing information. Alternative
splicing sites were predicted by using the
longest mRNA sequence in each UniGenecluster as the reference sequence Unigen Like 9,952/19,936
Human,mouse Yang Ming
Compugen
ESTs and all mRNA sequences were alignedwith the human genomesequence using LEADS, Compugen’salternative splicing modeling platform. Unigen Like Human Compugen
SpliceNest
Web-based graphical tool to explore genestructure, including alternative splicing,based on a mapping of the EST consensusfrom GeneNest to complete human genome Unigen Like
26880 introns,32348 exons from5468 genes,
Human,mouse,arabidopsis,zebrafish
Max-Planck-Gesellschaft
STACKSTACK can provide putative tissue-specifictranscripts for each gene Unigen Like Human
Egenetic,SouthAfrica
Avatar: Avatar: aa vvalue alue aadded dded ttrraanstnstrriptomiptome databasee database
Align entire dbEST to genome using PCsAlign entire dbEST to genome using PCs
Homo sapiens 14,989 22,969 11,188 330 7,481
Mus musculus 7,479 13,075 4,850 127 3,493
Rattus norvegicus 531 900 401 4 373
Caenorhabditis elegans
162 28 263 5 174
Drosophila melanogaster
351 117 221 6 221
Arabidopsis thaliana 83 4 77 1 32
OrganismNumber. of alternative splicing events
5’ AS 3’ ASExon
skippingmutually exclusive
intron retention
ApplicationsApplications
Cross-species analysisCross-species analysisTissue specific analysisTissue specific analysisSNP and alternative splicing SNP and alternative splicing Quantity analysis Quantity analysis Splicing enhancer Splicing enhancer Gene prediction through dbESTGene prediction through dbESTSNP finding through dbESTSNP finding through dbEST
BoneMarrow , 1
Brain , 15
Eye , 1
Liver, 15
Lung , 5
LymphNode , 1
MammaryGland , 1
Placenta , 4
Stomach , 5
Testis, 1
WholeBlood, 2
BoneMarrow Brain Eye LiverLung LymphNode MammaryGland Placenta Stomach TestisWholeBlood
Tissue distributions of 51 tumor-specific alternative splicing sites
1,598 SNP dependent alternative 1,598 SNP dependent alternative splicingsplicing
Comparison of human and Comparison of human and micemice
Exon skipping Exon skipping
Conserved alternative splicing events (CES events)
F1
F2
F1
F2
F1
F2
F1
F2
If NCES.F1 > K and NCES.F2 == 0
Non-conserved alternative splicing events (NCES events)
Discovering the different Discovering the different constitutive splicing eventsconstitutive splicing events
94 91
ME12713588-1 ME12751459-1
HumanSNX3
MR12705131-1
EST support: 41
ME2231614-2
MouseSnx3
ME2238811-1 EST support: 90
+
EST frequency >=10
EST frequency >=1
PSMD13
Psmd13
F2
F1
GGTGAACCCTTTGTCCCTCGTGGAAATCATTCTTCATGTAGTTAGACAGATGACTG
GGTAAACCCTCTGTCCCTGGTAGAAATAATTCTCCATGTGGTTAGACAGATGACCG
MR178998-1 ME184041-1 ME184161-1
ME579264-1
ME582152-1 ME582275-1
184167,C,T,D,2,2,48,0,0.00452488687782805
184171,T,C,D,2,2,48,0,0.00452488687782805
Human exon
Mouse exon
F1
86CT
CT
TC
T C
48
2 2
0
Finding SNP from dbESTFinding SNP from dbEST
3’
Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3
5’
AAA...
EST 1
EST 2
EST 4
EST 3
EST 5
EST 6
AAA...EST 7
EST to genome alignment with EST to genome alignment with profileprofile
3’
Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3
5’
EST 4
EST 3
EST 5
EST 6
AAA...EST 7
TranslocationTranslocation
Finding gene from dbESTFinding gene from dbEST
3’
Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3
5’
AAA...
EST 1
EST 2
EST 4
EST 3
EST 5
EST 6
AAA...EST 7
Transciptome GenomicsTransciptome GenomicsWhere Where What What Why Why How How
ConclusionConclusion