30
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Embed Size (px)

Citation preview

Page 1: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Alternative Splicing(a review by Liliana Florea, 2005)

CS 498 SS

Saurabh Sinha

11/30/06

Page 2: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

What is alternative splicing?

• The first result of transcription is “pre-mRNA”• This undergoes “splicing”, i.e., introns are

excised out, and exons remain, to form mRNA

• This splicing process may involve different combinations of exons, leading to different mRNAs, and different proteins

• This is alternative splicing

Page 3: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Alternative splicing

• Important regulatory mechanism, for modulating gene and protein content in the cell

• Large-scale genomic data today suggests that as many as 60% of the human genes undergo alternative splicing

Page 4: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Significance

• Number of human genes has recently been estimated to be about 20-25 K.

• Not significantly greater than much less complex organisms

• Alternative splicing is a potential explanation of how a large variety of proteins can be achieves with a small number of genes

• Errors in splicing mechanism implicated in diseases such as cancers

Page 5: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

What happens in alternative splicing?

• Different combinations of exons within a gene are spliced from the RNA precursor, to be included in mRNA

• The combination depends on tissue type, developmental stage, disease etc.

• Thus different proteins in these different conditions

• Different types of alternative splicing on next slide

Page 6: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

http://bib.oxfordjournals.org/cgi/content/full/7/1/55/F1

exon inclusion/exclusion

alternative 5’ exon

alternative 3’ exon

intron retention

5’ alternative UTR

3’ alternative UTR

Page 7: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Bioinformatics of Alt. splicing

• Two main goals:– Find out cases of alt. splicing

• What are the different forms (“isoforms”) of a gene?

– Find out how alt. splicing is regulated• What are the sequence motifs controlling alt.

splicing, and deciding which isoform will be produced

Page 8: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Identification of splice variants

• All cells have same genome• But all cells don’t have the same

“transcriptome” (i.e., transcripts)– Different cells may express different

(alternative) transcripts of the same gene

• Goal of bioinformatics is to find “splice forms”, i.e., what are the alternative splicing events?

Page 9: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Identification of splice variants

• Direct comparison between sequences of different cDNA isoforms – Q: What is cDNA? How is this different from a

gene’s DNA?– cDNA is “complementary DNA”, obtained by

reverse transcription from mRNA. It has no introns

• Direct comparison reveals differences in the isoforms

• But this difference could be part of an exon, a whole exon, or a set of exons

Page 10: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Copyright restrictions may apply.

Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005

Bioinformatics methods for identifying alternative splicing

directcomparison

Page 11: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Identification of splice variants

• Comparison of exon-intron structures (the gene’s architecture)

• Where do the exon-intron structures come from?– Align cDNA (no introns) with genomic

sequence (with introns)– This gives us the intron and exon structure

Page 12: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Copyright restrictions may apply.

Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005

Bioinformatics methods for identifying alternative splicing

comparisonof exon-intronstructures

Page 13: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Identification of splice variants

• Alignment tools.• Align cDNA sequence to genomic sequence• Why shouldn’t this be a perfect match with

gaps (introns)?– Sequencing errors, polymorphisms, etc.

• Special purpose alignment programs for this purpose

Page 14: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Identifying full lengh alt. spliced transcripts

• Previous methods identified parts of alt. spliced transcript

• Much more difficult to identify full length alternatively spliced transcripts

• Such methods include “gene indices”

Page 15: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Gene indices

• Compare all EST sequences against one another

• Identify significant overlaps

• Group and assemble sequences with compatible overlaps into clusters

Page 16: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Gene indices

Page 17: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Problems with gene indices

• Overclustering: paralogs may get clustered together.– What are paralogs? – Related but distinct genes in the same species

• Underclustering: if number of ESTs is not sufficient

• Computationally expensive:– Quadratic time complexity

Page 18: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Splice graphs

• Nodes: Exons

• Edges: Introns

• Gene: directed acyclic graph

• Each path in this DAG is an alternative transcript

Page 19: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Splice graph

Page 20: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Splice graphs

• Combinatorially generate all possible alt. transcripts

• But not all such transcripts are going to be present

• Need scores for candidate transcripts, in order to differentiate between the biologically relevant ones and the artifactual ones

Page 21: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Splice variants from microarray data

• Affymetrix GeneChip technology uses 22 probes collected from exons or straddling exon boundaries

• When an exon is alternatively spliced, expression level of its probes will be different in different experiments

Page 22: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Copyright restrictions may apply.

Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005

Bioinformatics methods for identifying alternative splicing

splice variantsfrom microarray data

Page 23: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Part 2:Regulation of

alternative splicing

Page 24: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Biological mechanism

• Splicing of pre-mRNA is a complex cellular process

• “Spliceosome” is a complex of several molecules that assembles onto each intron and catalyzes the excision of the intron

• Splice sites (5’ or donor splice site and 3’ or acceptor splice site) play a major role in splicing

• More sites, apart from the splice signals, in introns and exons, contribute to splicing

Page 25: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Biological mechanism

• Cis-regulatory elements (again !)

• Promote (“splicing enhancers”) or repress (“splicing silencers”) the inclusion of the exon in the mRNA

• Can be located in exons or introns

Page 26: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Bioinformatics methods

• Goal: find the cis-regulatory elements that mediate splicing (alternative splicing)

• Early work: find consensus sequences (motifs) of splicing enhancers

• More advanced work: Position weight matrices (PWMs)

Page 27: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Copyright restrictions may apply.

Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005

Bioinformatics representations of splicing regulatory motifs: (a) consensus sequence and (b) position weight matrix (PWM)

Page 28: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Motif finding (again !)

• Statistical overrepresentation• Find k-mers that occur more often in one class of

sequences than in another;• Should be statistically significant• Exonic splicing enhancers (ESE) are more likely to

occur in exons than in introns; hence find 6-mers (k=6) statistically overrepresented in exons compared to introns

• Calculate z-score of count– (Count - mean)/(standard deviation)– Homework 1

Page 29: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Motif finding

• Other standard approaches of motif finding also adopted:– MEME & Gibbs sampling

• Comparative genomics– Find conserved sites in introns– Find conserved sites in exons. This has to

be done carefully. Because exons already have selective pressure.

Page 30: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

Summary

• Alternative splicing is very important

• Bioinformatics for finding alternative spliced forms

• Bioinformatics for finding regulatory mechanisms