1 Alternative Splicing. 2 Eukaryotic genes Splicing Mature mRNA

Preview:

Citation preview

1

Alternative SplicingAlternative Splicing

2

Eukaryotic genesEukaryotic genes

Splicing

Mature mRNA

3

The mechanism of RNA The mechanism of RNA splicingsplicing

4

The mechanism of splicingThe mechanism of splicing

A

1 2

1 2A-OH

1 2YYYYYYYYYNCAGGTRAGT ACAG G

5’ splice site 3’ splice siteBranch point

5

Alternative splicingAlternative splicing

321 4

321 4

Splicing

31 4

Alternative

Mature splice variant II

Mature splice variant I

Can be specific to Can be specific to tissue, tissue, developmental-developmental-stagestage or or conditioncondition (stress, cell-cycle).(stress, cell-cycle).

50-70%50-70% of of mammalian genesmammalian genes

6

Some types of alternative Some types of alternative splicingsplicing

Exon skipping

Alternative Acceptor

Alternative Donor

Mutually exclusive

Intron retention

7

Sex determination in flySex determination in fly

8

Sex determination in flySex determination in fly

9

Sex determination in flySex determination in fly

10

Many variants in one geneMany variants in one gene

11

DSCAMDSCAM

12

Antibody secretionAntibody secretion

13

Antibody secretionAntibody secretion

immunoglobulin μ heavy chain

14

Tissue specific alternative Tissue specific alternative splicingsplicing

15

Detection of alternative Detection of alternative splicingsplicing

By sequencing of RNABy sequencing of RNA Old methods (1995-2007) – Old methods (1995-2007) – ESTsESTs New methods: New methods:

– Splicing-sensitive microarraysSplicing-sensitive microarrays– RNA-seqRNA-seq

16

Expressed Sequence Tags Expressed Sequence Tags (ESTs)(ESTs)

AAAAAAAAA

TTTTTTTTTTAAAAAAAAA

RT

Cloning

AAAAAAAAAAAA

cDNA

mRNA

Vector

17

EST preparationEST preparation

5’ EST 3’ EST

Random-primed EST

Average size of EST ~450bp

Picking a clone

18

Alignment of ESTs to the Alignment of ESTs to the genomegenome

EST

DNA

EST

EST

EST

EST

EST

8 million public human ESTs, collected over >10 years (NCBI)

19

Splicing microarraysSplicing microarrays

20

Massive sequencing of RNA (RNA-Massive sequencing of RNA (RNA-seq)seq)

21Wang et al Nature 2008

RNA-seq on multiple tissuesRNA-seq on multiple tissues

22

Splicing regulationSplicing regulation

23

Tissue specific alternative Tissue specific alternative splicingsplicing

How is this process regulated?

24

Regulation of alternative Regulation of alternative splicingsplicing

Splicing Enhancers/SilencersSplicing Enhancers/Silencers Specifically bind Specifically bind SR proteinsSR proteins

25

ExonWeak splice site

AGY(n)

Model for ESE actionModel for ESE action

SRbrain

Exonic Splicing Enhancer (ESE)

26

SR proteins structureSR proteins structure

27

28

Discovery of ESEsDiscovery of ESEs

Exon

Silent mutations can cause exon skipping

29

Regulators of splicingRegulators of splicing

SR proteins (Splicing factors)

ESE/ESSISE ISS

• Complex regulation usually exists• Hard to find intronic elements• For most alt exons – regulation unknown

Signaltransduction

30

How can we break the How can we break the regulatory code?regulatory code?

1. Comparative genomics1. Comparative genomics 2. High throughput methods2. High throughput methods

31

Comparative genomicsComparative genomics: : Use the Use the mouse genomemouse genome to to

findfind sequences that sequences that regulate alternative regulate alternative

splicingsplicing

32

Human-mouse Human-mouse comparisonscomparisons

33

The mouse genomeThe mouse genome

100 million years of evolution100 million years of evolution Average conservation in Average conservation in exonsexons: : 85%85% Only 40% of intronic sequences is Only 40% of intronic sequences is

alignablealignable Average conservation in alignable Average conservation in alignable intronicintronic

sequences: sequences: 69%69% Average conservation in Average conservation in promoterspromoters: : 77%77% Function => evolutionary Function => evolutionary

conservationconservation

34

Conservation of near Conservation of near intronsintrons

(from VISTA genome browser, http://pipeline.lbl.gov)

35

Collection of exonsCollection of exons

AF217972

Human DNA

AF010316

AF217965

AI972259

BE616884

BE614743

36

Finding the mouse homologFinding the mouse homolog

AF217972

Human DNA

AF010316

AF217965

AI972259

BE616884

BE614743

Mouse DNA

243Alt.

1753Const.

37

Conservation in the intronic Conservation in the intronic sequence near exonssequence near exons

AF217972

Human DNA

AF010316

AF217965

AI972259

BE616884

BE614743

Mouse DNA

243Alt.

1753Const.

38

Alternative exons

77%

23%

Constitutive exons

17%

83%

Flanking conserved

introns

ResultsResults

Alternative exonsConstitutive exons

~100 bp from each side of the exon

39

Conservation of intronsConservation of introns

40

Alternative splicing Alternative splicing regulatory sequences?regulatory sequences?

Could serve as binding sites for Could serve as binding sites for splicing regulatory proteinssplicing regulatory proteins

41

Motif searchingMotif searching

Top scoring hexamer in conserved Top scoring hexamer in conserved downstream regions: downstream regions: TGCATGTGCATG (9- (9-fold over expected)fold over expected)

Not over-represented downstream Not over-represented downstream to constitutive exons.to constitutive exons.

Binding site for Binding site for FOX1FOX1 (splicing (splicing regulatory protein)regulatory protein)

42

Functional elements in the Functional elements in the human genomehuman genome

5%5% of the human genomic of the human genomic sequence is considered functionalsequence is considered functional

43

Composition of functional 5% genomic sequence

Coding exons30%

UTR and promoters

20%

Unknown50%

Functional elements in the Functional elements in the human genomehuman genome

44

Impact of splicing Impact of splicing regulatory elementsregulatory elements

~12,000~12,000 alt. spliced exons in the genome alt. spliced exons in the genome 77%77% have conserved flanking intronic have conserved flanking intronic

sequencessequences ~100bp~100bp conserved on each side conserved on each side 12,000 exons * 100 bp * 2 introns * 0.77=12,000 exons * 100 bp * 2 introns * 0.77= 2M 2M

basesbases ==>At least==>At least 2 Million bases2 Million bases in the human genome in the human genome

might be involved in alternative splicing regulation.might be involved in alternative splicing regulation.

>1% of all functional DNA in the >1% of all functional DNA in the genome regulates alt splicing!genome regulates alt splicing!

45

How can we break the How can we break the regulatory code?regulatory code?

1. Comparative genomics1. Comparative genomics 2. High throughput methods2. High throughput methods

46

CLIP-seqCLIP-seq

Licatalosi et al, Nature 2008: 412,686 sequences

Ule et al, Science 2003: 340 sequences

47

Nova, a brain-specific Nova, a brain-specific splicing regulatorsplicing regulator

Ule et al, Science 2003: 340 sequences

48

Ule et al, Science 2003: 340 sequences

49

Extracting the regulatory Extracting the regulatory motifsmotifs

50

The power of deep sequencing The power of deep sequencing (2008)(2008)

51

Mutations causing aberrant Mutations causing aberrant splicingsplicing

Exon

~15% of all point mutations linked to genetic disorders involve splicing alterations

52

Mutations causing aberrant splicing: Mutations causing aberrant splicing: SMNSMN

53

Summary – alt splicingSummary – alt splicing

Increases the coding capacity of Increases the coding capacity of genesgenes

We have 25,000 genes but much We have 25,000 genes but much more protein isoformsmore protein isoforms

54

RNA EDITINARNA EDITINA

55

RNA EDITINRNA EDITINGG

56

What is RNA editing?What is RNA editing?

Alters the RNA sequence encoded Alters the RNA sequence encoded by DNA by DNA in a single-nucleotide, site-in a single-nucleotide, site-specific, mannerspecific, manner

If splicing is “cut and paste” If splicing is “cut and paste” editing is the “spelling checker”.editing is the “spelling checker”.

57

Mode of operation: A-to-I Mode of operation: A-to-I editingediting

A-> GEditing performed by ADAR enzymes (dsRNA specific adenosine deaminases)

Double strandRNA is required

58

Mechanism of RNA-editing (A-Mechanism of RNA-editing (A-to-I)to-I)

59

Functions of RNA editingFunctions of RNA editing

Defense against dsRNA virusesDefense against dsRNA viruses Also involved in endogenous Also involved in endogenous

regulationregulation

60

61

Functional consequences of Functional consequences of RNA editingRNA editing

Protein changeRNA stabilitySplicing

In human, RNA editing is particularly pronounced in In human, RNA editing is particularly pronounced in brain tissuesbrain tissues, due to excess of , due to excess of ADAR expression in brainADAR expression in brain

Neural disorders (glioblastoma, epilepsy, ALS) are linked to changes in RNA-Neural disorders (glioblastoma, epilepsy, ALS) are linked to changes in RNA-editing patternsediting patterns

Editing levels vary in other tissues (minimal editing in Editing levels vary in other tissues (minimal editing in skeletal muscle, pancreasskeletal muscle, pancreas).).

62

Finding RNA-editing sitesFinding RNA-editing sites

Theoretically easy Theoretically easy : find mismatch : find mismatch between genome between genome to RNAto RNA

Huge number of Huge number of sequencing errorssequencing errors

MutationsMutations DuplicationsDuplications SNPsSNPs

Signal drowns in noise

63

Computational approach Computational approach for identification of for identification of

editing sitesediting sites Alignment of ESTs to genomeAlignment of ESTs to genome Find potential intramolecular Find potential intramolecular

dsRNAdsRNA Data cleaningData cleaning

Levanon et al, Nature Biotech 2004

64

Intramolecular dsRNAIntramolecular dsRNA

Exon

RNA

Intron

Levanon et al, Nature Biotech 2004

65

ESTs to genome

Levanon et al, Nature Biotech 2004

66

•dsRNA regions

Levanon et al, Nature Biotech 2004

67

•dsRNA regions

•Masking EST’s ends

Levanon et al, Nature Biotech 2004

68

•dsRNA regions

•Masking EST’s ends

•Masking poor sequence regions

69

•dsRNA regions

•Masking EST’s ends

•Masking poor sequence regions

•Removing known genomic SNPsLevanon et al, Nature Biotech 2004

70

•dsRNA regions

•Masking EST’s ends

•Masking poor sequence regions

•Removing SNPs

•Collecting candidatesLevanon et al, Nature Biotech 2004

71

ResultsResultsDNA

RNA(ESTs)

72

Levanon et al, Nature Biotech 2004

73

74

RNA-editing – a source RNA-editing – a source for human transcripts for human transcripts

diversitydiversity >12,000 editing sites in >1,600 >12,000 editing sites in >1,600

human geneshuman genes Vast majority of editing – in UTRsVast majority of editing – in UTRs Vast majority of editing – in Alu Vast majority of editing – in Alu

(repetitive)(repetitive) A few editing sites in protein-coding A few editing sites in protein-coding

regionsregions

Levanon et al, Nature Biotech 2004

75

And the obligatory next generation sequencing And the obligatory next generation sequencing study…study…(Li, Levanon et al, Science 2009)(Li, Levanon et al, Science 2009)

Editing sites in non-repetitive regions

76

Connection between editing and Connection between editing and splicingsplicing

Negative feedback loop

ADAR gene (editing enzyme)

77

Evolution of a new exonEvolution of a new exon

78

Summary – alt splicing and RNA Summary – alt splicing and RNA editingediting

Increases the coding capacity of Increases the coding capacity of genesgenes

We have 25,000 genes but much We have 25,000 genes but much more protein isoformsmore protein isoforms

Recommended