10
Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Embed Size (px)

Citation preview

Page 1: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Geuvadis RNAseq analysis at UNIGE

Analysis plans

Tuuli Lappalainen

University of Geneva

Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Page 2: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

What we will do: Overview

Coordinate everything

Get the data together: QC, normalization, data sharing

Regulation quantitative trait loci (rQTL): Common and rare cis-regulatory variants

Participate in Loss-of-Function analyses

Functional annotation of both common and rare regulatory variants

Population and evolutionary genetic analyses

Page 3: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Genetic effects on regulatory variation

common/rare cis-variantsindependent

effects

trans-eQTLs

splicing QTLs

Fine-mapping the causal regulatory variants

miRNA/mRNA interactions

eQTL analysisASE analysis

splicing QTL analyses

Page 4: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Finding many needles and little hay

Technical variation reduces our power in eQTL analysis: correction of covariates such as library size, sequencing batches, GC content, % mapping reads…

Linear regression of covariates

Linear regression of ~10 PCs that are expected to be some sort of summaries of technical covariates

Population stratification may lead to false genetic associations

analyze EUR & YRI separately and correct for population structure within EUR with Eigenstrat

Reference allele mapping bias

reference genome

ALT reads map worse or not at

all

SNP INDEL cSNPsimulation results of biased reads & sites

remove from ASE

test: filter biased reads from sams, redo quantifications & eQTL analysis

Page 5: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

eQTLs : genotype association to regulatory phenotypes

The classical cis-eQTL analysis:

all genetic variants >5% MAF

1MB from transcription start site

Spearman rank correlation with (normalized) exon read counts

permutations to assess significance

Expect a few thousand genes with an eQTL

Page 6: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Taking the eQTL approach further Other phenotypes:

Gene expression levels: exon read counts or transcript quantifications?

splicing variation: links between exons (Halit Ongen @ UNIGE), Barcelona’s transcript ratios

miRNA quantifications

Variation QTLs: variation between independent measures of an individual’s gene expression levels = stochastic variation in gene expression

genotype

expr variance

common/rare cis-variants

independent effects

trans-eQTLs

splicing QTLsmiRNA/mRNA

interactions

Independent regulatory variants affecting the same gene

Regress out the first eQTL effect and redo the analysis

How to integrate eQTLs – sQTLs – vQTLs - miQTLs?

Page 7: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

ASE analysis

A C

G T

CC

TT

T TT

cis eQTL* coding SNP mRNA-sequencing Statistical testing for ASE

Is the allelic ratio different from 0.5 / 0.5?

Thousands of data points per individual

Less noisy than expression levels

No direct information of the causal variant

Page 8: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

ASE applications : population genetics of regulatory effects

Clustering of individuals (and populations)

Expression distance

ASE distance

Genetic distance

Epistasis between regulatory and coding variants

Deficiency of putatively deleterious coding variants with high expression of the derived allele

(Lappalainen et al. 2011)

Page 9: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

ASE applications : rare regulatory variants

Sharing of rare ASE effect leads to excess of sharing of the haplotype

We have developed a statistical method to look for ASE-genotype concordance to characterize rare regulatory variants (Montgomery et al. Plos Genetics 2011)

POOL OF INDIVIDUALS

ASE

ASE

NO ASE

NO ASE

NO ASE

NO ASE

Stephen Montgomery

Page 10: Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva

Functional annotation of regulatory variants

Functional annotation of the genome: 1000g annotations, ENCODE, conservation, etc

-> overlap with rQLTs

Can we finally get the causal variants?