Massively parallel functional analysis of missense mutations in … · Massively parallel...

Preview:

Citation preview

Lea Starita, PhD@lea_starita

Shendure and Fields LabsDepartment of Genome Sciences

University of Washington

Massively parallel functional analysis of missense mutations in BRCA1 for interpreting variants of uncertain significance

~350 Variants of Uncertain Significance (VUS)

Freq

uenc

yVariants of uncertain significance (VUS)

Computational prediction

ThroughputGenetic analysis or one-off experiments

Validity

Massively parallel functional analysis

+++ +

+ +++

++++++

How do we interpret the impact of genetic variation at scale?

Massively parallel functional assaysfor assessing function of missense variants

Generate a library with mutations in a sequence of interest

Multiplexed functional assay

Quantify effect sizes of individual variants

var-1var-2

var-3

var-4

var-1var-2

var-3

var-4

Co-transfection

CRISPR

HDR Library

Array-synthesized mutations

Haploid Cells

Population of Cells w/ Many Different Edits

var-1

var-2

var-4

var-3

Co-transfection

CRISPR

HDR Library

Array-synthesized mutations

Haploid Cells

Population of Cells w/ Many Different Edits

var-1

var-2

var-4

var-3

var-2

var-4

var-4var-4

effect size

Sequence variants in input and selected populations

BARD1RING

BRCA1RING

Biochemical functions of BRCA1

BRCA11-102

BRCA1 is required for homology-directed dsDNA break repair (HDR)

BRCA1 HDR activity is required for tumor suppression

BRCA1 must dimerize with BARD1 to function in HDR

The BRCA1:BARD1 dimer has ubiquitin ligase activity

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

BARD1-binding activity

selection in -histidine

transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing

Time 0

B

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

predictionmodel

How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?

to understand the homology-directed DNA repair (HDR) function of BRCA1?

predictionmodel

Ransburgh et al. Cancer Research 2010

HDR activity?

How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?

Experimental data build a better predictor of BRCA1 HDR function

0.0 0.2 0.4 0.6

PolyPhen-2

CADD

Grantham

Align-GVGD

E3 + BARD1-binding

E3 + BARD1-binding

+ A-GVGD

Leave-One-Out Cross Validation R2

pathogenic

VUS

benign

0.0 1.00.33 0.770

2

4

0

2

4

0

2

4

HDR predictions for clinical BRCA1 variants

Predicted HDR score

Cou

nts

Cou

nts

Cou

nts

splice

High HDR functionLow HDR function

pathogenicVUS

benign

0.0 1.00.33 0.77

0

50

100

HDR predictions for 1,287 BRCA1 variants not yet seen in patients

Predicted HDR score

High HDR functionLow HDR function

Cou

nts

HDR predictions for 1,287 BRCA1 variants not yet seen in patients

Cou

nts

High HDR functionLow HDR function

pathogenicVUS

benign

not yet seen

0.0 1.00.33 0.77

0

50

100

Predicted HDR score

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

Cas9/gRNA

Edited repair templates(N = 4,096)

Multiplex genome editing to measurethe effects of SNVs on splicing

1. CRISPR-Cas9constructtargetingBRCA1 exon18

2. RepairtemplatelibrarytosubstituteSNVs withintheexon.

Findlay, Boyleetal.,Nature (2014).

Co-transfection,Multiplex editing

Cas9/gRNA

Edited repair templates(N = 4,096)

Multiplex genome editing to measurethe effects of SNVs on splicing

Heterogeneouspopulationofeditedcells

5 days, collect gDNA& RNA “SelectivePCR”– onlyeditedgDNA andcDNA

Sequence, gDNA and cDNA Variant Counts

gDNA cDNA

Co-transfection,Multiplex editing

Cas9/gRNA

Edited repair templates(N = 4,096)

5 days, collect gDNA& RNA

Multiplex genome editing to measurethe effects of SNVs on splicing

CountvariantgDNA andcDNA

cDN

A/ g

DN

A

Enrichment Scores

Rank-ordered variants

Calculateeffectsonsplicingforeachvariant

Variants that create splice enhancers and silencers or trigger nonsense-mediated decay behave as expected

Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4

Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4

Findlay, Boyleetal.,Nature (2014).

* defined from Ke et al. 2011

*Splice enhancers

*Splice silencers

Effects of SNVs across BRCA1 exon 18

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCNNNNNN TGTGTG GATATCCAC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCTGAGTTTGTGTGT GAACGG AC

ATGCNNNNNNTGTGTGGATATCC AC

ATGCTGAGTTTGTGTGTGAACGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2aLo

g 2 en

richm

ent

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide

mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

LETTER RESEARCH

4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1

Macmillan Publishers Limited. All rights reserved©2014

- - - - *Dashedlinesrepresentnonsense mutations

Findlay, Boyleetal.,Nature (2014).

Effects of SNVs across BRCA1 exon 18

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCNNNNNN TGTGTG GATATCCAC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCTGAGTTTGTGTGT GAACGG AC

ATGCNNNNNNTGTGTGGATATCC AC

ATGCTGAGTTTGTGTGTGAACGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2aLo

g 2 en

richm

ent

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide

mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

LETTER RESEARCH

4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1

Macmillan Publishers Limited. All rights reserved©2014Findlay, Boyleetal.,Nature (2014).

MutPred Spliceannotations:C49G “SpliceAffectingVariant”A53G “ESELoss/ESSGain”A56G “ESELoss”G63T “Cryptic5’SS”T67G “Cryptic5’SS”akaVUSV1714G

- - - - *Dashedlinesrepresentnonsense mutations

In summary

Parallelized assays for the protein function of the RING domain of BRCA1

Saturation genome editing to understand the effect of missense variants on splicing

Freq

uenc

y

Next steps…

Suggestions?

Library construction and variant delivery

Parallelizable assays for protein function

Calculate likelihood estimates for pathogenicity

Sequencing of variants

Computational variant scoring pipeline

Challenges for scaling up

How the results from massively parallelassays could get to the bedside…

Morescans

Betterdatabases

Bettervarianteffectprediction

Shendure lab

Fowler lab

Parvin labMuhtadi Islam

The Ohio State University

Kitzman labUniversity of MichiganFields lab

Dave Young Justin Gullingsrud

Thanks to:

Funding from theYeast Resource Center

NIH P41

The effects of missense SNVs on splicing and protein function are difficult to predict

exon 17 exon 19exon 18CRISPR

HDR Library (N = 4,096)

Multiplex genome editing to determine effects of SNVs on splicing of exon 18 of

BRCA1Findlay et al. Nature 2014

Learning the Sequence Determinants of Alternative Splicing from Millions of

Random SequencesRosenberg et al. Cell 2015

Stop Gain √ Missense ?Frameshift √

Splicing effects

I-SceI-GFP donor GFP

+ I-SceI to induce break

HDR

GFP broken GFP

+ functional

BRCA1 variant

error prone break repair

+ siRNA to target BRCA1 3’UTR

+ nonfunctional

BRCA1 variant

Scoring full-length BRCA1 variants for HDR function in human cells

pathogenic

control

benign

0.00

0.25

0.50

0.75

1.00

1.25

WT

Vecto

r o

nly

R7C

M18T

L22S

C39Y

H41R

C44F

C44S

K45Q

C61G

C64G

D67Y

% H

DR

resc

ue

HDR rescue assay

Muhtadi Islam and Jeff Parvin

I-SceI-GFP donor GFP

+ I-SceI to induce break

HDR

GFP broken GFP

+ functional

BRCA1 variant

error prone break repair

+ siRNA to target BRCA1 3’UTR

+ nonfunctional

BRCA1 variant

Scoring full-length BRCA1 variants for HDR function in human cells

HDR rescue assay

Construction of the barcoded single amino acid substitution BRCA1-RING library

+

ATGCTGAGTTTGTGTGTGAACGGAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCTGAGTTTGTGTGT GAACGGAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCTGAGTTTGTGTGTGAACGGAC…

BRCA1 locus

Donor Plasmids w/ random 6mers Genome editing

in many cells

Transcription and Splicing

CRISPR

Endogenous Loci Eacheditedexonreceives

1.ArandomSNV2.Afixed mutation

3%of106 cells=30,000events

Multiplex genome editing to measurethe effects of SNVs on splicing

Prospective functional map for 1,287 BRCA1 RING variants0

25

50

75

0

2

4

6

8

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

1,287 BRCA1 RING variants

likely HDR likely HDR non-functional functional

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

59 BRCA1 RING variants

likely HDR likely HDR non-functional functional

midpoint between mean pathogenic and benign scores

max pathogenicHDR rescue score

(experimental)

min benignHDR rescue score

(experimental)

A B

COSMICEVS

pathogenicVUS

benign

subs

titut

ing

amin

o ac

id

AVILMFYWSTNQCGPRHKDE*

Zn++Zn++

N- 4-helix bundle loop 1 central helix loop 2

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

C- 4-helix bundle

C

no dataWT aa

1

2.0

0

PRED

ICTE

DHD

R re

scue

soc

re

*

*

E3 ligase activity

Massively parallel assays for BRCA1-RING E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

no dataWT aa

1

2.0

0

E3 fu

nctio

nal s

core

su

bs

titu

tin

g a

min

o a

cid

AVILMFYWSTNQ

CGP

RH

KDE

*

M18

damaging

neutral

enhancing

M18

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for BRCA1-RING E3 ligase activity

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

selection in -histidine

transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing

Time 0

B

BARD1-binding activity

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

Yeast two-hybrid

Genetic testing is big business

*41.7% of tests revealed a VUS in at least one geneTung et al. FrequencyofmutationsinindividualswithbreastcancerreferredforBRCA1 andBRCA2

testingusingnext-generationsequencingwitha25-genepanel.Cancer 2015

More companies, lower costs, more genes*

pathogenicVUS

benign

not yet seen

0.0 1.00.33 0.77

0

50

100

Predicted HDR score

Cou

nts

High HDR functionLow HDR function

0.0 0.2 0.4 0.6

PolyPhen-2

CADD

Grantham

Align-GVGD

E3 + BARD1-binding

E3 + BARD1-binding

+ A-GVGD

Leave-One-Out Cross Validation R2

We need new technologies to deliver

on the promises of genetic medicine

https://www.whitehouse.gov/precision-medicine

Massively parallel functional analyses are a

possible solution

0

25

50

75

0

2

4

6

8

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

1,287 BRCA1 RING variants

likely HDR likely HDR non-functional functional

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

59 BRCA1 RING variants

likely HDR likely HDR non-functional functional

midpoint between mean pathogenic and benign scores

max pathogenicHDR rescue score

(experimental)

min benignHDR rescue score

(experimental)

A B

COSMICEVS

pathogenicVUS

benign

subs

titut

ing

amin

o ac

id

AVILMFYWSTNQCGPRHKDE*

Zn++Zn++

N- 4-helix bundle loop 1 central helix loop 2

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

C- 4-helix bundle

C

no dataWT aa

1

2.0

0

PRED

ICTE

DHD

R re

scue

soc

re

*

*

Starita et al. Genetics, 2015Findlay et al. Nature, 2014Rosenberg et al. Cell, 2015 Patwardhan et al. Nature Biotech, 2012 Fowler et al. Nature Methods, 2010…

Recommended