Cartographie des complexes multiprotéiques humains suite à la … · 2020. 8. 7. · pTRE -BI pTRE -bidirectionel RPA Protéine de réplication A . xvi rTetR Représseur Tet inversé

Cartographie des complexes multiprotéiques

humains suite à la modification ciblée du génome

Mémoire

Jérémy Loehr

Maitrise en biologie cellulaire et moléculaire

Maitrise ès Sciences (M.Sc.)

Québec, Canada

© Jérémy Loehr, 2017

ii

iii

RÉSUMÉ

La purification par affinité couplée à l’analyse par spectrométrie de masse (AP-MS)

est une méthode de choix pour l’étude des interactions protéines-protéines chez les cellules

humaines. Par contre, cette technique est sensible aux perturbations causées par la

surexpression ectopique des protéines cibles. Des effets anormaux, tels que la formation

d’agrégats et la délocalisation des protéines cibles, peuvent mener à des conclusions

erronées. Il est donc important de reproduire le plus précisément possible les niveaux

physiologiques normaux des protéines à l’étude. Les travaux présentés dans ce mémoire

décrivent le développement d’un système robuste et rapide couplant l’édition du génome et

la protéomique permettant l’isolation de complexes protéiques natifs exprimés à des

niveaux quasi physiologiques. L’approche a servie de tremplin afin d’atteindre l’objectif

ultime qui est de caractériser les protéines exprimées à partir de leur contexte génomique

naturel. À l’aide des outils d’édition génomique, nous avons introduit de façon ciblée au

locus AAVS1 une cassette permettant l’expression de protéines d’intérêt étiquetées avec une

séquence permettant la purification par affinité. Ainsi, nous avons purifié de nombreuses

holoenzymes impliquées dans la réparation de l’ADN et la modification de la chromatine.

Nous avons identifié de nouvelles sous-unités et interactions au sein de complexes déjà

bien caractérisés et rapportons l’isolation de MCM8/9, soulignant ainsi l’efficacité et la

robustesse de notre approche. La technique présentée dans ce mémoire améliore et

simplifie l’exploration des interactions protéiques ainsi que l’étude de leur activité

biochimique, structurelle et fonctionnelle.

v

ABSTRACT

Conventional affinity purification followed by mass spectrometry (AP-MS) analysis

is a broadly applicable method to decipher molecular interaction networks and infer protein

function. However, it is sensitive to perturbations induced by ectopically overexpressed

target proteins and does not reflect multilevel physiological regulation in response to

diverse stimuli. Here, we developed an interface between genome editing and proteomics to

isolate native protein complexes produced from their natural genomic contexts. We used

CRISPR/Cas9 and ZFNs to insert cDNA of interest in the endogenous genomic safe harbor

locus AAVS1 and purified several DNA repair and chromatin modifying holoenzymes to

near homogeneity. We uncovered novel subunits and interactions amongst well-

characterized complexes and report the isolation of MCM8/9, highlighting the efficiency

and robustness of the approach. These methods improve and simplify both small and large-

scale explorations of protein interactions, as well as the study of biochemical activities and

structure-function relationships.

vii

TABLE DES MATIÈRES

RÉSUMÉ iii

ABSTRACT v

TABLE DES MATIÈRES vii

LISTE DES TABLEAUX xi

LISTE DES FIGURES xiii

LISTE DES ABRÉVIATIONS ET SIGLES xv

REMERCIEMENTS xvii

AVANT PROPOS xix

INTRODUCTION 1

1.0 Identification des complexes protéiques cellulaires ............................................... 3

1.1 Techniques utilisées pour la purification des complexes protéiques ............... 3

1.1.1 Purification d’affinité en tandem ............................................................ 3

Limites ........................................................................................... 5 1.1.1.1

1.1.2 Purification par étiquette d’affinité ......................................................... 6

L’étiquette FLAG ........................................................................... 7 1.1.2.1

L’étiquette Strep ............................................................................. 8 1.1.2.2

1.2 Les systèmes inductibles avec la tétracycline .................................................. 9

1.2.1 Le système Tétracycline-Off ................................................................... 9

1.2.2 Le système Tétracycline-On ................................................................. 10

1.2.3 L’amélioration des systèmes ................................................................. 11

1.2.4 Limites .................................................................................................. 12

1.3 Limites des techniques présentement utilisées. .............................................. 12

2.0 Concept de genomic safe harbour ........................................................................ 13

2.1 Insertion au locus AAVS1 ............................................................................. 14

3.0 L’édition du génome grâce aux nucléases d’ingénierie ....................................... 14

3.1 Les endonucléases .......................................................................................... 15

3.1.1 Les nucléases à doigt de zinc ................................................................ 15

Origines ........................................................................................ 15 3.1.1.1

viii

Reconnaissance et clivage d’une séquence spécifique d’ADN.... 15 3.1.1.2

Utilité pour la modification du génome ....................................... 16 3.1.1.3

Limites ......................................................................................... 17 3.1.1.4

3.1.2 Les nucléase effectrices de type activateur de transcription ................. 17

Origines ........................................................................................ 17 3.1.2.1


Utilisation pour la modification du génome ................................ 19 3.1.2.3

Limites ......................................................................................... 19 3.1.2.4

3.1.3 Le système CRISPR/Cas9 ..................................................................... 19

Origines ........................................................................................ 19 3.1.3.1


Protospacer Adjacent Motif ......................................................... 21 3.1.3.3

Utilisation pour la modification du génome ................................ 22 3.1.3.4

Limites ......................................................................................... 22 3.1.3.5

4.0 Mécanismes de réparation de l’ADN ................................................................... 23

4.1 Réparation des coupures simple brin ............................................................. 23

4.2 Réponse aux dommages à l’ADN .................................................................. 24

4.2.1 Jonction des extrémités non-homologues ............................................. 26

Mécanisme ................................................................................... 26 4.2.1.1

Protéines impliquées .................................................................... 26 4.2.1.2

Importance de ce mécanisme pour l’invalidation génique ........... 28 4.2.1.3

4.2.2 Recombinaison homologue ................................................................... 28

Mécanisme ................................................................................... 28 4.2.2.1

Protéines impliquées .................................................................... 28 4.2.2.2

Voies de réparation menant au HR .............................................. 29 4.2.2.3

4.2.2.3.1 Synthesis Dependent Strand Annealing ............................. 29

4.2.2.3.2 Double-Strand Break Repair .............................................. 30

4.2.2.3.3 Break-induced repair .......................................................... 31

4.2.2.3.4 Single-Strand Annealing..................................................... 31

Importance du processus de réparation pour l’édition du génome32 4.2.2.4

4.2.3 Utilisation temporelle des mécanismes de réparation de l’ADN .......... 33

ix

PROBLÉMATIQUE ET OBJECTIFS DES TRAVAUX 35

CHAPITRE 1 37

AVANT-PROPOS ................................................................................................. 39

RÉSUMÉ ............................................................................................................... 41

ABSTRACT .......................................................................................................... 43

INTRODUCTION ................................................................................................. 45

EXPERIMENTAL PROCEDURES...................................................................... 47

RESULTS .............................................................................................................. 50

DISCUSSION ........................................................................................................ 68

REFERENCES ...................................................................................................... 72

SUPPLEMENTAL EXPERIMENTAL PROCEDURES ..................................... 79

SUPPLEMENTAL INFORMATION ................................................................... 83

DISCUSSION GÉNÉRALE 97

CONCLUSION 104

RÉFÉRENCES 107

xi

LISTE DES TABLEAUX

Introduction

Tableau 1: Caractéristiques des étiquettes d’affinités communément utilisées ..................... 7

Tableau 2: Motif de liaison du PAM dans des orthologues de Cas9 .................................... 22

Chapitre 1

Table S1, Related to Figures 1 and 2. NuA4 Complex Subunits Identified by Mass

Spectrometry Analysis .................................................................................................. 54

Table S2, Related to Figure 3. PRC2 Complex Subunits Identified by Mass Spectrometry

Analysis ........................................................................................................................ 60

Table S1, Related to Figures 1 and 2. NuA4 Complex Subunits Identified by Mass

Spectrometry Analysis .................................................................................................. 93

Table S2, Related to Figure 3. PRC2 Complex Subunits Identified by Mass Spectrometry

Analysis ........................................................................................................................ 94

Table S3, Related to Figure 4. FA Core and Anchor Complexes Subunits Identified by

Mass Spectrometry Analysis ........................................................................................ 94

Table S4, Related to Figure 5. MCM8 Complex Subunits Identified by Mass Spectrometry

Analysis ........................................................................................................................ 95

Discussion

Tableau 3 : Avantages et désavantages des techniques de purification de complexes

protéine-protéine utilisées précédemment et de celle décrite dans ce mémoire ........ 100

Tableau 4: Avantages et désavantages des systèmes Tet-On et notre système auto-

inductible .................................................................................................................... 102

xiii

LISTE DES FIGURES

Introduction

Figure 1 : Un survol de la stratégie de purification TAP ........................................................ 4

Figure 2 : Séquences d'acide aminé de l'étiquette FLAG et 3xFLAG .................................... 8

Figure 3 : Représentation schématique du principe de purification d’un strep-tag et d’un

twin-strep-tag .................................................................................................................. 9

Figure 4 : Tétracycline Off et Tétracycline On .................................................................... 11

Figure 5 : Le système Tet-On 3G permet l’induction de l’expression d’un gène en présence

de la doxycycline. ......................................................................................................... 12

Figure 6 : Schéma du mécanisme d’action et de la structure d’une nucléase à doigt de zinc

...................................................................................................................................... 16

Figure 7 : Représentation schématique de la structure et de la fonction des TALENs ........ 18

Figure 8 : Schéma de la nucléase Cas9 guidée par l’ARN ................................................... 21

Figure 9 : Schéma de réparation d’une DSB par les voies NHEJ et HR .............................. 25

Figure 10 : Représentation d’une réparation par Synthesis Dependent Strand Annealing .. 30

Figure 11 : Représentation de Double-Strand Break Repair ................................................ 30

Figure 12 : Représentation du Break-Induced Repair .......................................................... 31

Figure 13 : Représentation de Single-Strand Annealing ...................................................... 32

Chapitre 1

Figure 1. ZFN-Driven Gene Addition to the AAVS1 Locus Simplifies Tandem Affinity

Purification of Multisubunit Protein Complexes .......................................................... 51

Figure 2. CRISPR/Cas9-Driven Tagging of NuA4 Subunits Enables Reciprocal Tandem

Affinity Purification of the Endogenous Native Complexes ........................................ 57

Figure 3. Tandem Affinity Purification of the Native PRC2 Protein Complex ................... 59

Figure 4. Tandem Affinity Purification of Endogenously Tagged Fanconi Anemia Core

Complex........................................................................................................................ 63

Figure 5. Tandem Affinity Purification of Endogenously Tagged Minichromosome

Maintenance Complex Component 8 ........................................................................... 64

xiv

Figure 6. Efficient Complex Purification from Unselected Gene-Modified Cell Pools ...... 66

Figure S1. Related to Figure 1. Determination of the Optimal Purification Steps for TAP

and Protein Expression in Single Cell-Derived Clones After ZFN / CRISPR-Driven

Gene Addition to the AAVS1 Locus ............................................................................ 83

Figure S2, Related to Figure 2. Strategy for CRISPR/Cas9-Driven Insertion of the TAP Tag

to the C-Terminus of the EPC1, EP400 and MBTD1 Proteins .................................... 85

Figure S3, Related to Figure 2. CRISPR/Cas9-Driven Insertion of the TAP Tag to the C-

Terminus of the EPC1 and EP400 Proteins .................................................................. 86

Figure S4, Related to Figure 3. TALEN-Driven Insertion of the TAP Tag to the N-

Terminus of the EZH2 Protein ..................................................................................... 87

Figure S5, Related to Figures 4 and 5. Strategy for CRISPR/Cas9-Driven Insertion of the

TAP Tag to the N-Terminus of FANCF and to the C-Terminus of MCM8 ................ 89

Figure S6, Related to Figure 1. Tandem-Affinity Purification (TAP) of JADE1L, JADE1S,

and the ATM kinase along with the Presentation of the Auto-Regulated Tet-On 3G

System. ......................................................................................................................... 92

xv

LISTE DES ABRÉVIATIONS ET SIGLES

AAVS1 Site d’intégration 1 du virus associé à l'adénovirus

AAVS1 Virus associé à l'adénovirus

AP Apurinique ou apyrimidine

ATM Ataxia telangiectasia mutated

BER Réparation par excision de bases

BIR Break-induced repair

Cas CRISPR-associé

CBP Peptide liant à la calmoduline

CRISPR Courtes répétitions palindromiques groupées et régulièrement espacées

crRNA CRISPR RNA

DBD Domaine de liaison à l'ADN

DDR Réponse aux dommages à l'ADN

D-loop Boucle de déplacement

DNA-PKcs DNA-dependent protein kinase catalytic subunit

DSB Cassure double brin

DSBR Double-Strand Break Repair

EGTA Acide tétraacétique d’éthylène glycol

EK Entérokinase

ESCs Cellules souches embryonnaires

gRNA Guide RNA

GSH Genomic safe harbor

HJ Jonction d'Holliday

HR Recombinaison homologue

HSPC Cellules souches et progéniteurs hématopoïétique

IgG Immunoglobulin G

Indels Insertions ou délétions

iPSCs Cellules souches pluripotentes induites

KO Invalidation génique

MMR Réparation des mésappariements

NER Réparation par excision de nucléotides

NHEJ Jonction d'extrémités non homologues

PAM Protospacer Adjacent Motif

PPP1R12C Phosphatase 1 Regulatory Subunit 12C

ProtA Protéine A

pTRE Promoteur Tetracycline-Response Element

pTRE-BI pTRE-bidirectionel

RPA Protéine de réplication A

xvi

rTetR Représseur Tet inversé

rtTA Transactivateur contrôlé. par la tétracycline inversé

SDSA Synthesis Dependent Strand Annealing

sgRNA Single guide RNA

SpCas9 Streptococcus pyogenes Cas9

SSA Single-Strand Annealing

TALEN Nucléases effectrices de type activateur de transcription

TALEs L'effecteur transcriptionel activator-like

TAP Purification d'affinité en Tandem

tetO Opéron Tétracycline

tet-Off Tétracycline - Off

tet-On Tétracycline - On

tetR Répresseur de la tétracycline

TEV Tobacco Etch Virus

tracrRNA Trans-activating crRNA

tTA Transactivateur contrôlé par la tétracycline

ZFN Nucléase à doigt de zinc

xvii

REMERCIEMENTS

Ce mémoire est le résultat d’un travail de recherche de près de deux ans. Je veux

adresser tous mes remerciements aux personnes qui m’ont permis de l’accomplir et qui

m’ont aidé pour la rédaction de ce mémoire.

Je souhaite tout d’abord remercier mon directeur de recherche Dr Yannick Doyon,

qui m’a accompagné tout au long de ma maitrise. Merci d’avoir pris un risque en

m’acceptant comme ton premier étudiant, d’avoir pris le temps nécessaire pour me montrer

des techniques de laboratoire ainsi que de m’avoir permis de développer une rigueur

scientifique dans mon travail.

Je tiens à remercier les gens avec qui j’ai eu la chance de travailler pendant mes

études graduées. En premier lieu, je veux remercier les gens de l’équipe Doyon. Je souhaite

particulièrement remercier Caroline pour avoir répondu à mes questions et avoir établi une

atmosphère de travail très agréable, Sophie pour ton aide qui a été indispensable dans la

rédaction de ce mémoire, et Alexandre pour ta facilité avec les mots.

Je souhaite remercier les membres des nombreuses équipes de recherche de l’étage

d’avoir permis aux membres de l’équipe Doyon (moi) d’emprunter des ressources de

laboratoire lorsque nous étions dans le stade embryonnaire du laboratoire. Et surtout

d’avoir pris le temps pour répondre à mes questions, merci à Christine, Francis, Nicolas,

Olivier, Philippe, et Suzanne.

Je souhaite remercier Jacques Côté pour m’avoir accueilli dans son laboratoire de

recherche et de m’avoir permis d’entamer les premières expériences de ma maitrise. Merci

à Valérie et à Céline pour vos conseils et votre support technique professionnel.

Je tiens à remercier tous les gens faisant partie de mon quotidien. Entre autres,

Ginette, Annie et Bertrand pour tous les soupers du dimanche et surtout pour les lunchs du

lendemain. Guillaume, Mihnea et Roccio lorsque nous réussissons à trouver du temps, nos

soirées ensemble furent toujours très appréciées.

xviii

Un grand merci à ma famille. Merci à mes parents pour essayer de comprendre ce

que je fais, et toujours être de mon côté. Merci à mon frère, tu es mon grand frère préféré.

J’adresse mes plus sincères remerciements ma conjointe Katrine. Sans toi je serais perdu, je

t’aime.

xix

AVANT PROPOS

Le présent ouvrage est déposé à la Faculté des Études Supérieures de l’Université

Laval pour l’obtention du diplôme de Master ès Sciences (MSc.). Ce mémoire porte sur le

développement d’une technique simple et rapide permettant la purification de complexes

protéiques dans leur contexte génomique natif. L’étude décrite dans cet ouvrage s’attarde

spécifiquement au développement d’une méthode à l’interface entre l’édition génique et la

protéomique permettant la purification de complexes protéiques. Ce mémoire contient une

introduction générale rédigée en français portant sur la réparation de l’ADN et l’édition

génique. Le chapitre 1 constitue le corps de l’ouvrage et relate en détail les travaux de

recherche réalisés pour ce mémoire. Ce chapitre est rédigé en anglais sous forme d’un

article scientifique tel que présenté en vue de sa publication, selon les exigences éditoriales.

Finalement, une conclusion termine ce mémoire en résumant les principaux résultats

obtenus, en analysant leurs liens respectifs et en discutant des voies futures à explorer.

L’article présenté dans ce mémoire est le fruit de mes travaux de maitrise qui ont été

réalisés au cours des deux dernières années dans le laboratoire du Dr Yannick Doyon.

L’étude présentée implique mon entière participation dans les techniques de clonage, de

transfection, de culture cellulaire, d’immuno-buvardage. Ce travail exécuté sous la

supervision du Dr Yannick Doyon a été réalisé grâce à la collaboration de Mathieu Dalvai

(étudiant postdoctoral de Dr Jacques Côté, Université Laval), Karine Jacquet (Doctorante

de Jacques Côté, Université Laval), Dr Caroline Huard (professionnelle de recherche du Dr

Yannick Doyon, Université Laval), de Céline Roques (professionnelle de recherche du Dr

Jacques Côté, Université Laval), de Dr Pauline Herst (étudiante postdoctorale de Dr

Jacques Côté, Université Laval) et du Dr Jacques Côté.

Dalvai M*, Loehr J*, Jacquet K, Huard CC, Roques C, Herst P, Côté J, and

Doyon Y, A Scalable Genome-Editing-Based Approach for Mapping

Multiprotein Complexes in Human Cells. Cell Reports 13, 621–633, October 20,

2015 *Equal contribution

INTRODUCTION

3

1.0 Identification des complexes protéiques cellulaires

Les machineries protéiques de la cellule sont responsables de la coordination et de

l’exécution des fonctions cellulaires [1, 2]. Il est donc impératif d’identifier et de

caractériser les composantes de ces complexes de façon à mieux comprendre les voies

perturbées lorsque l’organisme est malade.

1.1 Techniques utilisées pour la purification des complexes protéiques

La purification des protéines suivie d’analyses par spectrométrie de masse demeure

la méthode la plus couramment utilisée pour l’identification des complexes protéiques [3,

4]. Il est important d’avoir une quantité suffisante de la protéine à l’étude pour la réalisation

de cette technique [4]. L’étape limitante de cette caractérisation demeure donc la

purification du complexe et non l’identification des protéines [4]. Lors de l’étude de

protéines faiblement exprimées, cette limite peut être contournée par une surexpression de

la protéine cible [5]. Toutefois, cette approche couramment utilisée peut mener à la

formation d’interactions non spécifique et mener à des conclusions erronées [5].

1.1.1 Purification d’affinité en tandem

La méthode de purification des complexes protéiques utilisée chez la levure est la

purification d’affinité en tandem (TAP), décrite par Puig et al. en 2001 [6]. Cette technique

répond à la problématique précédemment décrite, car elle est optimisée de façon à obtenir

des complexes protéiques dans des conditions natives [6]. La méthode de TAP consiste en

la fusion de l’étiquette-TAP à la protéine cible par recombinaison homologue (HR).

L’étiquette-TAP est composée de trois modules : i) deux domaines de reconnaissance des

IgG de la protéine A (ProtA) provenant de Staphylococcus aureus, ii) un peptide qui se lie à

la calmoduline (CBP, Calmoduline binding protein) séparé par iii) un site de clivage de la

protéase du Tobacco Etch Virus (TEV) [6]. Cette construction, CBP-TEV-ProtA, est

conçue pour l’étiquetage en C-terminal d’une protéine cible (Figure 1A) [6]. Dans une

faible proportion des cas, environ 5%, l’ajout d’une étiquette en C-terminal peut nuire à la

4

fonction native de la protéine, produire des déficiences dans la croissance ou induire la

mort cellulaire [6]. Pour pallier à cette problématique, le même groupe de recherche a

développé une stratégie visant à placer l’étiquette-TAP en N-terminal de la protéine en

inversant l’ordre des modules, ProtA-TEV-CBP-EK (Figure 1A) [6]. Un site de clivage

(DDDDK) de l’entérokinase (EK) a aussi été ajouté, permettant ainsi de retirer

complètement les résidus de l’étiquette suite à la purification [6].

Figure 1 : Un survol de la stratégie de purification TAP

A) Représentation schématique des étiquettes TAP en C- et N-terminal. B) Survol de la stratégie de

purification TAP. [6]

À partir des extraits cellulaires préparés, les protéines fusionnées ainsi que les

protéines du même complexe sont purifiées [7]. L’extrait cellulaire est incubé dans une

première colonne de bille Sepharose IgG où le module ProtA de l’étiquette-TAP se lie

5

fortement à la matrice [6]. La protéase TEV est ensuite utilisée pour couper la protéine

chimère au site de clivage TEV permettant à la protéine cible de se dissocier de la matrice

IgG [6]. L’éluat, contenant la protéine cible fusionnée avec le module CBP, est incubé dans

une deuxième colonne contenant des microbilles liées à la calmoduline. Suite aux étapes de

lavage visant à enlever les protéases TEV et les contaminants, le complexe protéique est

relâché via l’ajout d’EGTA (Acide Tétraacétique d’éthylène glycol) [6]. Lors de

l’utilisation de la stratégie d’étiquetage-TAP en N-terminal, un traitement additionnel à

l’entérokinase peut être réalisé de façon à retirer complètement le résidu de l’étiquette

(Figure 1B) [6].

Suite à la purification, les complexes protéiques sont séparés sur un gel de SDS-

polyacrylamide, puis colorés à l’argent. Les bandes sont par la suite analysées par

spectrométrie de masse, permettant ainsi l’identification des protéines formant le complexe

purifié. Comme la purification TAP est faite sous conditions douces, des essais d’activité in

vitro peuvent être réalisés avec les complexes purifiés [6-8]. Cette technique est conçue

pour permettre la purification rapide de complexes à partir d’une quantité relativement

faible de cellules. De plus, aucune notion antérieure de la composition, l’activité, ou même

la fonction de la protéine cible et de son complexe n’est nécessaire pour l’utilisation de

cette technique [7].

Limites 1.1.1.1

Bien que la purification par étiquette-TAP soit largement utilisée, il existe

néanmoins certaines limites à son usage. La localisation de l’étiquette dans la structure 3D

de la protéine peut diminuer son exposition et ainsi réduire sa liaison aux billes d’affinité.

La présence de l’étiquette peut aussi affecter la fonction et les niveaux d’expression de la

protéine cible [6]. Changer la position de l’étiquette-TAP de C- à N-terminal peut remédier

à ces problématiques [6].

6

L’utilisation de la purification TAP a permis la purification de plus de 589

complexes protéiques, permettant ainsi l’étude systématique du protéome de la levure [9].

Malgré les succès obtenus dans la levure, cette technique se transpose difficilement dans les

eucaryotes supérieurs, car la HR est moins efficace dans ces organismes [9]. L’expression

dans la levure de gènes humains fusionnés à l’étiquette-TAP a permis l’étude fonctionnelle

de nombreuses protéines [9]. Toutefois, les conclusions tirées de ces études demeurent des

inférences, puisque ces dernières n’ont pas été réalisées en conditions natives chez

l’humain [10-18]. Malheureusement, cette technique est difficilement applicable dans les

cellules humaines [9]. La présence de l’étiquette-TAP peut mener à de l’interférence dans

la formation de la structure tridimensionnelle de la protéine et lors des interactions avec

d’autres protéines [19]. Aussi, certaines protéines de mammifère peuvent interagir avec la

calmoduline créant beaucoup de bruit de fond annulant ainsi la fonction principale de

l’étiquette-TAP [20, 21]. Des modifications visant à diminuer les perturbations dans la

stœchiométrie des interactions protéiques et à prévenir la localisation aberrante,

l’agrégation, l’effet dominant-négatif, ainsi que la toxicité devraient être apportées de façon

à permettre une meilleure caractérisation des complexes multiprotéiques [22, 23].

1.1.2 Purification par étiquette d’affinité

L’étiquette d’affinité est une séquence polypeptidique fusionnée à une protéine

d’intérêt de façon à en faciliter la purification et la détection. Les étiquettes d’affinités

peuvent être composées de protéines entières, de domaines protéiques ou de petites

séquences peptidiques. Les étiquettes de ce groupe partagent toutes certaines

caractéristiques : i) une procédure de purification simple; ii) un effet minimal sur la

structure tertiaire et l’activité biologique et iii) une application simple [24]. Tel que décrit

précédemment, l’utilisation de larges étiquettes constituées de domaines protéiques peut

impacter de façon négative la solubilité et la fonction de certaines protéines; il est donc

important de les retirer lors d’études fonctionnelles [24]. Une approche plus simple a été

développée et consiste en l’utilisation de très petites séquences peptidiques [25]. Il n’est

toutefois pas nécessaire de retirer ces étiquettes, car considérant leur petite taille, ces

dernières n’affectent généralement pas la fonction ou les interactions de la protéine à

7

l’étude [25]. Les étiquettes d’affinités les plus communément utilisées sont : poly-Arg-,

FLAG, poly-His-, c-myc-, S-, et Strep II-tag [25]. De façon à optimiser l’étude d’une

protéine d’intérêt, les caractéristiques de l’étiquette d’affinité utilisée doivent être prises en

compte (Tableau 1) [25]. À titre d’exemple, les étiquettes d’affinité FLAG et Strep seront

décrites plus en détail.

Tableau 1: Caractéristiques des étiquettes d’affinités communément utilisées

Tableau tiré de Terpe et al. 2003 [25]

L’étiquette FLAG 1.1.2.1

L’étiquette d’affinité FLAG est une petite séquence peptidique de 8 acides aminés

(DYKDDDDK) [26]. L’anticorps anti-FLAG monoclonal M2 se lie à la partie N-terminale

du peptide FLAGTM

permettant ainsi l’immunoprécipitation des protéines liées à l’étiquette

FLAG et de leurs complexes [27]. Cette étiquette peut être placée en C- ou en N-terminal

de la protéine cible [27]. Ce système est efficace dans plusieurs types cellulaires différents,

tels que les bactéries, la levure, et les cellules de mammifères [26, 28-32]. Les conditions

de purification du système sont généralement non dénaturantes ce qui permet de purifier

une protéine de fusion active [27]. Une version améliorée du système FLAG est le système

3xFLAG, constitué de trois épitopes FLAG regroupés ensemble (Figure 2). Il s’agit d’une

étiquette de 22 acides aminés (N-MDYKDHD-G-DYKDHD-I-DYKDDDDK-C), hydrophile

qui contient un site de clivage entérokinase [27]. La purification basée sur le système

8

3xFLAG est en mesure de détecter des quantités aussi petites que 10 fmol d’une protéine

cible tandis que le système FLAG est capable d’en détecter 100 fmol [25].

Figure 2 : Séquences d'acide aminé de l'étiquette FLAG et 3xFLAG

Figure tirée de Sigma-Aldrich [33]

L’étiquette Strep 1.1.2.2

Développé, en 1993, comme outil d’affinité pour la purification, le Strep-tag est une

petite séquence peptidique de 8 acides aminés (WSHPQFEK) qui se lie fortement dans la

pochette de liaison à la biotine de la streptavidine [34]. La Strep-Tactine est une variation

de la streptavidine qui est mutée aux positions 44, 45, et 47 [35]. Le Strep-tag a une affinité

près de 100 fois plus forte pour la Strep-tactine que pour la streptavidine [35-37]. L’élution

de la protéine fusionnée Strep-tag est faite sous condition douce avec une concentration

faible en d-desthiobiotin, résultant en un produit purifié se rapprochant de l’état natif [36].

Ceci permet donc l’étude des fonctions des complexes protéiques suite à leur purification

[36]. Une version améliorée du Strep-tag II est le Twin-Strep-tag (WSHPQFEK-

GGGSGGGSGG-SAWSHPQFEK) [38]. Cette version permet une plus grande affinité

pour la Strep-Tactine (Figure 3) [38]. Le Twin-Strep-tag consiste en deux Strep-tag II en

tandem avec une région de liaison [38].

9

Figure 3 : Représentation schématique du principe de purification d’un strep-tag et d’un twin-strep-tag

Figure tirée de Schmidt et al. 2007 [38].

1.2 Les systèmes inductibles avec la tétracycline

Malgré l’éventail de techniques disponibles, il est dans certains cas impossible

d’utiliser une des méthodes précédemment décrites pour étudier la fonction d’une protéine

cible. Par exemple, il est impossible de surexprimer de façon constitutive une protéine

cytotoxique pour en étudier les fonctions. Une approche alternative vise à induire de façon

temporelle l’expression du gène d’intérêt. Dans un premier temps, le système inductible

LacR/O provenant de E.coli a été utilisé dans les cellules de mammifères, toutefois l’agent

inductible β-D-thiogalactopyranoside réagissait trop lentement pour une induction adéquate

[39]. Pour répondre à cette problématique, deux systèmes inductibles, provenant de E. coli.,

ont été développés [40, 41]. Les systèmes tétracycline-Off (tet-Off) et tétracycline-On (tet-

On) sont fonctionnels dans les cellules humaines HeLa et démontrent une induction rapide

de l’expression du gène d’intérêt [40, 41]. Il existe cependant deux éléments clés

nécessaires à une induction efficace dans les cellules de mammifères: i) un plasmide

exprimant une protéine régulatrice, le transactivateur, et ii) un plasmide comprenant le

promoteur inductible.

1.2.1 Le système Tétracycline-Off

Dans le système tet-Off, le transactivateur contrôlé par la tétracycline (tTA) est

exprimé constitutivement et se lie au promoteur Tetracycline-Responce Element (pTRE)

10

activant ainsi le gène d’intérêt (Figure 4) [40]. Le pTRE est composé de 7 répétitions de

19pb de la séquence de l’opéron tétracycline (tetO) suivi d’un promoteur minimal CMV et

est reconnu par le répresseur de la tétracycline (tetR) [42]. Le tTA a été créé en fusionnant

tetR avec le domaine activateur de la protéine virale 16 (VP16). Lorsque la tétracycline est

présente, le tTA se lie préférentiellement à la tétracycline et non au pTRE, donc, inactive le

système [40, 41]. L’absence de tétracycline permet la liaison du tTA au pTRE, ce qui induit

l’expression du gène d’intérêt.

1.2.2 Le système Tétracycline-On

Le système Tet-On a été développé à l’inverse du système tet-OFF (Figure 4) [40].

Il s’agit donc d’un système inactif en conditions natives dans lequel l’expression du gène

d’intérêt est induite par l’ajout de tétracycline [41]. Par mutagenèse aléatoire, Gossen et al.

sont parvenus à identifier les acides aminés présents dans tetR responsables de la répression

par la tétracycline [41]. Cette découverte a permis le développement d’un répresseur tet

inversé (rTetR, reverse Tet repressor). Le transactivateur inversé de tTA (rtTA) a été créé

en fusionnant rTetR avec le domaine C-terminal de VP16, permettant ainsi d’activer le

système par l’ajout de la tétracycline au milieu.

11

Figure 4 : Tétracycline-Off et Tétracycline-On

Figure tirée de Gossen et al. 1992 [40].

1.2.3 L’amélioration des systèmes

Au cours des années, plusieurs versions améliorées des systèmes ont été conçues, tel

le système Tet-On 3G Bidirectionnel [42, 43]. Dans le Tet-On 3G, la doxycycline, un

dérivé synthétique de la tétracycline, remplace la tétracycline (Figure 5). La concentration

de doxycycline nécessaire pour induire le système est inférieure à celle de tétracycline.

Cette concentration inférieure diminue les effets cytotoxiques, ce qui est un net avantage

pour les études in vivo [44]. La doxycycline possède une demi-vie de 24 heures, donc pour

maintenir l’induction, il est nécessaire d’ajouter de la doxycycline fréquemment au milieu

de culture. Il est aussi important de noter que le nouveau système Tet-On 3G n’est

inductible que par la doxycycline et non la tétracycline comme l’était son prédécesseur

[41]. Le pTRE-bidirectionel (pTRE-BI) permet l’induction des gènes en 5’ et en 3’ [42].

Cette caractéristique permet l’expression inductible en simultané en amont et en aval des

deux transgènes d’intérêt. Comme le contrôle de l’expression du gène en 5’ est considéré

comme «fuyant», le gène placé en amont est souvent un marqueur visuel, tel mCherry ou

12

ZsGreen1, permettant la sélection des cellules ayant intégré le système. Le pTRE-BI ne

contient aucun site pour la liaison de facteurs de transcription endogènes chez les

mammifères, rendant ainsi impossible l’étude en condition d’expression native.

Figure 5 : Le système Tet-On 3G permet l’induction de l’expression d’un gène en présence de la

doxycycline.

Figure tirée de clontech [45]

1.2.4 Limites

Les techniques décrites précédemment comportent certaines limitations. Tel que

conçus, ces systèmes artificiels nécessitent une transfection simultanée de deux plasmides

et fonctionnent par intégration aléatoire dans le génome. La localisation aberrante du

transgène par l’intégration aléatoire peut mener à des effets indésirables dans la cellule [22]

Ces effets peuvent mener à des conclusions erronées en modifiant le contrôle

transcriptionnel réel, en invalidant ou surexprimant des gènes endogènes via la

modification de certaines régions régulatrices ou en ayant d’autres effets négatifs via

l’agrégation de faux complexes ou la création d’un effet dominant-négatif [22, 23].

1.3 Limites des techniques présentement utilisées.

La purification en tandem suivi de la spectrométrie de masse demeure une technique

utile pour l’étude des complexes protéiques, mais cette technique se transpose difficilement

dans les cellules humaines [6, 9]. Notamment, l’étiquette-TAP peut mener à l’interférence

dans la formation de la structure de la protéine ainsi que des interactions protéiques non

retrouvées à l’état naturel dans les cellules de mammifères [19-21]. Les études

13

fonctionnelles des protéines humaines ont été obtenues par l’expression de gènes humains

fusionnés à l’étiquette-TAP dans la levure [9]. Toutefois, les conclusions tirées de ces

études sont inférées, puisque les études n’ont pas été réalisées en conditions natives chez

l’humain [10-18].

Les étiquettes d’affinités sont de puissants outils permettant la purification de

protéines cibles. Par contre, l’utilisation de larges étiquettes constituées de domaines

protéiques peut impacter de façon négative la solubilité et la fonction de certaines protéines

[24]. L’utilisation d’étiquettes formées de très petites séquences peptidiques ne nécessite

pas de retirer ces étiquettes et n’affecte généralement pas la fonction ou les interactions de

la protéine à l’étude [25]. Par contre, les caractéristiques de l’étiquette d’affinité utilisée

doivent être prises en compte [25].

Les systèmes tétracycline permettent un contrôle sur l’expression de la protéine

cible. Toutefois, l’intégration aléatoire des deux vecteurs demeure une limite importante,

causant beaucoup de problématique potentielle dans la cellule et pouvant erroner les

conclusions ressorties [22, 23].

Des modifications visant à diminuer les perturbations dans la stœchiométrie des

interactions protéiques et à prévenir la localisation aberrante, l’agrégation, l’effet dominant-

négatif, ainsi que la toxicité devraient être apportées de façon à permettre une meilleure

caractérisation des complexes multiprotéiques [22, 23].

2.0 Concept de genomic safe harbour

Un genomic safe harbor (GSH) est une région du génome où il est possible

d’intégrer du matériel génétique sans perturber la fonction, la transcription et la régulation

des séquences codantes natives, ainsi que la structure génique [46]. Un GSH permet

l’expression suffisante d’un transgène sans prédisposer la cellule à une transformation

maligne ou en altérer la fonction [46, 47]. Aussi, l’expression du transgène par le GSH doit

être possible chez différents types cellulaires [46].

14

2.1 Insertion au locus AAVS1

Le locus Adeno-Associated Virus Integration Site 1 (AAVS1) aussi sous le nom de

Phosphatase 1 Regulatory Subunit 12C (PPP1R12C) est un gène qui encode une protéine

dont la fonction n’est pas encore connue. Chez l’humain, ce dernier se situe dans le

chromosome 19 à la position 19q13.42 et est le site d’intégration préférentiel du virus

associé à l’adénovirus (AAV) [48-50]. Comme l’infection par le AAV n’est associée à

aucune pathologie connue et que son intégration dans le locus AAVS1 ne pose aucun effet

néfaste, l’intégration au site AAVS1 est considérée inoffensive [51, 52]. Les cellules

souches embryonnaires (ESCs) et les cellules souches pluripotentes induites (iPSCs)

conservent leur pluripotence suite à l’intégration d’un transgène à AAVS1 [52-56]. De plus,

l’expression du transgène est maintenue suite à leur différenciation [52-56]. L’intégration à

AAVS1 a été démontrée comme étant bénigne dans les cellules T humaines en culture [57].

Le gène intégré est transcrit dans les lignées cellulaires communément utilisées : K562,

HeLa, HEK293, DU-145, et Hep3B, ce qui facilite son utilisation [47]. Le locus AAVS1

répond aux deux critères pour être considéré un GSH : i) l’intégration au site ne résulte pas

en effets néfastes et ii) la transcription du transgène est compétente dans plusieurs types

cellulaires [47].

3.0 L’édition du génome grâce aux nucléases d’ingénierie

L’édition du génome est une méthode qui permet l’introduction de modifications

désirées à un endroit précis du génome. Ces modifications peuvent prendre la forme

d’invalidation génique (KO, Knock-out), d’introduction de séquence exogène, ou

d’altération d’un ou plusieurs nucléotides. Cette technique nous permet de repousser les

limites des études fonctionnelles en protéomique en facilitant la modification génique.

L’édition du génome se fonde sur la création de cassures double brin (DSB) à un site

spécifique dans l’ADN à l’aide d’endonucléases, telles que les nucléases à doigt de zinc

(ZFN), les nucléases effectrices de type activateur de transcription (TALEN) et CRISPR

(Courtes répétitions palindromiques groupées et régulièrement espacées)/Cas (CRISPR-

15

associé). Ces systèmes, développés en 2005, 2011 et 2013 respectivement, possèdent une

facilité d’utilisation ouvrant la porte à la compréhension de certains phénomènes

biologiques jusqu'à présent impossible à caractériser.

3.1 Les endonucléases

3.1.1 Les nucléases à doigt de zinc

Origines 3.1.1.1

Les ZFN sont des endonucléases chimériques composées d’un domaine de liaison à

l’ADN (DBD) fusionné à un domaine de clivage [58]. Le DBD est composé de modules à

doigts de zinc Cys2His2 en tandem, modifiés pour reconnaître environ 3 nucléotides

spécifiques. Chacun de ces modules dérive de facteur de transcription eucaryote [59-61]. Le

domaine de clivage utilisé dans la ZFN est le domaine de clivage non spécifique de

l’endonucléase FokI [58].

Reconnaissance et clivage d’une séquence spécifique d’ADN 3.1.1.2

Typiquement, 3 à 6 modules sont utilisés de façon à créer le DBD d’une ZFN.

Chacun des modules est formé d’environ 30 acides aminés d’une configuration typique de

ββα [62]. Certains acides aminés situés sur la surface de l’hélice-α interagissent avec

environ 3 paires de bases se trouvant dans le grand sillon de l’ADN [62]. L’assemblage de

plusieurs doigts de zinc permet donc la reconnaissance d’une séquence spécifique d’ADN

[62].

La fonction endonucléase de la ZFN est assurée par la fusion du domaine de clivage

non spécifique de l’enzyme de restriction Fok1 aux modules de protéines à doigt de zinc.

Cette dernière doit dimériser pour être active [63]. Une paire de ZFN dans une orientation

adéquate et espacée de 4-7 pb est nécessaire pour créer une DSB à un endroit précis dans le

génome (Figure 6) [63]. L’obligation des ZFN à travailler en paire augmente la longueur de

16

la séquence de reconnaissance à l’ADN de 18 à 36 pb, accentuant considérablement le

potentiel d’obtenir un site de reconnaissance unique et limitant ainsi les sites de clivages

potentiels hors cible [64, 65].

Figure 6 : Schéma du mécanisme d’action et de la structure d’une nucléase à doigt de zinc

Figure tirée de Urnov et al. 2010 [64]

a) Schéma d’une ZFN dimérisée et liée à sa cible. Chaque ZFN contient le domaine de clivage FokI

lié à 3 à 6 modules à doigt de zinc. Ces derniers sont conçus pour reconnaître une séquence spécifique (boîtes

bleues et rouges) sur chacun des brins de l’ADN. Un petit nombre de bases (typiquement 5 à 6) sépare les

séquences cibles de la ZFN (Miller et al 2007 ref 58). b) L’assemblage des modules d’une ZFN. Pour générer

une protéine à doigt de zinc avec une spécificité à la séquence GGGGGTGAC, trois modules liant

spécifiquement un triplet de bases sont identifiés, puis liés. [63]

Utilité pour la modification du génome 3.1.1.3

Les ZFN furent les premières nucléases d’ingénierie à être développées et ont servi

à faire des études pionnières dans les plantes, les poissons et les mammifères [64, 66-68].

La ZFN est actuellement la seule nucléase à être utilisée lors d’études cliniques chez

l’humain. Sangamo Biosciences utilise une ZFN à la base de leurs stratégies de correction

17

génique pour différentes maladies (clinicaltrials.gov). Notamment, ils ont un projet en

Phase 2 visant à modifier le gène CCR5, un corécepteur utilisé par le VIH pour infecter les

cellules T. Ils ont aussi actuellement deux projets en Phase 1, l’un ciblant les cellules

souches et progéniteurs hématopoïétique (HSPC) des patients VIH, et l’autre ciblant les

HSPC dans les patients souffrant d’anémie à cellules falciformes.

Limites 3.1.1.4

L’utilisation des ZFN comporte toutefois certaines limites. Il faut synthétiser les

protéines spécifiques pour chaque site de coupure, ce processus peut être ardu à exécuter et

nécessite beaucoup de temps et d’argent. Seul un certain nombre de combinaisons de

reconnaissance de nucléotide sont possibles [69]. En effet, les modules protéiques adjacents

peuvent exercer une influence les uns sur les autres, ce qui complique la synthèse de ZFN

[69]. Pour fonctionner, les ZFN doivent être utilisés en paires ce qui permet de réduire les

DSB hors cible, mais ces derniers ne sont pas complètement éliminés [69].

3.1.2 Les nucléase effectrices de type activateur de transcription

Origines 3.1.2.1

Les TALENs ont été développées comme une alternative aux ZFN, car elles sont

plus simples à générer. Tout comme les ZFN, ces dernières utilisent un code protéine-ADN

simple d’utilisation et sont composées d’un DBD et d’un domaine de clivage (Figure 7)

[70]. Le DBD est composé de séquences répétées provenant des nucléases effectrices de

type activateur de transcription (TALEs), une protéine sécrétée par la bactérie

Xanthomonas qui altère la transcription de gène dans la plante hôte, favorisant ainsi son

infection [71]. Les TALENs clivent l’ADN grâce à leur fusion au domaine endonucléase de

l’enzyme de restriction Fok1.

18

Figure 7 : Représentation schématique de la structure et de la fonction des TALENs

a) Structure d’un TALEN. b) Assemblage de deux TALENs pour créer une DSB. Le clivage par le

domaine FokI s’effectue dans la séquence qui se situe entre les deux régions de l’ADN lié par les deux

monomères TALEN. c) Chacune des protéines TALE reconnaît une paire de bases spécifiques grâce à deux

résidus. d) Liaison d’un TALEN à l’ADN. Chaque domaine répété se lie à une paire de bases, notons la

présence d’une thymine en 5’ de la première base liée par une répétition TALE. [70]


Le domaine de répétition TALEs est composé de plusieurs séquences répétitives qui

se lient spécifiquement à une seule base d’ADN. Les séquences répétitives sont composées

de 33-35 acides aminés. Parmi ceux-ci, les résidus retrouvés aux positions 12-13 confèrent

l’interaction spécifique avec une base de l’ADN [72]. Les résidus principalement utilisés à

ces positions sont NN, NI, HD et NG, pour reconnaître les bases G, A, C et T,

respectivement [70]. La co-crystalisation, des domaines de répétition TALEs liés à la

séquence d’ADN cible, a démontré que les séquences répétitives individuelles sont

constituées de deux hélices-α en forme de V qui se chevauchent. Les résidus aux positions

8 et 12 interagissent ensemble de façon à stabiliser la structure [2, 73]. Ces dernières

19

forment une super-hélice autour de l’ADN plaçant ainsi les résidus en position 12 et 13

dans le grand sillon de l’ADN [2, 73].

Utilisation pour la modification du génome 3.1.2.3

Plusieurs études ont démontré que l’efficacité des TALEN à créer des DSB est

similaire à celle des ZFN [74-77]. Les répétitions TALE peuvent être assemblées

facilement. Notamment, une libraire de TALENs ciblant 18740 gène humains codant des

protéines a été développée [78]. Une étude préclinique a utilisé des TALEN ciblant le gène

CCR5 pour y induire des petites insertions ou délétions (Indels) et, en conséquence, induire

une protection contre l’infection du VIH-1 in vitro [79].

Limites 3.1.2.4

Le site de liaison à l’ADN reconnu par les TALENs doit débuter par une thymine

(T) ce qui limite l’éventail des séquences pouvant être ciblées [78]. La livraison des

composantes du système TALEN pose aussi un problème puisque leur grande taille rend

impossible leur empaquetage dans un virus adéno-associé, le vecteur de choix pour la

thérapie génique. [80].

3.1.3 Le système CRISPR/Cas9

Origines 3.1.3.1

Le système CRISPR/Cas est le système d’édition génique le plus récent et démontre

une grande facilité d’utilisation [81]. Les séquences répétées, maintenant nommées

CRISPR, ont été initialement identifiées en observant cinq répétitions de 29 paires de bases

espacées par 32 paires de bases à proximité du gène iap chez la bactérie Escherichia coli

[82]. En 2007, Barrangou, Horval et Moineau, ont lié les CRISPR à une réponse

immunitaire dans les procaryotes lorsqu’ils ont démontré l’acquisition d’une résistance aux

20

attaques de phages en introduisant les séquences de CRISPR chez Streptococcus

thermophilus [82]. L’immunité est possible, car un ARN produit par la séquence CRISPR

guide la nucléase du système CRISPR/Cas aux séquences spécifiques du phage invasif,

permettant son clivage et sa désintégration [83]. Le système CRISPR/Cas et ses variantes

ont été retrouvés dans nombreuses espèces de bactérie et archaea [83]. Les systèmes

CRISPR/Cas ont été classifiés en trois groupes : type I, type II et type III, ainsi que dans

des sous-types respectifs [84]. En 2012, le système CRISPR/Cas9 appartenant au type II a

été démontré comme étant un puissant outil d’ingénierie génétique, ce système est capable

de produire une DSB ciblée [83]. Ce système a été utilisé avec succès dans plusieurs

organismes différents [85-89].


Le système CRISPR/Cas9 à son état naturel consiste en trois éléments : crRNA

(CRISPR RNA), tracrRNA (trans-activating crRNA) et Cas. L’hybridation du crRNA avec

le tracrRNA permet leur association avec la nucléase Cas9 [90-93]. Une fusion chimérique

de l’ARN tracrRNA :crRNA, nommée sgRNA (single guide RNA) ou gRNA (guide RNA),

est plus fréquemment utilisée pour l’édition de génome (Figure 8) [83]. La séquence de 20

nucléotides guide faisant partie du crRNA, permet de diriger avec précision la nucléase à la

séquence homologue cible dans le génome [83, 90, 93]. Cette séquence cible doit être

située directement en amont d’une séquence génomique appelée Protospacer Adjacent

Motif (PAM) [83, 90, 93]. Cas9 subit un changement de conformation plaçant les deux

domaines endonucléiques, RuvC et HNH sur les brins opposés de l’ADN ciblé et le clive à

environ 3 à 4 nucléotides en amont du PAM [83, 90, 93].

21

Figure 8 : Schéma de la nucléase Cas9 guidée par l’ARN

La nucléase Cas9 de S. pyogenes (jaune) est ciblée pour l’ADN génomique (en exemple dans le

locus EMX1 humain) par un ARN-guide consistant en une séquence guide de 20 nucléotides (bleu) et un

échafaud (rouge). La séquence guide se lie avec l’ADN cible (ligne bleue), directement en aval du 5’-NGG

PAM (rose). Cas9 crée une DSB à ̴ 3 pb en amont du PAM (triangle rouge). Tirée de Jinek et al.[83].

Protospacer Adjacent Motif 3.1.3.3

Le PAM est une petite séquence dans l’ADN génomique obligatoire pour la

reconnaissance d’une séquence génique par le sgRNA [94]. Toutefois, cette nécessité ne

limite pas significativement les possibilités de séquence guide considérant sa petite taille

[94]. Par exemple, la séquence PAM de SpCas9 (Streptococcus pyogenes Cas9) est 5’-

NGG-3’, une séquence qui est présente à environ chaque 8 à 12 pb dans le génome humain

[94, 95]. Certaines nucléases provenant d’espèces différentes peuvent avoir un PAM

différent, ce qui permet plus de flexibilité pour la sélection d’un site de coupure dans le

génome (Tableau 2) [96]. Le PAM et les 8 à 12 premières bases du guide sont les plus

importants pour la reconnaissance du site [83]. Cette séquence appelée ‘seed’ est

primordiale à la liaison du guide ARN [83]. Cas9 reconnait tout d’abord un PAM dans le

génome. Elle ouvre la double hélice d’ADN et teste l’appariement systématique de chaque

base du sgRNA allant du 3’ jusqu’au 5’ [83]. Un mésappariement dans la région ‘seed’ va

induire un relâchement du système tandis qu’un mésappariement après le ‘seed’ peut être

toléré [83].

22

Tableau 2: Motif de liaison du PAM dans des orthologues de Cas9

Des orthologues de Cas9 avec les séquences de PAM connu. Le PAM du Cas9 de Lactobacillus

buchneri a été inféré à partir de d’autres séquences connues, mais n’a pas été validé expérimentalement. [96]

Utilisation pour la modification du génome 3.1.3.4

CRISPR/Cas9 est actuellement l’outil le plus puissant pour l’édition du génome, sa

facilité de synthèse et le faible coût y étant associé jouent un grand rôle dans sa popularité.

Plusieurs domaines de recherche ont été facilités grâce au CRISPR, dont la génération

d’animaux transgéniques. Le groupe de recherche de Rudolf Jaenisch a démontré qu’il est

possible de créer plusieurs mutations chez la souris en une seule génération [97]. Les

premières études cliniques chez l’humain viennent d’être approuvées en juin aux États-

Unis. CRISPR/Cas9 sera utilisé pour la thérapie génique ex vivo dans des cellules T de

patients souffrant de cancer [98].

Limites 3.1.3.5

Le PAM est la limite majeure pour la construction des gRNAs, car elle est

invariable. Toutefois, de plus en plus d’orthologues de Cas9 qui sont découvertes possèdent

des PAM différents de celui de SpCas9, ce qui permet d’augmenter les possibilités de cibler

une séquence précise [92, 99-102]. D’autres types de nucléases peuvent aussi avoir des

PAM différents. Cpf1, une endonucléase de Class 2 du système CRISPR-Cas, possède un

PAM riche en thymine : 5’-TTN-3’ [103].

23

Une autre limite identifiée est la tolérance du système CRISPR/Cas aux

mésappariements [47, 83, 95]. Les 10-12 pb en 5’ du gRNA sont généralement les plus

tolérantes au mésappariement [83, 95]. Cette caractéristique mène à une problématique de

DSB hors cible, qui peut mener à des mutations inattendues [104]. Des stratégies ont été

apportées pour diminuer les DSB hors cible, mais elles ne réussissent pas à complètement

les éliminer. En diminuant à 17 nucléotides le nombre de nucléotides pour former un ARN

guide, les DSB hors cible sont réduits dans certains cas de 5000 fois [47]. Ce processus ne

semble toutefois pas affecter l’efficacité des gRNA à atteindre leur cible [47]. Aussi, Cas9

D10A, une nickase créée en mutant et en désactivant le domaine endonucléique RuvC,

nécessite deux CRISPR-Cas9-D10A pour faire une DSB. Cette technique, rappelant le

fonctionnement des ZFN et des TALENs, augmente ainsi la spécificité du système [105].

De même, la fusion Fok1-dCas9 nécessite aussi d’être utilisée en paire. Ce système utilise

la nucléase Fok1 qui doit dimériser pour couper l’ADN et une Cas9 désactivée [106].

4.0 Mécanismes de réparation de l’ADN

4.1 Réparation des coupures simple brin

Lorsque seul un brin d’ADN est coupé, l’autre brin agi comme gabarit pour

permettre la correction. La réparation par excision de base (BER) permet de réparer un

nucléotide grâce à la glycosylase d’ADN, qui reconnaît et enlève le nucléotide en question,

créant un site apurinique ou apyrimidique (AP) [107]. Des endonucléases AP créent une

coupure simple brin dans le site AP pour permettre la réparation en utilisant le brin

complémentaire comme gabarit [107].

La réparation par excision de nucléotides (NER) répare le dommage à l’ADN

provenant de rayons UV [108]. La région endommagée est enlevée en une procédure de

trois étapes : i) reconnaissance du dommage, ii) excision du dommage à l’ADN en amont et

en aval, iii) et resynthèse de la région d’ADN excisée en utilisant le brin complémentaire

comme gabarit [108]. La réparation des mésappariements (MMR) permet la reconnaissance

des mésappariements base-base, ainsi que des petites Indels de nucléotides pendant la

24

réplication et la recombinaison de l’ADN, évitant ainsi que la mutation ne devienne

permanente dans la cellule [109-111].

4.2 Réponse aux dommages à l’ADN

Une DSB de l’ADN est potentiellement létale pour la cellule. La reconnaissance

rapide des dommages à l’ADN et la réparation avec précision de l’ADN sont les aspects les

plus importants de la réponse aux dommages à l’ADN (DDR) dans la cellule [57]. La

réparation des DSB se fait principalement par deux voies de réparation de l’ADN, soit par

jonction d’extrémités non homologues (NHEJ) ou par HR [112].

La réponse la plus précoce à une DSB est le recrutement du complexe MRN

(Mre11, Rad50, Nbs1) suivi rapidement du recrutement de la kinase ataxia telangiectasia

mutated (ATM) qui à son tour phosphoryle l’histone H2AX à proximité de la DSB [113-

115]. MRE11, quant à elle, est une 3’ à 5’exonuclease double brin et une endonucléase

d’ADN simple brin qui assure l’alignement de l’ADN brisé pour permettre sa liaison [116].

Rad50 est une protéine superhélice possédant une activité ATP-dépendante pour la liaison

de l’ADN [116]. Le complexe MRN clive les extrémités d’ADN en guise de préparation

pour la HR ou la NHEJ [116-118]. Le complexe NuA4 est le complexe protéique clé dans

l’acétylation d’histone et la réparation des DSB chez les mammifères. Il contient au moins

16 sous-unités dont trois sous-unités possèdent une activité catalytique : Tip60

l’acétyltransférase, le moteur ATPase p400 et les hélicases Ruvbl1 et Ruvbl2 [119-122].

Tip60 (aussi connue sous le nom d’acétyltransférase KAT5) est recrutée au DSB,

s’autophosphoryle et active l’ATM kinase [123-125]. De nombreuses études démontrent

que l’ATM phosphoryle Kap1 ce qui est une étape critique pour la réparation des DSB dans

l’hétérochromatine [126, 127]. Le recrutement du complexe NuA4 à la DSB permet

l’acétylation de l’histone 4, ce qui, en combinaison avec l’activité de l’ATPase SWI/SNF

DNA-dépendant de p400, permet de diminuer les interactions histone-histone [128]. Ce

relâchement des histones permet une plus grande accessibilité pour la réparation de l’ADN

au site de la cassure [128]. Les processus cellulaires menant au choix d’utiliser une voie de

réparation ou l’autre ne sont pas encore bien compris, par contre, les protéines 53BP1 et

25

BRCA1 sont connues comme ayant un rôle clé à cet égard [129, 130]. Il a toutefois été

démontré que CtIP interagit avec le complexe MRN et régule la résection, une étape clé

dans la décision d’utiliser la réparation par HR dans les cellules mammifères [131].

Figure 9 : Schéma de réparation d’une DSB par les voies NHEJ et HR.

L’ADN ayant subi une DSB et le gabarit homologue utilisé pour la réparation sont respectivement

représentés par une paire de lignes noires et grises. A) Les extrémités de la DSB sont liées par MR(X)N et le

complexe Ku/DNA-PK. B) Dans la réparation NHEJ, les extrémités de la DSB sont stabilisées par MR(X)N

et Ku/DNA-PK. C) MR(X)N et Ku/DNA-PK recrutent le complexe ligase et alignent les extrémités de la

DSB. D) Les extrémités de la DSB sont ligaturées ou sont traitées pour d’autre ligation (repair). E) Dans la

HR, les extrémités 5’ de la DSB sont réséquées par MR(X)N et d’autres nucléases. F) RPA se lie à l’ADN

simple brin généré par la résection. G) L’ADN simple brin lié par la RPA est substitué par la formation du

26

filament-Rad51, impliquant Rad52, Rad55-Rad57 et Rad54. H) La recherche d’homologie et l’invasion du

brin du filament-Rad51 mènent à la formation de la D-loop. I) À partir de la D-loop, différentes voies de HR

peuvent mener à la réparation de DSB. Tirée de Pardo et al. [132]

4.2.1 Jonction des extrémités non homologues

Mécanisme 4.2.1.1

La NHEJ est l’une des voies majeures de réparation de l’ADN suite à une DSB.

Cette méthode de réparation est active en tout temps lors du cycle cellulaire, mais elle

prédomine lors des phases G0/G1 et G2 [133-136].

La NHEJ relie directement les extrémités d’ADN coupées, ce qui peut introduire de

petits Indels de nucléotides pouvant mener à une invalidation génique. C’est pour cette

raison que la NHEJ est considérée propice à l’erreur, malgré le fait que ce ne soit pas

toujours le cas. Il existe trois étapes majeures dans la voie de réparation des DSB par

NHEJ : a) retrait de l’ADN endommagé par des nucléases, b) la réparation par les

polymérases, et c) la ligation des deux extrémités d’ADN par des ligases.

Protéines impliquées 4.2.1.2

Chez les mammifères, les protéines principalement impliquées dans la réparation

par NHEJ sont l’hétéroduplexe Ku70/80, la sous-unité catalytique de la protéine kinase

ADN-dépendent (DNA-dependent protein kinase catalytic subunit, DNA-PKcs) et les

protéines du complexe de ligation : X-ray cross-complementing protein 4 (XRCC4),

XRCC4-like factor (XLF, aussi connue sous le nom Cernunnos) ainsi que la DNA ligase IV

[137-140].

Les protéines Ku70 et Ku80, nommées en fonction de leurs poids moléculaires en

kDa, forment un hétéroduplexe en se liant aux extrémités coupées de l’ADN et recrutent

l’ADN-PKcs une sérine/thionine kinase [141]. Une fois recrutée à la DSB, l’ADN-PKcs

s’autophosphoryle, et s’autoactive. Par son activation, elle recrute et active Artemis en la

27

phosphorylant [134, 135]. Le complexe Artemis : ADN-PKcs possède une activité

endonucléase 5’ qui a une préférence à cliver l’ADN simple brin 5’ pour y laisser des bouts

francs, ainsi qu’une activité endonucléase 3’ avec une préférence pour l’ADNsb 3’ laissant

4 nt en surplomb [142]. De plus, le complexe Artemis : DNA-PKcs possède la capacité de

cliver les boucles d’ADN [142]. Ces activités préparent les extrémités d’ADN pour la

ligation par le complexe de ligation [142, 143].

La polymérase mu est impliquée dans la préparation d’ADN pré-ligation, car elle

possède de multiples activités enzymatiques. Toutefois, d’autres polymérases peuvent

contribuer à la réparation par NHEJ lorsque mu est absente, telle la polymérase lamdba

[144, 145]. La polymérase mu se lie à la DSB via Ku par les domaines BRCT situés en N-

terminal des polymérases [146]. Lorsque Ku et XRCC4 : ADN ligase IV sont présents avec

ces polymérases, elles acquièrent la capacité de synthétiser l’ADN après les extrémités

coupées, et ce, sans avoir besoin d’un gabarit [147-150]. Comme d’autres polymérases, la

polymérase mu peut glisser sur le brin guide ce qui la rend propice à l’erreur [151-153]. La

polymérase mu peut aussi synthétiser de l’ADN indépendamment d’un gabarit lorsqu’elle

est seule ou en présence de XRCC4 : ADN ligase IV [154].

La ligation des extrémités d’ADN est réalisée par la ligase IV qui est liée par les

protéines XRCC4 et XFL ancrées sur l’hétéroduplexe Ku70/80 [146, 155]. XRCC4 et XLF

ont des propriétés similaires; elles ne possèdent pas d’activité enzymatique, mais stimulent

l’activité de la ligase IV ainsi que sa readénylation lorsqu’elles y sont liées [155]. La ligase

IV est capable de ligaturer des extrémités compatibles d’une longueur de 4 nt [156].

L’interaction de la ligase IV avec XRCC4 permet la ligation des extrémités possédant une

microhomologie de 2 pb [155, 157, 158]. Si la ligase IV se trouve ancrée au site de la DSB

avec Ku, l’activité de ligation est améliorée de 10 fois [154]. Lorsque XLF s’ajoute, le

complexe XLF:XRCC4:DNA ligase IV est capable de ligaturer des extrémités d’ADN

incompatible avec plus d’efficacité [159, 160]. La NHEJ est un mécanisme qui, malgré son

potentiel d’introduire des indels, est nécessaire pour permettre la survie de la cellule dans

une situation autrement mortelle [136].

28

Importance de ce mécanisme pour l’invalidation génique 4.2.1.3

L’invalidation génique est nécessaire pour étudier la fonction d’un gène dans un

organisme spécifique. Une fois le gène invalidé, il suffit de comparer les fonctions de la

cellule ou de l’organisme invalidé à celles d’un contrôle pour inférer les fonctions du gène.

En utilisant une endonucléase ciblant spécifiquement le gène d’intérêt, il est possible

d’invalider rapidement et efficacement un gène et étudier ses effets dans la cellule [161].

Récemment, le groupe de recherche de Feng Zhang a développé une stratégie pour induire

multiples invalidations de gène en simultané dans les cellules de mammifère utilisant le

système CRISPR-Cpf1 avec 4 ARN guides [161].

4.2.2 Recombinaison homologue

Mécanisme 4.2.2.1

La HR est une des voies de réparation des DSB qui est considérée sans erreur, car

elle utilise la séquence homologue de la chromatine sœur pour faire une correction parfaite

[132]. Comme les chromatides sœurs ne sont présentes que lors des phases S et G2/M, la

réparation par HR est restreinte à ces phases du cycle cellulaire [162]. La HR classique se

caractérise principalement par trois étapes successives : 1) la résection de l’extrémité 5’ à la

DSB, suivi de 2) l’invasion de brin de la chromatine sœur et la recherche d’une séquence

d’homologie, puis 3) la résolution intermédiaire [132].

Protéines impliquées 4.2.2.2

La première étape de la réparation par HR est la dégradation de l’extrémité 5’

d’ADN. Cette étape est régie par les protéines MRE11, RAD50, NBS1, CtIP and EXO1,

ainsi que par d’autres complexes qui demeurent à être identifiés [132]. Cette résection

résulte en la production d’un ADN simple brin en 3’ qui est rapidement lié par la protéine

de réplication A (RPA), un complexe hétérotrimérique qui se lie à l’ADNss avec haute

affinité [163-165]. La liaison de RPA permet d’éviter la formation de structures secondaires

29

de l’ADNss ainsi que de recruter Rad52 et les protéines encodées par les gènes du groupe

épistatique de RAD52 : RAD50, RAD51, RAD52, RAD54, RAD55, RAD57, RAD59,

RDH54, MRE11 et XRS2. Ces dernières forment le filament Rad51 qui remplace RPA sur

l’ADNss 3’ [163-167]. L’ADNss 3’ enrobé par le filament Rad51 peut donc amorcer une

invasion du brin de la chromatine sœur en créant une boucle de déplacement, (D-loop) suivi

d’une recherche de séquences homologues pour permettre la synthèse de l’ADN [168-170].

Rad54 jouerait un rôle important dans le processus de recherche de séquences d’homologie

et dans la maturation d’intermédiaires de la recombinaison suite à la formation de la D-loop

[171-173].

Voies de réparation menant au HR 4.2.2.3

Différentes voies de réparation mènent à une réparation par HR. Le processus

associé au choix de la cellule d’utiliser l’une ou l’autre de ces voies est encore peu connu.

Toutefois, chacune des voies présentées ci-dessous résulte en un échange de matériel

génétique entre deux chromosomes homologues.

4.2.2.3.1 Synthesis Dependent Strand Annealing

Dans la voie de Synthesis Dependent Strand Annealing (SDSA), le brin invasif

identifie une séquence homologue dans la chromatine sœur. Cette dernière sert de gabarit

pour l’élongation [174]. La nouvelle séquence synthétisée se déplace à l’autre bout de la

DSB et permet la synthèse d’une nouvelle région complémentaire [174]. Cette voie de

réparation ne produit pas d’échange génétique par enjambement [175].

30

Figure 10 : Représentation d’une réparation par Synthesis Dependent Strand Annealing

A) Déplacement des brins par le brin invasif et liaison à l’autre extrémité du DSB. B) Clivage des

séquences non homologues, élongation et ligation produisent un enjambement. Figure tirée de Pardo et al.

[132]

4.2.2.3.2 Double-Strand Break Repair

Lors d’une réparation par la voie classique du Double-Strand Break Repair (DSBR),

l’élongation a lieu sur les deux brins d’invasion simultanément en utilisant le gabarit

provenant de la même D-loop [174]. La résultante est la formation de jonctions d’Holliday

(HJ) qui peuvent produire ou non des événements d’enjambement dépendamment du type

de clivage utilisé pour résoudre la jonction [174].

Figure 11 : Représentation de Double-Strand Break Repair

Élongation et ligation du brin invasif : formation de double-jonction d’Holliday. Figure tirée de

Pardo et al.[132]

31

4.2.2.3.3 Break-induced repair

Le break-induced repair (BIR) est une voie de réparation de l’ADN qui est

privilégiée lorsqu’une seule terminaison de la DSB peut trouver une séquence d’homologie

ou lorsqu’une extrémité d’ADN est perdue [176, 177]. Le brin invasif identifie une

séquence d’homologie, forme un D-loop et initie une synthèse unidirectionnelle d’ADN. Le

brin complémentaire est ensuite synthétisé de façon discontinue [178]. La BIR résulte en

une perte d’hétérozygosité et une possibilité de perte de séquence [178].

Figure 12 : Représentation du Break-Induced Repair

La réparation de la DSB mène à une duplication du gabarit du bras chromosomal. A) L’invasion de

brin par l’extrémité 3’ d’un DSB et la formation de D-loop. B) Formation de la fourche de réplication et la

synthèse d’ADN (flèche en pointillé grise). C) Réplication complète du gabarit du bras homologue et ligation

des brins : formation de HJ. D) Réparation du DSB par résolution du HJ (petite flèche noire). Figure tirée de

Pardo et al. [132]

4.2.2.3.4 Single-Strand Annealing

Une réparation par Single-Strand Annealing (SSA) peut se produire lorsque la DSB

a lieu entre des séquences répétitives [179]. Si l’ADNss 3’ résultant n’identifie pas de

séquence homologue sur la chromatine sœur, il y a poursuite de la résection, ce qui peut

entraîner la perte de plusieurs kilobases d’ADNss 3’ [179]. Suite à la résection des ADNss

32

3’, l’identification de séquences d’ADN homologues sur les deux brins résulte en leur

alignement, puis en la résection de la séquence non homologue par Rad1- Rad10 et les

protéines Msh2, Msh3 et Slx4 [180-182]. La réparation est par la suite terminée par une

élongation et une ligation finale [179]. Cette méthode de réparation produit une délétion de

la séquence qui se trouve entre les séquences répétitives, ce qui peut être critique ou non

pour la fonction de la cellule.

Figure 13 : Représentation de Single-Strand Annealing

A) Résection de l’extrémité 5’ du DSB et la liaison par complémentarité des répétitions d’ADN

(flèche grise en tandem). B) Clivage de la séquence non homologue (flèche noire). C) Élongation. D)

Ligation. Figure tirée de Pardo et al. [132]

Importance du processus de réparation pour l’édition du 4.2.2.4

génome

Lorsqu’on fourni un ADN donneur exogène à la cellule, il est possible d’utiliser la

HR afin d’intégrer des séquences d’ADN avec des bras d’homologie en amont et en aval de

la DSB. Cela donne donc la possibilité d’intégrer une ou des mutations ponctuelles, faire de

la correction génique ou intégrer une séquence spécifique. Par exemple, il est possible

d’intégrer un gène d’intérêt, une étiquette d’affinité, et/ou un domaine fonctionnel

provenant d’un autre gène [183, 184].

33

4.2.3 Utilisation temporelle des mécanismes de réparation de l’ADN

Les deux voies de réparation présentées ci-haut sont en compétition directe selon la

phase de croissance dans laquelle se trouve la cellule. Le NHEJ est connue pour être actif

en tout temps, car ce dernier ne nécessite pas de gabarit, les deux extrémités créées par la

DSB sont directement religaturées ensemble. À l’opposé, la HR est active seulement dans

les phase S tardive et G2/M de la cellule, lorsque les chromatines sœurs y sont présentes.

Cette caractéristique a d’ailleurs récemment été utilisée en couplant l’expression de Cas9 au

cycle cellulaire de façon à augmenter la fréquence de HR in vitro [185].

35

PROBLÉMATIQUE ET OBJECTIFS DES TRAVAUX

Problématique

Les techniques actuellement utilisées pour la purification des complexes protéiques

par surexpression posent des limites importantes à l’étude des complexes protéiques natifs

chez l’humain. Cette approche couramment utilisée peut mener à des effets anormaux, tels

que la formation d’interactions non spécifiques, la délocalisation des protéines cible et la

cytotoxicité, qui peuvent mener à des conclusions erronées. Il est donc important de

reproduire le plus précisément possible les niveaux physiologiques normaux des protéines à

l’étude.

Dans ce contexte, le présent mémoire visait à tester l’hypothèse suivante :

L’intégration ciblée au locus AAVS1 d’un cDNA d’intérêt possédant une étiquette permet

de purifier des complexes protéiques dans un contexte quasi natif, en absence de

surexpression.

Objectif des travaux

1. Développer une technique fondée sur la modification du génome permettant la

purification de complexes protéiques dans des conditions optimales pour l’étude de leur

fonction et structure.

2. Établir des lignées cellulaires humaines pouvant exprimer des protéines toxiques.

36

Objectifs spécifiques

1.a) Utiliser les techniques d’édition de génomes pour créer des lignées stables de cellules

humaines qui expriment nos cDNA d’intérêt au GSH AAVS1.

1.b) Purifier en tandem les protéines exprimées par les lignées stables en se servant de

l’étiquette en tandem 3xFLAG/TwinStrep.

1.c) Purifier les protéines MCM8 et EZH2.

2.a) Utiliser le système Tet-On 3G pour créer un système auto-inductible d’expression des

protéines cibles étiquetées.

37

CHAPITRE 1

A Scalable Genome Editing-Based Approach for Mapping the Human

ProteinInteractome

Mathieu Dalvai 1,2,3, Jeremy Loehr 1,3, Karine Jacquet 1,2, Caroline C. Huard 1, Céline Roques 1,2, Pauline Herst 1,2, Jacques Côté 1,2 and Yannick Doyon 1

1Centre Hospitalier Universitaire de Québec Research Center and Faculty of Medicine,

Laval University, Quebec City, QC, Canada.

2St-Patrick Research Group in Basic Oncology and Laval University Cancer Research

Center, Quebec City, QC, Canada.

3Co-first authors.

Copyright © Cell Press From Cell Reports ®, 13, 621–633, October 20, 2015

Reprinted with the permission from Cell Press

Article publié dans Cell Reports 13, 621–633, October 20, 2015

39

AVANT-PROPOS

Ce travail exécuté sous la supervision du Dr Yannick Doyon et du Dr Jacques Côté

a été réalisé grâce à la collaboration de Dr Mathieu Dalvai (étudiant postdoctoral de Dr

Jacques Côté, Université Laval). Karine Jacquet (Doctorante de Jacques Côté, Université

Laval), Dr Caroline Huard (professionnelle de recherche du Dr Yannick Doyon, Université

Laval), Céline Roques (professionnelle de recherche du Dr Jacques Côté, Université Laval),

Dr Pauline Herst (étudiante postdoctoral de Dr Jacques Côté, Université Laval) ont

participé à l’élaboration conceptuelle de l’étude. Dr Mathieu Dalvai a travaillé dans le volet

d’étiquetage endogène. Ma participation dans cette étude consiste en la construction des

plasmides, l’établissement des lignées cellulaires stables ainsi que l’obtention et l’analyse

des résultats dans le volet d’édition génique au locus AAVS1. Les étapes de rédaction ont

été réalisées par Dr Yannick Doyon.

41

RÉSUMÉ

La purification par affinité couplée à l’analyse par spectrométrie de masse

(AP-MS) est une méthode de choix pour l’étude des interactions protéines-protéines chez

les cellules humaines. Par contre, cette technique est sensible aux perturbations causées par

la surexpression ectopique des protéines cibles. Des effets anormaux, tels que la formation

d’agrégats et la délocalisation des protéines cibles, peuvent mener à l’établissement de

conclusions erronées. Il est donc important de reproduire le plus précisément possible les

niveaux physiologiques normaux des protéines à l’étude. Les travaux présentés dans ce

mémoire décrivent le développement d’un système robuste et rapide à l’interface entre

l’édition du génome et la protéomique permettant l’isolation de complexes protéiques natifs

dans leur contexte génomique naturel. À l’aide des systèmes Clustered regularly

interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) et

transcription activator-like effector nucleases (TALEN) nous avons étiqueté différents

gènes endogènes et purifié de nombreuses holoenzymes impliquées dans la réparation de

l’ADN et la modification de la chromatine à presque homogénéité. Nous avons identifié des

sous-unités et des interactions au sein de complexes déjà bien caractérisés et rapportons

l’isolation de MCM8/9, soulignant ainsi l’efficacité et la robustesse de cette approche. La

technique présentée améliore et simplifie l’exploration des interactions protéiques ainsi que

l’étude de l’activité biochimique, structurelle et fonctionnelle.

43

ABSTRACT

Conventional affinity purification followed by mass spectrometry (AP-MS) analysis

is a broadly applicable method to decipher molecular interaction networks and infer protein

function. However, it is sensitive to perturbations induced by ectopically overexpressed

target proteins and does not reflect multilevel physiological regulation in response to

diverse stimuli. Here we developed an interface between genome editing and proteomics to

isolate native protein complexes produced from their natural genomic contexts. We used

CRISPR/Cas9 and TALENs to tag endogenous genes and purified several DNA repair and

chromatin modifying holoenzymes to near homogeneity. We uncovered novel subunits and

interactions amongst well-characterized complexes and report the isolation of MCM8/9,

highlighting the efficiency and robustness of the approach. These methods improve and

simplify both small and large-scale explorations of protein interactions, as well as the study

of biochemical activities and structure-function relationships.

45

INTRODUCTION

The cell is composed of a collection of protein machines responsible for the

coordinated execution of cellular functions (Alberts, 1998). Deciphering the components

and activities of these molecular assemblies is crucial to understand the cellular networks

that are perturbed under disease states (Gavin et al., 2011; Rolland et al., 2014). For

example, the identification of cancer-driving mutations amongst a background of passenger

mutations can be improved by tying combinations of rare mutations to specific protein

complexes (Krogan et al., 2015; Leiserson et al., 2015; Rolland et al., 2014). Besides, the

growing list of cancer related genes often comprises uncharacterized proteins for which

focused proteomic studies could help uncover function (Lawrence et al., 2014; Vogelstein

et al., 2013).

Affinity purification followed by mass spectrometry (AP-MS) analysis is a powerful

approach to characterize protein-protein interactions and multiprotein complexes which are

defined as sets of stably associated proteins isolated under standardized biochemical

conditions (Gavin et al., 2011). Landmark studies describing genome-wide identification of

complexes in budding yeast generated high quality datasets by relying on two major

technical innovations (Gavin et al., 2006; Krogan et al., 2006). First, the development of

standardized tandem affinity purification (TAP) protocols allowed the isolation of

complexes by sequential capture and elution using pairs of affinity tags (Rigaut et al.,

1999). Second, systematic tagging of open reading frames on the chromosomes via

homologous recombination permitted to retain the physiological regulation of gene

expression for the bait proteins.

While pioneering studies in Arabidopsis, Drosophila, and human cells using ectopic

expression of tagged proteins have yielded important insights into biological pathways in

higher eukaryotes (Behrends et al., 2010; Goudreault et al., 2009; Guruharsha et al., 2011;

46

Hegemann et al., 2011; Hutchins et al., 2010; Huttlin et al., 2015; Marcon et al., 2014;

Sardiu et al., 2008; Sowa et al., 2009; Van Leene et al., 2010), there is a need to keep

improving these methods in order to reduce perturbations in the stoichiometry of protein

interactions and prevent aberrant localization, protein aggregation, dominant negative

effects and toxicity (Doyon et al., 2011; Ho et al., 2002). For example, chromatin

modifying complexes require meticulous biochemical characterization since they often

exist in different forms with paralogous subunits and only their native assembly can

recapitulate their specificity for a given histone residue within chromatin (reviewed in

(Lalonde et al., 2014)). Hence, we developed straightforward methods to streamline the

mapping of protein-protein interactions in human cells under settings minimizing

deviations from their natural context. We used engineered nucleases such as zinc-finger

nucleases (ZFNs), TAL effector nucleases (TALENs) and clustered regularly interspaced

short palindromic repeats (CRISPR)/Cas9 to simplify the generation of cell lines with

tailored modifications and enable the expression of tagged proteins from their endogenous

loci (Hsu et al., 2014; Joung and Sander, 2013; Sternberg and Doudna, 2015; Urnov et al.,

2010). First, we exploited a genomic safe harbor locus to rapidly and reliably generate cells

lines expressing bait proteins at near physiological levels as a surrogate to classical

plasmid- or virus-based methods for stable cell line generation. Second, we introduced an

affinity tag at the N and C-terminus of proteins encoded by endogenous genes in order to

retain the dynamic control of gene expression specified by the native chromosomal context

and the natural post-transcriptional regulation mechanisms. Under these conditions, we

obtained near homogenous preparations of native proteins complexes in sufficient amounts

to perform biochemical assays and identify their subunits via mass spectrometry analysis.

The tagged cell lines were also used for immunoprecipitation of cross-linked chromatin

fragments (Chromatin Immunoprecipitation, ChIP) to study protein-DNA interactions in

vivo under normalized conditions. These tools were portable to numerous proteins and

represent a general solution for protein isolation, complex identification, and genome

location analysis under physiological conditions. Importantly, the scalability and

adaptability of this system will open avenues for the systematic and unbiased mapping of

protein-protein networks in a variety of organisms.

47

EXPERIMENTAL PROCEDURES

Cell culture and transfection

K562 cells were obtained from the ATCC and maintained at 37 °C under 5% CO2

in RPMI medium supplemented with 10% FBS, penicillin-streptomycin and GlutaMAX.

When cultivated in Erlenmeyer or spinner flasks, 25 mM HEPES-NaOH pH 7.4 was added.

Cells were transfected using the Amaxa 4D-Nucleofector (Lonza) per manufacturer's

recommendations.

ZFN, TALEN, and CRISPR/Cas9 reagents

The AAVS1-targeting ZFNs and EZH2 TALENs (Addgene #36775 and #36776)

have been described (Hockemeyer et al., 2009; Reyon et al., 2012). The CMV-driven

human codon optimized Cas9 nuclease and nickase (Cas9 D10A) vectors (Addgene #41815

and #41816) and the U6-driven guide RNA (gRNA) vector to target human AAVS1 (T2

target sequence; Addgene #41818) have been described (Mali et al., 2013). The two U6-

driven guide RNA (gRNA) vectors to target human FANCF have been described (Tsai et

al., 2014). All other gRNA expression vectors were built in the MLM3636 (Addgene

#43860) backbone. Target sequences for EPC1 (GTGACGTAGCTTCCTCCGAG), EP400

(TGCCCTGACTACTGGCACGG), MBTD1 (ATCAAACAAGAGCCATGAGG) and

MCM8 (CCAGCTTCAAACTATGTAAA) were chosen according to a web-based CRISPR

design tool (Hsu et al., 2013). The DNA sequence for the gRNA for EP400, MBTD1 and

MCM8 were modified at position 1 to encode a ‘G’ due to the transcription initiation

requirement of the human U6 promoter. When present on a nuclease construct, FLAG

epitopes were removed by subcloning.

48

Construction of plasmid donors for recombination

The AAVS1-TAP tagging plasmid (Addgene #68375) was assembled in the AAVS1

SA-2A-puro-pA vector (Hockemeyer et al., 2009) by inserting a synthetic DNA fragment

containing a SV40 late polyadenylation sequence, hPGK1 promoter and TAP tag sequence

in between the puromycin resistance gene and the BGH polyadenylation site. The donor

plasmids for tagging EPC1, EP400, MBTD1, EZH2 and FANCF were synthesized as

gBlocks gene fragments (Integrated DNA Technologies) and assembled using Zero Blunt

TOPO cloning kit (Life Technologies) or cloned by restriction into pUC19. The homology

arms for the MCM8 donor plasmid were amplified by PCR from K562 genomic DNA.

Targeted integration to the AAVS1 locus

One millions cells were transfected with 1 µg of ZFN expression vector and 4 µg of

donor constructs. Simultaneous selection and cloning was performed for 10 days in

methylcellulose-based semi-solid RPMI medium supplemented with 0.25 µg/ml puromycin

starting 3 days post transfection. Clones were picked and expanded in 96 wells for 3 days

and transferred to 12 wells plates for another 3 days before cells were harvested for western

blot.

CRISPR/Cas9 and TALEN-driven targeted integration

For targeting using the CRISPR/Cas9 system, one million cells were transfected

with 2 µg of gRNA plasmid, 2 µg of Cas9 vector and 4 µg of donor. For TALEN-driven

integration, 2.5 µg of each vector and 4 µg of donor were transfected. Limiting dilution

cloning was performed 3 days post transfection and targeted clones were identified via out-

out PCR. For the experiments shown in Figure 6, 3 million cells were transfected at the

same DNA ratios and the cells were cultivated in 10 ml of RPMI media in a T-75 flask for

3 days. The cells were then diluted to 2E5 / ml and grown in Erlenmeyer flasks with

agitation until they reached a density of 1E6 / ml. Under these conditions we typically

49

obtained 2E8 cells (200 ml culture at saturation) 7 days post transfection and 1E9 cells (1 l

culture at saturation) 10 days post transfection.

Tandem affinity purification (TAP)

Typically, nuclear extracts (Abmayr et al., 2006) were prepared from 1E9 to 3E9

cells (1 l to 3 l cultures at saturation), adjusted to 0.1% Tween-20 and ultracentrifuged at

100 000 x g for 45 min. Extracts were precleared with 300 µl of Sepharose CL-6B (Sigma),

then 250 µl of anti-FLAG M2 affinity resin (Sigma) was added for 2 h at 4°C. The beads

were then washed in Poly-Prep columns (Bio-Rad) with 40 column volumes (CV) of buffer

# 1 (20 mM HEPES-KOH pH 7.9, 10 % glycerol, 300 mM KCl, 0.1% Tween 20, 1 mM

DTT, 1X Halt protease and phosphatase inhibitor cocktail without EDTA (Pierce))

followed by 40 CV of buffer # 2 (20 mM HEPES-KOH pH 7.9, 10 % glycerol, 150 mM

KCl, 0.1% Tween 20, 1 mM DTT, 1X Halt protease and phosphatase inhibitor cocktail

without EDTA (Pierce)). Complexes were eluted with 5 CV of buffer # 2 supplemented

with 150 µg/ml 3x FLAG peptide (Sigma) for 1 h at 4°C. Next, fractions were mixed with

125 µl Strep-Tactin sepharose (IBA) affinity matrix for 1 h at 4°C and the beads were

washed with 40 CV of buffer # 2 in Poly-Prep columns (Bio-Rad). Complexes were eluted

in 2 fractions with 4 CV of buffer # 2 supplemented with 2.5 mM D-biotin, flash frozen in

liquid nitrogen and stored at -80°C. Typically, 15 µl of the first elution (3% of total) loaded

on Bolt or NuPAGE 4–12% Bis-Tris gels (life technologies) and analyzed by silver

staining. For purifications from whole cell extracts (WCE), cells were washed twice with

PBS and lysed in buffer A (20 mM HEPES-KOH pH 7.9, 10 % glycerol, 300 mM KCl,

0.1% IGEPAL CA-630, 1 mM DTT, 1X Halt protease and phosphatase inhibitor cocktail

without EDTA (Pierce)) (ratio of 100 µl of lysis buffer per 1E6 cells) for 30 min at 4°C.

Extracts were centrifuged for 30 min at 17 000 x g and the purifications were performed as

described above.

HAT and HMT assay

TAP purified fractions were assayed for enzymatic activity on short

oligonucleosomes isolated from HeLa S3 cells as described (Doyon et al., 2004;

Musselman et al., 2012).

50

RESULTS

Characterization of a Potent Tandem Affinity Tag

An ideal TAP tag leads to high recovery of a fusion protein present at low

concentration with minimal background contaminants. It should be functional as N and C-

terminal fusions and its size and amino acid (AA) sequence should have no impact on

protein function. In preliminary experiments we evaluated several tags including the AC-

TAP, SBP, FLAG-HA tags and settled on a combination of 3xFLAG and 2xSTREP tags as

both tags can be eluted under gentle conditions and yield to the isolation of highly purified

material (see below) ((Doyon et al., 2006) and data not shown). Our version contains 59

AA for a predicted molecular weight of 6 kilodaltons (kDa) and does not necessitate

proteolytic cleavage (Figure 1A).

To fine-tune our purification scheme we selected the KAT5/TIP60 tumor suppressor

protein, the catalytic subunit of the NuA4 (Nucleosome acetyltransferase of histone H4)

complex that regulates gene expression, promotes DNA repair via homologous

recombination and is required for embryonic stem cells self-renewal and differentiation

(Steunou, 2014). NuA4 is composed of more than 15 subunits ranging from 20 to 400 kDa

in size (Cai et al., 2005; Doyon et al., 2006; Doyon et al., 2004; Ikura et al., 2000). We first

established two independent pools of cells expressing KAT5 fused to a C-terminal TAP tag

and performed purifications from whole cell extracts to determine the proper order of steps

required to achieve high yields and purity. We observed that the high binding capacity of

the anti-FLAG M2 affinity resin is best used as a first step to recover the maximal amount

of complexes while the Strep-Tactin resin increases dramatically the purity of the final

samples yielding to near homogenous preparations from low amounts of starting material

(Figure S1).

51

1Figure 1. ZFN-Driven Gene Addition to the AAVS1 Locus Simplifies Tandem Affinity Purification of

Multisubunit Protein Complexes

A) Schematic of the donor construct and of the AAVS1 locus following cDNA addition. The first

two exons of the PPP1R12C gene are shown as open boxes. Also annotated are the locations of the splice

acceptor site (SA), Sa self-cleaving peptide sequence (2A), puromycine resistance gene (Puro),

polyadenylation sequence (pA), human phosphoglycerate kinase 1 promoter (hPGK1), and 3xFLAG-

2xSTREP tandem affinity tag (Tag); homology arms left and right (HA-L, HA-R) are respectively 800 and

840 bp. Sequence of the TAP tag. The 3xFLAG sequence is in bold, and 2xSTREP is in bold and underlined.

B) Silver-stained SDS-PAGE showing the purified EPC1 complex. K562 cells expressing the tag (Mock) and

a clonal cell line expressing EPC1-tag (EPC1). C) Silver-stained SDS-PAGE showing the purified EP400

complex from a clonal cell line expressing EP400 tag (EP400). Proteins were identified from unfractionated

protein samples and assigned to specific gel bands based on extensive western blotting analysis. D) Western

blots of selected NuA4 subunits on purified fractions. E) Autoradiogram showing the results of HAT assays

to determine the specificity of the EPC1 and EP400 complexes. Coomassie staining was used as a loading

control for histones. See also Figure S1.

52

Tandem-Affinity Purification Following Nuclease-Driven Gene Addition to the

AAVS1 Genomic Safe Harbor Locus

In order to rapidly generate isogenic cell lines expressing TAP tagged cDNAs, we

used nuclease-driven targeted integration into the human PPP1R12C gene, a safe harbor

genomic locus known as AAVS1 that allows stable transgenesis and neutral marking of the

cell (DeKelver et al., 2010; Hockemeyer et al., 2009; Lombardo et al., 2011). This system

is composed of a nuclease that cleaves the first intron of PPP1R12C and a gene-trap vector

allowing puromycin selection of targeted cells. We adapted this system to integrate tagged

cDNAs and used the moderately active human PGK1 promoter to achieve slight

overexpression conditions (Figure 1A and see below). We targeted two subunits of NuA4

to the AAVS1 locus and purified the associated complexes. We chose the enhancer of

polycomb homolog 1 (EPC1) and the E1A binding protein p400 (EP400) as model proteins.

Due to their size, respectively 100 and 400 kDa, we reasoned that it would represent a

substantial test for biochemical stability during purification. Likewise, targeting required

error-free ZFN-driven addition of their relatively large expression cassettes of 4.3 kb and

11.3 kb.

K562 cells were selected as our model cell line because they can be cultivated,

expanded, and transfected with high efficiency as suspension cultures. They are also

permissive to genome editing events and tolerate cloning via either limiting dilution or in

semi-solid media. Moreover, as a designated tier 1 cell line used by all investigators of the

Encyclopedia of DNA Elements (ENCODE) project, a myriad of genomic and epigenomic

data is available for this cell line (Consortium, 2012).

We first transfected K562 cells with AAVS1-targeting ZFNs and the EPC1 donor

and selected clones in methylcellulose-containing media supplemented with puromycin

(Figure 1A). Single cell-derived colonies were picked after 10 days, expanded, and

53

transgene expression was monitored by western blot. In a typical experiment, more than 90

% of the clones expressed the transgene with little variability (see Figure S1 for an

example). Nuclear extracts were prepared from an EPC1-tag cell line, as well as from K562

cells expressing only the tag (Mock), and subjected to TAP. EPC1 complex subunits

separated by SDS–polyacrylamide gel electrophoresis (SDS–PAGE) could be

unambiguously identified on silver stained gels due to the very low protein background

observed in the mock purification after double-affinity purification (Figures 1B). Mass

spectrometry analysis identified all known components of NuA4, in addition to MBTD1,

which was not previously ascertained as a core complex subunit (Cai et al., 2005; Doyon et

al., 2006; Doyon et al., 2004; Ikura et al., 2000) (Table S1). Next, we performed a

reciprocal purification of the complex by repeating the process with EP400, coding for the

large SWI/SNF2-family ATPase that incorporates histone variant H2A.Z into chromatin

(Weber and Henikoff, 2014). We obtained a complex that was indiscernible from the EPC1

assembly (Figures 1C, 1D ; Table S1). In contrast to previously published data, the EP400

complex contains the KAT5 catalytic subunit (Fuchs et al., 2001). It appears that

overexpression of EP400 that was achieved using retroviral-based gene delivery in the

previous study resulted in the purification of a partially assembled complex.

An important characteristic of our system is that the purified fractions can be

assayed biochemically in vitro. To determine if the purified complexes are recovered in

sufficient amounts and concentration, we performed histone acetyltransferase (HAT)

assays. We observed robust enzymatic activity for both complexes with substrate

specificity consistent with published reports for KAT5/NuA4 (Doyon et al., 2004; Ikura et

al., 2000) (Figure 1E). Again, these observations contrast with the minimal traces of HAT

activity observed for the EP400 complex and the reported disconnection between these two

enzymatic activities (Fuchs et al., 2001; Park et al., 2010; Tyteca et al., 2006). Taken

together, these data demonstrate that coupling our TAP approach with gene targeting at

AAVS1 creates a highly efficient surrogate method for the isolation of native protein

complexes under near physiological conditions.

54

1Table S1, Related to Figures 1 and 2. NuA4 Complex Subunits Identified by Mass Spectrometry

Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for

identified proteins are indicated. AAVS1-EPC1/EP400 and CRISPR-EPC1/EP400 indicate

whether the subunits were expressed ectopically from the AAVS1 locus or from the

endogenous loci, respectively.

Tandem-Affinity Purification of Native Multisubunit Protein Complexes

following CRISPR/Cas9-Driven Tagging of Endogenous Genes

Having demonstrated that our purification strategy worked efficiently and

eliminated most background contaminants at the low expression levels obtained by

targeting the AAVS1 locus, we aimed to determine if proteins expressed from their natural

chromosomal contexts could be isolated with high specificity as physiologically regulated

endogenous complexes. We designed CRISPR/Cas9-based nucleases to cleave DNA at the

vicinity of the stop codon of EPC1 in order to stimulate homology-directed integration of

55

the TAP tag using a donor molecule containing short homology arms (Figures 2A and S2).

Following transfection of K562 cells with the nuclease and recombination donor, we

readily detected the incorporation of the tag in the pool of cells by PCR (Figure S3).

Targeted clones, isolated by limiting dilution in absence of any selection, were observed at

a frequency of 6% (10/165) and accurate gene modification was confirmed by western blot

analysis and sequencing (Figures 2B, 2C and S3). We did not obtain homozygous clones

for this gene, but we note that PCR analysis suggests that three copies of the locus are

present in K562 cells (Figure S3). Comparison of the EPC1-tag protein levels expressed

from the AAVS1 locus versus the endogenous gene revealed an approximately 2.5 fold

overexpression in the former case (Figures 2G and S3). We performed side-by-side TAP

from an endogenously tagged EPC1 cell line, and from a clone targeted at AAVS1, and

observed matching banding patterns on silver stained gels (Figure S3). Mass spectrometry

analysis confirmed the complete coverage of the subunits when the complex is purified

from the endogenous locus (Table S1). To confirm these observations, we used a similar

approach to perform a reciprocal purification of the complex by tagging the endogenous

EP400 gene at its C-terminus (Figures 2D and S2). The efficiency of targeting in this case

was 21% (13/61, including 2 homozygous clones). For the purification, a homozygous

clone expressing tagged EP400 was selected (Figures 2E, 2F and S3). This clone expressed

approximately 2 fold less protein than the one used in the AAVS1 purification (Figures 2G

and S3). Both complexes could be purified to near homogeneity and displayed strong

H4/H2A-specific HAT activity towards nucleosomes (Figures 2H and 2I; Table S1). Since

MBTD1 was reproducibly detected in these preparations, we performed reciprocal tagging

to confirm its stable association with NuA4. CRISPR/Cas9-mediated integration of the

TAP tag at its C-terminus was efficient reaching 26 % (20/78, including 6 homozygous

clones) of positive clones (Figures S2 and data not shown). Purification of MBTD1 using a

homozygote clone confirmed its association with NuA4 (Figures 2J and 2K). Thus, our

approach led to the most complete characterization of native NuA4 components to date.

Not only we were able to conclusively demonstrate that EPC1 and EP400 exclusively

associate with the complex, but we also identified MBTD1 as novel subunit. These data

demonstrate that highly efficient purification of native protein complexes is achievable with

56

minimal chromosomal sequence disruption and preservation of endogenous physiological

regulation of the bait protein.

57

2Figure 2. CRISPR/Cas9-Driven Tagging of NuA4 Subunits Enables Reciprocal Tandem Affinity

Purification of the Endogenous Native Complexes

A) Schematic of the EPC1 locus, Cas9 target site, and donor construct used to insert the TAP tag to

the C terminus of the EPC1 protein. Annotated are the positions of the stop codon (TAG), the protospacer

adjacent motif (PAM) that specifies the cleavage site, and homology arms left and right (HA-L, HA-R). B)

Schematic and results of a PCR-based assay (out-out PCR) to detect targeted integration (TI) of the tag

sequence in single-cell-derived K562 clones obtained by limiting dilution. Primers are located outside of the

homology arms and are designed to yield a longer PCR product if the tag is inserted. C) Western blot showing

EPC1-tag protein expression in K562 clones. Mock indicates cells treated with donor and Cas9 nuclease in

the absence of gRNA. The FLAG M2 antibody was used to detect EPC1, and the GAPDH antibody was used

as a loading control. D) Targeting scheme for EP400, depected as in (A). E) Same as (B), but for EP400. F)

EP400 expression monitored as in (C). G) 2-fold serial dilutions of whole-cell extracts prepared from AAVS1-

EPC1/EP400 and CRISPR-EPC1/EP400 cell lines were analyzed by western blot to determine the relative

expression of EPC1-tag proteins. Error bars indicate the SD from two independently preformed experiments.

H) Silver-stained SDS-PAGE showing the purified EPC1 and EP400 complexes. Wild-type K562 cells

(Mock) and clonal cell lines expressing EPC1-tag (#112) and EP400-tag (#43) from their endogenous loci. I)

Autoradiogram showing the results of HAT assays to determine the specificity of the complexes. Coomassie

staining was used as a loading control for histones. J) Silver-stained SDS-PAGE showing the purified

MBTD1 complex. K) Western blots of selected NuA4 subunits on purified fractions. Wild-type K562 cells

(Mock). Proteins were identified from unfractionated proteins samples and assigned to specific gel bands

based on extensive western blotting analysis. See also Figures S2 and S3.

TALEN-Enabled Purification of the Endogenous Polycomb Repressive

Complex 2

To test the generality of the TAP procedure, we undertook the purification of the

enhancer of zeste homolog 2 (EZH2), the catalytic subunit of the polycomb repressive

complex 2 (PRC2) responsible for the di- and tri-methylation of histone H3 at lysine 27

(H3K27me2/3), a histone mark that correlates with silent or poorly transcribed genomic

regions (Margueron and Reinberg, 2011). EZH2 is a critical regulator of development,

controls stem cells pluripotency and differentiation, its deregulation being at the center of

58

novel therapeutic strategies for a variety of cancers (Plass et al., 2013). First, EZH2 with an

N-terminal TAP tag was targeted to the AAVS1 locus (Figure S1). Next, we replaced the

natural ATG of EZH2 with the TAP tag via homology-directed repair at the chromosomal

locus using a pair of TAL effector nucleases (TALENs) (Figures 3A and S4) (Reyon et al.,

2012). Twenty one percent (20/96, including 6 homozygous clones) of screened clones had

a tagged allele as assessed by PCR, western blotting and sequencing (Figures 3B, 3C and

S4). A homozygote clone expressing exclusively the tagged EZH2 protein was selected for

TAP. Interestingly, in this context, the endogenous locus drives higher protein expression

levels as compared to the ectopically expressed protein from the AAVS1 locus (Figure S4).

The complexes were isolated from nuclear extracts prepared from both cell lines and

analyzed by SDS-PAGE and silver staining. Both preparations appeared similar as a

specific protein band pattern could be clearly identified from the gels when compared to the

mock fractions (Figures 3D and 3E). Co-purified proteins were analyzed by mass

spectrometry leading to the unambiguous identification of all known PRC2 components

(Table S2) (Margueron and Reinberg, 2011). Sub-stoichiometric interactions between

PRC2 and the EHMT1/2 complex, a H3K9 mono- and di-methyltransferase, were also

detected in the TALEN-derived clone (Figure 3F and Table S2). The interaction is not

mediated by co-localization/co-purification on DNA since it is not sensitive to treatment

with benzonase (Figure S4). This association uncovers a direct physical link that supports

the concept of an H3K27me/H3K9me switch for the long-term repression of differentiation

genes (Mozzetta et al., 2014). It also indicates that biallelic targeting of endogenous genes,

leading to exclusive expression of the tagged bait protein in the cell, can reveal labile and

important interactions. Lastly, the purified fractions contained robust enzymatic activity as

determined by histone methyltransferase (HMT) assays (Figure 3G).

59

3Figure 3. Tandem Affinity Purification of the Native PRC2 Protein Complex

A) Schematic of the EZH2 locus, TALEN target site, and donor construct used to insert an affinity tag to

the N terminus of the EZH2 protein. Annotated are the positions of the start codon (ATG), the left and right

TALEN target sites, and homology arms left and right (HA-L, HA-R). B) Schematic results of a PCR-based

assay (out-out PCR) to detect targeted integration (TI) of the tag sequence in single-cell-derived K562 clones

obtained by limiting dilution following TALEN-driven gene targeting. Primers are located outside of the

60

homology arms and are designed to yield a longer PCR product if the tag is inserted. C) Western Blots

showing tag-EZH2 protein expression in the K562 clones. Mock indicates cells treated with donor only. The

FLAG M2 antibody was used to detect EZH2 and the actin antibody was used as a loading control. D) Silver-

stained SDS-PAGE showing the purified EZH2 complex expressed from the AAVS1 locus. K562 cells

expressing the tag (Mock) and a clonal cell line ectopically expressing tag-EZH2. E) Silver-stained SDS-

PAGA showing the purified EZH2 complex expressed from the endogenous locus. Wild-type K562 cells

(Mock) and a clonal cell line expressing tag-EZH2 (#63). Proteins were identified from unfractionated protein

samples and assigned to specific gel bands based on western blotting analysis and predicted molecular

weights. F) Western blots of selected PRC2 subunits on purified fractions shown in (D) and (E). G)

Autoradiogram showing the results of a HMT assay used to determine the specificity of the EZH2 complex.

Coomassie staining was used as a loading control hor histones. H) Binding of endogenously tagged EZH2 and

EPC1 to target gene promoters as determined by ChIP-qPCR analysis. Values are expressed as % of input

chromatin. Error bars indicate the SD from two independently performed experiments. See also Figure S4.

2Table S2, Related to Figure 3. PRC2 Complex Subunits Identified by Mass Spectrometry Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for identified

proteins are indicated. AAVS1-EZH2 and TALEN-EZH2 indicate whether EZH2 was expressed

ectopically from the AAVS1 locus or from its endogenous locus, respectively.

61

Interactions Between Endogenously Tagged Chromatin Modifying Complexes

and Genomic DNA Regions

Chromatin Immunoprecipitation (ChIP) is a powerful technique used to investigate

the interaction between proteins and DNA in the cell permitting to identify the genomic

regions bound by a protein of interest. We tested if the endogenously tagged cell lines could

be used for ChIP using the well-characterized anti-FLAG M2 antibody and analyzed the

occupancy of tag-EZH2 and EPC1-tag to the HOXA9 locus and to the RPL36AL ribosomal

protein gene. We observed strong binding of EZH2 at HOXA9, a bona fide PRC2 target,

and marginal enrichment at RPL36AL (Figure 3H). In contrast, EPC1 was specifically

recruited to the highly transcribed RPL36AL gene, as expected for NuA4 (Figure 3H). The

effectiveness of ChIP assays and genome-wide location analysis currently rely on validated

antibodies to target proteins, which varies greatly. Thus, the use of a common tag added on

endogenous proteins will greatly enhance reproducibility, and importantly, enable accurate

comparison between different sets of factors in distinct growth conditions (Bradbury and

Pluckthun, 2015; Venters et al., 2011).

Purification of Endogenous DNA Repair Complexes

To further exemplify the robustness of our strategy, we extended our studies to

proteins involved in DNA repair. The Fanconi anemia (FA) core complex is a multisubunit

ubiquitin ligase that initiates DNA repair of interstrand crosslinks (Walden and Deans,

2014). Based on the work of several groups, its subunit composition has been well

established and the core was defined as containing FANCA, FANCB, FANCC, FANCE,

FANCF, FANCG, FANCL, FAAP20, and FAAP100 (Walden and Deans, 2014). However,

some variations in its architecture are observed suggesting that the complex is composed of

sub-modules (Ali et al., 2012; Rajendra et al., 2014; Walden and Deans, 2014). We

attempted to purify the FA core complex through the FANCF protein as it is suggested to

62

act as a scaffolding subunit. We tagged the N-terminus of FANCF using the CRISPR/Cas9

double nicking strategy and selected a homozygote clone expressing exclusively the tagged

FANCF protein for TAP (Ran et al., 2013) (Figures 4A-4C and S5). In this experiment, five

percent (5/96, including 2 homozygous clones) of screened clones had a tagged allele. Our

approach successfully resulted in the purification of a core complex that is highly similar to

previously reported assemblies (Figure 4D and Table S3) (Ali et al., 2012; Rajendra et al.,

2014). In addition, subunits of the anchor complex (FANCM, FAAP24 and FAAP16) were

detected with the core complex in cells under normal cycling conditions (Walden and

Deans, 2014) (Table S3). Next, we attempted to purify minichromosome maintenance

complex component 8 (MCM8), a protein evolutionarily related to members of the MCM2-

7 replicative helicase family (Maiorano et al., 2006). MCM8 has previously been shown to

promote homologous recombination via its interaction with MCM9, but it is unknown if

this association is exclusive. (Lutzmann et al., 2012; Nishimura et al., 2012; Park et al.,

2013). C-terminal tagging of MCM8 clearly revealed the strict heterodimeric nature of the

complex (Figures 5 and S5; Table S4). Taken together, these findings establish the value of

using nuclease-mediated endogenous gene tagging to refine the composition and enzymatic

activities of protein complexes and highlight the robustness of our TAP strategy.

63

4Figure 4. Tandem Affinity Purification of Endogenously Tagged Fanconi Anemia Core Complex

A) Strategy for CRISPR/Cas9-driven insertion of the TAP tag to the C terminus of the FANCF

protein. Schematic of the FANCF locus, Cas9 double nickase target sites, and donor construct. The positions

of the start codon (ATG) and protospacer adjacent motif (PAM) that specify nicking sites are shown. B)

Schematic and results of a PCR-based assay (out-out PCR) to detect targeted integration (TI) of the tag

sequence in a single-cell-derived clone obtained by limiting dilution following CRISPR/Cas9-driven gene

targeting. Primers are located outside of the homology arms and are designed to yield a longer PCR product if

the tag is inserted. C) Western blots showing tag-FANCF protein expression in K562 clones. Mock indicates

cells treated with GFP expression vector. The FLAG M2 antibody was used to detect FANCF, and the

tubuline antibody was used as a loading control. D) Silver-stained SDS-PAGE showing the purified FANCF

complex. Wild-type K562 cells (Mock) and a clonal cell line expressing tag-FANCF (#12). Proteins were

identified from unfractionated protein samples and assigned to specific gel bands based on predicted

molecular weights. See also Figure S5.

64

5Figure 5. Tandem Affinity Purification of Endogenously Tagged Minichromosome Maintenance

Complex Component 8

A) Strategy for CRISPR/Cas9-driven insertion of the TAP tag to the C terminus of the MCM8

protein. Schematic of the MCM8 locus, wild-type Cas9 target site, and donor construct. Annotated are the

positions of the stop codon (TAA), target site, and protospacer adjacent motif (PAM) that specifies the

cleavage site. B) Schematic and results of a PCR-based assay (in-out PCR) to detect targeted integration (TI)

of the tag sequence in a single-cell-derived clone obtained by limiting dilution following CRISPR/Cas9-

driven gene targeting. In the particular case, one primer is located upstream of the homology arm and one

binds the right homology arm to yield a longer PCR product if the tag is inserted. C) Western blots showing

MCM8-tag protein expression in K562 clone #75. Mock indicates cells treated with donor Cas9 nuclease in

the absence of gRNA. The FLAG M2 antibody was used to detect MCM8, and the GAPDH antibody was

used as a loading control. D) Silver-stained SDS-PAGE showing the purified MCM8 complex. Wild-type

K562 cells (Mock) and a clonal cell line expressing MCM8 tag (#75). Proteins were identified from

unfractionated protein samples and assigned to specific gel bands based on predicted molecular weights. See

also Figure S5.

65

Towards High-Throughput Genome-Scale Purification of Native Endogenous

Protein Complexes

The combined robustness of the endogenous targeting and purification methods lead

us to test whether it was possible to purify the native protein complexes from the pool of

cells shortly after transfection. If such a scheme were successful, it would offer the unique

opportunity to easily and rapidly tag all human genes at their genomic location and isolate

complexes. We transfected cells with the EP400 CRISPR/Cas9 nuclease and donor vectors

and expanded cells for 7 days before harvesting them for TAP purification (Figure 6A).

Over the course of the experiment, we monitored the presence of targeted alleles in the cell

population via PCR (Figures 6B and 6C). The EP400 complex could be efficiently purified

from whole cell extracts with all its associated subunits (Figure 6D). Then, we confirmed

this observation with the EZH2 reagents and obtained an almost pure complex 10 days post

transfection (Figures 6E-6G). It is critical to mention that no selection or cell sorting was

used to enrich for tagged cells in the population before attempting the purifications. Thus,

the requirement to generate genome-edited cell clones can be bypassed such that protein

complexes can be directly isolated from gene-modified cell pools. These data constitute a

blueprint for high-throughput genome-scale charting of physiological protein-protein

interactions in human cells.

66

6Figure 6. Efficient Complex Purification from Unselected Gene-Modified Cell Pools

A) Timeline of the experiment. B) Schematic of a PCR-based assay (out-out PCR) to detect targeted

integration (TI) of the tag sequence at the C terminus of EP400. Primers are located outside of the homology

arms and are designed to yield a longer PCR product if the tag is inserted. C) Results of an out-out PCR assay

conducted on genomic DNA from K562 cells transfected with (1) EGFP expression vector and EP400 donor

(Mock) and (2) wild-type Cas9 expression vector, gRNA #3, and EP400 donor (EP400). Cells were collected

at various time points post-transfection. D) Silver-stained SDS-PAGE showing the purified EP400 complex

isolated from the cell pool 7 days post-transfection. E) Same as in (B) but for the N-terminal tagging of

EZH2. F) Same as in (C) but using the EZH2 TALENs and donor. G) Silver-stained SDS-PAGE showing the

67

purified EZH2 complex isolated from the cell pool 10 days post-transfection. Proteins were identified from

unfractionated protein samples and assigned to specific gel bands based on western blotting analysis and

predicted molecular weights.

68

DISCUSSION

USES AND LIMITATIONS OF THE AAVS1 SAFE HARBOR ECTOPIC

EXPRESSION SYSTEM

The methods described here offer unique advantages as compared to traditional

approaches used to study protein complexes in metazoan cells. The purification strategy not

only allows the identification of associated proteins by mass spectrometry, but the purity

and yields of the preparations are sufficient to perform enzymatic and mechanistic studies

in vitro. It is worth noting that gene targeting at AAVS1 can be performed in any human

cell lines since the nuclease target site is naturally occurring as compared to the Flp-In

system that requires prior integration of a Flp Recombination Target (FRT) site into the

genome. Biosafety issues associated with the use of lentiviral vectors (Biosafety level 2

containment) are also avoided. Thus, it offers greater flexibility of use than traditional

ectopic expression systems used for proteomic analysis. Note that ZFNs, TALENs and

CRISPR/Cas9 nucleases targeting AAVS1 can be used interchangeably (Figure S1)

(Hockemeyer et al., 2009; Hockemeyer et al., 2011; Mali et al., 2013). The generation of

cell lines expressing near physiological levels of bait proteins using the AAVS1 targeting

system is straightforward and can be used to rapidly isolate a protein of interest and to

perform reciprocal purifications in order to confirm the stable association of a subunit with

a protein complex. It can also be used to generate panels of variants under isogenic settings

and for the analysis of specific splicing isoforms. As an example, the long and short

isoforms of JADE1 regulate the presence of ING4/5 tumor suppressor proteins in the

KAT7/HBO1 histone acetyltransferase complex (Figure S6) (Saksouk et al., 2009). One

can also contemplate using this system to test the function of several mutants in the

background of a cellular gene knockout. This could be especially useful for the functional

study of essential genes. In addition, it offers a surrogate to study protein fusions resulting

from complex genomic rearrangements that cannot be modeled via genome editing

currently. As these protein fusions are often “toxic” when expressed in cells, we established

a single-vector autoregulated Tet-On expression system permitting tightly controlled

inducible expression of target proteins (Figure S6). Of course, this system requires

obtaining and subcloning the cDNA of interest, which could be troublesome for very large

69

proteins. However, this does not seem to be a major limitation as we successfully purified

the EP400 complex (400 kDa) and obtained highly purified 350 kDa Ataxia telangiectasia

mutated (ATM) kinase, a critical activator of the DNA damage response that is notoriously

challenging to purify (Figure S6) (Shiloh and Ziv, 2013). Importantly, the use of standard

and characterized nucleases can minimize the risk of confounding results due to off target

mutagenesis.

ADVANTAGES AND LIMITATIONS OF ENDOGENOUS TAGGING

Seamless tagging of genes at their natural chromosomal locations preserves the

physiological regulation of the bait protein expression, and of its splicing variants. Since

the epitope tag inserted is of very small size and is not linked to a drug resistance gene or

other extraneous elements, natural 5’ and 3’ UTR are minimally perturbed and no sequence

is lost. Cis-acting regulatory elements are maintained within (introns) and outside

(promoters/enhancers) transcribed regions. Thus, native regulatory mechanisms of protein

expression are retained and the impact of various stimuli that modulate isoform ratios and

interactors can be more precisely studied. Current limitations linked to constitutive

overexpression of bait proteins leading to higher rates of false positive and negative

interactions are therefore avoided. As examples, our approach settled the score on the

KAT5 and EP400 enzymatic activities, clearly establishing them as part of a single stable

macromolecular assembly in vivo. Moreover, this approach led to the identification of

MBTD1 as a novel subunit of NuA4. It also demonstrates a direct physical link, in solution,

between H3K27 and H3K9 histone methyltransferases, an interaction previously suggested

to occur only through colocalization on the genome during development (Alekseyenko et

al., 2014). Histone and residue specificity of chromatin modifying enzymes has been

marred on multiple occasions over the years by several conflicting and debated results in

the literature. Still today, the specificity of many enzymes is up to interpretation. We feel

that our approach using endogenous activities along with native substrates will lead to a

more coherent picture helping us to better understand the dynamic nature of the epigenome

during development and disease (Lalonde et al., 2014).

70

The wide availability of research reagents for CRISPR/Cas9 and TALENs greatly

facilitates the design and construction of custom reagents. All nuclease-based reagents

described here were obtained from the plasmid repository Addgene. We provide detailed

examples of donor design (Figures S2, S4 and S5) in order to facilitate the implementation

and adaptation of the strategy to user-specific contexts. The use of short homology arms

facilitates the construction of donor vectors as they can be synthesized as DNA fragments

smaller than 1 kb in length. For sequences that are AT or GC rich, PCR-based amplification

of the homology arms might be required. Potential problems can be encountered if highly

repetitive elements are found in proximity of the ATG or STOP codons, in which cases, the

length of the arms should be truncated to avoid the presence of repeats in the donor.

However, this is not essential as the MCM8 donor described in this study contains

repetitive elements and successful targeting was achieved. Apart from strict biochemical

considerations, one should take into account the structure of the gene, the various protein

isoforms produced and adjacent genetic elements when choosing to tag either the 5’ or 3’

end of genes.

A sensible preoccupation with genome editing techniques is the possibility of off-

target mutagenesis. In our experiments we used TALENs, wild-type Cas9 and Cas9 D10A

(dual nickase) interchangeably and did not observe overt toxicity or loss of targeted cells.

Continuing progress in the field aiming to increase the precision of genome editing should

progressively decrease the risk of obtaining confounding results. However, we note that the

targeting specificity for the TAP tag using circular DNA donors benefits from the fact that

homology-directed repair (HDR) mechanisms are required for integration of donor

sequences. Thus, random integration of the donor is minimal because there are no

homologous sequences at these potential off-targets. Nevertheless, on-target mutagenesis

via non-homologous end joining (NHEJ) can result in small insertions and deletions

(indels) at the non-targeted allele and care should be taken to carefully genotype both

tagged and untagged alleles in clonally-derived cells lines (see Figures S3 and S5 for

examples). We note that it is possible to design the targeting strategy to minimize the

71

potential impact of such mutagenic events (see Figures S2, S4 and S5). In this study, we

used clones with biallelic integrations of the TAP tag when possible. This has an additional

advantage as all molecules of the bait proteins are tagged in the cell.

Lastly, our data suggest that high-throughput characterization of protein-protein

interactions under physiological conditions is achievable since protein complexes can be

efficiently purified from mixed cell populations containing only a fraction of tagged alleles

(Figure 6). As genome-scale CRISPR-Cas9 knockout screening in human cells is now a

reality (Shalem et al., 2014; Wang et al., 2014), it will be possible to adapt these methods to

generate nucleases targeting either the N- or C-terminus of every human proteins. Thus, a

genome-scale proteomic approach of endogenous human proteins using this strategy seems

imminently feasible.

72

REFERENCES

Abmayr, S.M., Yao, T., Parmely, T., and Workman, J.L. (2006). Preparation of

nuclear and cytoplasmic extracts from mammalian cells. Curr Protoc Mol Biol Chapter 12,

Unit 12 11.

Alberts, B. (1998). The cell as a collection of protein machines: preparing the next

generation of molecular biologists. Cell 92, 291-294.

Alekseyenko, A.A., Gorchakov, A.A., Kharchenko, P.V., and Kuroda, M.I. (2014).

Reciprocal interactions of human C10orf12 and C17orf96 with PRC2 revealed by BioTAP-

XL cross-linking and affinity purification. Proc Natl Acad Sci U S A 111, 2488-2493.

Ali, A.M., Pradhan, A., Singh, T.R., Du, C., Li, J., Wahengbam, K., Grassman, E.,

Auerbach, A.D., Pang, Q., and Meetei, A.R. (2012). FAAP20: a novel ubiquitin-binding

FA nuclear core-complex protein required for functional integrity of the FA-BRCA DNA

repair pathway. Blood 119, 3285-3294.

Behrends, C., Sowa, M.E., Gygi, S.P., and Harper, J.W. (2010). Network

organization of the human autophagy system. Nature 466, 68-76.

Bradbury, A., and Pluckthun, A. (2015). Reproducibility: Standardize antibodies

used in research. Nature 518, 27-29.

Cai, Y., Jin, J., Florens, L., Swanson, S.K., Kusch, T., Li, B., Workman, J.L.,

Washburn, M.P., Conaway, R.C., and Conaway, J.W. (2005). The mammalian YL1 protein

is a shared subunit of the TRRAP/TIP60 histone acetyltransferase and SRCAP complexes.

The Journal of biological chemistry 280, 13665-13670.

Consortium, E.P. (2012). An integrated encyclopedia of DNA elements in the

human genome. Nature 489, 57-74.

DeKelver, R.C., Choi, V.M., Moehle, E.A., Paschon, D.E., Hockemeyer, D.,

Meijsing, S.H., Sancak, Y., Cui, X., Steine, E.J., Miller, J.C., et al. (2010). Functional

genomics, proteomics, and regulatory DNA analysis in isogenic settings using zinc finger

nuclease-driven transgenesis into a safe harbor locus in the human genome. Genome

research 20, 1133-1142.

Doyon, J.B., Zeitler, B., Cheng, J., Cheng, A.T., Cherone, J.M., Santiago, Y., Lee,

A.H., Vo, T.D., Doyon, Y., Miller, J.C., et al. (2011). Rapid and efficient clathrin-mediated

endocytosis revealed in genome-edited mammalian cells. Nat Cell Biol 13, 331-337.

73

Doyon, Y., Cayrou, C., Ullah, M., Landry, A.J., Cote, V., Selleck, W., Lane, W.S.,

Tan, S., Yang, X.J., and Cote, J. (2006). ING tumor suppressor proteins are critical

regulators of chromatin acetylation required for genome expression and perpetuation. Mol

Cell 21, 51-64.

Doyon, Y., Selleck, W., Lane, W.S., Tan, S., and Cote, J. (2004). Structural and

functional conservation of the NuA4 histone acetyltransferase complex from yeast to

humans. Mol Cell Biol 24, 1884-1896.

Fuchs, M., Gerber, J., Drapkin, R., Sif, S., Ikura, T., Ogryzko, V., Lane, W.S.,

Nakatani, Y., and Livingston, D.M. (2001). The p400 complex is an essential E1A

transformation target. Cell 106, 297-307.

Gavin, A.C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C.,

Jensen, L.J., Bastuck, S., Dumpelfeld, B., et al. (2006). Proteome survey reveals modularity

of the yeast cell machinery. Nature 440, 631-636.

Gavin, A.C., Maeda, K., and Kuhner, S. (2011). Recent advances in charting

protein-protein interaction: mass spectrometry-based approaches. Current opinion in

biotechnology 22, 42-49.

Goudreault, M., D'Ambrosio, L.M., Kean, M.J., Mullin, M.J., Larsen, B.G.,

Sanchez, A., Chaudhry, S., Chen, G.I., Sicheri, F., Nesvizhskii, A.I., et al. (2009). A PP2A

phosphatase high density interaction network identifies a novel striatin-interacting

phosphatase and kinase complex linked to the cerebral cavernous malformation 3 (CCM3)

protein. Molecular & cellular proteomics : MCP 8, 157-171.

Guruharsha, K.G., Rual, J.F., Zhai, B., Mintseris, J., Vaidya, P., Vaidya, N.,

Beekman, C., Wong, C., Rhee, D.Y., Cenaj, O., et al. (2011). A protein complex network

of Drosophila melanogaster. Cell 147, 690-703.

Hegemann, B., Hutchins, J.R., Hudecz, O., Novatchkova, M., Rameseder, J.,

Sykora, M.M., Liu, S., Mazanek, M., Lenart, P., Heriche, J.K., et al. (2011). Systematic

phosphorylation analysis of human mitotic protein complexes. Science signaling 4, rs12.

Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A.,

Taylor, P., Bennett, K., Boutilier, K., et al. (2002). Systematic identification of protein

complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180-183.

Hockemeyer, D., Soldner, F., Beard, C., Gao, Q., Mitalipova, M., DeKelver, R.C.,

Katibah, G.E., Amora, R., Boydston, E.A., Zeitler, B., et al. (2009). Efficient targeting of

expressed and silent genes in human ESCs and iPSCs using zinc-finger nucleases. Nature


74

Hockemeyer, D., Wang, H., Kiani, S., Lai, C.S., Gao, Q., Cassady, J.P., Cost, G.J.,

Zhang, L., Santiago, Y., Miller, J.C., et al. (2011). Genetic engineering of human

pluripotent cells using TALE nucleases. Nature biotechnology 29, 731-734.

Hsu, P.D., Lander, E.S., and Zhang, F. (2014). Development and applications of

CRISPR-Cas9 for genome engineering. Cell 157, 1262-1278.

Hsu, P.D., Scott, D.A., Weinstein, J.A., Ran, F.A., Konermann, S., Agarwala, V.,

Li, Y., Fine, E.J., Wu, X., Shalem, O., et al. (2013). DNA targeting specificity of RNA-

guided Cas9 nucleases. Nature biotechnology 31, 827-832.

Hutchins, J.R., Toyoda, Y., Hegemann, B., Poser, I., Heriche, J.K., Sykora, M.M.,

Augsburg, M., Hudecz, O., Buschhorn, B.A., Bulkescher, J., et al. (2010). Systematic

analysis of human protein complexes identifies chromosome segregation proteins. Science

328, 593-599.

Huttlin, E.L., Ting, L., Bruckner, R.J., Gebreab, F., Gygi, M.P., Szpyt, J., Tam, S.,

Zarraga, G., Colby, G., Baltier, K., et al. (2015). The BioPlex Network: A Systematic

Exploration of the Human Interactome. Cell 162, 425-440.

Ikura, T., Ogryzko, V.V., Grigoriev, M., Groisman, R., Wang, J., Horikoshi, M.,

Scully, R., Qin, J., and Nakatani, Y. (2000). Involvement of the TIP60 histone acetylase

complex in DNA repair and apoptosis. Cell 102, 463-473.

Joung, J.K., and Sander, J.D. (2013). TALENs: a widely applicable technology for

targeted genome editing. Nat Rev Mol Cell Biol 14, 49-55.

Krogan, N.J., Cagney, G., Yu, H., Zhong, G., Guo, X., Ignatchenko, A., Li, J., Pu,

S., Datta, N., Tikuisis, A.P., et al. (2006). Global landscape of protein complexes in the

yeast Saccharomyces cerevisiae. Nature 440, 637-643.

Krogan, N.J., Lippman, S., Agard, D.A., Ashworth, A., and Ideker, T. (2015). The

Cancer Cell Map Initiative: Defining the Hallmark Networks of Cancer. Mol Cell 58, 690-

698.

Lalonde, M.E., Cheng, X., and Cote, J. (2014). Histone target selection within

chromatin: an exemplary case of teamwork. Genes Dev 28, 1029-1041.

Lawrence, M.S., Stojanov, P., Mermel, C.H., Robinson, J.T., Garraway, L.A.,

Golub, T.R., Meyerson, M., Gabriel, S.B., Lander, E.S., and Getz, G. (2014). Discovery

and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495-501.

75

Leiserson, M.D., Vandin, F., Wu, H.T., Dobson, J.R., Eldridge, J.V., Thomas, J.L.,

Papoutsaki, A., Kim, Y., Niu, B., McLellan, M., et al. (2015). Pan-cancer network analysis

identifies combinations of rare somatic mutations across pathways and protein complexes.

Nature genetics 47, 106-114.

Lombardo, A., Cesana, D., Genovese, P., Di Stefano, B., Provasi, E., Colombo,

D.F., Neri, M., Magnani, Z., Cantore, A., Lo Riso, P., et al. (2011). Site-specific integration

and tailoring of cassette design for sustainable gene transfer. Nature methods 8, 861-869.

Lutzmann, M., Grey, C., Traver, S., Ganier, O., Maya-Mendoza, A., Ranisavljevic,

N., Bernex, F., Nishiyama, A., Montel, N., Gavois, E., et al. (2012). MCM8- and MCM9-

deficient mice reveal gametogenesis defects and genome instability due to impaired

homologous recombination. Mol Cell 47, 523-534.

Maiorano, D., Lutzmann, M., and Mechali, M. (2006). MCM proteins and DNA

replication. Curr Opin Cell Biol 18, 130-136.

Mali, P., Yang, L., Esvelt, K.M., Aach, J., Guell, M., DiCarlo, J.E., Norville, J.E.,

and Church, G.M. (2013). RNA-guided human genome engineering via Cas9. Science 339,

823-826.

Marcon, E., Ni, Z., Pu, S., Turinsky, A.L., Trimble, S.S., Olsen, J.B., Silverman-

Gavrila, R., Silverman-Gavrila, L., Phanse, S., Guo, H., et al. (2014). Human-chromatin-

related protein interactions identify a demethylase complex required for chromosome

segregation. Cell reports 8, 297-310.

Margueron, R., and Reinberg, D. (2011). The Polycomb complex PRC2 and its

mark in life. Nature 469, 343-349.

Mozzetta, C., Pontis, J., Fritsch, L., Robin, P., Portoso, M., Proux, C., Margueron,

R., and Ait-Si-Ali, S. (2014). The histone H3 lysine 9 methyltransferases G9a and GLP

regulate polycomb repressive complex 2-mediated gene silencing. Mol Cell 53, 277-289.

Musselman, C.A., Avvakumov, N., Watanabe, R., Abraham, C.G., Lalonde, M.E.,

Hong, Z., Allen, C., Roy, S., Nunez, J.K., Nickoloff, J., et al. (2012). Molecular basis for

H3K36me3 recognition by the Tudor domain of PHF1. Nat Struct Mol Biol 19, 1266-1272.

Nishimura, K., Ishiai, M., Horikawa, K., Fukagawa, T., Takata, M., Takisawa, H.,

and Kanemaki, M.T. (2012). Mcm8 and Mcm9 form a complex that functions in

homologous recombination repair induced by DNA interstrand crosslinks. Mol Cell 47,

511-522.

76

Park, J., Long, D.T., Lee, K.Y., Abbas, T., Shibata, E., Negishi, M., Luo, Y.,

Schimenti, J.C., Gambus, A., Walter, J.C., et al. (2013). The MCM8-MCM9 complex

promotes RAD51 recruitment at DNA damage sites to facilitate homologous

recombination. Mol Cell Biol 33, 1632-1644.

Park, J.H., Sun, X.J., and Roeder, R.G. (2010). The SANT domain of p400 ATPase

represses acetyltransferase activity and coactivator function of TIP60 in basal p21 gene

expression. Mol Cell Biol 30, 2750-2761.

Plass, C., Pfister, S.M., Lindroth, A.M., Bogatyrova, O., Claus, R., and Lichter, P.

(2013). Mutations in regulators of the epigenome and their connections to global chromatin

patterns in cancer. Nature reviews. Genetics 14, 765-780.

Rajendra, E., Oestergaard, V.H., Langevin, F., Wang, M., Dornan, G.L., Patel, K.J.,

and Passmore, L.A. (2014). The genetic and biochemical basis of FANCD2

monoubiquitination. Mol Cell 54, 858-869.

Ran, F.A., Hsu, P.D., Lin, C.Y., Gootenberg, J.S., Konermann, S., Trevino, A.E.,

Scott, D.A., Inoue, A., Matoba, S., Zhang, Y., et al. (2013). Double nicking by RNA-

guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380-1389.

Reyon, D., Tsai, S.Q., Khayter, C., Foden, J.A., Sander, J.D., and Joung, J.K.

(2012). FLASH assembly of TALENs for high-throughput genome editing. Nature


Rigaut, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M., and Seraphin, B.

(1999). A generic protein purification method for protein complex characterization and

proteome exploration. Nature biotechnology 17, 1030-1032.

Rolland, T., Tasan, M., Charloteaux, B., Pevzner, S.J., Zhong, Q., Sahni, N., Yi, S.,

Lemmens, I., Fontanillo, C., Mosca, R., et al. (2014). A proteome-scale map of the human

interactome network. Cell 159, 1212-1226.

Saksouk, N., Avvakumov, N., Champagne, K.S., Hung, T., Doyon, Y., Cayrou, C.,

Paquet, E., Ullah, M., Landry, A.J., Cote, V., et al. (2009). HBO1 HAT complexes target

chromatin throughout gene coding regions via multiple PHD finger interactions with

histone H3 tail. Mol Cell 33, 257-265.

Sardiu, M.E., Cai, Y., Jin, J., Swanson, S.K., Conaway, R.C., Conaway, J.W.,

Florens, L., and Washburn, M.P. (2008). Probabilistic assembly of human protein

interaction networks from label-free quantitative proteomics. Proceedings of the National

Academy of Sciences of the United States of America 105, 1454-1459.

77

Shalem, O., Sanjana, N.E., Hartenian, E., Shi, X., Scott, D.A., Mikkelsen, T.S.,

Heckl, D., Ebert, B.L., Root, D.E., Doench, J.G., et al. (2014). Genome-scale CRISPR-

Cas9 knockout screening in human cells. Science 343, 84-87.

Shiloh, Y., and Ziv, Y. (2013). The ATM protein kinase: regulating the cellular

response to genotoxic stress, and more. Nat Rev Mol Cell Biol 14, 197-210.

Sowa, M.E., Bennett, E.J., Gygi, S.P., and Harper, J.W. (2009). Defining the human

deubiquitinating enzyme interaction landscape. Cell 138, 389-403.

Sternberg, S.H., and Doudna, J.A. (2015). Expanding the Biologist's Toolkit with

CRISPR-Cas9. Mol Cell 58, 568-574.

Steunou, A.L., Rossetto, D., and Côté, J. (2014). Regulating Chromatin by Histone

Acetylation. In Fundamentals of Chromatin Workman, J.L., Abmayr, S.M., ed. (Springer-

Verlag New York), 147-212.

Tsai, S.Q., Wyvekens, N., Khayter, C., Foden, J.A., Thapar, V., Reyon, D.,

Goodwin, M.J., Aryee, M.J., and Joung, J.K. (2014). Dimeric CRISPR RNA-guided FokI

nucleases for highly specific genome editing. Nature biotechnology 32, 569-576.

Tyteca, S., Vandromme, M., Legube, G., Chevillard-Briet, M., and Trouche, D.

(2006). Tip60 and p400 are both required for UV-induced apoptosis but play antagonistic

roles in cell cycle progression. EMBO J 25, 1680-1689.

Urnov, F.D., Rebar, E.J., Holmes, M.C., Zhang, H.S., and Gregory, P.D. (2010).

Genome editing with engineered zinc finger nucleases. Nature reviews. Genetics 11, 636-

646.

Van Leene, J., Hollunder, J., Eeckhout, D., Persiau, G., Van De Slijke, E., Stals, H.,

Van Isterdael, G., Verkest, A., Neirynck, S., Buffel, Y., et al. (2010). Targeted

interactomics reveals a complex core cell cycle machinery in Arabidopsis thaliana.

Molecular systems biology 6, 397.

Venters, B.J., Wachi, S., Mavrich, T.N., Andersen, B.E., Jena, P., Sinnamon, A.J.,

Jain, P., Rolleri, N.S., Jiang, C., Hemeryck-Walsh, C., et al. (2011). A comprehensive

genomic binding map of gene and chromatin regulatory proteins in Saccharomyces. Mol

Cell 41, 480-492.

Vogelstein, B., Papadopoulos, N., Velculescu, V.E., Zhou, S., Diaz, L.A., Jr., and

Kinzler, K.W. (2013). Cancer genome landscapes. Science 339, 1546-1558.

78

Walden, H., and Deans, A.J. (2014). The Fanconi anemia DNA repair pathway:

structural and functional insights into a complex disorder. Annu Rev Biophys 43, 257-278.

Wang, T., Wei, J.J., Sabatini, D.M., and Lander, E.S. (2014). Genetic screens in

human cells using the CRISPR-Cas9 system. Science 343, 80-84.

Weber, C.M., and Henikoff, S. (2014). Histone variants: dynamic punctuation in

transcription. Genes Dev 28, 672-682.

79

SUPPLEMENTAL EXPERIMENTAL PROCEDURES

Surveyor nuclease (Cel-1) assay and Out-Out PCR

Genomic DNA from 2.5E5 cells was extracted with 250 μl of QuickExtract DNA

extraction solution (Epicentre) per manufacturer's recommendations. Cel-1 assays were

performed with the Surveyor mutation detection kit (Transgenomics) according to the

manufacturer’s protocol, with the exception that the reactions were incubated for 20 min at

42°C without enhancing solution. Samples were separated on 10% PAGE gels in TBE

buffer. To detect the targeted integration of the TAP tag, genomic DNA was subjected to

30 cycles of PCR using the Phusion polymerase (Thermo Scientific) and reactions were

loaded on 1% agarose gels in TAE buffer. Primer sequences are shown in Table S1.

Western blot

Antibodies to Flag M2 (A8592, Sigma), Tubulin (DM1A, Santa Cruz), TRRAP

(SC5405, Santa Cruz), BRD8 (A300-219A, Bethyl), TIP60 (S2475, Epitomic), MRG15

(39361, Actif Motif), EHMT2 (ab183889, Abcam), SUZ12 (3737, Cell signaling), GAPDH

(1OR-G109a, Fitzgerald), and beta-Actin (TLC 002.100, BioShop) were used.

Chromatin Immunoprecipitation assays (ChIP)

ChIP assays were performed as previously described [186]. Briefly, 1 mg of

chromatin was sonicated to fragments of ~500 bp and immunoprecipitated using 10 ug of

FLAG M2 antibodies (F3165, Sigma), or an irrelevant IgG antibody (pp64K, Millipore)

and recovered using protein A/G magnetic beads. The precipitated DNA was amplified by

real-time qPCR, using primer sets designed to amplify regions of the RPL36AL and

HOXA9 genes. qRT-PCR primers: RPL36AL 5’-CCATGCCTGAGACCTTTTTC-3’ and

5’-TGCTCAGAATCCTGGGTAGG-3’ HOXA9 5’-TGCCTTTCCTAACCAGTTCAGC-

80

3’ and 5’-CGGGCAGAAGACGCACATCCCG-3’. ChIP data are shown as % of input

chromatin signal. Data presented is based on 2 biological replicates.

Sample preparation for mass spectrometry analysis

Preparative gels (12% NuPAGE Bis-Tris) for tandem mass spectrometry were run

over 1 cm to stack all proteins into 1 band and stained with Sypro Ruby (Bio-Rad). The gel

slice was digested with trypsin on a MassPrep liquid handling robot (Waters) according to

the manufacturer’s specifications. Briefly, proteins were reduced with 10 mM DTT and

alkylated with 55 mM iodoacetamide. Trypsin digestion was performed using 126 nM of

modified porcine trypsin (Sequencing grade, Promega) at 37°C for 18 h. Digestion products

were extracted using 1% formic acid, 2% acetonitrile followed by 1% formic acid, 50%

acetonitrile. The recovered extracts were pooled, vacuum centrifuge dried and then

resuspended into 15 ul of 2% acetonitrile, 0.05% trifluoroacetic acid and 5 ul were analyzed

by mass spectrometry.

Proteins identification by mass spectrometry

The analyses were performed at the proteomic platform of the Quebec Genomics

Center. Peptide samples were separated by online reversed-phase (RP) nanoscale capillary

liquid chromatography (nanoLC) and analyzed by electrospray mass spectrometry (ESI

MS/MS). The experiments were performed with a Dionex UltiMate 3000 nanoRSLC

chromatography system (Thermo Fisher Scientific) connected to an Orbitrap Fusion mass

spectrometer (Thermo Fisher Scientific) equipped with a nanoelectrospray ion source.

Peptides were trapped at 20 ul / min in loading solvent (2% acetonitrile, 0.05% TFA) on a

5mm x 300 μm C18 pepmap cartridge pre-column (Thermo Fisher Scientific) during 5

minutes. Then, the pre-column was switch online with a self-made 50 cm x 75 um internal

diameter separation column packed with ReproSil-Pur C18-AQ 3-μm resin (Dr. Maisch

HPLC) and the peptides were eluted with a linear gradient from 5-40% solvent B (A: 0,1%

formic acid, B: 80% acetonitrile, 0.1% formic acid) in 60 minutes, at 300 nL/min. Mass

81

spectra were acquired using a data dependent acquisition mode using Thermo XCalibur

software version 3.0.63. Full scan mass spectra (350 to 1800m/z) were acquired in the

orbitrap using an AGC target of 4e5, a maximum injection time of 50 ms and a resolution

of 120 000. Internal calibration using lock mass on the m/z 445.12003 siloxane ion was

used. Each MS scan was followed by acquisition of fragmentation spectra of the most

intense ions for a total cycle time of 3 seconds (top speed mode). The selected ions were

isolated using the quadrupole analyzer in a window of 1.6 m/z and fragmented by Higher

energy Collision-induced Dissociation (HCD) with 35% of collision energy. The resulting

fragments were detected by the linear ion trap in rapid scan rate with an AGC target of 1E4

and a maximum injection time of 50ms. Dynamic exclusion of previously fragmented

peptides was set for a period of 20 sec and a tolerance of 10 ppm.

For database searching, all MS/MS peak lists (MGF files) were generated using

Thermo Proteome Discoverer version 1.4.0.288 (Thermo Fisher). MGF sample files were

then analyzed using Mascot version 2.4.0 (Matrix Science). Mascot was set up to search the

UniprotKB Homo Sapiens database (release 11/2014, 162831 sequences) assuming the

digestion enzyme trypsin. Mascot was searched with a fragment ion mass tolerance of 0.6

Da and a parent ion tolerance of 10 ppm. Oxidation of methionine and deamidation of

asparagine and glutamine were specified as a variable modifications and

carbamidomethylation as fixed modification. Two missed cleavages were allowed.

Scaffold (version 4.0.1), Proteome Software Inc., Portland, OR) was used to

validate MS/MS based peptide and protein identifications. Proteins/peptides FDR rate was

set to 1% or less based on decoy database searching. The Protein Prophet algorithm

assigned protein probabilities. Proteins that contained similar peptides and could not be

differentiated based on MS/MS analysis alone were grouped to satisfy the principles of

parsimony.

82

Gene CEL1 primers

EPC1 R: GAGCATTGCTGTCAAGTCCCF:

TGATGGTAATGTAGTTGACTGTGG

P400 R: AAAGCACTACATGCTCACAAAGA

F: CAGCAGGTGCAGATGATCC

EZH2 R: TTCCATTATGCCTTGCTACTG

F: CAAAGACTGATTAATGTGCATGG

Gene Out-Out PCR primers

EPC1 R: CCAAGGAGTCCACAGCTACC

F: AAGCCTGACACAAATCTCAGT

P400 R: AAGACCACGAGGCATTTTTC

F: CTCTCACCCTTTTCCCAAGA

MBTD1 R: GCGGATCACAAGGTCAAGAG

F: CCCACCTGAAAAATCTGGAA

EZH2 R: CGATTGCCATCCTTTCTTTG

F: GTGGCACAAGAGGCAAAAAT

FANCF R: CAGATAGACAGGAGACAGCGC

F: GAGCGTTTCCTCACGTCACAG

MCM8* R: ACTTTTGGGACATCATTTTTCAGAG

F: CAACAGGTCAACAGCGAAAA

Primers Used for the CEL1 and out-out PCR Assays * Due to the presence of

highly repetitive sequences in the 3’ UTR and the use of a long (979bp) right homology

arm, the detection of integration was performed via « in-out » PCR (the reverse primer

binds into the homology arm).

Supplemental References

Musselman, C.A., Avvakumov, N., Watanabe, R., Abraham, C.G., Lalonde, M.E.,

Hong, Z., Allen, C., Roy, S., Nunez, J.K., Nickoloff, J., et al. (2012). Molecular basis for

H3K36me3 recognition by the Tudor domain of PHF1. Nat Struct Mol Biol 19, 1266-1272.

83

SUPPLEMENTAL INFORMATION

7Figure S1. Related to Figure 1. Determination of the Optimal Purification Steps for TAP and Protein

Expression in Single Cell-Derived Clones After ZFN / CRISPR-Driven Gene Addition to the AAVS1 Locus

A) Purification Scheme from total cell extracts. B) Silver stained SDS-PAGE showing the distinct

steps of purification of the KAT5 histone acetyltransferase complex. K562 cells expressing the tag (Mock)

and two pools of cells expressing KAT5-tag (KAT pool #1 and #2). C) Western blots showing tag-EZH2

protien expression in K562 clones obtained by simultaneous selection and cloning of cells in methylcellulose-

based semi-solid medium containing puromycin following ZFN-driven gene targeting. Also shown is a

sample from a pool of cells selected in suspension culture. D) Same as in C but using CRISPR system. The

FLAG M2 antibody was used to detect EZH2 and the tubulin antibody was used as a loading control.

84

85

8Figure S2, Related to Figure 2. Strategy for CRISPR/Cas9-Driven Insertion of the TAP Tag to the C-

Terminus of the EPC1, EP400 and MBTD1 Proteins

A) Schematic of the EPC1 locus, wild-type Cas9 target site, and donor construct. Annotated are the

positions of the stop codon (TAG), the target site, and the protospacer adjacent motif (PAM) that specifies the

cleavage site. The 3’ end of the left homology arm (HA-L) starts just before the STOP and the 5’ end of the

right homology arm (HA-R) starts after the STOP. Note that in this case, the STOP codon in the donor was

changed from TAG to TGA to reduce the possibility of cleavage by the CRISPR/Cas9 enzyme. This way, the

donor sequence contains a truncated target sequence of just 11 bp, which should be sufficient to prevent DSB

formation in the episomal donor and in the chromosome after targeted integration of the tag. We avoid

repetitive sequences in HA-L and HA-R. The optimal guide RNA is selected based on several criteria; (i) high

activity based on CEL1 assay, (ii) induction of DSB in proximity to the STOP codon, (iii) option to prevent

donor cleavage (see above) and, if possible, (iv) cleavage in the 3’ UTR of the gene to prevent NHEJ-

mediated mutagenesis of the coding sequence of the untargeted allele. B) Same as A, but for the EP400 locus.

In this case, the donor sequence contains a truncated target sequence of just 9 bp, which should be sufficient

to prevent DSB formation in the episomal donor and in the chromosome after targeted integration of the tag.

It was not possible to select a cleavage site in the 3’ UTR of the gene to prevent NHEJ-mediated mutagenesis

of the coding sequence of the untargeted allele. C) Same as A, but for the MBTD1 locus. In this case, the

donor sequence contains a truncated target sequence of just 5 bp, which should be sufficient to prevent DSB

formation in the episomal donor and in the chromosome after targeted integration of the tag. It was not

possible to select a cleavage site in the 3’ UTR of the gene to prevent NHEJ-mediated mutagenesis of the

coding sequence of the untargeted allele.

86

9Figure S3, Related to Figure 2. CRISPR/Cas9-Driven Insertion of the TAP Tag to the C-Terminus of the

EPC1 and EP400 Proteins

A) Result of a CEL1 assay at EPC1 to determine the frequency of CRISPR/Cas9-induced indels,

conducted on genomic DNA from K562 cells collected 3 days posttransfection with the indicated guide

RNAs. Arrows denote specific cleavage products. B) Schematic of a PCR-based assay (Out-Out PCR) to

detect targeted integration (TI) of the tag sequence. Primers are located outside of the homology arms and are

designed to yield a longer PCR product if the tag is inserted. C) Results of a out-out PCR assay conducted on

genomic DNA from K562 cells transfected with (i) wild-type Cas9 expression vector and EPC1 donor (Mock)

and (ii) wild-type Cas9 expression vector, gRNA #1 and EPC1 donor (Nuclease + Donor). Cells were

collected 3 days post-transfection. D) Same as in C, but on single-cell derived clones. E) Sequencing of the

untargeted allele reveals a CRISPR-Cas9 induced deletion in the 3’ UTR of EPC1. F) Two-fold serial

dilutions of whole cell extracts prepared from AAVS1-EPC1 and CRISPR-EPC1 cell lines were analyzed by

western blot to determine the relative expression of EPC1-tag proteins. The FLAG M2 antibody was used to

detect EPC1-tag and the GAPDH antibody was used as a loading control. ImageJ was used for quantification.

G) Silver stained SDS-PAGE showing the EPC1 complex obtained from the endogenous (CRISPR) and the

AAVS1 loci. Eluted fractions (E1, E2). (H, I, J, K) Same as in A, C, D, F but for EP400.

87

10Figure S4, Related to Figure 3. TALEN-Driven Insertion of the TAP Tag to the N-Terminus of the

EZH2 Protein

A) Strategy for TALEN-Driven Insertion of the TAP Tag to the N-Terminus of the EZH2 Protein.

Schematic of the EZH2 locus, TALEN target sites, and donor construct. The position of the start codon

(ATG) is shown. The 3’ end of the left homology arm (HA-L) starts just before the ATG and the 5’ end of the

right homology arm (HA-R) starts after the ATG. A Kozak consensus sequence was placed upstream of the

tag sequence for efficient translation initiation at this site. Note that in this case, the TALEN binding sites are

separated by the tag sequence in the donor, which should be sufficient to prevent DSB formation in the

episomal donor and in the chromosome after targeted integration of the tag. We avoid repetitive sequences in

HA-L and HA-R. B) Result of a CEL1 assay to determine the frequency of TALEN-induced indels,

conducted on genomic DNA from K562 cells collected 3 days posttransfection. Arrows denote specific

cleavage products. C) Results of a out-out PCR assay conducted on genomic DNA from K562 cells

transfected with (i) EZH2 donor (Mock) and (ii) TALENs plus EZH2 donor (Nuclease + Donor). Cells were

collected 3 days post-transfection. D) Two-fold serial dilutions of whole cell extracts prepared from AAVS1-

EZH2 and TALEN-EZH2 (#63) cell lines were analyzed by western blot to determine the relative expression

of tag-EZH2 proteins. The FLAG M2 antibody was used to detect tag-EZH2 and the actin antibody was used

as a loading control. E) The relative density (intensity) of the various bands on the films were compared using

88

the software ImageJ (http://rsb.info.nih.gov/ij/index.html) and expression was normalized to the levels

obtained in the AAVS1 cell line. F) Benzonase treatment does not prevent interaction between PCR2 and

EHMT1/2 complexes. Purified fractions were diluted to 50 mM KCL in presence of MgCl2 and incubated at

4o C for 5 hours with 12,5 U of benzonase. Treated fractions were then immunoprecipitated O/N with anti-

FLAG beads, washed, and analyzed by western blot.

89

11Figure S5, Related to Figures 4 and 5. Strategy for CRISPR/Cas9-Driven Insertion of the TAP Tag to

the N-Terminus of FANCF and to the C-Terminus of MCM8

A) Schematic of the FANCF locus, Cas9 double nickase target sites, and donor construct. The position of

the start codon (ATG) and the protospacer adjacent motifs (PAM) that specify the nicking sites are shown.

The 3’ end of the left homology arm (HA-L) starts just before the ATG and the 5’ end of the right homology

arm (HA-R) starts after the ATG. A Kozak consensus sequence was placed upstream of the tag sequence for

efficient translation initiation at this site. Note that in this case, the nicking sites are separated by the tag

sequence in the donor, which should be sufficient to prevent DSB formation in the episomal donor and in the

chromosome after targeted integration of the tag. We avoid repetitive sequences in HA-L and HA-R. B)

Schematic of the MCM8 locus, wild-type Cas9 target site, and donor construct. Annotated are the positions of

the stop codon (TAA), the target site, and the protospacer adjacent motif (PAM) that specifies the cleavage

site. The 3’ end of the left homology arm (HA-L) starts just before the STOP and the 5’ end of the right

homology arm (HA-R) starts after the STOP. Note that in this case, the donor contains 2 stop codons and a

XhoI site upstream of the right homology arm. This way, the donor contains a truncated target sequence of

just 1 bp, which should be sufficient to prevent DSB formation in the episomal donor and in the chromosome

after targeted integration of the tag. This specific donor has longer homology arms of 782 and 979 bp that

90

were amplified from genomic DNA and contains repetitive sequences as opposed to other donors used in this

study. C) Sequencing of the untargeted MCM8 allele in the #75 MCM8-Tag cell line reveals a CRISPR-Cas9

induced 7 bp deletion introducing a frameshift at the C-terminus of MCM8. D) Alignment of the wild-type

and predicted MCM8 mutant protein.

91

92

12Figure S6, Related to Figure 1. Tandem-Affinity Purification (TAP) of JADE1L, JADE1S, and the

ATM kinase along with the Presentation of the Auto-Regulated Tet-On 3G System.

A) Silver stained SDS-PAGE showing the purified JADE1L-tag and JADE1S-tag complexes

expressed from the AAVS1 locus. JADE1L and JADE1S are splicing isoforms of PHF17. Nuclear extracts

from K562 cells expressing the tag (Mock) and clonal cell lines expressing JADE1L-tag and JADE1S-tag

were analyzed. B) Western blots of selected KAT7 (HBO1) subunits on purified fractions shown in A. C)

Silver stained SDS-PAGE showing the purified ATM after each purification steps. K562 cells expressing the

tag (Mock) and a clonal cell line expressing tag-ATM (ATM). Eluted fractions (E1, E2) from the two

purification steps are shown. D) Schematic of donor construct and of the AAVS1 locus following ORF

addition for the robust induction of target proteins upon doxycycline treatment. The first two exons of the

PPP1R12C gene are shown as open boxes. Also annotated are the locations of the splice acceptor site (SA),

2A self-cleaving peptide sequence (2A), puromycin resistance gene (Puro), polyadenylation sequence (pA),

bi-directional tet-responsive promoter (pTRE3G-BI), Tet3G transactivator (Tet3G), 3xFLAG-2xSTREP

tandem affinity tag (Tag), homology arms left and right (HA-L, HA-R). E) Western blots showing MCM8-tag

protein expression in two K562 clones treated for 72hrs with doxycycline (250 ng/ml). Expression of MCM8

and of the Tet-on transactivator was monitored by western blot with antibodies to FLAG (anti-FLAG) and to

TetR (anti-TetR), respectively. Blots with antibody to -tubulin (anti-tubulin) were used as controls for

loading. F) Western blots showing tag-EZH2 protein expression in a K562 clone treated for 24hrs with

increasing doses of doxycycline. Expression of EZH2 and of the Tet-on transactivator was monitored by

western blot with antibodies to FLAG (anti-FLAG) and to TetR (anti-TetR), respectively. Blots with antibody

to -tubulin (anti-tubulin) were used as controls for loading.

93

3Table S1, Related to Figures 1 and 2. NuA4 Complex Subunits Identified by Mass Spectrometry

Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for

identified proteins are indicated. AAVS1-EPC1/EP400 and CRISPR-EPC1/EP400 indicate

whether the subunits were expressed ectopically from the AAVS1 locus or from the

endogenous loci, respectively.

94

4Table S2, Related to Figure 3. PRC2 Complex Subunits Identified by Mass Spectrometry Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for identified proteins are

indicated. AAVS1-EZH2 and TALEN-EZH2 indicate whether EZH2 was expressed ectopically from the

AAVS1 locus or from its endogenous locus, respectively.

5Table S3, Related to Figure 4. FA Core and Anchor Complexes Subunits Identified by Mass

Spectrometry Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for identified proteins

are indicated. CRISPR-FANCF indicates that FANCF is expressed from its endogenous locus.

95

6Table S4, Related to Figure 5. MCM8 Complex Subunits Identified by Mass Spectrometry Analysis

Predicted molecular weights, total spectral counts and # of unique peptides for identified proteins

are indicated. CRISPR-MCM8 indicates that MCM8 is expressed from its endogenous locus.

DISCUSSION GÉNÉRALE

99

Grâce à la disponibilité des bases de données, il est dorénavant possible d’inférer la

fonction d’une protéine en comparant la séquence des acides aminés avec celles de

protéines provenant de certains gènes préalablement caractérisés. Il est toutefois toujours

préférable de caractériser la protéine dans son état naturel plutôt que d’utiliser des moyens

indirects. Pour déterminer les interactions protéine-protéine, la purification d’affinité suivie

de la spectrométrie de masse (AP-MS) est la norme [187]. Par contre, l’efficacité de la

technique est grandement réduite dans les cellules de mammifères [9].

La purification d’affinité utilise des étiquettes d’affinités, des outils très puissants,

pour purifier le complexe d’intérêt [7]. Il faut malgré tout garder en tête les interactions

possibles de l’étiquette d’affinité avec les autres protéines dans l’organisme à l’étude ce qui

peut mener à des taux élevés de bruit de fond. La TAP a été optimisée chez la levure et a

permis la caractérisation de plus de 589 complexes protéiques [9]. Les limites majeures de

la technique sont la quantité et la qualité des protéines à purifier. Lorsque la protéine est

faiblement exprimée dans la cellule, la surexpression est nécessaire pour obtenir une

quantité suffisante de protéines. Par contre, cela peut mener à des aberrations dans la

stœchiométrie. La technique de TAP pour la purification de complexe de la protéine cible

est bien établie dans la levure. Malheureusement, cette technique se transfère difficilement

dans les eucaryotes supérieurs notamment dans les cellules humaines.

Les travaux présentés ici utilisent les techniques d’ingénieries du génome pour

répondre à cette problématique en intégrant un cDNA d’intérêt étiqueté dans le GSH

AAVS1, permettant ainsi son expression à des niveaux presque physiologiques. Cette

technique permet d’obtenir un aperçu rapide et précis des interactions protéiques de la

protéine cible dans les conditions naturelles. Il s’agit d’un avantage certain par rapport aux

techniques traditionnelles d’expression ectopique.

La deuxième technique mise au point se base sur l’intégration d’étiquettes d’affinité

en tandem 3xFLAG-Twin-Strep soit en N ou en C-terminal du gène d’intérêt. Cette

méthode permet de conserver l’intégrité de la régulation native du gène ainsi que son

emplacement dans le génome. Cette technique est plus laborieuse à réaliser, mais les

résultats reflètent plus précisément les interactions protéiques de la protéine cible dans le

contexte naturel de la cellule.

100

1Tableau 3 : Avantages et désavantages des techniques de purification de complexes protéine-protéine

utilisées précédemment et de celle décrite dans ce mémoire

Techniques utilisées précédemment Technique décrite

Purification d’affinité TAP 3xFLAG/TwinStrep TAP-

TAG intégré à AAVS1

Avantage Approche générique

Facile à reproduire

Possible de purifier des

protéines/complexes

protéiques faiblement

exprimés.

Approche générique




protéiques faiblement

exprimés.

Condition de purification

physiologiquement douce.

Réduction massive du bruit

de fond.

Efficace dans les études à

petite et grande échelle.

Approche générique et

rapide




protéiques très faiblement

exprimés.




de fond.



Niveau d’expression

presque physiologique de

la protéine cible.

Bon alternatif à

l’étiquetage endogène.

Désavantage Expression de gène

ectopique.

La fusion protéine-TAG

peut influencer la fonction

de la protéine.

Les caractéristiques de

l’étiquette d’affinité

différente peuvent induire

des limitations uniques à la

technique.

Nécessite un protocole de

purification unique pour

chaque étiquette d’affinités

différentes.

Expression de gène

ectopique.



de la protéine.

L’efficacité de la technique

est grandement réduite

dans les cellules

d’eucaryotes supérieurs.

CBP peut interférer avec

les voies de signalisation

du calcium.

Expression du cDNA par

un promoteur ectopique.



de la protéine.

Sous clonage du cDNA

d’intérêt.

En intégrant les techniques préalablement disponibles aux techniques de pointes

utilisant les nucléases d’ingénieries, nous avons développé une méthode applicable à

l’étude de complexes protéiques faiblement exprimés dans les cellules humaines. Il est à

noter que les endonucléases ciblant le GSH AAVS1, ZFNs, TALENs et CRISPR/Cas9

101

peuvent toutes être utilisées et sont interchangeables dans le protocole. La technique est

facile à réaliser, rapide et permet d’observer les interactions protéiques à des niveaux

d’expression presque physiologiques en intégrant le cDNA approprié au GSH AAVS1. Peu

d’informations préalables sur la protéine d’intérêt sont nécessaires pour l’application de

cette technique. Pour démontrer la puissance de la technique, nous avons isolé des clones

uniques ayant intégré notre construction avec le cDNA d’EPC1 à AAVS1. La spectrométrie

de masse suite à la TAP a identifié tous les composants du complexe NuA4 déjà confirmés.

De plus, comme l’extraction des protéines a été faite dans des conditions douces, nous

avons identifié la protéine Malignant Brain Tumor domain-containing protein 1 (MBTD1),

une protéine importante pour la régulation développementale, mais jamais auparavant

associée au complexe NuA4 [188]. MBTD1 se lie aux lysines mono- ou di- méthylées des

histones grâce à l’une des quatre répétitions MBT [188, 189]. Ceci permet de mesurer les

taux de lysine méthylé sur les H3 et H4, et de bien positionner le complexe NuA4 [190].

Le complexe NuA4 est un complexe bien caractérisé. Ayant trouvé un nouveau

partenaire en utilisant notre technique, il est possible de croire que de nouvelles

interactions protéiques pourraient être découvertes dans des complexes déjà bien

caractérisés. Par exemple, on peut penser à la voie de signalisation de la survie cellulaire

phosphatidylinositol 3-kinase (PI3K). Des anomalies dans cette voie sont considérées

impliquées dans de nombreux types de cancers et l’augmentation de l’activité de cette voie

de signalisation est souvent associée à une résistance aux thérapies contre le cancer [191,

192]. Mechanistic target of rapamycin (mTOR) est une sérine/thréonine kinase de la voie

de signalisation PI3K identifiée comme étant la cible cellulaire de la rapamycine [193].

mTOR peut former deux complexes protéiques différents, mTORC1 et mTORC2 régulant

la synthèse protéique nécessaire pour la croissance et la prolifération de la cellule [193].

Chacun de ces complexes est formé d’un nombre important de partenaires en interaction

jouant entre autre des rôles d’activateurs ou d’inhibiteurs. Afin de potentiellement

découvrir de nouvelles interactions protéiques pouvant mener à de nouvelles stratégies de

thérapie pour le cancer, il serait facile de purifier les complexes mTORC1 et mTORC2 en

utilisant notre technique. En introduisant les cDNA de mTOR, ou mieux encore celui des

cDNA des autres protéines connues des complexes, tels : Raptor, LST8, PRAS40,

102

DEPTOR et Rictor, mSIN1, GβL, respectivement. Il serait donc possible de purifier les

complexes et valider chacune de leurs composantes.

Comme les techniques actuelles se fondent sur la surexpression de gène, il peut en

résulter des effets anormaux dans la cellule ce qui peut mener à des conclusions erronées.

La surexpression de certaines protéines peut aussi mener à une toxicité, ce qui rend leur

étude très difficile. C’est pour cette raison qu’ATM n’a jamais été purifié sans

surexpression. Nous avons initialement pensé utiliser le système Tet-On 3G auto-inductible

pour permettre la purification de ATM, mais l’intégration à AAVS1 avec le cDNA TAP-

TAG s’est avéré être une technique très robuste qui a permis sa purification, une première

dans la littérature. Cette première purification d’ATM ouvre la porte à des études plus

poussées sur sa fonction.

2Tableau 4: Avantages et désavantages du système Tet-On et de notre système auto-inductible

Système Tet-On Système auto-inductible

Tet-On 3G

Avantage Contrôle sur l’expression de

la protéine cible.

Expression de la protéine

d’intérêt rapide.

Contrôle sur l’expression de

la protéine cible.

Expression de la protéine

d’intérêt rapide.

Facile à reproduire.



protéiques très faiblement

exprimés.




de fond.



Désavantage Intégration aléatoire pouvant

mener à une perturbation

dans la stœchiométrie de la

cellule.

Sous clonage du cDNA

d’intérêt.

Expression variable de la

protéine cible dans chaque

clone.

Optimisation pour

l’expression de la protéine

cible nécessaire.

103

Le système Tet-on permet d’induire l’expression de la protéine, ce qui est un

avantage certain pour l’étude des protéines cytotoxiques. Toutefois, dans son design actuel,

les constructions s’intègrent aléatoirement ce qui augmente la possibilité de perturber la

stœchiométrie de la cellule. Il est donc important de cibler l’intégration de séquence

d’intérêt à un GSH pour ne pas perturber la cellule. Les nouveaux systèmes

d’endonucléases ZFN, TALEN et CRISPR/Cas9 sont des outils indispensables à cet effet,

car ils nous permettent de créer une DSB à un endroit précis dans le génome et y intégrer

nos séquences d’intérêt.

Nous avons créé un système auto-inductible Tet-ON 3G exprimant le cDNA

d’intérêt au GSH AAVS1. Ce système permet d’étudier des protéines cytotoxiques, telles

des protéines fusions résultant de translocation génomique comme EPC1-PHF1. Ce

système permet d’induire l’expression de la protéine cible peu avant l’étape d’extraction.

L’induction du système est robuste et l’expression de la protéine cible peut être optimisée

via la concentration de doxycycline utilisée pour induire le système. Les concentrations

requises pour l’induction par la doxycycline n’ont aucun effet néfaste connu. Plusieurs

translocations menant à des protéines fusions différentes ont été identifiées dans des

tumeurs malignes, tels JAZF1-PHF1 et MEAF6-PHF1. Il n’est pas encore possible de

sélectionner les cellules avec ces translocations pour permettre une étude de l’effet direct de

la translocation. En utilisant le cDNA de la protéine fusion d’intérêt avec le système auto-

inductible Tet-On 3G intégré à AAVS1, il serait possible d’obtenir ces protéines fusions

TAP-TAG et ainsi d’identifier leurs interactions par spectrométrie de masse. Il serait aussi

possible d’obtenir une quantité suffisante de protéines pour faire des tests enzymatiques et

pour une cristallographie. Il serait possible de faire ressortir des conclusions pertinentes

avec la protéine fusion.

Perspectives

Utiliser le site AAVS1 pour intégrer de l’ADN dans un génome sans produire

d’effets néfastes est intéressant dans l’optique de corriger des maladies génétiques par

édition génique et d’éliminer la nécessité de prendre des médicaments. En utilisant le GSH

104

ROSA26, cette technique peut aussi être utilisée chez la souris. Ultimement, il serait donc

possible de compenser une déficience enzymatique en y introduisant un gène fonctionnel in

vivo.

Une possibilité intéressante serait d’incorporer à AAVS1 un gène essentiel connu et

d’induire par la suite l’invalidation de sa forme native pour étudier les effets du gène dans

la cellule. Le système auto-inductible TetOn-3G a été créé pour répondre au besoin

d’exprimer des protéines cytotoxiques et pour en faciliter l’étude.

Il est important de mentionner que cibler le locus AAVS1 peut être fait dans toutes

les lignées cellulaires humaines et que l’ADN intégré est retenu même après

différenciation. Il serait donc intéressant d’incorporer l’ADN d’intérêt dans des cellules

souches, tel des cellules souches hématopoïétiques et induire la différenciation pour obtenir

plusieurs types de cellules humaines découlant tous de la même manipulation génétique.

Ceci permettrait de créer des lignées cellulaires dans tous les types cellulaires dont il est

possible d’induire la différenciation à partir d’un seul évènement.

Cette technique nous permettrait aussi d’étiqueter des protéines qui pourraient par la

suite être identifiées par immunohistochimie ou immunofluorescence lorsqu’aucun

anticorps spécifique n’est disponible. Par exemple, visualiser la distribution tissulaire d’un

GPCR précis.

Les méthodes développées dans le cadre de nos travaux offrent un avantage unique

comparativement à l’approche traditionnelle utilisée pour étudier les complexes protéiques

dans les cellules de mammifère. La stratégie de purification permet non seulement

d’identifier les protéines associées par spectrométrie de masses, mais la pureté et la quantité

récupérée permettent l’étude enzymatique et mécanistique in vitro.

105

CONCLUSION

Les travaux décrits dans le présent mémoire, ainsi que les recherches sur lesquelles

ce dernier s’appuie, démontrent le potentiel indéniable de la technique 3xFLAG/TwinStrep

TAP-TAG intégré à AAVS1. Cette technique a d’ailleurs été utilisée par plusieurs autres

groupes depuis sa publication [194-196]. Cette méthode rapide et facile permet d’obtenir un

premier aperçu des interactions protéine-protéine sans nécessiter beaucoup d’informations

initiales sur la protéine cible. Les résultats présentés ont permis d’identifier une interaction

nouvelle dans le complexe NuA4, un complexe auparavant très bien caractérisé ainsi que de

purifier la protéine ATM un exploit jamais fait auparavant. Quoique cette technique ne

reflète pas les conditions naturelles dans les cellules, car le promoteur de la construction est

ectopique, les niveaux d’expression de la protéine demeurent proches des concentrations

normales et permettent de rapprocher nos conclusions de la réalité. Les résultats présentés

ici bonifient grandement les techniques d’étiquetage endogène permettant de purifier une

protéine d’intérêt fonctionnelle et en quantité suffisante pour étudier ses interactions

protéine-protéine ainsi que ses fonctions enzymatiques.

107

RÉFÉRENCES

1. Alberts, B., The cell as a collection of protein machines: preparing the next

generation of molecular biologists. Cell, 1998. 92(3): p. 291-4.

2. Mak, A.N., et al., The crystal structure of TAL effector PthXo1 bound to its DNA

target. Science, 2012. 335(6069): p. 716-9.

3. Blackstock, W.P. and M.P. Weir, Proteomics: quantitative and physical mapping of

cellular proteins. Trends Biotechnol, 1999. 17(3): p. 121-7.

4. Guide to Protein Purification 2nd edition. Academic Press ed. Vol. 436. 2009.

5. Swaffield, J.C., K. Melcher, and S.A. Johnston, A highly conserved ATPase protein

as a mediator between acidic activation domains and the TATA-binding protein.

Nature, 1995. 374(6517): p. 88-91.

6. Puig, O., et al., The tandem affinity purification (TAP) method: a general procedure

of protein complex purification. Methods, 2001. 24(3): p. 218-29.

7. Rigaut, G., et al., A generic protein purification method for protein complex

characterization and proteome exploration. Nat Biotechnol, 1999. 17(10): p. 1030-

2.

8. Shevchenko, A., et al., Linking genome and proteome by mass spectrometry: large-

scale identification of yeast proteins from two dimensional gels. Proc Natl Acad Sci

U S A, 1996. 93(25): p. 14440-5.

9. Gavin, A.C., et al., Functional organization of the yeast proteome by systematic

analysis of protein complexes. Nature, 2002. 415(6868): p. 141-7.

10. Behrends, C., et al., Network organization of the human autophagy system. Nature,

2010. 466(7302): p. 68-76.

11. Goudreault, M., et al., A PP2A phosphatase high density interaction network

identifies a novel striatin-interacting phosphatase and kinase complex linked to the

cerebral cavernous malformation 3 (CCM3) protein. Mol Cell Proteomics, 2009.

8(1): p. 157-71.

12. Guruharsha, K.G., et al., A protein complex network of Drosophila melanogaster.

Cell, 2011. 147(3): p. 690-703.

13. Hegemann, B., et al., Systematic phosphorylation analysis of human mitotic protein

complexes. Sci Signal, 2011. 4(198): p. rs12.

14. Hutchins, J.R., et al., Systematic analysis of human protein complexes identifies

chromosome segregation proteins. Science, 2010. 328(5978): p. 593-9.

15. Huttlin, E.L., et al., The BioPlex Network: A Systematic Exploration of the Human

Interactome. Cell, 2015. 162(2): p. 425-40.

16. Sardiu, M.E., et al., Probabilistic assembly of human protein interaction networks

from label-free quantitative proteomics. Proc Natl Acad Sci U S A, 2008. 105(5): p.

1454-9.

17. Sowa, M.E., et al., Defining the human deubiquitinating enzyme interaction

landscape. Cell, 2009. 138(2): p. 389-403.

18. Van Leene, J., et al., Targeted interactomics reveals a complex core cell cycle

machinery in Arabidopsis thaliana. Mol Syst Biol, 2010. 6: p. 397.

19. Gavin, A.C., et al., Proteome survey reveals modularity of the yeast cell machinery.

Nature, 2006. 440(7084): p. 631-6.

108

20. Agell, N., et al., Modulation of the Ras/Raf/MEK/ERK pathway by Ca(2+), and

calmodulin. Cell Signal, 2002. 14(8): p. 649-54.

21. Head, J.F., A better grip on calmodulin. Curr Biol, 1992. 2(11): p. 609-11.

22. Doyon, J.B., et al., Rapid and efficient clathrin-mediated endocytosis revealed in

genome-edited mammalian cells. Nat Cell Biol, 2011. 13(3): p. 331-7.

23. Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces

cerevisiae by mass spectrometry. Nature, 2002. 415(6868): p. 180-3.

24. Bucher, M.H., A.G. Evdokimov, and D.S. Waugh, Differential effects of short

affinity tags on the crystallization of Pyrococcus furiosus maltodextrin-binding

protein. Acta Crystallogr D Biol Crystallogr, 2002. 58(Pt 3): p. 392-7.

25. Terpe, K., Overview of tag protein fusions: from molecular and biochemical

fundamentals to commercial systems. Appl Microbiol Biotechnol, 2003. 60(5): p.

523-33.

26. Blanar, M.A. and W.J. Rutter, Interaction cloning: identification of a helix-loop-

helix zipper protein that interacts with c-Fos. Science, 1992. 256(5059): p. 1014-8.

27. Einhauer, A. and A. Jungbauer, The FLAG peptide, a versatile fusion tag for the

purification of recombinant proteins. J Biochem Biophys Methods, 2001. 49(1-3):

p. 455-65.

28. Su, X., A.K. Prestwood, and R.A. McGraw, Production of recombinant porcine

tumor necrosis factor alpha in a novel E. coli expression system. Biotechniques,

1992. 13(5): p. 756-62.

29. Einhauer, A., et al., Expression and purification of homogenous proteins in

Saccharomyces cerevisiae based on ubiquitin-FLAG fusion. Protein Expr Purif,

2002. 24(3): p. 497-504.

30. Schuster, M., et al., Protein expression strategies for identification of novel target

proteins. J Biomol Screen, 2000. 5(2): p. 89-97.

31. Kunz, D., N.P. Gerard, and C. Gerard, The human leukocyte platelet-activating

factor receptor. cDNA cloning, cell surface expression, and construction of a novel

epitope-bearing analog. J Biol Chem, 1992. 267(13): p. 9101-6.

32. Zhang, X.K., et al., Novel pathway for thyroid hormone receptor action through

interaction with jun and fos oncogene activities. Mol Cell Biol, 1991. 11(12): p.

6016-25.

33. Sigma-Aldrich. FLAG System. [cited 2017 3 février]; Available from:

http://www.sigmaaldrich.com/life-science/proteomics/recombinant-protein-

expression/purification-detection/flag-system.html.

34. Schmidt, T.G. and A. Skerra, The random peptide library-assisted engineering of a

C-terminal affinity peptide, useful for the detection and purification of a functional

Ig Fv fragment. Protein Eng, 1993. 6(1): p. 109-22.

35. Voss, S. and A. Skerra, Mutagenesis of a flexible loop in streptavidin leads to

higher affinity for the Strep-tag II peptide and improved performance in

recombinant protein purification. Protein Eng, 1997. 10(8): p. 975-82.

36. Schmidt, T.G., et al., Molecular interaction between the Strep-tag affinity peptide

and its cognate target, streptavidin. J Mol Biol, 1996. 255(5): p. 753-66.

http://www.sigmaaldrich.com/life-science/proteomics/recombinant-protein-expression/purification-detection/flag-system.html

http://www.sigmaaldrich.com/life-science/proteomics/recombinant-protein-expression/purification-detection/flag-system.html

109

37. Korndorfer, I.P. and A. Skerra, Improved affinity of engineered streptavidin for the

Strep-tag II peptide is due to a fixed open conformation of the lid-like loop at the

binding site. Protein Sci, 2002. 11(4): p. 883-93.

38. Schmidt, T.G. and A. Skerra, The Strep-tag system for one-step purification and

high-affinity detection or capturing of proteins. Nat Protoc, 2007. 2(6): p. 1528-35.

39. Wyborski, D.L. and J.M. Short, Analysis of inducers of the E.coli lac repressor

system in mammalian cells and whole animals. Nucleic Acids Res, 1991. 19(17): p.

4647-53.

40. Gossen, M. and H. Bujard, Tight control of gene expression in mammalian cells by

tetracycline-responsive promoters. Proc Natl Acad Sci U S A, 1992. 89(12): p.

5547-51.

41. Gossen, M., et al., Transcriptional activation by tetracyclines in mammalian cells.

Science, 1995. 268(5218): p. 1766-9.

42. Loew, R., et al., Improved Tet-responsive promoters with minimized background

expression. BMC Biotechnol, 2010. 10: p. 81.

43. Baron, U., et al., Co-regulation of two gene activities by tetracycline via a

bidirectional promoter. Nucleic Acids Res, 1995. 23(17): p. 3605-6.

44. Zhou, X., et al., Optimization of the Tet-On system for regulated gene expression

through viral evolution. Gene Ther, 2006. 13(19): p. 1382-90.

45. Clontech Laboratories, I. Tet-On 3G Inducible Expression Systems User Manual.

[cited 2017 29 Janvier]; Available from:

http://www.clontech.com/xxclt_ibcGetAttachment.jsp?cItemId=17569.

46. Sadelain, M., E.P. Papapetrou, and F.D. Bushman, Safe harbours for the integration

of new DNA in the human genome. Nat Rev Cancer, 2011. 12(1): p. 51-8.

47. DeKelver, R.C., et al., Functional genomics, proteomics, and regulatory DNA

analysis in isogenic settings using zinc finger nuclease-driven transgenesis into a

safe harbor locus in the human genome. Genome Res, 2010. 20(8): p. 1133-42.

48. Kotin, R.M., R.M. Linden, and K.I. Berns, Characterization of a preferred site on

human chromosome 19q for integration of adeno-associated virus DNA by non-

homologous recombination. EMBO J, 1992. 11(13): p. 5071-8.

49. Tan, I., et al., Phosphorylation of a novel myosin binding subunit of protein

phosphatase 1 reveals a conserved mechanism in the regulation of actin

cytoskeleton. J Biol Chem, 2001. 276(24): p. 21209-16.

50. Ogata, T., T. Kozuka, and T. Kanda, Identification of an insulator in AAVS1, a

preferred region for integration of adeno-associated virus DNA. J Virol, 2003.

77(16): p. 9000-7.

51. Henckaerts, E. and R.M. Linden, Adeno-associated virus: a key to the human

genome? Future Virol, 2010. 5(5): p. 555-574.

52. Smith, J.R., et al., Robust, persistent transgene expression in human embryonic stem

cells is achieved with AAVS1-targeted integration. Stem Cells, 2008. 26(2): p. 496-

504.

53. Zou, J., et al., Oxidase-deficient neutrophils from X-linked chronic granulomatous

disease iPS cells: functional correction by zinc finger nuclease-mediated safe

harbor targeting. Blood, 2011. 117(21): p. 5561-72.

http://www.clontech.com/xxclt_ibcGetAttachment.jsp?cItemId=17569

110

54. Ramachandra, C.J., et al., Efficient recombinase-mediated cassette exchange at the

AAVS1 locus in human embryonic stem cells using baculoviral vectors. Nucleic

Acids Res, 2011. 39(16): p. e107.

55. Yang, L., et al., Human cardiovascular progenitor cells develop from a KDR+

embryonic-stem-cell-derived population. Nature, 2008. 453(7194): p. 524-8.

56. Hockemeyer, D., et al., Efficient targeting of expressed and silent genes in human

ESCs and iPSCs using zinc-finger nucleases. Nat Biotechnol, 2009. 27(9): p. 851-7.

57. Lombardo, A., et al., Site-specific integration and tailoring of cassette design for

sustainable gene transfer. Nat Methods, 2011. 8(10): p. 861-9.

58. Kim, Y.G., J. Cha, and S. Chandrasegaran, Hybrid restriction enzymes: zinc finger

fusions to Fok I cleavage domain. Proc Natl Acad Sci U S A, 1996. 93(3): p. 1156-

60.

59. Miller, J., A.D. McLachlan, and A. Klug, Repetitive zinc-binding domains in the

protein transcription factor IIIA from Xenopus oocytes. EMBO J, 1985. 4(6): p.

1609-14.

60. Wolfe, S.A., L. Nekludova, and C.O. Pabo, DNA recognition by Cys2His2 zinc

finger proteins. Annu Rev Biophys Biomol Struct, 2000. 29: p. 183-212.

61. Miller, J.C., et al., An improved zinc-finger nuclease architecture for highly specific

genome editing. Nat Biotechnol, 2007. 25(7): p. 778-85.

62. Beerli, R.R. and C.F. Barbas, 3rd, Engineering polydactyl zinc-finger transcription

factors. Nat Biotechnol, 2002. 20(2): p. 135-41.

63. Vanamee, E.S., S. Santagata, and A.K. Aggarwal, FokI requires two specific DNA

sites for cleavage. J Mol Biol, 2001. 309(1): p. 69-78.

64. Urnov, F.D., et al., Genome editing with engineered zinc finger nucleases. Nat Rev

Genet, 2010. 11(9): p. 636-46.

65. Kim, E., et al., Precision genome engineering with programmable DNA-nicking

enzymes. Genome Res, 2012. 22(7): p. 1327-33.

66. Gaj, T., C.A. Gersbach, and C.F. Barbas, 3rd, ZFN, TALEN, and CRISPR/Cas-

based methods for genome engineering. Trends Biotechnol, 2013. 31(7): p. 397-

405.

67. Perez-Pinera, P., D.G. Ousterout, and C.A. Gersbach, Advances in targeted genome

editing. Curr Opin Chem Biol, 2012. 16(3-4): p. 268-77.

68. Segal, D.J. and J.F. Meckler, Genome engineering at the dawn of the golden age.

Annu Rev Genomics Hum Genet, 2013. 14: p. 135-58.

69. Ramirez, C.L., et al., Unexpected failure rates for modular assembly of engineered

zinc fingers. Nat Methods, 2008. 5(5): p. 374-5.

70. Joung, J.K. and J.D. Sander, TALENs: a widely applicable technology for targeted

genome editing. Nat Rev Mol Cell Biol, 2013. 14(1): p. 49-55.

71. Boch, J. and U. Bonas, Xanthomonas AvrBs3 family-type III effectors: discovery

and function. Annu Rev Phytopathol, 2010. 48: p. 419-36.

72. Boch, J., et al., Breaking the code of DNA binding specificity of TAL-type III

effectors. Science, 2009. 326(5959): p. 1509-12.

73. Deng, D., et al., Structural basis for sequence-specific recognition of DNA by TAL

effectors. Science, 2012. 335(6069): p. 720-3.

111

74. Sander, J.D., et al., Targeted gene disruption in somatic zebrafish cells using

engineered TALENs. Nat Biotechnol, 2011. 29(8): p. 697-8.

75. Tesson, L., et al., Knockout rats generated by embryo microinjection of TALENs.

Nat Biotechnol, 2011. 29(8): p. 695-6.

76. Reyon, D., et al., FLASH assembly of TALENs for high-throughput genome editing.

Nat Biotechnol, 2012. 30(5): p. 460-5.

77. Hockemeyer, D., et al., Genetic engineering of human pluripotent cells using TALE

nucleases. Nat Biotechnol, 2011. 29(8): p. 731-4.

78. Kim, Y., et al., A library of TAL effector nucleases spanning the human genome.

Nat Biotechnol, 2013. 31(3): p. 251-8.

79. Shi, B., et al., TALEN-Mediated Knockout of CCR5 Confers Protection Against

Infection of Human Immunodeficiency Virus. J Acquir Immune Defic Syndr, 2017.

74(2): p. 229-241.

80. Ellis, B.L., et al., Zinc-finger nuclease-mediated gene correction using single AAV

vector transduction and enhancement by Food and Drug Administration-approved

drugs. Gene Ther, 2013. 20(1): p. 35-42.

81. Ishino, Y., et al., Nucleotide sequence of the iap gene, responsible for alkaline

phosphatase isozyme conversion in Escherichia coli, and identification of the gene

product. J Bacteriol, 1987. 169(12): p. 5429-33.

82. Barrangou, R., et al., CRISPR provides acquired resistance against viruses in

prokaryotes. Science, 2007. 315(5819): p. 1709-12.

83. Jinek, M., et al., A programmable dual-RNA-guided DNA endonuclease in adaptive

bacterial immunity. Science, 2012. 337(6096): p. 816-21.

84. Makarova, K.S., et al., Evolution and classification of the CRISPR-Cas systems. Nat

Rev Microbiol, 2011. 9(6): p. 467-77.

85. DiCarlo, J.E., et al., Genome engineering in Saccharomyces cerevisiae using

CRISPR-Cas systems. Nucleic Acids Res, 2013. 41(7): p. 4336-43.

86. Gratz, S.J., et al., Genome engineering of Drosophila with the CRISPR RNA-guided

Cas9 nuclease. Genetics, 2013. 194(4): p. 1029-35.

87. Friedland, A.E., et al., Heritable genome editing in C. elegans via a CRISPR-Cas9

system. Nat Methods, 2013. 10(8): p. 741-3.

88. Jiang, W., et al., Demonstration of CRISPR/Cas9/sgRNA-mediated targeted gene

modification in Arabidopsis, tobacco, sorghum and rice. Nucleic Acids Res, 2013.

41(20): p. e188.

89. Wang, H., et al., One-step generation of mice carrying mutations in multiple genes

by CRISPR/Cas-mediated genome engineering. Cell, 2013. 153(4): p. 910-8.

90. Deltcheva, E., et al., CRISPR RNA maturation by trans-encoded small RNA and

host factor RNase III. Nature, 2011. 471(7340): p. 602-7.

91. Brouns, S.J., et al., Small CRISPR RNAs guide antiviral defense in prokaryotes.

Science, 2008. 321(5891): p. 960-4.

92. Garneau, J.E., et al., The CRISPR/Cas bacterial immune system cleaves

bacteriophage and plasmid DNA. Nature, 2010. 468(7320): p. 67-71.

93. Gasiunas, G., et al., Cas9-crRNA ribonucleoprotein complex mediates specific DNA

cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A, 2012.

109(39): p. E2579-86.

112

94. Hsu, P.D., et al., DNA targeting specificity of RNA-guided Cas9 nucleases. Nat

Biotechnol, 2013. 31(9): p. 827-32.

95. Cong, L., et al., Multiplex genome engineering using CRISPR/Cas systems. Science,

2013. 339(6121): p. 819-23.

96. Anders, C., et al., Structural basis of PAM-dependent target DNA recognition by the

Cas9 endonuclease. Nature, 2014. 513(7519): p. 569-73.

97. Yang, H., H. Wang, and R. Jaenisch, Generating genetically modified mice using

CRISPR/Cas-mediated genome engineering. Nat Protoc, 2014. 9(8): p. 1956-68.

98. Reardon, S. First CRISPR clinical trial gets green light from US panel. Nature,

2016. DOI: 10.1038/nature.2016.20137.

99. Mali, P., et al., RNA-guided human genome engineering via Cas9. Science, 2013.

339(6121): p. 823-6.

100. Mali, P., et al., CAS9 transcriptional activators for target specificity screening and

paired nickases for cooperative genome engineering. Nat Biotechnol, 2013. 31(9):

p. 833-8.

101. Fonfara, I., et al., Phylogeny of Cas9 determines functional exchangeability of dual-

RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Res,

2014. 42(4): p. 2577-90.

102. Esvelt, K.M., et al., Orthogonal Cas9 proteins for RNA-guided gene regulation and

editing. Nat Methods, 2013. 10(11): p. 1116-21.

103. Zetsche, B., et al., Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-

Cas system. Cell, 2015. 163(3): p. 759-71.

104. Haeussler, M., et al., Evaluation of off-target and on-target scoring algorithms and

integration into the guide RNA selection tool CRISPOR. Genome Biol, 2016. 17(1):

p. 148.

105. Ran, F.A., et al., Double nicking by RNA-guided CRISPR Cas9 for enhanced

genome editing specificity. Cell, 2013. 154(6): p. 1380-9.

106. Guilinger, J.P., D.B. Thompson, and D.R. Liu, Fusion of catalytically inactive Cas9

to FokI nuclease improves the specificity of genome modification. Nat Biotechnol,

2014. 32(6): p. 577-82.

107. Dianov, G.L. and U. Hubscher, Mammalian base excision repair: the forgotten

archangel. Nucleic Acids Res, 2013. 41(6): p. 3483-90.

108. Scharer, O.D., Nucleotide excision repair in eukaryotes. Cold Spring Harb Perspect

Biol, 2013. 5(10): p. a012609.

109. Li, G.M., Mechanisms and functions of DNA mismatch repair. Cell Res, 2008.

18(1): p. 85-98.

110. Kolodner, R.D. and G.T. Marsischky, Eukaryotic DNA mismatch repair. Curr Opin

Genet Dev, 1999. 9(1): p. 89-96.

111. Modrich, P. and R. Lahue, Mismatch repair in replication fidelity, genetic

recombination, and cancer biology. Annu Rev Biochem, 1996. 65: p. 101-33.

112. Valerie, K. and L.F. Povirk, Regulation and mechanisms of mammalian double-

strand break repair. Oncogene, 2003. 22(37): p. 5792-812.

113. Bonner, W.M., et al., GammaH2AX and cancer. Nat Rev Cancer, 2008. 8(12): p.

957-67.

113

114. Lavin, M.F., ATM and the Mre11 complex combine to recognize and signal DNA

double-strand breaks. Oncogene, 2007. 26(56): p. 7749-58.

115. Goedecke, W., et al., Mre11 and Ku70 interact in somatic cells, but are

differentially expressed in early meiosis. Nat Genet, 1999. 23(2): p. 194-8.

116. Haber, J.E., The many interfaces of Mre11. Cell, 1998. 95(5): p. 583-6.

117. Paull, T.T. and M. Gellert, The 3' to 5' exonuclease activity of Mre 11 facilitates

repair of DNA double-strand breaks. Mol Cell, 1998. 1(7): p. 969-79.

118. Paull, T.T. and M. Gellert, A mechanistic basis for Mre11-directed DNA joining at

microhomologies. Proc Natl Acad Sci U S A, 2000. 97(12): p. 6409-14.

119. Jha, S., E. Shibata, and A. Dutta, Human Rvb1/Tip49 is required for the histone

acetyltransferase activity of Tip60/NuA4 and for the downregulation of

phosphorylation on H2AX after DNA damage. Mol Cell Biol, 2008. 28(8): p. 2690-

700.

120. Doyon, Y. and J. Cote, The highly conserved and multifunctional NuA4 HAT

complex. Curr Opin Genet Dev, 2004. 14(2): p. 147-54.

121. Doyon, Y., et al., Structural and functional conservation of the NuA4 histone

acetyltransferase complex from yeast to humans. Mol Cell Biol, 2004. 24(5): p.

1884-96.

122. Cheng, X., et al., Eaf1 Links the NuA4 Histone Acetyltransferase Complex to Htz1

Incorporation and Regulation of Purine Biosynthesis. Eukaryot Cell, 2015. 14(6): p.

535-44.

123. Sun, Y., et al., A role for the Tip60 histone acetyltransferase in the acetylation and

activation of ATM. Proc Natl Acad Sci U S A, 2005. 102(37): p. 13182-7.

124. Sun, Y., et al., DNA damage-induced acetylation of lysine 3016 of ATM activates

ATM kinase activity. Mol Cell Biol, 2007. 27(24): p. 8502-9.

125. Sun, Y., et al., Histone H3 methylation links DNA damage detection to activation of

the tumour suppressor Tip60. Nat Cell Biol, 2009. 11(11): p. 1376-82.

126. Goodarzi, A.A., et al., ATM signaling facilitates repair of DNA double-strand

breaks associated with heterochromatin. Mol Cell, 2008. 31(2): p. 167-77.

127. Noon, A.T., et al., 53BP1-dependent robust localized KAP-1 phosphorylation is

essential for heterochromatic DNA double-strand break repair. Nat Cell Biol, 2010.

12(2): p. 177-84.

128. Chan, H.M., et al., The p400 E1A-associated protein is a novel component of the

p53 --> p21 senescence pathway. Genes Dev, 2005. 19(2): p. 196-201.

129. Bothmer, A., et al., 53BP1 regulates DNA resection and the choice between

classical and alternative end joining during class switch recombination. J Exp Med,

2010. 207(4): p. 855-65.

130. Bunting, S.F., et al., 53BP1 inhibits homologous recombination in Brca1-deficient

cells by blocking resection of DNA breaks. Cell, 2010. 141(2): p. 243-54.

131. Limbo, O., et al., Ctp1 is a cell-cycle-regulated protein that functions with Mre11

complex to control double-strand break repair by homologous recombination. Mol

Cell, 2007. 28(1): p. 134-46.

132. Pardo, B., B. Gomez-Gonzalez, and A. Aguilera, DNA repair in mammalian cells:

DNA double-strand break repair: how to fix a broken relationship. Cell Mol Life

Sci, 2009. 66(6): p. 1039-56.

114

133. Chiruvella, K.K., Z. Liang, and T.E. Wilson, Repair of double-strand breaks by end

joining. Cold Spring Harb Perspect Biol, 2013. 5(5): p. a012757.

134. Karanam, K., et al., Quantitative live cell imaging reveals a gradual shift between

DNA repair mechanisms and a maximal use of HR in mid S phase. Mol Cell, 2012.

47(2): p. 320-9.

135. Sonoda, E., et al., Differential usage of non-homologous end-joining and

homologous recombination in double strand break repair. DNA Repair (Amst),

2006. 5(9-10): p. 1021-9.

136. Moore, J.K. and J.E. Haber, Cell cycle and genetic requirements of two pathways of

nonhomologous end-joining repair of double-strand breaks in Saccharomyces

cerevisiae. Mol Cell Biol, 1996. 16(5): p. 2164-73.

137. Ochi, T., et al., DNA repair. PAXX, a paralog of XRCC4 and XLF, interacts with

Ku to promote DNA double-strand break repair. Science, 2015. 347(6218): p. 185-

8.

138. Xing, M., et al., Interactome analysis identifies a new paralogue of XRCC4 in non-

homologous end joining DNA repair pathway. Nat Commun, 2015. 6: p. 6233.

139. Radhakrishnan, S.K., N. Jette, and S.P. Lees-Miller, Non-homologous end joining:

emerging themes and unanswered questions. DNA Repair (Amst), 2014. 17: p. 2-8.

140. Lieber, M.R., The mechanism of double-strand DNA break repair by the

nonhomologous DNA end-joining pathway. Annu Rev Biochem, 2010. 79: p. 181-

211.

141. Cannan, W.J. and D.S. Pederson, Mechanisms and Consequences of Double-Strand

DNA Break Formation in Chromatin. J Cell Physiol, 2016. 231(1): p. 3-14.

142. Ma, Y., et al., Hairpin opening and overhang processing by an Artemis/DNA-

dependent protein kinase complex in nonhomologous end joining and V(D)J

recombination. Cell, 2002. 108(6): p. 781-94.

143. Yannone, S.M., et al., Coordinate 5' and 3' endonucleolytic trimming of terminally

blocked blunt DNA double-strand break ends by Artemis nuclease and DNA-

dependent protein kinase. Nucleic Acids Res, 2008. 36(10): p. 3354-65.

144. Bertocci, B., et al., Nonoverlapping functions of DNA polymerases mu, lambda, and

terminal deoxynucleotidyltransferase during immunoglobulin V(D)J recombination

in vivo. Immunity, 2006. 25(1): p. 31-41.

145. Wilson, T.E. and M.R. Lieber, Efficient processing of DNA ends during yeast

nonhomologous end joining. Evidence for a DNA polymerase beta (Pol4)-dependent

pathway. J Biol Chem, 1999. 274(33): p. 23599-609.

146. Ma, Y., et al., A biochemically defined system for mammalian nonhomologous DNA

end joining. Mol Cell, 2004. 16(5): p. 701-13.

147. Davis, B.J., J.M. Havener, and D.A. Ramsden, End-bridging is required for pol mu

to efficiently promote repair of noncomplementary ends by nonhomologous end

joining. Nucleic Acids Res, 2008. 36(9): p. 3085-94.

148. Nick McElhinny, S.A., et al., A gradient of template dependence defines distinct

biological roles for family X polymerases in nonhomologous end joining. Mol Cell,

2005. 19(3): p. 357-66.

149. Moon, A.F., et al., The X family portrait: structural insights into biological

functions of X family polymerases. DNA Repair (Amst), 2007. 6(12): p. 1709-25.

115

150. Nick McElhinny, S.A. and D.A. Ramsden, Polymerase mu is a DNA-directed

DNA/RNA polymerase. Mol Cell Biol, 2003. 23(7): p. 2309-15.

151. Dominguez, O., et al., DNA polymerase mu (Pol mu), homologous to TdT, could act

as a DNA mutator in eukaryotic cells. EMBO J, 2000. 19(7): p. 1731-42.

152. Tippin, B., et al., To slip or skip, visualizing frameshift mutation dynamics for

error-prone DNA polymerases. J Biol Chem, 2004. 279(44): p. 45360-8.

153. Ramadan, K., et al., De novo DNA synthesis by human DNA polymerase lambda,

DNA polymerase mu and terminal deoxyribonucleotidyl transferase. J Mol Biol,

2004. 339(2): p. 395-404.

154. Gu, J., et al., XRCC4:DNA ligase IV can ligate incompatible DNA ends and can

ligate across gaps. EMBO J, 2007. 26(4): p. 1010-23.

155. Riballo, E., et al., XLF-Cernunnos promotes DNA ligase IV-XRCC4 re-adenylation

following ligation. Nucleic Acids Res, 2009. 37(2): p. 482-92.

156. Grawunder, U., et al., Activity of DNA ligase IV stimulated by complex formation

with XRCC4 protein in mammalian cells. Nature, 1997. 388(6641): p. 492-5.

157. Palmbos, P.L., J.M. Daley, and T.E. Wilson, Mutations of the Yku80 C terminus and

Xrs2 FHA domain specifically block yeast nonhomologous end joining. Mol Cell

Biol, 2005. 25(24): p. 10782-90.

158. Koch, C.A., et al., Xrcc4 physically links DNA end processing by polynucleotide

kinase to DNA ligation by DNA ligase IV. EMBO J, 2004. 23(19): p. 3874-85.

159. Tsai, C.J., S.A. Kim, and G. Chu, Cernunnos/XLF promotes the ligation of

mismatched and noncohesive DNA ends. Proc Natl Acad Sci U S A, 2007. 104(19):

p. 7851-6.

160. Gu, J., et al., Single-stranded DNA ligation and XLF-stimulated incompatible DNA

end ligation by the XRCC4-DNA ligase IV complex: influence of terminal DNA

sequence. Nucleic Acids Res, 2007. 35(17): p. 5755-62.

161. Zetsche, B., et al., Multiplex gene editing by CRISPR-Cpf1 using a single crRNA

array. Nat Biotechnol, 2017. 35(1): p. 31-34.

162. Price, B.D. and A.D. D'Andrea, Chromatin remodeling at DNA double-strand

breaks. Cell, 2013. 152(6): p. 1344-54.

163. Song, B. and P. Sung, Functional interactions among yeast Rad51 recombinase,

Rad52 mediator, and replication protein A in DNA strand exchange. J Biol Chem,

2000. 275(21): p. 15895-904.

164. Hays, S.L., et al., Studies of the interaction between Rad52 protein and the yeast

single-stranded DNA binding protein RPA. Mol Cell Biol, 1998. 18(7): p. 4400-6.

165. Lisby, M., et al., Choreography of the DNA damage response: spatiotemporal

relationships among checkpoint and repair proteins. Cell, 2004. 118(6): p. 699-713.

166. Sung, P., Yeast Rad55 and Rad57 proteins form a heterodimer that functions with

replication protein A to promote DNA strand exchange by Rad51 recombinase.

Genes Dev, 1997. 11(9): p. 1111-21.

167. Krogh, B.O. and L.S. Symington, Recombination proteins in yeast. Annu Rev

Genet, 2004. 38: p. 233-71.

168. Sartori, A.A., et al., Human CtIP promotes DNA end resection. Nature, 2007.

450(7169): p. 509-14.

116

169. White, C.I. and J.E. Haber, Intermediates of recombination during mating type

switching in Saccharomyces cerevisiae. EMBO J, 1990. 9(3): p. 663-73.

170. Sun, H., D. Treco, and J.W. Szostak, Extensive 3'-overhanging, single-stranded

DNA associated with the meiosis-specific double-strand breaks at the ARG4

recombination initiation site. Cell, 1991. 64(6): p. 1155-61.

171. Heyer, W.D., et al., Rad54: the Swiss Army knife of homologous recombination?

Nucleic Acids Res, 2006. 34(15): p. 4115-25.

172. Bugreev, D.V., F. Hanaoka, and A.V. Mazin, Rad54 dissociates homologous

recombination intermediates by branch migration. Nat Struct Mol Biol, 2007.

14(8): p. 746-53.

173. Sugawara, N., X. Wang, and J.E. Haber, In vivo roles of Rad52, Rad54, and Rad55

proteins in Rad51-mediated recombination. Mol Cell, 2003. 12(1): p. 209-19.

174. Szostak, J.W., et al., The double-strand-break repair model for recombination. Cell,

1983. 33(1): p. 25-35.

175. Nassif, N., et al., Efficient copying of nonhomologous sequences from ectopic sites

via P-element-induced gap repair. Mol Cell Biol, 1994. 14(3): p. 1613-25.

176. Voelkel-Meiman, K. and G.S. Roeder, Gene conversion tracts stimulated by HOT1-

promoted transcription are long and continuous. Genetics, 1990. 126(4): p. 851-67.

177. Morrow, D.M., C. Connelly, and P. Hieter, "Break copy" duplication: a model for

chromosome fragment formation in Saccharomyces cerevisiae. Genetics, 1997.

147(2): p. 371-82.

178. Llorente, B., C.E. Smith, and L.S. Symington, Break-induced replication: what is it

and what is it for? Cell Cycle, 2008. 7(7): p. 859-64.

179. Lin, F.L., K. Sperle, and N. Sternberg, Model for homologous recombination during

transfer of DNA into mouse L cells: role for DNA ends in the recombination

process. Mol Cell Biol, 1984. 4(6): p. 1020-34.

180. Saparbaev, M., L. Prakash, and S. Prakash, Requirement of mismatch repair genes

MSH2 and MSH3 in the RAD1-RAD10 pathway of mitotic recombination in

Saccharomyces cerevisiae. Genetics, 1996. 142(3): p. 727-36.

181. Flott, S., et al., Phosphorylation of Slx4 by Mec1 and Tel1 regulates the single-

strand annealing mode of DNA repair in budding yeast. Mol Cell Biol, 2007.

27(18): p. 6433-45.

182. Fishman-Lobell, J. and J.E. Haber, Removal of nonhomologous DNA ends in

double-strand break recombination: the role of the yeast ultraviolet repair gene

RAD1. Science, 1992. 258(5081): p. 480-4.

183. Thomas, K.R. and M.R. Capecchi, Site-directed mutagenesis by gene targeting in

mouse embryo-derived stem cells. Cell, 1987. 51(3): p. 503-12.

184. Moehle, E.A., et al., Targeted gene addition into a specified location in the human

genome using designed zinc finger nucleases. Proc Natl Acad Sci U S A, 2007.

104(9): p. 3055-60.

185. Bajar, B.T., et al., Fluorescent indicators for simultaneous reporting of all four cell

cycle phases. Nat Methods, 2016. 13(12): p. 993-996.

186. Musselman, C.A., et al., Molecular basis for H3K36me3 recognition by the Tudor

domain of PHF1. Nat Struct Mol Biol, 2012. 19(12): p. 1266-72.

117

187. Gavin, A.C., K. Maeda, and S. Kuhner, Recent advances in charting protein-protein

interaction: mass spectrometry-based approaches. Curr Opin Biotechnol, 2011.

22(1): p. 42-9.

188. Eryilmaz, J., et al., Structural studies of a four-MBT repeat protein MBTD1. PLoS

One, 2009. 4(10): p. e7274.

189. Guo, Y., et al., Methylation-state-specific recognition of histones by the MBT repeat

protein L3MBTL2. Nucleic Acids Res, 2009. 37(7): p. 2204-10.

190. Kim, J., et al., Tudor, MBT and chromo domains gauge the degree of lysine

methylation. EMBO Rep, 2006. 7(4): p. 397-403.

191. Myers, A.P. and L.C. Cantley, Targeting a common collaborator in cancer

development. Sci Transl Med, 2010. 2(48): p. 48ps45.

192. McCubrey, J.A., et al., Targeting the RAF/MEK/ERK, PI3K/AKT and p53 pathways

in hematopoietic drug resistance. Adv Enzyme Regul, 2007. 47: p. 64-103.

193. Liu, P., et al., Targeting the phosphoinositide 3-kinase pathway in cancer. Nat Rev

Drug Discov, 2009. 8(8): p. 627-44.

194. Smith, R.J., et al., Ataxia telangiectasia mutated (ATM) interacts with p400 ATPase

for an efficient DNA damage response. BMC Mol Biol, 2016. 17(1): p. 22.

195. Jacquet, K., et al., The TIP60 Complex Regulates Bivalent Chromatin Recognition

by 53BP1 through Direct H4K20me Binding and H2AK15 Acetylation. Mol Cell,

2016. 62(3): p. 409-21.

196. Zee, B.M., et al., Streamlined discovery of cross-linked chromatin complexes and

associated histone modifications by mass spectrometry. Proc Natl Acad Sci U S A,

2016. 113(7): p. 1784-9.

Documents

Cartographie des complexes multiprotéiques humains suite à la … · 2020. 8. 7. · pTRE -BI pTRE -bidirectionel RPA Protéine de réplication A . xvi rTetR Représseur Tet inversé