49
Genome-wide profiling of Epstein-Barr virus integration by targeted sequencing in Epstein-Barr virus associated malignancies Miao Xu 1,¶ , Wei-Long Zhang 2, ¶ , Qing Zhu 2, ¶ ,You-yuan Yao 1, ¶ , Qi-Sheng Feng 1 , Zhe Zhang 3 , Rou-Jun Peng 1 , Wei-Hua Jia 1 , Gui-Ping He 1 , Lin Feng 1 , Zhao-Lei Zeng 1 , Bing Luo 4 , Rui-Hua Xu 1 , Mu-Sheng Zeng 1 , Wei- Li Zhao 5 , Sai-Juan Chen 5 , Yi-Xin Zeng 1,2* , Yuchen Jiao 2,* 1 State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China. 2 State Key Lab of Molecular Oncology, National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College; Collaborative Innovation Center for Cancer Medicine, Beijing, China. 3 Department of Otolaryngology/Head and Neck Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, China. 4 Department of Medical Microbiology, Qingdao University Medical College, Qingdao, China. 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2

 · Web viewEpstein-Barr virus (EBV) is associated with multiple malignancies with expression of viral oncogenic proteins and chronic inflammation as major mechanisms contributing

  • Upload
    buicong

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Genome-wide profiling of Epstein-Barr virus integration by targeted sequencing in

Epstein-Barr virus associated malignancies

Miao Xu1,¶, Wei-Long Zhang2, ¶, Qing Zhu2, ¶,You-yuan Yao1, ¶, Qi-Sheng Feng1, Zhe Zhang3,

Rou-Jun Peng1, Wei-Hua Jia1, Gui-Ping He1, Lin Feng1, Zhao-Lei Zeng1, Bing Luo4, Rui-Hua

Xu1, Mu-Sheng Zeng1, Wei-Li Zhao5, Sai-Juan Chen5, Yi-Xin Zeng1,2*, Yuchen Jiao2,*

1 State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer

Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.

2 State Key Lab of Molecular Oncology, National Cancer Center/Cancer Hospital, Chinese

Academy of Medical Sciences and Peking Union Medical College; Collaborative Innovation

Center for Cancer Medicine, Beijing, China.

3 Department of Otolaryngology/Head and Neck Surgery, First Affiliated Hospital of Guangxi

Medical University, Nanning, China.

4 Department of Medical Microbiology, Qingdao University Medical College, Qingdao, China.

5 State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, Rui Jin

Hospital, Shanghai Jiao Tong University (SJTU) School of Medicine and Collaborative

Innovation Center of Systems Biomedicine, Shanghai, China.

* Corresponding author

E-mail: [email protected] (Y.-X.Z.)

[email protected] (Y.-C.J.)

¶ These authors contributed equally to the work.

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

2212

Abstract

Rationale: Epstein-Barr virus (EBV) is associated with multiple malignancies with expression

of viral oncogenic proteins and chronic inflammation as major mechanisms contributing to tumor

development. A less well-studied mechanism is the integration of EBV into the human genome

possibly at sites which may disrupt gene expression or genome stability. Methods: We

sequenced tumor DNA to profile the EBV sequences by hybridization-based enrichment.

Bioinformatic analysis was used to detect the breakpoints of EBV integrations in the genome of

cancer cells. Results: We identified 197 breakpoints in nasopharyngeal carcinomas and other

EBV-associated malignancies. EBV integrations were enriched at vulnerable regions of the

human genome and were close to tumor suppressor and inflammation-related genes. We found

that EBV integrations into the introns could decrease the expression of the inflammation-related

genes, TNFAIP3, PARK2, and CDK15, in NPC tumors. In the EBV genome, the breakpoints

were frequently at oriP or terminal repeats. These breakpoints were surrounded by

microhomology sequences, consistent with a mechanism for integration involving viral genome

replication and microhomology-mediated recombination. Conclusion: Our finding provides

insight into the potential of EBV integration as an additional mechanism mediating

tumorigenesis in EBV associated malignancies.

Keywords: Epstein-Barr virus (EBV), nasopharyngeal carcinoma (NPC), DNA integration

Graphical Abstract

2

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

34

Introduction

Epstein-Barr virus (EBV) is one of the first described human cancer viruses. EBV is

associated with ~ 1% of cancers worldwide, including Burkitt lymphoma, nasopharyngeal

carcinoma (NPC), Hodgkin lymphomas, NK/T cell lymphomas, and a subset of gastric

carcinomas [1, 2]. The EBV genome typically exists as an episome in infected cells. The most

well-described EBV carcinogenic mechanisms are mediated through EBV viral protein effects or

EBV infection. Expression of EBV proteins, EBNA-1, EBNA-2, EBNA-3A/3B/3C, LMP-1 and

LMP-2, causes B cell and epithelial cell proliferation, increases viability of Burkitt lymphoma

and NPC cells, and induces DNA damage and genomic instability [3-6], while EBV infection

promotes chronic inflammation and reduces anti-tumor immune surveillance in the epithelium

[4].

The first reports of the integration of the EBV genome into host genomes date back to the

1980s [7-10]. Subsequent studies confirmed the frequent integration of full-length EBV genomes

as well as DNA fragments in EBV-positive lymphoma and epithelial carcinomas including NPC

3

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

56

and gastric carcinoma [11-21]. These findings suggest that integrated and episomal EBV DNA

coexist in tumor cells in vivo and in vitro. The significance of viral integrations as they may

relate to host genome abnormalities and ultimately the development of cancer is not well

understood [22]. Thus, it is important to determine whether EBV randomly integrates into human

genomes or not [23-27]. The discoveries that integration sites occasionally co-localize with

regions containing cancer associated genes JAK2, PD-L1 and PD-L2 in gastric carcinomas and

BACH2, REL and BCL-11A in Burkitt lymphoma cells raise the possibility that EBV integration

can promote carcinogenesis [20, 28, 29]. However, these studies are limited by a small sample

size and the absence of a systematic investigation of the EBV integration landscape on a

genome-wide scale. To provide systematic insight into EBV integration in associated

malignancies, we performed EBV-targeted ultra-deep sequencing and conducted a

comprehensive survey of EBV integration in a variety of human malignancies. This work

provides the first unbiased, genome-wide analysis of EBV integrations, and reveals the

involvement of novel inflammation-related genes in NPC.

Results

To perform comprehensive profiling of EBV integration, we conducted EBV-targeted ultra-

deep sequencing on 177 NPCs, 39 gastric carcinomas, 25 NK/T cell lymphomas, 11 Hodgkin

lymphomas, one nasopharyngitis tissue and the EBV-positive NPC cell line C666-1. A total of

197 EBV integration breakpoints were identified from 33 tumors and the C666-1 cell line (Table

1 and Figure S1). The integration rates were higher in the gastric carcinomas (25.6%; 95%

confidence interval (CI): 13.0 - 42.1%) than in the NPC tumors (9.6%; 95% CI: 5.7 - 14.9%).

We observed slightly more EBV integration positive samples in late-stage NPC tumors (stage

4

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

78

III-IV) and large-size gastric cancers (> 5 cm; Table S1). The EBV integration counts in positive

tumor samples varied widely among tumor types and individual cases. Twenty-seven of the 34

positive samples harbored 1-2 breakpoints. The remaining positive samples (n = 7) contained

more than two with one gastric cancer harboring an especially large number (118) of integration

breakpoints. At least 2 EBV integration breakpoints were consistent between matched primary

and metastatic NPC tumors from the same patient (Figure S2).

The 197 breakpoints were distributed over all 23 human chromosomes (Figure 1A). EBV

showed a strong tendency to integrate near common fragile sites in both NPC and gastric cancer

samples (Figure 1B). Similarly, two of the six breakpoints identified in NK/T cell lymphoma

samples were located at the same common fragile site (Table S2). EBV also tended to integrate

into microsatellite repeats in gastric carcinomas, but avoided SINE repeats in NPC and gastric

carcinomas (Figure 1C; For details, see Figure S3 and Table S2). Common fragile regions and

microsatellite repeats are vulnerable to DNA damage, which increases the chance for EBV DNA

insertion into host genomes through microhomology-mediated DNA repair.

We also tested the relative position of EBV integrations to genes in the human genome. We

found that 75 breakpoints (38.1%) were located within known UCSC-annotated genes (Figure

S4). The integration sites were slightly skewed toward gene body and promoter regions (26

breakpoints; 13.2%; Figure S4). Unlike HBV and HPV integrations, no associations were found

between EBV integration sites and CpG islands, the repetitive elements other than microsatellite

repeats, or the binding sites of the genome architecture regulator CCCTC-binding factor (CTCF;

Figure S5).

5

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

910

We detected EBV integration in the proximity of tumor suppressor genes: KANK1 in one

NK/T cell lymphoma, RB1CC1 in one Hodgkin lymphoma, and DLEC1 in one NPC tumor

(Table S2). Integrations in gastric carcinoma samples were associated with tumor suppressor

genes SETD2, KISS1, FHIT, PTEN and TET2 (Table S2). The integration breakpoints associated

with the histone methyltransferase, SETD2, were located in a region 22 kb upstream of the gene

(Figure 2A). In one NK/T cell lymphoma, a breakpoint was found 27 kb downstream of the

tumor suppressor gene KANK1. The other EBV integration breakpoint in the same sample was

identified 204 kb downstream of JAK2. Amplification was also detected in this region (Figure

2B). Common fragile regions were often co-localized with tumor suppressor genes; for example,

the common fragile site FRA3B lies within the tumor suppressor gene FHIT. A number of

studies have shown that tumor suppressor genes and common fragile regions are frequent targets

of viral DNA integration [30, 31]. The breakpoints associated with tumor suppressor genes in our

study showed significantly higher coverage in the targeted sequencing, which indicates that they

were more likely to be clonal in the tumorigenesis (P < 0.0001, unpaired, two-sided t-test).

Integration could alter the expression or function of tumor suppressor genes and provide host

cells with a selective advantage during tumorigenesis.

We also identified EBV integrations located within the introns of CDK15 in the primary

(Figures S2A-B) and metastatic (Figures 2C, S2C and D) NPC tumors from a single patient,

and TNFAIP3 and PARK2 in two additional NPC tumors from two other patients (Figures 2D-

E). These breakpoints were all supported by a high number of sequencing reads, suggesting

clonal expansion of cancer cells after EBV integration (Table S2). Notably, TNFAIP3, CDK15

and PARK2 are all inflammation-related genes involved in the regulation of TNF-alpha-induced

apoptosis/NF-κB pathways [32-34], and dysregulation of these pathways contributes to the

6

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

1261112

development of EBV-associated cancers, including NPC [35]. We performed the

immunohistochemistry staining of CDK15, TNFAIP3 and PARK2 proteins using the integrated

and non-integrated NPC samples. We found that the protein levels of CDK15, TNFAIP3 and

PARK2 were lower in the samples harboring EBV integrations into the introns of the respective

genes. (Figures 3A-C). Using qPCR of NF-κB targeted genes and a luciferase reporter gene

assay, we found that NF-κB activity was up-regulated in NPC cells with TNFAIP3 knockdown

(Figure S6A), confirming its role as an inhibitor of NF-κB pathway. In contrast, NF-κB activity

was down-regulated, and nuclear localization of p65 after TNF-α treatment was diminished in

NPC cells with CDK15 knockdown (Figure S6B), indicating that CDK15 is positively related to

the activation of the NF-κB pathway.

To investigate the mechanisms of EBV integration, we surveyed the distribution of the 197 EBV

integration breakpoints in the EBV genome. EBV breakpoints were spread over the entire viral

genome with multiple hotspots (Figure 4A). Breakpoints were enriched in the proximity of oriP

and terminal repeats, while no breakpoints were detected within the two long internal repeats

(Figure 4B). The tendency for EBV breakpoints to localize around oriP and terminal repeats

indicated that EBV integration was related to viral genome replication. We further analyzed the

microhomology (MH) sequences in the regions flanking integration sites. We found frequent

microhomologies between the human genome and the EBV genome near integration breakpoints

(Figure S2 and Figure S7). Insertions of 2-10 bp were also observed near the EBV integration

breakpoints (Figures S2 and Figure S7). Two EBV integrations containing MH sequences were

observed in matched primary and metastatic NPC tumors from a single patient (Figure S2).

Methods

7

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

1314

Ethics statement

This study was approved by the institutional ethics committees of the Sun Yat-sen

University Cancer Center (Guangzhou, China), the First Affiliated Hospital of Guangxi Medical

University (Nanning, China), the Affiliated Hospital of Qingdao University (Qingdao, China),

and Rui Jin Hospital (Shanghai, China). Written informed consent was obtained from each study

participant.

Sample collection and DNA extraction

Frozen samples were collected for isolation of genomic DNAs from the following tissue

types: NPC (n = 177; the Sun Yat-sen University Cancer Center and the First Affiliated Hospital

of Guangxi Medical University); gastric carcinoma (n = 39; the Sun Yat-sen University Cancer

Center and the Affiliated Hospital of Qingdao University); NK/T cell lymphoma (n = 25; the Sun

Yat-sen University Cancer Center and Rui Jin Hospital); Hodgkin lymphoma (n = 11; the Sun

Yat-sen University Cancer Center); and nasopharyngitis (n = 1; the Sun Yat-sen University

Cancer Center). All histopathological diagnoses were performed according to WHO

classifications and were reviewed by two pathologists. The clinicopathological characteristics of

the study subjects are listed in Table S1.

Genomic DNAs were extracted from the frozen tissue samples using the DNeasy Blood and

Tissue Kit (Qiagen; Germantown, MD, USA).

EBV DNA capture and sequencing

Genomic DNA was subjected to hybrid capture using an EBV-targeting single-stranded

DNA probe developed by MyGenostics (Beijing, China). Sequencing libraries were constructed

8

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

1516

by shearing genomic DNA into 150-200 bp fragments, followed by DNA purification, end

blunting and adaptor ligation according to the instructions provided by Illumina (San Diego, CA,

USA). The library concentrations were evaluated with a Bioanalyzer 2100 (Agilent

Technologies; Santa Clara, CA, USA). EBV DNA was captured from the genomic DNA

following the MyGenostics GenCap Target Enrichment Protocol (GenCap Enrichment,

MyGenostics). The libraries were hybridized with EBV probes at 65°C for 24 h and then washed

to remove uncaptured DNA. The eluted DNA fragments were amplified through 18 PCR cycles

to generate libraries for sequencing. The libraries were subsequently quantified and subjected to

paired-end sequencing (2 100 or 150 bp) on an Illumina HiSeq 2000 sequencer according to

the manufacturer’s instructions (Illumina).

Integration detection, validation and annotation

Quality assessment of the raw reads was conducted using TrimGalore to remove adaptor

sequences and low-quality reads. High-quality reads were aligned to the human (NCBI build 37,

hg19) and EBV genomes (NC_007605.1) using the Burrows-Wheeler Aligner (BWA, version

0.7.5a) [36]. Alignments were converted from a sequence alignment map format to sorted and

indexed binary alignment map (BAM) files [37]. The Picard tool was used to remove duplicate

reads. We developed a bioinformatic method based on LUMPY to identify EBV-human chimeric

reads[38]. Briefly, paired-end reads mapping solely to the human or EBV reference genomes

were removed. Next, two types of integration-supportive signals were extracted by LUMPY as

follows: 1) read pairs, in which one of the paired reads was mapped to the EBV genome and the

other to the human genome and 2) chimeric reads, in which one read covered both the EBV

genome sequence and the human genome sequence. Integration events supported by ≥ 3 read

9

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

1718

pairs or chimeric reads were retained. ANNOVAR [39] and the UCSC table browser [40] were

used to annotate the breakpoints using fragile regions [41, 42] and the UCSC hg19 CpG island

and RepeatMasker database. We randomly selected 12 integrations and performed PCR and

Sanger sequencing to validate. Ten (83.3%) breakpoints were validated with Sanger sequencing,

while PCR for two breakpoints failed to generate products (Table S3).

Copy number variation and gene expression

CNV data for three NK/T cell lymphomas were retrieved from the study by Jiang et al. [43].

Publicly available gene expression data for normal epithelium (n = 10) and NPC tumors (n = 31)

obtained on the Affymetrix Human Genome U133 Plus 2.0 Array were retrieved from the NCBI

GEO database (GSE12452). Probeset measures of all 41 arrays were calculated by robust

multiarray averaging. The relative RNA expression value was log-transformed using log2. Data

were analyzed with the unpaired t-test and presented as the mean ± SEM. A P-value of < 0.05

was considered to be statistically significant.

siRNA transfection

CDK15 and TNFAIP3 siRNAs were designed and synthesized by RIBOBIO (Guangzhou,

China). Non-targeting siRNA duplexes are denoted as “scr”. Knockdown efficiency was

confirmed with qPCR. Cells (7 × 105) were seeded into 6 well plates, and after incubation

overnight, transfected with the indicated siRNA duplexes using Lipofectiamine RNAiMAX

Transfection Reagent (Thermo Fisher Scientific; Waltham, MA, USA), according to the

manufacturer’s instructions. Sequences of the CDK15 siRNAs were the following:

CDK15-siRNA#1: 5’-GAGGAAGGAGTCCCATTTA-3’

10

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

1920

CDK15-siRNA#2: 5’-CCTCAGAACTTACTCATCA-3’

TNFAIP3-siRNA#1: 5’-GCGGAAAGCTGTGAAGATA-3’

TNFAIP3-siRNA#2: 5’-CAACTCACTGGAAGAAATA-3’

Quantitative Real-time PCR

Total RNA was isolated using Trizol reagent (Thermo Fisher Scientific), and cDNA was

prepared from RNA (1 µg) using the PrimeScript RT Reagent Kit (TaKaRa; Tokyo, Japan)

according to the manufacturer’s instructions. Quantitative real-time PCR was performed using

the Platinum SYBR Green qPCR SuperMix (Thermo Fisher Scientific) on a Light cycler 480

Real-time system (Roche; Indianapolis, IN, USA).

The primers sequences used were the following:

CDK15-F: 5’-TATGCGACAGTTTACAAGG-3’

CDK15-R: 5’-TCATGCAGGAGCACAATA-3’

IKBα-F: 5’-CTCCGAGACTTTCGAGGAAATAC-3’

IKBα-R: 5’-GCCATTGTAGTTGGTAGCCTTCA-3’

IL6-F: 5’-ACTCACCTCTTCAGAACGAATT-3’

IL6-R: 5’-CCATCTTTGGAAGGTTCAGGTTG-3’

MCP1-F: 5’-CAGCCAGATGCAATCAATGC-3’

MCP1-R: 5’-TGGAATCCTGAACCCACTTCT-3’

TNF-F: 5’-CCTCTCTCTAATCAGCCCTCTG-3’

TNF-R: 5’-GAGGACCTGGGAGTAGATGAG-3’

GAPDH-F: 5’-TTGCCATCAATGACCCCTTCA-3’

GAPDH-R: 5’-CGCCCCACTTGATTTTGGA-3’

11

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

2122

Western blot analysis

Western blot analysis was performed as previously described [44]. Cells were lysed in

RIPA lysis buffer containing protease inhibitors (Roche). The subcellular fractionation was

performed as previously described [45]. The proteins were separated with 10% SDS-PAGE and

transferred to PVDF membranes (Thermo Fisher Scientific). Membranes were blocked with 5%

BSA for 1 h and incubated with primary antibodies overnight in 4°C. Antibodies against p65

(ab53465; Abcam; Cambridge, MA, USA), ß-actin (#3700; Cell Signaling Technology; Danvers,

MA, USA), and GAPDH (60004-1-1 g; Protein-tech; Chicago, IL, USA) were used for western

blot analysis.

Luciferase reporter assay

(CAGA)12-Luc and the control vector pRL-TK (Promega) encoding Renilla luciferase were

cotransfected into HEK293T or NPC cells using PEI. Luciferase activity was measured 24 h

after transfection using the Dual-Luciferase Reporter Assay System (Promega). The firefly

luciferase activity values were normalized to those of Renilla, and the ratios of firefly/Renilla

activities were determined. The experiments were independently performed in triplicate.

Discussion

EBV integration during tumorigenesis has not yet been systematically investigated using

rigorous methods. In this study, we conducted the first large-scale analysis of EBV integration in

multiple malignancies using EBV genome-targeted sequencing. Our method, combining EBV

genome capture and ultra-deep sequencing, efficiently detected integrated EBV sequences from

background “noise” introduced by nuclear EBV episomes. Our results indicate that EBV can

12

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

2324

integrate into host genomes at a significant rate in multiple tumor types. We observed that EBV

integration frequencies varied among tumor types as well as the number of integrated EBV

genomes among tumor samples. The heterogeneity of these tumor genomes may underlie the

observed variation of EBV integration in these tumors.

Our study revealed that common fragile regions were preferred sites for EBV integration.

Common fragile regions are genomic hotspots for DNA damage and are susceptible to genome

rearrangement, thereby increasing the chance for EBV DNA insertion through microhomology-

mediated DNA repair, which has an important role in the integration of other tumorigenic

viruses, HBV and HPV [30, 31]. We observed EBV integrations into or near tumor suppressor

genes that were often colocalized with common fragile regions. Integration in the proximity of

tumor suppressor genes may provide host cells with a selective advantage. Moreover, integration

distribution in gastric carcinomas correlated with microsatellite repeats which are vulnerable to

DNA damage, one of the samples which has the 118 integration breakpoints, the whole-exome

sequencing showed the mutation including XRCC2, PARP3, SLX4, and PMS2, which are

involved in DNA repair, further suggesting that host genome stability has a strong impact on

EBV integration. Although genome instability and microhomology-mediated DNA repair are

involved in the integration of EBV, HPV and HBV DNA into the host genome, why does EBV

integration occur at a relatively lower rate (25.6% in GC, 9.6% in NPC) than HPV integration in

cervical cancer (76.3%) and head and neck squamous cell carcinoma (HNSCC; 60.7%), and

HBV in hepatocellular carcinoma (HCC; 92.6%)? [31, 46-49]. There are several possible

reasons. First, mechanisms underlying tumorigenesis and genetic backgrounds differ greatly

between these tumor types associated with the different viruses. The tumor suppressors, TP53

and RB, are inactivated by the expression of HPV-encoded oncogenes E6 and E7 in HPV-

13

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

2872526

associated cervical cancer and HNSCC. Dysfunction of the TP53 pathway has also been

frequently observed in HBV-associated HCC (~ 18-51.8%) [50-52]. The impairment of the TP53

pathway leads to increased genomic instability and accumulation of somatic mutations, possibly

also contributing to the high rate of HPV and HBV integration. However, the TP53 pathway is

not mutated as frequently in NPC (~ 7-10%) and EBV-associated gastric cancer (rarely) as in

HCC [53-55]. Moreover, large-scale whole-genome surveys indicate that NPC and EBV-

associated gastric carcinomas tend to have relatively stable genomes, compared to many other

carcinomas including cervical cancer, HNSCC and HCC [53]. Second, life cycles and genomic

features of the viruses themselves may also affect their integration into the host genome. In latent

infection state, EBV episomes are replicated along with chromosomes in host cells and therefore

are relatively stable. EBV has a much larger genome than HPV or HBV, which may make its

integration by microhomology-mediated DNA recombination more difficult.

In NPC, three integration events were localized to introns of the inflammation-related genes

PARK2, TNFAIP3 and CDK15, which regulate the TNF-alpha-induced apoptosis/NF-κB

pathways. PARK2 deficiency promotes inflammation and genome instability and has an

important role in the development of lung cancer [56]. Dysregulation of NF-κB activation

contributes to the development of various EBV-associated cancers, including NPC [35]. We

found lower expression of PARK2, TNFAIP3 and CDK15 proteins and also dysregulated NF-κB

activity in the integrated NPC tumors. Our results suggest that integration of EBV into these

genes may disrupt their function and contribute to tumorigenesis through the TNF-alpha-induced

apoptosis/NF-κB pathways. The identification of integrations with a high number of supporting

reads associated with inflammatory genes indicates that such EBV integration events are

potentially selected for during tumor development. If these genes are involved in NPC

14

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

3102728

development, they may be frequent targets of somatic mutation in NPC. TNFAIP3 gene as a

mutation hot spot has been already confirmed by previous studies in NPC [53-55]. We also

searched our unpublished data and confirmed that about 5% of NPC tumors harbor copy number

variation at CDK15 and PARK2 loci.

In the EBV genome, the breakpoints identified in this study were concentrated around oriP

and the terminal repeats. During latency, EBNA1 binding to oriP can recruit host cell replication

machinery to facilitate the formation of an efficient origin of replication for the EBV episome

[57]. The terminal repeats are responsible for the circularization of the EBV genome after it

enters the nucleus and cleavage/encapsulation of EBV DNA. Both EBV genome circularization

and cleavage involve recombination events. The microhomology sequences around breakpoints

indicate that EBV integration involves microhomology-mediated DNA repair pathways. These

integrations may be triggered by genomic vulnerability/fragility during genome replication and

the physical proximity of the oriP repeats to the host DNA bridged by EBNA1 and DNA

recombination, which underlies the EBV integration mechanism.

Till now, we still do not know the size of the EBV sequence integrated into the host genome.

Due to the limit of the read length in the second or third generation sequencing, we can only

identify a few hundred base pair to around 10 Kb of EBV sequence fused into the host genome.

Compared with the large size of the EBV genome, it is difficult to determine whether a portion

or the full EBV genome is integrated into human genome. Future development of sequencing

technology could help to map the landscape of EBV genome integrated into the human genome.

In summary, our work provides an unbiased large-scale genome-wide analysis of the EBV

integration landscape in multiple malignancies. EBV integration occurs preferentially within 15

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

3322930

unstable chromosomal regions of the host genome, surrounding oriP or terminal repeats of the

EBV genome. Several integration sites were located in the proximity of tumor suppressor genes

that are frequently disrupted during cancer progression. We detected multiple integrations into

genes regulating TNF-alpha-induced apoptosis/NF-κB pathways in NPC. These pathways are

closely related to EBV-associated diseases and indicate that EBV integration disrupts the

function of crucial genes, leading to the development of cancer in some cases of latent EBV

infection.

Abbreviations EBV: Epstein-Barr virus; NPC: nasopharyngeal carcinoma; CI: confidence

interval; MH: microhomology.

Acknowledgments

We would like to thank all the participants recruited for this study. We thank Dr. E.D. Kieff

for helpful discussions and suggestions on this manuscript. This work was supported by the

National Natural Science Foundation of China (81872228 and 81430059) and the National Key

R&D Program of China (No. 2016YF0902000).

Author contributions

Y.-X.Z. and Y.-C.J. were the overall principal investigators who conceived the study and

obtained financial support. Y.-X.Z., M.X. and Y.-C. J designed and oversaw the study. M.X.,

W.-L.Z. and Y.Y. performed sample preparation, sequencing and statistical analysis and

validation. S.-J.C., W-L.Z., R.-H.X., Z.Z. and B.L. contributed to sample collection. Q.Z. edited

16

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

3132

the tables and figures. The manuscript was drafted by M.X. and W.-L.Z. under the supervision of

Y.-X.Z. and Y.-C.J. All authors critically reviewed the article and approved the final manuscript.

Competing interests

The authors declare no competing interests.

References:

1. Parkin DM. The global health burden of infection-associated cancers in the year 2002. Int J Cancer. 2006; 118: 3030-44.2. Zur Hausen H, de Villiers EM. Reprint of: cancer "causation" by infections--individual contributions and synergistic networks. Semin Oncol. 2015; 42: 207-22.3. Arvey A, Tempera I, Tsai K, Chen HS, Tikhmyanova N, Klichinsky M, et al. An atlas of the Epstein-Barr virus transcriptome and epigenome reveals host-virus regulatory interactions. Cell Host Microbe. 2012; 12: 233-45.4. Tsao SW, Tsang CM, To KF, Lo KW. The role of Epstein-Barr virus in epithelial malignancies. J Pathol. 2015; 235: 323-33.5. Li R, Liao G, Nirujogi RS, Pinto SM, Shaw PG, Huang TC, et al. Phosphoproteomic Profiling Reveals Epstein-Barr Virus Protein Kinase Integration of DNA Damage Response and Mitotic Signaling. PLoS Pathog. 2015; 11: e1005346.6. Shumilov A, Tsai MH, Schlosser YT, Kratz AS, Bernhardt K, Fink S, et al. Epstein-Barr virus particles induce centrosome amplification and chromosomal instability. Nat Commun. 2017; 8: 14257.7. Henderson A, Ripley S, Heller M, Kieff E. Chromosome site for Epstein-Barr virus DNA in a Burkitt tumor cell line and in lymphocytes growth-transformed in vitro. Proc Natl Acad Sci U S A. 1983; 80: 1987-91.8. Matsuo T, Heller M, Petti L, O'Shiro E, Kieff E. Persistence of the entire Epstein-Barr virus genome integrated into human lymphocyte DNA. Science. 1984; 226: 1322-5.9. Lawrence JB, Villnave CA, Singer RH. Sensitive, high-resolution chromatin and chromosome mapping in situ: presence and orientation of two closely integrated copies of EBV in a lymphoma line. Cell. 1988; 52: 51-61.10. Anvret M, Karlsson A, Bjursell G. Evidence for integrated EBV genomes in Raji cellular DNA. Nucleic Acids Res. 1984; 12: 1149-61.11. Delecluse HJ, Bartnizke S, Hammerschmidt W, Bullerdiek J, Bornkamm GW. Episomal and integrated copies of Epstein-Barr virus coexist in Burkitt lymphoma cell lines. J Virol. 1993; 67: 1292-9.12. Wolf J, Pawlita M, Klevenz B, Frech B, Freese UK, Muller-Lantzsch N, et al.

17

355

356

357

358

359

360

361

362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393

3334

Down-regulation of integrated Epstein-Barr virus nuclear antigen 1 and 2 genes in a Burkitt lymphoma cell line after somatic cell fusion with autologous EBV-immortalized lymphoblastoid cells. Int J Cancer. 1993; 53: 621-7.13. Hurley EA, Agger S, McNeil JA, Lawrence JB, Calendar A, Lenoir G, et al. When Epstein-Barr virus persistently infects B-cell lines, it frequently integrates. J Virol. 1991; 65: 1245-54.14. Kripalani-Joshi S, Law HY. Identification of integrated Epstein-Barr virus in nasopharyngeal carcinoma using pulse field gel electrophoresis. Int J Cancer.1994; 56: 187-92.15. Chang Y, Cheng SD, Tsai CH. Chromosomal integration of Epstein-Barr virus genomes in nasopharyngeal carcinoma cells. Head Neck. 2002; 24: 143-50.16. Zhang HY, Qu G, Deng ZW, Yao TH, Glaser R. Epstein-Barr virus DNA in nasopharyngeal biopsies. Virus Res. 1989; 12: 53-9.17. Xiao K, Yu Z, Li X, Li X, Tang K, Tu C, et al. Genome-wide Analysis of Epstein-Barr Virus (EBV) Integration and Strain in C666-1 and Raji Cells. J Cancer. 2016; 7: 214-24.18. Morissette G, Flamand L. Herpesviruses and chromosomal integration. J Virol. 2010; 84: 12100-9.19. Cao S, Strong MJ, Wang X, Moss WN, Concha M, Lin Z, et al. High-throughput RNA sequencing-based virome analysis of 50 lymphoma cell lines from the Cancer Cell Line Encyclopedia project. J Virol. 2015; 89: 713-29.20. Gulley ML. Genomic assays for Epstein-Barr virus-positive gastric adenocarcinoma. Exp Mol Med. 2015; 47: e134.21. Ohshima K, Suzumiya J, Kanda M, Kato A, Kikuchi M. Integrated and episomal forms of Epstein-Barr virus (EBV) in EBV associated disease. Cancer Lett.1998; 122: 43-50.22. Ohshima K, Suzumiya J, Ohga S, Ohgami A, Kikuchi M. Integrated Epstein-Barr virus (EBV) and chromosomal abnormality in chronic active EBV infection. Int J Cancer. 1997; 71: 943-7.23. Lestou VS, De Braekeleer M, Strehl S, Ott G, Gadner H, Ambros PF. Non-random integration of Epstein-Barr virus in lymphoblastoid cell lines. Genes Chromosomes Cancer. 1993; 8: 38-48.24. Jox A, Rohen C, Belge G, Bartnitzke S, Pawlita M, Diehl V, et al. Integration of Epstein-Barr virus in Burkitt's lymphoma cells leads to a region of enhanced chromosome instability. Ann Oncol. 1997; 8 Suppl 2: 131-5.25. Wuu KD, Chen YJ, Wuu SW. Frequency and distribution of chromosomal integration sites of the Epstein-Barr virus genome. J Formos Med Assoc. 1996; 95: 911-6.26. Oh JH, Kim YJ, Moon S, Nam HY, Jeon JP, Lee JH, et al. Genotype instability during long-term subculture of lymphoblastoid cell lines. J Hum Genet. 2013; 58: 16-20.27. Gao J, Luo X, Tang K, Li X, Li G. Epstein-Barr virus integrates frequently into chromosome 4q, 2q, 1q and 7q of Burkitt's lymphoma cell line (Raji). J Virol Methods. 2006; 136: 193-9.

18

394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439

3536

28. Takakuwa T, Luo WJ, Ham MF, Sakane-Ishikawa F, Wada N, Aozasa K. Integration of Epstein-Barr virus into chromosome 6q15 of Burkitt lymphoma cell line (Raji) induces loss of BACH2 expression. Am J Pathol. 2004; 164: 967-74.29. Luo WJ, Takakuwa T, Ham MF, Wada N, Liu A, Fujita S, et al. Epstein-Barr virus is integrated between REL and BCL-11A in American Burkitt lymphoma cell line (NAB-2). Lab Invest. 2004; 84: 1193-9.30. Zhao LH, Liu X, Yan HX, Li WY, Zeng X, Yang Y, et al. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma. Nat Commun. 2016; 7: 12992.31. Hu Z, Zhu D, Wang W, Li W, Jia W, Zeng X, et al. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism. Nat Genet. 2015; 47: 158-63.32. Muller-Rischart AK, Pilsl A, Beaudette P, Patra M, Hadian K, Funke M, et al. The E3 ligase parkin maintains mitochondrial integrity by increasing linear ubiquitination of NEMO. Mol Cell. 2013; 49: 908-21.33. Park MH, Kim SY, Kim YJ, Chung YH. ALS2CR7 (CDK15) attenuates TRAIL induced apoptosis by inducing phosphorylation of survivin Thr34. Biochem Biophys Res Commun. 2014; 450: 129-34.34. Wertz IE, Newton K, Seshasayee D, Kusam S, Lam C, Zhang J, et al. Phosphorylation and linear ubiquitin direct A20 inhibition of inflammation. Nature. 2015; 528: 370-5.35. Sun SC, Cesarman E. NF-kappaB as a target for oncogenic viruses. Curr Top Microbiol Immunol. 2011; 349: 197-244.36. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25: 1754-60.37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25: 2078-9.38. Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014; 15: R84.39. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010; 38: e164.40. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004; 32: D493-6.41. Fungtammasan A, Walsh E, Chiaromonte F, Eckert KA, Makova KD. Corrigendum: A genome-wide analysis of common fragile sites: What features determine chromosomal instability in the human genome? Genome Res. 2016; 26: 1451.42. Fungtammasan A, Walsh E, Chiaromonte F, Eckert KA, Makova KD. A genome-wide analysis of common fragile sites: what features determine chromosomal instability in the human genome? Genome Res. 2012; 22: 993-1005.

19

440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485

3738

43. Jiang L, Gu ZH, Yan ZX, Zhao X, Xie YY, Zhang ZG, et al. Exome sequencing identifies somatic mutations of DDX3X in natural killer/T-cell lymphoma. Nat Genet. 2015; 47: 1061-6.44. Ma W, Feng L, Zhang S, Zhang H, Zhang X, Qi X, et al. Induction of chemokine (C-C motif) ligand 5 by Epstein-Barr virus infection enhances tumor angiogenesis in nasopharyngeal carcinoma. Cancer Sci. 2018; 109: 1710-22.45. Holden P, Horton WA. Crude subcellular fractionation of cultured mammalian cell lines. BMC Res Notes. 2009; 2: 243.46. Koneva LA, Zhang Y, Virani S, Hall PB, McHugh JB, Chepeha DB, et al. HPV Integration in HNSCC Correlates with Survival Outcomes, Immune Response Signatures, and Candidate Drivers. Mol Cancer Res. 2018; 16: 90-102.47. Sung WK, Zheng H, Li S, Chen R, Liu X, Li Y, et al. Genome-wide survey of recurrent HBV integration in hepatocellular carcinoma. Nat Genet. 2012; 44: 765-9.48. Zhao LH, Liu X, Yan HX, Li WY, Zeng X, Yang Y, et al. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma. Nat Commun. 2016; 7: 12992.49. Kawai-Kitahata F, Asahina Y, Tanaka S, Kakinuma S, Murakawa M, Nitta S, et al. Comprehensive analyses of mutations and hepatitis B virus integration in hepatocellular carcinoma with clinicopathological features. J Gastroenterol. 2016; 51: 473-86.50. Guichard C, Amaddeo G, Imbeaud S, Ladeiro Y, Pelletier L, Maad IB, et al. Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma. Nat Genet. 2012; 44: 694-8.51. Fujimoto A, Totoki Y, Abe T, Boroevich KA, Hosoda F, Nguyen HH, et al. Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators. Nat Genet. 2012; 44: 760-4.52. Cleary SP, Jeck WR, Zhao X, Chen K, Selitsky SR, Savich GL, et al. Identification of driver genes in hepatocellular carcinoma by exome sequencing. Hepatology. 2013; 58: 1693-702.53. Lin DC, Meng X, Hazawa M, Nagata Y, Varela AM, Xu L, et al. The genomic landscape of nasopharyngeal carcinoma. Nat Genet. 2014; 46: 866-71.54. Zheng H, Dai W, Cheung AK, Ko JM, Kan R, Wong BW, et al. Whole-exome sequencing identifies multiple loss-of-function mutations of NF-kappaB pathway regulators in nasopharyngeal carcinoma. Proc Natl Acad Sci U S A. 2016; 113: 11283-8.55. Li YY, Chung GT, Lui VW, To KF, Ma BB, Chow C, et al. Exome and genome sequencing of nasopharynx cancer identifies NF-kappaB pathway activating mutations. Nat Commun. 2017; 8: 14121.56. Lee S, She J, Deng B, Kim J, de Andrade M, Na J, et al. Multiple-level validation identifies PARK2 in the development of lung cancer and chronic obstructive pulmonary disease. Oncotarget. 2016; 7: 44211-23.57. Hung SC, Kang MS, Kieff E. Maintenance of Epstein-Barr virus (EBV) oriP-

20

486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531

3940

based episomes requires EBV-encoded nuclear antigen-1 chromosome-binding domains, which can be replaced by high-mobility group-I or histone H1. Proc Natl Acad Sci U S A. 2001; 98: 1865-70.

Table 1. EBV integrations detected in EBV-associated malignancies

No. of integration

positive samples

(total samples)

Integration rate

( 95% confidence

interval)

Total No. of

breakpoints

detected

No. of breakpoints

per sample

Gastric carcinoma 10 (39) 25.6% (13.0 - 42.1%) 153 0 - 118

Hodgkin lymphoma 2 (11) 18.2% (2.3 - 51.8%) 8 0 - 5

NK/T cell lymphoma 4 (25) 16.0% (4.5 - 36.1%) 6 0 - 2

NPC 17 (177) 9.6% (5.7 - 14.9%) 28 0 - 6

Nasopharyngitis 0 (1)

C666-1 cell line 1 2

21

532533534535

536

537

538

539

540

541

542

543

544

4142

Figure 1. Distribution of EBV integration breakpoints. (A) Distribution of the 197

integration breakpoints across the human genome. For each integration breakpoint,

each bar represents the total number of supporting reads at a specific locus in the

human genome. Gene annotations for 23 breakpoints in 10 cancers and the C666.1 cell 22

545

546

547

548

5494344

line supported by ≥ 9 EBV-DNA chimeric read pairs are labeled. (B) Distribution of 197

breakpoints in common fragile regions. The expected (assuming uniform and random

distribution, blue) and observed ratios of EBV integration breakpoints detected in all

samples (n=34, red, total), gastric carcinomas (n=10, green, GCT) and NPC (n=17,

purple, NPCT) in common fragile regions are shown. P-values were calculated using

the binomial exact test. (C) Significant enrichment of integration breakpoints with

microsatellite repeats in gastric carcinomas. The expected and observed ratios of

breakpoints co-localized with repeat elements LINE, SINE, LTR, DNA transposon and

microsatellite in NPC (green, NPCT) and gastric carcinoma (red, GCT) are shown (for

detailed frequencies, see Supplementary Figure. 3). P-values were calculated using the

binomial exact test. NS, non-significant; LINE, long interspersed nuclear element; SINE,

long interspersed nuclear element; LTR, long terminal repeat.

23

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

4546

Figure 2. Mapping of EBV integration breakpoints in proximity of tumor

suppressor and inflammation-related genes. (A-E) The EBV integration breakpoints

located in proximity of tumor suppressor genes: SETD2 in one gastric carcinoma (A),

KANK1 in one NK/T cell lymphoma (B), and inflammation-related genes CDK15 (C),

TNFAIP3 (D) and PARK2 (E) in NPC tumors. Red arrows indicate a transcription factor

binding site near the EBV integration breakpoint. Three of the transcription factor

binding sites are in the promoter, and one is in the EBV integration breakpoint.

24

573

574

575

576

577

578

579

580

581

582

583584

4748

Figure 3. Gene expression in normal epithelium and NPC tumors. (A-C)

Immunohistochemical images of CDK15 (A), PARK2 (B), TNFAIP3 (C) expression in

normal epithelium and NPC tumors with and without EBV integration. The red curve

marks the neoplasm or epithelium, the rest is inflammatoryinfiltration. EBV (+) indicates

positive for EBV integration; EBV (-) indicates negative for EBV integration.

25

585586

587

588

589

590

591

592

593

594

595

596

4950

Figure 4. Distribution of integration breakpoints in the EBV genome. (A)

Distribution of breakpoints across the EBV genome. Histogram of the frequency of

breakpoints was constructed for 1000 bp intervals. EBV genome annotation is shown.

(B) Breakpoints enriched in oriP and terminal repeats in the EBV genome. The 26

597

598

599

600

6015152

observed (blue) and expected (red) frequencies of breakpoints within fragments are

shown. P-values were calculated using the binomial exact test.

27

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

5354

Figure S1. Mapping of EBV integration breakpoint detected in C666-1 cell line.

EBV-human chimeric read pairs from bam file to detect EBV DNA integration in C666-1

cell line were mapped to chr20: 57967350-57967359 in the human hg19 reference

genome.

28

621622

623

624

625

626

627

628

629

630

631

632

633

634635

636

5556

Figure S2. Two EBV integrations were observed in both primary and metastatic

NPC tumors from the same patient. (A-D) Two EBV integrations were observed in

both primary (A-B) and metastatic NPC tumors (C-D) from the same patient. Sequence

alignment around the integration site between the human genome and EBV genome are

presented. All EBV and human sequences are from the reference strand. Human

sequences are in blue, and EBV sequences are in yellow. Nucleotides that align to both

reference sequences are micro-homologies and are highlighted in red.

29

637

638

639

640

641

642

643

644

645

5758

Figure S3. Significant enrichment of integration breakpoints with microsatellite

repeats in gastric carcinoma. The expected and observed frequency of breakpoints

from all gastric carcinomas (A) all gastric carcinomas except GCT015 (B) and NPC

tumors (C) co-localized with repeat elements LINE, SINE, LTR, DNA transposon and

microsatellite are shown. P-values were calculated using the binomial exact test. LINE,

long interspersed nuclear element; SINE, long interspersed nuclear element; LTR, long

terminal repeat.

30

646647

648

649

650

651

652

653

654

655

5960

Figure S4. Distribution of EBV integration breakpoints relative to genes. (A)

Proportions of EBV integration breakpoints located near genes and transcriptional start

sites. (B) The observed (upper panel) and expected (lower panel) random distributions

of breakpoints relative to transcriptional start sites (TSS) are shown. Different colored

blocks indicate the distance of breakpoints to TSS and are ordered according to the 5’

to 3’ transcriptional direction. The X-axis and width of colored blocks show the

proportion of breakpoints falling into each category indicated. The breakpoints tended to

occur adjacent (smaller proportion of breakpoints with distance to TSS > 100 kb, purple

block) to TSS more frequently than expected.

31

656

657

658

659

660

661

662

663

664

665

6162

Figure S5. Distribution of EBV integration breakpoints relative to CpG islands and

CTCF binding sites. The observed and expected proportions of EBV integration

breakpoints co-localized with CpG islands and CTCF binding sites are shown.

32

666667

668

669

670

671

672

673

674

6364

Figure S6. Regulation of NF-κB pathway activity in NPC cells. (A) TNFAIP3

knockdown in NPC cells up-regulates NF-κB pathway activity. NF-κB–specific dual

luciferase promoter assay to detect NF-κB activity in NPC cells with TNFAIP3

knockdown. The relative promoter activity value was compared with the control.

Quantitative PCR performed on RNA prepared from NPC cells with TNFAIP3

knockdown to detect expression levels of NF-κB-pathway-targeted genes. (B) CDK15

knockdown in NPC cells was down-regulated by NF-κB pathway activity. Western blot

performed with nuclear lysates prepared from NPC cells with CDK15 knockdown

treated with TNF-α (10 mg/mL) and incubated with antibody against p65. Histone H3

was used as a control for loading. Quantitative PCR to detect levels of NF-κB-pathway-

targeted genes in NPC cells with CDK15 knockdown.

33

675676

677

678

679

680

681

682

683

684

685

686

687

688

689

6566

Figure S7. Microhomology (MH) sequences in the regions flanking integration

sites and Sanger sequencing validation.

(A-F) Microhomology (MH) sequences in the regions flanking integration sites observed

in NPC (A-D) and NK/T cell lymphoma (E-F) samples. All EBV and human sequences

are from the reference strand. Human partner is in blue, and EBV partner is in yellow.

34

690691

692

693

694

695

6768

Nucleotides highlighted in red are micro-homologies aligning to both reference

sequences.

35

696

697

698699

6970