21
20 © 2012 Future Medicine 20 20 www.futuremedicine.com Chee-Seng Ku Chee-Seng Ku completed his PhD at the Naonal University of Singapore (Singapore) in 2011/2012. He then worked as a Research Associate at the Cancer Science Instute of Singapore. His research interests focus on applying high-throughput microarray and sequencing technologies for studies on human genetic variation, disease genetics (Mendelian and complex diseases) and for diagnosc applicaons. Currently, he is a Foreign Adjunct Faculty at the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet (Sweden) and a Honorary Adjunct Research Fellow in the Saw Swee Hock School of Public Health, Naonal University of Singapore. He is also serving on the editorial board of several internaonal journals including Human Genecs, Journal of Medical Genecs and Human Genomics. David N Cooper David N Cooper is Professor of Human Molecular Genecs at Cardiff University (Cardiff, UK). His research interests are largely focused upon elucidating the mechanisms of mutagenesis underlying human genec disease. He has published over 350 papers in the field of human molecular genetics and has coauthored/coedited a number of books on mutaon in the context of inherited disease or molecular evoluon. He curates the Human Gene Mutaon Database [101] and is European Editor of Human Genecs. About the Authors For reprint orders, please contact: [email protected]

Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Embed Size (px)

Citation preview

Page 1: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

20 © 2012 Future Medicine2020 www.futuremedicine.com

Chee-Seng KuChee-Seng Ku completed his PhD at the National University of Singapore (Singapore) in 2011/2012. He then worked as a Research Associate at the Cancer Science Institute of Singapore. His research interests focus on applying high-throughput microarray and sequencing technologies for studies on human genetic variation, disease genetics (Mendelian and complex diseases) and for diagnostic applications. Currently, he is a Foreign Adjunct Faculty at the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet (Sweden) and a Honorary Adjunct Research Fellow in the Saw Swee Hock School of Public Health, National University of Singapore. He is also serving on the editorial board of several international journals including Human Genetics, Journal of Medical Genetics and Human Genomics.

David N CooperDavid N Cooper is Professor of Human Molecular Genetics at Cardiff University (Cardiff, UK). His research interests are largely focused upon elucidating the mechanisms of mutagenesis underlying human genetic disease. He has published over 350 papers in the field of human molecular genetics and has coauthored/coedited a number of books on mutation in the context of inherited disease or molecular evolution. He curates the Human Gene Mutation Database [101] and is European Editor of Human Genetics.

About the Authors

For reprint orders, please contact: [email protected]

Page 2: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

© 2012 Future Medicine

doi:10.2217/EBO.12.46

21

Chapter 2Next-generation sequencing in cancer research & diagnostics

Chee-Seng Ku & David N CooperSince the sequencing of the first whole cancer genome was completed in 2008, next-generation sequencing (NGS) technologies have introduced a new paradigm into cancer genetics. Numerous studies have now employed whole-genome sequencing (WGS) and whole-exome sequencing (WES) to determine the somatic mutational landscape of a range of different cancers. This endeavor has been very fruitful, leading to the identification of numerous recurrent somatic mutations, highly mutated genes and fusion genes resulting from structural rearrangements. New germline pathological mutations and genes have also been identified in hereditary cancers by WES. In parallel, the clinical utility of NGS technologies has also been increasingly evident. Indeed, the advent of bench-top NGS instrumentation has further broadened the application of NGS technologies into supporting smaller-scale molecular diagnostic analyses involving, for example, small panels of genes (e.g., the mismatch repair genes to detect germline mutations underlying Lynch syndrome). On top of its affordability, the turnaround time of NGS (e.g., WGS) in a molecular diagnostic setting lies within a reasonable clinical timeframe. Pipelines for interpreting WGS and transcriptome

Introduction 22

The development of NGS technologies: from high-throughput to bench-top 23

NGS in deciphering cancer genetics 27

NGS in cancer diagnostics 31

Perspective & conclusion 36

Page 3: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

22 www.futuremedicine.com

sequencing data derived from tumors involving professionals from different disciplines have also been suggested. While the widespread adoption of NGS in cancer diagnostics is inevitable, tests must be properly regulated in a clinical setting and operated according to the Clinical Laboratory Improvement Amendments.

IntroductionThe advent of NGS technologies has advanced cancer genetics research in two distinct directions – research discovery and clinical application. Studies employing NGS technologies have been attempting to identify somatic driver mutations in various sporadic cancers [1], as well as high-penetrance causal germline variants in hereditary cancer syndromes [2–4]. In addition to providing new biological insights into basic mechanisms of tumorigenesis, these discoveries are important as a means of generating more biomarkers for use in diagnostic, prognostic and therapeutic prediction. Furthermore, NGS technologies are also promising diagnostic tools in cancer; WGS has been used to identify a cryptic fusion oncogene in acute promyelocytic leukemia [5]. In terms of germline cancer variants, the clinical utility of NGS has also been evident when coupled with custom-designed oligonucleotides in a targeted sequencing approach to enrich 21 genes responsible for conferring an inherited risk of breast and ovarian cancers, and demonstrated promising results in relation to the identification of known

mutations in the tested samples [6].

These recent advances in research and diagnostics would not have been possible with the traditional low-throughput PCR amplification and Sanger sequencing methods. By contrast, NGS technologies are characterized by

Whole-genome sequencing: an approach to sequence the entire genome. Most of the

whole-genome sequencing studies have been performed using massively parallel sequencing. There are two analytical methods for the huge number of sequence reads generated by massively parallel sequencing (i.e., a resequencing approach where sequence reads are aligned against a reference genome) and a de novo genome assembly approach where the sequence reads will be assembled by bioinformatic means without mapping against a reference genome sequence. As such, this process of assembly could identify new or previously unknown sequences.

Whole-exome sequencing: an approach to sequence the coding regions or the entire collection of exons in the genome. Exome sequencing requires several sequence-enrichment steps followed by massively parallel sequencing. The development of commercial whole human exome enrichment kits has been largely responsible for making this approach feasible in practical terms. During the enrichment steps, the genomic regions of interest (i.e., all exons) are captured through hybrid selection of DNA fragments using oligonucleotide probes, while the unwanted DNA sequences (i.e., the noncoding regions) are removed before sequencing, leading to a significant reduction in the proportion of the genome that needs to be sequenced.

Somatic mutations: acquired genetic alterations that occur during mitosis in somatic cells and thus are confined specifically to a cell population. These mutations are not transmitted to offspring. Generally, somatic mutations also encompass various types, such as base substitutions or point mutations, small indels, copy number alterations and other structural rearrangements.

Driver mutations: mutations that are responsible for cancer development or tumor growth. These mutations can be involved in multiple cellular processes such as cell cycle regulation, apoptosis, angiogenesis, tissue invasion and metastasis.

Page 4: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

23www.futuremedicine.com

massively parallel sequencing of hundreds of millions of sequence reads, for example, the Illumina Hiseq™ and Life Technologies SOLiD™/5500 systems. The high-throughput production of hundreds of gigabases of DNA sequence data (at a very low cost per nucleotide) has made WGS both technically feasible and affordable [7,8]. In addition, the introduction of several bench-top NGS machines such as the Life Technologies Ion Torrent™ sequencing platform, with a throughput ranging from 10 Mb to 1 Gb [9], has further widened the ability of NGS technologies to support different scales of sequencing applications. For example, conventional high-throughput NGS technologies constitute a more suitable platform for WGS and WES, whereas bench-top NGS machines could be more suited to the targeted sequencing of panels of genes in molecular diagnostics. In parallel, the development of multiple WES and custom-made enrichment (hybrid-selection or PCR-based enrichment) assays has greatly aided the technical feasibility of WES and targeted sequencing [10].

In the following sections, the authors summarize current knowledge and the latest developments in NGS technologies; NGS-based studies performed to elucidate the molecular genetic basis of tumorigenesis (i.e., to identify somatic driver mutations in sporadic cancers and germline variants underlying hereditary cancers); and the application of NGS to cancer diagnostics.

The development of NGS technologies: from high-throughput to bench-topCurrently, the available conventional NGS technologies such as Illumina HiSeq2000/2500 and Life Technologies SOLiD4/5500 are able to generate hundreds of millions of short sequence reads (up to ~150 bp) with a throughput of several hundred gigabases of sequencing data per instrument run. By contrast, Roche 454 GS FLX produces approximately 1 million longer sequence reads (~500 bp) [7,8]. All the NGS technologies support single-end and paired-end sequencing, and hence would be able to identify a range of different types of genetic variants, from single-nucleotide variants (SNVs), small indels, larger-copy-number variants and other structural rearrangements such as translocations and inversions. Sequencing of paired-end (sequencing both ends of DNA fragments with several hundred base-pairs) or mate-paired (a larger insert size of several kilobases, meaning both ends of a larger DNA fragment can be sequenced) libraries is required for the ‘paired-end mapping’ approach to detect copy number variants and structural arrangements of different sizes [11–13]. In terms of preparing the paired-end mapping libraries, all of the NGS technologies are able to generate both types

Paired-end sequencing or mapping: a sequencing-based method to detect structural

variants based on the discrepancy or discordance in insert size and orientation of the paired-end sequences being aligned to the reference genome.

Page 5: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

24 www.futuremedicine.com

of libraries, allowing for sequencing of short and longer insert sizes [14,15]. Thus, the NGS technologies have been widely used for various applications ranging from the targeted sequencing of candidate genes to WES and WGS. In addition to detecting genetic variants, NGS technologies have also been commonly used in functional genomics studies such as RNA-Seq, chromatin immunoprecipitation (ChIP)-Seq, bisulfite sequencing for DNA methylation and sequencing of metagenomics (collections of microbial genomes) [16]. Owing to the large number of sequence reads generated by the HiSeq and SOLiD systems, these platforms are more suitable for certain applications, such as ChIP-Seq and RNA-Seq that require millions of reads. However, the authors focus exclusively on the application of NGS to identify somatic mutations and germline variants in cancer.

In terms of their accuracy, all three NGS technologies (Illumina, Life Technologies and Roche 454 sequencing platforms) have higher raw base error rates than Sanger sequencing, ranging from <0.1 to 2% [17]. However, the accuracy of raw base calling has been improved with further developments in sequencing instrumentation and calling algorithms. The lowest raw base error rate was achieved by Life Technologies SOLiD because each of the nucleotides was interrogated twice by ligation of dinucleotide probes in repeated cycles of sequencing mediated by ligase enzymes. In addition, the dominant error type differs between the NGS technologies, owing to their different sequencing chemistries. Illumina sequencing and SOLiD platforms are more prone to nucleotide substitution errors, whereas the Roche 454 sequencing system is characterized by base insertion and deletion errors, especially in homopolymer regions of more than six identical nucleotides. These indel errors are attributable to the pyrosequencing chemistry in which the intensity of chemiluminescent light is proportional to the number of nucleotides incorporated into the DNA template in each cycle of sequencing [18]. Although NGS technologies achieved a lower raw base accuracy than Sanger sequencing, this can be improved through deeper sequencing to achieve a higher consensus accuracy rate. For WES and WGS of genomic DNA, an average sequencing depth of 50–100× (i.e., each nucleotide is covered by multiple sequence reads) is usually deemed sufficient to detect most germline SNVs accurately.

However, greater sequencing depth would be required to detect somatic point mutations in primary cancer tissue in order to allow not only for tissue contamination (i.e., admixture of cancer and noncancer cells) and genetic heterogeneity within the cancer tissue (i.e., the presence of multiple subclones) [15,19]. In general, NGS has a much higher sensitivity than Sanger sequencing with respect to the detection of somatic mutations under these conditions, since the presence of mutation can be detected

Page 6: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

25www.futuremedicine.com

by individual sequence reads. Mutations occurring in a small proportion of cancer cells can still be detected, as long as the sequencing depth is adequate (e.g., up to several hundred-fold, depending upon the ‘rarity’ of the mutations in the tumor sample). In terms of different genetic variants, NGS has a much higher sensitivity and specificity for detecting SNVs (>90%) than small indels, copy number variants and structural rearrangements [20-25]. All of the high-throughput and bench-top NGS technologies (discussed later) rely on DNA polymerase-mediated amplification methods; emulsion PCR and bridge amplification are the examples of polymerase-mediated amplification methods applied by the NGS technologies, which can adversely affect the detection of rare or new somatic mutations. However, PCR artifacts can be avoided by third-generation sequencing technologies characterized by single DNA molecule sequencing, without the need for amplification [26].

Although higher sequencing throughputs are expected with further developments in NGS technology, the production of several hundred gigabases of data has also rendered them less suitable for certain applications, such as in the molecular diagnostic testing of a panel of genes. This requires a lower throughput and a much more rapid sequencing turnaround time than is required in a research context. Furthermore, either a single patient or a relatively small number of samples is often encountered in a clinical diagnostic context, thereby ensuring that the barcoding or multiplexing of a large number of patient samples is impractical as a solution to achieve cost efficiency. For example, in a genetic diagnostic test of Lynch syndrome (a hereditary colorectal cancer), involving the sequencing of four mismatch repair genes (i.e., MSH2, MLH1, MSH6 and PMS2), their coding regions spanned a length of approximately 65 kb [27]. A sequencing run to an average depth of 100× would require only 6.5 Mb (or 13 Mb, assuming 50% of sequence reads were filtered during analysis steps) of sequencing data. As such, it is clear that the capacity of high-throughput NGS technologies greatly exceeds the requirement in this context. By contrast, Sanger sequencing does not meet the demand of this sequencing requirement either in terms of efficiency or cost–effectiveness. However, the development of multiple bench-top NGS machines has adequately filled the ‘gap’.

The first bench-top NGS instrument was introduced into the market by Roche in 2010 – the Roche 454 Genome Sequencer Junior Sequencing System [28]. This was subsequently followed by the launch of the Ion Torrent Personal Genome Machine Sequencer (Life Technologies) [9] and the MiSeq Personal Sequencing System (Illumina) [102]. The Roche 454 Genome Sequencer Junior and Illumina MiSeq Personal Sequencing System relied upon the same well-established sequencing chemistries (i.e., pyrosequencing

Page 7: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

26 www.futuremedicine.com

and reversible terminator sequencing) as employed in their respective high-throughput NGS technologies [7,18]. These NGS technologies also share a common feature, namely that they are reliant on either fluorescent emission (Illumina GA/HiSeq™/MiSeq and ABI SOLiD) or chemiluminescent light emission (i.e., Roche 454 sequencing platforms) to detect and distinguish nucleotide incorporation. By contrast, the Ion Torrent Personal Genome Machine represents a new sequencing technology, and has been known as the world’s first ‘post-light’ and semiconductor sequencing technology because it relies on chemical/pH changes rather than light emission. The Ion Torrent sequencer sequentially supplies the chip with one type of nucleotide after another. This is also known as a ‘sequencing by synthesis’ approach mediated by polymerase; when a nucleotide is incorporated into a DNA template, a hydrogen ion is released, which causes pH changes that are subsequently converted to a digital output in terms of voltage changes [9].

The throughput of these bench-top NGS machines ranges from 10 Mb to >1 Gb. For example, several Ion Torrent sequencing chips are available for different throughputs, from >10 Mb (chip314) to >100 Mb (chip316) and >1 Gb (chip318). Similarly, Illumina MiSeq produces sequencing data ranging from >120 Mb to >1 Gb depending on the read length and whether it is single-end or paired-end sequencing. By contrast, the Roche 454 Genome Sequencer Junior has a much lower throughput (>35 Mb) per instrument run but a longer read length of 400 bp on average compared with the other two platforms. These bench-top NGS machines have further enhanced the technical and logistical flexibility (e.g., a smaller number of samples can be processed) at the same time as avoiding ‘redundant’ sequencing (i.e., sequencing to a higher than required depth). Thus, they have also further optimized cost–effectiveness by avoiding redundant sequencing. In molecular diagnostics, the sequencing of a panel of genes in a small number of samples is both common and routine. Using Lynch syndrome once again as an example, the Ion Torrent chip314 would be sufficient to meet the sequencing demand or Roche 454 Junior if multiple samples are available for multiplexing [29].

In parallel, the development of multiple custom-designed oligonucleotide assays based on hybrid selection (e.g., by Agilent and Nimblegen) and PCR-based amplification methods (such as Fluidigm and RainDance technologies and Illumina TruSeq™ Custom Amplicon sequencing) has made the targeted sequencing of a panel of genes with different sizes of genomic regions technically highly feasible. For example, Illumina TruSeq Custom Amplicon sequencing allows a multiplexing of up to 384 amplicons with a targeted size ranging from 4 to 96 kb [103]. By contrast, the Agilent and Nimblegen

Page 8: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

27www.futuremedicine.com

custom-designed oligonucleotide enrichment assays offer a larger target size ranging from hundreds of kilobases to several megabases [10]. More importantly, the latest development of prehybridization barcoding protocol, where multiple samples can be pooled for a single hybrid selection experiment, has further reduced the cost of genomic enrichment. Taken together, through harnessing the power of recent technological developments in enrichment and sequencing methodologies, targeted sequencing, WES and WGS have not only been made more technically feasible and more accessible to the clinical diagnostic setting, but also more cost effective. Table 2.1 summarizes the technological features of high-throughput and bench-top NGS platforms. As these technologies are advancing very rapidly, the reader is encouraged to refer to the vendors’ websites for the latest information.

NGS in deciphering cancer geneticsOver the past several years, WES and WGS have been increasingly used to delineate the somatic mutational profiles of various cancers. The very first pioneering cancer WGS was performed on acute myeloid leukemia (AML) and succeeded in identifying numerous tumor-specific variants; this suggested the technical and analytical feasibility of applying NGS to interrogate somatic mutations in entire cancer genomes in parallel with paired constitutional DNA samples [30]. Subsequent studies have identified multiple somatic mutations recurrent in IDH1 and DNMT3A genes in AML [31,32]. Additional WGS studies have implicated novel candidate genes harboring putative pathological somatic mutations in hepatocellular carcinoma [33], melanoma [34], prostate cancer [35] and lung cancer [36]. A further advantage of WGS lies in its ability to detect chromosomal rearrangements and fusion genes. For example, in hepatocellular carcinoma, WGS identified 33 somatic rearrangements, of which 22 were validated by Sanger sequencing of the breakpoints in both the tumor and lymphocyte genomes. Four somatic fusion transcripts generated by different chromosomal rearrangements were also identified and validated – the BCORL1–ELF4 and CTNND1–STX5 fusion genes by intrachromosomal inversions (Xq25 and 11q12, respectively), the VCL–ADK fusion gene formed by an interstitial deletion in 10q22, and the CABP2–LOC645332 fusion gene resulting from a tandem duplication in 11q13 [33]. New insights into cancer metastasis were also provided through WGS of metastatic lobular breast cancer in comparison with the primary tumor derived from the same patient [37]. Clonal evolution in relapsed AML was also investigated by WGS to determine the mutational spectrum associated with relapse [38].

Page 9: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

28 www.futuremedicine.com

Table 2

.1. S

umm

ary

and c

om

paris

on

of te

chno

logical f

eatu

res

of high-

thro

ughp

ut a

nd b

enc

h-to

p n

ext-genera

tion

sequenc

ing p

latform

s.

Tech

nolo

gica

l fe

atur

eH

igh-

thro

ughp

ut N

GS

platf

orm

sBe

nch-

top

NG

S pl

atfor

ms

Roch

e 45

4 G

enom

e Se

quen

cer

FLX

Tita

nium

Illum

ina

GA

IIx™

/H

iSeq

2000

™Li

fe T

echn

olog

ies

SOLi

D4/

5500

™/

5500

XL™

Roch

e 45

4 Ju

nior

Illum

ina

MiS

eq™

Life

Tec

hnol

ogie

s Io

n To

rren

t™ P

erso

nal

Gen

ome

Mac

hine

Tem

plat

e pr

epar

ation

/am

plifi

catio

n

Emul

sion

PCR

Br

idge

am

plifi

catio

n (c

lust

er

gene

ratio

n)

Emul

sion

PCR

Emul

sion

PCR

Brid

ge

ampl

ifica

tion

(clu

ster

ge

nera

tion)

Emul

sion

PCR

Det

ectio

n of

nu

cleo

tide

inco

rpor

ation

Emis

sion

of

chem

ilum

ines

cent

lig

ht

(pyr

oseq

uenc

ing)

Emis

sion

of

fluor

esce

nt li

ght

Emis

sion

of

fluor

esce

nt li

ght

Emis

sion

of

chem

ilum

ines

cent

lig

ht

(pyr

oseq

uenc

ing)

Emis

sion

of

fluor

esce

nt

light

Rele

ase

of h

ydro

gen

ion

and

pH c

hang

es

Thro

ughp

ut p

er

run

Up

to 7

00 M

bU

p to

300

Gb

Up

to 3

00 G

b>3

5 M

b >1

Gb

>1 G

b

Read

leng

thU

p to

700

bp

Up

to 1

50 b

pU

p to

75

bpU

p to

400

bp

Up

to 1

50 b

pU

p to

200

bp

Sing

le-e

nd

sequ

enci

ngYe

sYe

sYe

sYe

sYe

sYe

s

Pair

ed-e

nd

sequ

enci

ngYe

sYe

sYe

sYe

sYe

sYe

s

Inde

xing

or

barc

odin

g of

sa

mpl

es

Yes

Yes

Yes

Yes

Yes

Yes

Dom

inan

t err

or

type

Inde

l err

ors

in

hom

opol

ymer

N

ucle

otide

su

bstit

ution

er

rors

Nuc

leoti

de

subs

tituti

on

erro

rs

Inde

l err

ors

in

hom

opol

ymer

Nuc

leoti

de

subs

tituti

on

erro

rs

Inde

l err

ors

in

hom

opol

ymer

The

tech

nolo

gica

l inf

orm

ation

of t

he s

eque

ncin

g pl

atfor

ms

sum

mar

ized

in th

is t

able

(and

dis

cuss

ed in

the

chap

ter)

was

der

ived

from

the

com

pany

w

ebsi

tes.

As

the

info

rmati

on is

bei

ng u

pdat

ed fr

eque

ntly

, rea

ders

are

enc

oura

ged

to r

efer

to th

e ve

ndor

s’ w

ebsi

tes

for

the

late

st in

form

ation

. N

GS:

Nex

t-ge

nera

tion

sequ

enci

ng; W

ES: W

hole

-exo

me

sequ

enci

ng; W

GS:

Who

le-g

enom

e se

quen

cing

. Re

prod

uced

wit

h pe

rmis

sion

from

[55]

.

Page 10: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

29www.futuremedicine.com

Table 2

.1. S

umm

ary

and c

om

paris

on

of te

chno

logical f

eatu

res

of high-

thro

ughp

ut a

nd b

enc

h-to

p n

ext-genera

tion

sequenc

ing p

latform

s (c

ont

.).

Tech

nolo

gica

l fe

atur

eH

igh-

thro

ughp

ut N

GS

platf

orm

sBe

nch-

top

NG

S pl

atfor

ms

Roch

e 45

4 G

enom

e Se

quen

cer

FLX

Tita

nium

Illum

ina

GA

IIx™

/H

iSeq

2000

™Li

fe T

echn

olog

ies

SOLi

D4/

5500

™/

5500

XL™

Roch

e 45

4 Ju

nior

Illum

ina

MiS

eq™

Life

Tec

hnol

ogie

s Io

n To

rren

t™ P

erso

nal

Gen

ome

Mac

hine

Suit

abili

ty fo

r W

GS,

WES

and

ta

rget

ed

sequ

enci

ng

of h

uman

ge

nom

e

Targ

eted

se

quen

cing

WES

(mul

tiple

se

quen

cing

run

s)

WG

S an

d W

ESW

GS

and

WES

Targ

eted

se

quen

cing

Targ

eted

se

quen

cing

W

ES (m

ultip

le

sequ

enci

ng

runs

)

Targ

eted

seq

uenc

ing

WES

(mul

tiple

se

quen

cing

run

s)

The

tech

nolo

gica

l inf

orm

ation

of t

he s

eque

ncin

g pl

atfor

ms

sum

mar

ized

in th

is t

able

(and

dis

cuss

ed in

the

chap

ter)

was

der

ived

from

the

com

pany

w

ebsi

tes.

As

the

info

rmati

on is

bei

ng u

pdat

ed fr

eque

ntly

, rea

ders

are

enc

oura

ged

to r

efer

to th

e ve

ndor

s’ w

ebsi

tes

for

the

late

st in

form

ation

. N

GS:

Nex

t-ge

nera

tion

sequ

enci

ng; W

ES: W

hole

-exo

me

sequ

enci

ng; W

GS:

Who

le-g

enom

e se

quen

cing

. Re

prod

uced

wit

h pe

rmis

sion

from

[55]

.

Page 11: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

30 www.futuremedicine.com

By contrast, the delay in applying WES to cancer genome sequencing may be attributed to the initial technical difficulties inherent in enriching the collection of all exons in the human genome. However, several early studies have attempted to apply traditional PCR Sanger sequencing (in ‘brute-force’ mode) to sequence most of the consensus coding sequence and RefSeq genes [39,40]. For example, up to 125,624 PCR primers were needed to amplify 6196 RefSeq transcripts [40]. This technical obstacle, however, was removed with the development of commercial whole-exome enrichment kits. Several WES studies of cancer genomes were not published until 2011 [41–44].

In comparison to WGS, WES is more cost–effective and analytically less challenging, and hence more affordable and feasible for a larger sample size. Employing a larger sample size is advantageous for prioritizing potential candidates (either recurrent mutations or highly mutated genes) for subsequent validation. This has been demonstrated in a WES study of melanoma by searching specifically for novel recurrent mutations that occurred in at least two of the 14 samples for further validation. Follow-up investigation in an additional 153 melanoma samples identified TRRAP, which harbored a recurrent mutation in approximately 4% of samples [44]. Taken together, numerous recurrent mutations and highly mutated genes have been identified for a number of cancers through WES; these are very likely to be driver mutations or candidate cancer genes with key roles in carcinogenesis. In addition, one of the most exciting findings from the recent cancer genome sequencing studies has probably been the identification of tumor-associated mutations in the genes involved in chromatin remodeling [43,45,46]. This also further highlights the importance of the interaction between genetic and epigenetic aberrations in cancer.

In parallel, NGS has been applied to familial cancer syndromes, with the successful identification of pathological germline mutations responsible for familial pancreatic cancer, hereditary pheochromocytoma and familial melanoma [2–4]. For example, a germline truncating mutation in the PALB2 gene was identified by WES of a patient with suspected familial pancreatic cancer [2]. In a similar vein, germline mutations in MAX were identified in three unrelated individuals with hereditary pheochromocytoma [3]. These studies applied a robust set of filtering criteria to identify the putative causal mutations. For example, the WES of hereditary pheochromocytoma focused on heterozygous SNVs and small indels because it was postulated that it would be very unlikely that homozygous variants could act as founder mutations. Over and above this strategy, several common filtering criteria were applied, as with other studies of Mendelian disorders [47], for example, selecting only those variants within coding regions (such as those with amino acid changes) and those that affected

Page 12: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

31www.futuremedicine.com

the same gene in all three samples. This resulted in the identification of a total of five SNVs, located within two genes (MAX and ADCY6). Segregation of two variants in MAX with hereditary pheochromocytoma was observed in the two families from whom DNA from affected relatives was available, but not ADCY6. Additional evidence to support the causative role came from screening 59 further cases of hereditary pheochromocytoma, which identified two additional truncating mutations and three missense variants in the MAX gene [3]. Taken together, these studies demonstrated the value of WES in discovering the underlying genetic causes of hereditary cancer syndromes.

NGS in cancer diagnosticsThe application of NGS in cancer diagnostics has been increasingly evident. This is exemplified by two recent studies using WGS [5,48]. More specifically, WGS has demonstrated its discovery and confirmatory role in cases characterized by an ambiguous diagnosis or clinical presentation. For example, it has been used to unravel the genetic aberration of a patient with a diagnosis of AML of unclear subtype [5]. Molecular characterization carries important clinical implications in the treatment and management of the patient. The ambiguity came from the observations that the patient’s clinical presentation was consistent with acute promyelocytic leukemia (a subtype of AML with a favorable prognosis), but it was contradicted by cytogenetic analysis. The cytogenetic analysis revealed a different subtype associated with a poor prognosis for which bone marrow transplantation in first remission is recommended. The diagnostic and treatment uncertainty was resolved by performing WGS on the original leukemic bone marrow, and from a skin biopsy. The WGS analysis detected a novel insertional translocation on chromosome 17, that generated a pathogenic PML–RARA gene fusion, thereby confirming a diagnosis of acute promyelocytic leukemia. This type of complex rearrangement would not have been detected without WGS, further demonstrating that WGS represents a comprehensive analytical tool for the entire genome [5].

More importantly, this molecular confirmatory diagnosis had important clinical implications for the treatment administered to the patient. Following the molecular diagnosis, the patient was considered eligible to receive treatment with retinoic acid, which significantly improves the overall prognosis of patients suffering from acute promyelocytic leukemia. In addition, this avoided the risks inherent in bone marrow transplantation, since this treatment option was not considered further. The clinical significance was clear as the results of the analysis were used in clinical decision making with regards to the patient’s therapy. In practical terms,

Page 13: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

32 www.futuremedicine.com

the WGS and subsequent ana lysis were completed within a clinically reasonable timeframe (6 weeks) [5].

In a similar vein, WGS has been employed to resolve the genetic cause of a patient with a suspected cancer susceptibility syndrome based upon the early onset of several primary tumors [48]. The patient developed multiple cancers (specifically breast cancer and ovarian cancer) at an early age. In addition, the patient also developed treatment-related AML. These clinical presentations led to the genetic testing of BRCA1 and BRCA2 genes, which was unrevealing. However, the underlying genetic cause was resolved by WGS on leukemic and skin cells derived from the patient. The WGS analysis identified a novel heterozygous deletion of three exons of the TP53 gene, and the intact copy of TP53 had been lost in the leukemic cells due to uniparental disomy. This demonstrated the utility of WGS in a case with unexpected ‘genetic heterogeneity’, where mutations, other than in BRCA1 and BRCA2 genes, were not tested at the outset. Although this did not affect subsequent clinical decision-making, revealing the underlying genetic defect had important implications for screening family members. The successful applications of NGS/WGS in cancer diagnostics in these studies are likely to be the first examples of how the new technologies are proving their worth; the number of these reports is expected to increase rapidly in the coming year.

As noted above, in clinical oncology it is important to generate comprehensive genomic data, perform the analysis and interpret the data in a timeframe that is clinically relevant for the patient. In addition to the WGS studies, another study has also assessed the incorporation of WES and transcriptome sequencing in terms of their clinical utility from both technical and cost perspectives [49]. All three genomic approaches were applied to tumors in an effort to identify potentially pathological aberrations. This study showed that a ‘comprehensive genomic approach’ is both time- and cost-effective. In particular, the time from biopsy sampling and wet-lab experiments to computational analysis and initial results was streamlined to just 24 days. Furthermore, the total cost of sequencing for the three experiments and analysis was US$5400 per patient during the study, and this cost may be expected to decrease over time. A further advantage of this ‘integrative genomic approach’ is that the findings can be cross-validated in a more efficient way. For example, both WGS and WES detected an amplification event on chromosome 13q spanning the CDK8 gene in a metastatic colorectal carcinoma; the overexpression of CDK8 was confirmed by transcriptome sequencing.

One of the critical challenges in applying these technologies in a clinical setting is in the handling and interpreting of the huge volume of genomic

Page 14: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

33www.futuremedicine.com

data. To address this concern, the study introduced the notion of a multidisciplinary ‘sequencing tumor board’ (which included professionals from several disciplines such as clinicians, geneticists, pathologists, biologists, bioinformatic specialists and bioethicists) responsible for the clinical interpretation of the sequencing data obtained from each patient [49]. Personnel who are specialized in these high-throughput genomic technologies are clearly needed. Thus, given that these challenges (cost, analysis and interpretation) are being increasingly addressed, or even alleviated, it is widely perceived that the clinical utility of NGS will become commonplace in the near future. Figure 2.1 displays the workflow of a ‘sequencing everything’ approach in a clinical context.

NGS has also been assessed for its applicability as a diagnostic tool to detect known germline mutations for hereditary cancers. More specifically, by leveraging the technological advances in custom enrichment and NGS, Walsh et al. designed custom oligonucleotides in solution to capture 21 genes responsible for an inherited risk of breast and ovarian cancers [6]. The enrichment followed by NGS (using Illumina GA) was tested in 20 women diagnosed with breast or ovarian cancer and with a known mutation in one of the genes responsible for inherited predisposition to these cancers. It generated encouraging results where all of the known point mutations, small indel mutations (1–19 bp), and large genomic duplications and deletions (160–101,013 bp) were detected in all of the samples. The large deletions and duplications were detected using a read-depth strategy and were in complete agreement with the multiple ligation probe assay [6]. In addition to being able to detect point mutations and small indels, the ability to detect larger deleted and duplicated regions is a further advantage of NGS compared with Sanger sequencing; this is important in a diagnostic test setting as some causal genes are affected by copy number variants. Similarly, the promise of NGS in genetic diagnosis of familial breast cancer has also been demonstrated by another study, which attempted to detect TP53, BRCA1 and BRCA2 mutations in tumor cell lines and DNA from patients with germline mutations. All of the known pathological mutations (including point mutations and small indels of up to 16 nucleotides) were identified [50].

Furthermore, attempts have also been made to incorporate custom genomic enrichment and NGS methods into the genetic diagnostic testing of Lynch syndrome (hereditary nonpolyposis colorectal cancer). It was developed to capture every exon in a panel of 22 genes (most of which are associated with hereditary colorectal cancer) and followed by NGS using Roche 454 GS-FLX and the Illumina GA to evaluate their performance [27].

Page 15: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

34 www.futuremedicine.com

Although these technologies are promising in the context of molecular diagnostics, their technical limitations must also be recognized, for example, the GC-rich regions were difficult to enrich. In the worst case scenario, these GC-rich regions would not be captured at all; for example, it has been demonstrated in a targeted sequencing study of several ataxia genes, where

Figure 2.1. Workflow of a sequencing everything approach in a clinical context

Patient after informed consent

Tumor biopsy (DNA, RNA extraction)

Genomic analysisWGS, WES

Transcriptomics analysisRNA-Seq

Epigenomics analysisBisulfite-Seq, ChIP-Seq

Point mutations,small indels,

copy number alterations,structural rearrangements,

uniparental disomy

Expression levels of coding RNAs

and noncoding RNAs,fusion transcripts,alternative splicing

DNA methylation patterns,histone modifications,transcription factor

binding sites

Combine information for integrative analysis and validation of the results by CLIA-certified laboratoryfor clincial decision making

Expert panel (clinicians, geneticists, pathologists, oncologists, biologists, bioinformatics specialists,bioethicists) responsible for data interpretation, report generation and disclosure of results

Personalized genomic medicine or personalized patient management and treatment

ChIP: Chromatin immunoprecipitation; CLIA: Clinical Laboratory Improvement Amendments; Seq: Sequencing; WES: Whole-exome sequencing; WGS: Whole-genome sequencing. Reproduced with permission from [55].

Page 16: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

35www.futuremedicine.com

two exons that lacked any sequence coverage contained a very high GC content (76.1 and 63.6%, respectively) compared with the average GC content of 37.6% for the 50 best covered exons [51].

Off-target sequencing is also an issue that leads to redundant sequencing beyond the targeted regions.Moreover, uneven sequencing coverage across the targeted regions (due to several factors such as uneven enrichment, uneven sequencing and difficulty in aligning the sequence reads to repetitive regions) may result in poor sequence coverage in some of the regions that subsequently affect the sensitivity and specificity of variant detection [27]. Last but not least, the comparison of the two NGS platforms also revealed discrepancies between sequence variants called by the different platforms. This implies that if one platform were used, further validation by Sanger sequencing may be needed. In a clinical setting, mutations that are deemed important to patients but were identified in a research setting require confirmation in a Clinical Laboratories Improvements Amendment-certified laboratory.

The development of a diagnostic tool to accurately and cost-effectively detect different genetic aberrations in panels of genes is important for its adoption in a clinical setting. The lack of such a comprehensive tool would create a need for several diagnostic tests per patient. Thus, for example, in genetic testing for BRCA1/BRAC2 mutations, a separate diagnostic test has been offered in order to detect large exonic deletions and duplications that are undetectable by PCR Sanger-based approaches. Similarly, deletion/duplication analysis of the genes implicated in Lynch syndrome (MLH1, MSH2, MSH6, PMS2 and EPCAM) has been performed separately from gene sequencing analysis. NGS has shown its advantages in integrating analyses of different types of genetic variants in a single ‘convenient’ test. In addition, all of the known disease genes can be screened simultaneously owing to the cost–effectiveness and higher throughput of NGS, thereby obviating the need for Sanger sequencing-based testing on a one-by-one basis, which is both time consuming and costly. However, for molecular diagnostics involving a small panel of genes, such as in breast/ovarian cancer and Lynch syndrome, usually only a few samples are available at a time and bench-top NGS machines are the more suitable platforms. Although a limited number of studies

The clinical utility of next-generation sequencing technologies has been increasingly

evident.

On top of its affordability, the turnaround time of next-generation sequencing (e.g., whole-genome sequencing) in a molecular diagnostic setting lies within a reasonable clinical timeframe.

Pipelines for interpreting whole-genome sequencing and transcriptome sequencing data derived from tumors involving professionals from different disciplines have also been suggested.

However, tests must be properly regulated in a clinical setting and operated according to the Clinical Laboratory Improvement Amendments.

Page 17: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

36 www.futuremedicine.com

have been performed thus far to develop and assess NGS-based diagnostic assays in hereditary cancers, many have been conducted for Mendelian disorders [47].

Perspective & conclusionNGS technologies have so far made significant progress in characterizing somatic mutations in cancer genomes. This endeavor will be further accelerated by international initiatives such as the International Cancer Genome Consortium [52]. The aim of this initiative is to interrogate the somatic mutational landscape of at least 50 different cancer types and subtypes in thousands of samples, and eventually integrate these genomic data with transcriptomic and epigenomic data. This integrative approach is critical in characterizing the genomic complexities of cancers. The acceleration in the discovery of novel driver mutations or candidate cancer genes in various cancers will not only lead to a better understanding of cancer pathogenesis, but should also potentiate personalized cancer medicine. Similarly, further studies of hereditary cancer syndromes, whose genetic etiologies have not yet been fully explained, would also be expected to identify new causal mutations, that could be invaluable in diagnostic testing.

Although the technical and analytical feasibility and cost–effectiveness of NGS in cancer diagnostics have been amply demonstrated, attention should also be given to ethical issues pertinent to the use of these powerful information-generating tools, for example, its ability to reveal results that may be considered incidental. It is noteworthy that the application of NGS in cancer diagnostics has until now been primarily demonstrated in a research setting. All of the clinical (genetic) tests (whether NGS-based or not) must be validated in a heavily regulated clinical setting if the results are to be used to make a diagnosis or therapeutic recommendation to the patient. Although the adoption of NGS in cancer diagnostics in inevitable, the authors believe that targeted methods, such as PCR Sanger sequencing might still be practical and could suffice for certain applications involving single candidate genes with hotspot mutations. For example, in the context of therapeutic prediction, the selection of patients who are eligible for anti-EGFR tyrosine kinase inhibitors in non-small-cell lung cancer or anti-EGFR monoclonal antibody in colorectal cancer is guided by the status of several hotspot mutations in the EGFR and KRAS genes [53]. By contrast, NGS possesses advantages in terms of the simultaneous sequencing of multiple genes and multiple samples (i.e., targeted sequencing as demonstrated in the genes implicated in breast cancer and Lynch syndrome), and has resulted in cost and time savings. Although not the

Page 18: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

37www.futuremedicine.com

focus of this chapter, therapeutic prediction has also benefited significantly from NGS as a powerful discovery tool. Most notably, a recent study that employed a targeted NGS approach to sequence 138 cancer genes in melanomas (before and after relapse) from a given patient succeeded in identifying the underlying genetic mutation in the MEK1 gene responsible for acquired resistance to PLX4032 (vemurafenib) after an initial dramatic response. The in vitro demonstration of increased kinase activity, which conferred resistance to both RAF and MEK inhibition of this mutant MEK1 protein, further supported this novel mechanism of acquired drug resistance [54]. Moreover, the discovery power of WGS in making a diagnosis in cases with an unsuspected genetic etiology is very evident. Finally, it is anticipated that continuing advances in NGS technologies and computational tools, and cost reduction will make them ever more accessible in clinical practice.

Financial & competing interests disclosure

The authors have no relevant affiliations or financial involvement with any organi-zation or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, con-sultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

No writing assistance was utilized in the production of this manuscript.

Summary.

� The advent of next-generation sequencing (NGS) technologies has advanced cancer genetics research in two distinct directions – research discovery and clinical application.

� Studies employing NGS technologies have been attempting to identify somatic driver mutations in various sporadic cancers as well as high-penetrance causal germline variants in hereditary cancer syndromes.

� NGS technologies are also a promising diagnostic tool for cancers. � These recent advances in research and diagnostics would not have been possible with the

traditional low-throughput PCR amplification and Sanger sequencing methods. � NGS technologies are characterized by massively parallel sequencing of hundreds of millions of

sequence reads, and the production of hundreds of gigabases of DNA sequence data (at a very low cost per nucleotide) has made whole-genome sequencing both technically feasible and affordable.

� Although the technical and analytical feasibility and cost–effectiveness of NGS in cancer diagnostics have been amply demonstrated, attention should also be given to ethical issues pertinent to the use of these powerful information-generating tools; for instance, its ability to reveal results that may be considered incidental.

� It is anticipated that continuing advances in NGS technologies and computational tools, and cost reduction will make them ever more accessible in clinical practice.

Page 19: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

38 www.futuremedicine.com

References1 Wong KM, Hudson TJ,

McPherson JD. Unraveling the genetics of cancer: genome sequencing and beyond. Annu. Rev. Genomics Hum. Genet. 12, 407–430 (2011).

2 Jones S, Hruban RH, Kamiyama M et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science 324(5924), 217 (2009).

3 Comino-Mendez I, Gracia-Aznarez FJ, Schiavi F et al. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma. Nat. Genet. 43(7), 663–667 (2011).

4 Yokoyama S, Woods SL, Boyle GM et al. A novel recurrent mutation in MITF predisposes to familial and sporadic melanoma. Nature 480(7375), 99–103 (2011).

5 Welch JS, Westervelt P, Ding L et al. Use of whole-genome sequencing to diagnose a cryptic fusion oncogene. JAMA 305(15), 1577–1584 (2011).

6 Walsh T, Lee MK, Casadei S et al. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing. Proc. Natl Acad. Sci. USA 107(28), 12629-12633 (2010).

7 Metzker ML. Sequencing technologies – the next generation. Nat. Rev. Genet. 11(1), 31–46 (2010).

8 Mardis ER. A decade’s perspective on DNA sequencing technology. Nature 470(7333), 198–203 (2011).

9 Rothberg JM, Hinz W, Rearick TM et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature 475(7356), 348–352 (2011).

10 Mertes F, Elsharawy A, Sauer S et al. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct. Genomics 10(6), 374–386 (2011).

11 Korbel JO, Urban AE, Affourtit JP et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318(5849), 420–426 (2007).

12 Medvedev P, Stanciu M, Brudno M. Computational methods for discovering structural variation with next-generation sequencing. Nat. Methods. 6(11 Suppl) S13–S20 (2009).

13 Medvedev P, Fiume M, Dzamba M, Smith T, Brudno M. Detecting copy number variation with mated short reads. Genome Res. 20(11), 1613–1622 (2010).

14 Koboldt DC, Ding L, Mardis ER, Wilson RK. Challenges of sequencing human genomes. Brief Bioinform. 11(5), 484-498 (2010).

15 Robison K. Application of second-generation sequencing to cancer genomics. Brief Bioinform. 11(5), 524–534 (2010).

16 Mardis ER. The impact of next-generation sequencing technology on genetics. Trends Genet. 24(3), 133–141 (2008).

17 Li Y, Wang J. Faster human genome sequencing. Nat. Biotechnol. 27(9), 820–821 (2009).

18 Shendure J, Ji H. Next-generation DNA sequencing. Nat. Biotechnol. 26(10), 1135-1145 (2008).

19 Meyerson M, Gabriel S, Getz G. Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 11(10), 685–696 (2010).

20 Wheeler DA, Srinivasan M, Egholm M et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452(7189), 872–876 (2008).

21 Wang J, Wang W, Li R et al. The diploid genome sequence of an Asian individual. Nature 456(7218), 60–65 (2008).

22 Bentley DR, Balasubramanian S, Swerdlow HP et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218), 53–59 (2008).

23 Ng SB, Turner EH, Robertson PD et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261), 272–276 (2009).

24 Ng SB, Buckingham KJ, Lee C et al. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42(1), 30–35 (2010).

25 Ng SB, Bigham AW, Buckingham KJ et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42(9), 790–793 (2010).

26 Schadt EE, Turner S, Kasarskis A. A window into third-generation sequencing. Hum. Mol. Genet. 19(R2), R227–R240 (2010).

Page 20: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Next-generation sequencing in cancer research & diagnostics

39www.futuremedicine.com

27 Hoppman-Chaney N, Peterson LM, Klee EW, Middha S, Courteau LK, Ferber MJ. Evaluation of oligonucleotide sequence capture arrays and comparison of next-generation sequencing platforms for use in molecular diagnostics. Clin. Chem. 56(8), 1297–1306 (2010).

28 Artuso R, Fallerini C, Dosa L et al. Advances in Alport syndrome diagnosis using next-generation sequencing. Eur. J. Hum. Genet. 20(1), 50–57 (2012).

29 Ku CS, Wu M, Cooper DN et al. Technological advances in DNA sequence enrichment and sequencing for germline genetic diagnosis. Expert Rev. Mol. Diagn. 12(2), 159–173 (2012).

30 Ley TJ, Mardis ER, Ding L et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456(7218), 66–72 (2008).

31 Mardis ER, Ding L, Dooling DJ et al. Recurring mutations found by sequencing an acute myeloid leukemia genome. N. Engl. J. Med. 361(11), 1058–1066 (2009).

32 Ley TJ, Ding L, Walter MJ et al. DNMT3A mutations in acute myeloid leukemia. N. Engl. J. Med. 363(25), 2424–2433 (2010).

33 Totoki Y, Tatsuno K, Yamamoto S et al. High-resolution characterization of a hepatocellular carcinoma genome. Nat. Genet. 43(5), 464–469 (2011).

34 Pleasance ED, Cheetham RK, Stephens PJ et al. A comprehensive catalogue of somatic mutations from a

human cancer genome. Nature 463(7278), 191–196 (2010).

35 Berger MF, Lawrence MS, Demichelis F et al. The genomic complexity of primary human prostate cancer. Nature 470(7333), 214–220 (2011).

36 Lee W, Jiang Z, Liu J et al. The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465(7297), 473–477 (2010).

37 Shah SP, Morin RD, Khattra J et al. Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Nature 461(7265), 809–813 (2009).

38 Ding L, Ley TJ, Larson DE et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481(7382), 506–510 (2012).

39 Sjoblom T, Jones S, Wood LD et al. The consensus coding sequences of human breast and colorectal cancers. Science 314(5797), 268–274 (2006).

40 Wood LD, Parsons DW, Jones S et al. The genomic landscapes of human breast and colorectal cancers. Science 318(5853), 1108–1113 (2007).

41 Yan XJ, Xu J, Gu ZH et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nat. Genet. 43(4), 309–315 (2011).

42 Wang K, Kan J, Yuen ST et al. Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nat. Genet. 43(12), 1219–1223 (2011).

43 Varela I, Tarpey P, Raine K et al. Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma. Nature 469(7331), 539–542 (2011).

44 Wei X, Walia V, Lin JC et al. Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nat. Genet. 43(5), 442–446 (2011).

45 Dalgliesh GL, Furge K, Greenman C et al. Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genes. Nature 463(7279), 360–363 (2010).

46 Li M, Zhao H, Zhang X et al. Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma. Nat. Genet. 43(9), 828–829 (2011).

47 Ku CS, Cooper DN, Polychronakos C, Naidoo N, Wu M, Soong R. Exome sequencing: dual role as a discovery and diagnostic tool. Ann. Neurol. 71(1), 5–14 (2012).

48 Link DC, Schuettpelz LG, Shen D et al. Identification of a novel TP53 cancer susceptibility mutation through whole-genome sequencing of a patient with therapy-related AML. JAMA 305(15), 1568–1576 (2011).

49 Roychowdhury S, Iyer MK, Robinson DR et al. Personalized oncology through integrative high-throughput sequencing: a pilot study. Sci. Transl. Med. 3(111), 111ra121 (2011).

50 Morgan JE, Carr IM, Sheridan E et al. Genetic diagnosis of familial breast cancer using clonal

Page 21: Next-Generation Sequencing & Molecular Diagnostics || Next-generation sequencing in cancer research & diagnostics

Ku & Cooper

40 www.futuremedicine.com

sequencing. Hum. Mutat. 31(4), 484–491 (2010).

51 Hoischen A, Gilissen C, Arts P et al. Massively parallel sequencing of ataxia genes after array-based enrichment. Hum. Mutat. 31(4), 494–499 (2010).

52 Hudson TJ, Anderson W, Artez A et al. International network of cancer genome projects. Nature 464(7291), 993–998 (2010).

53 Ross JS, Cronin M. Whole cancer genome sequencing by next-generation methods.

Am. J. Clin. Pathol. 136(4), 527–539 (2011).

54 Wagle N, Emery C, Berger MF et al. Dissecting therapeutic resistance to RAF inhibition in melanoma by tumor genomic profiling. J. Clin. Oncol. 29(22), 3085–3096 (2011).

55 Ku C-S, Cooper DN, Ziogas ED, Halkia E, Tzaphlidou M, Roukos DH. Research and clinical applications of cancer genome sequencing. Curr. Opin. Obstet. Gynecol. doi: 10.1097/GCO.0b013e32835af17c (2012) (Epub ahead of print).

Websites101 The Human Gene Mutation

Database. www.hgmd.org

102 Illumina. MiSeq™ Personal Sequencer. www.illumina.com/systems/miseq.ilmn

103 Illumina. TruSeq™ Custom Amplicon. www.illumina.com/products/truseq_custom_amplicon.ilmn