2
NEWS & VIEWS NATURE|Vol 437| 15 September 2005 326 fails to inspire. Pritiraj Mohanty is in the Department of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, USA. e-mail: [email protected] 1. Strogatz, S. Sync: The Emerging Science of Spontaneous Order (Hyperion, New York, 2003). 2. Kaka, S. et al. Nature 437, 389–392 (2005). 3. Mancoff, F. B. et al. Nature 437, 393–395 (2005). labs to become involved in DNA sequencing, new approaches must decrease the cost and increase the throughput of sequencing sig- nificantly, while maintaining the high quality of data produced by the current approach. Rothberg and colleagues 4 have developed a highly parallel system capable of sequencing 25 million bases in a four-hour period — about 100 times faster than the current state- of-the-art Sanger sequencing and capillary- based electrophoresis platform. The method could potentially allow one individual to pre- pare and sequence an entire genome in a few days (Fig. 1). The sequencer itself, equipped with a simple detection device and liquid delivery system, and housed in a casing roughly the size of a microwave oven, is actu- ally relatively low-tech. The complexity of the system lies primarily in the sample preparation and in the microfabricated, massively parallel platform, which contains 1.6 million picolitre- sized reactors in a 6.4-cm 2 slide. Sample preparation starts with fragmenta- tion of the genomic DNA, followed by the attachment of adaptor sequences to the ends of the DNA pieces. The adaptors allow the DNA fragments to bind to tiny beads (around 28 Ȗm in diameter). This is done under conditions that allow only one piece of DNA to bind to each bead. The beads are encased in droplets of oil that contain all of the reactants needed to amplify the DNA using a standard tool called the polymerase chain reaction. The oil droplets form part of an emulsion so that each bead is kept apart from its neighbour, ensuring the amplification is uncontaminated. Each bead ends up with roughly 10 million copies of its initial DNA fragment. To perform the sequencing reaction, the DNA-template-carrying beads are loaded into the picolitre reactor wells — each well having space for just one bead. The technique uses a sequencing-by-synthesis 8 method developed by Uhlen and colleagues, in which DNA complementary to each template strand is synthesized. The nucleotide bases used for sequencing release a chemical group as the base forms a bond with the growing DNA chain, and this group drives a light-emitting reaction in the presence of specific enzymes and luciferin. Sequential washes of each of the four possible nucleotides are run over the plate, and a detector senses which of the wells emit light with each wash to determine the sequence of the growing strand. This new system shows great promise in several sequencing applications, including re- sequencing and de novo sequencing of smaller bacterial and viral genomes. It could potentially allow research groups with limited resources to enter the field of large-scale DNA sequencing and genomic research, as it provides a technol- ogy that is inexpensive and easy to imple- ment and maintain. However, this technology GENOMICS Massively parallel sequencing Yu-Hui Rogers and J. Craig Venter A sequencing system has been developed that can read 25 million bases of genetic code — the entire genome of some fungi — within four hours. The technique may provide an alternative approach to DNA sequencing. Since the publication of the first complete genome sequence of a living organism 1 in 1995, the field of genomics has changed dramatically. Fuelled by innovations in high-throughput DNA sequencing, high-performance comput- ing and bioinformatics, genomic science has expanded substantially and the rate of genomic discovery has grown exponentially. To date, the genomes of more than 300 organisms have been sequenced and analysed, including those of most major human pathogens, diverse microbes — and, of course, our own genome 2,3 . These advances have profoundly altered the landscapes of biological science and medicine. In this issue, Rothberg and colleagues (page 376) 4 describe a sequencing system that offers a much higher throughput than the current state- of-the-art methods. The system has some limi- tations to overcome before it can be used for all sequencing applications, but it is nonetheless one of the most promising sequencing tech- nologies to have emerged in recent years. For more than a decade, Sanger sequencing 5 and fluorescence-based electrophoresis tech- nologies 6 have dominated the DNA sequenc- ing field. Continued improvements in these techniques and in instrumentation, paired with advances in computing and informatics, have reduced the cost of sequencing by roughly two orders of magnitude and transformed genome projects from decade-long endeavours to projects of mere months (for mammalian- sized genomes), or even weeks (for microbial genomes). However, it still costs an estimated US$10 million to US$25 million to sequence a single human genome 7 and $20,000–$50,000 to sequence a microbial genome. Only a hand- ful of large genome centres worldwide have the resources and technical expertise to handle the sequencing of a mammalian-sized genome, perform large-scale sequencing of multiple organisms or conduct the resequencing of large numbers of genes. To ensure continued growth of genomic science and to enable more Figure 1 | Speeding up sequencing. Flow diagrams for a, traditional microlitre-scale Sanger DNA sequencing and electrophoresis, and b, the massively parallel picolitre-scale sequencing developed by Rothberg et al. 4 . The traditional microlitre-scale approach requires a longer processing time per production cycle, substantially more support equipment, a larger facility and more labour than the picolitre-scale approach. 4. Richardson, O. W. Phys. Rev. 26, 248–253 (1908). 5. Einstein, A. & de Haas, W. J. Verh. Dt. Phys. Ges. 17, 152–170 (1915). 6. Beth, R. A. Phys. Rev. 50, 115–125 (1936). 7. Scott, G. G. Phys. Rev. 82, 542–547 (1951). 8. Doll, R. & Nabauer, M. Phys. Rev. Lett. 7, 51–52 (1961). 9. Mohanty, P. et al. Phys. Rev. B 70, 195301 (2004). 10. Berger, L. Phys. Rev. B 54, 9353–9358 (1996). 11. Slonczewski, J. C. J. Magn. Magn. Mater. 159, L1–L7 (1996). 12. Huygens, C. Oeuvres Complètes de Christiaan Huygens (ed. Nijhoff, M.) (Société Hollandaise des Sciences, The Hague, 1893). Nature Publishing Group ©2005

Yu-Hui Rogers and J. Craig Venter- Massively parallel sequencing

Embed Size (px)

Citation preview

Page 1: Yu-Hui Rogers and J. Craig Venter- Massively parallel sequencing

NEWS & VIEWS NATURE|Vol 437|15 September 2005

326

fails to inspire. ■

Pritiraj Mohanty is in the Department of Physics,Boston University, 590 Commonwealth Avenue,Boston, Massachusetts 02215, USA.e-mail: [email protected]

1. Strogatz, S. Sync: The Emerging Science of Spontaneous Order(Hyperion, New York, 2003).

2. Kaka, S. et al. Nature 437, 389–392 (2005).3. Mancoff, F. B. et al. Nature 437, 393–395 (2005).

labs to become involved in DNA sequencing,new approaches must decrease the cost andincrease the throughput of sequencing sig-nificantly, while maintaining the high quality of data produced by the current approach.

Rothberg and colleagues4 have developed ahighly parallel system capable of sequencing25 million bases in a four-hour period —about 100 times faster than the current state-of-the-art Sanger sequencing and capillary-based electrophoresis platform. The methodcould potentially allow one individual to pre-pare and sequence an entire genome in a fewdays (Fig. 1). The sequencer itself, equippedwith a simple detection device and liquiddelivery system, and housed in a casingroughly the size of a microwave oven, is actu-ally relatively low-tech. The complexity of thesystem lies primarily in the sample preparationand in the microfabricated, massively parallelplatform, which contains 1.6 million picolitre-sized reactors in a 6.4-cm2 slide.

Sample preparation starts with fragmenta-tion of the genomic DNA, followed by theattachment of adaptor sequences to the ends ofthe DNA pieces. The adaptors allow the DNAfragments to bind to tiny beads (around 28 �min diameter). This is done under conditionsthat allow only one piece of DNA to bind toeach bead. The beads are encased in dropletsof oil that contain all of the reactants needed to amplify the DNA using a standard toolcalled the polymerase chain reaction. The oildroplets form part of an emulsion so that eachbead is kept apart from its neighbour, ensuringthe amplification is uncontaminated. Eachbead ends up with roughly 10 million copies ofits initial DNA fragment.

To perform the sequencing reaction, theDNA-template-carrying beads are loaded intothe picolitre reactor wells — each well havingspace for just one bead. The technique uses asequencing-by-synthesis8 method developedby Uhlen and colleagues, in which DNA

complementary to each template strand is synthesized. The nucleotide bases used forsequencing release a chemical group as thebase forms a bond with the growing DNAchain, and this group drives a light-emittingreaction in the presence of specific enzymesand luciferin. Sequential washes of each of thefour possible nucleotides are run over theplate, and a detector senses which of the wellsemit light with each wash to determine thesequence of the growing strand.

This new system shows great promise in several sequencing applications, including re-sequencing and de novo sequencing of smallerbacterial and viral genomes. It could potentiallyallow research groups with limited resources toenter the field of large-scale DNA sequencingand genomic research, as it provides a technol-ogy that is inexpensive and easy to imple-ment and maintain. However, this technology

GENOMICS

Massively parallel sequencingYu-Hui Rogers and J. Craig Venter

A sequencing system has been developed that can read 25 million bases ofgenetic code — the entire genome of some fungi — within four hours. Thetechnique may provide an alternative approach to DNA sequencing.

Since the publication of the first completegenome sequence of a living organism1 in 1995,the field of genomics has changed dramatically.Fuelled by innovations in high-throughputDNA sequencing, high-performance comput-ing and bioinformatics, genomic science hasexpanded substantially and the rate of genomicdiscovery has grown exponentially. To date, the genomes of more than 300 organisms have been sequenced and analysed, including those of most major human pathogens, diversemicrobes — and, of course, our own genome2,3.These advances have profoundly altered thelandscapes of biological science and medicine.In this issue, Rothberg and colleagues (page376)4 describe a sequencing system that offers amuch higher throughput than the current state-of-the-art methods. The system has some limi-tations to overcome before it can be used for allsequencing applications, but it is nonethelessone of the most promising sequencing tech-nologies to have emerged in recent years.

For more than a decade, Sanger sequencing5

and fluorescence-based electrophoresis tech-nologies6 have dominated the DNA sequenc-ing field. Continued improvements in thesetechniques and in instrumentation, paired withadvances in computing and informatics, havereduced the cost of sequencing by roughly twoorders of magnitude and transformed genomeprojects from decade-long endeavours to projects of mere months (for mammalian-sized genomes), or even weeks (for microbialgenomes). However, it still costs an estimatedUS$10 million to US$25 million to sequence asingle human genome7 and $20,000–$50,000to sequence a microbial genome. Only a hand-ful of large genome centres worldwide have theresources and technical expertise to handle thesequencing of a mammalian-sized genome,perform large-scale sequencing of multipleorganisms or conduct the resequencing oflarge numbers of genes. To ensure continuedgrowth of genomic science and to enable more

Figure 1 | Speeding up sequencing. Flow diagramsfor a, traditional microlitre-scale Sanger DNAsequencing and electrophoresis, and b, themassively parallel picolitre-scale sequencingdeveloped by Rothberg et al.4. The traditionalmicrolitre-scale approach requires a longerprocessing time per production cycle, substantiallymore support equipment, a larger facility andmore labour than the picolitre-scale approach.

4. Richardson, O. W. Phys. Rev. 26, 248–253 (1908).5. Einstein, A. & de Haas, W. J. Verh. Dt. Phys. Ges. 17, 152–170

(1915).6. Beth, R. A. Phys. Rev. 50, 115–125 (1936).7. Scott, G. G. Phys. Rev. 82, 542–547 (1951).8. Doll, R. & Nabauer, M. Phys. Rev. Lett. 7, 51–52 (1961).9. Mohanty, P. et al. Phys. Rev. B 70, 195301 (2004).10. Berger, L. Phys. Rev. B 54, 9353–9358 (1996).11. Slonczewski, J. C. J. Magn. Magn. Mater. 159, L1–L7 (1996).12. Huygens, C. Oeuvres Complètes de Christiaan Huygens

(ed. Nijhoff, M.) (Société Hollandaise des Sciences, The Hague, 1893).

15.9 News & Views 325 MH 9/9/05 5:42 PM Page 326

Nature Publishing Group© 2005

Page 2: Yu-Hui Rogers and J. Craig Venter- Massively parallel sequencing

NATURE|Vol 437|15 September 2005 NEWS & VIEWS

327

cannot yet replace the Sanger sequencingapproach for some of the more demandingapplications, such as sequencing a mammaliangenome, as it has several limitations.

First, the technique can only read compara-tively short lengths of DNA, averaging 80–120bases per read, which is approximately a tenthof the read-lengths possible using Sangersequencing. This means not only that morereads must be done to cover the samesequence, but also that stitching the resultstogether into longer genomic sequences is a lotmore complicated. This is particularly truewhen dealing with genomes containing longrepetitive sequences.

Second, the accuracy of each individual readis not as good as with Sanger sequencing —particularly in genomic regions in which single bases are constantly repeated. Third,because the DNA ‘library’ is currently pre-pared in a single-stranded format, unlike thedouble-stranded inserts of DNA libraries usedfor Sanger sequencing, the technique cannotgenerate paired-end reads for each DNA frag-ment. The paired-end information is crucialfor assembling and orientating the individualsequence reads into a complete genomic mapfor de novo sequencing applications. Finally,the sample preparation and amplificationprocesses are still quite complex and willrequire automation and/or simplification.

Church and colleagues9 also recently hitupon the idea of using massively parallel reac-tions to speed up sequencing, although theirmethod is still only at the proof-of-principlestage rather than being a full production system.They use a similar principle to Rothberg andcolleagues4, that is, sequencing-by-synthesis ona solid support. However, the two approachesdiverge in terms of library construction,sequencing chemistry, signal detection andarray platform. These differences greatly affectthe characteristics and reproducibility of thedata, as well as the scalability of the platform.For example, Church and colleagues’ methodcan read paired-end sequences; however, itsaverage read-lengths are approximately a fifth ofthose generated by Rothberg and colleagues’system. These differences are key factors indetermining the sequencing application forwhich each technique might be most suited.

It may be years before Rothberg and col-leagues’ system, or other similar approaches9,10,can tackle all three billion letters of the humangenome with the same reliability and accuracyas current methods. Nevertheless, it looksextremely promising, and it is certainly one ofthe most significant sequencing technologiesunder development. ■

Yu-Hui Rogers and J. Craig Venter are at the J. Craig Venter Institute, 9704 Medical CenterDrive, Rockville, Maryland 20850, USA.e-mail: [email protected]

1. Venter, J. C. et al. Science 269, 496–512 (1995).2. Venter, J. C. et al. Science 291, 1304–1351 (2001).3. International Human Genome Mapping Consortium.

Nature 409, 860–921 (2001).

4. Margulies, M. et al. Nature 437, 376–380 (2005).5. Sanger, F., Nicklen, S. & Coulson, A. R. Proc. Natl Acad. Sci.

USA 74, 5463–5467 (1977).6. Prober, J. M. et al. Science 238, 336–341 (1987).7. NIH News Release www.genome.gov/12513210

(2004).

8. Nyren, P., Pettersson, B. & Uhlen, M. Anal. Biochem. 208,171–175 (1993).

9. Shendure, J. et al. Science advance online publicationdoi:10.1126/science.1117389 (2005).

10. Quake, S. R. et al. Proc. Natl Acad. Sci. USA 100, 3960–3964(2003).

Sources of white light are found almost every-where — in lighting and signage generally, butincreasingly as backlights in all sorts of dis-plays, for example in laptop computers or smartphones. The development of organic light-emitting diodes (OLEDs) promises furtherinnovation in the field: such diodes are light-weight, provide high brightness at low power,and can be fabricated on flexible substrates toform thin devices at potentially low productioncost1. Writing in Advanced Materials, Gong et al.2 demonstrate the use of semiconductingelectrolytes to grow highly efficient, multi-layer, white-light OLEDs from solution.

The structure of an OLED is quite simple; ituses an organic material that either fluorescesor phosphoresces. (Both processes involve there-emission of absorbed light at a longer wave-length; in phosphorescence, the quantum-mechanical processes that lead to re-emissionare more complex, so the emission is delayed.)The light-emitting material is sandwiched as athin film, typically 70–100 nanometres thick,between two electrodes. Of these, the anode istypically transparent and the cathode acts as amirror that ideally reflects any incident pho-tons back towards the transparent side.

When a voltage is applied to the electrodes,positive and negative charges (‘holes’ and elec-trons, respectively) are injected into the filmand move towards each other, forming a bodyknown as an exciton on meeting. This excitoncan become de-excited by emitting a photon,which leaves the device through the transpar-ent anode. Excitons are divided into two cat-egories according to the alignment of the spins of the electrons involved relative to oneanother: if these are antiparallel, a ‘singlet’ witha total spin of zero is formed; if they are paral-lel, the state is a ‘triplet’ with a spin of one. Interms of quantum mechanics, three-quartersof excitons must be triplets and only one-quarter singlets. The prospect of achievinghigher efficiency with OLEDs emitting fromtriplets has led to the investment of consider-able effort2,3 in their development.

OLEDs are generally produced by one oftwo routes: the sublimation of small molecules

DEVICE PHYSICS

Enlightening solutionsKlaus Meerholz

White-light-emitting diodes are becoming increasingly important, but whatis the best way to build compact devices possessing high efficiency? Brightprospects are offered by multi-layer organic devices grown from solution.

Figure 1 | Cross-section through the multi-layerOLED developed by Gong and colleagues2.The staggered height of the layers indicates their different energies; electrons (blue) favourmoving to lower energies, whereas holes (red)tend towards higher energies. The emissive layer(EML) is sandwiched between a hole-transportlayer (HTL) and an electron-transport layer(ETL) consisting of transport agents (HTA/ETA)containing sulphonate groups (SO3

�). (M+ stands for a metal counter-ion.) Thisstructure serves three purposes: first, to facilitatethe injection of holes and electrons (A and C) into the emissive layer by reducing the energeticbarriers to their passage; second, to enhance therecombination efficiency (formation of excitons)by blocking the passage of one type of carrier (B or D) from the emissive layer into the oppositetransport layer through a large step in energy;and third, to avoid quenching reactions ofexcitons at the electrodes (+/�). The layers of the OLED are deposited alternately from water or ethanol (HTL and ETL) and fromorganic solvents (EML). The light emittedthrough recombination in the EML passesthrough the transparent anode to the left; thecolour of the emission depends on the material or mixture of materials that forms the EML.

15.9 News & Views 325 MH 9/9/05 5:42 PM Page 327

Nature Publishing Group© 2005