6
Rational design of self-assembly pathways for complex multicomponent structures William M. Jacobs 1 , Aleks Reinhardt, and Daan Frenkel 1 Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom Edited by Athanassios Z. Panagiotopoulos, Princeton University, Princeton, NJ, and accepted by the Editorial Board April 7, 2015 (received for review February 3, 2015) The field of complex self-assembly is moving toward the design of multiparticle structures consisting of thousands of distinct building blocks. To exploit the potential benefits of structures with such addressable complexity,we need to understand the factors that optimize the yield and the kinetics of self-assembly. Here we use a simple theoretical method to explain the key features responsible for the unexpected success of DNA-brick experiments, which are currently the only demonstration of reliable self-assembly with such a large number of components. Simulations confirm that our theory accurately predicts the narrow temperature window in which error-free assembly can occur. Even more strikingly, our theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further- more, we predict that low coordination numbers result in non- classical nucleation behavior, which we find to be essential for achieving optimal nucleation kinetics under mild growth condi- tions. We also show that, rather surprisingly, the use of heteroge- neous bond energies improves the nucleation kinetics and in fact appears to be necessary for assembling certain intricate 3D struc- tures. This observation makes it possible to sculpt nucleation path- ways by tuning the distribution of interaction strengths. These insights not only suggest how to improve the design of structures based on DNA bricks, but also point the way toward the creation of a much wider class of chemical or colloidal structures with addressable complexity. self-assembly | free-energy landscapes | nucleation | DNA nanotechnology R ecent experiments with short pieces of single-stranded DNA (1, 2) have shown that it is possible to assemble well-defined molecular superstructures from a single solution with more than merely a handful of distinct building blocks. These experiments use complementary DNA sequences to encode an addressable structure (3) in which each distinct single-stranded brickbelongs in a specific location within the target assembly. A remarkable feature of these experiments is that even without careful control of the subunit stoichiometry or optimization of the DNA sequences, a large number of 2- and 3D designed structures with thousands of subunits assemble reliably (1, 2, 4, 5). The success of this approach is astounding given the many ways in which the assembly of an addressable structure could potentially go wrong (68). Any attempt to optimize the assembly yield or to create even more complex structures should be based on a better understanding of the mechanism by which DNA bricks manage to self-assemble robustly. The existence of a sizable nucleation barrier, as originally proposed in refs. 1, 2, would remedy two possible sources of error that were previously thought to limit the successful assembly of multicomponent nanostructures: the depletion of free monomers and the uncontrolled aggregation of partially formed structures. Slowing the rate of nucleation would suppress competition among multiple nucleation sites for available monomers and give the complete structure a chance to assemble before encountering other partial structures. Recent simulations of a simplified model of a 3D addressable structure have provided evidence of a free-energy barrier for nucleation (9), suggesting that the ability to control this barrier should enable the assembly of a wide range of complex nanostructures. We therefore need to be able to predict how such a barrier depends on the design of the target structure and on the choice of DNA sequences. Until now, however, there have been no reliable techniques to predict the existence, let alone the magnitude, of a nucleation barrier for self-assembly in a mixture of complementary DNA bricks. Here we show that the assembly of 3D DNA-brick nano- structures is indeed a nucleated process, but only in a narrow range of temperatures. The nucleation barrier in these systems is determined entirely by the topology of the designed interactions that stabilize the target structure. Controllable nucleation is therefore a general feature of addressable structures that can be tuned through the rational choice of designed interactions. We find that the reliable self-assembly of 3D DNA bricks is a direct consequence of their unusual nucleation behavior, which is not accounted for by existing theories that work for classical exam- ples of self-assembly, such as crystal nucleation. We are thus able to provide a rational basis for the rather unconventional protocol used in the recent DNA-brick experiments by showing that they exploit a narrow window of opportunity where robust multi- component self-assembly can take place. Structure Connectivity Determines Assembly In constructing a model for the self-assembly of addressable struc- tures, we note that the designed interactions should be much stronger than any attractive interactions between subunits that are not adja- cent in a correctly assembled structure. The designed interactions Significance Recent experiments have demonstrated that complex, three- dimensional nanostructures can be self-assembled out of thou- sands of short strands of preprogrammed DNA. However, the mechanism by which robust self-assembly occurs is poorly un- derstood, and the same feat has not yet been achieved using any other molecular building block. Using a new theory of address- ableself-assembly, we explain how the design of the target structure and the choice of interparticle interactions determine the self-assembly pathway, and, to our knowledge, for the first time predict that a time-dependent protocol, rather than merely a carefully tuned set of conditions, may be necessary to optimize the yield. With an understanding of these design principles, it should be possible to engineer addressable nanostructures using a much wider array of materials. Author contributions: W.M.J., A.R., and D.F. designed research; W.M.J. and A.R. per- formed research; W.M.J., A.R., and D.F. analyzed data; and W.M.J., A.R., and D.F. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. A.Z.P. is a guest editor invited by the Editorial Board. Data deposition: Supporting data related to this publication are available at github.com/ wmjac/pygtsa. 1 To whom correspondence may be addressed. Email: [email protected] or df246@ cam.ac.uk. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1502210112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1502210112 PNAS | May 19, 2015 | vol. 112 | no. 20 | 63136318 CHEMISTRY Downloaded by guest on May 27, 2020

Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

Rational design of self-assembly pathways for complexmulticomponent structuresWilliam M. Jacobs1, Aleks Reinhardt, and Daan Frenkel1

Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom

Edited by Athanassios Z. Panagiotopoulos, Princeton University, Princeton, NJ, and accepted by the Editorial Board April 7, 2015 (received for review February3, 2015)

The field of complex self-assembly is moving toward the design ofmultiparticle structures consisting of thousands of distinct buildingblocks. To exploit the potential benefits of structures with such“addressable complexity,” we need to understand the factors thatoptimize the yield and the kinetics of self-assembly. Here we use asimple theoretical method to explain the key features responsiblefor the unexpected success of DNA-brick experiments, which arecurrently the only demonstration of reliable self-assembly withsuch a large number of components. Simulations confirm thatour theory accurately predicts the narrow temperature windowin which error-free assembly can occur. Even more strikingly, ourtheory predicts that correct assembly of the complete structuremay require a time-dependent experimental protocol. Further-more, we predict that low coordination numbers result in non-classical nucleation behavior, which we find to be essential forachieving optimal nucleation kinetics under mild growth condi-tions. We also show that, rather surprisingly, the use of heteroge-neous bond energies improves the nucleation kinetics and in factappears to be necessary for assembling certain intricate 3D struc-tures. This observation makes it possible to sculpt nucleation path-ways by tuning the distribution of interaction strengths. Theseinsights not only suggest how to improve the design of structuresbased on DNA bricks, but also point the way toward the creationof a much wider class of chemical or colloidal structures withaddressable complexity.

self-assembly | free-energy landscapes | nucleation | DNA nanotechnology

Recent experiments with short pieces of single-stranded DNA(1, 2) have shown that it is possible to assemble well-defined

molecular superstructures from a single solution with more thanmerely a handful of distinct building blocks. These experimentsuse complementary DNA sequences to encode an addressablestructure (3) in which each distinct single-stranded “brick” belongsin a specific location within the target assembly. A remarkablefeature of these experiments is that even without careful control ofthe subunit stoichiometry or optimization of the DNA sequences,a large number of 2- and 3D designed structures with thousands ofsubunits assemble reliably (1, 2, 4, 5). The success of this approachis astounding given the many ways in which the assembly of anaddressable structure could potentially go wrong (6–8).Any attempt to optimize the assembly yield or to create even

more complex structures should be based on a better understandingof the mechanism by which DNA bricks manage to self-assemblerobustly. The existence of a sizable nucleation barrier, as originallyproposed in refs. 1, 2, would remedy two possible sources of errorthat were previously thought to limit the successful assembly ofmulticomponent nanostructures: the depletion of free monomersand the uncontrolled aggregation of partially formed structures.Slowing the rate of nucleation would suppress competition amongmultiple nucleation sites for available monomers and give thecomplete structure a chance to assemble before encountering otherpartial structures. Recent simulations of a simplified model of a 3Daddressable structure have provided evidence of a free-energybarrier for nucleation (9), suggesting that the ability to control thisbarrier should enable the assembly of a wide range of complex

nanostructures. We therefore need to be able to predict howsuch a barrier depends on the design of the target structure and onthe choice of DNA sequences. Until now, however, there havebeen no reliable techniques to predict the existence, let alone themagnitude, of a nucleation barrier for self-assembly in a mixture ofcomplementary DNA bricks.Here we show that the assembly of 3D DNA-brick nano-

structures is indeed a nucleated process, but only in a narrowrange of temperatures. The nucleation barrier in these systems isdetermined entirely by the topology of the designed interactionsthat stabilize the target structure. Controllable nucleation istherefore a general feature of addressable structures that can betuned through the rational choice of designed interactions. Wefind that the reliable self-assembly of 3D DNA bricks is a directconsequence of their unusual nucleation behavior, which is notaccounted for by existing theories that work for classical exam-ples of self-assembly, such as crystal nucleation. We are thus ableto provide a rational basis for the rather unconventional protocolused in the recent DNA-brick experiments by showing that theyexploit a narrow window of opportunity where robust multi-component self-assembly can take place.

Structure Connectivity Determines AssemblyIn constructing a model for the self-assembly of addressable struc-tures, we note that the designed interactions should be much strongerthan any attractive interactions between subunits that are not adja-cent in a correctly assembled structure. The designed interactions

Significance

Recent experiments have demonstrated that complex, three-dimensional nanostructures can be self-assembled out of thou-sands of short strands of preprogrammed DNA. However, themechanism by which robust self-assembly occurs is poorly un-derstood, and the same feat has not yet been achieved using anyother molecular building block. Using a new theory of “address-able” self-assembly, we explain how the design of the targetstructure and the choice of interparticle interactions determinethe self-assembly pathway, and, to our knowledge, for the firsttime predict that a time-dependent protocol, rather than merely acarefully tuned set of conditions, may be necessary to optimizethe yield. With an understanding of these design principles, itshould be possible to engineer addressable nanostructures usinga much wider array of materials.

Author contributions: W.M.J., A.R., and D.F. designed research; W.M.J. and A.R. per-formed research; W.M.J., A.R., and D.F. analyzed data; and W.M.J., A.R., and D.F. wrotethe paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. A.Z.P. is a guest editor invited by the EditorialBoard.

Data deposition: Supporting data related to this publication are available at github.com/wmjac/pygtsa.1To whom correspondence may be addressed. Email: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1502210112/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1502210112 PNAS | May 19, 2015 | vol. 112 | no. 20 | 6313–6318

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

27,

202

0

Page 2: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

that stabilize the target structure can be described by a connectivitygraph G, in which each vertex represents a distinct subunit and eachedge indicates a correct bond. This graph allows us to describe theconnectivity of the structure without specifying the geometric detailsand spatial organization of the building blocks. For structures con-structed from DNA bricks, the edges of G indicate the hybridizationof DNA strands with complementary sequences that are adjacent inthe target structure. An example 3D DNA-brick structure is shownalong with its connectivity graph in Fig. 1 A and B.

In an ideal solution with exclusively designed interactions, thesubunits assemble into clusters in which all allowed bonds areencoded in the connectivity graph of the target structure. Tocompute the free-energy difference between on-pathway clusters ofa particular size and the unbonded single-stranded bricks, we mustconsider all of the ways in which a correctly bonded cluster with agiven number of monomers can be assembled. These clusters canbe described by the distinct “fragments” of the target structure,which correspond to connected subgraphs of the connectivity

A B C

ED

F G

Fig. 1. Controlled nucleation is essential for robust self-assembly. (A) An example 86-strand DNA-brick structure and (B) its associated connectivity graph.(C) Incidental interactions between dangling ends, shown in orange, lead to incorrect associations between fragments. (D) Free energies of clusters ofV subunits with randomly chosen DNA sequences in units of kBT, where kB is the Boltzmann constant and T is the absolute temperature. The nucleationbarrier ΔF‡ and the target structure stability ΔFG are strongly temperature-dependent. (E) The equilibrium yield with exclusively designed interactions and thenucleation barrier as a function of temperature. Also shown are the target structure stability and the free-energy difference between all off-pathway and on-pathway intermediates, in the presence, ΔF†in, and absence, ΔFin, of a nucleation barrier. The nucleation window is shown in orange. (F) Representative latticeMonte Carlo simulation trajectories with (blue) and without (gray) a temperature ramp. (Inset) A typical malformed structure. (G) The size of the largestcorrectly bonded stable cluster in lattice Monte Carlo simulations using a temperature ramp (blue) and in constant-temperature simulations initialized fromfree monomers in solution (gray).

6314 | www.pnas.org/cgi/doi/10.1073/pnas.1502210112 Jacobs et al.

Dow

nloa

ded

by g

uest

on

May

27,

202

0

Page 3: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

graph. In a dilute solution with strong designed interactions, thenumbers of edges and vertices are the primary factors determiningthe stability of a particular fragment. We therefore identify allpossible assembly intermediates by grouping fragments into setswith the same number of edges and vertices and counting the totalnumber of fragments in each set. This calculation yields the“density of states” of fragments of the target structure, which is anintrinsic property of the connectivity graph. We combine thedensity of states with information about the subunit geometries,monomer concentrations, and designed bond strengths to computethe free energies of the on-pathway clusters. We can also estimatethe equilibrium probability of forming competing off-pathwaystructures, the overwhelming majority of which arises from un-desired incidental interactions between subunits. For further dis-cussion of the theory, please see SI Text, section S1 and ref. 10.This theoretical approach is powerful because it can predict

the free-energy landscape as a function of the degree of assemblybetween the monomers and the target structure. Furthermore,the predicted landscape captures the precise topology of thetarget structure, which is essential for understanding the as-sembly of addressable, finite-sized structures. In the case ofDNA-brick structures, we can assign DNA hybridization freeenergies to the edges of the target connectivity graph to de-termine the temperature dependence of the free-energy land-scape; for example, Fig. 1D shows the free-energy profile of the86-strand DNA-brick structure with random DNA sequences atthree temperatures. Our theoretical approach allows us to cal-culate the nucleation barrier ΔF‡ by examining the free energiesof clusters corresponding to fragments with exactly V vertices.The critical number of strands required for nucleation is V ‡:transient clusters with fewer than V ‡ strands are more likely todissociate than to continue incorporating additional strands. Thepresence of a substantial nucleation barrier therefore inhibits theproliferation of large, partially assembled fragments that sticktogether to form nontarget aggregates.

Assembly Requires a Time-Dependent ProtocolOver a significant range of temperatures, we find that the free-energy profiles of DNA-brick structures exhibit both a nucle-ation barrier and a thermodynamically stable intermediatestructure. The nucleation barrier is associated with the minimumnumber of subunits that must be assembled to complete one ormore cycles, i.e., closed loops of stabilizing bonds in a fragment.For example, the critical number of monomers in the examplestructure at 319 K, V ‡ = 8, is one fewer than the nine subunitsrequired to form a bicyclic fragment of the target structure.Under the conditions where nucleation is rate controlling, theminimum free-energy structure is not the complete 86-particletarget structure, but rather a structure with only V ’ 65 particles.This incomplete structure is favored by entropy, because it canbe realized in many more ways than the unique target structure.Hence, the temperature where nucleation is rate controlling ishigher than the temperature where the target structure is themost stable cluster. The existence of thermodynamically stableintermediates is a typical feature of DNA-brick structures and ofcomplex addressable, finite-sized structures in general. [We notethat thermodynamically stable partial structures have also beenobserved in previously reported simulations (9) of DNA-brickstructures consisting of approximately 1,000 distinct subunits.]This behavior is not compatible with classical nucleation the-

ory (CNT), which predicts that, beyond the nucleation barrier,large clusters are always more stable than smaller clusters. As aconsequence, in “classical” nucleation scenarios such as crystalli-zation, there is a sharp boundary in temperature and concentrationat which the largest-possible ordered structure, rather than themonomeric state, becomes thermodynamically stable. Typically, asimple fluid must be supersaturated well beyond this boundary toreduce the nucleation barrier, which arises due to the competition

between the free-energy penalty of forming a solid–liquid interfaceand the increased stability due to the growth of an ordered struc-ture (11–13). However, in the case of addressable self-assembly,and DNA bricks in particular, a nucleation barrier for the forma-tion of a stable partial structure may exist even when the targetstructure is unstable relative to the free monomers.An experiment to assemble such a structure requires a pro-

tocol: first nucleation at a relatively high temperature, and thenfurther cooling to complete the formation of the target structure.[DNA-brick structures have also been assembled at constanttemperature by changing the solution conditions during thecourse of the experiment (4). For simplicity, we assume that theexperimental control parameter is the temperature, but clearlyother methods of altering the DNA hybridization free energiesduring the assembly process may also be suitable.] This behaviorcan be seen in Fig. 1E, where we identify a narrow temperaturewindow in which there is a significant yet surmountable nucleationbarrier. Unlike CNT, the nucleation barrier does not diverge as thetemperature is increased. Instead, there is a well-defined temper-ature above which all clusters have a higher free energy than thefree monomers. As the temperature is lowered further, the nu-cleation barrier disappears entirely before the equilibrium yield,defined as the fraction of all clusters that are correctly assembled asthe complete target structure, increases measurably above zero.The equilibrium yield tends to 100% at low temperatures, becausewe have thus far assumed that only designed interactions arepossible. Therefore, because of the presence of stable intermediatestructures, it is typically impossible to assemble the target structurecompletely at any temperature where nucleation is rate controlling.To examine the importance of a nucleation barrier for pre-

venting misassembly, we estimate the free-energy difference be-tween off-pathway aggregates and all on-pathway intermediates,ΔFin, by calculating the probability of incidental interactions be-tween partially assembled structures (10). From the connectivitygraph of the example DNA-brick structure, we can calculate thetotal free energy of aggregated clusters by considering all of theways that partially assembled structures can interact via the dan-gling ends of the single-stranded bricks, as shown in Fig. 1C. Wealso estimate this free-energy difference in the case of slow nu-cleation, ΔF†

in, by only allowing one of the interacting clusters in amisassembled intermediate to have V >V ‡.The above analysis supports our claim that a substantial nu-

cleation barrier is essential for accurate self-assembly. Our cal-culations show that even with very weak incidental interactions,incorrect bonding between the multiple dangling ends of largepartial structures prevents error-free assembly at equilibrium,because ΔFin > 0. The presence of a nucleation barrier slows theapproach to equilibrium, maintaining the viability of the cor-rectly assembled clusters.These theoretical predictions are confirmed by extensive

Monte Carlo simulations of the structure shown in Fig. 1A. Inthese simulations, the DNA bricks are modeled as rigid particlesthat move on a cubic lattice, but otherwise the sequence com-plementarity and the hybridization free energies of the experi-mental system are preserved (9). Using realistic dynamics (14),we simulate the assembly of the target structure using a singlecopy of each monomer. In Fig. 1F, we compare a representativetrajectory from a simulation using a linear temperature ramp witha trajectory from a constant-temperature simulation starting fromfree monomers in solution. We also report the largest stablecluster size averaged over many such trajectories in Fig. 1G. Nu-cleation first occurs within the predicted nucleation window whereΔF‡ ’ 8  kBT. At 319 K, the size of the largest stable cluster co-incides precisely with the predicted average cluster size at the free-energy minimum in Fig. 1D [Incomplete assembly at a constanttemperature has also been observed in experiments (15) as well asin our simulations of more complex structures, such as a rect-angular slab with a raised checkerboard pattern.] Intermediate

Jacobs et al. PNAS | May 19, 2015 | vol. 112 | no. 20 | 6315

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

27,

202

0

Page 4: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

structures assembled via a temperature ramp continue to grow atlower temperatures, whereas clusters formed directly from a so-lution of free monomers become arrested in conformations thatare incompatible with further growth (Fig. 1F, Inset). In agreementwith our theoretical predictions, the simulation results demon-strate that a time-dependent protocol is essential for correctlyassembling a complete DNA-brick structure.

Coordination Number Controls the Nucleation BarrierIn the modular assemblies reported in ref. 2, the maximum co-ordination number of bricks in the interior of the structure is 4.However, one can envisage other building blocks, such as func-tionalized molecular constructs or nanocolloids, that have adifferent coordination number. To investigate the effect of the co-ordination number on the nucleation barrier, we compare the free-energy profile of a 48-strand DNA-brick structure with those of twohigher-coordinated structures (Fig. 2A): a simple cubic structurewith coordination number qc = 6 and a close-packed structure withqc = 12. (For a discussion of 2D structures, see SI Text, section S4.)In Fig. 2B, we show the free-energy profiles at 50% yield assumingidentical bond energies within each structure.One striking difference between the qc = 4 structure and the

higher-coordinated examples is the stability of the target at 50%yield. In the DNA-brick structure, the target structure coexists innearly equal populations with a partial structure that is missing asingle cycle. In the structures with higher coordination numbers,however, the target has the same free energy as the free mono-mers at 50% yield. Intermediate structures are therefore globallyunstable at all temperatures, as predicted by CNT.A second point of distinction among these structures lies in the

relative stability of intermediate cluster sizes. Whereas the DNA-brick structure assembles by completing individual cycles, the cubicstructure grows by adding one face at a time to an expanding cu-boid. [This equilibrium growth pathway is similar to that describedin ref. 16.] With qc = 12, the greater diversity of fragments with thesame number of vertices smooths out the free-energy profile nearthe top of the nucleation barrier. The fitted black line in Fig. 2Bshows that the assembly of this structure does in fact obey CNT (SIText, section S2).The differences among these free-energy profiles originate

from the topologies of the connectivity graphs of the example

structures. The most important determinant of the nucleationbehavior is simply the number of vertices required to completeeach additional cycle in the target connectivity graph, which iscontrolled by the maximum coordination number of the subunits.[Although the coordination number also affects the rotationalentropy in lattice-based simulations, altering the rotational en-tropy in our theoretical method has the same effect as changingthe monomer concentration (SI Text, section S3) and thus doesnot affect the shape of the free-energy profile.] Our findings implythat controlled self-assembly of 3D addressable structures is unlikelyto be achieved straightforwardly using subunits with coordinationnumbers higher than 4. In higher-coordinated structures, which arewell described by CNT, it would be necessary to go to high super-saturation to find a surmountable nucleation barrier; however, suchan approach is likely to fail due to kinetic trapping (17). However, inDNA-brick structures, the nucleation barrier is surmountable at lowsupersaturation and is relatively insensitive to the size of the targetstructure (Fig. 2C). The reliable self-assembly of large DNA-brickstructures is thus a direct consequence of the small number of bondsmade by each brick.

Heterogeneous Bond Energies Improve KineticsRecent publications have argued that equal bond energiesshould enhance the stability of the designed structure (18) andreduce errors during growth (19). By contrast, we find that thekinetics of DNA-brick assembly are actually worse if one selectsDNA sequences that minimize the variance in the bond energies.Our observation is consistent with the successful use of randomDNA sequences in the original experiments with DNA bricks (1,2). Here again, the nucleation behavior is responsible for thisunexpected result.To demonstrate the difference between random DNA se-

quences and sequences chosen to yield monodisperse bond en-ergies, we consider the relatively simple nonconvex DNA-brickstructure shown in Fig. 3A. This 74-brick structure, constructedby removing the interior strands and two faces from a cuboidalstructure, assembles roughly face-by-face when using randomDNA sequences. The relevant nucleation barrier, as predictedtheoretically in Fig. 3B and confirmed with Monte Carlo simu-lations in Fig. 3C, is the completion of the third face. Withmonodisperse bond energies and an equivalent mean interaction

A B C

Fig. 2. Dependence of the nucleation barrier on the coordination number of the target structure. (A) Example structures with tetrahedral coordination(qc = 4), octahedral coordination (qc = 6), and close-packed coordination (qc = 12), along with their associated connectivity graphs. (B) Free-energy profiles ofthese three structures at 50% yield assuming identical bond energies within each structure. The black line shows the fit of classical nucleation theory to thenucleation barrier of the qc = 12 structure. (C) The dependence of the nucleation barrier on the total number of strands, VðGÞ, in DNA-brick structures withrandomly chosen DNA sequences. The nucleation temperature does not increase monotonically with VðGÞ in these roughly cuboidal structures becausesurface effects are considerable.

6316 | www.pnas.org/cgi/doi/10.1073/pnas.1502210112 Jacobs et al.

Dow

nloa

ded

by g

uest

on

May

27,

202

0

Page 5: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

strength, a much larger nucleation barrier appears before thefirst face forms. Attempts to reduce this nucleation barrier by in-creasing the mean bond energy result in kinetic trapping andarrested growth. Despite promising fluctuations in the largestcluster size in the simulation trajectory with monodisperse en-ergies, multiple competing nuclei appear, and the largest clusterremains poorly configured for further assembly (Fig. 3C, Inset).The use of sequences with a broad distribution of hybridiza-

tion free energies results in a more suitable nucleation barrierbecause such a distribution selectively stabilizes small and floppyintermediate structures. This is a statistical effect: because thereare far fewer ways of constructing a maximally connected frag-ment with a given number of monomers, the chance that ran-domly assigned sequences concentrate the strongest bonds in acompact fragment is vanishingly small in a large structure. As aresult, the dominant nucleation pathways no longer need tofollow the maximally connected fragments. The use of a broaddistribution of bond energies therefore tends to reduce nucle-ation barriers, because unstable fragments near the top of abarrier contain fewer cycles and are thus affected more signifi-cantly by the variance in the bond-energy distribution.

Design Principles for (DNA) NanostructuresThe insights provided by our predictive theory allow us to un-derstand the general principles underlying the unexpected suc-cess of DNA-brick self-assembly. Slow, controlled nucleation atlow supersaturation is achieved for large structures because eachbrick can only make a small number of designed connections.Because of an appreciable nucleation barrier that appears in anarrow temperature window, monomer depletion does not posea significant problem for one-pot assembly. Surprisingly, complexstructures with randomly selected complementary DNA sequencesexperience enhanced nucleation, making larger intermediate struc-tures kinetically accessible at higher temperatures.The use of a temperature ramp plays a more crucial role than

previously thought. Cooling the DNA-brick solution slowly is notjust a convenient way of locating good assembly conditions, as inthe case of conventional crystals; rather, it is an essential non-equilibrium protocol for achieving error-free assembly of finite-sized structures. The explanation of slow nucleation and fastgrowth that was originally proposed in refs. 1, 2 is therefore in-complete: fast growth allows the DNA bricks to assemble into astable, on-pathway intermediate that must be annealed at lower

temperatures to complete the target structure. For nucleation tobe rate controlling, the monomers must diffuse sufficiently rap-idly for the solution of precritical clusters to reach equilibrium.With this assumption, our results suggest that the growth of super-critical clusters proceeds by monomer addition and that, as a result,incorrect assembly is rare. However, successful assembly also relieson the slow diffusion of large intermediates, whose rate of diffusionchanges approximately inversely with the cluster radius (20). Re-maining out of equilibrium throughout the assembly protocol isnecessary to avoid the aggregation of these partial structures, whichwe predict to be the globally favored thermodynamic state.Our approach also suggests how to improve the design of

DNA-brick nanostructures beyond the random selection ofuniformly distributed DNA sequences. For a given target struc-ture, it is easy to tune the nucleation barrier by adjusting thestatistical distribution of bond energies. Complementary DNAsequences can then be assigned to the structure to achieve thedesired distribution of hybridization free energies. Furthermore,with an understanding of the origin of the nucleation barrier in aparticular structure, it is possible to optimize the annealingprotocol rationally to increase the yield of the target assembly.Our approach also provides a means of systematically investi-gating how local modifications to the coordination numberthrough the fusing of adjacent strands affect the nucleation be-havior of DNA-brick structures (21).The theoretical method used here greatly simplifies the quanti-

tative prediction of nucleation barriers and intermediate structureswith widespread applications for controlling the self-assembly ofbiomolecular or synthetic building blocks. Addressable self-as-sembly holds great promise for building intricate 3D structures thatare likely to require optimization on a case-by-case basis. Becauseour predictive theory is sensitive to the details of a particular targetstructure, performing these calculations for nanostructures of ex-perimental interest will enable the precise engineering of assemblyproperties at the design stage. In order for potential users to per-form such experimental protocol design, we provide a user-friendlysoftware package online at github.com/wmjac/pygtsa.

MethodsDNA Hybridization Free Energies.We compute the hybridization free energiesof complementary 8-nucleotide DNA sequences using established empiricalformulas (22, 23) assuming salt concentrations of [Na+] = 1 mol dm−3 and[Mg2+] = 0.08 mol dm−3. For the calculations with monodisperse bond

A B C

Fig. 3. Random DNA sequences can improve the kinetics of self-assembly. (A) A nonconvex, 74-strand DNA-brick structure and its associated connectivitygraph. (B) Free-energy profiles and effective nucleation barriers ΔF‡, with both random and monodisperse designed interaction strengths. The temperaturesare chosen such that the mean hybridization free energies of these two distributions are equal. (C) Representative lattice Monte Carlo simulation trajectorieswith heterogeneous and monodisperse bond energies. Correct assembly proceeds by assembling one face of the structure at a time. The temperature of themonodisperse system must be lowered to achieve nucleation, but the largest cluster quickly becomes kinetically trapped (Inset).

Jacobs et al. PNAS | May 19, 2015 | vol. 112 | no. 20 | 6317

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

27,

202

0

Page 6: Rational design of self-assembly pathways for complex ...theory predicts that correct assembly of the complete structure may require a time-dependent experimental protocol. Further-more,

energies, we use the sequences provided in ref. 19. The strengths of in-cidental interactions are estimated based on the longest attractive overlapfor each pair of noncomplementary sequences. In calculations of the equi-librium yield and free-energy profiles, we report the average thermody-namic properties using 1,000 randomly chosen complete sets of DNAsequences. See SI Text, section S5 for further details.

Lattice Monte Carlo Simulations. Constant-temperature lattice Monte Carlosimulations are carried out using the virtualmoveMonte Carlo algorithm (14) toproduce physical dynamics. Rigid particles, each with four distinct patches fixedin a tetrahedral arrangement, are confined to a cubic lattice. A single copy of

each required subunit is present in the simulation box with 62× 62× 62 latticesites. Complete details are given in ref. 9. For comparison with the results ofthese simulations, the theoretical calculations reported here assume the samedimensionless monomer concentration, ρ= 62−3, lattice coordination number,qc = 4, and fixed number of dihedral angles, qd = 3 (SI Text, section S1).

ACKNOWLEDGMENTS. This work was carried out with support from theEuropean Research Council (Advanced Grant 227758) and the Engineer-ing and Physical Sciences Research Council Programme Grant EP/I001352/1.W.M.J. acknowledges support from the Gates Cambridge Trust and theNational Science Foundation Graduate Research Fellowship under GrantDGE-1143678.

1. Wei B, Dai M, Yin P (2012) Complex shapes self-assembled from single-stranded DNAtiles. Nature 485(7400):623–626.

2. Ke Y, Ong LL, Shih WM, Yin P (2012) Three-dimensional structures self-assembledfrom DNA bricks. Science 338(6111):1177–1183.

3. Rothemund PW, Winfree E (2000) The Program-Size Complexity of Self-AssembledSquares (ACM, New York), pp 459–468.

4. Zhang Z, Song J, Besenbacher F, Dong M, Gothelf KV (2013) Self-assembly of DNAorigami and single-stranded tile structures at room temperature. Angew Chem Int EdEngl 52(35):9219–9223.

5. Wei B, et al. (2013) Design space for complex DNA structures. J Am Chem Soc 135(48):18080–18088.

6. Rothemund PW, Andersen ES (2012) Nanotechnology: The importance of beingmodular. Nature 485(7400):584–585.

7. Gothelf KV (2012) Materials science. LEGO-like DNA structures. Science 338(6111):1159–1160.

8. Kim AJ, Scarlett R, Biancaniello PL, Sinno T, Crocker JC (2009) Probing interfacialequilibration in microsphere crystals formed by DNA-directed assembly. Nat Mater8(1):52–55.

9. Reinhardt A, Frenkel D (2014) Numerical evidence for nucleated self-assembly of DNAbrick structures. Phys Rev Lett 112(23):238103.

10. Jacobs WM, Reinhardt A, Frenkel D (2015) Communication: Theoretical prediction offree-energy landscapes for complex self-assembly. J Chem Phys 142(2):021101.

11. Gibbs JW (1878) On the equilibrium of heterogeneous substances. Am J Sci 16(96):441–458.12. Oxtoby DW (1992) Homogeneous nucleation: Theory and experiment. J Phys Condens

Matter 4(38):7627–7650.

13. Sear RP (2007) Nucleation: Theory and applications to protein solutions and colloidal

suspensions. J Phys Condens Matter 19(3):033101.14. Whitelam S, Geissler PL (2007) Avoiding unphysical kinetic traps in Monte Carlo

simulations of strongly attractive particles. J Chem Phys 127(15):154101.15. Sobczak JPJ, Martin TG, Gerling T, Dietz H (2012) Rapid folding of DNA into nanoscale

shapes at constant temperature. Science 338(6113):1458–1461.16. Shneidman VA (2003) On the lowest energy nucleation path in a supersaturated

lattice gas. J Stat Phys 112(1-2):293–318.17. Whitelam S, Jack RL (2015) The statistical mechanics of dynamic pathways to self-

assembly. Annu Rev Phys Chem 66(1):143–163.18. Hormoz S, Brenner MP (2011) Design principles for self-assembly with short-range

interactions. Proc Natl Acad Sci USA 108(13):5193–5198.19. Hedges LO, Mannige RV, Whitelam S (2014) Growth of equilibrium structures built

from a large number of distinct component types. Soft Matter 10(34):6404–6416.20. Einstein A (1905) Über die von der molekularkinetischen Theorie der Wärme ge-

forderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen. Ann Phys

322(8):549–560.21. Li W, Yang Y, Jiang S, Yan H, Liu Y (2014) Controlled nucleation and growth of DNA

tile arrays within prescribed DNA origami frames and their dynamics. J Am Chem Soc

136(10):3724–3727.22. SantaLucia J, Jr, Hicks D (2004) The thermodynamics of DNA structural motifs. Annu

Rev Biophys Biomol Struct 33(1):415–440.23. Koehler RT, Peyret N (2005) Thermodynamic properties of DNA sequences: Charac-

teristic values for the human genome. Bioinformatics 21(16):3333–3339.

6318 | www.pnas.org/cgi/doi/10.1073/pnas.1502210112 Jacobs et al.

Dow

nloa

ded

by g

uest

on

May

27,

202

0