15
IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1,JANUARY 2012 105 Low Complexity Design of Ripple Carry and Brent–Kung Adders in QCA Vikramkumar Pudi and K. Sridharan, Senior Member, IEEE Abstract—The design of adders on quantum dot cellular au- tomata (QCA) has been of recent interest. While few designs ex- ist, investigations on reduction of QCA primitives (majority gates and inverters) for various adders are limited. In this paper, we present a number of new results on majority logic. We use these results to present efficient QCA designs for the ripple carry adder (RCA) and various prefix adders. We derive bounds on the number of majority gates for n-bit RCA and n-bit Brent–Kung, Kogge– Stone, Ladner–Fischer, and Han–Carlson adders. We further show that the Brent–Kung adder has lower delay than the best existing adder designs as well as other prefix adders. In addition, signal in- tegrity and robustness studies show that the proposed Brent–Kung adder is fairly well-suited to changes in time-related parameters as well as temperature. Detailed simulations using QCADesigner are presented. Index Terms—Area, Brent–Kung adder, cell count, Han– Carlson adder, inverters, Kogge–Stone adder, Ladner–Fischer adder, majority gates, quantum-dot cellular automata, ripple carry adder. I. INTRODUCTION C ONTEMPORARY microprocessors and application- specific integrated circuits are largely based on the com- plementary metal oxide semiconductor (CMOS) technology. It is believed that the performance of various circuits in current CMOS-based architectures is close to reaching the limit. When the feature size of transistors is reduced to a nanometer, quantum effects such as tunneling take place [1]. Further, when device scaling takes place, the interconnections do not scale automati- cally due to the effects of wire resistance and capacitance [2]. Alternatives to conventional CMOS technology are there- fore being explored. These include the quantum-dot cellular au- tomata (QCA) [3] and the single electron transistor (SET) [4]. The work described in this paper concerns designs in QCA. QCA is based on electron confinement in dots. The basic ele- ment in QCA, namely the QCA cell, represents a bit through the configuration of charge in the cell. The key aspect of QCA is that interaction between cells is purely Coulombic and there is no transport of charge between cells [5]. The cellular automata Manuscript received February 2, 2010; revised January 23, 2011 and May 5, 2011; accepted May 9, 2011. Date of publication May 27, 2011; date of current version January 11, 2012. The review of this paper was arranged by Associate Editor D. Hammerstrom. The authors are with the Department of Electrical Engineering, Indian Institute of Technology Madras, Chennai-600036, India (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNANO.2011.2158006 (CA) notion is due to the fact that state of a given cell at a particular time depends on the state of its neighbors during the previous clock cycle. Logic primitives in the QCA model are majority gate and inverter. The motivation for the research described in this paper stems from studies on AND-OR gate realizations of Boolean functions in combinational logic and the minimal realizations for three- variable functions in QCA described in [6]. Prior studies suggest that design complexity can be substantially influenced by the majority gate count. The number of majority gates (and to an extent inverters) also indirectly determines the cell count due to QCA wires in a design. It is therefore of interest to design methods that involve systematic reduction of majority gates and inverters. To the best of our knowledge, there has been no prior work on deriving simplified expressions involving majority gates and inverters for various types of multi-bit adders. The contributions of this paper are as follows. We present a number of new results on majority logic reduction and apply the same to multibit adders. In particular, we present efficient QCA designs for the ripple carry adder (RCA) and various prefix adders. We derive bounds on the number of majority gates for n-bit RCA and n-bit Brent–Kung, Kogge–Stone, Ladner– Fischer, and Han–Carlson adders. We further show that the Brent–Kung adder has lower delay than the best existing adder designs as well as other prefix adders. In addition, signal in- tegrity and robustness studies have been performed. The Brent– Kung adder yields the correct output for temperature varying from 1 to 23 K. Simulations using QCADesigner [7] support the theory presented. The rest of this paper is organized as follows. Basic notations pertaining to QCA are presented in the next section. Section III presents new results on majority gates. Section IV presents re- sults on the RCA. The efficient QCA design of the Brent–Kung prefix adder is presented in Section V. Section VI presents ma- jority gate results for Kogge–Stone, Han–Carlson, and Ladner– Fischer prefix adders. Section VII presents simulation results for the proposed adders. Section VIII presents comparisons of various adders including the best existing adder designs. Conclusions are presented in Section IX. II. TERMINOLOGY AND PRIOR WORK A. Basics of QCA Fig. 1 depicts QCA cells, each with four quantum dots. Each QCA cell is occupied by two electrons. The locations of the electrons determine the binary states. It is worth noting that Fig. 1 shows only a conceptual four-dot cell (the implementation technology is not fixed here). 1536-125X/$26.00 © 2011 IEEE

Adders

Embed Size (px)

Citation preview

Page 1: Adders

IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012 105

Low Complexity Design of Ripple Carry andBrent–Kung Adders in QCA

Vikramkumar Pudi and K. Sridharan, Senior Member, IEEE

Abstract—The design of adders on quantum dot cellular au-tomata (QCA) has been of recent interest. While few designs ex-ist, investigations on reduction of QCA primitives (majority gatesand inverters) for various adders are limited. In this paper, wepresent a number of new results on majority logic. We use theseresults to present efficient QCA designs for the ripple carry adder(RCA) and various prefix adders. We derive bounds on the numberof majority gates for n-bit RCA and n-bit Brent–Kung, Kogge–Stone, Ladner–Fischer, and Han–Carlson adders. We further showthat the Brent–Kung adder has lower delay than the best existingadder designs as well as other prefix adders. In addition, signal in-tegrity and robustness studies show that the proposed Brent–Kungadder is fairly well-suited to changes in time-related parameters aswell as temperature. Detailed simulations using QCADesigner arepresented.

Index Terms—Area, Brent–Kung adder, cell count, Han–Carlson adder, inverters, Kogge–Stone adder, Ladner–Fischeradder, majority gates, quantum-dot cellular automata, ripple carryadder.

I. INTRODUCTION

CONTEMPORARY microprocessors and application-specific integrated circuits are largely based on the com-

plementary metal oxide semiconductor (CMOS) technology. Itis believed that the performance of various circuits in currentCMOS-based architectures is close to reaching the limit. Whenthe feature size of transistors is reduced to a nanometer, quantumeffects such as tunneling take place [1]. Further, when devicescaling takes place, the interconnections do not scale automati-cally due to the effects of wire resistance and capacitance [2].

Alternatives to conventional CMOS technology are there-fore being explored. These include the quantum-dot cellular au-tomata (QCA) [3] and the single electron transistor (SET) [4].The work described in this paper concerns designs in QCA.QCA is based on electron confinement in dots. The basic ele-ment in QCA, namely the QCA cell, represents a bit through theconfiguration of charge in the cell. The key aspect of QCA isthat interaction between cells is purely Coulombic and there isno transport of charge between cells [5]. The cellular automata

Manuscript received February 2, 2010; revised January 23, 2011 and May 5,2011; accepted May 9, 2011. Date of publication May 27, 2011; date of currentversion January 11, 2012. The review of this paper was arranged by AssociateEditor D. Hammerstrom.

The authors are with the Department of Electrical Engineering,Indian Institute of Technology Madras, Chennai-600036, India (e-mail:[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNANO.2011.2158006

(CA) notion is due to the fact that state of a given cell at aparticular time depends on the state of its neighbors during theprevious clock cycle. Logic primitives in the QCA model aremajority gate and inverter.

The motivation for the research described in this paper stemsfrom studies on AND-OR gate realizations of Boolean functionsin combinational logic and the minimal realizations for three-variable functions in QCA described in [6]. Prior studies suggestthat design complexity can be substantially influenced by themajority gate count. The number of majority gates (and to anextent inverters) also indirectly determines the cell count dueto QCA wires in a design. It is therefore of interest to designmethods that involve systematic reduction of majority gatesand inverters. To the best of our knowledge, there has been noprior work on deriving simplified expressions involving majoritygates and inverters for various types of multi-bit adders.

The contributions of this paper are as follows. We present anumber of new results on majority logic reduction and applythe same to multibit adders. In particular, we present efficientQCA designs for the ripple carry adder (RCA) and various prefixadders. We derive bounds on the number of majority gates forn-bit RCA and n-bit Brent–Kung, Kogge–Stone, Ladner–Fischer, and Han–Carlson adders. We further show that theBrent–Kung adder has lower delay than the best existing adderdesigns as well as other prefix adders. In addition, signal in-tegrity and robustness studies have been performed. The Brent–Kung adder yields the correct output for temperature varyingfrom 1 to 23 K. Simulations using QCADesigner [7] supportthe theory presented.

The rest of this paper is organized as follows. Basic notationspertaining to QCA are presented in the next section. Section IIIpresents new results on majority gates. Section IV presents re-sults on the RCA. The efficient QCA design of the Brent–Kungprefix adder is presented in Section V. Section VI presents ma-jority gate results for Kogge–Stone, Han–Carlson, and Ladner–Fischer prefix adders. Section VII presents simulation resultsfor the proposed adders. Section VIII presents comparisonsof various adders including the best existing adder designs.Conclusions are presented in Section IX.

II. TERMINOLOGY AND PRIOR WORK

A. Basics of QCA

Fig. 1 depicts QCA cells, each with four quantum dots. EachQCA cell is occupied by two electrons. The locations of theelectrons determine the binary states. It is worth noting thatFig. 1 shows only a conceptual four-dot cell (the implementationtechnology is not fixed here).

1536-125X/$26.00 © 2011 IEEE

Page 2: Adders

106 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Fig. 1. QCA cells with electrons indicating possible polarizations.

Fig. 2. QCA Clock zones.

Fig. 3. Interdot barriers in a clocking zone.

Fig. 4. Array of QCA cells with four clocks applied functions like aD-latch [8].

QCA cells are clocked using a four-phase clocking scheme.Four clocks with a phase difference of 90◦ are shown in Fig. 2.Phases in QCA clocking, namely switch, hold, release, andrelax [9] are depicted in Fig. 3. From an unpolarized state, acell is polarized during the switch phase depending on the stateof the neighboring cells. In the hold phase, a cell retains itspolarization. During the release and relax phases, QCA cellsare unpolarized. In view of the polarization and unpolarization,each clock zone behaves like a D-latch. Hence, when a sequenceof clocks is applied to an array of QCA cells as shown in Fig. 4,the clocking information can be conveyed through a numberedD-latch system [8]. The convention followed for the clockingin Fig. 4 is the same as in [8], [10], [11]. The numbering distin-guishes one clock zone from another via appropriate subscripts.D0,D1,D2, and D3 represent the D-latches of clock zones 0,1, 2, and 3 respectively. During the switch phase of the clockzone 0 (D0 latch), electrons in the QCA cell polarize as per theinput polarization. In the hold phase of clock zone 0, clock zone

Fig. 5. QCA inverter and majority gate.

1(D1 latch) will be in the switch phase so electrons in QCAcell polarize as per clock zone 0, similarly, for clock zone 2(D2latch) and clock zone 3(D3 latch). Among the four clock zones,only one zone will be in the hold state and has valid data ata time. In the implementation, 16 cells are assumed per clockzone (to allow comparisons with prior work [11]). Primitives inthe QCA model consist of a wire, inverter and a majority gateand are depicted in Fig. 5.

A QCA design permits two options for crossover, termedcoplanar crossover and multilayer crossover. While the copla-nar crossover uses only one layer but involves usage of twocell types (termed regular and rotated), the multilayer crossoveruses more than one layer of cells (analogous to multiple metallayers in a conventional IC). The multilayer crossover is usedin this paper for wire crossings since we can effectively crosssignals over on another layer and the extra layers of QCA canbe used as active components of the circuit [8]. Further, multi-layer QCA circuits can potentially consume much less area ascompared to planar circuits [8]. Moreover, some studies [12]indicate that coplanar crossover is difficult to fabricate in themolecular implementation.

B. Prior Work

1) Majority Logic Algebraic Manipulation and Synthesis:Research on binary majority decision elements motivated by thedevelopment of devices such as parametrons and Esaki diodeshas been reported as early as 1960 [13]. The author in [13]describes how ordinary Boolean algebra can be augmented toinclude a majority decision operator. Realization of a one-bitadder using three majority gates and two inverters is shown in[13]. An extension to this work is presented in [14] where anew algebra for the logical design with majority decision ele-ments is derived and applied to the one-bit adder design. Anapproach based on decomposition and rearrangement, to major-ity element-based synthesis of networks of components havinglimited fan-in, is presented in [15]. Strategies of [14] and [15]are equally elegant with respect to synthesis of a broad class offunctions. Extensions to multibit adders are not, however, dis-cussed in these works. A distributive law pertaining to majoritylogic is described in [16].

While [14] presents an algebraic method for the logical de-sign, the authors in [17] present a geometric method that usesVeitch diagrams (and extensions) for synthesis using i-inputmajority gates of a variety of n-argument switching functions.

Page 3: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 107

An approach based on the notion of logically passive self-dual(LPSD) has been presented in [18]. An extension to this workis presented in [19]. All of these are interesting from a theo-retical point of view and offer valuable insight for the majoritylogic-based design. However, their focus seems to be more onsimplification of Boolean expressions rather than on specialadder designs (which is of primary interest to us in this paper).

2) QCA-based Digital Design: A simulation tool for QCAhas been reported in [7] while a synthesis tool is describedin [20]. The authors in [6] present an evaluation of several three-variable Boolean functions and conclude with study of a 1-bitfull adder. A performance comparison of some quantum-dot CAadders is presented in [21]. The modular design of conditionalsum adders has been studied in [22]. The authors in [23] havepresented a QCA design methodology based on the traditionalCMOS circuits design flow and a SPICE model. Probabilisticmodeling of adder circuits using Bayesian networks is presentedin [24]. Cell count, area, and latency have been studied formultibit adders (especially RCA and CLA) in [10]. Robust QCAadder designs that exploit proper clocking schemes are proposedin [25] but they do not study special adders such as prefix adders.Probabilistic analysis of molecular quantum-dot cellular addersis presented in [26]. Reliability of magnetic QCA adders andelectrostatic QCA adders is studied via probabilistic transfermatrices in [27]. Robust adders based on QCA are describedin [28]. Energy dissipation per clock cycle in QCA adder circuitsis studied in [29]. An interesting extension of [10] to a new typeof adder called CFA is presented in [11]. A recent text on designand test of digital circuits in QCA is [30].

III. NEW RESULTS ON MAJORITY GATES

We present three new lemmas on reduction of the number ofmajority gates which are directly of interest in obtaining efficientdesigns of prefix adders. Given three binary inputs, a, b, and c,the majority voting logic function can be expressed in terms offundamental Boolean operators M(a, b, c) = ab + bc + ac [3],[6].

Lemma 1: If a, b and c are three binary inputs, thenM(a, b,M(a, b, c)) = M(a, b, c).

Proof:

M(a, b,M(a, b, c)) = ab + b(ab + bc + ac)

+ a(ab + bc + ac)

= ab + bc + ac + abc

= M(a, b, c) Q.E.D.

As a consequence of Lemma 1, we have M(a,M(a, b, c), c) =M(a, b, c) and M(M(a, b, c), b, c) = M(a, b, c).

Lemma 2: Let a, b and c be three binary inputs. Then

M(a, b,M(a, b, c)) = M(a, b, c).

Proof:

M(a, b,M(a, b, c)) = ab + a(ab + bc + ac)

+ b(ab + bc + ac)

= ab + abc + abc

= a(b + bc) + b(a + ac)

= a(b + c) + b(a + c)

= M(a, b, c) Q.E.D.

As a consequence of Lemma 2, we have M(a,M(a, b, c), c) =M(a, b, c) and M(M(a, b, c), b, c) = M(a, b, c).

Lemma 3. Let f1 , f2 , and f3 be three Boolean functionssuch that f1 and f2 satisfy f1f2 = f1 and f1 + f2 = f2 . Then

M(f1 , f2 , f3) = f1 + f2f3

Proof:

M(f1 , f2 , f3) = f1f2 + f1f3 + f2f3

= f1 + (f1 + f2)f3

= f1 + f2f3 Q.E.D.

IV. RIPPLE CARRY ADDER DESIGN IN QCA

An interesting consequence of Lemma 2 is a new Lemma 4described in the following. This lemma establishes that carrygeneration requires one majority gate and sum generation re-quires just two majority gates plus one inverter in a one-bit fulladder. Let ai, bi , and ci be inputs to a full adder and let si andci+1 be its outputs.

Lemma 4: A 1-bit full adder can be realized using threemajority gates and one inverter.

Proof: From the results in [6], we have

ci+1 = M(ai, bi , ci). (1)

The sum si can be expressed in three equivalent formats as givenby (3), (4), and the following:

si = M(M(ai, bi , ci),M(ai, bi , ci), ci). (2)

If bi in (2) is complemented instead of ci , we have

si = M(M(ai, bi , ci),M(ai, bi , ci), bi) (3)

Similarly,

si = M(M(ai, bi , ci),M(ai, bi , ci), ai). (4)

Using Lemma 2 on (4), we have

si = M(M(ai, bi , ci),M(M(ai, bi , ci), bi , ci), ai). (5)

Q.E.D.A one-bit full adder that incorporates appropriate clocking is

shown in Fig. 6. The D-latch convention presented in [8] enablesus to obtain the total circuit delay. One D-latch (namely, D0) isused to indicate that one-quarter of a clock is required to applythe inputs to the majority logic. One-fourth clock zone delay isassumed when a majority gate is immediately followed by aninverter or vice-versa (D1 is introduced at the output of inverterthat follows the majority gate [8]). Proceeding this way, we havea total circuit delay of 1 clock (4 clock zones) for generating Si

as well as Ci+1 for a 1-bit adder.

Page 4: Adders

108 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Fig. 6. Full adder realization using three majority gates and one inverter;numbered D-latches enable delay determination.

The result presented in Lemma 4 improves upon a result in [6]that requires two inverters. We can use the result on a 1-bit fulladder to derive the following for an n-bit RCA.

Corollary 1: An n-bit RCA requires at most 3n majoritygates and n inverters.

From Corollary 1, we see that a 4-bit RCA requires 12 major-ity gates and four inverters. Note that the path from input to the“last” output contains seven clock zones as shown in Fig. 7. Sowe have a total delay of 1 3

4 clocks. While the RCA is simple,the delay increases (linearly) as the size of the adder increases.This suggests study of other types of adders.

V. BRENT–KUNG PREFIX ADDER IN QCA

Prefix adders constitute an interesting class of parallel adders[31]. They are based on reducing carry computation to a “prefix”computation. Among the various adders reported in the litera-ture [32]–[35], the Brent–Kung adder has been chosen first forefficient QCA realization in view of the (relatively) small growthin the number of associative operations as a function of the addersize. This is exemplified via the prefix graph of a 16-bit Brent–Kung adder depicted in Fig. 8. The small shaded circles in Fig. 8represent the associative operator “◦” defined in (6). The prefixgraph indicates 4, 11, and 26 associative operations respectivelyfor Brent–Kung adders of sizes 4, 8, and 16 bits, respectively.In general, for an n-bit Brent–Kung adder, the number of levels(of the corresponding prefix graph) is 2 log2(n) − 1 [31] whilethe cost in terms of the number of associative operations is2n − log2(n) − 2 (this can be inferred as the sum of associativeoperations at each level in the prefix graph). Substantially largernumber of associative operations are required for other prefix

Fig. 7. 4-bit RCA critical path composed of seven D-latches (including onefor input and one for each majority gate).

Fig. 8. 16-bit Brent–Kung adder prefix graph.

adders (as indicated in Table II) although other prefix adders toohave O(log2 n) levels. The number of associative operationsinfluences the amount of majority logic (as established later viaTheorems 1, 2, 3, and 4).

We begin by presenting a general formulation of prefix addersin terms of the associative operator ◦ defined as follows [34](Note that ◦ is also the fundamental carry operation [31]):

(Gi, Pi) ◦ (Gj , Pj ) = (Gi + (PiGj ), PiPj ). (6)

In particular, (7), (8), (9), and (10) apply to all forms of prefixadders. Square brackets are used in (8)–(10) to indicate theprevious step in recursion. Let gi = aibi and pi = ai + bi :

(c1 , 0) = (g0 , p0) ◦ (c0 , 0) (7)

(c2 , 0) = (g1 , p1) ◦ [(g0 , p0) ◦ (c0 , 0)] (8)

(c3 , 0) = (g2 , p2) ◦ [(g1 , p1) ◦ (g0 , p0) ◦ (c0 , 0)] (9)

(c4 , 0) = [(g3 , p3) ◦ (g2 , p2)] ◦[(g1 , p1) ◦ (g0 , p0) ◦ (c0 , 0)] . (10)

Page 5: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 109

Fig. 9. Eight-bit Brent–Kung adder prefix graph.

To develop the equations for the Brent–Kung prefix adder, wedefine the following:

(gi:j , pi:j )=(gi, pi) ◦ (gi−1 , pi−1) . . . ◦ (gj−1 , pj−1) ◦ (gj , pj ).(11)

Suppose m is an integer defined as j < m < i; then we canrewrite (11) as shown in (12):

(gi:j , pi:j ) = (gi:m , pi:m ) ◦ (gm−1:j , pm−1:j ). (12)

Using (12), we can rewrite (8) as

c2 = (g1:0 , p1:0) ◦ (c0 , 0)

In general, ci+1 can be expressed as

ci+1 = (gi:0 , pi:0) ◦ (c0 , 0).

If initial carry c0 = 0, then ci+1 = gi:0 .

A. Majority Logic Optimization for an 8-bit Brent–Kung Adder

The prefix graph for an 8-bit Brent–Kung adder is given inFig. 9. The associative operators are indicated by filled circleswith the labels below indicating the corresponding outputs. Theequations corresponding to the 8-bit Brent–Kung adder prefixgraph (see Fig. 9) are given in Table I. This assumes the initialcarry c0 is 0 (and is in line with the assumptions in [34]). Thegenerate and propagate phase is not labelled explicitly as a stage.

We now present a lemma (Lemma 5) on the majority logicthat will be used to obtain an efficient QCA design for the Brent–Kung adder. Let gi = aibi and pi = ai + bi . Lemma 5 showsthat the calculation of sum requires just two majority gates andone inverter using gi , pi , ci , and ci+1 .

Lemma 5: At most two majority gates and one inverter arerequired for obtaining the sum at each stage (when gi and pi areincoming) of a Brent–Kung adder.

Proof: From (1) and (2), we have si =M(M(ai, bi , ci),M(ai, bi , ci), ci). However,

M(ai, bi , ci) = aibi + (ai + bi)ci

= gi + pici

= M(gi, pi , ci)(fromLemma3).

Hence, ci+1 = M(ai, bi , ci) = M(gi, pi , ci).Similarly

M(ai, bi , ci) = M(gi, pi , ci)

TABLE 1BRENT–KUNG PREFIX ADDER EQUATIONS

From Lemma 2,

M(gi, pi , ci) = M(gi, pi ,M(gi, pi , ci)) = M(gi, pi , ci+1).

Therefore

si = M(ci+1 ,M(gi, pi , ci+1), ci) Q.E.D. (13)

In the case of an 8-bit Brent–Kung adder in QCA, at most51 majority gates and eight inverters are required. This canbe explained as follows. The computation of gi, pi , i = 0, . . . 7requires 16 majority gates. c1 in Table I is g0 and there-fore does not require any majority gate. c2 , given by g1:0 =g1 + p1g0 requires one majority gate (from Lemma 3). c3 =g2:0 = g2 + p2g1:0 also requires one majority gate (Lemma 3).c4 = g3:0 = g3:2 + p3:2g1:0 requires four majority gates sinceg3:2 requires one, p3:2 requires one, the AND of p3:2g1:0 requiresone while the overall OR requires one (here, Lemma 3 does notapply). c5 = g4:0 = g4 + p4g3:0 requires one majority gate (us-ing Lemma 3). c6 = g5:0 = g5:4 + p5:4g3:0 requires four major-ity gates (similar to explanation for c4). c7 = g6:0 = g6 + p6g5:0requires one majority gate. Finally, c8 = g7:0 = g7:4 + p7:4g3:0requires seven majority gates (since g7:4 = g7:6 + p7:6g5:4 re-quires four majority gates, one each for g7:6 and p7:6 , one forAND and one for OR; p7:4 = p7:6p5:4 requires one majoritygate; the AND and OR in overall expression for c8 require oneeach). Sixteen majority gates and eight inverters are required forsum and hence the comment.

Fig. 10 shows the majority gate diagram for primarilycarry generation of an 8-bit Brent–Kung adder with numberedD-latches. This assumes that gis and pis are available. Further,p0 is not required for generation of the output carries and henceit is not shown in the diagram. The total (maximum) delay is2.5 clocks and can be obtained by counting the D-latches alongthe path that leads to sum S7 .

Page 6: Adders

110 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Fig. 10. Brent-Kung 8-bit adder majority gate diagram for carry generationassuming gs and ps; the maximum delay of 2.5 clocks is for sum S7 .

Fig. 11. Sixteen-bit Kogge–Stone adder prefix graph.

Remark 1: c0 has been assumed to be 0 in the derivation ofBrent–Kung adder (in line with the description in [34]). If c0 isnon-zero, then c1 , given by M(x0 , y0 , c0), is calculated alongwith g0 and p0 . For calculation of remaining carries, namely c2to c8 , we simply can replace g0 by c1 . As a consequence, thereis no increase in the total delay. Further, only one extra majoritygate is required.

In the absence of Lemma 3, the requirement of majority gateswould increase by 7 (1 more for c2 , c3 , c4 , c5 , c6 , c7 , and c8)for an 8-bit adder.

B. Generalization to an n-bit Brent–Kung Adder

In this section, we present a bound on the number of major-ity gates required, denoted by I(n), for QCA realization of ann-bit Brent–Kung adder (we focus on majority gates since weknow that n inverters are required for an n-bit adder). I(n) cor-responds to the sum of Ic(n) (for generating n-carries), Is(n)(for generating n-sums) and Igp(n) (for obtaining the gis andpis). To obtain Ic(n), we present a recursive formulation. Inparticular, we show that Ic(n) = Ic(n : n

2 + 1) + Ic(n2 ), where

Ic(n : n2 + 1) is the number of majority gates required for gen-

erating the carries from n to n2 + 1 while Ic(n

2 ) is the number ofmajority gates required for generating carries of an n

2 -bit adder.To derive the general formula for Ic(n), we make some obser-

vations on the prefix graph shown in Fig. 8. In this case, n = 16.Let the number of majority gates required for a direct method(that does not use Lemma 3) be denoted by Id

c (16). Calculationof Id

c (16) involves following steps.(i) Consider the left-half corresponding to generation of C9

to C16 in Fig. 8. Stages 1, 2, and 3 involve four, two, and oneassociative operations respectively. Each associative operatorrequires three majority gates. Hence, the total number of major-ity gates is 21 (7×3).

(ii) Stage 4 involves one associative operation while Stages5, 6, and 7 involve one, two, and four associative operationsrespectively. From Stage 4 to Stage 7, only generation of carrytakes place so there is no need for majority logic for propagatehere. Therefore, each associative operation in Stages 4 to 7 re-quires two majority gates. Hence, a total of 16 (8×2) is required.We therefore have Id

c (16 : 9) = 21 + 16 = 37.(iii) For generating the remaining carries, namely C1 to C8 ,

we consider the left-half corresponding to C8 − C5 . In this case,two and one associative operations are present in Stages 1 and2, respectively, one associative operation in Stage 3 and oneand two associative operations in Stages 4 and 5, respectively.Idc (8 : 5) = 3 × 3 + 4 × 2 = 17.

(iv) This way, we can proceed recursively. Idc (16) can be

expressed as Idc (16 : 9) + Id

c (8), Idc (8) as Id

c (8 : 5) + Idc (4),

and Idc (4) as Id

c (4 : 3) + Idc (2). Id

c (2) is 2 and Idc (4 : 3) requires

seven majority gates. Hence, Idc (16) = 37 + 17 + 7 + 2 = 63.

(v) Ic(16) is considerably less than Idc (16) since Lemma

3 can be applied at Stage 1 and 7 leading to a saving of 15majority gates. That is, n − 1 majority gates are saved here.Hence, Ic(16) = 48. We now present a bound on the number ofmajority gates required for the n-bit case via Theorem 1.

Theorem 1: The number of majority gates required for ann-bit Brent–Kung adder is given by

I(n) = 8n − 3 log2(n) − 4

Proof : Computation of carries of an n-bit Brent–Kung adderrequires 2 log2(n) − 1 stages (assuming that n is a power of 2).The count for majority gates for an n-bit adder can be obtainedusing a recursive formulation noting that the lower order car-ries (namely, carries from 1 to n

2 ) are those of an n2 -bit adder.

We therefore obtain a general formula for calculating the ma-jority gates for carries from n

2 + 1 to n. The stages from 1 to(log2 n− 1) contain n

4 , n8 , n

16 , . . ., n2lo g 2 (n ) number of associative

Page 7: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 111

operations respectively for carries from n to n2 + 1. Each as-

sociative operation requires three majority gates (in a directmethod). So total number of majority gates (for the left half),denoted by t1 , is given as

t1 = 3(

n

4+

n

8+

n

16+ · · · + n

2log2 (n)

)

= 3(2log2 (n)−2 + 2log2 (n)−3 + 2log2 (n)−4 + · · · + 2 + 1)

= 3(2log2 (n)−1 − 1)

= 3(

n

2− 1

).

Stages log2(n) to 2 log2(n) − 1 contain 1,1,2,. . ., n8 ,n

4 numberof associative operations respectively and each requires twomajority gates. The total number of majority required for this,denoted by t2 , is given by

t2 = 2(

1 + 1 + 2 + · · · + n

8+

n

4

)

= 2(1 + 1 + 2 + · · · + 2log2 (n)−3 + 2log2 (n)−2)

= 2(1 + 2log2 (n)−1 − 1)

= 2(2log2 (n)−1)

= 2log2 (n)

= n

Idc (n : n

2 ) is then expressed as the sum of t1 and t2 . So

Idc

(n :

n

2+ 1

)= 3

(n

2− 1

)+ n = 5

(n

2

)− 3

Hence, for a direct solution, we have

Idc (n) = 5

(n

2

)− 3 + Id

c

(n

2

)

= 5(

n

2+

n

4+ · · · + 2

)− 3(log2(n) − 1) + Id

c (2)

= 5(n − 2) − 3 log2(n) + 3 + 2 [Idc (2) = 2]

= 5n − 3 log2(n) − 5.

To obtain a reduction in majority gates in each stage, we canapply Lemma 3 to two stages, namely Stage 1 and Stage(2 log2(n) − 1) (Lemma 3 cannot be applied to other stages).This will lead to a reduction of one majority gate at each asso-ciative operator from carry n

2 + 1 to n (i.e., n4 + n

4 = n2 ). So

total reduction of majority gates, denoted by Ir (n), is sum ofreduction of majority gates from carry n

2 + 1 to n (denoted byIr (n : n

2 + 1)) and reduction of majority gates from carry n2 to

1 [denoted by Ir (n2 )]:

Ir (n) = Ir

(n :

n

2+ 1

)+ Ir

(n

2

)

=n

2+ Ir

(n

2

)

=n

2+

n

4+ · · · + 4 + 2 + Ir (2)

= n − 2 + 1 [Ir (2) = 1]

= n − 1.

Therefore,

Ic(n) = Idc (n) − Ir (n)

= 4n − 3 log2(n) − 4.

For an n-bit adder, gi and pi require one majority gate each,hence Igp(n) = 2n majority gates. Also, each si requires twomajority gates, so Is(n) = 2n majority gates. So overall major-ity gate requirement is

I(n) = Ic(n) + Igp(n) + Is(n)

= 8n − 3 log2(n) − 4 Q.E.D.

Remark 2: From Theorem 1, we note that Idc (n) is given by

5n − 3 log2(n) − 5 while application of Lemma 3 leads to Ic(n)given by 4n − 3 log2(n) − 4. This corresponds to savings ofapproximately 20%. Further, the direct Brent–Kung adder yieldsId(n) = Id

c (n) + 4n while application of Lemma 3 leads toI(n) = Ic(n) + 4n. This corresponds to majority gate savingsof roughly 11%.

In addition, using Theorem 1, we can infer that when n = 32at most 237 majority gates and 32 inverters are required whilefor n = 64, at most 490 majority gates and 64 inverters arerequired for Brent–Kung prefix adder.

VI. OTHER PREFIX ADDERS

In this section, we develop majority-gate based designs forother prefix adders and compare with the Brent–Kung prefixadder.

The number of majority gates required for an n-bit Kogge–Stone adder is given by Theorem 2. Its prefix adder is shown inFig. 11.

A. Kogge–Stone Adder

Theorem 2: The number of majority gates required for ann-bit Kogge–Stone adder is given by

I(n) = n(3 log2 n − 1) + 5.

Proof: The computation of carries of an n-bit Kogge–Stoneadder requires log2 n stages. The number of majority gates re-quired for an n-bit Kogge–Stone adder is also obtained via arecursive formulation (just like the Brent–Kung adder). Thegeneral formula for calculating the number of majority gates forcarries from n

2 + 1 to n is obtained as follows. Each stage in the

Page 8: Adders

112 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Kogge–Stone adder requires n2 associative operations (namely

carries from n2 + 1 to n). Except the last stage, each associative

operator in all other stages requires three majority gates. In thelast stage, each associative operation requires only two majoritygates (since there is no calculation of the propagate term):

Idc

(n :

n

2+ 1

)= 3

(n

2

)(log2 n − 1) + 2

(n

2

)

=n

2[3 log2 n − 1]

Idc (n) =

n

2[3 log2 n − 1] + Id

c

(n

2

)

= 3[n

2log2 n +

n

4log2

n

2+ · · · + 2 · 2

]

−[n

2+

n

4+ · · · 2

]+ Id

c (2)

= 3[2n log2 n − (log2 n + 1)n] − [n − 2] + 2

= 3n log2 n − 4n + 4 [� Idc (2) = 2].

Lemma 3 can be applied to Stage 1 of each associative operator.We will have a reduction of one majority gate each and a totalof n

2 majority gates:

Ir (n) =n

2+ Ir

(n

2

)

= n − 1.

Hence, Ic(n) is given as

Ic(n) = Idc (n) − Ir (n)

= 3n log2 n − 5n + 5.

For an n-bit adder, gi and pi require one majority gate each;hence, Igp(n) = 2n majority gates. Also, each si requires twomajority gates, so Is(n) = 2n majority gates. So overall major-ity gate requirement is given by

I(n) = Ic(n) + Igp(n) + Is(n)

= n(3 log2 n − 1) + 5 Q.E.D.

B. Ladner–Fischer Adder

Another prefix adder reported in the literature is the Ladner–Fischer adder [33]. This also has O(log2 n) stages. Its prefixgraph is shown in Fig. 12.

A bound on the number of majority gates required for an n-bitLadner–Fischer is given in Theorem 3. The details of the proofare given briefly here (More details on the Ladner–Fischer adderitself are available in [36]).

Theorem 3: The number of majority gates required for ann-bit Ladner–Fischer adder is given by

I(n) =n

2(3 log2 n + 4) + 2.

Fig. 12. Ladner–Fischer 16-bit adder prefix graph.

Proof (Outline): The direct calculation of the carries, denotedby Id

c (n), is given by

Idc (n) =

n

4[3 log2 n + 1] + Id

c

(n

2

)

=32

[n

2log2 n +

n

4log2

n

2+ · · · + 2 · 2

]

−[n

4+

n

8+ · · · 2 + 1

]+ Id

c (2)

=32[2n log2 n − (log2 n + 1)n] − [n − 2] + 2

=32n log2 n − n + 1 [� Id

c (2) = 2].

We can apply Lemma 3 to associative operations in Stage 1and stage log2 n. This leads to a reduction of majority gates,denoted Ir (n), given by

Ir (n) =n

2+ Ir

(n

2

)= n − 1

Ic(n) = Idc (n) − Ir (n)

=32n log2 n − 2n + 2.

The overall majority gate requirement is given by

I(n) = Ic(n) + Igp(n) + Is(n)

=n

2(3 log2 n + 4) + 2 Q.E.D.

C. Han–Carlson Adder

A fourth prefix adder reported in the literature is the Han–Carlson adder [35]. Its prefix graph is shown in Fig. 13.

Theorem 4: The number of majority gates required for ann-bit Han–Carlson adder is given by

I(n) =n

2(3 log2 n + 4) + 2.

Proof: The computation of carries of n-bit Han–Carlson adderrequires log2 n + 1 stages. As before, we use a recursive for-mulation and the general formula for calculating the majoritygates for carries from ( n

2 + 1) to n is derived as follows. Eachstage in the Han–Carlson adder requires n

4 associative opera-tions (namely carries from ( n

2 + 1) to n). In the Stages labelled 1

Page 9: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 113

Fig. 13. Sixteen-bit Han-Carlson adder prefix graph.

to log2 n − 1, each associative operator requires three majoritygates. However, in the last two stages, each associative requiresonly two majority gates (since there is no calculation of thepropagate term):

Idc

(n :

n

2+ 1

)= 3

(n

4

)(log2 n − 1) + 2

(n

2

)2

=n

4[3 log2 n + 1]

Idc (n) =

n

4[3 log2 n + 1] + Id

c

(n

2

)

=32

[n

2log2 n +

n

4log2

n

2+ · · · + 2 · 2

]

−[n

4+

n

8+ · · · 2 + 1

]+ Id

c (2)

=32[2n log2 n − (log2 n + 1)n]

− [n − 2] + 2

=32n log2 n − n + 1 [� Id

c (2) = 2].

We can apply Lemma 3 to associative operations in Stage 1and stage log2 n + 1. For each associative operation, there is asaving of 1 majority gate which implies a total reduction of n

2majority gates (for the half from n

2 + 1 to n):

Ir (n) =n

2+ Ir

(n

2

)= n − 1.

Hence, Ic(n) is given as

Ic(n) = Idc (n) − Ir (n)

TABLE IINUMBER OF LEVELS, ASSOCIATIVE OPERATIONS AND MAJORITY GATES FOR

VARIOUS PREFIX ADDERS

Fig. 14. Plot of majority gates versus adder size for various prefix adders.

=32n log2 n − 2n + 2.

For an n-bit adder, gi and pi require one majority gate each;hence, Igp(n) = 2n majority gates. Also, each si requires twomajority gates, so Is(n) = 2n majority gates. So overall major-ity gate requirement is

I(n) = Ic(n) + Igp(n) + Is(n)

=n

2(3 log2 n + 4) + 2 Q.E.D.

Table II summarizes the number of levels, associative oper-ations and majority gates for each of the four prefix adders (interms of the adder size, denoted by n). Graphs showing how ma-jority gates accumulate for different prefix adders, as the addersize grows are shown in Fig. 14.

From the graph (Fig. 14), we observe that the Brent–Kung pre-fix adder supports a very efficient solution (in terms of smallergrowth in number of majority gates). This is not unexpected,however, given the fact that the Brent–Kung graph has smallernumber of operators (as expressed via Table II) and thus gates.

VII. QCA IMPLEMENTATION

In this section, we present results of simulation in QCADe-signer [7]. We also present area and time complexity results forvarious adders.

Page 10: Adders

114 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Fig. 15. QCADesigner layout for 8-bit ripple carry adder.

Fig. 16. QCADesigner layout for 8-bit Brent-Kung adder.

Fig. 17. QCADesigner layout for 8-bit Kogge–Stone adder.

A. Design Rules

Cells for our design are assumed to have a height of 18 nm,and width of 18 nm while the quantum dots have a diameter of5 nm (same as the assumptions of [11]). Further, the cells areplaced on a grid with a cell center-to-center distance of 20 nm.A maximum of 16 cells per clock zone is used (as in [11], it isto be noted that the number of cells per clock zone affects theoverall delay).

Fig. 18. QCADesigner layout for 8-bit Ladner–Fischer adder.

Fig. 19. QCADesigner layout for 8-bit Han–Carlson adder.

B. Simulation Engine

The coherence vector engine has been used for simulations.The options for the simulation were as follows (and are in agree-ment with the suggestions in [37]): temperature: 1 K; relaxationtime: 1 ×10−15 s; time step: 1 ×10−16 s; total simulation time:7 ×10−11 s; radius of effect: 80 nm; relative permittivity: 12.9;layer separation: 11.5 nm; Euler method; randomize simulationorder.

C. Layout Level Implementation of RCA and Prefix Adders

Fig. 15 shows the QCADesigner layout for an 8-bit ripplecarry adder. The layout is labelled to indicate majority gates aswell as the outputs, namely S0 , S1 . . . , S7 and C8 . A labelledQCADesigner layout for the proposed Brent–Kung adder(8-bit) is shown in Fig. 16. Figs. 17–19, show the layout ofan 8-bit Kogge–Stone adder, Ladner–Fischer adder, andHan–Carlson adder, respectively. The 16-bit ver-sions are available at http://www.ee.iitm.ac.in/∼sridhara/16 bit layouts/index.html.

Page 11: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 115

Fig. 20. QCADesigner Layout for a 16-bit Brent-Kung adder.

Fig. 21. Simulation results for 16-bit Brent–Kung adder.

Fig. 22. Polarization versus relaxation time for a Brent–Kung adder with respect to: (a) C4; (b) S2; and (c) S0.

Fig. 20 gives the layout for the 16-bit Brent–Kung adder.Fig. 21 shows the simulation results of a 16-bit Brent–Kungadder (the complete sum output includes also the top carryout). The first set of inputs for simulation shown in Fig. 21corresponds to A[15 : 0] = 0;B[15 : 0] = 0 (since the initialcarry is set to 0 as in [34], this is retained for simulations).

The output, SUM[15 : 0] = 0 appears after four clock delays(this is also reflected in the “delay” column in Table III). Thesecond set of inputs corresponds to A[15 : 0] = 1024;B[15 :0] = 512. The output, SUM[15 : 0] = 1536 (this includes thecarry as well).

Page 12: Adders

116 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

TABLE IIICELL COUNT, AREA, OVERALL SIZE, DELAY, AND NUMBER OF TOTAL CLOCK

PHASES FOR DIFFERENT METHODS; PROP: PROPOSED; BK: BRENT–KUNG;HC: HAN–CARLSON; KS: KOGGE–STONE; LF: LADNER–FISCHER

D. Study of Signal Integrity as a Function of Relaxation Time

Given the advantages of the Brent–Kung adder in the QCAmodel with respect to majority gate requirements, it is of interestto explore other aspects that are equally important in the QCAparadigm. In this section, we study the robustness of the Brent–Kung adder. For this purpose, the coherence vector simulationengine is used since it allows for the verification of functionalityof QCA layouts as time-related parameters and temperature arevaried. The clock frequency for the studies is obtained using thecoherence vector engine total simulation time and the length ofsimulation input vector. In our case, this is 100 GHz.

The polarization of various outputs as a function of the relax-ation time is studied for a 4-bit Brent–Kung adder for three dif-ferent temperature settings. Thresholds are set as follows [38]:polarization <0.5 =⇒ logic 0; polarization >0.5 =⇒ logic1, otherwise the state is indeterminate. The range for the relax-ation time is chosen as follows. The coherence vector enginetime step has been fixed at the default value (1 × 10−16 s) andthis determines a value for the relaxation time where the de-sign breaks (i.e., gives incorrect polarization of the outputs).

The point where the design breaks at the other end is obtainedby systematically increasing the relaxation time from the de-fault value (of 1 × 10−15 s) and observing the outputs for eachcase. The plots of polarization of three of the outputs, namelyC4, S2, and S0, as a function of relaxation time when inputsare A = 15, B = 15, and Cin = 0 are shown in Figs. 22(a)– (c).(The plots are limited to these three due to space constraints.)The X-axis is plotted on logarithmic scale (base 10) to facilitatehandling a large range for the relaxation time (from 9.9 × 10−17

to 3.0 × 10−12 s in the case of C4 and slightly less in the case ofS2 and S4). The design breaks for each of the output bits whenthe relaxation time is 9.9 × 10−17 s (i.e., just less than the timestep). On the higher side, the design breaks for C4 when therelaxation time is 9 × 10−14 s for 1 Kelvin and for somewhathigher values as the temperature is increased (3 × 10−13 s for10 K and 1 × 10−12 s for 20 K). The design breaks for S2 at9 × 10−13 s for 1 Kelvin and around the same value as temper-ature is increased. Further, the design breaks for S0 when therelaxation time is 3 × 10−13 s at 1 Kelvin and at the same valueas temperature is increased to 10 Kelvin. These are illustratedin Figs. 22(a)–(c) (Note that the X-axis values are in naturallogarithm scale).

VIII. COMPARISON STUDIES

A. Cell Count, Area, and Delay for Various Adders

Table III gives the details of cell count, area and delay for theproposed RCA, Brent–Kung, Kogge–Stone, Ladner–Fischer,and Han–Carlson adders. Comparisons with prior work are alsopresented in Table III. The comparisons are primarily with theresults reported in [10] and [11] since they appear to be the mostrecent on RCA, CLA, and other other adders. The delay valueindicated for the proposed 8-bit Brent–Kung adder correspondsto the delay for output of sum S7 in Fig. 10. It is worth notingthat the proposed Brent–Kung adder has the lowest delay amongall existing adders and this can be attributed due to optimizationof majority logic (as well as wires in the design).

Plots of cell count, delay and area as a function of adder sizeare given in Figs. 23, 24, and 25. These plots are based on theanalysis of the QCA layouts.

We present time and space estimates for various adders usingthe order notation followed by the authors of [10] (this is basedon examining the growth of cell count, delay etc. as the adder sizedoubles). From the statistics, cell counts for a QCA adder withn-bit operands are roughly O(n1.33) for Brent–Kung, O(n1.49)for Han–Carlson, O(n1.56) for Kogge–Stone, O(n1.42) forLadner-Fischer, O(n1.24) for proposed RCA, O(n1.21) for CFA[11] and O(n1.35) for RCA [10]. Area estimates are O(n1.39) forBrent–Kung, O(n1.41) for Han–Carlson, O(n1.53) for Kogge–Stone, O(n1.48) for Ladner–Fischer, O(n1.56) for proposedRCA, O(n1.42) for CFA [11] and O(n1.72) for RCA [10]. Delayestimates are given by O(n0.68) for Brent–Kung, O(n0.83) forHan–Carlson, O(n0.78) for Kogge–Stone, O(n0.7) for Ladner–Fischer, O(n0.83) for proposed RCA, O(n0.87) for CFA [11]and O(n0.97) for RCA [10]. From the order results (using theBig Oh notation), we note that Brent–Kung has, in general,lower complexity than other adders in the QCA model. We

Page 13: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 117

Fig. 23. Complexity in terms of cell count for various adders.

Fig. 24. Time complexity in terms of delay for various adders.

Fig. 25. Area complexity for various adders.

Fig. 26. Polarization versus temp. with input 1: (a) S0 and (b) C4.

further note that a feature of the Kogge–Stone adder in the QCAdomain is that the growth (ratio) in delay as well as area (whenthe adder width is doubled) remains nearly constant.

B. Polarization Versus Temperature Studies

In this section, we study robustness of the proposed Brent–Kung adder. We report studies on polarization as a functionof temperature for two different adders, namely the proposedBrent–Kung adder and carry flow adder (which has the leastcell count and delay [11] among existing adders). The coher-ence vector simulation in QCADesigner has been performedwith temperature set to various values starting from 1 Kelvin.Default settings (Euler Method and Randomize Simulation Or-der option) have been chosen for simulation of Brent–Kung aswell as the carry flow adder. The clock frequency for the studiesis set to 100 GHz as before. We interpret the waveforms result-ing from simulation in QCADesigner using a simple thresholdsystem as suggested in [38]: polarization <0.5 =⇒ logic 0;

Page 14: Adders

118 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 11, NO. 1, JANUARY 2012

Fig. 27. Polarization versus temp. with input 2: (a) S0 and (b) S3.

polarization >0.5 =⇒ logic 1, otherwise the state is indeter-minate.

The results presented are for the 4-bit version of each of thetwo adders. The change in polarization has been studied for sumbits, Si, i = 0, . . . , 3 and carry bit C4 for two different inputs.The first set of input values (referred to as input 1 in figurecaptions) is: X = 15;Y = 15, Cin = 0. Plots of polarizationversus temperature for S0 and C4 for this input set are shownrespectively in Figs. 26(a) and (b). (Plots for the remainingoutputs are omitted due to space constraints.)

The second set of input values (referred to as input 2 infigure captions) is X = 7;Y = 7;Cin = 0. Plots of polarizationversus temperature for S0 and S3 for this input set are shownrespectively in Figs. 27(a) and (b). Since negative polarization(−1) corresponds to a binary value of 0, the “min” value istaken in the results of simulation in QCADesigner for both C4and S0 (for the remaining output bits, the “max” value is takensince these bits are 1). From the plots, we can infer that theproposed Brent-Kung adder has better performance than theCFA with respect to output polarization. S0 is in error (it is 0)

at higher temperatures (beyond 21 Kelvin) in the case of CFAleading to an error in overall sum for inputs given by: X =15, Y = 15, Cin = 0. All the outputs have the correct value upto 23 Kelvin for the proposed Brent-Kung adder. For the inputgiven by X = 7, Y = 7, Cin = 0, S3 is in error for temperaturesexceeding 6 Kelvin (the design breaks at this temperature asshown in Fig. 27(b) and is the first output that is broken) in thecase of CFA while accuracy of all outputs is maintained up to23 Kelvin in the case of the Brent–Kung adder.

IX. CONCLUSIONS

In this paper, we have considered primitives in QCA anddeveloped several results pertaining to majority logic optimiza-tion. We have also shown that a 1-bit full adder can be realizedusing at most three majority gates and one inverter.

Using the new results on majority logic optimization, we havepresented an efficient QCA design for an n-bit ripple carry adderand various prefix adders. We have shown that the proposedripple carry adder has substantially lower area and delay thanexisting RCA designs. We have also shown that the Brent–Kungadder has lower delay than all other adder designs studied here(and in prior work) for large word sizes. Further, the Brent–Kung adder performs best among the prefix adders in terms ofdelay and area.

The Brent–Kung adder design has also been studied for ro-bustness. It is observed that the Brent-Kung adder is fairly robustfor considerable variation in relaxation time as well as tempera-ture. Simulation results using QCADesigner with the coherencevector engine confirm the advantages of the Brent-Kung prefixadder in the QCA domain.

REFERENCES

[1] G. W. Hanson, Fundamentals of Nanoelectronics. Englewood Cliffs,NJ: Prentice-Hall, 2008.

[2] W. Porod, “Quantum-dot devices and quantum-dot cellular automata,” J.Franklin Inst., vol. 334B, no. 5/6, pp. 1147–1175, 1997.

[3] P. D. Tougaw and C. S. Lent, “Logical devices implemented using quan-tum cellular automata,” J. Appl. Phys., vol. 75, no. 3, pp. 1818–1825,1994.

[4] M. A. Kastner, “The single electron transistor,” Rev. Modern Phys.,vol. 64, no. 3, pp. 849–858, 1992.

[5] J. Timler and C. S. Lent, “Power gain and dissipation in quantum-dotcellular automata,” J. Appl. Phys., vol. 91, no. 2, pp. 823–831, 2002.

[6] R. Zhang, K. Walus, W. Wang, and G. A. Jullien, “A method of majoritylogic reduction for quantum cellular automata,” IEEE Trans. Nanotech-nol., vol. 3, no. 4, pp. 443–450, Dec. 2004.

[7] K. Walus, T. Dysart, G. Jullien, and R. Budiman, “QCADesigner: A rapiddesign and simulation tool for quantum-dot cellular automata,” IEEETrans. Nanotechnol., vol. 3, no. 1, pp. 26–29, Mar. 2004.

[8] K. Walus and G. A. Jullien, “Design tools for an emerging SOC tech-nology: Quantum-dot cellular automata,” Proc. IEEE, vol. 94, no. 6,pp. 1225–1244, Jun. 2006.

[9] C. S. Lent and P. D. Tougaw, “A device architecture for computing withquantum dots,” Proc. IEEE, vol. 85, no. 4, pp. 541–557, Apr. 1997.

[10] H. Cho and E. E. Swartzlander, “Adder designs and analyses for quantum-dot cellular automata,” IEEE Trans. Nanotechnol., vol. 6, no. 3, pp. 374–383, May 2007.

[11] H. Cho and E. E. Swartzlander, “Adder and multiplier designs in quantum-dot cellular automata,” IEEE Trans. Comput., vol. 58, no. 6, pp. 721–727,Jun. 2009.

[12] A. Chaudhary, D. Z. Chen, X. S. Hu, M. T. Niemier, R. Ravichandran,and K. Whitton, “Fabricatable interconnect and molecular QCA circuits,”IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 26, no. 11,pp. 1978–1991, Nov. 2007.

Page 15: Adders

PUDI AND SRIDHARAN: LOW COMPLEXITY DESIGN OF RIPPLE CARRY AND BRENT–KUNG ADDERS IN QCA 119

[13] R. Lindaman, “A theorem for deriving majority-logic networks within anaugmented Boolean algebra,” IEEE Trans. Electron. Comput., vol. EC-9,no. 3, pp. 338–342, Sep. 1960.

[14] M. Cohn and R. Lindaman, “Axiomatic majority-decision logic,” IEEETrans. Electron. Comput., vol. EC-10, no. 1, pp. 17 –21, Mar. 1961, 2012.

[15] F. Miyata, “Realization of arbitrary logical functions using majority ele-ments,” IEEE Trans. Electron. Comput., vol. EC-12, no. 3, pp. 183–191,Jun. 1963.

[16] S. B. Akers, “On the algebraic manipulation of majority logic,” IEEETrans. Electron. Comput., vol. EC-10, no. 4, pp. 779–779, Dec. 1961.

[17] H. S. Miller and R. O. Winder, “Majority-logic synthesis by geometricmethods,” IEEE Trans. Electron. Comput., vol. EC-11, no. 1, pp. 89–90,Feb. 1962.

[18] S. B. Akers, “Synthesis of combinational logic using three-input majoritygates,” in Proc. 3rd Annu. Symp. Switch. Circuit Theory Logical Design,1962, 7–12 1962, pp. 149–158.

[19] E. M. Riseman, “A realization algorithm using three-input majority ele-ments,” IEEE Trans. Electron. Comput., vol. EC-16, no. 4, pp. 456–462,Aug. 1967.

[20] R. Zhang, P. Gupta, and N. K. Jha, “Synthesis of majority and minoritynetworks and its applications to QCA-, TPL-, and SET-based nanotech-nologies,” in Proc. Int. Conf. VLSI Design, 2005, pp. 229–234.

[21] R. Zhang, K. Walus, W. Wang, and G. A. Jullien, “Performance compar-ison of quantum-dot cellular automata adders,” in Proc. IEEE Int. Symp.Circuits Syst., 2005, pp. 2522–2526.

[22] H. Cho and E. E. Swartzlander, “Modular design of conditional sumadders using quantum-dot cellular automata,” in Proc. 6th IEEE Conf.Nanotechnol. (IEEE-NANO 2006), pp. 363–366..

[23] R. Tang, F. Zheng, and Y.-B. Kim, “QCA-based nano circuits design[adder design example],” in Proc. IEEE Int. Symp. Circuits Syst., 2005,pp. 2527–2530.

[24] S. Bhanja and S. Sarkar, “Probabilistic modeling of QCA circuits usingBayesian networks,” IEEE Trans. Nanotechnol., vol. 5, no. 6, pp. 657–670, Nov. 2006.

[25] K. Kim, K. Wu, and R. Karri, “The robust QCA adder designs usingcomposable QCA building blocks,” IEEE Trans. Comput.-Aided DesignIntegr. Circuits Syst., vol. 26, no. 1, pp. 176–183, Jan. 2007.

[26] T. J. Dysart and P. M. Kogge, “Probabilistic analysis of a molecularquantum-dot cellular automata adder,” in Proc. IEEE Int. Symp. DefectFault-Tolerance VLSI Syst., 2007, pp. 478–486.

[27] T. J. Dysart and P. M. Kogge, “Analyzing the inherent reliability of mod-erately sized magnetic and electrostatic QCA circuits via probabilistictransfer matrices,” IEEE Trans. Very Large Scale Integrat. Syst., vol. 17,no. 4, pp. 507–516, Apr. 2009.

[28] I. Hanninen and J. Takala, “Robust adders based on quantum-dot cellu-lar automata,” in Proc. IEEE Int. Conf. Appl.-Specific Syst., Architect.Processors, 2007, pp. 391–396.

[29] S. Srivastava, S. Sarkar and S. Bhanja, “Estimation of upper bound ofpower dissipation in QCA circuits,” IEEE Trans. Nanotechnol., vol. 8,no. 1, pp. 116–127, Jan. 2009.

[30] F. Lombardi and J. Huang, Design and Test of Digital Circuits by Quantum-Dot Cellular Automata: Norwood, MA: Artech House, 2007.

[31] I. Koren, Computer Arithmetic Algorithms. Natick, MA: A.K. PetersLtd., 2002.

[32] P. M. Kogge and H. S. Stone, “A parallel algorithm for the efficientsolution of a general class of recurrent equations,” IEEE Trans. Comput.,vol. C-22, no. 8, pp. 786–793, Aug. 1973.

[33] R. E. Ladner and M. J. Fischer, “Parallel prefix computation,” J. Assoc.Comput. Mach., vol. 27, no. 4, pp. 831–838, Oct. 1980.

[34] R. P. Brent and H. T. Kung, “A regular layout for parallel adders,” IEEETrans. Comput., vol. C-31, no. 3, pp. 260–264, Mar. 1982.

[35] T. Han and D. A. Carlson, “Fast area-efficient VLSI adders,” in Proc. 8thIEEE Symp. Comput. Arithmet., 1987, pp. 49–56.

[36] V. Pudi and K. Sridharan, “Efficient design of a hybrid adder in quantum-dot cellular automata,” IEEE Trans. VLSI Syst., 2010, to be published.

[37] K. Walus, G. Schulhof, and G. Jullien, “Implementation of a simulationengine for clocked molecular QCA,” in Proc. IEEE Can. Conf. Electr.Comput. Eng., May 2006, pp. 2128–2131.

[38] G. Schulhof, K. Walus, and G. A. Jullien, “Simulation of random celldisplacements in QCA,” ACM J. Emerg. Technol. Comput. Syst., vol. 3,no. 1, pp. 1–14, 2007.

Vikramkumar Pudi received the B.Tech. degreein electronics and communication engineering fromNarayana Engineering College, Nellore, India in2008. He is currently working toward the Ph.D degreein the Department of Electrical Engineering in IndianInstitute of Technology Madras, Chennai, India.

His research interests include quantum dot cellularautomata, VLSI architectures, and hardware realiza-tion of DSP algorithms.

K. Sridharan (S’84–M’96–SM’01) received thePh.D degree from Rensselaer Polytechnic Institute,Troy, NY, in 1995.

He was an Assistant Professor at Indian Instituteof Technology (IIT), Guwahati, from 1996 to 2001.Since June 2001, he is with IIT Madras where he ispresently a Professor. He was a visiting staff memberat Nanyang Technological University, Singapore, in2000–2001 and 2006–2008. He has supervised threePh.D degrees and holds two patents. He is an au-thor of a book published by Springer in 2008 on

hardware-efficient algorithms for robotics. He has also authored/co-authoredapproximately 70 papers in various journals and conferences.

He received the Computer Engineering Division Prize for a paper publishedin the Journal of I.E(India) in 2002. He is also the recipient of the 2009 VikramSarabhai Research Award for his work in algorithms, architectures, and FPGA-based designs for robotics.