A 240ps 64b Carry-Lookahead Adder in 90nm CMOS Faezeh Montazeri fmontazeri@ece.ut.ac.ir Advanced...

A 240ps 64b Carry-Lookahead A 240ps 64b Carry-Lookahead Adder in 90nm CMOSAdder in 90nm CMOS

Faezeh MontazeriFaezeh Montazerifmontazeri@ece.ut.ac.irfmontazeri@ece.ut.ac.ir

Advanced VLSI Course PresentationAdvanced VLSI Course PresentationUniversity of TehranUniversity of Tehran

December 2006December 2006

Based on :Based on :A 240ps 64b Carry-Lookahead Adder in 90nm CMOSA 240ps 64b Carry-Lookahead Adder in 90nm CMOS

Sean Kao, Radu Zlatanovici, Borivoje NikolićSean Kao, Radu Zlatanovici, Borivoje NikolićUniversity of California, BerkeleyUniversity of California, Berkeley

0 10 20 30 40 50 60

Normalized Delay [90nm 1V FO4]

500 nm

350 nm

250 nm

180 nm

130 nm

What Is an Optimal Adder?What Is an Optimal Adder?

Optimal adder:• Minimum delay for given energy• Minimum energy for given delay

64-bit Adders on IEEE Xplore 1995-2005

This WorkThis Work

Multi-issue 64-bit microprocessor environment:

• Optimize a set of representative 64-bit adders in

the energy – delay space

• Analyze the design tradeoffs

• Implement the optimal adder in

1.0V 90nm GP CMOS

OutlineOutline

• Energy – delay optimization

• Design tradeoffs for 64-bit adders

• Test chip implementation

• Measured results

• Summary

Energy – Delay OptimizationEnergy – Delay Optimization

rgy Domino CLA Adder

• Goal: obtain the energy – delay optimal adder • CAD tool: optimize custom digital circuits in the

energy – delay space [3]

Static CLA Adder

Circuit Optimization FrameworkCircuit Optimization Framework

Optimizer

(Matlab)

Delay, EnergyStatic timer

Models Netlist Optimization Goal

Optimal Design

Variables

Design Variables

Static timer

Optimization Core

Adder Optimization SetupAdder Optimization Setup

MinimizeDELAYsubject toMaximumENERGY

Generatesubtree

Propagatesubtree

Sum precompute

A,B,Cin

Critical path

Non-critical path

CL = 27 fF

CIN ≤ 27fF

tSLOPE ≤ 100 ps [1]

6 8 10 12 14

Delay [FO4]

R2 CLA

R4 CLA

CLA: Full Tree ComparisonCLA: Full Tree Comparison

• 6 stages• Moderate

branching

• 3 stages• Larger

branching

Radix- 4 closer to optimum number of stages

Radix-2 Radix-4

CLA vs. LingCLA vs. Ling

6 8 10 12 14

Delay [FO4]

R2 Ling

R2 CLA

R4 Ling

R4 CLA

1i1i1iiiiiii

0121223

HbabaHbaS

gttgtgg0]:H[3

0123123233

gpppgppgpg0]:G[3

Conventional CLA• Higher stack in first stage• Simple sum precompute

Ling CLA• Lower stack in first stage • Complex sum precompute• Higher speed

Full vs. Sparse ComparisonFull vs. Sparse Comparison

6 8 10 12 14

Delay [FO4]

R2 FULL

R4 FULL

FULL SP2Ling CLA

6 8 10 12 14

Delay [FO4]

R2 FULL

R2 SP2

R4 FULL

R4 SP2

FULL SP2Ling CLA

R4 +[1]

6 8 10 12 14

Delay [FO4]

R2 FULL

R2 SP2

R2 SP4

R4 FULL

R4 SP2

R4 SP4

Sparseness benefits adders with large carry trees

FULL SP4Ling CLA

SP2 SP4

R2 + +

R4 + –[1]

6 8 10 12 14

Delay [FO4]

R2 FULL

R2 SP2

R2 SP4

R4 FULL

R4 SP2

R4 SP4

Optimal AdderOptimal Adder

• Ling’s equations

• Radix-4 sparse-2

• Domino carry tree

• Static sum-precompute

• Delay of fastest adder:

7.3 FO4

Radix-4 Sparse-2 Carry TreeRadix-4 Sparse-2 Carry Tree

• Computes every other Ling pseudo-carry: H0, H2, H4 …• Each output selects two sums

SUMSEL

(A0, B0)

H16/I16

Cin (A63, B63)G/T

s63Couts0

G/T gates

H gate

H/I gates

SUMSEL MUX

LEGEND

Adder Core Block DiagramAdder Core Block Diagram

• Critical paths implemented in clock-delayed domino • Non-critical paths implemented in static • At-speed BIST

H16I16

Sum precompute

Sum selectMUX

pc1 pc2 pc3 pc4 psel

Clock Generator

MUX Out FF

Scan chain

S1 Buffer

parator

scan_in

footed domino

footless domino

static CMOShard edge

Precomputed sums

inputs

Timing DiagramTiming Diagram

• 20 ps margin on all edges; Adjustable hard edges• Delay spread places precharge in critical path

Hard edge

TCYCLE DUTY CYCLE

Layout FloorplanLayout Floorplan

• Bitslice height: 24 metal tracks• Aligned clock lines• Sum precompute occupies space freed by sparse carry tree

TG SUM SELECT

SUM SELECT

TG SUM SELECT

SUM SELECT

EVERY BITSLICE

SPARSE-2 CARRY TREE

SPARSE-2 SUM

PRECOMP

24 TRACKS

LEGEND

pc1 pc2 pc3 pc4 psel

90 nm Test Chip90 nm Test Chip

CK GEN

1.7 mm

• 90 nm GP 7M 1P • SVT transistors• VDD = 1V• 8 adder cores + test

circuitry • Core 1: this work• Cores 2-8:

Supply noise measurements and supply grid experiments [4].

• Adder core size: 417 x 75m2

Chip PackagingChip Packaging

Chip-on-board:• Bond wires 60% shorter• Cleaner supply 10 ps shorter delays

Advance ProgramDigest

Measured Results: DelayMeasured Results: Delay

CHIP-ON-BOARD:

• VDD = 1 V

– Average: 240 ps

– Fastest: 226 ps

• VDD = 1.3 V

– Average: 180 ps

Davg = 7.5 FO4

Measured Results: PowerMeasured Results: Power

VDD = 1V: Pmax = 260 mW

VDD = 1.3V: Pmax = 606 mW

Adder core

Clk gen

Leakage

ConclusionConclusion

• 90 nm GP 7M 1P

• SVT transistors

• VDD = 1V

• 8 adder cores + test circuitry

• Adder core size: 417 x 75m2

0 10 20 30 40 50 60

Normalized Delay [90nm 1V FO4]

500 nm350 nm250 nm180 nm130 nm90 nmThis work

64-bit Adders on IEEE Xplore 1995-2005

SummarySummary

• Ling radix-4 sparse-2 domino carry tree

• 90nm GP CMOS: 240ps, 260mW @1V

ReferencesReferences

• [1]. S. Kao, R. Zlatanovici, B. Nikolic, “A 240ps 64-bit Carry-Lookahead Adder in 90nm CMOS,” ISSCC2006, Feb.2006.

• [2]. H. Ling, “High Speed Binary Adder,” IBM J. R&D, vol. 25, no. 3, pp.156-166, May, 1981.

• [3]. R. Zlatanovici, B. Nikolic, “Power – Performance Optimization for Custom Digital Circuits,” Proc. PATMOS, pp. 404-414, Sept., 2005.

• [4] V. Abramzon, E. Alon, M. Horowitz Stanford University

A 240ps 64b Carry-Lookahead Adder in 90nm CMOS Faezeh Montazeri fmontazeri@ece.ut.ac.ir Advanced...

Documents

A guide to modelling cardiac electrical activity in anatomically detailed ventricles By: faezeh heydari khabbaz

【シングルオーガー】 - agoraex.comagoraex.com/catalog/oga.pdf · smd-120hp skc-120vw smd-150hp smd-60k 655 2軸（分離型） 120～240PS 2軸（同期型）

Margaret Dowling Creek Hydraulic Modelling...Margaret Dowling Creek Hydraulic Modelling Mahdi Montazeri Department of Environment, Water and Natural Resources November, 2014 DEWNR

JAGUAR XEjaguar-landrover.vn/wp-content/uploads/2019/05/Jaguar_XE_Brochur… · Ingenium Petrol engine and 2.0 litre 4 cylinder 240PS Twin Turbocharged Ingenium Diesel engine. They

Jurnal Politeknik Negeri Lhokseumawejurnal.pnl.ac.id/wp-content/plugins/Flutter/files...A Survey of File Replication Techniques In Grid Systems Moslem kaviani, Faezeh Pournaghdali

Sheet9 - ecc.co.id · 45 Ishak Ismail Damanik UNIVERSITAS GADJAH MADA Hukum 46 Jarot Wicakmoko UNIVERSITAS MUHAMMADIYAH YOGYAKARTA Teknik Elektro 47 Jatu Montazeri INSTITUT TEKNOLOGI

Dr. mohd . montazeri tehran metro

Delft University of Technology Climate Proof Cities ... · Hamid Montazeri Twan van Hooff Harry Timmermans Wiebke Klemm Toine Vergroesen Reinder Brolsma Laura Kleerekoper Leyre Echevarría

Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ GraphConnect NY 2013

a*, Jalal Bazargan , Faezeh Azhangc, Rana Nasirid 4 a. PhD

ONTENTS Effects of different polyamines on vase life ... · mineral concentrations of grapevine cv. ... Fatemeh Hosseini, Mohammad Mohsen Montazeri, Nasser Bagherani ... Received:

The Electrooculogram EOG Faezeh heydari 86133102

Moghadam, Saeed Montazeri; Pinchefsky, Elana; Tse, Ilse

ٕدرف تاصخشم - University of Isfahanengold.ui.ac.ir/~yazdchi/index_files/CV.pdf · Jahangiri, Faezeh Tabesh, Ali Gholamrezaei, “P wave duration and dispersion in Holter

RESUME - sciold.ui.ac.irsciold.ui.ac.ir/~rahgozar/resume.pdf · RESUME Name Soheila Rahgozar Address Dr ... Jamal Moshtaghian, Kamran Ghaedi, Abolg-hasem Esmaeili, and Fatemeh Montazeri

ISSCC 2006 / SESSION 24 / HIGH-PERFORMANCE DIGITAL ...bora/publications/ISSCC06.pdfISSCC 2006 / SESSION 24 / HIGH-PERFORMANCE DIGITAL CIRCUITS / 24.2 24.2 A 240ps 64b Carry-Lookahead

Böhm ∙ Montazeri | Fußballgames. 100 Seiten

E Ed - iba-2016.sciencesconf.org · Faezeh makhlooghi Azad a, Patrick C. Howlettb, Maria Forsythc aIFM- Institute for Frontier Materials, Deakin University, Geelong, Australia b,c

Faezeh Ashtiani Portfolio

Dr Faezeh Sakhinia FY2. Aims Basics of CXR interpretation OSCE approach Images with explanation Quiz!!