Penn Engineering | Inventing the Futureese570/spring2015/ESE570...Kenneth R. Laker, University of Pennsylvania, updated 20Jan15 3 1. Apply principles of hierarchical digital CMOS VLSI,

Kenneth R. Laker, University of Pennsylvania, updated 20Jan15 1

http://www.seas.upenn.edu/~ese570/



TOPICS● The Course

● Industry Trends● Digital CMOS Basics

● Some VLSI Fundamentals● Illustrative Design Example


1. Apply principles of hierarchical digital CMOS VLSI, from the transistor up to the system level, to the understanding of CMOS circuits and systems that are suitable for CMOS fabrication.

2. Apply the models for state-of-the-art VLSI components, fabrication steps, hierarchical design flow and semiconductor business economics to judge the manufacturability of a design and estimate its manufacturing costs.

3. Design simulated experiments using Cadence to verify the integrity of a CMOS circuit and its layout.

4. Design digital circuits that are manufacturable in CMOS.

5. Apply the Cadence VLSI CAD tool suite layout digital circuits for CMOS fabrication and verify said circuits with layout parasitic elements.

6. Apply course knowledge and the Cadence VLSI CAD tools in a team based capstone design project that involves much the same design flow they would encounter in a semiconductor design industrial setting. Capstone project is presented in a formal report due at the end of the semester.


Classification of Digital CMOS Circuits

STATIC CIRCUIT In steady-state the output is always at a “1” or “0” via a low-impedance path between the output and VDD or GND, respectively.

DYNAMIC CIRCUITIn steady-state the output is at “1” or “0” due to the presence or absence of charge, respectively, stored on the output node capacitance.

STATICCIRCUITS

STATICCIRCUITS DYNAMIC

CIRCUITS


Course Introduction


Grading PoliciesHomework: 20 %Midterm 05Mar15: 40 %Project 07May15: 40 %

Homework Policies Homework assignments are combination of textbook problems, CADENCE tutorial & CADENCE lab exercises. Homework is assigned each week by Friday and will be due on Thursday the week after it is assigned. All homework assignments and due dates will be posted on the ESE 570 website http://www.seas.upenn.edu/~ese570/. Students are permitted up to THREE one-week latenesses without penalty.

a. On three occasions homework may be turned-in one week after the official due date, unless it falls on a university or religious holiday. http://provost.upenn.edu/policies/pennbook/2013/02/13/policy-on-secular-and-religious-holidays

b. Homework not turned in accordance with policy will receive zero grade.

COURSE PROTOCOLS



Industry Trends


Curve shows transistor count doubling every two years

Pentium

40048006

8080Mot 6800

8086

Mot 6800080286

80386

80486

MOS 6502Zilog Z80

80186

AMD K5Pentium II

Pentium IIIAMD K7

Pentium 4AMD K8

AMD K10 AMD 6-Core Opteron 24004-Core i7

2-Core Itanium 26-Core i76-Core i716-Core SPARC T3

10-Core Xenon IBM 4-Core z196

IBM 8-Core POWER74-Core Itanium Tukwilla

Microprocessor Transistor Count 1971 – 2015 & “Moore's Law”

2015: Oracle SPARC M7, 20 nm CMOS, 32-Core, 10B 3-D FinFET transistors.

2015


Process Node/”Minimum” Feature

Year1960 1980 2000 2020 2040

100 µm

10 µm

1 µm

0.1 µm

10 nm

1 nm

0.1 nm

Integrated Circuit History

0.18 µm in 1999

Distant Future

ITRS Roadmap

Transition Region

Quantum Devices

Atomic Dimensions

TREND – “Minimum” Feature Size vs Year

“Minimum” Feature Measure = line/gate conductor width or half-pitch (adjacent 1st metal layer lines or adjacent transistor gates)


Intel Cost Scaling

http://www.anandtech.com/show/8367/intels-14nm-technology-in-detail


Tri-Gate transistors with multiple fins connected together increases total drive strength for higher performance

22 nm 3-D FinFET Transistor

http://download.intel.com/newsroom/kits/22nm/pdfs/22nm-Details_Presentation.pdf

High-k gate dielectric


2010 YEAR

0.032'm

Serial data links operating at 10 Gbits/sec.

Increased reuse of logic IP, i.e. designs and cores.

0.022'm20112009

“Moore's Law” Impact on Intel Micro-Computers

2BT µP (Intel Itanium Tukwila) 4-Core chip (65 nm) introduced Q1 2010.3BT mP (Intel Itanium Poulson) 8-Core chip (32 nm) to be introduced 2012.Introduces 22 nm Tri-gate Transistor Tech.

Complexity - # transistorsDouble every Two Years


Moore's Law and MoreEquivalent Scaling

Geo

me t

rical

Sca

ling

“More-than-Moore”, International Road Map (IRC) White Paper, 2011.

International Technology Road Map for Semiconductors

Scal

ing


Geometrical Scaling - refers to the continued shrinking of horizontal and vertical physical feature sizes.

Equivalent Scaling - refers to 3-dimensional device structure improvements and new materials that affect the electrical performance of the chip even if no geometrical scaling.

Design Equivalent Scaling - refers to design technologies that enable high performance, low power, high reliability, low cost, and high design productivity even if neither geometrical nor equivalent scaling can be used.

Examples include: Design-for-variabilityLow power design (sleep modes, clock gating, multi-Vdd, etc.)Multi-core SOC architectures

More Moore => Scaling


Examples of “More Than Moore” Devices

Interacting with the outside worldElectromagnetic/Optical

- Radio-frequency domain up to the THz range - Optical domain from the infrared to the near ultraviolet - Hard radiation (EUV, X-ray, γ-ray)

Mechanical parameters (sensors/actuators)- MEMS/NEMS position, speed, acceleration, rotation,

pressure, stress, etc.Chemical composition (sensors/actuators)Biological parameters (sensors/actuators)

PoweringIntegration of renewable sourcesEnergy storageSmart meteringEfficient consumption

More than Moore => Functional Diversification


“More-than-Moore” Components Complement Digital Processing/Storage Elements in an Integrated System


Readout Integrated Circuit (ROIC)

Digital Signal Processing (DSP)

optical

electrical

Back-Side Illuminated (BSI) Photodiode Array

3D integration of “More-than-Moore” photodetector with “More Moore” ROIC and DSP


1010

109

108

107

106

105

104

103

102

10

Transistors/cm2

1010

109

108

107

106

105

104

103

102

10

Com

ponents/cm2

1970 1980 1990 2000 2010 2020

MultichipModule

System-in-package

(SIP)System-

on-package (SOP)

Semiconductor System Integration – More Than Moore's Law

R. Tummala, “Moore's Law Meets Its Match”, IEEE Spectrum, June, 2006

SOP law for system integration. As components shrink and boards all but disappear, component density will double every year or so.


System-on-Package Vision (Georgia Tech 3D Systems Packaging Research Center)


Improvement Trends for VLSI SoCs Enabled by Geometrical and Equivalent Scaling

TRENDS1. Higher Integration level -> exponentially increased number of components/transistors per chip/package.2. Performance Scaling -> combination of Geometrical (shrinking of dimensions) and Equivalent (innovation) Scaling.3. System implementation -> SoC + increased use of SiP -> SOP

CONSEQUENCES4. Higher Speed -> CPU clock rate at multiple GHz + parallel processing.5. Increased Compactness & less weight -> increasing system integration.6. Lower Power -> Decreasing energy requirement per function.7. Lower Cost -> Decreasing cost per function.


Digital CMOS Basics


Classification of Digital CMOS Circuits

STATIC CIRCUIT In steady-state the output is always at a “1” or “0” via a low-impedance path between the output and VDD or GND, respectively.

DYNAMIC CIRCUITIn steady-state the output is at “1” or “0” due to the presence or absence of charge, respectively, stored on the output node capacitance.

STATICCIRCUITS

STATICCIRCUITS DYNAMIC

CIRCUITS




Ideal nMOS and pMOS Characteristics

High Impedance or High Z

High Impedance or High Z

g g

g g

g = 0

g = 0 g = 0

g = 0

g = 1

g = 1

g = 1

g = 1

If N- & P-Switch are MOS transistors“a” = drain & “b” = source

ab b

bb

a a

aa


Complementary CMOS Switch22

g g

g g

-g

-g -g


Ideal CMOS Inverter

Inverter Truth Table

A

F=A

F=A

F=A

PUN

PDN

F=A

F=A

0 (GND)


N

P=?

A AF F

0 (GND) 0 (GND)


PDN

PUN

F = f(A,B,C,D)

VDD

Inputs

Output

When the PDN is conducting, the output F will be “0”. Hence,the PDN is determined by a Boolean expression for the complemented output F in terms of the un-complemented inputs (A,B,C,D).

When the PUN is conducting, the output F will be “1”. Hence,the PUN is determined by a Boolean expression for the un-complemented output F in terms of the complemented inputs (A,B,C,D).

PUN and PDN are Dual Nets

ABCD

ABCD

CMOS GATES

DeMorgan’s Theorem


Two-Input CMOS NAND Gate

F=%A⋅B&

F

FA

B

AB

F=%A⋅B&

F=%A⋅B&

DeMorgan's Theorem

B-INPUT

A-INPUTOUTPUT

Gate Circuit Symbols

%A⋅B &

%A$B&%A⋅B & %A$B&=

“AND”

“OR”

“NOT”

“NOT”

Z = open circuit

PDN

PUNF=%A$B&


Two-Input CMOS NOR Gate

DeMorgan's Theorem

A $ B

F

F

AB OR

AND

F=%A$B&

%A$B&=%A⋅B &%A$B&

%A⋅B &

F=%A⋅B&

F=%A$B&

PUN

PDN


Constructing Compound CMOS GatesF=%%A⋅B&$%C⋅D&&PUN (F = f(A,B,C,D)

F

0 0

PDN(F = f(A,B,C,D)F=%%A⋅B&$%C⋅D &&

F=%A$B&⋅%C$D &1 1

X

X

F X

F

F

0

F F

PUN

PDN


Combining the PDN and PUN =>

F=%%A⋅B &$%C⋅D&&


output=A⋅s$B⋅s

MULTIPLEXOR (MUX)

x = DON'T CARE


Some VLSI Fundamentals

Oracle SPARC M7Processor


VLSI Hierarchical Representations

fabricated?

- Circuit- Component


System Level

Algorithmic Level

Register-Transfer Level

Logic Level

Circuit Level

Behavioral Domain Structural Domain

Physical Domain

System SpecificationAlgorithm

Register-Transfer Spec.Boolean Expression

Transistor Layout

Standard-cell/Sub-cell

Macro-cell/Module

Chip/SoC/Board

Block/Die

Chip/SoC/Board

Block/Die Layout

Macro-cell/Module Layout

Standard-cell/Sub-cell Layout

Processor, Sub-systemALU, Register, MUX

Gate/Flip-flopTransistor symbols

Y-Chart: Consistent Abstractions in 3 Domains

Transistor Model Equation

CPU, ASIC

Boolean ExpressionBoolean ExpressionBoolean Expression


Goal of All VLSI Design Enterprises

Convert System Specs into an IC DESIGN in MINIMUM TIME and with MAXIMUM LIKLIHOOD that the Design will PEFORM AS SPECIFIED when fabricated.

MAX YIELD + MIN DEVELOPMENT TIME + MIN DIE AREA=> MIN COST

MIN DIE AREA ≠ MIN DEVELOPMENT TIME TRADEOFFS

MIN DIE AREA ≠ MAX YIELD


CMOS CHIP MANUFCTURING STEPS


Basic VLSI Chip Cost Model

mask set, all indirect costs.

# good die

# good die

where 0 ≤ yield ≤ 1 (100 %)

total # of IC dies fabricated# of IC dies satisfy ALL requirements

variable cost per packaged IC die

cost per packaged IC die = variable cost per IC packaged die +

FIXED COST: total cost due to expenditures that do not directly vary with sales, e.g. R&D, equipment depreciation, general operations and administration.

VARIABLE COST: total cost due to expenditures directly linked to making the product; e.g. engineering, wafer, mask set, test, package.

variable cost(total # of die fabricated) * yield

=


VLSI Design Cycle or Flow

Verilog/SPICE


Design a One-Bit Adder Circuit using 0.8 twin-well CMOS Technology. The design specifications are:

1. Propagation Delay Times of SUM and CARRY_Out signals: ≤ 1.2 ns2. Rise and Fall Times of SUM and CARRY_Out signals: ≤ 1.2 ns3. Circuit Die Area: ≤ 4. Dynamic Power Dissipation (@ V

DD = 5 V and f

max = 20 MHz): ≤ 1 mW

5. Functional

1500'm2

Illustrative Circuit Design Example

0 8µ( m twin-well CMOS technology. The


Register ADDER Shifter MultiplierData IN Data OUT

Control

Bit 0

Bit N

Bit-Sliced Data Path


Illustrative Circuit Design Example

Binary Full

Adder (BFA)

sum_out = A + B + C = ABC + ABC + ABC + ACB = ABC + (A + B + C) carry_outcarry_out + AB + AC + BC

Use of carry_out to realize sum_out reduces circuit complexity and die area.


Gate Level Schematic of One-Bit Full Adder Circuit

NOT directly realizable in CMOS - Why?


clkGND

cin

VDD

a<7:0>b<7:0>

sum<7:0>

cout

1-bit AdderCell

8-bit Ripple Adder

a b vdd

gnd sum

cin coutBFA

BFA(0) BFA(2) BFA(7)BFA(1)


42

Transistor Level Schematic of One-Bit Full Adder Circuit

%A$B &⋅C$A⋅BCOUT

%A$B$C &⋅COUT

A⋅B⋅C$%A$B$C &⋅COUT

COUTSUMOUT

CMOS Implementable


Initial Layout of One-Bit Full Adder Circuit

N1 N2

N1 N2N1 N2

SUMOUTCOUT

COUT


Initial Layout of One-Bit Full Adder Circuit

≤1500'm2

COUT

Dynamic Power Dissipation (@ VDD = 5V, f max

= 20 MHz): = 0.7 mW ≤ 1 mW


Simulated Performance of One-Bit Full Adder Circuit

Spec NOTmet.

Let all specs be met except tPLH, i.e.

Documents

Penn Engineering | Inventing the Futureese570/spring2015/ESE570...Kenneth R. Laker, University of Pennsylvania, updated 20Jan15 3 1. Apply principles of hierarchical digital CMOS VLSI,