35
1 Moore’s Law in Microprocessors 4004 8008 8080 8085 8086 286 386 486 Pentium® proc P6 0.001 0.01 0.1 1 10 100 1000 1970 1980 1990 2000 2010 Year Transistors (MT) 2X growth in 1.96 years! Transistors on lead microprocessors double every 2 years

1 Moore’s Law in Microprocessors 4004 8008 8080 8085 8086 286 386 486 Pentium® proc P6 0.001 0.01 0.1 1 10 100 1000 19701980199020002010 Year Transistors

Embed Size (px)

Citation preview

1

Moore’s Law in Microprocessors

40048008

80808085 8086

286386

486Pentium® proc

P6

0.001

0.01

0.1

1

10

100

1000

1970 1980 1990 2000 2010

Year

Tra

nsi

sto

rs (

MT

)

2X growth in 1.96 years!

Transistors on lead microprocessors double every 2 yearsTransistors on lead microprocessors double every 2 years

2

64

256

1,000

4,000

16,000

64,000

256,000

1,000,000

4,000,000

16,000,000

64,000,000

10

100

1000

10000

100000

1000000

10000000

100000000

1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010

Year

Kbit

capa

city

/chi

p

Evolution in DRAM Chip Capacity

1.6-2.4 m

1.0-1.2 m

0.7-0.8 m

0.5-0.6 m

0.35-0.4 m

0.18-0.25 m

0.13 m

0.1 m

0.07 m4X growth every 3 years!

3

Die Size Growth

40048008

80808085

8086286

386486 Pentium ® proc

P6

1

10

100

1970 1980 1990 2000 2010

Year

Die

siz

e (m

m)

~7% growth per year

~2X growth in 10 years

Die size grows by 14% to satisfy Moore’s LawDie size grows by 14% to satisfy Moore’s Law

4

Clock FrequencyLead microprocessors frequency doubles every 2 yearsLead microprocessors frequency doubles every 2 years

P6

Pentium ® proc486

38628680868085

8080

80084004

0.1

1

10

100

1000

10000

1970 1980 1990 2000 2010

Year

Fre

qu

ency

(M

hz)

2X every 2 years

Courtesy, Intel

5

Power Dissipation

P6Pentium ® proc

486

3862868086

80858080

80084004

0.1

1

10

100

1971 1974 1978 1985 1992 2000Year

Po

wer

(W

atts

)Lead Microprocessors power continues to increaseLead Microprocessors power continues to increase

Power delivery and dissipation will be prohibitivePower delivery and dissipation will be prohibitive

6

Power Density

40048008

80808085

8086

286386

486Pentium® proc

P6

1

10

100

1000

10000

1970 1980 1990 2000 2010

Year

Po

wer

Den

sity

(W

/cm

2)

Hot Plate

NuclearReactor

RocketNozzle

Power density too high to keep junctions at low tempPower density too high to keep junctions at low temp

7

Design Productivity Trends

200

3

198

1

198

3

198

5

198

7

198

9

199

1

199

3

199

5

199

7

199

9

200

1

200

5

200

7

200

9

Logic Tr./Chip

Tr./Staff Month.

xxx

xxx

x

21%/Yr. compoundProductivity growth rate

x

58%/Yr. compoundedComplexity growth rate

10,000

1,000

100

10

1

0.1

0.01

0.001

Lo

gic

Tra

nsi

sto

r p

er C

hip

(M)

0.01

0.1

1

10

100

1,000

10,000

100,000

Pro

du

ctiv

ity

(K)

Tra

ns.

/Sta

ff -

Mo

.

Co

mp

lexi

ty

Courtesy, ITRS Roadmap

Complexity outpaces design productivityComplexity outpaces design productivity

8

SIA RoadmapYear 1999 2002 2005 2008 2011 2014

Feature size (nm) 180 130 100 70 50 35

Mtrans/cm2 7 14-26 47 115 284 701

Chip size (mm2) 170 170-214

235 269 308 354

Signal pins/chip 768 1024 1024 1280 1408 1472

Clock rate (MHz) 600 800 1100 1400 1800 2200

Wiring levels 6-7 7-8 8-9 9 9-10 10

Power supply (V) 1.8 1.5 1.2 0.9 0.6 0.6

High-perf power (W)

90 130 160 170 174 183

Battery power (W) 1.4 2.0 2.4 2.0 2.2 2.4

9

10

11

Design Abstraction LevelsSYSTEM

GATE

CIRCUIT

VoutVin

CIRCUIT

VoutVin

MODULE

+

DEVICE

n+S D

n+

G

12

Major Design Challenges• Microscopic issues

– ultra-high speeds– power dissipation and supply

rail drop– growing importance of

interconnect– noise, crosstalk– reliability, manufacturability– clock distribution

• Macroscopic issues– time-to-market– design complexity (millions

of gates)– high levels of abstractions– reuse and IP, portability– systems on a chip (SoC)– tool interoperability

Year Tech. Complexity Frequency

3 Yr. Design

Staff Size

Staff Costs

1997 0.35 13 M Tr. 400 MHz 210 $90 M

1998 0.25 20 M Tr. 500 MHz 270 $120 M

1999 0.18 32 M Tr. 600 MHz 360 $160 M

2002 0.13 130 M Tr. 800 MHz 800 $360 M

13

14

15

16

17

18

19

20

21

22

Fuse and anti-f use Fuse makes or breaks link between two wires Typical connections are 50-300 ohm One-time programmable (testing before programming?) Very high density

EPROM and EEPROM High power consumption Typical connections are 2K-4K ohm Fairly high density

RAM-based Memory bit controls a switch that connects/ disconnects two

wires Typical connections are .5K-1K ohm Can be programmed and re-programmed in the circuit Low density

Programmable Logic Technologies

23

Clk MUX

Output MUXQ

F/B MUX

Invert Control

AND ARRAY

CLK

pad

8 Product TermAND-OR Array

+Programmable

MUX's

Programmable polarity

I/O Pin

Seq. LogicBlock

Programmable feedback

Altera EPLD (Erasable Programmable Logic Devices)

• Historical Perspective– PALs: same technology as programmed once bipolar PROM– EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light

• Altera building block = MACROCELL

24

Altera EPLDs contain 8 to 48 independently programmed macrocells

Personalizedby EPROMbits: Flipflop controlled

by global clock signal

local signal computesoutput enable

Flipflop controlledby locally generatedclock signal

+ Seq Logic: could be D, T positive or negative edge triggered+ product term to implement clear function

Synchronous Mode

Asynchronous Mode

Global CLK

OE/Local CLK

EPROM Cell

1

Global CLK

OE/Local CLK

EPROM Cell

1

Clk MUX

Clk MUX

Q

Q

Altera EPLD

25

Basic Module is aModified 4:1 Multiplexer

Example: Implementation of S-R Latch

2:1 MUXD0

D1

SOA

2:1 MUXD2

D3

SOB

2:1 MUX

S0

Y

S1

2:1 MUX"0"

R

2:1 MUX"1"

S

2:1 MUX Q

"0"

Actel Logic Module

26

Interconnection Fabric

Logic Module

Horizontal Track

Vertical Track

Anti-fuse

Actel Interconnect

27

IOB IOB IOB IOB

CLB CLB

CLB CLB

IOB

IOB

IOB

IOB

Wiring Channels

Xilinx Programmable Gate Arrays

• CLB - Configurable Logic Block– 5-input, 1 output function– or 2 4-input, 1 output functions– optional register on outputs

• Built-in fast carry logic• Can be used as memory• Three types of routing

– direct– general-purpose– long lines of various lengths

• RAM-programmable– can be reconfigured

28

CLB

CLB

CLB

CLB

SwitchMatrix

ProgrammableInterconnect

I/O Blocks (IOBs)

ConfigurableLogic Blocks (CLBs)

D Q

SlewRate

Control

PassivePull-Up,

Pull-Down

Delay

Vcc

OutputBuffer

InputBuffer

Q D

Pad

D QSD

RD

EC

S/RControl

D QSD

RD

EC

S/RControl

1

1

F'

G'

H'

DIN

F'

G'

H'

DIN

F'

G'

H'

H'

HFunc.Gen.

GFunc.Gen.

FFunc.Gen.

G4G3G2G1

F4F3F2F1

C4C1 C2 C3

K

Y

X

H1 DIN S/R EC

29

The Xilinx 4000 CLB

30

Xilinx 4000 Interconnect

31

Switch Matrix

32

Xilinx 4000 Interconnect Details

33

Computer-Aided Design

• Can't design FPGAs by hand– Way too much logic to manage, hard to make changes

• Hardware description languages– Specify functionality of logic at a high level

• Validation: high-level simulation to catch specification errors– Verify pin-outs and connections to other system components– Low-level to verify mapping and check performance

• Logic synthesis– Process of compiling HDL program into logic gates and flip-flops

• Technology mapping– Map the logic onto elements available in the implementation

technology (LUTs for Xilinx FPGAs)

34

CAD Tool Path (cont’d)

• Placement and routing– Assign logic blocks to functions– Make wiring connections

• Timing analysis - verify paths– Determine delays as routed– Look at critical paths and ways to improve

• Partitioning and constraining– If design does not fit or is unroutable as placed split into multiple

chips– If design it too slow prioritize critical paths, fix placement of cells,

etc.– Few tools to help with these tasks exist today

• Generate programming files - bits to be loaded into chip for configuration

35

Xilinx CAD Tools

• Verilog (or VHDL) use to specify logic at a high-level– Combine with schematics, library components

• Synopsys– Compiles Verilog to logic– Maps logic to the FPGA cells– Optimizes logic

• Xilinx APR - automatic place and route (simulated annealing)– Provides controllability through constraints– Handles global signals

• Xilinx Xdelay - measure delay properties of mapping and aid in iteration

• Xilinx XACT - design editor to view final mapping results