Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 1
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 1
Reconfiguring The Future
Seth Copen GoldsteinCarnegie Mellon University
IEE FPGA Developers Forum10/22/03
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 2
Moore’s Law
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 2
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 3
HappyB’dayHappyB ‘ Day
HappyB’Day
Moore’s Law
Nanoscale components ⇒ Computers that are• Small enough to fit inside cells • Cheap enough to be disposable• Dense enough to embed a supercomputer• Smart enough to assemble themselves
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 4
What Comes Next?
Combination of Hans Moravac + Larry Roberts + Gordon Bell WordSize*ops/s/sysprice
From Gray Turing Award Lecture
1.E-06
1.E-03
1.E+00
1.E+03
1.E+06
1.E+09
1880 1900 1920 1940 1960 1980 2000
doubles every 7.5 years
doubles every 2.3 years
doubles every 1.0
years
Ops/sec/$
Mechanical/ Relays
Tubes/Transistor
CMOS
2010 2020 2030
1.E+10
1.E+11
nanometer
doubles every few months?
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 3
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 5
Technology Shifts• Size of Devices
⇒ Inches to Microns• Type of Interconnect
⇒ Rods to Lithowires• Method of Fabrication
⇒ Hammers to Light• Largest Sustainable System
⇒ 101 to 108
• Reliability⇒ Bad to Excellent
• Size of Devices⇒ Inches to Microns to Nanometers
• Type of Interconnect⇒ Rods to Lithowires to Nanowires
• Method of Fabrication⇒ Hammers to Light to Self-Assembly
• Largest Sustainable System⇒ 101 to 108 to 1012
• Reliability⇒ Bad to Excellent to Unknown
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 6
A trillion devices/cm2
… but many of them probably won’t work
… and those that do won’t be identical
200nm
15nm FINFET,end-of-roamap
approx CMOS size
Carbon Nanotubetransistor
~2nm width
Copper wires,predicted
~50nm pitch
1cm1cm
Nanowires,Already
30nm pitch
UC. Berkeley, MIT
IBM HP/UCLA
Delft
Size Matters
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 4
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 7
As we scale down:• Devices become
– more variable– more faulty (defects & faults)– numerous
• Fabrication becomes– More expensive– More constrained
• Design becomes– More complicated– More expensive
Requires:–Defect tolerant architectures–Higher level specification–Universal substrate
Drain Gate Source
1 dopant atom
IBM
NanoCMOS
MIT HP
Independent of Technology
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 8
Karen Brown, NIST
Affordable Total Cost / Wafer Level Exposure
020406080
100120140160180
0.35 um 0.25 um 0.18 um 0.13 umSIA Roadmap Generation
Mas
k Co
st /
Waf
er L
evel
Exp
osur
e ($
)
500
3000Wafer Exposures/Mask
Affordable Total Cost / Wafer Level Exposure
020406080
100120140160180
0.35 um 0.25 um 0.18 um 0.13 umSIA Roadmap Generation
Mas
k Co
st /
Waf
er L
evel
Exp
osur
e ($
)
500
3000Wafer Exposures/Mask
Karen Brown, NIST
Affordable Total Cost / Wafer Level Exposure
020406080
100120140160180
0.35 um 0.25 um 0.18 um 0.13 umSIA Roadmap Generation
Mas
k Co
st /
Waf
er L
evel
Exp
osur
e ($
)
500
3000Wafer Exposures/Mask
Affordable Total Cost / Wafer Level Exposure
020406080
100120140160180
0.35 um 0.25 um 0.18 um 0.13 umSIA Roadmap Generation
Mas
k Co
st /
Waf
er L
evel
Exp
osur
e ($
)
500
3000Wafer Exposures/Mask
'81
'83
'85
'87
'89
'93
'95
'97
'99
Logic
Ts (M
)/Ch
ip
Logic
Ts (K)
/Sta
ff M
on105
104
103
102
1010.110-210-3
10-2
0.11
10102
103104
'01
'03
'05
'07
'09
SEMATECH
Complexity
Productivity
'81
'83
'85
'87
'89
'93
'95
'97
'99
Logic
Ts (M
)/Ch
ip
Logic
Ts (K)
/Sta
ff M
on105
104
103
102
1010.110-2
105
104
103
102
1010.110-210-3
10-2
0.11
10102
103104
10-310-2
0.11
10102
103104
'01
'03
'05
'07
'09
SEMATECH
Complexity
Productivity
As we scale down:• Devices become
– more variable– more faulty (defects & faults)– numerous
• Fabrication becomes– More expensive– More constrained
• Design becomes– More complicated– More expensive
Requires:–Defect tolerant architectures–Higher level specification–Universal substrate
Independent of Technology
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 5
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 9
CMOS Too …• Precision is expensive• Current Abstractions have a major
impact on cost – Requires process engineers to deliver robust
devices– Requires perfect devices
• Process rules and mask costs are already eliminating arbitrary patterns (e.g., forbidden pitches)
• CMOS: just another nanoscale component
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 10
• Reliable Systems from reliable components
• Functionality invested at time of manufacture
• Behavior remains same as features scales down
• Reliable Systems from reliable componentsReliable systems from unreliable components
• Functionality invested at time of manufactureFunctionality modified after manufactureNew manufacturing: Bottom-up assembly
• Behavior remains same as features scales downExpect increased variability
Changes in functionalityRestrictions on connectivity
FutureManufacturing Paradigm Shift RequiredToday
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 6
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 11
• Reliable Systems from reliable components
• Functionality invested at time of manufacture
• Behavior remains same as features scales down
• Reliable Systems from reliable componentsReliable systems from unreliable components
• Functionality invested at time of manufactureFunctionality modified after manufactureNew manufacturing: Bottom-up assembly
• Behavior remains same as features scales downExpect increased variability
Changes in functionalityRestrictions on connectivity
FutureManufacturing Paradigm Shift RequiredToday
A CMOS RAM cell
NanoRAM cell.
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 12
Defect Tolerant Architectures• Features:
– Regular topology– Homogenous resources– Fine-grained?– Post-fabrication modification
• Example from today: DRAM– Requires external device for testing– Requires external device for repair
• Logic? FPGA
Key is redundancy
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 7
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 13
FPGA
Universal gates and/or
storage elements
Interconnectionnetwork
Programmable Switches
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 14
• Aides defect tolerance
Reconfigurability & DFT
• FPGA computing fabric– Regular– periodic– Fine-grained– Homogenous
• programs ⇒ circuits
Place& Route
Place& Route
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 8
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 15
Defect ToleranceTeramac works with mostly broken parts:
Defect mapping+
Fine-grained configurability⇒
Working device
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 16
Finding The Defects• Need to discover the characteristics of
the individual components, but• Can’t selectively stimulate or probe the
components• Download test machines
HOSTThin link
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 9
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 17
≈Built-In Self-Test
Configure testers vertically
Configure testers horizontally
Identify fault!
Configure the device to test itself!
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 18
Scaling to 1011
• Eliminate Host• After initial area is found,
configure chip to act as host• At end of complete test,
upload entire defect map.• For proposed device <1 day to map
HOSTThin link
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 10
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 19
Scaling to 1011
• Eliminate Host• After initial area is found,
configure chip to act as host• At end of complete test,
upload entire defect map.• For proposed device <1 day to map
HOSTHOST
HOST
HOST
HOSTHOST
HOST
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 20
Scaling to 1011
• Eliminate Host• After initial area is found,
configure chip to act as host• At end of complete test,
upload entire defect map.• For proposed device <1 day to map
Complete Map
HOSTThin link
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 11
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 21
Coping With DefectsFPGA
FPGA
FPGA
FPGA
X
XX
XX
XX
Customers
Perfect, $$$
Defects, $$
Defects, $
Defects, 0
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 22
DFT tool flow
Fabric
(Async.) CircuitNetlist
Tester
“Soft”Config.
DefectUnaware
Place-and-Route
Defect Map “Hard”Config.
Defect Aware
Place-and-Route
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 12
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 23
Reconfigurable Computing
Compiler
General-Purpose Custom Hardwareint reverse(int x){
int k,r=0;for (k=0; k<64; k++)
r |= x&1;x = x >> 1;r = r << 1;
}}int func(int* a,int *b){
int j,sum=0;for (j=0; *a>0; j++)
sum+=reverse(*b
General-PurposeCustom Hardware
Logic Blocks
Routing Resources
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 24
Advantages of Reconfigurable
• Flexibility of a processor• Performance of custom hardware
Near
You have to • Store and• Address
the configuration
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 13
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 25
Heart of an FPGA
Switch controlled by a 1-bit RAM cell
0001
Universal gate = RAM
a0a1a0
a1
dataa1 & a2
0data in
control
The cos
t of the
FPGA:
Increa
sed Area
-Delay Pro
duct
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 26
Key Component: Reconfigurable Switch
http://www.chem.ucla.edu/dept/Faculty/stoddart/research/mv.htm
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 14
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 27
Key Component: Reconfigurable Switch
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 28
Key Component: Reconfigurable Switch
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 15
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 29
Key Component: Reconfigurable Switch
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 30
Key Component: Reconfigurable Switch
Reconfigurable Molecular Switch:Eliminates overhead for reconfigurablecomputing
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 16
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 31
The Molecular Electronics Advantage:A Reconfigurable Switch
• Each crosspoint is a reconfigurable switch• Can be programmed using the signal wires
Eliminates overhead found in CMOS FPGAs!
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 32
Fabrication Is Different• Devices & wires alone are not useful• Key to nanoscale computing is
bottom-up assembly
Con
trol
, con
figu
ratio
n de
com
pres
sion
, &de
fect
map
ping
see
d C
ontr
ol, c
onfi
gura
tion
deco
mpr
essi
on, &
defe
ct m
appi
ng s
eed
Con
trol
, con
figu
ratio
n de
com
pres
sion
, &de
fect
map
ping
see
d C
ontr
ol, c
onfi
gura
tion
deco
mpr
essi
on, &
defe
ct m
appi
ng s
eed
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 17
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 33
Building a Computing Crystal
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 34
Building a Computing Crystal
Assembly ⇒ Computing CrystalsWith defects
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 18
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 35
→ Reconfigurable!
Implications
• Use only 2 terminal devices?• Only regular structures• Defect-tolerance required• Functionality must be added post-
fabrication
Remember, this all applies to future CMOS generations too!
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 36
Reconfigurable Computing
Compiler
General-Purpose Custom Hardwareint reverse(int x){
int k,r=0;for (k=0; k<64; k++)
r |= x&1;x = x >> 1;r = r << 1;
}}int func(int* a,int *b){
int j,sum=0;for (j=0; *a>0; j++)
sum+=reverse(*b
General-PurposeCustom Hardware
Logic Blocks
Routing Resources
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 19
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 37
NanoBlock
+Vdd
Gro
und
C
C
Φ
GndΦ
Stripped regions indicate connections from the CMOS layer.
Diode-matrix Molecular-latches
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 38
Computing with a nanoBlock• The active component is a diode.• Use diode-resistor logic
V AND
A
BA
B
A& B
A & B
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 20
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 39
Diode-resistor Logic
VDD
A * BA
B
A * B
Nano-implementation Electrical equivalent
Configured “ON”
Configured “OFF”V AND
BA
A ^ B
VVV AND
A
BA
B
A * B
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 40
You are here
How Do We Get There?
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 21
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 41
Abstractions Are The Key• Transistor
– An ideal switch and isolator– Identical all across the chip
• Circuit– Can be laid out on the chip– Manufactured without defects
• Instruction Set Architecture– Hides all details of the internal architecture– Mechanism for specifying programs
As Fabrication goes, so …
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 42
Necessary Device Abstraction• Transistor provides it all
– Switch– Fan-out– I/O isolation– Signal restoration
• Replace single abstraction with multiple abstractions: SIRM
• Enables an incremental path to direct nano-circuits
Necessary for logic
Necessary for large designs
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 22
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 43
SIRM• SIRM decomposes the abstraction into
its fundumental components.– Switch– Isolation– Restoration– Memory
• SIRM already in use• Allows circuit designers to explicitly
manage the different components of a scalable circuit technology
Necessary for logic
Necessary for large designs
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 44
SIRM Example
• Breaks up transistor abstraction into its components
• Allows creation of library of composable components
• Increases freedom of design
A
A
B
B
A&B
Old Way
A
B
A&B
New Way
Nano-switchNano-restorer
Nano-isolator
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 23
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 45
NanoBlock
+Vdd
Gro
und
C
C
Φ
GndΦ
Stripped regions indicate connections from the CMOS layer.
Diode-matrix Molecular-latches
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 46
Cluster of NanoBlocks
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 24
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 47
The NanoFabric
Con
trol
, con
figu
ratio
n de
com
pres
sion
, &de
fect
map
ping
see
d C
ontr
ol, c
onfi
gura
tion
deco
mpr
essi
on, &
defe
ct m
appi
ng s
eed
• Nanoscale layer put deterministically ontop of CMOS
• Highly regular
• ~108 long lines
• ~106 clusters
• Cluster has 128 blocks
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 48
NanoFabric Attributes• Hierarchical fabrication
1. Manufacture devices2. Non-deterministically align wires3. Self-assemble monolayers onto wires4. Create meshes5. Deterministically align w/CMOS
• Reconfigurable• Defect tolerant
How do we use it?
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 25
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 49
Spanning 10-orders of Magnitude1 Program
10 Billion Gates
Compilers
Architecture
Phoenix Theory
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 50
Bryant’s Law• Processor verification is
always 10 years behind• What to do?
Don’t use processors!
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 26
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 51
Circuits From Compilers
1. Program
2. Split-phase Abstract Machines
3. Configurations placed independently
4. Placement on chip
int reverse(int x){
int k,r=0;for (k=0; k<64; k++)
r |= x&1;x = x >> 1;r = r << 1;
}}
Computations& local storage
Unknown latency ops.
Certifying Circuits!
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 52
Phoenix
General: applicable to today’s software- programming languages- applications
Automatic: compiler-driven
Scalable: - run-time: with clock, hardware- compile-time: with program size
Parallelism: exploit application parallelism
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 27
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 53
Asynchronous Computation
+
data
datavalid
ack
• Tolerant of• Layout• Parametric variation• Variable Latency Operations
• Low-power• Natural implementation for
dataflow
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 54
Typical Program Graph (g721_e)
Control flow transfer
100% memory cluster
Memory reads
100% code cluster
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 28
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 55
Typical Program Graph (g721_e)
Control flow transfer
memory
Memory reads
code
memcpy
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 56
Program Graph After Inlining memcpy
memcpy
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 29
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 57
Another Consideration: Power
40048008
80808085
8086
286 386 486Pentium®
P6
1
10
100
1000
10000
1970 1980 1990 2000 2010
Powe
r Den
sity
(W
/cm2)
Hot Plate
NuclearReactor
RocketNozzle
q Power densities too high to keep junctions at low tempsSource: Irwin & Borkar, De Intel
Sun’sSurface
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 58
Power α 1/Parallel• P α CV2F• F α (V-Vth) 5/4, for CMOS
V• In relevant range: F α V• So, P α CF3
• Also, 1/A α Tn, early VLSI theory result(1 ≤ n ≤ 2)
• If we fix T, then F-1 α T, F α A-1/n
• If n=2, P α CA-3/2
• If C α A, P α A-1
FPGA Developers Forum IEE '03 (10/22/03)
© 2003 Seth Copen Goldstein 30
IEE '03 (10/22/03) © 2003 Seth Copen Goldstein 59
Conclusions• Reconfigurable Computing is inevitable
– Cost of manufacturing– Defect tolerance– Fabrication constraints
• Molecular switches are ideal for reconfigurable device
• EN: New constraints, but huge potential– Billions of devices per cm2
– Ultra-low power– Faster design time– Easier verification