Upload
buicong
View
214
Download
1
Embed Size (px)
Citation preview
1
CS/EE 5830/6830 VLSI ARCHITECTURE
Spring 2011 – Arithmetic Subsystem Design
VLSI Architecture
The overall purpose of this class is to study a particular VLSI-related topic in depth.
The class this year will focus on arithmetic circuits and systems, and their VLSI implementations.
VLSI Architecture
This will be a lab/homework based class that concludes with a project: ¾ of a semester of labs (individual) ¼ of a semester of project (either group or individual)
Subjects Covered
Arithmetic circuits (size vs. speed vs. power) for: Addition
ripple, CLA, CSA, prefix-tree, CCS, carry-save, higher-order compressors, multi-operand reduction
Multiplication serial, carry-save, higher-radix coding (Booth coding), array, tree
Division restoring vs. non-restoring, higher-radix coding, SRT
Floating Point add, sub, multiply, divide, etc.
Logical Effort transistor sizing Optimizing for speed on the back of an envelope
Asynchronous Arithmetic Variable completion time circuits
CS/EE 5830/6830
VLSI Architecture T Th 5:15-6:35pm, MEB 3105
Instructor: Prof. Erik Brunvand MEB 3142 Office hours: After class, when my door is open, or by
appointment
TA: Anand Venkat Office hours: to be determined
CS/EE 5830/6830
Web Page – all sorts of information! http://www.eng.utah.edu/~cs6830
[email protected] Goes to everyone in the class https://sympa.eng.utah.edu/sympa
[email protected] Goes to instructor and TA
2
Textbook
Digital Arithmetic
Miloš Ercegovac and Tomás Lang
Prerequisites
Digital design is essential! (i.e. CS/EE 3700) Boolean algebra Combinational circuit design and optimization
K-map minimization, SOP, POS, DeMorgan, bubble-pushing, etc. Basic arithmetic circuits, 2’s complement numbers
Sequential Circuit design and optimization Latch/flip-flop design Finite state machine design/implementation Communicating FSMs Using FSMs to control datapaths
Prerequisites
You should have used some sort of schematic design entry tool
You should be able to use Linux If you’re going to build a chip, or do detailed
mask-level design and evaluation, you need CS/EE 5710/6710 experience
If you’re going to target an FPGA, you need Xilinx experience (i.e. CS/EE 3710)
We’ll be using Verilog for some assignments, so experience with an HDL will be useful
Self-Evaluation!
On the class web site is a self-evaluation practice exam If you can do these problems, you probably have the
right background If you can’t, you will struggle!!!!!
Please take this seriously! Give this exam a try and make sure you remember what you need to know!
Recommendations
Computer Architecture experience is helpful Instruction set architecture (ISA) Assembly language execution model Instruction encoding Simple pipelining
I assume you’ve used some sort of CAD tools for digital circuits Schematic capture Simulation
First Assignment
CAD Assignment #1 Cadence Composer tutorial Simple circuit design with simulation
Learn basic Verilog for writing testbenchs
Available on the web site Due on Tuesday, Jan 25th, 5:00pm
3
Assignments/Grading
Labs & Homework (40%-5830, 35%-6830) Choosing and evaluating papers (10%-6830) Mid-term exam (15%) Final Project (45%-5830, 40%-6830)
See the syllabus (web page) for more details about grading breakdown
There is a “flake-factor” that I can apply to group project grades based on your confidential evaluations…
Projects
Study and characterize the behavior (speed, power, size, etc.) of various arithmetic subsystems Can be team-based if you like
We’ll use tools from Cadence and Synopsys (and possibly Xilinx) These are installed in the CADE lab, so you’ll need a
CADE account I also assume you know something about Linux!
Arithmetic Units
Example: I did a quick study in 2005 to look at the relative sizes of arithmetic units using an Artisan standard cell library (0.25µ)
Cell Library
Commercial cell library from Artisan Targets a 0.25µ CMOS process 5 layers of metal interconnect 441 cells in the library
Multiple drive strengths per logic function
Similar libraries available for 180nm, 130nm, 90nm, 65nm, 45nm, 32nm, etc.
Tools
Arithmetic units synthesized using Module Compiler from Synopsys Along with DesignWare for FP units
CPU synthesized using Design Compiler from a behavioral Verilog description
Place and route using Cadence Silicon Ensemble
Disclaimer
It took some work getting the library in a state that works with our back-end flow I haven’t simulated or tested any of these circuits I haven’t looked into timing details I haven’t looked into power details So, take these exact dimensions with a grain of salt, but
they’re very close…
4
Circuits
32-bit ripple-carry adder 32-bit carry lookahead (cla) adder 32-bit ALU
+ (cla), -, inc, dec, abs, neg, and, or, xor, inv, pass, 0, 1
32-bit multiplier (64 bit result) 32-bit divider
FP Circuits
All 32-bit FP format 8-bit exponent, 23-bit mantissa
FP Add FP Mult FP Divide
CPU
OpenRISC 1200 32 bit CPU, 5-stage pipeline, 32 regs MAC instruction (32x32 -> 48) (fully pipelined) Performance reported for 0.18µ 6LM process
300 dhrystone 2.1 MIPS @ 300 MHz Full system includes caches, MMU, I/O, etc and uses around
1M transistors (all I synthesized was the CPU)
OpenRISC 1200
OpenRISC 1200 CPU OpenRISC 1200 Arith Ops
5
Number of Standard Cells Chip Area
Lightening Tour of VLSI Design
Start with HDL program (VHDL or Verilog usually)
entity traffic is port (CLK, go_green, go_red, go_yellow: in STD_LOGIC; l_green, l_red, l_yellow: out STD_LOGIC;); end; architecture traffic_arch of traffic is -- SYMBOLIC ENCODED state machine: Sreg0 type Sreg0_type is (green, red, yellow); signal Sreg0: Sreg0_type; begin --concurrent signal assignments Sreg0_machine: process (CLK) begin if CLK'event and CLK = '1' then case Sreg0 is when green => if go_yellow='1' then Sreg0 <= yellow; end if; when red => if go_green='1' then Sreg0 <= green; end if; when yellow => if go_red='1' then Sreg0 <= red; end if;
-- when others => null; end case; end if; end process; assignment statements for combinatorial outputs l_green_assignment: l_green <= '1' when (Sreg0 = green) else '0' when (Sreg0 = red) else '0' when (Sreg0 = yellow) else '0';
l_yellow_assignment: l_yellow <= '0' when (Sreg0 = green) else '0' when (Sreg0 = red) else '1' when (Sreg0 = yellow) else '1';
l_red_assignment: l_red <= '0' when (Sreg0 = green) else '1' when (Sreg0 = red) else '0' when (Sreg0 = yellow) else '0';
end traffic_arch;
VLSI Design
Or start with a schematic (or a mix of both)
Convert Gates to Transistors Convert Transistors to Layout
6
Assemble Gates into a Circuit And Assemble Whole Chip
Example Class Chip (2001)
16-bit Processor, approx 27,000 transistors
Same Chip (no M2, M3)
1.5mm x 3.0mm, 72 I/O pads
Zoom In… Zoom In…
A Hair (100 microns)
7
Another Class Project (2001)
3.0mm x 3.0mm
84 I/O Pads
Standard-Cell Part
Standard-Cell Zoom Register File
Adder/Shifter Class project from 2002
16-bit CORDIC Processor
8
Class project from 2003
Basketball Scoreboard Display
Class project from 2003
Basketball Scoreboard Display
Another class project (2003)
Simple processor (+, -, *, /) with ADC on the input
Back to the Arithmetic Units…
How they Look
@ 0.6u inside a single tiny-chip frame, and 4-TCU frame
Ripple Adder
9
FP Mult Inside a 6mm Chip (@0.25u)
Inside a 10mm Chip (@0.25u) Inside a 15mm Chip (@0.25u)
Inside a 10mm chip (@0.13u) Inside a 15mm chip (@0.13u)
10
Background - transistors
Because of the history of this class, VLSI (6710) isn’t a prerequisite…
BUT, understanding something about CMOS transistor-level design is important! Hard to understand all the power/speed/size tradeoffs
if you don’t understand the issues!
So – lightning review of CMOS transistors! We’ll start looking at Chapter 1 next week…
Electronics Summary
Voltage is a measure of electrical potential energy
Current is moving charge caused by voltage
Resistance reduces current flow
Ohm’s Law: V = I R
Power is work over time P = V I = I2R
Capacitors store charge
It takes time to charge/ discharge a capacitor Time to charge/discharge is related exponentially to RC It takes energy to charge a capacitor Energy stored in a capacitor is (1/2) C V2
Reminder: Voltage Division
Find the voltage across any series-connected resistors
Example of Voltage Division
Find the voltage at point A with respect to GND
How Does This Relate to VLSI? Model of a CMOS Transistor
11
Two Types of CMOS Transistors CMOS Transistors
Complementary Metal Oxide Semiconductor Two types of transistors
Built on silicon substrate “majority carrier” devices Field-effect transistors
An electric field attracts carriers to form a conducting channel in the silicon…
For now, just some basic abstractions
Silicon Lattice
Transistors are built on a silicon substrate Silicon is a Group IV material Forms crystal lattice with bonds to four neighbors
Dopants
Silicon is a semiconductor Pure silicon has no free carriers and conducts poorly Adding dopants increases the conductivity
Group V: extra electron (n-type) Group III: missing electron, called hole
(p-type)
p-n Junctions
A junction between p-type and n-type semiconductor forms a diode.
Current flows only in one direction
+
-
i electrons Vds
+Vgs S
G
D
N-type Transistor
12
nMOS Operation
Body is commonly tied to ground (0 V) When the gate is at a low voltage:
P-type body is at low voltage Source-body and drain-body diodes are OFF No current flows, transistor is OFF
nMOS Operation Cont.
When the gate is at a high voltage: Positive charge on gate of MOS capacitor Negative charge attracted to body Inverts a channel under gate to n-type Now current can flow through n-type silicon from source
through channel to drain, transistor is ON
+
-
i holes Vsd -Vgs S
G
D
P-type Transistor pMOS Transistor
Similar, but doping and voltages reversed Body tied to high voltage (VDD) Gate low: transistor ON Gate high: transistor OFF Bubble indicates inverted behavior
A Cutaway View
CMOS structure with both transistor types
Transistors as Switches
For now, we’ll abstract away most analog details…
S
G
D
S
G
D
G=0 G=1
G=0 G=1
Good 0
Poor 0 Good 1
Poor 1
Good 1
Good 0 Good 1
Good 0
Not Perfect Switches!
13
“Switching Circuit”
For example, a switch can control when a light comes on or off
No electricity can flow
+5v
0v
“AND” Circuit
Both switch X AND switch Y need to be closed for the light to light up
+5v
0v X Y
“OR” Circuit
The light comes on if either X OR Y are closed
+5v
X
Y 0v
CMOS Inverter
CMOS Inverter
A Y 0 1
CMOS Inverter
A Y 0 1 0
14
CMOS Inverter
A Y 0 1 1 0
Timing Issues in CMOS
Power Consumption CMOS NAND Gate
CMOS NAND Gate
A B Y 0 0 0 1 1 0 1 1
CMOS NAND Gate
A B Y 0 0 1 0 1 1 0 1 1
15
CMOS NAND Gate
A B Y 0 0 1 0 1 1 1 0 1 1
CMOS NAND Gate
A B Y 0 0 1 0 1 1 1 0 1 1 1
CMOS NAND Gate
A B Y 0 0 1 0 1 1 1 0 1 1 1 0
CMOS NOR Gate
3-input NAND Gate
Y pulls low if ALL inputs are 1 Y pulls high if ANY input is 0
3-input NAND Gate
Y pulls low if ALL inputs are 1 Y pulls high if ANY input is 0
16
N-type and P-type Uses
Because of the imperfect nature of the the transistor switches ALWAYS use N-type to pull low ALWAYS use P-type to pull high If you need to pull both ways, use them both
S
In
Out
S S=0, In = Out S=1, In = Out
Switch to Whiteboard
Complex Gate Tri-State Latch D-register