Upload
chad-phillips
View
219
Download
0
Embed Size (px)
DESCRIPTION
Penn ESE534 Spring DeHon 3 Today Addition –organization –design space –parallel prefix
Citation preview
Penn ESE534 Spring2010 -- DeHon1
ESE534:Computer Organization
Day 3: January 25, 2010Arithmetic
Work preclass exercise
Penn ESE534 Spring2010 -- DeHon2
Last Time
• Boolean logic computing any finite function• Saw gates…and a few properties of logic
Penn ESE534 Spring2010 -- DeHon3
Today
• Addition– organization– design space– parallel prefix
Penn ESE534 Spring2010 -- DeHon4
Why?
• Start getting a handle on – Complexity
• Area and time• Area-time tradeoffs
– Parallelism– Regularity
• Arithmetic underlies much computation– grounds out complexity
Preclass
Penn ESE534 Spring2010 -- DeHon5
Circuit 1
• Can the delay be reduced?• How?• To what?
Penn ESE534 Spring2010 -- DeHon6
Tree Reduce AND
Penn ESE534 Spring2010 -- DeHon7
Circuit 2
• Can the delay be reduced?
Penn ESE534 Spring2010 -- DeHon8
Circuit 3
• Can the delay be reduced?
Penn ESE534 Spring2010 -- DeHon9
Brute Force Multi-Output AND
• How big?• ~38 here • … in general about N2/2
Penn ESE534 Spring2010 -- DeHon10
Brute Force Multi-Output AND
• Can we do better?
Penn ESE534 Spring2010 -- DeHon11
Circuit 4
• Can the delay be reduced?
Penn ESE534 Spring2010 -- DeHon12
Addition
Penn ESE534 Spring2010 -- DeHon13
Penn ESE534 Spring2010 -- DeHon14
C: 00A: 01101101010B: 01100101100S: 0
C: 000A: 01101101010B: 01100101100S: 10
C: 0000A: 01101101010B: 01100101100S: 110
C: 10000A: 01101101010B: 01100101100S: 0110
C: 010000A: 01101101010B: 01100101100S: 10110
C: 1010000A: 01101101010B: 01100101100S: 010110
C: 11010000A: 01101101010B: 01100101100S: 0010110
C: 011010000A: 01101101010B: 01100101100S: 10010110
C: 1011010000A: 01101101010B: 01100101100S: 010010110
C: 11011010000A: 01101101010B: 01100101100S: 1010010110
C: 11011010000A: 01101101010B: 01100101100S: 11010010110
Example: Bit Level Addition• Addition
– Base 2 example
A: 01101101010B: 01100101100S:
C: 0A: 01101101010B: 01100101100S:
Penn ESE534 Spring2010 -- DeHon15
Addition Base 2• A = an-1*2(n-1)+an-2*2(n-2)+... a1*21+ a0*20
= (ai*2i)• S=A+B• What is the function for si … carryi?
• si= carryi xor ai xor bi
• carryi = ( ai-1 + bi-1 + carryi-1) 2
= ai-1*bi-1+ai-1*carryi-1 + bi-1*carryi-1
= MAJ(ai-1,bi-1,carryi-1)
Penn ESE534 Spring2010 -- DeHon16
Ripple Carry Addition• Shown operation of each bit• Often convenient to define logic for
each bit, then assemble:– bit slice
Penn ESE534 Spring2010 -- DeHon17
Ripple Carry Analysis
• Area: O(N) [6n]• Delay: O(N) [2n]
What is area and delay for N-bit RA adder? [unit delay gates]
Penn ESE534 Spring2010 -- DeHon18
Can we do better?
• Lower delay?
Penn ESE534 Spring2010 -- DeHon19
Important Observation
• Do we have to wait for the carry to show up to begin doing useful work?– We do have to know the carry to get the
right answer.– How many values can the carry take on?
Penn ESE534 Spring2010 -- DeHon20
Idea
• Compute both possible values and select correct result when we know the answer
Penn ESE534 Spring2010 -- DeHon21
Preliminary Analysis• Delay(RA) --Delay Ripple Adder• Delay(RA(n)) = k*n [k=2 this example]• Delay(RA(n)) = 2*(k*n/2)=2*DRA(n/2)• Delay(P2A) -- Delay Predictive Adder• Delay(P2A)=DRA(n/2)+D(mux2)• …almost half
the delay!
Penn ESE534 Spring2010 -- DeHon22
Recurse
• If something works once, do it again.• Use the predictive adder to implement
the first half of the addition
Penn ESE534 Spring2010 -- DeHon23
Recurse
Redundant (can share)
N/4
Penn ESE534 Spring2010 -- DeHon24
Recurse• If something works once, do it again.• Use the predictive adder to implement the
first half of the addition
• Delay(P4A(n))=Delay(RA(n/4)) + D(mux2) + D(mux2)
• Delay(P4A(n))=Delay(RA(n/4))+2*D(mux2)
Penn ESE534 Spring2010 -- DeHon25
Recurse• By know we realize we’ve been using the wrong
recursion– should be using the Predictive Adder in the recursion
• Delay(PA(n)) = Delay(PA(n/2)) + D(mux2)• Every time cut in half…? • How many times cut in half?• Delay(PA(n))=log2(n)*D(mux2)+C
– C = Delay(PA(1)) • if use FA for PA(1), then C=2
Penn ESE534 Spring2010 -- DeHon26
Another Way
(Parallel Prefix)
Penn ESE534 Spring2010 -- DeHon27
CLA
• Think about each adder bit as a computing a function on the carry in– C[i]=g(c[i-1])– Particular function f will
depend on a[i], b[i]– g=f(a,b)
Penn ESE534 Spring2010 -- DeHon28
Functions
• What functions can g(c[i-1]) be?– g(x)=1
• a[i]=b[i]=1– g(x)=x
• a[i] xor b[i]=1– g(x)=0
• a[i]=b[i]=0
Maybe better to show this as a specialization: g(c) = carry(a=0,b=0,c) = carry(a=1,b=0,c) = carry(a=0,b=1,c) = carry(a=1,b=1,c)
Penn ESE534 Spring2010 -- DeHon29
Functions
• What functions can g(c[i-1]) be?– g(x)=1 Generate
• a[i]=b[i]=1– g(x)=x Propagate
• a[i] xor b[i]=1– g(x)=0 Squash
• a[i]=b[i]=0
Penn ESE534 Spring2010 -- DeHon30
Combining
• Want to combine functions– Compute c[i]=gi(gi-1(c[i-2]))– Compute compose of two
functions
• What functions will the compose of two of these functions be?– Same as before
• Propagate, generate, squash
Penn ESE534 Spring2010 -- DeHon31
Compose Rules(LSB MSB)
• GG• GP • GS• PG• PP• PS
• SG• SP• SS
[work on board]
Penn ESE534 Spring2010 -- DeHon32
Compose Rules (LSB MSB)
• GG = G• GP = G• GS = S• PG = G• PP = P• PS = S
• SG = G• SP = S• SS = S
Penn ESE534 Spring2010 -- DeHon33
Combining
• Do it again…• Combine g[i-3,i-2] and g[i-1,i]• What do we get?
Penn ESE534 Spring2010 -- DeHon34
Reduce Tree
Penn ESE534 Spring2010 -- DeHon35
Reduce Tree
• Sq=/A*/B• Gen=A*B
• Sqout=Sq1+/Gen1*Sq0
• Genout=Gen1+/Sq1*Gen0
Penn ESE534 Spring2010 -- DeHon36
Reduce Tree
• Sq=/A*/B• Gen=A*B
• Sqout=Sq1+/Gen1*Sq0
• Genout=Gen1+/Sq1*Gen0
• Delay and Area?
Penn ESE534 Spring2010 -- DeHon37
Reduce Tree
• Sq=/A*/B• Gen=A*B
• Sqout=Sq1+/Gen1*Sq0
• Genout=Gen1+/Sq1*Gen0
• A(Encode)=2• D(Encode)=1• A(Combine)=4• D(Combine)=2• A(Carry)=2• D(Carry)=1
Penn ESE534 Spring2010 -- DeHon38
Reduce Tree: Delay?
• D(Encode)=1• D(Combine)=2• D(Carry)=1
Delay = 1+2log2(N)+1
Penn ESE534 Spring2010 -- DeHon39
Reduce Tree: Area?
• A(Encode)=2• A(Combine)=4• A(Carry)=2
Area= 2N+4(N-1)+2
Penn ESE534 Spring2010 -- DeHon40
Reduce Tree: Area & Delay
• Area(N) = 6N-2• Delay(N) = 2log2(N)+2
Intermediates
• Can we compute intermediates efficiently?
Penn ESE534 Spring2010 -- DeHon41
Penn ESE534 Spring2010 -- DeHon42
Prefix TreeP
refix
Tr
ee
Penn ESE534 Spring2010 -- DeHon43
Prefix Tree• Share terms
Intermediates
• Share common terms
Penn ESE534 Spring2010 -- DeHon44
Penn ESE534 Spring2010 -- DeHon45
Prefix Tree• Share terms• Reduce computes
spans– 0:3, 4:5
• Reverse tree– Combine spans for
missing 0:i (i<N) • E.g. 0:3+4:50:5
• Same size as reduce tree
Penn ESE534 Spring2010 -- DeHon46
Prefix TreeP
refix
Tr
ee
Parallel Prefix Area and Delay?
• Roughly twice the area/delay• Area= 2N+4N+4N+2N = 10N• Delay = 4log2(N)+2
Penn ESE534 Spring2010 -- DeHon47
Penn ESE534 Spring2010 -- DeHon48
Parallel Prefix
• Important Pattern• Applicable any time operation is associative– Or can be made assoc. as in MAJ case
• Examples of associative functions?– Non-associative?
• Function Composition is always associative
Penn ESE534 Spring2010 -- DeHon49
Note: Constants Matter• Watch the constants• Asymptotically this Carry-Lookahead Adder
(CLA) is great• For small adders can be smaller with
– fast ripple carry– larger combining than 2-ary tree– mix of techniques
• …will depend on the technology primitives and cost functions
Penn ESE534 Spring2010 -- DeHon50
Two’s Complement
• positive numbers in binary• negative numbers
– subtract 1 and invert– (or invert and add 1)
Penn ESE534 Spring2010 -- DeHon51
Two’s Complement
• 2 = 010• 1 = 001• 0 = 000• -1 = 111• -2 = 110
Penn ESE534 Spring2010 -- DeHon52
Addition of Negative Numbers?
• …just works
A: 111B: 001S: 000
A: 110B: 001S: 111
A: 111B: 010S: 001
A: 111B: 110S: 101
Penn ESE534 Spring2010 -- DeHon53
Subtraction• Negate the subtracted input and use adder
– which is:• invert input and add 1• works for both positive and negative input
–001 110 +1 = 111–111 000 +1 = 001–000 111 +1 = 000–010 101 +1 = 110–110 001 +1 = 010
Penn ESE534 Spring2010 -- DeHon54
Subtraction (add/sub)
• Note: you can use the “unused” carry input at the LSB to perform the “add 1”
Penn ESE534 Spring2010 -- DeHon55
Overflow?
• Overflow=(A.s==B.s)*(A.s!=S.s)
A: 111B: 001S: 000
A: 110B: 001S: 111
A: 111B: 010S: 001
A: 111B: 110S: 101
A: 001B: 001S: 010
A: 011B: 001S: 100
A: 111B: 100S: 011
Admin
• Office Hours: W2pm
Penn ESE534 Spring2010 -- DeHon56
Penn ESE534 Spring2010 -- DeHon57
Big Ideas[MSB Ideas]
• Can build arithmetic out of logic
Penn ESE534 Spring2010 -- DeHon58
Big Ideas[MSB-1 Ideas]
• Associativity • Parallel Prefix• Can perform addition
– in log time– with linear area