Upload
phillip-powell
View
220
Download
1
Tags:
Embed Size (px)
Citation preview
-1-UC San Diego / VLSI CAD Laboratory
Accuracy-Configurable Adder for Approximate Arithmetic Designs
Accuracy-Configurable Adder for Approximate Arithmetic Designs
Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego
49th Design Automation ConferenceJune 6th, 2012
-2-
OutlineOutline
Background and Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works
-3-
Why Approximate Designs?Why Approximate Designs? Threats to traditional IC design approach ...
Extreme variations: PVT variation uncertainty lead to design overheadReliability issues: Hard errors (NBTI, latchup), Soft errors (α-particle)Cost: Cost (power/performance) of perfect accuracy is too high!
Approximate designsRelaxing the requirement of correctness can dramatically reduce costs of the design
What is the square root of 10 ?
“a little more than three”
“3.162278....”
Approximation could be faster and more powerful
Threats to traditional IC design approach ...Extreme variations / Reliability issues / Cost:
Approximate designsRelaxing the requirement of correctness can dramatically reduce costs of the design
-4-
Previous Approximate AddersPrevious Approximate Adders
Lu et al. IEEE Computer 2004
Zhu et al. TVLSI 2010
Output accuracy is fixed benefits can be limited by required accuracy
Faster adder w/ shorter carry chain High performance with small error rate Large area overhead: not applicable for
low energy design
ETAI : accurate part + inaccurate part Reduce error size Error rate is high
-5-
Our Work: Accuracy-Configurable Approximate AdderOur Work: Accuracy-Configurable Approximate Adder
time
norm
aliz
ed p
ower 1.0
required accuracy80% 100% 90% 80%
accurate design
accuracy configurable design
event occurred
accurate mode
approximate mode
Accuracy-configurable design adapts to changing requirements by using different modes in each situation
How power benefits can be achieved …
-6-
Our Work: Accuracy-Configurable Approximate AdderOur Work: Accuracy-Configurable Approximate Adder
How power benefits can be achieved …
time
norm
aliz
ed p
ower 1.0
required accuracy80% 100% 90% 80%
accurate design
accuracy configurable design
event occurred
accurate mode
approximate mode
Accuracy-configurable approximate adder
Mode 1: turn-off ECC-1, ECC-2
accuracy: 90% accuracy: 95%Mode 2: turn-off ECC-2
Mode 3: turn-on All ECC
accuracy: 100%
approximateadder
error collection
(ECC-1)
error collection
(ECC-2)
-7-
OutlineOutline
Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works
-8-
Approximate Adder ImplementationApproximate Adder Implementation
A[15:0]
8-bitadder
8-bitadder
8-bitadder‘
SUM[16]
SUM
SUM[3:0]
SUM[15:12]AH+BH
AM+BM
AL+BL
SUM[7:4]
SUM[11:8]
carryAH=A[15:8],AM=A[11:4],
AL=A[7:0]
B[15:0]
A[0]
A[15]
SUMH
SUMM
SUML
16-bit adder case
Carry chain is cut to reduce critical path delay Sub-adders generate results of partial summation Middle sub-adder improves accuracy (error 50% 5.5%)
-9-
Approximate Adder ImplementationApproximate Adder Implementation
N-bit adder case
Probability of correct result :2
1)
2
12
2
11(),(
k
N
k
k
kkNP
Approximate adder can be configured with “k”
A [N-1:N-k]
B [N-1:N-k]
A [N-k-1:N-2k]
B [N-k-1:N-2k]
A [N-2k-1:N-3k]
B [N-2k-1:N-3k]
SUM [N-1:N-k] SUM [N-k-1:N-2k]
A [N-2k-1:N-3k]
B [N-2k-1:N-3k]
SUM [N-2k-1:N-3k]carry
k N: bit width, k: ½ carry-chain depth
Estimation over CLA (N=16)K 2 3 4 5 6
Min. clock cycle 0.5 0.65 0.75 0.83 0.89area 0.87 1.05 1.12 1.15 1.12power 0.44 0.68 0.84 0.95 1.00pass rate 0.554 0.829 0.942 0.982 0.995
carry
-10-
Error Detection and CorrectionError Detection and Correction
SUMapprox
OUTINsub-adderi
sub-adderi+1
approximate adder
SUMcorrect
carryi+1
error
EDC circuit
data stall
sumi
errori
incrementor
Error can be detected and corrected with small overhead Error detection: ‘and’ gates Error correction: incrementor circuit
Error detection and correction can take more time than critical path delay of “sub-adder”; the throughput can be reduced
Variable latencyoperation
-11-
Accuracy Configuration with PipelineAccuracy Configuration with Pipeline
approximate adder
A
B
Stage 1 Stage 2
errors on S1
SUMcorrectcorrection on S1
S3 S2 S1 S0SUM
approximate correct
S3 S2 S1 S0
approximate correct
Stage 3
correction on S2
Stage 4
correction on S3
S3 S2 S1 S0
correctapprox.
S3 S2 S1 S0
correct
errors on S2
errors on S3
Config.Power-gating
Accuracy
Power reductio
n
Mode-1 None 1.000 -11.5%
Mode-2 Stage 4 0.960 12.4%
Mode-3 Stage-3, 4 0.925 31.0%
Mode-4Stage-2, 3,
40.900 51.6%
Each stage generates a result with different accuracy
Can turn off later stages with power gating according to accuracy requirement
power gating
power gating
power gating
-12-
OutlineOutline
Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works
-13-
Experimental Setup and MetricsExperimental Setup and Metrics
Metric Definition Data typeACCamp 1-|Rc-Re|/Rc Amplitude dataACCinf 1-Be/Bw Information data
Experimental Setup Library: TSMC 65GP Implementation: Synopsys Design Compiler Simulation: Cadence NC-SIM Input patterns: random data and actual data Library preparation: Cadence Library Characterizer
Accuracy Metrics
Rc and Re : correct and obtained results Be: number of error bits, Bw: bit-width of data
-14-
Approximate Adder ComparisonApproximate Adder Comparison Accuracy vs. power consumption
Image smoothing(Gaussian filter)(a) Original image(b) Accurate adder(c) ACA (PSNR 24.5dB)(d) ETAI (25.3dB)(e) ETAII (16.2dB)(f) LU (11.1dB)
(c)~(f) have 50% power of accurate adder (b)
(a) (b) (c)
(d) (e) (f)
* ETAI cannot detect and correct errors
-15-
Approximate Adder ComparisonApproximate Adder Comparison Accuracy vs. power consumption w/voltage scaling
2.00E-04 4.00E-04 6.00E-04 8.00E-040.400
0.500
0.600
0.700
0.800
0.900
1.000
ACA adderCLALu's adderETAI
total power (W)
ACCamp
Voltage scaling (1.0V~0.6V)
2.00E-04 4.00E-04 6.00E-04 8.00E-040.400
0.500
0.600
0.700
0.800
0.900
1.000
ACA adderCLALu's adderETAIETAIIM
total power (W)
ACCinf
ACA adder shows fine results (accuracy vs. power)
on both ACCamp and ACCinf metrics
-16-
0.80 0.85 0.90 0.95 1.000.00E+00
5.00E-04
1.00E-03
1.50E-03
2.00E-03
2.50E-03
3.00E-03
3.50E-03
4.00E-03
Conventional pipelined adderACA adder (mode 1)ACA adder (mode 2)ACA adder (mode 3)ACA adder (mode 4)
ACCinf
tota
l pow
er c
onsu
mpti
on (W
)Accuracy Configuration and Power SavingAccuracy Configuration and Power Saving Power saving from voltage scaling + mode change
4-stage 32-bit adder caseaccurate result
mode change
voltage scaling
Accuracy configuration w/ mode change is more effective than w/ voltage scaling
volta
ge s
calin
g
mod
e ch
ange 4X
redu
ction
Accuracy:1.0 → 0.9
-17-
Accuracy Configuration and Power SavingAccuracy Configuration and Power Saving Power consumption when accuracy requirement
is varying (w/ SPEC 2006 benchmarks)
astar
bzip2
calcu
lix gcc
h264refmcf
sjeng
soplex
0
0.2
0.4
0.6
0.8
1
mode-4mode-3mode-2mode-1
Nor
mal
ized
pow
er
cons
umpti
on
0.95 Accuracy 1.00
Average 30% power savings over no accuracy configuration
reference
referenceresultAvgAccuracy
||1.
Hig
h ac
cura
cy
-18-
OutlineOutline
Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works
-19-
Conclusions and Ongoing WorksConclusions and Ongoing Works
RTL Required accuracy
exact adder
approximate adder Synthesis
Accuracy estimation
Conclusions We proposed accuracy-configurable approximate (ACA)
adder, which can adapt to changing accuracy requirement ACA can provide 30% power reduction with accuracy
configuration during runtime Ongoing Works
Accuracy-configurable design for other arithmetic units (multiplier, divider)
Automated synthesis flow (minimize power under the required accuracy)
-20-
Thank You!
-21-
Accuracy-Configurable Approximate DesignAccuracy-Configurable Approximate Design
Required accuracy can change during runtime Idea of High-Efficiency Math
highlighted by Intel Labs at ISSCC-2012 Variable-precision floating point unit w/
accuracy tracking : 24-bit 12-bit 6-bit as needed
time
norm
aliz
ed p
ower 1.0
required accuracy80% 100% 90% 80%
accurate design
accuracy configurable design
event occurred
accurate mode
approximate mode
Variable-precision Mantissa
Accuracy-configurable design adapts to changing requirements, maximizing benefits of approximate design paradigm