A 0.5 V 2.5 µW/MHz Microcontroller withAnalog-Assisted Adaptive Body Bias PVT
Compensation with 3.13 nW/kB SRAM Retentionin 55 nm Deeply-Depleted Channel CMOS
Marc Pons∗, Christoph Thomas Muller∗†
David Ruffieux∗, Jean-Luc Nagel∗, Stephane Emery∗, Andreas Burg†,Shuuji Tanahashi,‡ Yoshitaka Tanaka‡, Atsushi Takeuchi‡
∗CSEM SA, Neuchatel, Switzerland {christoph.mueller, marc.pons, david.ruffieux, jean-luc.nagel, stephane.emery}@csem.ch†EPFL, Lausanne, Switzerland {christoph.mueller, andreas.burg}@epfl.ch
‡Mie Fujitsu Semiconductor Limited, Yokohama, Japan {tanahashi.shuuj, yoshi-tanaka, a-takeuchi}@jp.fujitsu.com
Abstract—Microcontroller systems operating at low supplyvoltage in near- or sub-threshold regime suffer both from in-creased effects of PVT (Process, Voltage, Temperature) variationand from a larger share of leakage on overall power due to thereduced frequency. We show how to overcome these effects for thecore and memory by exploiting the strong body factor of deeply-depleted channel CMOS at 0.5 V, compensating frequency overPVT to ±6%, achieving 30x frequency and 20x leakage scaling ina 2.56 µW/MHz 32 bit RISC Core with 3.13 nW/kB 2.5 µW/MHzSRAM. Frequency-leakage configurability in core and SRAMthrough adaptive body bias at fixed supply voltage is implementedusing a novel automatic analog-assisted ION-controlled approach.
Index Terms—Low Power, Microcontroller, SoC, PVT Com-pensation, Body Bias, Adaptive Body Bias
I. INTRODUCTION
IoT systems require battery powered devices with au-tonomous operation for multiple years. Digital systems haveattempted to accommodate this requirement by downscalingthe supply voltage to reduce the active power consumption, butsupply voltage reduction in general purpose microcontroller(µC) systems (core+memory) poses the following challenges:1) increased impact of PVT variations limits operating fre-quency, 2) leakage dominated consumption inhibits powerscaling with frequency. State of the art systems have attemptedto overcome these lacunae. Techniques in [1], [2] can eliminatefrequency guard bands due to PVT variations, yet require dualsampling when reading memory data or are based on a criticalpath replica sensor. Dynamic leakage suppression logic limitsthe system frequency to the Hz range unless adding bypasstransistors [3].
This work exploits the strong body factor ∆VT/∆VBB of55 nm DDC (Deeply-Depleted Channel, transistor crossectiondepicted in Figure 1) CMOS. With 375 mV/V the body fac-tor of DDC by far exceeds the 85 mV/V back gate tuningsensitivity of FDSOI. When sweeping the body terminal over−1 V<VBB<0.6 V, a VT tuning range over 0.25 V<VT<0.85 Vis realised with negligible impact on power as body currentsare up to 3 orders of magnitude below the supply currents
Gate Depleted channelDrainSource
Body
Anti-punch-through layer
Screening layer
Gate Oxide
VT setting offset layer
Fig. 1. Deeply-Depleted Channel (DDC) CMOS transistor cross section.
1.00E-15
1.00E-14
1.00E-13
1.00E-12
1.00E-11
1.00E-10
1.00E-09
1.00E-08
1.00E-07
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6
Bitce
ll SRA
M lea
kage
(A)
VBB (V)
TT bitcell FF bitcell SS bitcellTT well FF well SS well
Fig. 2. SRAM bit-cell measurements across three process corners at 0.5 V for25 ◦C when sweeping body bias in −1 V<VBB<0.6 V. Well currents (in thebody) are only higher than bit-cell currents (in DDC transistors supply) forSS corner at maximum reverse bias, which is never used when consideringPVT compensation.
(Figure 2). When compared with bulk, DDC transistors ex-perience lower random dopant fluctuations due to the deeplydepleeted channel, resulting in a reduced sensitivity to localmismatch, which is also benefitial when lowering down thesupply voltage.
Specifically this paper presents a full 0.5V µC system(core+memory) that exploits the strong body factor of DDC.The proposed technique is different from classical adaptivebody bias techniques by its novel analog-assisted bias regula-tion. It relies on an ION-current based bias regulation circuitthat automatically adapts bias voltages to 1) compensate for
* 978-1-5386-9395-7/19/$31.00 ©2019 IEEE
0.0
0.2
0.4V D
D(V)
Forward biased
0.4VDD (V)0.20.0 0.0 0.2 0.4
VDD (V)
0.0
0.2
0.4
V DD(V)
Reverse biased
Fig. 3. SRAM SNM butterfly curves exhibit clear windows of operation inthe full range of body bias (from maximum forward to maximum reversebias).
10 10
10 7
10 4
I ON
(A) o
f 100
para
llel P
MOS
203x
slowest corner defines max ION
fastest cornerdefinesmin ION
PMOS
SS0.4
5V-4
0CTT
0.45V
-40C
FF0.4
5V-4
0CSS
0.50V
-40C
TT0.5
0V-4
0CSS
0.45V
25C
TT0.4
5V25
CFF
0.50V
-40C
FF0.4
5V25
CSS
0.55V
-40C
SS0.5
0V25
CSS
0.45V
85C
TT0.5
5V-4
0CTT
0.50V
25C
TT0.4
5V85
CFF
0.45V
85C
FF0.5
0V25
CFF
0.55V
-40C
SS0.5
0V85
CSS
0.55V
25C
TT0.5
0V85
CTT
0.55V
25C
FF0.5
0V85
CSS
0.55V
85C
FF0.5
5V25
CTT
0.55V
85C
FF0.5
5V85
C
10 10
10 7
10 4
I ON
(A) o
f 100
para
llel N
MOS
296x
slowest corner defines max ION
fastest cornerdefinesmin ION
NMOS
Fig. 4. Measured ION currents through 100 parallel W/L=205/90nm N andPMOS ULL DDC transistors: TT, SS, FF samples, VGS=VDS=0.5 V±10%,−40 ◦C, 25 ◦C, 85 ◦C. Considering PVT compensation, VT tuning capabilityby body bias allows ION scaling of 200x for PMOS and 300x for NMOS.
PVT frequency variation—achieving as low as ±6% variabil-ity across PVT, 2) dynamically vary the leakage with operatingfrequency to optimise the power efficiency for the specificapplication—with 30x frequency scaling and 20x leakage scal-ing capability in the core and 3) implement a memory retentionmode to minimise the leakage contribution—reaching down to3.13 nW/kB for the SRAM.
The remainder of this paper is structured as follows: SectionII describes our approach on generating the body voltagesbased on ION-current regulation and Section III describes theoverall system implemented. Following, Section IV presentssilicon measurement results of the whole system and SectionV concludes the findings.
II. ION BASED ADAPTIVE BODY BIAS GENERATION
We exploit the VT tuning capability to set a constant IONcurrent in the transistors, to control the frequency linearlyaccording to f = ION/CV equation. At circuit level, this isenabled by an analog-assisted adaptive body bias loop thatadjusts the transistors VT by varying the body bias w.r.t. PVTconditions with a current reference without modifying thesupply voltage. It enables >4x in core and up to 10x SRAMdynamic power efficiency over state of the art systems.
Fig. 5. Full die photo with annotation of the presented system
Figure 4 illustrates ION variation for PMOS and NMOS in−1 V<VBB<0.6 V range as a function of PVT. Black dotsindicate the case of no bias (VBB=0 V) and the bars spanfrom maximum forward (VBB=0.6 V) to maximum reverse bias(VBB=−1 V). PVT conditions SS 0.45 V −40 ◦C and FF 0.55 V85 ◦C define the maximum and minimum ION that can be setfor all PVT by using body bias, and therefore defining therange of frequency compensation.
Maximum and minimum ION will respectively correspondto highest frequency + worst case leakage, and to lowest fre-quency + best case leakage. This allows to define a “mode ofoperation” with application specific frequency and minimumpossible leakage consumption.
For optimum compensation, with VT tuning uniformlyapplied to the complete system, a dedicated standard celllibrary with single transistor size of W/L=205/90 nm (chosenas a compromise between matching and area/power) wasdesigned. For higher drive strengths, multiple fingers wereused. Timing analysis including statistical on-chip variations(SOCV) was performed to derive the library files in all PVTwith appropriate biasing conditions for synthesis and timingclosure.
A dedicated SRAM has also been designed for bias com-pensation. 6T bit-cell from the foundry is used. The bit-cellsizing is optimised for 0.5V operation with the body bias.SNM butterfly curves illustrating clear windows are depictedin Figure 3. The bit-cell area is 0.4256 µm2. In the SRAMperiphery, sense amplifiers for bit-line reading are replaced bytristate buffers for column selection and robust PVT operationat low voltage. Single-side bit-line read is used for improvedSNM (Figure 6).
III. SYSTEM
Figure 6 shows the integrated system overview and theconcept of the body biasing scheme: it uses two digitally-programmable currents (8 bit DAC codes). They are fed in-dividually to diode-connected N and PMOS replicas (withsame W/L as in the design). Two OPAMPs adjust VBNW andVBPW respectively. The connections are such that all devicesto be compensated, and the replicas, have identical VGS andVBS and therefore follow the same VT tuning. The proof-of-concept system comprises of a low power 32 bit RISCCore combined with 64 kB SRAM, 4 kB latch based memory,and 4 kB ROM. The core peripherals include Timers, SPI,UART, GPIOs and JTAG on-chip debugging. Ring oscillator
Current Controlled Biasing System
Core Biasing System
4KB ROM 32 BitRISC Core
4KBLatch RAM
GPIOJTAG + OCD
SRAM BC Biasing System
ROM Biasing System
4KBBitcellArray
64KB SRAMin total
…
Periphery
4KBBitcellArray
Periphery
4KBBitcellArray
Periphery
4KBBitcellArray
PeripherySRAM
Periphery Biasing System
Ring Biasing System
-+
VBNW
2.5V
-1V
VDDNW8 bit
current DAC
-+
VBPW
-1V
2.5V
PW8 bit
current DAC
VDD
…50 rings constructed of standard cells
SPI
UART
Timer
VDD
ensingle side
read throughtri-state buffer
BLBLWL
VBPW,B
VBNW,B
Column decoder Control
Row decoder
VBPW,P VBNW,P
Periphery
Bitcell array
Fig. 6. Integrated system overview: Body Bias generation, µC with periphery, and biased SRAM architecture.
Bitcell VBNW (V)
0.4
0.2
0.0
0.2
0.4
Bitce
ll VBP
W (V
)
5.00MHz
20 MHz
10 MHz
2.50MHz
0.31MHz0.62 MHz
1.25 MHz
Bitcell
0.5 0.0 0.5Periphery VBNW (V)
0.4
0.2
0.0
0.2
0.4
Perip
hery
V BPW
(V)
5.00MHz
20 MHz
10 MHz
2.50MHz
0.31MHz0.62 MHz
1.25 MHz
Periphery0.5 0.0 0.5
Bitcell VBNW (V)
0.0
0.2
0.4
0.6
Bitce
ll VBP
W (V
)
20 MHz
10 MHz
5 MHz
1.25MHz
0.31MHz
2.5MHz
0.62MHz
retention: 0.1µA
SS,0.45V,-40°C
1.0 0.5 0.0 0.5Bitcell VBNW (V)
0.4
0.2
0.0
0.2
0.4
Bitce
ll VBP
W (V
)
5 MHz
20 MHz
10 MHz
2.5MHz
0.31MHz0.62MHz1.25MHz
retention: 0.4µA
TT,0.5V,25°C
1.0 0.5 0.0 0.5Bitcell VBNW (V)
1.00
0.75
0.50
0.25
0.00
0.25
Bitce
ll VBP
W (V
)
10 MHz
1.25MHz
2.5MHz5 MHz
0.31MHz0.62MHz
retention: 4.2µA
FF,0.55V,85°C
100 101
Frequency (MHz)
10 7
10 6
10 5
10 4
10 3
10 2
SRAM
leak
age (
A) 0.4m
A
0.1m
A
0.0m
A
1.9µ
A
0.5µ
A
4.1µ
A
0.9µ
A
retention: 0.1µA
Slow mode0.86µA
Fast mode0.08mA
100 101
Frequency (MHz)
10 7
10 6
10 5
10 4
10 3
10 2
SRAM
leak
age (
A)
0.1m
A
1.7m
A
0.2m
A
13.6µ
A
1.3µ
A 3.2µ
A
5.3µ
A
retention: 0.4µA
Slow mode3.23µA
Fast mode0.19mA
100 101
Frequency (MHz)
10 7
10 6
10 5
10 4
10 3
10 2
SRAM
leak
age (
A)
0.2m
A
13.9µ
A
32.9µ
A
0.1m
A
6.3µ
A
8.8µ
A
retention: 4.2µA
Slow mode6.30µA
Fast mode0.19mA
Fig. 7. Left: Functional SRAM bias conditions in TT 0.5 V 25 ◦C. The SRAM functionality is defined in a 4D bias design-space including periphery andbit-cell biases. Black dots mark minimum leakage points. Right: 64 kB SRAM measurement in extreme and typical PVT. Green dots are working (VBNW,VBPW)pairs. Black lines indicate constant frequency fronts. Black dots indicate lowest leakage for the given frequency line. Yellow dots are retention (VBNW,VBPW)configurations. The black star is the lowest retention leakage. Fast and Slow mode leakages are plotted against frequency by red and blue lines.
based test circuitry is also included. Figure 5 shows a photoof the chip integrated in 55 nm DDC technology. Five bodybiasing domains individually cover (1) SRAM bit-cells, (2)SRAM periphery, (3) RISC Core, its peripherals, and the latchmemory, (4) ROM and (5) ring oscillators.
IV. RESULTS
For the measurements we have defined two extreme modesof operation of the system to illustrate its configurabilitydepending on the application’s leakage v/s frequency require-ments. The Fast mode of operation accommodates up to10 MHz operating frequency across all PVT, while the Slowmode reduces the frequency by 30x to 0.31 MHz and also
TABLE IMEASUREMENTS FOR FAST AND SLOW MODES OF LEAKAGE AND
FREQUENCY VARIATION FOR THE 32 BIT RISC CORE IN EXTREME ANDTYPICAL PVT CONDITIONS.
P FF TT SSV 0.55 V 0.5 V 0.45 VT 85 ◦C 25 ◦C −40 ◦C
RISC RISC RISC Frequencycore core core Variation
leakage leakage leakage over PVTSlow (0.31 MHz) 0.25 µA 54 nA 14 nA ±21 %
Fast (10 MHz) 4.08 µA 1.19 nA 0.24 nA ±6 %
TABLE IICOMPARISON WITH STATE OF THE ART.
This Work ISSCC2018 [2] ISSCC2018 [4] ISSCC2014 [5] ISSCC2015 [3] JSSCC2017 [6]Technology 55 nm DDC 28 nm FDSOI 180 nm Bulk 65 nm Bulk 65 nm Bulk 40 nm BulkCore 32 bit RISC LVT core 16 bit MSP430 n/a 32 bit Cortex M0+ 32 bit Cortex M0Core dyn (µW/MHz) 2.56 n/a 14 n/a 11.7 8.8Core leak (nW) 27 n/a <1 n/a 20 n/aSRAM (kB) 64 ULL 6T n/a 2 SCM 16 XLL 6T 8 LV 10T 64 6TSRAM dyn (µW/MHz) 2.5 n/a n/a 25 n/a n/aSRAM retention (nW/kB) 3.13 n/a n/a 0.26 15 n/aSRAM bitcell size (µm2) 0.425 n/a n/a 2.159 3.64 n/aRetention (Core + 4kBSRAM) (nW)
39.52 n/a n/a n/a 80 n/a
Frequency (MHz) 0.31-10 9 MHz target NM:0.016-2.8 142.9 0.029-66 0.8-50LSM: 1 Hz-4 Hz
Frequency variation Fast: ±6% ±3.5% Only Die-2-Die n/a n/a n/ain PVT (%) Slow: ±21% process 3σ
NM: ±23%LSM: ±36%
Process measured. SS, TT, FF Single process Single process Single process Single process Single processSupply range 0.5 V±10% 0.35-1 V 0.2 V-1.1 V 1.2 V 0.19 V-1.2 V 0.2 V-0.5 V Core
0.6V SRAMBias range. −1 V<VBB<0.6 V NW: 0 to 1.8 V n/a −0.3 V<VBB<−0.1 V n/a n/a
PW: -1.5 to 0 VTemp. range -40 to 85 ◦C -40 to 125 ◦C 0 to 45 ◦C 25 to 125 ◦C 25 to 70 ◦C 0 to 70 ◦C
the leakage by 20x as compared to Fast mode. Measurementresults for the RISC Core are depicted in Table I. PVTcompensation enables as low as ±6% residual frequencyvariation for the Fast mode and ±21% for the Slow mode. InSlow mode the variation is higher as transistors VT is lowerand we operate in the sub-threshold region. Yet this is stilllower as compared to 2x frequency guard bands reported foruncompensated designs [4].
64kB SRAM measurements regarding frequency, leakageand retention are depicted in on the right side of Figure 7for PVT extreme cases and in typical conditions. FF 0.55 V85 ◦C presents the worst case leakage, therefore VBB biasesfor minimizing the leakage in this PVT condition are selectedfor Fast and Slow modes. However, leakages in TT 0.5 V25 ◦C and SS 0.45 V −40 ◦C are shifted from the optimum(only up to 6% leakage increase in Fast Mode compared tothe optimum leakage indicated by the black dot). The effectworsens in Slow mode due to PVT variations (up to 60%). Inretention mode leakage is as low as 0.4 µA for the 64 kB arrayin typical conditions, corresponding to 3.13 nW/kB. SRAMbias conditions are selected in a 4D design space as VBBfor P and NMOS in the bit-cell array and in the peripheryare required (Figure 7). The current-based PVT compensationsimplifies this selection as bias voltages are defined by fixedcurrents. Table II compares the performances of our system(core+memory) with the state of the art implementations. Also,we have shown the proposed technique with SS, TT and FFchips, demonstrating the robustness of the approach.
V. CONCLUSION
This work combines the benefits of the 55 nm CMOS DDCprocess with a novel analog-assisted current-based adaptivebody bias to mitigate the impact of PVT variations on theVT induced frequency variations. This enables wide 30x
frequency and 20x leakage scaling without tuning the supplyvoltage. Further, the design achieves a power efficiency of2.56 µW/MHz and 2.5 µW/MHz enabling >4x and up-to 10ximprovement in the core and in the SRAM respectively, overstate of the art systems. Furthermore, the achieved 5x bitcelldensity improvement and 3.13 nW/kB SRAM retention will behighly beneficial for the emerging memory intensive systems.Considering a core and 4 kB SRAM for comparison, thesystem retention is 2x lower than in [6].
REFERENCES
[1] M. Khayatzadeh, M. Saligane, J. Wang, M. Alioto, D. Blaauw, andD. Sylvester, “17.3 a reconfigurable dual-port memory with error detec-tion and correction in 28nm fdsoi,” in 2016 IEEE International Solid-StateCircuits Conference (ISSCC), pp. 310–312, Jan 2016.
[2] A. Quelen, G. Pillonnet, P. Flatresse, and E. Beigne, “A 2.5uw0.0067mm2 automatic back-biasing compensation unit achieving 50fdsoi28nm over 0.35-to-1v vddrange,” in 2018 IEEE International Solid - StateCircuits Conference - (ISSCC), pp. 304–306, Feb 2018.
[3] J. Myers, A. Savanth, D. Howard, R. Gaddh, P. Prabhat, and D. Flynn,“8.1 an 80nw retention 11.7pj/cycle active subthreshold arm cortex-m0+subsystem in 65nm cmos for wsn applications,” in 2015 IEEE Inter-national Solid-State Circuits Conference - (ISSCC) Digest of TechnicalPapers, pp. 1–3, Feb 2015.
[4] L. Lin, S. Jain, and M. Alioto, “A 595pw 14pj/cycle microcontrollerwith dual-mode standard cells and self-startup for battery-indifferentdistributed sensing,” in 2018 IEEE International Solid - State CircuitsConference - (ISSCC), pp. 44–46, Feb 2018.
[5] T. Fukuda, K. Kohara, T. Dozaka, Y. Takeyama, T. Midorikawa,K. Hashimoto, I. Wakiyama, S. Miyano, and T. Hojo, “13.4 a 7ns-access-time 25w/mhz 128kb sram for low-power fast wake-up mcu in 65nm cmoswith 27fa/b retention current,” in 2014 IEEE International Solid-StateCircuits Conference Digest of Technical Papers (ISSCC), pp. 236–237,Feb 2014.
[6] H. Reyserhove and W. Dehaene, “A differential transmission gate designflow for minimum energy sub-10-pj/cycle arm cortex-m0 mcus,” IEEEJournal of Solid-State Circuits, vol. 52, pp. 1904–1914, July 2017.