Kneip, Adrian
[UCL]
Bol, David
[UCL]
Ultra-low power (ULP) microcontroller units (MCUs) are widespread solutions in the field of IoT edge-devices for their versatility. Nonetheless, their embedded foundry high-density SRAM macro's increasingly suffer from variability and leakage with technology scaling, making them unfit for ULP operations. Though many architectures of ULP SRAM macro's have been proposed to cope with these issues, they either fail at reaching the MHz-range required for effective edge-computing capabilities, either consume vast access energy during Read/Write (R/W) operations, either achieve poor density. In this work, we propose a 32kB ULP SRAM macro in the LVT 28nm FDSOI technology which takes advantage of an adaptive forward body-biasing (FBB) to regulate PVT variations and boost R/W performance when the MCU processes data in active mode. In sleep mode, the back-bias voltages are set to 0V to save power. The proposed ULP SRAM macro integrates a custom single-ended 7T-ULP bitcell which uses a latch structure with negative-differential resistances (NDRs) to leverage sub-threshold retention. Boosting the gate voltage of a single Write-side PMOS under retention allows to fully scale down the bitcell to a 32nm gate length achieving 0.03ppm stability in sleep mode. A Write-assist mechanism deals with the weak Write-0 operation. We obtain a post-layout 8.3x reduction in sleep mode leakage power over previously proposed non-scaled NDR-latch bitcells architectures while reducing the bitcell area by 2.13x without any significant speed penalty. For integrating the 7T-ULP cell, we propose a custom SRAM macro architecture with the single 32nm gate length. We limit the idle power of the scaled SRAM macro to 32.1μW in spite of the threshold voltage roll-off at 32nm thanks to stacked architectures of bitline and wordline drivers, at the cost of a delay overhead. We compensate for this delay by the 2D-banking of the SRAM macro in local arrays (LAs), yielding a 0.03ppm read access time of 3ns over the LA. Besides, this sub-division reduces the average R/W access energy to a mere 5.5fJ/bit at 80MHz. Finally, a global power-gating of the periphery in sleep mode yields a total sleep mode leakage of 0.49pW/bit over the full macro. Compared to the state-of-the-art of ULP SRAM macro's, our design showcases the lowest sleep leakage power per bit and lowest access energy per bit at 80MHz, while bringing the area down by a factor two compared to previous NDR-latch based ULP SRAM macro's.
Bibliographic reference |
Kneip, Adrian. Design and optimization of an ultra-low power SRAM macro for the Internet-of-Things. Ecole polytechnique de Louvain, Université catholique de Louvain, 2019. Prom. : Bol, David. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:19513 |