

UNIVERSITÉ CATHOLIQUE DE LOUVAIN ECOLE POLYTECHNIQUE DE LOUVAIN ICTEAM INSTITUTE

## Switched-Capacitor DC/DC Converters in Nanometer CMOS Technologies for Micro-Power Energy Management

Julien De Vos

Thesis submitted in partial fulfillment of the requirements for the degree of *Docteur en Sciences de l'Ingénieur* 

Dissertation committee: Prof. Denis Flandre (ICTEAM Institute, UCL), advisor Prof. David Bol (ICTEAM Institute, UCL), Prof. Laurent Francis (ICTEAM Institute, UCL), Prof. Michiel Steyaert (KU Leuven, Leuven), Ir. Clerc Sylvain (STMicroelectronics, Crolles), Prof. Danielle Vanhoenacker-Janvier (ICTEAM Institute, UCL), President

December 2013

## SWITCHED-CAPACITOR DC/DC CONVERTERS IN NANOMETER CMOS TECHNOLOGIES FOR MICRO-POWER ENERGY MANAGEMENT

Julien De Vos Ecole Polytechnique de Louvain ICTEAM Institute



Université catholique de Louvain Louvain-la-Neuve (Belgium)

 $\dot{A}$  mes parents et à ma soeur

# CONTENTS

| Acknow            | dedgments                                                                                     | xi                  |
|-------------------|-----------------------------------------------------------------------------------------------|---------------------|
| Abstrac           | t                                                                                             | xiii                |
| Acrony            | ms                                                                                            | XV                  |
| List of a         | notations                                                                                     | xvii                |
| Introdu           | ction                                                                                         | xxi                 |
| I.1<br>I.2<br>I.3 | Energy autonomous IC design for the Internet-of-Things<br>Thesis objectives<br>Thesis outline | xxi<br>xxiii<br>xxv |
| Author?           | s publication list                                                                            | xxix                |
| 1 Fun             | damentals of switched-capacitor DC/DC converters                                              | 1                   |
| 1.1               | Introduction                                                                                  | 3                   |
| 1.2               | Overview of switched-capacitor DC/DC converters                                               | 4                   |
|                   | 1.2.1 Voltage conversion mechanism                                                            | 4                   |
|                   | 1.2.2 Topologies of the switched-capacitor network                                            | 5                   |
| 1.3               | Electrical equivalent model of the switched-capacitor network                                 | 7                   |
|                   | 1.3.1 Model behavior                                                                          | 7                   |
|                   | 1.3.2 Slow-switching frequency behavior                                                       | 8                   |
|                   | 1.3.3 Fast-switching frequency behavior                                                       | 10                  |
| 1.4               | Sources of power losses                                                                       | 11                  |
|                   | 1.4.1 Linear losses                                                                           | 11                  |
|                   | 1.4.2 Bottom-plate losses                                                                     | 13                  |
|                   | 1.4.3 Switch-driving losses                                                                   | 14                  |
|                   | 1.4.4 Control losses                                                                          | 14                  |
| 1 5               | 1.4.5 Voltage-ripple losses                                                                   | 15                  |
| 1.5               | voltage regulation with switched-capacitor networks                                           | 15                  |
|                   | 1.5.1 Frequency control mechanisms                                                            | 15<br>17            |
|                   | 1.5.2 Switch conductance control mechanisms                                                   | 17                  |
|                   | 1.0.0 Capacitor control mechanism                                                             | 11                  |

vii

viii contents

|     | 1.5.4        | Reconfiguration of the switched-capacitor network | 19 |
|-----|--------------|---------------------------------------------------|----|
|     | 1.5.5        | Converter interleaving                            | 19 |
| 1.6 | 3 Conclusion |                                                   | 21 |
|     |              |                                                   |    |

| 2 | As   | sizing methodology for on-chip switched capacitor DC/DC         |    |
|---|------|-----------------------------------------------------------------|----|
|   | conv | verters                                                         | 23 |
|   | 2.1  | Introduction                                                    | 25 |
|   | 2.2  | Proposed sizing methodology                                     | 26 |
|   |      | 2.2.1 Topology selection                                        | 27 |
|   |      | 2.2.2 Capacitor sizing                                          | 28 |
|   |      | 2.2.3 Relative switch sizing                                    | 30 |
|   |      | 2.2.4 Partitioning of $Z_{out}$ between $Z_{SSL}$ and $Z_{FSL}$ | 32 |
|   |      | 2.2.5 Selection of switching frequency and total switch width   | 36 |
|   |      | 2.2.6 Losses and efficiency calculation                         | 37 |
|   | 2.3  | Validation of the methodology                                   | 38 |
|   |      | 2.3.1 Validation of the output impedance and losses models      | 38 |
|   |      | 2.3.2 Sizing methodology validation against an exhaustive       |    |
|   |      | search for the optimum                                          | 40 |
|   | 2.4  | Discussion and exploitation of the methodology                  | 40 |
|   | 2.5  | Conclusion                                                      | 42 |
| 3 | Mul  | ti-mode DC/DC converters for dynamic power management           | 45 |
|   | 31   | Introduction                                                    | 47 |
|   | 3.2  | A $0.13\mu$ m dual-mode SC DC/DC converter for duty-cycled      |    |
|   | 0.2  | microcontrollers                                                | 48 |
|   |      | 3.2.1 MP mode                                                   | 50 |
|   |      | 3.2.2 ULP mode                                                  | 50 |
|   |      | 3.2.3 Experimental validation                                   | 51 |
|   | 3.3  | A 28nm FD-SOI multi-mode SC DC/DC converter for DVFS            |    |
|   |      | microcontrollers                                                | 54 |
|   |      | 3.3.1 Proposed SC DC/DC converter architecture                  | 55 |
|   |      | 3.3.2 Simulation results                                        | 61 |
|   | 3.4  | Conclusion                                                      | 63 |
| 4 | Puse | ching adaptive voltage scaling fully on chip                    | 67 |
|   | 4.1  | Introduction                                                    | 69 |
|   | 4.2  | Impact of PVT variations on low-power circuits                  | 70 |
|   | _    | 4.2.1 Definition of the low-power benchmark circuits            | 70 |

| CONTENTS | ix |
|----------|----|
|----------|----|

|     | 4.2.2  | Impact of the PVT variations                      | 72 |
|-----|--------|---------------------------------------------------|----|
| 4.3 | State- | of-the-art of AVS systems                         | 74 |
|     | 4.3.1  | The timing sensor                                 | 75 |
|     | 4.3.2  | The controller                                    | 77 |
|     | 4.3.3  | The clock generator                               | 78 |
|     | 4.3.4  | The DC/DC converter                               | 78 |
| 4.4 | Prope  | sed on-chip AVS system with an SC DC/DC converter | 79 |
|     | 4.4.1  | Description of the AVS system                     | 79 |
|     | 4.4.2  | Jitter on the generated clock                     | 80 |
|     | 4.4.3  | Ripple-induced $V_{DD}$ guard band                | 81 |
|     | 4.4.4  | Local variation induced $V_{DD}$ guard band       | 83 |
| 4.5 | Pract  | ical AVS implementation                           | 84 |
|     | 4.5.1  | AVS controller                                    | 85 |
|     | 4.5.2  | DC/DC converter                                   | 87 |
|     | 4.5.3  | Variable-length ring oscillator                   | 87 |
|     | 4.5.4  | AVS stability                                     | 88 |
| 4.6 | Exper  | imental validation                                | 89 |
| 4.7 | Concl  | usions                                            | 93 |

## 5 A single-converter power management unit for energy-harvesting wireless sensor nodes 95

| 5.1     | Introd  | luction                                                | 97  |
|---------|---------|--------------------------------------------------------|-----|
| 5.2     | State   | of the art of energy-harvesting power management units | 98  |
| 5.3     | A sing  | gle-converter PMU for energy harvesting                | 100 |
|         | 5.3.1   | PMU architecture                                       | 101 |
|         | 5.3.2   | PMU controller                                         | 102 |
|         | 5.3.3   | Voltage reference                                      | 108 |
|         | 5.3.4   | Power on reset                                         | 110 |
|         | 5.3.5   | Voltage comparators                                    | 111 |
|         | 5.3.6   | SC DC/DC converter                                     | 112 |
| 5.4     | Syster  | n validation                                           | 117 |
| 5.5     | Concl   | usions                                                 | 122 |
| Conclus | ions ar | nd perspectives                                        | 125 |
| Referen | ces     |                                                        | 135 |
| Append  | ix A: K | XOB22-12X1 macro model                                 | 141 |

## ACKNOWLEDGMENTS

Je pense que pour pouvoir réaliser un doctorat, il faut être capable de rêver et de suivre un lapin blanc exempt de ponctualité. Un peu comme Alice au pays des merveilles, on tombe alors dans un puits sans fin pour se réveiller dans un monde étrange, celui de notre domaine de recherche, qui n'est familier qu'à une poignée de personnes sur Terre. Seul et tâtonnant, on part alors à la découverte de ce monde pour retrouver notre chemin. Heureusement, comme Alice, on trouve beaucoup de personnes pour nous aider dans cette aventure. J'aimerais ici remercier toutes ces personnes et m'excuser d'avance pour tous les oublis que je pourrais faire.

Tout d'abord, j'aimerais remercier mon promoteur, le Professeur Denis Flandre, de m'avoir convaincu de suivre cette aventure et de déposer une demande de bourse au FNRS il y a un peu moins de cinq ans. Je voudrais également le remercier pour ses relectures et commentaires, même quand je lui remettais un brouillon peu de temps avant l'échéance. Merci aussi pour cette liberté laissée lors de mes recherches, et pour m'avoir laissé tenter mes propres expériences. Je tiens aussi à remercier le Professeur David Bol, qui a également contribué à me convaincre de réaliser cette thèse (ceux qui me connaissent savent que je suis têtu, et le doctorat ne faisait pas partie de mes projets après l'obtention de mon diplôme d'ingénieur, mais il ne faut jamais dire jamais...). Merci d'avoir suivi mon travail au jour le jour, d'avoir répondu à une multitude de questions et de m'avoir permis de passer quelques mois à Crolles. Merci encore pour les nombreuses relectures, pour les discussions techniques ou autres, ainsi que pour ta patience. Enfin, je souhaite témoigner ma gratitude au Professeur Laurent Francis qui a accepté de faire partie de mon comité d'accompagnement. Nos réunions m'ont permis de prendre le recul nécessaire sur mon travail.

Je suis également reconnaissant aux membres de mon Jury de thèse : le Professeur Michiel Steyaert qui m'a permis de confronter mes résultats à un véritable expert du domaine, et Monsieur Sylvain Clerc qui a veillé à ce que je me sente comme chez moi lors de mon séjour à Crolles. Je souhaite aussi remercier le Professeur Danielle Vanhoenacker-Janvier qui a accepté de présider ma défense de thèse.

J'aimerais ensuite remercier mes collègues de bureau pour la bonne ambiance et pour le temps passé ensemble : Cédric Hocquet (peut-être nous recroiseronsnous une troisième fois ?), Angelo Kuti Lusala qui m'a tenu compagnie certaines longues soirées de week-end, Dina Kamel, François Durvaux, Sébastien Bernard, Guerric de Streel (N'en profite pas pour étendre tes papiers sur mon bureau !) et François Botman (pour ses relectures de mon anglais et pour m'avoir laissé le taquiner plus que de raison). J'aimerais encore remercier mes collègues de l'ICTEAM et de l'UCL pour la bonne ambiance au travail et pour les bons moments partagés : Adeline Decuyper, Aline Emplit, Andra Iordanescu, El Hafed Boufouss, Emilie Renard, François Baudart, Geoffroy Gosset, Guillaume Beckers, Guillaume Pollissard, Jonathan Denies, Numa Couniot, Pierre-Antoinne Haddad, Thomas Wallewyns. Merci également à l'équipe administrative : Anne Adant, Isabelle Dargent, Corinne de Potter d'Indoye, Christel Derzelle, Nathalie Ponet, Viviane Sauvage et à l'équipe WELCOME : Pascal Simon et Pierre Gérard. Je remercie également Myriam Banaï de l'AILV pour son écoute et ses conseils durant une période de doute. Enfin, je n'oublie pas Brigitte Dupont pour le support informatique toujours très rapide et pour le réseau sur lequel il a été un plaisir de faire quelques *vols* d'essai.

Je remercie le Fonds National de la Recherche Scientifique de Belgique pour avoir financé mes recherches.

Enfin, je voudrais remercier mes parents pour leur soutien pendant ces quatre années de thèse, et avant de m'avoir enseigné certaines valeurs qui ont fait de moi ce que je suis aujourd'hui. Merci également à ma famille et à mes proches pour leur confiance et leurs encouragements.

Enfin, je tiens à remercier Pia pour son soutien, ses encouragements, et pour son éternel enthousiasme même dans les moments plus difficiles. Merci également d'être comme tu es, et pour cette belle vie que nous construisons ensemble.

Julien

# ABSTRACT

The development of wireless sensor nodes (WSNs) as well as the rise of the Internet-of-Things (IoT) push ahead the research effort in ultra-low-power integrated circuit. Jointly with the recent development of micro-energy harvesters, it enables energy autonomous designs. Nevertheless, the power extracted by such energy harvesters does not provide a large power budget so that such node circuits are highly duty cycled.

The power management unit (PMU) of energy-autonomous systems must thus be designed properly. A high power-conversion efficiency is required in the DC/DC converters between the energy harvesters, energy storage devices and system components. Because of the highly duty-cycled operation, not all but most functions of the system are idle most of the time and the PMU must be power efficient not only at typical loads ( $100\mu$ W-10mW) but also for ultra-low loads ( $<<10\mu$ W). Furthermore as the availability of environmental energy can hardly be predicted, robust power management units with a storage device must be designed to ensure that the system will not run out of energy. Finally the PMU must use as few external components as possible to sustain the deployment of trillions IoT nodes. In order to ease the development of such a power management units, this dissertation focuses on answering two questions:

- How can we design efficient on-chip DC/DC converters to supply both ultra-low and multi-mode loads?
- How can we integrate these converters in power management units to address the requirements of energy-autonomous systems?

To answer these two questions we propose analytical developments supported by simulation and experimental results. To start this dissertation we show that Switched-Capacitor (SC) DC/DC converters are a good candidate to address the power-conversion issue because they can efficiently supply low loads and because they can be integrated on-chip. Then we further answer the first question in two steps.

Firstly, we propose a model of both the SC DC/DC converter output impedance and of their losses. According to this model, we develop a practical and systematical sizing methodology that gives an analytical solution of the converter optimum switching frequency, switch and capacitor sizes that result in the best power-conversion efficiency for given die area and load power requirements. We further demonstrate that there is a design trade-off between the converter occupied area and the conversion efficiency.

Secondly we propose specific designs to enable high conversion efficiency through an extremely wide load range of more than three orders of magnitude.

#### **XIV** ABSTRACT

We show that adaptive techniques such as body biasing, clock generation and switch sizing can cut down the energy consumed by the converter when the load power consumption drops because it enters a low workload or a stand-by mode. We also show that proper selection of transistor type and of the switch gate length allows to use the converter as a power-gating module.

The second question is answered in two parts. First, digital loads such as microcontrollers are an important contributor to the total power consumption of energy-autonomous systems. A way to reduce their power consumption is to supply them at an ultra-low supply voltage generated on-chip. However the minimal voltage of digital circuits to support a target clock frequency is highly dependent on the temperature and on the manufacturing process corner which leads to large supply voltage guard band and thus to extra power consumption. A power management unit that monitors the clock frequency and generates the supply voltage of such digital circuits is proposed. It uses an SC DC/DC converter with an adaptive voltage scaling controller to deliver the smallest supply voltage allowing the digital load to operate safely at its target frequency. The delivered supply voltage is automatically adapted to process corners and temperature fluctuations which reduces the digital circuit power consumption by up to 44%.

Finally, energy-autonomous systems require several DC/DC converters to convert the energy provided by the energy harvesters, to send energy to the storage device, and to supply the loads. The multiplication of the voltage converters adds to the energy loss and either threatens the system energy robustness or increases the energy harvester area. We thus propose a complete power management unit that only uses one SC DC/DC converter to both regulate a 1V voltage from energy harvesters and send charge to the storage device.

# ACRONYMS

| AVS    | Adaptive Voltage Scaling                |
|--------|-----------------------------------------|
| CMOS   | Complementary Metal-Oxide Semiconductor |
| CPR    | Critical-Path Replica                   |
| DCO    | Digitally-Controlled Oscillator         |
| DVFS   | Dynamic Voltage and Frequency Scaling   |
| FD-SOI | Fully Depleted Silicon-On-Insulator     |
| FSL    | Fast Switching Limit                    |
| IC     | Integrated Circuit                      |
| IoT    | Internet-of-Things                      |
| MPPT   | Maximum Power Point Tracking            |
| NOC    | Non-Overlapping Clock                   |
| PVT    | Process, Voltage and Temperature        |
| PFM    | Pulse Frequency Modulation              |
| PMU    | Power Management Unit                   |
| POR    | Power On Reset                          |
| PSM    | Pulse Skipped Modulation                |
| PWM    | Pulse Width Modulation                  |
| TC     | Transfer Capacitor                      |
| RO     | Ring Oscillator                         |
| SAR    | Successive Approximation Register       |
| SC     | Switched-Capacitor                      |
| SCN    | Switched-Capacitor Network              |
| SoC    | System-on-Chip                          |
| SOI    | Silicon-On-Insulator                    |
| SSL    | Slow Switching Limit                    |
| ULP    | Ultra-Low-Power                         |
| ULV    | Ultra-Low-Voltage                       |
| WSN    | Wireless Sensor Node                    |

# LIST OF NOTATIONS

| $a_c$            | Vector of the normalized charge flow through the transfer capacitors                                                                                           | e<br>[/]            |
|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------|
| $a_r$            | Vector of the normalized charge flow through the switches                                                                                                      | e power<br>[/]      |
| $\alpha_{BP}$    | Ratio between parasitic capacitance and useful $C$                                                                                                             | $C_{Ti}$ [/]        |
| β                | Variable regrouping all the terms in the expression<br>the driving losses that are technology- and<br>topology-dependent: $\beta = \epsilon C_{unit} V_{dd}^2$ | on of $[J/\mu m]$   |
| $C_{cont}$       | Average capacitance switched every clock cycle is<br>control circuits                                                                                          | n the $[F]$         |
| $C_T$            | Transfer capacitance                                                                                                                                           | [F]                 |
| $C_{T,tot}$      | Sum of the transfer capacitances                                                                                                                               | [F]                 |
| $C_{out}$        | Output filtering capacitance                                                                                                                                   | [F]                 |
| $C_{unit}$       | Power switch gate capacitance per width unit                                                                                                                   | $[F/\mu m]$         |
| $\Delta V$       | Difference between $V_{out}$ and $V_{nl}$ of a SC DC/DC                                                                                                        | [V]                 |
| $\Delta V_{BPi}$ | Voltage swing at the bottom plate of the $i^{th} C_{Ti}$<br>phase change                                                                                       | at a $[V]$          |
| $\epsilon$       | Fitting parameter for the $P_{driving}$                                                                                                                        | [/]                 |
| $\eta$           | Efficiency of the DC/DC converter                                                                                                                              | [/]                 |
| $\eta_{max}$     | Maximal theoretical efficiency of the SC $\rm DC/\rm DC$                                                                                                       |                     |
|                  | converter                                                                                                                                                      | [/]                 |
| $E_{in}$         | Energy supplied by the energy source                                                                                                                           | [J]                 |
| $E_{out}$        | Energy supplied to the load                                                                                                                                    | [J]                 |
| $\gamma$         | Variable regrouping all the terms in the expression<br>the bottom-plate losses that are technology- and                                                        | on of               |
|                  | topology-dependent: $\gamma = C_{T,tot} \alpha \sum_{i} (c_i \Delta V_{BPi}^2)$                                                                                | $\lfloor J \rfloor$ |
| $G_{i0}$         | Conductance per $\mu m$ of the power switches                                                                                                                  | $[S/\mu m]$         |
|                  |                                                                                                                                                                | xvii                |

| $f_{CLK}$          | Clock frequency of a microcontroller                                                | [Hz]                                 |
|--------------------|-------------------------------------------------------------------------------------|--------------------------------------|
| $f_{MAX}$          | Maximal clock frequency of a microcont<br>timing failure                            | troller without $[Hz]$               |
| $f_t$              | Switching frequency for which slow and<br>limit impedance have the same value       | fast switching $[Hz]$                |
| $f_{TARGET}$       | Target frequency                                                                    | [Hz]                                 |
| $f_{sw}$           | Switching frequency                                                                 | [Hz]                                 |
| $I_{leak,control}$ | Leakage current in the control circuits                                             | [A]                                  |
| Iout               | Output current                                                                      | [A]                                  |
| $K_{int}$          | Number of interleaved converters                                                    | [/]                                  |
| $k_n$              | Parameter for the evaluation of NMOS conductance: $\frac{C_{ox} \times \mu_n}{L}$   | switch $[F/(V \cdot s \cdot \mu m)]$ |
| $k_p$              | Parameter for the evaluation of PMOS conductance: $\frac{C_{ox} \times \mu_p}{L}$   | switch $[F/(V \cdot s \cdot \mu m)]$ |
| К                  | Variable regrouping all the terms in the<br>the slow switching limit impedance that | e expression of<br>t are technology- |
|                    | and topology-dependent: $\kappa = \sum_{i} \frac{a_{ci}^2}{C_{Ti}}$                 | $[F^{-1}]$                           |
| m                  | Ideal conversion ratio of an SC $DC/DC$                                             | converter [/]                        |
| $\phi 1$           | Fraction of the switching period where network is in its first phase                | the capacitor [/]                    |
| $\phi 2$           | Fraction of the switching period where network is in its second phase               | the capacitor [/]                    |
| $P_{BP}$           | Bottom plate losses                                                                 | [W]                                  |
| $P_{control}$      | Controller losses                                                                   | [W]                                  |
| $P_{driving}$      | Driving losses                                                                      | [W]                                  |
| $P_{in}$           | Power provided by the energy source                                                 | [W]                                  |
| $P_{lin}$          | Linear losses                                                                       | [W]                                  |
| $P_{out}$          | Output power                                                                        | [W]                                  |
| $P_{ripple}$       | Losses inherent to the ripple voltage                                               | [W]                                  |

| $\theta$     | Variable regrouping all the terms in the expression                                                                     | n of       |
|--------------|-------------------------------------------------------------------------------------------------------------------------|------------|
|              | the fast switching limit impedance that are technological topology dependents $2 \times \sum_{i=1}^{n_{ri}} a_{ri}^{2}$ | logy-      |
|              | and topology-dependent: $2 \times \sum_{i} \frac{1}{G_{i0} \times s_i}$ [5]                                             | $2/\mu m$  |
| $s_i$        | Ratio between the width of the $i^{th}$ power switch a                                                                  | nd         |
|              | W <sub>tot</sub>                                                                                                        | [/]        |
| $T_{TARGET}$ | Target clock period                                                                                                     | [s]        |
| $V_{BAT}$    | Battery voltage                                                                                                         | [V]        |
| $V_{SC}$     | Voltage seen by the storage supercapacitor                                                                              | [V]        |
| $V_{DD,min}$ | Minimum $V_{out}$ to ensure safe operation of the load                                                                  |            |
|              | circuits                                                                                                                | [V]        |
| $V_{DDbuf}$  | Supply voltage of the switch drivers                                                                                    | [V]        |
| $V_{DS}$     | Drain to source voltage of a MOSFET device                                                                              | [V]        |
| $V_{GS}$     | Gate to source voltage of a MOSFET device                                                                               | [V]        |
| $V_{in}$     | Input voltage                                                                                                           | [V]        |
| $V_{MAX}$    | Supply voltage of the PMU gate drivers, it is the                                                                       |            |
|              | highest between $V_{reg}$ and $V_{SC}$ .                                                                                | [V]        |
| $V_{nl}$     | No-load voltage                                                                                                         | [V]        |
| $V_{reg}$    | Regulated voltage                                                                                                       | [V]        |
| $V_T$        | Threshold voltage of MOS device                                                                                         | [V]        |
| $V_{out}$    | Output voltage                                                                                                          | [V]        |
| $W_{Si}$     | Width of a power switch                                                                                                 | $[\mu m]$  |
| $W_{S,tot}$  | Sum of all the power switch widths                                                                                      | $[\mu m]$  |
| $Z_{FSL}$    | Output impedance in FSL regime                                                                                          | $[\Omega]$ |
| $Z_{SSL}$    | Output impedance in SSL regime                                                                                          | $[\Omega]$ |
| $Z_{out}$    | Output impedance                                                                                                        | $[\Omega]$ |

## INTRODUCTION

In the early 60's, Joseph Carl Robnett Licklider has the vision of a computer network easing information communication between people [1]. His vision has become years later known as the "Internet" and is now ubiquitous in everyone's life. The Internet allows people across the world to communicate instantaneously and is today the world largest database. However development of Licklider's vision has not yet come to an end: the future of Internet is to enable communication between objects organized in wireless sensor networks (WSNs), making them able to gather and process information and to interact with their environment without human assistance. This new paradigm is known as the "Internet-of-Things" (IoT). The rise of the Internet-of-Things pushes ahead the research effort in ultralow-power integrated circuit design. Both industries and end users are greedy for new exciting applications such as structural health monitoring in remote areas where no power grid can easily be drawn, or battery-less controllers for home automation or multimedia applications [2], [3]. To implement this vision, a whole new family of electronic systems is coming to life: energy-autonomous systems [4, 5, 6, 7], for which new integrated circuits (ICs) are required.

In this general introduction, we briefly introduce the concepts that motivated this work: the Internet-of-Things and the development of energy autonomous systems, before defining the objectives of this thesis and giving the outline of the text.

## I.1 ENERGY AUTONOMOUS IC DESIGN FOR THE INTERNET-OF-THINGS

Typical IoT nodes shown in Fig. I.1 have four main parts: sensors to collect data from the environment such as pressure, temperature, humidity, or movement, a radio transceiver to communicate with a base station or other nodes, a microcontroller to process the collected data, to control the node, and to handle the wireless communication protocol, and a power management unit to supply the node circuits from the battery or from environmental energy harvesters [2], [8]. In order to implement the future of Licklider's vision trillions of IoT nodes are expected to be deployed in the near future [9].

To sustain such massive deployment, the design of an IoT node must meet three main constraints. The first constraint is that the node must have a small form factor [2] and a low bill of material to reduce its cost and its manufacturing carbon footprint for environment sustainability [5, 9]. This requires to integrate on-chip most of the node circuitry and to limit the use of bulky discrete components such as inductors.

#### xxii INTRODUCTION



Fig. I.1. Typical IoT node architecture.

The second constraint is that the node must be energy autonomous [2], [7]. This is required because such nodes are not connected to the grid to ease their deployment and their battery has to last for the device lifetime to avoid prohibitive maintenance costs. Energy autonomous design is enabled by the recent development of micro-energy harvesters [10, 11, 12]. Nevertheless, the power extracted by such energy harvesters only provides a limited power budget for the sensors, the radio and the microcontroller. For example, commercial solar cells can extract up to  $100\mu W$  per  $cm^2$  with indoor lighting [12]. Therefore, the designers of energy-autonomous systems aim at minimizing their power consumption with two main design tricks. First, as the computational load of such applications is low, the digital circuits do not need high performances, and can be supplied at ultra-low voltage (0.3V-0.5V) that is dynamically adapted to the current workload, which reduces their dynamic power consumption [5, 13, 14]. Second, to avoid unnecessary switching, the node is highly duty cycled: all its circuitry beyond the always-on peripherals (such as retentive SRAM or the sleep controller) remains power gated most of the time [7]. When the sensor has acquired enough data, the microcontroller wakes up to process them by compression or feature extraction [9], in order to lower the data volume that must be transmitted by the power-hungry radio transceiver. Compressed information is stored into a memory (SRAM or FLASH). On request by the base station, when a critical feature is detected, or when the memory is full, the radio transmitter wakes-up and sends

the information to the base station. Thanks to its low supply voltage and to duty-cycled operation, the node power consumption is reduced to its minimum.

The third constraint is that the node must be able start autonomously [2]. It implies that the node can operate without having any pre-loaded battery (cold start), and that it must be able to automatically configure itself when inserted in a network. This automatic self-configuration allows the insertion of new nodes or the node replacement in a network that already exists without having to reconfigure all previously-deployed nodes.

## **I.2 THESIS OBJECTIVES**

The power management unit is a critical part of IoT nodes. Indeed it extracts power from the energy harvester, manages the energy in the storage device and supplies the three other parts of the node. Without a high energy efficiency of the voltage conversions between the harvester, the storage element and the node circuits, the effort to reduce the node power consumption would be ruined. Furthermore it must be adapted to the duty-cycled behavior of the node. Therefore, the three constraints on IoT node design explained in previous section brings strong challenges on their power management unit:

- the DC/DC converters must be integrated on-chip, along with the node digital circuits to avoid the use of bulky external components, for low bill of material,
- to minimize their power consumption, the supply voltage of the node blocks must be individually lowered which leads to a multiplication of the power domains on a single die [5, 18],
- to efficiently supply duty-cycled loads or loads whose supply voltage is dynamically adapted to its workload, the PMU DC/DC converters must be able to deliver wide power ranges from tens of milliwatts to less than a microwatt with a high power-conversion efficiency above 75%,
- to allow the node to cold-start when deployed, its power management unit must be able to kick start the node when energy becomes available for the first time.

In order to ease the development of such power management units, this dissertation focuses on answering two questions:

- How can we design efficient on-chip DC/DC converters to supply both ultra-low and multi-mode loads (that are duty cycled and whose supply voltage is tuned to the workload)?
- How can we integrate these converters in power management units to address the requirements of energy-autonomous systems?

xxiv INTRODUCTION



**Fig. I.2.** Performances of state-of-the-art SC DC/DC converters regrouped into three load categories: emerging energy-autonomous systems, low-power microcontrollers and high-performance processors for mobile applications.

In this thesis, we first prove that SC DC/DC converters are a valid candidate to address the first question. Therefore, we study their behavior and perform a review of state-of-the-art SC DC/DC converters. Efficiency and load range of such converters are shown in Fig. I.2. Loads supplied by these converters can be regrouped into three categories: (i) emerging energy-autonomous systems with power consumption below  $1\mu$ W [7, 15, 16, 17], (ii) low-power microcontrollers supplied at voltages below 1V [18, 19, 20, 21, 22], and (iii) high-performance processors for mobile applications [23, 24, 25, 26]. Fig. I.2 shows that SC DC/DC converters can efficiently supply low loads. It is not the case of inductive converters that cannot supply low loads with small passive devices [4], and of linear regulators that have intrinsic efficiency limitations linked to the voltage conversion ratio [27, 28]. Furthermore, nanometer CMOS processes propose options to build dense capacitors. SC DC/DC converters can thus easily be integrated on-chip alleviating the need for external components [24].

The first question of this thesis focuses on supplying the low-power microcontroller (second load category on Fig. I.2). State-of-the-art SC DC/DC converters achieve high conversion efficiency only on a narrow load range, as their power conversion efficiency quickly drops at low loads because of losses in the converter controller. In order to address the first question we thus elaborate for the first time a complete practical and systematic sizing methodology for SC DC/DC converters providing all the design variables [JP1.], [CP5.]. We then propose specific designs to extend the power range that can efficiently be supplied to multi-mode loads with power consumption below  $1\mu$ W or above 1mW, achieving thus record ultra-wide power ranges [CP2.], [CP6.]. The second question focus on supplying energy autonomous systems (first load category on Fig. I.2). We answer the second question by proposing two examples of PMU for IoT nodes based on such SC DC/DC converters. The first one contributes to the node power reduction by adaptive tuning of its supplied voltage to the process, voltage and temperature corners, and uses therefore for the first time an on-chip SC DC/DC converter [BC1.], [JP2.], [JP3.], [ITK1.], [CP3.], [CP4.]. The second one supplies an IoT node from environmental energy harvesters. State-of-the-art SC DC/DC converters are designed for  $mm^2$  energy harvesters extracting power below  $1\mu W$  [7, 16, 17]. This is not compatible with the power consumption of IoT nodes performing complex tasks such as radio communication with IPv6 routing [3]. Therefore we propose a complete PMU for energy-autonomous applications. It uses only one DC/DC converter, with a minimum external component count, and can supply up to 20mW [CP1.].

## **I.3 THESIS OUTLINE**

The opportunity to integrate high-efficiency DC/DC converters on-chip opens new possibilities to solve the problem of energy efficient IoT nodes. This dissertation explores these possibilities with two research focuses: theoretical analysis and system designs. In the first three chapters, we answer the question: "How can we design efficient on-chip DC/DC converters to supply both ultra-low and multi-mode loads?" by analytically studying the behavior of switchedcapacitor converters, and by proposing specific designs. In Chapters 4 and 5, we then propose system designs using such converters to answer the question: "How can we integrate these converters in power management units to address the requirements of energy-autonomous systems?". This is the outline of the text.

Chapter 1. As a preliminary discussion, we review the behavior of switchedcapacitor (SC) DC/DC converters. To do so we first explain the charge transfer mechanism leading to the voltage conversion, which is based on a two-phase reconfiguration of a capacitor network. We then review the loss mechanisms that reduce the power-conversion efficiency and show that the maximal efficiency achievable by SC DC/DC converters depends on the topology of the capacitor network used for the energy transfer. We also provide a convenient model of the converter and introduce the concept of its output impedance, which follows two asymptotic behaviors corresponding respectively to slow and fast switching frequency of the converter. We then review voltage regulation schemes in SC DC/DC converters. Most of them are based on the modulation of the output impedance of the converter by modifying the passive device size, power switch characteristics, or converter switching frequency. Other control mechanisms use smart (re)configuration of the converter topology, capacitors or switches to achieve better efficiency or smaller output voltage ripple. Several of these features are usually used together in the SC DC/DC converter controllers.

#### xxvi INTRODUCTION

Fundamentals of SC DC/DC converters reviewed in this chapter are used as the foundations of the next chapters of this dissertation.

**Chapter 2.** In this second chapter, we develop a systematic sizing methodology for switched-capacitor DC/DC converters aimed at maximizing the converter efficiency under a die area constraint. To do so, we first evaluate the optimum transfer capacitor size and switch relative width. Then, we propose an analytical solution of the optimum switching frequency. It shows that when the parasitic capacitances are low, this solution leads to an identical contribution of the switches and transfer capacitors to the converter output impedance. As the parasitic capacitances increase, the optimum switching frequency decreases and the switch size increases because of the extra losses associated to these parasitic capacitances. Once the optimum switching frequency is known, the absolute switch sizes are determined. We show that the overdrive voltage strongly impacts the optimum switch width through the modification of their conductance.

To support the sizing methodology, models proposed in Chapter 1 of the behavior and efficiency of switched-capacitor DC/DC converters are used. They are validated against simulation and measurement results in 65nm and  $0.13\mu$ m CMOS, respectively and then the sizing methodology is validated by comparing its outputs against an exhaustive search for the optimum sizes. Finally, the proposed methodology shows how the converter efficiency can be traded-off for die area reduction and what is the impact of parasitic capacitances on the converter sizing.

**Chapter 3.** Ultra-low-voltage microcontrollers for highly duty-cycled applications such as wireless sensor nodes must support several modes of operation: sleep mode and active modes adapted to the current workload. Even in sleep mode some critical blocks such as retentive SRAM, timer and interrupt controller must remain powered-on. The DC/DC converter thus needs to be able to supply ultra-wide load ranges from the sleep mode up to full workload mode. In this chapter, we develop specific designs for multi-mode switched-capacitor DC/DC converters to supply such ultra-low-voltage microcontrollers with high power-conversion efficiency in all modes by adapting the switch sizes (both L and W), their body bias, by adaptive internal clock generation supplied by the output voltage, and by reconfiguration of the SCN topology according to the voltage that must be supplied. Furthermore, we propose to use the DC/DC converter as a power-gating device when the microcontroller is in sleep mode.

To validate these techniques, we implemented two such SC DC/DC converters. The first one delivers a 0.3-0.4V output voltage from a 1-1.2V input source. The  $0.12mm^2$  chip was manufactured in a  $0.13\mu m$  CMOS technology. The efficiency reaches 74% with a  $100\mu W$  load and 63% efficiency with a 100nWload, corresponding to the microcontroller active and sleep modes respectively. The converter correctly operates over a wide load range from 25nW to  $125\mu W$ , i.e. nearly 4 orders of magnitude, which is a record for such low power levels. The second one, is manufactured in the STMicroelectronics 28nm fully-depleted SOI CMOS technology, and occupies  $0.135mm^2$  area. It delivers from a 1V input either  $40\mu W$  @ 0.3V with up to 79% efficiency or 1.3mW @ 0.45V with up to 87% efficiency close to the theoretical 90% maximum efficiency, enabling dynamic voltage scaling of the microcontroller depending on the workload.

Chapter 4. In this chapter, we investigate the possibility for low-power applications to integrate an efficient adaptive voltage scaling (AVS) system on chip. Therefore the impact of process (both global and local), voltage and temperature variations is firstly studied on two typical low-power circuits i.e., a CPU for mobile applications and a microcontroller for IoT nodes. In order to ensure safe operation under all conditions, it is required to increase the supply voltage by up to 22% leading to a nearly 50% increase in dynamic power consumption when compared to nominal operating conditions. An AVS system allows for reducing this supply voltage guard band. To be able to include the AVS system on chip using standard CMOS, this chapter proposes to use a switched-capacitor network for DC/DC conversion from the higher battery voltage. A critical path replica is used for both sensing the circuit maximum operating frequency and generating its clock signal. We show that the voltage ripple induced by the DC/DC converter does not significantly contribute to the supply voltage guard-band, and that overall the proposed AVS system allows for reducing this guard band by up to 80% while consuming less than 33% of the total circuit area.

Such an AVS system has been successfully integrated in the 65nm ultralow-voltage microcontroller SoC SleepWalker [5]. Thanks to the AVS system, the microcontroller can operate at 25MHz instead of 10MHz over process and temperature variations from -40°C to 85°C, with a peak efficiency of the DC/DC converter above 80%.

#### Chapter 5.

In this chapter, we propose a power management unit to enable energyautonomous operation of an IoT node from solar energy. Therefore we first review the state of the art of power management units using environmental harvesters as energy source and an energy storage element to power the load when no power is available from the harvesters. Such power managements fail to predict the energy available from the storage element due to the flat battery voltage and have relatively high bill of material.

The proposed power management unit is the first based on a single SC DC/DC converter to extract power from solar cells, generate a regulated 1V load voltage, and interface a supercapacitor as energy storage element. The use of only one DC/DC converter allows for reducing the die area and to increase the end-toend conversion efficiency. The PMU requires only a supercapacitor and a filtering capacitor as external components. In order to enable management of the node tasks and to anticipate power downs, an SC\_status signal provides information about the voltage on the supercapacitor and thus on the stored energy. The single SC DC/DC converter has a gearbox with 7 voltage gains, gate-boosting drivers for the switches, and a pulse skipped modulation control. Furthermore, the PMU

#### xxviii INTRODUCTION

can operate in three frequency modes with a converter switching frequency of 1MHz, 10MHz, or 40MHz, thereby adapting the quiescent current in the PMU controller depending on the power available at the solar cells. It allows the PMU to handle efficiently input power ranging from  $10\mu W$ , up to 20mW. With the 1MHz switching frequency, the SC DC/DC converter has an average simulated efficiency of 70.7% when charging the supercapacitor from 0V to 2.7V, and of 74.9% when supplying  $V_{reg}$  from the supercapacitor, leading to an average 53% end-to-end efficiency.

A layout has been realized in a commercial 65nm CMOS process. It consumes  $0.425mm^2$  area. Once charged, the low  $7\mu$ W standby power, including the supercapacitor leakage, and a  $1.7\mu$ W sleep power of the load allows the IoT node to survive 43 hours without light before the supercapacitor gets empty.

**Conclusions and perspectives** We finally summarize the results and give concluding remarks with research perspectives in the general conclusion.

# AUTHOR'S PUBLICATION LIST

### **Book chapters**

BC1. M. Hayes, J. Rohan, W. Wang, N. Wang, A. Romani, E. Macrelli, M. Dini, M. Filippi, M. Tartagni, D. Flandre, D. Bol, <u>J. De Vos</u>, "Smart Energy Management and Conversion", to be published.

#### **Related journal papers**

- JP1. J. De Vos, D. Flandre, D. Bol, "A Sizing Methodology for On-Chip Switched-Capacitor DC/DC Converters", in *IEEE Transactions on Cir*cuits and Systems I, accepted.
- JP2. D. Bol, <u>J. De Vos</u>, C. Hocquet, F. Botman, F. Durvaux, S. Boyd, D. Flandre and J.-D. Legat, "SleepWalker: A 25-MHz 0.4-V Sub-mm<sup>2</sup> 7-µW/MHz Microcontroller in 65-nm LP/GP CMOS for Low-Carbon Wireless Sensor Nodes", in *IEEE J. Solid-State Circuits*, vol. 48 (1), pp. 20-32, 2013.
- JP3. J. De Vos, D. Flandre and D. Bol, "Pushing adaptive voltage scaling fully on chip", in ASP J. Low-Power Electronics, vol. 8, pp. 95-105, 2012.

#### Invited tutorials and keynotes

ITK1. D. Bol, J. De Vos, F. Botman, G. de Streel, S. Bernard, D. Flandre, and J.-D. Legat, "Green SoCs for a Sustainable Internet-of-Things", in Proc. Workshop Faible Tension Faible Consommation (FTFC), 4 p., 2013.

### **Related conference papers**

- CP1. F. Botman, J. De Vos, S. Bernard, J.D. Legat, D. Bol, "Bellevue: a 50MHz Variable-Width SIMD 32bit Microcontroller at 0.37V for Processing-Intensive Wireless Sensor Nodes", in *IEEE International Symposium on Circuits and Systems*, 4p., 2014, submitted paper.
- CP2. J. De Vos, D. Bol and D. Flandre, "A dual-mode DC/DC converter for ultra-low-voltage microcontrollers", in Proc. IEEE Subthreshold Microelectronics Conference, 3 p., 2012.
- CP3. D. Bol, <u>J. De Vos</u>, C. Hocquet, F. Botman, F. Durvaux, S. Boyd, D. Flandre, J.D. Legat, "A 25MHz 7µW/MHz ultra-low-voltage microcontroller

SoC in 65nm LP/GP CMOS for low-carbon wireless sensor nodes", in *IEEE International Solid-State Circuits Conference*, pp.490-492, 2012.

- CP4. J. De Vos, D. Flandre and D. Bol, "Variability and ripple analysis of an on-chip all-digital AVS system", in VARI Workshop on CMOS Variability, 4 p., 2011.
- CP5. J. De Vos, D. Bol and D. Flandre, "Design methodology for sizing DCDC converters supplying subthreshold circuits", in *IEEE Proc. Subthreshold Microelectronics Conference*, 1 p., 2011.
- CP6. J. De Vos, D. Bol, D. Flandre, "Dual-Mode Switched-Capacitor DC-DC Converter for Subthreshold Processors with Deep Sleep Mode", in *Fringe* poster session at the European Solid-State Circuits Conference, 4 p., 2010.
- CP7. J. De Vos, D. Bol, D. Flandre, "Génération d'horloge adaptative", in *FETCH Workshop*, 1 p, 2010.

#### **Unrelated papers**

- UP1. V. Kilchytska, D. Bol, <u>J. De Vos</u>, F. Andrieu, and D. Flandre, "Quasidouble gate regime to boost UTBB SOI MOSFET performance in analog and sleep transistor applications", in *Elsevier Solid-State Electronics*, Volume 84, Pages 28-37, 2013.
- UP2. D. Bol, V. Kilchytska, <u>J. De Vos</u>, F. Andrieu and D. Flandre, "Quasidouble gate mode for sleep transistors in UTBB FD SOI low-power highspeed applications", in *Proc. IEEE International SOI Conference*, 2 p. , 2012.
- UP3. D. Bol, C. Hocquet, <u>J. De Vos</u>, F. Durvaux, F. Botman, D. Flandre and J.-D. Legat, "Design techniques for reliable timing closure in ULV SoCs", in *IEEE Proc. Subthreshold Microelectronics Conference*, 1 p., 2011.
- UP4. D. Bol, <u>J. De Vos</u>, D. Flandre and J.-D. Legat: "Ultra-low-power highnoise-margin logic with undoped FD SOI devices", in Proc. *IEEE International SOI Conference*, pp. 97-98, 2009.
- UP5. D. Bol, J. De Vos, R. Ambroise, D. Flandre, J.-D. Legat, "Building ultralow-power high-temperature digital circuits in standard high-performance SOI technology", in *Elsevier Solid-State Electronics*, vol. 52, issue 10, pp.1939-1945, 2008.
- UP6. J. De Vos, D. Bol, D. Flandre, "Cellule SRAM 12 transistors à ultra faible courant de fuite", in Proc. Workshop Faible Tension Faible Consommation (FTFC), 4 p., 2008.

I'd put my money on the sun and solar energy. What a source of power! I hope we don't have to wait until oil and coal run out before we tackle that.

Thomas Edison, 1931.

**CHAPTER** 1

# FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

## Abstract

As a preliminary discussion, we review the behavior of switched-capacitor (SC) DC/DC converters. To do so we first explain the charge transfer mechanism leading to the voltage conversion, which is based on a two-phase reconfiguration of a capacitor network. We then review the loss mechanisms that reduce the power-conversion efficiency and show that the maximal efficiency achievable by SC DC/DC converters depends on the topology of the capacitor network used for the energy transfer. We also provide a convenient model of the converter and introduce the concept of its output impedance, which follows two asymptotic behaviors corresponding respectively to slow and fast switching frequency of the converter. We then review voltage regulation schemes in SC DC/DC converters. Most of them are based on the modulation of the output impedance of the converter by modifying the passive device size, power switch characteristics, or converter switching frequency. Other control mechanisms use smart (re)configuration of the converter topology, capacitors or switches to achieve better efficiency or smaller output voltage ripple. Several of these features are usually used together in the SC DC/DC converter controllers.

Fundamentals of SC DC/DC converters reviewed in this chapter are used as the foundations of the next chapters of this dissertation.

## Contents

| 1.1 | Introduction                                          | 3  |
|-----|-------------------------------------------------------|----|
| 1.2 | Overview of switched-capacitor DC/DC converters       | 4  |
| 1.3 | Electrical equivalent model of the switched-capacitor |    |
|     | network                                               | 7  |
| 1.4 | Sources of power losses                               | 11 |
| 1.5 | Voltage regulation with switched-capacitor networks   | 15 |
| 1.6 | Conclusion                                            | 21 |
|     |                                                       |    |

2



Fig. 1.1. Overview of an SC DC/DC converter.

### 1.1 INTRODUCTION

There are two main families of voltage regulators. First, the linear regulators dissipate energy in a passive component such as a resistor, or in the channel resistance of a MOSFET device. The voltage regulation is achieved by modulating the resistor size or the MOSFET channel resistance by controlling its gate voltage. Unfortunately the conversion efficiency of such voltage regulators is limited to the ratio between output and input voltage  $V_{out}/V_{in}$  because of thermal dissipation occurring in the resistor or in the MOSFET device. Second, the switching converters use switches and inductors or capacitors to convert an input voltage to an output voltage. Inductive switching converters perform the voltage conversion in two phases. First the inductor short circuits the input energy source so that energy stored in the inductor increases as the current flowing through it rises. Second the inductor is connected to the output and delivers current to a buffer capacitor supplying the load. Inductive converters are well known but they cannot easily meet the on-chip integration constraint because inductors cannot easily be integrated on-chip without poor quality factor [29] or additional fabrication steps [30]. Further, they lack efficiency when supplying a low power load [25] because of high equivalent series resistance leading to large conduction losses in the inductor. Capacitive switching converters also perform the voltage conversion in two phases but they can be fully integrated on-chip thanks to the process features of new CMOS technologies, and their efficiency is still high while supplying low loads [31], [32].

In this chapter we review the fundamentals of switched-capacitor (SC) DC/DC converters and propose a state-of-the art of previous works on these converters. Fig. 1.1 shows a typical architecture of such a converter. A capacitor network (SCN) between input and output voltages performs the voltage conversion. Therefore it oscillates between two phases  $\phi 1$  and  $\phi 2$ . Regulation of the output voltage is performed thanks to a controller that senses  $V_{out}$ . Control signals for power switches actuation are sent to a non-overlapping clock (NOC) which ensures that the switches ON during the first phase  $\phi 1$  are never activated

#### 4 FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

at the same time as the switches ON during the second phase  $\phi 2$  which would result in unwanted short circuit currents. Gate drivers amplify the switch control signals in order to keep small rise and fall times on the large power switch gates. Finally an optional output filtering capacitor  $C_{out}$  reduces the voltage ripple on  $V_{out}$  due to the SCN switching.

This Chapter is organized as follows: Section 1.2 explains the mechanism by which energy is transferred. The sources of energy loss are reviewed in Section 1.4. We then present a model of SC DC/DC converters in Section 1.3, and finally Section 1.5 describes existing voltage regulation controllers of such converters.

#### 1.2 OVERVIEW OF SWITCHED-CAPACITOR DC/DC CONVERTERS

SC DC/DC converters use capacitors as passive energy transfer devices. Unlike inductive converters that uses only one inductor, SC DC/DC converters use several passive devices to perform the energy transfer, and these passive devices are connected in a network that is reconfigured every switching cycle.

This section describes how capacitive switching converters use capacitors to perform the voltage conversion and gives insights on the capacitor network configuration.

#### 1.2.1 Voltage conversion mechanism

In SC DC/DC converters, transfer capacitors (TCs) are used to transfer charges from the energy source at the input voltage  $V_{in}$  to the load at the output voltage  $V_{out}$  [33]. SC DC/DC converters generally operates in two phases  $\phi 1$  and  $\phi 2$ , each with a specific configuration of the transfer capacitor network. The switching between these two configurations enables the charge transfer from the input terminal to the output terminal of the converter. After start-up, the converter reaches its regime operation that implies that capacitors accumulate charges from the energy source during one of the two phases and deliver the same amount of charges to the load during the other phase. The voltage gain  $V_{out}/V_{in}$  is fixed by the two configurations of the transfer capacitor network corresponding to  $\phi 1$ and  $\phi 2$ .

To illustrate the voltage conversion mechanisms, Fig. 1.2 gives an example of a divide-by-two SC DC/DC converter. During  $\phi 1$ , the transfer capacitor is connected in series between  $V_{in}$  and  $V_{out}$  so that charges flow through it up to the load. Therefore the voltage across  $C_{T1}$  rises up to  $V_{in} - V_{out}$ . In the second phase  $\phi 2$ , the capacitor network is reconfigured to connect  $C_{T1}$  in parallel with  $V_{out}$  so that stored charges in  $C_{T1}$  are delivered to  $V_{out}$ . The charge transfer stops when the voltage across  $C_{T1}$  has lowered to the load voltage  $V_{out}$ . When the load does not consume any power,  $V_{out}$  stabilizes at the no-load voltage  $V_{nl}$ , which is here  $V_{in}/2$ . In this example, the SC DC/DC converter is thus said to use a "divide-by-two" topology.


Fig. 1.2. Example of a divide-by-two SCN. In the first phase  $\phi_1$ , a transfer capacitor  $C_{T1}$  is connected between  $V_{in}$  and  $V_{out}$ . In the second phase  $\phi_2$ , it is connected between  $V_{out}$  and the ground.

A non-overlapping time between  $\phi 1$  and  $\phi 2$  is mandatory to ensure that no short circuit current could flow in the switches closed during  $\phi 1$  and those closed during  $\phi 2$  thus threatening the converter functionality.

#### 1.2.2 Topologies of the switched-capacitor network

The no-load voltage of an SC DC/DC converter can be obtained by inspection of the two SCN configurations. It is independent of the switch conductance, switching frequency, parasitic capacitances, etc. In [34], it is shown that all the achievable ratios  $m = V_{nl}/V_{in}$  are given by Eq. (1.1):

$$m = P/Q, \tag{1.1}$$

where P and Q are integers that belong to the N + 2 first elements of the Fibonacci series with N, the number transfer capacitors in the converter. Table 1.1 from [31] gives an overview of the achievable ratio m for a given number of transfer capacitors.

As an example, Fig. 1.3 illustrates the achievable m with two transfer capacitors with the so-called series parallel topologies [34]. For the voltage downconversion topologies (m < 1), the transfer capacitors are first serially connected between the  $V_{in}$  and  $V_{out}$  terminals during  $\phi 1$  so that charges flow through the TCs, which increases their voltage. Secondly they are connected in parallel with the load during  $\phi 2$  so that they supply the stored charged at a lower voltage than  $V_{in}$ . For the voltage up-conversion topologies (m > 1), the TCs are first connected in parallel with the input voltage to accumulate charges before being serially connected between  $V_{in}$  and  $V_{out}$  which leads to a ratio  $V_{nl}/V_{in}$  higher than one.

Note from Table 1.1 that specific topologies allow to reach higher m than what is possible with the series-parallel topology. For example, it is possible to multiply  $V_{in}$  by a factor of height with only four TCs. Actually there exists

Number of TCs (N)Achievable m1  $1 \ 2$ 1  $\overline{2}$  $\frac{1}{2}$  $\frac{2}{3}$ 1 2 $2\ 3$  $\overline{3}$  $\frac{2}{5}$  $\frac{3}{5}$  $1 \ 1 \ 1$  $\frac{2}{3}$ 1 3 4 55 2 3  $3\ 4\ 5$  $\overline{5}$  $\overline{2}$  $\overline{4}$  $\overline{3}$  $\frac{3}{8}$ 1 1 1 1 1 21  $\frac{2}{5}$ 3 4 3 5253 567 4  $\frac{1}{7}$  5 1  $\overline{\overline{7}}_{8}$  $\overline{3}$ 7  $\overline{7}$  $\overline{7}$  $\overline{4}$  $\overline{5}$  $\frac{\overline{3}}{8}$  $\frac{\overline{3}}{3}$  $\overline{7}$  $\overline{8}$  $\overline{6}$  $\overline{2}$ 7  $\overline{7}$  $\frac{-8}{5}$  $\overline{5}$  $\overline{6}$ 8  $\frac{8}{5}$ 7 7 3 245678 $\overline{7}$  $\overline{3}$  $\overline{5}$  $\overline{2}$  $\overline{6}$  $\overline{5}$  $\overline{4}$ 3 OVout φ1: out ф2: m=2/3 m=1/3 m=1/2 m=1 φ1: ф2: m=3/2 m=2 m=3

**Table 1.1.** Achievable voltage conversion ratios  $m = V_{nl}/V_{in}$  for a fixed number of transfer capacitors (adapted from [31]).

Fig. 1.3. Example of TC network configurations to achieve possible *m* with two TCs.

different SCN topologies leading to the same transfer ratio m [19], [34], [35]. Fig. 1.4 gives three possible ways to multiply  $V_{in}$  by five. Fig. 1.4 (a), illustrates again the series parallel topology. Fig. 1.4 (b) depicts the doubler topology which requires one less TC than the series parallel topology. Here during  $\phi 1$ ,  $C_{T1}$  is charged up to  $V_{in}$  while  $C_{T2}$  and  $C_{T3}$  supply the load by being serially connected with  $V_{in}$  so that the voltage across them adds to  $V_{in}$ . Then during  $\phi 2$ ,  $C_{T1}$  is connected in series with  $V_{in}$  to distribute charges to  $C_{T2}$  and  $C_{T3}$ . Therefore



**Fig. 1.4.** Three multiply-by-5 topologies: (a) the series parallel topology, (b) the doubler topology and (c) the Dickson charge pump topology.

the voltage across these capacitors is twice  $V_{in}$  giving its name to this topology. Fig. 1.4 (c) presents another way to perform up-conversion with capacitor-based switching converter. It is the Dickson charged-pump topology where charges are pumped from one capacitor to the next capacitor so that the voltage at each stage increase by  $V_{in}$ .

However, as will be explained in next sections, some topologies are more efficient to transfer charges to the load than others and thus require smaller TCs or switches. Further some topologies are best suited for operation at high switching frequencies by optimizing the switch characteristics, while others should be preferred for low switching frequencies because they optimize the use of the TCs [36].

# 1.3 ELECTRICAL EQUIVALENT MODEL OF THE SWITCHED-CAPACITOR NETWORK

In this section we explain how to model the voltage conversion of an SC DC/DC converter. Therefore we first show that a transformer associated with an output impedance is a good equivalent circuit of the transfer capacitor network. Then we review the asymptotic evolution of this output impedance with the converter switching frequency.

# 1.3.1 Model behavior

The voltage conversion function of SC DC/DC converters is usually modeled by means of an ideal transformer with the addition of a series impedance connected between the transformer and the converter output [34] [36] [37], as depicted in Fig. 1.5. The conversion ratio of the transformer is the ratio  $V_{nl}/V_{in}$  which corresponds to the parameter m defined in Section 1.2.2.  $V_{nl}$  is fully determined by the SCN topology and  $V_{out}$  equals  $V_{nl}$  only if there is no current  $I_{out}$  flowing through the load. For a given conversion ratio m of the transformer and thus for a given ratio  $V_{nl}/V_{in}$ , it is possible to use various topologies of the SCN [19],

#### 8 FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS



Fig. 1.5. Equivalent circuit of SC DC/DC converters.

[34], [36] as explained in Section 1.2.2. When a load is supplied, the converter output voltage  $V_{out}$  decreases below  $V_{nl}$ . This is modeled by the addition of an output impedance  $Z_{out}$  [24], [34]. Finally,  $C_{out}$  is the filtering output capacitor of the converter. It is either an on-chip or off-chip capacitor, or the output capacitor of other SC DC/DC converters [24] as will be explained in Section 1.5.5.

From Fig. 1.5, the power  $P_{out}$  supplied by the converter to the load can be derived from the equivalent circuit of SC DC/DC converters:

$$P_{out} = \frac{(V_{nl} - V_{out}) \times V_{out}}{Z_{out}}.$$
(1.2)

In Eq. (1.2),  $Z_{out}$  depends on the converter operating regime. At extremely slow and fast switching frequency,  $Z_{out}$  follows two asymptotes corresponding to  $Z_{SSL}$  and  $Z_{FSL}$  respectively [36] as illustrated in Fig. 1.6. Between these two asymptotes,  $Z_{out}$  is given by Eq. (1.3) [31], [38].

$$Z_{out} = \sqrt{Z_{SSL}^2 + Z_{FSL}^2}.$$
(1.3)

#### 1.3.2 Slow-switching frequency behavior

When the switching frequency is below a certain transition frequency  $f_t$ , which is the intersection frequency of the two asymptotes as shown in Fig. 1.6, the switches have enough time to transfer the charges to/from the transfer capacitors. Thus, the charge transfer from the input to the load is limited by the amount of charges that can be stored in and delivered by the TCs at every switching cycle [34]. The converter is said to operate in the slow switching limit (SSL) regime and the converter output impedance is equal to  $Z_{SSL}$  given by [34], [36]:

$$Z_{SSL} = \frac{\kappa}{f_{sw}},\tag{1.4}$$



**Fig. 1.6.** Asymptotic output impedance of an SC DC/DC converter.  $Z_{out}$  first scales down as  $f_{sw}$  increases but the scaling is limited by the switch conductance at high  $f_{sw}$ .

where  $\kappa$  is proportional to the TC and is dependent on the SCN topology. An identical voltage conversion ratio  $V_{nl}/V_{in}$  implemented with different topologies leads to different  $\kappa$  and thus  $Z_{SSL}$ . It means that some topologies use the TCs more efficiently than others and can transfer more charges during one switching cycle with the same value of TCs [36]. An analytical expression of the  $\kappa$  value is given in [34] and [36]:

$$\kappa = \sum_{i} \frac{a_{ci}^2}{C_{Ti}},\tag{1.5}$$

where  $a_c$  is a vector giving an image of the charge flow in the  $i^{th}$  TC during each switching cycle. In order to evaluate  $a_c$ , [36] defines vectors  $a_{\phi 1}$  and  $a_{\phi 2}$  of the normalized charge flow through each components of the capacitor network for both phases  $\phi_1$  and  $\phi_2$ :

$$a_{\phi 1} = [q_{\phi 1,out} \ q_{\phi 1,1} \ \dots \ q_{\phi 1,n} \ q_{\phi 1,in}]/q_{out} = [a_{\phi 1,out} \ a_{\phi 1,1} \ \dots \ a_{\phi 1,n} \ a_{\phi 1,in}], (1.6)$$
$$a_{\phi 2} = [q_{\phi 2,out} \ q_{\phi 2,1} \ \dots \ q_{\phi 2,n} \ q_{\phi 2,in}]/q_{out} = [a_{\phi 2,out} \ a_{\phi 2,1} \ \dots \ a_{\phi 2,n} \ a_{\phi 2,in}], (1.7)$$

where  $q_{in}$ ,  $q_{out}$  and  $q_i$  represent the charges that flow during one switching cycle into the energy source, the converter load and the  $i^{th}$  TC respectively. The convention is that elements of these vectors are positive if the charge flow enters the component by its positive terminal. These vectors can be evaluated using Kirchoff's Current Law and by stating that the same amount of charges are stored in a TC at the end of a switching cycle ( $\phi 1 + \phi 2$ ) and at the beginning of this switching cycle, so that the charge flows into a TC during the two phases must be of the same magnitude with opposite directions ( $q_{\phi 1,i} = -q_{\phi 2,i}$ ) [34]. The  $a_c$  vector is then defined as the magnitude of the charge flow through the



Fig. 1.7. Charge flow in a divide-by-three capacitor network

transfer capacitors.

$$a_c = [|q_{\phi 1,1}/q_{out}| \dots |q_{\phi 1,n}/q_{out}|] = [|a_{\phi 1,1}| \dots |a_{\phi 1,n}|].$$
(1.8)

As an example, Fig. 1.7 illustrates computation of  $a_c$  for the divide-by-three topology of Fig. 1.3. The  $a_c$  vector can be computed knowing that  $a_{\phi 1,out} + a_{\phi 2,out} = 1$  and that  $a_{\phi 1,1} = -a_{\phi 2,1}$  and  $a_{\phi 1,2} = -a_{\phi 2,2}$  as mentioned earlier. In this example, it leads to:

$$a_c = \begin{bmatrix} 1/3 & 1/3 \end{bmatrix}. \tag{1.9}$$

#### 1.3.3 Fast-switching frequency behavior

When the switching frequency is higher than the transition frequency  $f_t$ , the stored charges in the TC do not have enough time to be transferred by the switches before the reconfiguration of the SCN [36], [37]. In this case, the charge transfer is limited by the switch conductance. The converter is said to operate in the fast switching limit (FSL) regime where the converter output impedance is  $Z_{FSL}$  given by [36]:

$$Z_{FSL} = \frac{\theta}{W_{tot}},\tag{1.10}$$

where  $\theta$  is dependent on the *ON*-resistance of the power switches, and of the SCN topology, which means again that different topologies with identical conversion ratios  $V_{nl}/V_{in}$  leads to different  $\theta$  and thus to different  $Z_{SSL}$ . Therefore some topologies use the power switches more efficiently than others [36]. An analytical expression of  $\theta$  is proposed in [36]:

$$\theta = 2 \times \sum_{i} \frac{a_{ri}^2}{G_{i0} \times s_i},\tag{1.11}$$

where  $a_r$  is a vector similar to  $a_c$  but defining the charge flow through the power switches while they are ON.

In Eq. (1.10),  $Z_{FSL}$  is in first approximation independent of  $f_{sw}$ . However the non-overlapping time of the 2 phase clocks and the rise and fall times in the buffer chain driving the switches reduce the effective charge transfer time and therefore increase  $Z_{FSL}$  at  $f_{sw}$  much higher than  $f_t$  [31]. We do not consider here such high  $f_{sw}$  values where  $Z_{FSL}$  becomes dependent on  $f_{sw}$  because they are not used in practical designs.

# 1.4 SOURCES OF POWER LOSSES

This section reviews the sources of efficiency loss in SC DC/DC converters. There are three main loss sources that are inherent to the voltage conversion in an SC DC/DC converter: the linear losses  $P_{lin}$ , the bottom-plate losses  $P_{BP}$ , and the switch driving losses  $P_{driving}$ . Two other efficiency loss sources come from the power consumed by the converter controller and peripheral circuits  $P_{control}$ , and the losses resulting from the output voltage ripple  $P_{ripple}$  due to the switching operation. All these loss sources are summarized in Fig. 1.8 with the example of a divide-by-two topology. The resulting SC DC/DC converter efficiency  $\eta$  can be estimated by combining these five terms:

$$\eta = \frac{P_{in} - P_{losses}}{P_{in}} = 1 - \frac{P_{lin} + P_{BP} + P_{driving} + P_{control} + P_{ripple}}{P_{in}}, \quad (1.12)$$

where  $P_{in}$  is the power provided by the energy source to the SC DC/DC converter.

#### 1.4.1 Linear losses

Firstly, when the load voltage  $V_{out}$  is lower than the no-load voltage  $V_{nl}$  thermal power dissipation occurs in the parasitic impedance of the converter: the sheet metal resistances, the equivalent series resistances of the TCs, and the channel resistances of the power switches which are usually the main contributor. It is referred to as the *linear losses*  $P_{lin}$ . Let us illustrate them with the example of the divide-by-two converter of Fig. 1.9. For the following discussion we assume that the load voltage  $V_{out}$  is a virtual net whose voltage does not vary, and is lower than  $V_{nl}$  so that  $V_{out} = V_{nl} + \Delta V$ . During  $\phi 1$ , an amount of charges  $Q_1 = 2 \times \Delta V \times C_{T1}$  are extracted from  $V_{in}$  and flow through  $C_{T1}$  to  $V_{out}$ . The factor 2 comes here from the SCN topology where the voltage swing on the transfer capacitor is twice  $\Delta V$ . The energy  $E_{in1}$  supplied by  $V_{in}$ , and the energy  $E_{out1}$  delivered to  $V_{out}$  are respectively given by Eq. (1.13) and (1.14):

$$E_{in1} = V_{in} \times 2\Delta V \times C_{T1}, \qquad (1.13)$$

$$E_{out1} = V_{out} \times 2\Delta V \times C_{T1}, \tag{1.14}$$

During  $\phi 2$ ,  $C_{T1}$  delivers the same amount of charges to  $V_{out}$  so that the same energy  $E_{out2} = E_{out1}$  is supplied to the load. Therefore, from Eq. (1.13) and



12 FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

Fig. 1.8. The five loss sources in an SC DC/DC converter.



Fig. 1.9. When  $V_{out}$  is lower than  $V_{nl}$ , linear losses occur because of the power dissipated in the parasitic impedances.

(1.14), during a complete cycle  $(\phi 1 + \phi 2)$  the ratio  $\eta_{max}$  of the energy transferred to the load to the energy taken from the input is:

$$\eta_{max} = \frac{2 \times V_{out} \times 2\Delta V \times C_{T1}}{V_{in} \times 2\Delta V \times C_{T1}} = \frac{2 \times V_{out}}{V_{in}} = \frac{V_{out}}{V_{nl}}$$
(1.15)

The linear losses leading to this efficiency loss are thus:

$$P_{lin} = P_{in} \times \frac{V_{nl} - V_{out}}{V_{nl}} = P_{in} \frac{\Delta V}{V_{nl}}.$$
(1.16)

This result can be extended to any conversion ratio: the maximal powerconversion efficiency that can be achieved by SC DC/DC converters scales linearly with the ratio  $V_{out}/V_{nl}$ . It sets a maximal limit  $\eta_{max}$  to the converter efficiency that is independent of the technology or design parameters but is fully determined by  $V_{out}$  and by the conversion ratio m of the  $C_{Ti}$  network topology.

Let us mention here that the linear losses correspond also to the thermal dissipation in the output impedance  $Z_{out}$  of the SCN electrical model described in Fig. 1.5. Indeed computation of the electrical model efficiency shows that it is equal to  $\eta_{max}$ 

$$n_{model} = \frac{I_{out} \times V_{out}}{I_{out} \times (V_{out} + \Delta V)} = \frac{V_{out}}{V_{nl}} = n_{max}.$$
 (1.17)

The linear losses  $P_{lin}$  may therefore be rewritten as a function of  $Z_{out}$  that will be reused in Chapter 2 to establish a sizing methodology for SC DC/DC converters:

$$P_{lin} = \frac{(V_{nl} - V_{out})^2}{Z_{out}}.$$
 (1.18)

#### 1.4.2 Bottom-plate losses

The second source losses comes from the switching of the SCN internal nodes at phase change leading the charging and discharging of parasitic capacitances. As CMOS integrated capacitors have usually a planar orientation, parasitic capacitances between the bottom plate of the TCs and the substrate are often the main contributors, leading to the denomination *bottom-plate losses*  $P_{BP}$  for these losses. However strictly speaking, top-plate losses and junction capacitance of the MOSFET power switches participate to the  $P_{BP}$  as well. If we neglect the latter, the  $P_{BP}$  can be approximated by:

$$P_{BP} = f_{sw} \times \alpha_{BP} \sum_{i} \left( C_{Ti} \times \Delta V_{BPi}^2 \right), \qquad (1.19)$$

with  $\alpha_{BP}$  a technology-dependent parameter which is the ratio between parasitic capacitances (top and bottom) to the substrate and the useful TC, and  $\Delta V_{BPi}$  the voltage swing seen at the bottom plate of the  $i^{th}$  TC when a phase change occurs.

These losses are dependent on the SCN topology and on technological parameters. It can be an important source of efficiency loss since the parasitic capacitances are fully charged and discharged at each phase change, which is not the case for the usefull TCs enabling the energy transfer. In order to reduce the  $P_{BP}$ , [19] proposes a specific switching scheme aiming at skipping reconfiguration of TCs with large  $\Delta V_{BPi}$ .

#### 14 FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

#### 1.4.3 Switch-driving losses

The energy spent in the driver of the power switches  $P_{driving}$  is the third source of losses: the *switch-driving losses*. It accounts for the charging and discharging of the gate capacitance of the large power switches, and for the power wasted in the buffer chain driving them. Thus these losses are proportional to the converter switching frequency and to the square of the supply voltage of the buffer chain  $V_{DDbuf}$  which is usually  $V_{in}$ . It is also proportional to the sum of the power switch width  $W_i$ .

$$P_{driving} = \epsilon \times C_{unit} \times V_{DDbuf}^2 \times f_{sw} \times \sum_i W_i, \qquad (1.20)$$

with  $C_{unit}$ , the switch gate capacitance per width unit, and  $\epsilon$ , a fitting parameter higher than 1 to include the losses occurring in the switch driving buffer chain.  $\epsilon$  depends on the sizing of the buffers. As a practical example in the case of a driving chain designed for an optimum power-delay product to avoid prohibitive long rise and fall time on the power switch gate [39],  $\epsilon$  is approximately 1.25 [40].

# 1.4.4 Control losses

Peripheral circuits are needed in order to control the switches of the SCN:

- a primary clock to generate the switching frequency signal,
- a block to generate non overlapping phases  $\phi 1$  and  $\phi 2$  to ensure that switches closed during one phase are opened before closing the switches of the other phase,
- circuits for the voltage regulation,

The power consumption of these circuits does not participate to the power conversion and thus do not scale with the supplied power. Therefore it usually sets a lower limit to the power range that can be efficiently delivered by the converter because of the fast efficiency drop when the load power gets smaller than the power consumption of the peripheral circuits [19], [21]. Digital circuits may be used to enhance the converter efficiency by replacing analog blocks [27] or by improving the switching scheme of the SCN [19]. A power estimate of the control circuitry  $P_{control}$  is proposed in [19]:

$$P_{control} = C_{cont} \times f_{CLK} \times V_{in}^2 + I_{leak,control} * V_{in}, \qquad (1.21)$$

where  $C_{cont}$  is the average capacitance in the control circuits switched every period of the controller clock  $f_{CLK}$ , and  $I_{leak control}$  is the leakage current in the peripheral circuits.



**Fig. 1.10.** Output voltage and current of an SC DC/DC converter supplying a digital load.

#### 1.4.5 Voltage-ripple losses

When a DC/DC converter supplies a load it must ensure that  $V_{out}$  never drops below a minimum value  $V_{dd,min}$ . Indeed for lower  $V_{out}$  faulty operation can occur in the load. For example, synchronous digital circuits can have timing failures if the data does not have enough time to propagate from one register to the following register in critical paths. However output voltage of SC DC/DC converters have ripple due to the switching behavior of the converter. Therefore the average  $V_{out}$  must be higher than  $V_{dd,min}$  by at least half the maximal ripple magnitude  $V_{ripple}/2$  as illustrated in Fig. 1.10. The authors of [24] stated that it can be considered as a source of efficiency loss  $P_{ripple}$  as it leads to higher power consumption of the digital circuits proportional to the square of  $V_{ripple}/2$ .

# 1.5 VOLTAGE REGULATION WITH SWITCHED-CAPACITOR NETWORKS

In this section we review existing ways to perform the voltage regulation in SC DC/DC converters. From Fig. 1.5, the current  $I_{out}$  supplied to the load is given by the ratio  $Z_{out}/\Delta V$ . Most regulation techniques rely thus on the modulation of  $Z_{out}$  or on the modification of the voltage transfer ratio of the SCN to adapt the  $V_{nl}$ . Modulation of  $Z_{out}$  aims at regulating  $V_{out}$  while minimizing  $P_{BP}$  and/or  $P_{driving}$  by tuning the switch sizes, the TC sizes, or the switching frequency of the converter. The  $V_{nl}$  modulation is performed by modification of the SCN topology. It allows to reduce the  $P_{lin}$ . Finally simultaneous use of several converters interleaved allows to reduce the  $V_{ripple}$  on the voltage delivered to the load which reduces  $P_{ripple}$ .

#### 1.5.1 Frequency control mechanisms

The modulation of the switching frequency of an SC DC/DC converter is an efficient way to regulate its  $Z_{SSL}$  because it is inversely proportional to  $f_{sw}$  as



Fig. 1.11. Pulse frequency modulation control loop and associated waveforms: when  $V_{out}$  is above  $V_{ref}$  on the  $f_{sw}$  rising edge, the converter skips the next switching of the SCN.

stated in Eq. (1.4). As  $Z_{FSL}$  is not proportional to  $f_{sw}$ , this control mechanism can only be used when the converter does not operate deep in the FSL regime. There are three actuation mechanisms of the switching frequency of SC DC/DC converters:

- The pulse frequency modulation (PFM) regulates the absolute value of the switching frequency: if the power consumed by the load increases, the switching frequency is increased as well to deliver more charges to the load [19].
- The pulse skipped modulation (PSM) always use the same switching frequency. As illustrated by Fig 1.11, a voltage comparator compares  $V_{out}$  to a reference voltage. As long as  $V_{out}$  is higher than the reference voltage, switching cycles are skipped. As the load still consumes charges,  $V_{out}$  decreases and when  $V_{out}$  gets lower than the reference voltage, charge transfer is enabled until  $V_{out}$  rises again above the reference voltage [20], [25], [21], [41].
- The duty cycle modulation or pulse width modulation (PWM) is based on the fact that  $Z_{FSL}$  rises for a duty cycle of  $f_{sw}$  that is not 50% [26] eventhough  $Z_{FSL}$  is independent of the  $f_{sw}$  itself. PWM actually limits the charge transfer through the switches by reducing the ON period of switches that have a low conductance.

The PFM and PSM techniques have the advantage of scaling both the  $P_{BP}$  and  $P_{driving}$  with load current as the effective  $f_{sw}$  is reduced when  $I_{out}$  lowers. However their drawback is that the frequency of the noise due to the switching operation on  $V_{in}$  and  $V_{out}$  is not predictable and has a wide range, which can be not acceptable for some analog applications such as RF circuits [18].

The PWM is not often used because it does not provide any  $P_{BP}$  nor  $P_{driving}$  savings when the load power reduces [31] since the  $f_{sw}$  does not scale with the

load current. However as  $f_{sw}$  stays constant, the noise pattern on  $V_{out}$  is at a known frequency.

In this thesis we aim at supplying loads that have large power consumption fluctuations. Therefore in order to keep the power conversion efficiency high across the load power range, PFM (Chapters 3, 4 and 5) and PSM (Chapters 3 and 5) techniques will be used. Furthermore, we will analyze the impact of the converter output voltage ripple on the load behavior when it may threaten its functionality.

#### 1.5.2 Switch conductance control mechanisms

Modulation of the switch sizes also allows to regulate the output voltage against line and load variations [42]. Indeed  $Z_{FSL}$  is proportional to the width of the switches. Therefore by adapting it, it is possible to modify accordingly  $Z_{out}$  as long as the converter does not operate deep in the SSL regime. It allows thus to control the voltage drop on  $Z_{out}$ . Therefore the power switches are divided into arrays and only part of the arrays are activated each switching cycle depending on the load current [19] [43]. Unfortunately it only allows a discrete control of the load [31]. Nevertheless, as the drived switch gate area is reduced with  $I_{out}$ , this control mechanism allows to scale  $P_{drive}$  with  $P_{out}$ . Thus it is also interesting to adapt the switch sizes even if the converter is not in the FSL regime: the switch width tuning does not trigger any line or load regulation but the  $P_{driving}$  still reduces [43].

Another way to use the switch to control  $Z_{FSL}$  is to control the magnitude of the gate to source voltage  $V_{GS}$  of the switches, but the *ON*-resistance of the switch is a non-linear function of the  $V_{GS}$  voltage making it difficult to implement [31].

#### 1.5.3 Capacitor control mechanism

Another way to regulate the load is by adapting the TC sizes to the line and load conditions. Indeed as long as the converter does not operate deep in the FSL regime, the amount of charge transferred to the load at each switching cycle is proportional to the transfer capacitor sizes. Therefore each individual TC can be divided into several banks and only parts of these participate to the charge transfer depending on the amount of charges that needs to be transferred to the load [18], [20].

This technique has the advantage that the  $P_{BP}$  scales with the load current as only part of the transfer capacitors and thus of their parasitic capacitances are charged and discharged at every switching cycle. Furthermore as the switching frequency is fixed, the noise pattern of the output voltage has a known frequency [18]. Another advantage is that for low current loads the switching frequency can be maintained high without reducing dramatically the power-conversion efficiency thanks to the  $P_{BP}$  reduction. Therefore the output voltage ripple can be mitigated thanks to RC filtering via the ON resistance of the power switches



Fig. 1.12. (a) Building block to enable DC/DC converter with multiple  $V_{nl}/V_{in}$  conversion ratios, (b) divide-by-two and divide-by-three topology example build with two elementary units [24].



Fig. 1.13. Trade-off between maximal power that can be supplied and  $\eta_{max}$  of an SC DC/DC converter. Data for a divide-by-two topology (see Fig. 1.2), with  $V_{in}=1$ V,  $f_{sw} = 10MHz$ ,  $C_{T1}=100$ pF and ideal switches.

[20], [44]. However it is not possible to sweep continuously the transfer capacitor sizes.

#### 1.5.4 Reconfiguration of the switched-capacitor network

Linear losses occur because of the voltage gap  $\Delta V$  between  $V_{out}$  and  $V_{nl}$  as explained in Section 1.4.1. Therefore an SC DC/DC converter is inefficient to supply loads with a wide range of output voltages.

A solution to this limitation comes from controllers adapting the converter topology to the ratio  $V_{out}/V_{in}$  to dynamically change the  $V_{nl}$  and thus the  $\Delta V$  [18], [24]. Therefore, linear losses are kept low for the whole range of  $V_{out}$ . This kind of control does not modulate the  $Z_{out}$  value as other control mechanisms seen previously, and can thus be used both in the SSL and in the FSL regimes.

However only discrete ratios  $V_{nl}/V_{in}$  can be achieved for a given number of transfer capacitors as shown in Table 1.1. Furthermore the output impedance of the converter both in SSL and FSL regime depends on the SCN topology as seen in Eq. (1.4) and (1.10). Both these concerns make this design approach difficult to implement in practical designs. Controllers using this techniques are thus often combined with other control mechanisms [18], [24] and are implemented by dividing the TCs in smaller elementary units such as in Fig. 1.12 (a) that can be connected together depending on the controller request as illustrated in Fig. 1.12 (b).

Selection of the proper conversion ratio  $V_{nl}/V_{in}$  amongst an available set may seem straightforward: the closer  $V_{nl}$  to  $V_{out}$ , the lower the linear losses leading to higher  $\eta_{max}$ . However as  $V_{nl}$  gets closer to  $V_{out}$  the power that can be delivered to the load reduces [19] as illustrated in Fig. 1.13. This is because when delivering charges to the load, (i) in SSL regime the  $\Delta V$  between  $V_{out}$  and  $V_{nl}$  is smaller, which stops therefore quickly the charge transfer and (ii) in FSL regime the power switches have a small voltage  $V_{DS}$  across the channel limiting thus the ON resistance of the devices.

Even if increasing the number of available SCN topologies allows for reducing the linear losses, it comes at the cost of an increased design complexity as the controller must be able to select the best topology given the load conditions ( $V_{out}$ ,  $P_{out}$ ), and as the number of components (switches and transfer capacitors) increases. Therefore most designs use one to three SCN topologies [17, 20, 22, 24] fitting the typical load profiles of the targeted application. Some designs use up to five different SCN topologies to supply dynamic voltage scaled microcontrollers [19, 21] (i.e. loads whose voltage may have any value across a more than 200mV range). On top of the costs in terms of design complexity, the number of SCNs coexisting into a single design cannot be increased indefinitely because the multiplication of the switches lowers the converter power conversion efficiency due to their leakage and to their parasitic capacitance.

#### 1.5.5 Converter interleaving

This section does not present another control mechanism of  $V_{out}$  but rather a way to reduce the voltage ripple  $V_{ripple}$  on  $V_{out}$ . The voltage ripple occurring at the output voltage of the converter is due to the switching behavior of SC



FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS



Fig. 1.14. (a) Example of divide-by-two converter with a filtering capacitor and a current source load, (b) wave form of the current flow in the converter and resulting ripple on  $V_{out}$ , (c) example of a divide-by-two with two interleaved converter, (d) wave form of the current flow with two interleaved converter and of the resulting ripple on  $V_{out}$ .

DC/DC converters. To explain it let us assume that the load is made of a filtering capacitor  $C_{out}$  and of a current source  $I_{out}$  and that the converter is built with a divide-by-two topology as shown in Fig. 1.14 (a). During  $\phi 1$  the  $C_{T1}$  is serially connected between  $V_{in}$  and  $C_{out}$ , there is first a charge surge flowing to the load and  $I_{out}$  consumes less charges than provided by the SC DC/DC converter. Therefore  $C_{out}$  stores excess charges which leads to a rise of  $V_{out}$  as shown in Fig. 1.14 (b). Then as the voltage on the upper plate of  $C_{T1}$  gets closer to  $V_{in}$ , the charge flow through  $C_{T1}$  reduces so that  $I_{out}$  consumes more charges than supplied by the SC DC/DC converter. Therefore part of the charges consumed by  $I_{out}$  are provided by  $C_{out}$  which leads to a lowering of  $V_{out}$  as shown in Fig. 1.14 (b). Just after the phase change from  $\Phi 1$  to  $\Phi 2$ , there is a new charge surge from  $C_{T1}$  to the load when  $C_{T1}$  is connected in parallel to the load, making  $V_{out}$ 

rises again. As  $C_{T1}$  is discharged into the load, the charge flow reduces and when it gets lower than  $I_{out}$ ,  $V_{out}$  decreases. The alternate rise and fall of  $V_{out}$  during one switching cycle of the SC DC/DC converter sets the voltage ripple. From this example, its magnitude is proportional to the TC sizes.

A way to reduce the ripple magnitude is to split the SC DC/DC converter into several smaller converters and to apply a phase shift to their switching signals [24], [25] as illustrated in Fig. 1.14 (c). This causes the number of TCs to increase by a factor  $K_{int}$  corresponding to the number of interleaved converters. It leads to a reduction of  $V_{ripple}$  by approximately the factor  $K_{int}$  [24] as illustrated in Fig. 1.14 (d). It is even possible to remove  $C_{out}$  of Fig. 1.14 (a) as the transfer capacitors of each interleaved converter act as filtering devices for the other converters [24]. It significantly reduces the area and manufacturing costs of fully integrated SC DC/DC converters.

For SC DC/DC converters using discrete components interleaving increases the manufacturing costs, but in the case of on-chip integration there is only a low area penalties, as long as the converter area is dominated by the capacitors, because on-chip capacitors can easily be split into capacitor banks [25]. The design complexity increases with the number of interleaved converters as more capacitors, and switches must be interconnected and because phase shifted clocks must be generated and routed. According to [24] there are no real benefits to increase  $K_{int}$  above 10 because (i) the charge peaks are reduced by one order of magnitude and (ii) the design complexity of the controller and specifically of the clock propagation circuits increases with  $K_{int}$ . Further, as some of the controller circuits must be duplicated for each interleaved converter, the controller area and power overhead increase with  $K_{int}$ .

#### 1.6 CONCLUSION

In this first chapter, we reviewed the operation principles as well as a behavioral model of switched-capacitor DC/DC converters. The energy transfer mechanism is based on the reconfiguration of a capacitor network every half cycle of the switching frequency. Thanks to the use of capacitors as only passive devices, this kind of DC/DC converter can be co-integrated on-chip with the supplied load.

We also reviewed the sources of loss occurring in such DC/DC converters. Besides the losses inherent to the switching behavior, the linear losses set a upper limit  $\eta_{max}$  to the voltage-conversion efficiency. In most designs they are the main contributor to efficiency loss. However smart selection and adaptation of the capacitor network topology allows to keep the  $\eta_{max}$  high, close to 90%. Other sources of efficiency loss scale with the power consumed by the supplied load. Proper selection of the transfer capacitance, of the switch sizes and of the switching frequency allows to efficiently deliver energy to low- and ultralow-power loads. In Chapter 2, we develop a sizing methodology to find the optimum design point of the SC DC/DC converter, and we explore the trade-off

#### 22 FUNDAMENTALS OF SWITCHED-CAPACITOR DC/DC CONVERTERS

space between converter area and efficiency. Then, in Chapter 3, we address the problem of supplying loads with wide range of power consumption.

Finally we analyzed the control mechanisms that can be used to perform line and load regulation. Most of these mechanisms modulate the output impedance of the converter to adapt to the line and load conditions. This is either performed by tuning the switching frequency, the transfer capacitor sizes or the switch characteristics. PWM control induces  $P_{driving}$  and  $P_{BP}$  losses that do not scale with the load power, which is not acceptable when a large load range must be supplied. PSM and PFM keep the  $P_{driving}$  and  $P_{BP}$  proportional to the load power and allow thus for maximizing the load range that can be supplied efficiently by the converter. It comes at the cost of a variable voltage ripple frequency.

On top of an accurate control of the converter output impedance by a modulation of its switching frequency, a rough adaptation of the transfer capacitor and of the switch size according to the load power consumption allows for keeping  $P_{driving}$  and  $P_{BP}$  low. However load regulation only based on the modulation of the transfer capacitor or of the switch characteristics requires either operating the converter in the energy inefficient fast switching limit or to divide each capacitor and switch into a large number of smaller components, increasing thus the design complexity.

Another control loop can be used to adapt the topology of the capacitor network to the ratio  $V_{out}/V_{in}$ . It increases the design complexity but reduces the linear losses when  $V_{out}$  or  $V_{in}$  have large fluctuations. Finally interleaving several converters allows reducing the output voltage ripple. In most designs, several of these control mechanisms are used. In chapters 4 and 5, we propose specific controller to adapt the optimal output voltage of the converter to the environmental conditions and to integrate the converter in a power management unit interfaced with energy harvesters and storage devices. **CHAPTER 2** 

# SIZING METHODOLOGY FOR SWITCHED CAPACITOR DC/DC CONVERTERS

# Abstract

In this second chapter, we develop a systematic sizing methodology for switchedcapacitor DC/DC converters aimed at maximizing the converter efficiency under a die area constraint. To do so, we first evaluate the optimum transfer capacitor size and switch relative width. Then, we propose an analytical solution of the optimum switching frequency. It shows that when the parasitic capacitances are low, this solution leads to an identical contribution of the switches and transfer capacitors to the converter output impedance. As the parasitic capacitances increase, the optimum switching frequency decreases and the switch size increases because of the extra losses associated to these parasitic capacitances. Once the optimum switching frequency is known, the absolute switch sizes are determined. We show that the overdrive voltage strongly impacts the optimum switch width through the modification of their conductance.

To support the sizing methodology, models proposed in Chapter 1 of the behavior and efficiency of switched-capacitor DC/DC converters are used. They are validated against simulation and measurement results in 65nm and 0.13 $\mu$ m CMOS, respectively and then the sizing methodology is validated by comparing its outputs against an exhaustive search for the optimum sizes. Finally, the proposed methodology shows how the converter efficiency can be traded-off for die area reduction and what is the impact of parasitic capacitances on the converter sizing.

#### Contents

| <b>2.1</b> | Introduction                                   | 25 |
|------------|------------------------------------------------|----|
| 2.2        | Proposed sizing methodology                    | 26 |
| <b>2.3</b> | Validation of the methodology                  | 38 |
| <b>2.4</b> | Discussion and exploitation of the methodology | 40 |
| <b>2.5</b> | Conclusion                                     | 42 |
|            |                                                |    |

# 2.1 INTRODUCTION

In the first chapter, we saw how it is possible to build DC/DC converters using only capacitors to supply ULP SoC circuits such as wireless sensor nodes for the Internet of Things at an internal voltage  $V_{DD}$  lower than the system supply voltage [5]. Efficient generation of this ultra-low voltage is thus as important as minimizing the energy of the supplied blocks.

Despite many publications on SC DC/DC converters, they are less straightforward to design than inductive or linear converters. For example, several topologies can be used in order to obtain the same transfer ratio [36]. Moreover, the converter behaves differently depending on the switching frequency [34] as seen in Chapter 1. When switching at a slow frequency (slow switching limit regime), the converter output impedance is limited by the size of the transfer capacitors while at fast switching frequency (fast switching limit regime), it is limited by the switch sizes. Designers thus need for a systematic sizing methodology allowing to maximize the converter efficiency for given area and load power constraints, as presented in Fig. 2.1.

In [36], the authors propose a sizing methodology to size the switches and capacitors of SC CD/DC converters based on the asymptotic behavior of the output impedance. The authors of [24] give a solution for the optimum switching frequency in the two cases where either the power used for driving the switches or the power dissipated in the parasitic capacitances of the transfer capacitors can be neglected. However, none of these references answers the question of finding the optimum operation point between slow switching limit and fast switching limit regimes. Furthermore previously proposed sizing studies [36] assumed identical overdrive voltage on each switch. The switch design was mainly focused on area optimization to reduce switch driving losses. However in the case of down-conversion to 0.3V-0.6V, the conductance of the switches can dramatically be lowered by reduced overdrive voltage. Thus the switch conductance becomes also dependent on the SC DC/DC converter topology and of the voltage levels seen by the converter.

In order to fully help designers to solve the problem illustrated in Fig. 2.1, a complete sizing methodology providing both (i) the optimum switching frequency in non-asymptotic regime and (ii) the optimum capacitor and switch sizes must be developed.

This chapter addresses this problem by:

- finding an analytical solution of the optimum switching frequency that maximizes the efficiency based on the models of conversion efficiency and output impedance developed in Chapter 1,
- proposing a sizing methodology exploiting this expression of the optimum switching frequency.

Because switches have different overdrive voltages, the study of the optimum switch sizes includes a weight factor for each switch representative of its con-





Fig. 2.1. Problem formulation.

ductance. The sizing methodology results allow for evaluating the impact of the bottom-plate capacitances on the converter design and for exploring the achievable converter efficiency for power and die area constraints. Let us mention that we focus here on voltage down-conversion. Stacked switches are thus not considered and it is assumed that neither the input voltage nor the output voltage exceed the maximum voltage of the CMOS technology. If there are risks of having a higher voltage on  $V_{out}$  or  $V_{in}$ , cascoded switches [45, 46], or high-voltage CMOS switches [47] must be used. Beyond these considerations, the results presented here can be extended to the case of voltage up-conversions. This chapter is organized as follows. The proposed sizing methodology is detailed in Section 2.2. Section 2.3 validates this work through simulations and measurements results, and finally Section 2.4 uses the sizing methodology results to explore the trade-off between converter area and efficiency.

#### 2.2 PROPOSED SIZING METHODOLOGY

In this section, we use the models of the conversion efficiency and output impedance presented in Sections 1.4 and 1.3 respectively to propose a practical and systematic sizing methodology to design SC DC/DC converters. It is assumed that the converter is fully integrated on-chip so that most of the converter die area is dedicated to the TCs. As ripple on the output voltage can be addressed either by the addition of an off-chip filtering capacitor or by using interleaved converter [24], it is not treated here. The sizing methodology, illustrated in Fig. 2.2, can be divided into six main steps:

- selection of the SC DC/DC converter topology,
- sizing of the transfer capacitors,
- relative sizing of the switches  $s_i = W_{Si} / \sum W_{Si}$ ,



Fig. 2.2. Proposed SC DC/DC converter sizing methodology.

- partitioning of the output impedance between  $Z_{SSL}$  and  $Z_{FSL}$ ,
- selection of the switching frequency  $f_{sw}$  and sizing of the total switch width  $W_{S,tot}$ ,
- losses and efficiency calculation.

As load power is dependent on its workload and PVT corners,  $P_{out}$  must be selected at the worst case (highest value). However, large  $P_{out}$  fluctuations due to different operating modes of the load circuits must be addressed with specific design steps presented in Chapter 3 to ensure good conversion efficiency across the full converter  $P_{out}$  range. Simulation results presented in this section are from an industrial 65nm CMOS process. Model parameters are summarized in Tables 2.1 and 2.2. Technology parameters from a  $0.13\mu m$  CMOS technology used to manufacture a test-chip presented in Section 2.3 are also provided.

#### 2.2.1 Topology selection

It is first required to select a SCN topology and the underlying conversion ratio  $V_{nl}/V_{in}$ . As the maximum efficiency of the converter is set by Eq. (1.15),  $m \times V_{in}$  should be close to  $V_{out}$ . Once the conversion ratio is chosen, several SCN topologies can be implemented as seen en Section 1.2.2. A topology either with a good FSL or SSL behavior can be selected depending on the CMOS process but

| Parameter                                      | $65 \mathrm{nm}~\mathrm{GP}$ | $0.13\mu\mathrm{m~GP}$ |
|------------------------------------------------|------------------------------|------------------------|
| $V_T$ [V] NMOS                                 | 0.33                         | 0.44                   |
| $V_T$ [V] PMOS                                 | -0.27                        | -0.43                  |
| $k_n \left[ F/(V \cdot s \cdot \mu m) \right]$ | $5.21 	imes 10^{-3}$         | $2.52\times 10^{-3}$   |
| $k_p \ [F/(V \cdot s \cdot \mu m)]$            | $1.63\times 10^{-3}$         | $0.73\times 10^{-3}$   |
| $\alpha_{BP}$ [%]                              | 0.4                          | 1.2                    |
| $C_{unit}[fF]$                                 | 1.5                          | 1.66                   |
| $\epsilon$ [/]                                 | 1.24                         | 1.24                   |

 Table 2.1.
 Extracted CMOS technology parameters

**Table 2.2.** Capacitor types available on 65nm and  $130\mu$ m industrial CMOS processes (value from post layout extraction).

|                  | C [fF/ $\mu m^2$ ] | Parasitics $(\alpha_{BP})$ | Specificities                |
|------------------|--------------------|----------------------------|------------------------------|
| MoM 65nm         | 1.5                | 5-10%                      | -                            |
| MiM 65nm         | 5                  | <1%                        | Requires extra mask option   |
| MiM $0.13 \mu m$ | $2^{\diamond}$     | NA                         | Requires extra mask option   |
| MOS 65nm         | 2-12               | 5-10%                      | C varies with the voltage,   |
| eDRAM 65nm       | $105^{\diamond}$   | NA                         | Requires DRAM process option |

 $^{\diamond}$  from design rules manuals

the SC DC/DC converter topology selection is out of the scope of this dissertation as it is already deeply studied in [36]. With a CMOS process that is switch limited (i.e. the switch conductance per  $\mu m$  width is poor, requesting the use of large switch to enable the charge transfer), a topology which has a good FSL behavior should be preferred to minimize switch sizes and associated  $P_{driving}$ losses. On the other hand, if the process capacitor limited, a topology with good SSL performances would allow to limit capacitor area and losses associated to parasitic (bottom-plate) capacitance. The proposed methodology can be used to quickly explore the achieved efficiency with different topologies.

# 2.2.2 Capacitor sizing

Once the topology is selected, the TCs need to be sized to allow us to define the lowest achievable  $Z_{SSL}$  curve. As illustrated in Fig. 2.2, this is performed in three steps: (i) selection of the capacitor type, (ii) evaluation of  $C_{T,tot}$ , and (iii) computation of the optimum TCs.



**Fig. 2.3.** Available capacitors in CMOS process: (a) MoM capacitor, (b) MiM capacitor, (c) MOS capacitor, (d) Trench capacitor.

Depending on the process technology, several implementations of capacitors might be available. The two main properties to look for when choosing how to build the capacitors are the capacitor density and the ratio  $\alpha_{BP}$  between parasitic (bottom-plate) capacitances to the substrate and effective capacitor. Indeed, the density defines the maximum  $C_{T,tot}$  for a die area constraint, and the  $P_{BP}$  are proportional to  $\alpha_{BP}$ . Fig. 2.3 illustrates available capacitors in CMOS process.

- MoM or fringe capacitors are built with the standard metal layers. Therefore they do not request any additional mask nor process step but have a poor density and a non-negligible parasitic capacitance.
- MiM capacitors use specific masks and process steps. In advanced CMOS technologies, they have a thin dielectric layer with a higher dielectric constant than the dielectric between standard metal layers enabling a better capacitor density. Furthermore they are built far from the substrate to reduce the bottom-plate capacitance.
- MOS capacitors are built using the gate of transistors. The thin gate oxide and possible high- $\kappa$  dielectric of recent CMOS processes leads to high density. However, the capacitors are close to the substrate and therefore have high bottom-plate capacitances except with silicon-on-insulator (SOI) technologies. Moreover the capacitance varies with the applied voltage.

#### **30** A SIZING METHODOLOGY FOR ON-CHIP SWITCHED CAPACITOR DC/DC CONVERTERS

• Embedded DRAM process options propose ultra-dense trench capacitors to build the DRAM storage capacitors and feature low relative bottomplate parasitics [31]. They necessitate specific masks and process steps for DRAM manufacturing.

Table 2.2 summarizes the available capacitors in an industrial 65nm CMOS process with post-layout estimates of capacitance densities and bottom-plate parasitic capacitance. Based on the choice of the capacitor type,  $C_{T,tot}$  is calculated to fill the available die area.

The ratio between each TC and  $C_{T,tot}$  is then evaluated. Therefore we consider the asymptotic SSL behavior of the converter, so that the charge transfers are switch-independent and are fully determined by the SCN. In order to find the optimum value for the TCs, [36] uses an optimization of  $Z_{SSL}$ , in other words it finds the TC relative sizing to maximize the charge transfer to the load at every switching cycle:

$$C_{Ti}/C_{Ttot} = \frac{|a_{ci}|}{\sum_{i} |a_{cj}|},\tag{2.1}$$

with  $C_{Ti}$ , the  $i^{th}$  TC.

#### 2.2.3 Relative switch sizing

In this design step, the relative size of the switches  $s_i = W_i/W_{S,tot}$  is evaluated. It allows for evaluating the lowest achievable value of  $Z_{FSL}$  as a function of  $W_{S,tot}$ . For sizing the switches, we consider the FSL asymptotic behavior of the converter. In the fast switching limit, the charge transfers are independent of the capacitors because the switches are too small to transfer all the charges before the phase change of the capacitor network. Therefore we are here trying to find the relative switch sizes that maximize the charge transfer to the load.

The conductance of switch i is given by:

$$G_i = G_{i0} \times W_{S,tot} \times s_i, \tag{2.2}$$

with  $s_i$ , the ratio between  $W_{Si}$  and  $W_{S,tot}$  and  $G_{i0}$  the switch conductance per width unit.  $G_{i0}$  can be estimated by Eq. (2.3) as long as the MOSFET power switch is in triode mode while it is turned ON (i.e. the voltage  $V_{DS}$  across the switch is smaller than the saturation voltage  $V_{Dsat}$  which is here the case as long  $\Delta V$  is small. For example for the divide-by-two topology of Fig. 2.4,  $V_{DS}$  is at most equal to  $2 \times \Delta V$ ):

$$G_{i0} \approx k_{n/p} \times (V_{GS} - V_{th}), \qquad (2.3)$$

where  $k_{n/p}$  is a parameter equal to  $\frac{C_{ox} \times \mu_{n/p}}{L}$ , with  $C_{ox}$  the gate capacitance,  $\mu$  the carrier mobility, and L the switch gate length. Let us mention that  $V_{th}$  is dependent on the body effect. In 65nm or 0.13 $\mu$ m CMOS technologies considered in this Chapter,  $V_{th}$  can be considered as constant in first approximation. In



**Fig. 2.4.** (a) Divide-by-two SC DC/DC converter, (b) Modeled conductance of the power switches from Eq. (2.3) per  $\mu$ m for  $V_{bs=0}$ .

advanced technologies and especially in FD SOI technologies special care must be taken on the  $V_{th}$  body-effect dependence as the body may act as a second gate. At this stage,  $W_{S,tot}$  is unknown. It will be determined in the next steps of the sizing methodology. However, it is possible to find the optimum  $s_i$  without knowing  $W_{S,tot}$ , and therefore to determine the relative switch sizes.

In [36], the authors performed the sizing of the switches based on a cost constraint related to the total switch area, taking the assumption that the conductance of any switch in the capacitor network is only proportional to its width. In the case of down-conversion to voltage in the 0.3V to 0.6V range, the switch conductance must be weighted by a factor depending on how small is the overdrive voltage of the switch. As an example, Fig. 2.4 (b) shows the conductance per  $\mu$ m width of the switches of the divide-by-two converter shown in Fig. 2.4 (a). The factor 3.02 between S1 and S2 comes from the fact that a PMOS is used for S1 and an NMOS for S2. However the factor 2.13 between S3 or S4 and S2 is because S2 has a larger overdrive voltage than S3 or S4 which have their source terminal connected to  $V_{out}$  instead of the ground. Furthermore, switches S3 and S4 have a negative body bias if their substrate is connected to the ground which increases their threshold voltage and reduces further their overdrive voltage and conductance.

To find the optimum width of the switches we use the Lagrange multipliers method to find the combination of switch sizes  $W_{Si}$  that leads to the lowest  $Z_{FSL}$ for a given total switch width  $W_{S,tot}$ . Therefore we define a constraint where the  $W_{S,tot}$  must be the sum of the width of all the switches:

$$W_{S,tot} = \sum_{i} W_{Si} = W_{S,tot} \times \sum_{i} s_i.$$
(2.4)

To optimize the  $Z_{FSL}$  impedance given by Eq. (1.10) we obtain the following Lagrange function where  $\lambda$  is the Lagrange multiplier variable:

**32** A SIZING METHODOLOGY FOR ON-CHIP SWITCHED CAPACITOR DC/DC CONVERTERS

$$\Lambda(W_{Si},\lambda) = Z_{FSL} + \lambda \times \left(\sum_{i} W_{Si} - W_{S,tot}\right).$$
(2.5)

By substituting  $G_i 0$  in Eq. (1.10) by its expression in Eq. (2.2), Eq. (2.5) becomes:

$$\Lambda(W_{Si},\lambda) = 2\sum_{i} \frac{a_{ri}^2}{G_{i0} \times W_{Si}} + \lambda\left(\sum_{i} W_{Si} - W_{S,tot}\right).$$
 (2.6)

In order to find the lowest  $Z_{FSL}$  let us now compute the partial derivatives of Eq. (2.6) versus  $\lambda$  and  $W_{Si}$ . Finding their root after dividing them by  $W_{S,tot}$  allows us to find the optimum  $s_i$ :

$$s_{i,opt} = \frac{|a_{ri}|}{\sqrt{G_{i0}} \times \sum_{j} \frac{|a_{j}|}{\sqrt{G_{i0}}}}.$$
(2.7)

# 2.2.4 Partitioning of $Z_{out}$ between $Z_{SSL}$ and $Z_{FSL}$

As shown in previous sections, the SC DC/DC converter sizing is dependent on  $Z_{SSL}$  and  $Z_{FSL}$ . These two impedances are linked together by the output impedance of the converter and thus by the power  $P_{out}$  that must be supplied to the load as shown in Eq. (1.2) and (1.3). Before going forward with the converter sizing both  $Z_{SSL}$  and  $Z_{FSL}$  must be determined based on the obtained TC and relative switch sizes  $s_i$ . Therefore let us find the optimum value of these impedances under a given  $Z_{out}$  constraint to minimize the converter losses presented in Section 1.4 and thus maximizing the converter efficiency for a given area constraint. We do not consider here the losses associated to the converter controller and to the output voltage ripple because they are independent of the capacitor network sizing.

Expressions of the converter output impedance modeled in Section 1.3 are here reminded:

$$Z_{out} = \sqrt{Z_{SSL}^2 + Z_{FSL}^2}.$$
(2.8)

$$Z_{SSL}\left(f_{sw}\right) = \frac{1}{f_{sw} \times C_{T,tot}} \sum_{i} \frac{a_{ci}^2}{c_i} = \frac{\kappa}{f_{sw}},\tag{2.9}$$

$$Z_{FSL}(W_{S,tot}) = 2 \times \sum_{i} \frac{a_{ri}^2}{G_{i0} \times W_{S,tot} \times s_i} = \frac{\theta}{W_{S,tot}}.$$
 (2.10)

Let us notice that the variables  $\kappa$  and  $\theta$  regroup all the terms that are technology- and topology-dependent and are thus already fixed because the capacitor network topology has been chosen in a previous design step. The same grouping can be performed on the bottom-plate losses and switch driving losses given in Eq. (1.19) and (1.20) respectively.

$$P_{BP} = f_{sw} \times C_{T,tot} \times \alpha_{BP} \times \sum_{i} \left( c_i \times \Delta V_{BPi}^2 \right) = \gamma \times f_{sw}, \qquad (2.11)$$

$$P_{driving} = \epsilon \times C_{unit} \times W_{S,tot} \times V_{dd}^2 \times f_{sw} = \beta \times f_{sw} \times W_{S,tot}, \qquad (2.12)$$

with  $\gamma$  and  $\beta$  new variables regrouping all the terms that are technology- and topology-dependent.

The losses that we are trying to minimize are equal to the sum of linear, bottom-plate, and switch driving losses:

$$P_{loss} = P_{lin} + \gamma \times f_{sw} + \beta \times f_{sw} \times W_{S,tot}.$$
(2.13)

In Eq. (2.13),  $f_{sw}$  and  $W_{S,tot}$  are the only remaining design variables. The  $Z_{out}$  constraint from Eq. (2.8) gives us a relationship between  $f_{sw}$  and  $W_{S,tot}$  if we replace  $Z_{SSL}(f_{sw})$  and  $Z_{FSL}(W_{S,tot})$  by their expressions in Eq. (2.9) and (2.10):

$$Z_{out} = \sqrt{\frac{\kappa^2}{f_{sw}^2} + \frac{\theta^2}{W_{S,tot}^2}}.$$
 (2.14)

Therefore  $W_{S,tot}$  can be rewritten as a function of  $f_{sw}$ :

$$W_{S,tot} = \frac{\theta}{\sqrt{Z_{out}^2 - \frac{\kappa^2}{f_{sw}^2}}}.$$
(2.15)

There is an  $f_{sw}$  and  $W_{S,tot}$  combination leading to minimal converter losses. This can be explained by considering two scenarios as shown in Fig. 2justification. First if the converter operates deep in the SSL regime, the switch sizes can be reduced. Indeed, in this regime, the charge transfer between the capacitors is terminated before the SCN reconfiguration. Thus smaller switches can be used without increasing the converter output impedance, and therefore leading to a reduction of the switch driving losses as seen in Eq. (2.12). Secondly, if the converter operates deep in the FSL regime, charges do not have enough time to be transferred before the SCN phase change. The transfer capacitor are not fully used while the bottom-plate capacitances still leads to  $P_{BP}$  that are proportional to  $f_{sw}$ . The optimum  $f_{sw}$  that minimizes the losses is therefore somewhere between the SSL and FSL regime.

In order to find the minimum losses, let us now derivate Eq. (2.13) with respect to  $f_{sw}$  with the substitution of  $W_{S,tot}$  by its above expression. Knowing that the linear losses are frequency-independent we obtain:

$$\frac{\partial P_{loss}}{\partial f_{sw}} = \gamma + \frac{\theta \beta \sqrt{Z_{out}^2 - \frac{\kappa^2}{f_{sw}^2}} - \frac{\theta \beta \kappa^2}{f_{sw}^2 \sqrt{Z_{out}^2 - \frac{\kappa^2}{f_{sw}^2}}}}{Z_{out}^2 - \frac{\kappa^2}{f_{sw}^2}}.$$
(2.16)



**Fig. 2.5.** Evolution of the converter losses with a varying partitioning of  $Z_{out}$  between  $Z_{SSL}$  and  $Z_{FSL}$ . There is an optimum when  $Z_{SSL}$  is close to  $Z_{FSL}$ .

This equation can be rewritten as a function of  $Z_{SSL}$  and  $Z_{FSL}$  by combination with Eq. (2.9) and (2.10):

$$\frac{\partial P_{loss}}{\partial f_{sw}} = \gamma + \frac{\theta \beta Z_{FSL} - \frac{\theta \beta Z_{SSL}^2}{Z_{FSL}}}{Z_{FSL}^2}.$$
(2.17)

Finding the zero of this equation gives us the optimum  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$  to minimize the losses:

$$Z_{SSL}^2 = Z_{FSL}^2 \times (1 + \frac{\gamma}{\theta\beta} Z_{FSL}), \qquad (2.18)$$

with  $\gamma = \alpha_{BP}C_{T,tot} \sum_{i} (c_i \Delta V_{BPi}^2)$ ,  $\theta = 2 \sum_{i} \frac{a_{r_i}^2}{G_{i0} \times s_i}$  and  $\beta = \epsilon C_{unit} V_{dd}^2$ . The optimum numerical value of  $Z_{SSL}$  and of  $Z_{FSL}$  can be found because

The optimum numerical value of  $Z_{SSL}$  and of  $Z_{FSL}$  can be found because  $Z_{SSL}$  and  $Z_{FSL}$  are bound to  $Z_{out}$  by Eq. (2.8), which is fixed by the  $P_{out}$  target.

Let us illustrate this with the sizing example of a converter supplying 300mV to a load from a 1V input. The converter uses the divide-by-three topology depicted in Fig. 2.6 and supplies a  $100\mu$ W load. Fig. 2.7 shows the partitioning of  $Z_{out}$  between  $Z_{SSL}$  and  $Z_{FSL}$  for varying  $\alpha_{BP}$ . When  $\alpha_{BP}$  is zero, which means no parasitic capacitance between the substrate and the TCs, the optimum partition is  $Z_{SSL}$  equal  $Z_{FSL}$ . However, when  $\alpha_{BP}$  is increased, bottom-plate losses add to the driving losses. This adds an energy cost in terms of extra  $P_{BP}$  to the reconfiguration of the capacitor network. At one point, this energy cost increases above the  $P_{driving}$  cost that would occurs from the upsize of the switches. Therefore in order to minimize the losses, the  $f_{sw}$  needs to be reduced by transferring



Fig. 2.6. A divide-by-three converter and its design specifications.



**Fig. 2.7.** Optimum  $Z_{out}$  partition between  $Z_{SSL}$  and  $Z_{FSL}$  for minimum losses. For large  $\alpha_{BP}$ ,  $Z_{SSL}$  gets close to  $Z_{out}$ .

as many charges as possible from the TCs and therefore by increasing the switch sizes. Thus as illustrated in Fig. 2.7 when  $\alpha_{BP}$  increases  $Z_{SSL}$  is degraded (increased) towards  $Z_{out}$ . This moves the optimum design point towards the SSL regime. Fig. 2.8 also shows that when using switches with a better conductance the converter enters faster the SSL regime. It is because smaller switches are used to achieve the same  $Z_{FSL}$  which leads to lower  $P_{driving}$ . Therefore the net efficiency benefit to reduce  $P_{BP}$  while increasing  $P_{driving}$  rises faster with  $\alpha_{BP}$ .



Fig. 2.8. Optimum partition of  $Z_{out}$  between  $Z_{SSL}$  and  $Z_{FSL}$  for minimum losses. With switches that have better conductance, the converter enters faster the SSL regime as  $\alpha$  increases.

# 2.2.5 Selection of switching frequency and total switch width

The last design variables can easily be determined once  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$  is found. Optimum  $f_{sw}$  is deduced from Eq. (2.9):

$$f_{sw} = \frac{\kappa}{Z_{SSL}}.$$
(2.19)

Similarly, the optimum  $W_{S,tot}$  is extracted from Eq. (2.10):

$$W_{S,tot} = \frac{\theta}{Z_{FSL}}.$$
(2.20)

This also sets the width of the switches by multiplying the  $s_i$  found at Eq. (2.7) by  $W_{S,tot}$ . Fig. 2.9 shows the optimum  $f_{sw}$  and  $W_{S,tot}$  of the converter from Fig. 2.6 for varying parasitic capacitances. As  $\alpha_{BP}$  increases, the optimum switching frequency decreases. This allows mitigating the increase of the bottom-plate losses due to the increasing parasitic capacitance. Meanwhile  $W_{S,tot}$  increases. It allows a better charge transfer across the transfer capacitor. For an increasing ratio between parasitic and effective capacitances  $\alpha_{BP}$  from 0.1% to 1%,  $f_{sw}$  is only reduced by 14% while  $W_{S,tot}$  increases by 52%. The  $W_{S,tot}$  increase gets even worst for higher parasitics and the  $f_{sw}$  reduction becomes neglectable. This is because as seen in Fig. 2.7,  $Z_{SSL}$  tends to be equal to  $Z_{out}$ . Thus for large parasitic capacitances it is not possible to go deeper in the SSL regime while keeping the same supplied power. This means that bottom-plate losses can be mitigated by tuning the switching frequency, but once the converter enters the SSL regime no more savings are possible.



Fig. 2.9. Evolution of the optimum  $f_{sw}$  and  $W_{S,tot}$  for minimum losses. Higher parasitic capacitances lead to lower switching frequency and wider switches to reduce the contribution of the bottom-plate losses.



Fig. 2.10. Evolution of the converter losses at the optimum  $f_{sw}$  for minimum losses. Large parasitic capacitances prevent the converter from being efficient due to bottomplate losses.

# 2.2.6 Losses and efficiency calculation

Once the  $f_{sw}$ , TC sizes and  $W_{Si}$  are fixed, the converter is fully sized. It is thus possible to evaluate the converter losses and efficiency. Fig. 2.10 shows the evolution with the parasitic capacitances of the converter losses normalized to  $P_{out}$  again for the example of Fig. 2.6. The linear losses  $P_{lin}$  are fixed by the converter topology and output voltage as described in Section 1.4.1. The bottom-plate losses  $P_{BP}$  increase dramatically with the parasitic capacitances. It is because unlike the TCs these capacitances are fully charged and discharged at every switching period. Reducing the switching frequency and increasing the switch conductance allows first to mitigate the loss increase. However once in SSL regime, as shown in Fig. 2.9, no further reduction of  $f_{sw}$  is possible without increasing the converter output impedance. It is because  $Z_{out}$  is close to  $Z_{SSL}$  as seen in Fig. 2.7.

The driving losses  $P_{driving}$  also have a small increase with the parasitic capacitances. This is due to the increasing ratio between  $Z_{SSL}$  and  $Z_{FSL}$  leading to wider switches as seen in Fig. 2.9.

If the converter efficiency is too low to meet the efficiency specifications, new inputs data must be provided. The most obvious way to increase the efficiency is to increase the converter area, or to decrease the parasitic capacitances. The latter can be achieved for example by selecting other process options which allows for building capacitors further away from the substrate.

# 2.3 VALIDATION OF THE METHODOLOGY

In this section, we first validate the efficiency and output impedance models presented in Sections 1.4 and 1.3.1 through simulation and measurement results. Then the sizing methodology is validated by comparing its outputs against an exhaustive search for the optimum sizing point. This is followed by a discussion about the sizing methodology which allows to explore the trade-off between converter area and efficiency.

## 2.3.1 Validation of the output impedance and losses models

To validate the models on which the sizing methodology is based, we first designed two DC/DC converters and compared their behavior to the models.

Fig. 2.11 shows the considered topologies, output impedance and input / output powers of these SC DC/DC converters.

- For the divide-by-three topology of Fig. 2.11 (a), the models are compared to SPICE simulations in an industrial 65nm CMOS process.
- For the divide-by-two topology of Fig. 2.11 (b), the models are compared to measurements on an SC DC/DC converter manufactured in an industrial 0.13μm CMOS process[48]. This design will be discussed in details in Chapter 3.

Technology parameters extracted from simulations are summarized in Table 2.1 for both cases. The models show good agreements with the simulation and measurement results. However, when deep in FSL regime (here above 100MHz), the accuracy of the prediction is reduced because at high switching frequency the buffer chain driving the switches have rise and fall times which are not negligible when compared to the switching period. Furthermore the non-overlapping clock time impact was not included in the  $Z_{FSL}$  model described in Section 1.3.3. Both of these effects increase the actual  $Z_{FSL}$  for very high  $f_{sw}$ .



**Fig. 2.11.** Validation of the models described in Sections 1.3.1 and 1.4: (a) simulated converter with a divide-by three topology with  $V_{in} = 1V$ ,  $V_{out} = 300mV$ ,  $C_{T,tot} = 50pF$ , and  $W_{S,tot} = 10\mu m$ , (b) measured converter with a divide-by two topology with  $V_{in} = 1.2V$ ,  $V_{out} = 500mV$ ,  $C_{T,tot} = 200pF$ , and  $W_{S,tot} = 110\mu m$ , (c) Converter output impedance, and (d) input and output powers. Squares and dashed lines are for a divide-by-two DC/DC, circles and plain lines are for a divide-by-three DC/DC



Fig. 2.12. Validation of the sizing methodology described in Section 2.2. When compared to exhaustive sizings with different  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$  and resulting  $f_{sw}$ , TC sizes, and  $W_{Si}$ , the sizing methodology achieves the maximal efficiency (area =  $0.1mm^2$ ,  $P_{out} = 100\mu W$ ).

# 2.3.2 Sizing methodology validation against an exhaustive search for the optimum

The models are then used to estimate the efficiency of divide-by-three converters using the topology of Fig. 2.11 (a) and sized with varying switch  $W_{tot}$  for a given  $0.1mm^2$  area with the 65nm CMOS technology parameters of Table 2.1, which sets a different value for  $f_{sw}$  at each sizing point. TCs are MiM capacitors described in Table 2.2.  $C_{T,tot}$  is thus fixed at 500pF to use the whole  $0.1mm^2$ die area. The converter must supply a 0.3V  $100\mu$ W load ( $V_{out}$ ,  $P_{out}$ ) from a 1V input voltage. Varying the switch sizes lead to varying  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$ . The efficiency of these converters is compared in Fig. 2.12 to the efficiency of a converter sized with the proposed sizing methodology. As expected, the converter resulting from the sizing methodology outputs has the best efficiency.

# 2.4 DISCUSSION AND EXPLOITATION OF THE METHODOLOGY

The sizing purpose is to size the converter to achieve the best efficiency for a given area constraint. Iteration of the design steps allows the designer to choose the best trade-off between area and efficiency. Let us here again consider the example of the divide-by-three converter described in Fig. 2.6 with a sweep of the area constraint. Fig. 2.13 shows the evolution of the optimum  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$ , and of the converter losses with the area. The proposed sizing methodology tends to operate the converter in the SSL regime for large area. It allows to limit the driving losses by reducing  $f_{sw}$  and thus to increase the converter efficiency.


Fig. 2.13. Evolution of the  $Z_{out}$  partitioning between  $Z_{SSL}$  and  $Z_{FSL}$  with the converter area. As the maximum area increases, the optimum sizing point is shifted deeper in the SSL regime to further reduce losses and increase efficiency.



**Fig. 2.14.** Evolution of  $f_{sw}$  and  $W_{S,tot}$  with the converter area.  $f_{sw}$  scales linearly with the area while  $W_{S,tot}$  increases at a lower rate. It allows for a better converter efficiency.

Fig. 2.14 shows the optimum  $f_{sw}$  and  $W_{S,tot}$  evolution with the maximal area. Maximal efficiency as well as the loss break down for a given area are shown in Fig. 2.15. The linear losses are fixed by the topology and by the converter output voltage. The bottom-plate losses are in first approximation constant over the area range. It is because they are proportional to the product between  $f_{sw}$  and  $C_{T,tot}$ as seen in Eq. (2.11). As the converter area, and thus  $C_{T,tot}$  increases, the optimal  $f_{sw}$  decreases as illustrated in Fig. 2.14. Thus the  $f_{sw} \times C_{T,tot}$  product stays in first approximation constant in order to deliver the same power to the load. More precisely, there is a slow decrease of the bottom-plate losses as the area increases. This is because with larger area, the sizing results in operation deeper in the SSL



Fig. 2.15. Evolution of the converter losses and efficiency with the area. At low area, the driving losses dominates due to high  $f_{sw}$ . The bottom-plate losses do not scale because the  $f_{sw} \times C_{T,tot}$  product is approximately constant.

regime as shown in Fig. 2.13. Therefore  $f_{sw}$  decreases faster than the increase of  $C_{T,tot}$ . Finally the converter driving losses reduces significantly as the converter area increases. Fig. 2.14 shows the  $W_{S,tot}$  increase with the converter area. As  $W_{S,tot}$  is increased,  $Z_{FSL}$  is reduced which allows the converter to operate deeper in the SSL regime. However the converter switching frequency drops faster than the increase of  $W_{S,tot}$ . It leads first to a significant reduction of the driving losses as the area increases. For area larger than  $0.1mm^2$  the efficiency improvement slows down because the driving losses become negligible when compared to the bottom-plate and linear losses.

## 2.5 CONCLUSION

A systematic and practical sizing methodology for designing switched-capacitor DC/DC converters was developed. The sizing methodology outputs are the optimum converter switching frequency, switch sizes and transfer capacitance values to maximize conversion efficiency under the die area constraint. In the proposed sizing methodology, we pointed out that the switch overdrive voltage which is not equal for all switches in the converter strongly impacts the optimum switch sizes as it affects the switch conductance. We also showed how the ratio between bottom-plate and effective transfer capacitances affects the optimal operation regime of the converter: the highest the parasitic capacitances the deeper the converter should operate in the SSL regime. SPICE simulations in a 65nm CMOS process and measurements of an SC DC/DC converter in a  $0.13\mu$ m CMOS process validate the models used for the sizing methodology. These models have been used to compare the efficiency of a converter designed with the proposed sizing methodology and the efficiency of converters designed with exhaustive parameters sweep. The results of the sizing methodology lead to the best conversion

efficiency. The proposed sizing methodology can be used to quickly explore the design trade-off between die area and conversion efficiency.

Results presented in this chapter will be used in Chapter 3 to size SC DC/DC converters with wide load range, and in Chapter 4 and 5 to design complete power management units based on SC DC/DC converters.

**CHAPTER 3** 

# MULTI-MODE DC/DC CONVERTERS FOR DYNAMIC POWER MANAGEMENT

# Abstract

Ultra-low-voltage microcontrollers for highly duty-cycled applications such as wireless sensor nodes must support several modes of operation: sleep mode and active modes adapted to the current workload. Even in sleep mode some critical blocks such as retentive SRAM, timer and interrupt controller must remain powered-on. The DC/DC converter thus needs to be able to supply ultra-wide load ranges from the sleep mode up to full workload mode. In this chapter, we develop specific designs for multi-mode switched-capacitor DC/DC converters to supply such ultra-low-voltage microcontrollers with high power-conversion efficiency in all modes by adapting the switch sizes (both L and W), their body bias, by adaptive internal clock generation supplied by the output voltage, and by reconfiguration of the SCN topology according to the voltage that must be supplied. Furthermore, we propose to use the DC/DC converter as a power-gating device when the microcontroller is in sleep mode.

To validate these techniques, we implemented two such SC DC/DC converters. The first one delivers a 0.3-0.4V output voltage from a 1-1.2V input source. The  $0.12mm^2$  chip was manufactured in a  $0.13\mu m$  CMOS technology. The efficiency reaches 74% with a  $100\mu W$  load and 63% efficiency with a 100nWload, corresponding to the microcontroller active and sleep modes respectively. The converter correctly operates over a wide load range from 25nW to  $125\mu W$ , i.e. nearly 4 orders of magnitude, which is a record for such low power levels. The second one, is manufactured in the STMicroelectronics 28nm fully-depleted SOI CMOS technology, and occupies  $0.135mm^2$  area. It delivers from a 1V input either  $40\mu W$  @ 0.3V with up to 79% efficiency or 1.3mW @ 0.45V with up to 87% efficiency close to the theoretical 90% maximum efficiency, enabling dynamic voltage scaling of the microcontroller depending on the workload.

| Contents   |                                                       |    |
|------------|-------------------------------------------------------|----|
| 3.1        | Introduction                                          | 47 |
| <b>3.2</b> | A 0.13 $\mu$ m dual-mode SC DC/DC converter for duty- |    |
|            | cycled microcontrollers                               | 48 |
| <b>3.3</b> | A 28nm FD-SOI multi-mode SC DC/DC converter for       |    |
|            | DVFS microcontrollers                                 | 54 |
| <b>3.4</b> | Conclusion                                            | 63 |
|            |                                                       |    |

46



**Fig. 3.1.** (a) Power consumption profile of DVFS microcontrollers. (b) Average power of duty cycled microcontrollers. Energy spend in sleep mode is higher than in active mode for duty cycle larger than  $1000 \times (P_{active} = 100 \mu W, P_{sleep} = 100 nW)$ .

#### 3.1 INTRODUCTION

Trillions of wireless sensor nodes expected to be deployed to support the development of the Internet of Things push the research focus for ultra-lowvoltage (ULV) microcontrollers. Such a recent microcontroller consumes as low as  $7\mu$ W/MHz [5] to increase the battery lifetime. In energy-autonomous systems microcontrollers are often highly duty cycled, and are awaken only when a task need to be performed in order to save energy. Fig. 3.1 (a) shows the power consumption profile of a typical IoT microcontroller that senses a signal and only performs data processing when a critical event is detected. As the radio transmitter power dominates the chip consumption, data pre-processing is performed to reduce the volume of data to be transmitted. In sleep mode, the power consumption is not zero because retentive SRAM, timer and interrupt controller must still be powered-on.

Fig. 3.1 (b) shows that the total power spent in sleep mode can be higher than in active mode when the active time is much lower than the sleep time, i.e. when the duty cycle is very low. Therefore, DC/DC converters supplying the ultra-low voltage efficiently at low-load currents are needed as well as efficient power-gating devices to minimize the power consumption of unused components during the sleep periods. Of course, these converters must also be able to supply

|                 | Value                                                                                                       |
|-----------------|-------------------------------------------------------------------------------------------------------------|
| Process         | Bulk $0.13 \mu m$                                                                                           |
| Input voltage   | 1-1.2V                                                                                                      |
| Operating modes | MP/ULP mode                                                                                                 |
| Output voltage  | 0.4V                                                                                                        |
| Output power    | $\frac{100\mu W \pm 25\% \text{ (active microcontroller)}}{100nW \pm 25\% \text{ (sleep microcontroller)}}$ |
| Ripple          | ${<}25\mathrm{mV}$ @3.3nF on $V_{out}$                                                                      |
| Die area        | $< 0.15 mm^2$                                                                                               |

Table 3.1. Specifications of the  $0.13\mu$ m CMOS dual mode SC DC/DC converter

efficiently the systems when it is active. Load-power variations by several orders of magnitude may thus occur between active and sleep modes, and the power range that must be delivered by such converters must be large.

This chapter presents two multi-mode SC DC/DC converters whose design is based on the sizing methodology presented in Chapter 2, and that are both able to supply dynamic loads with a wide load range, and to deliver extremely low power. The chapter is organized as follows: Section 3.2 presents a dualmode SC DC/DC converter in a  $0.13\mu m$  CMOS technology to supply loads duty-cycled from 25nW to  $125\mu W$ , Section 3.3 presents a second multi-mode SC DC/DC converter in the STMicroelectonics 28nm fully depleted SOI (FD-SOI) CMOS technology to supply a dynamic voltage and frequency scaled (DVFS) microcontroller from  $40\mu W @ 0.3V$  to 4mW @ 0.45V.

# 3.2 A $0.13\mu$ M DUAL-MODE SC DC/DC CONVERTER FOR DUTY-CYCLED MICROCONTROLLERS

In this Section we propose a dual-mode converter to supply an ULV microcontroller whose power consumption is  $100\mu$ W in active mode and 100nW sleep mode. In sleep mode only always-on peripherals such as the sleep controller or retentive SRAM are powered, the other blocks being power-gated internally. Fig. 3.2 shows the proposed converter architecture, Table 3.1 its specifications, and Fig. 3.3 shows the switched-capacitor network (SCN) performing the DC/DC conversion. The proposed converter uses an SCN with a divide-by-two topology, with two mode of operation: medium power (MP), and ultra low power (ULP) to supply the microcontroller in active and sleep mode, respectively.



Fig. 3.2. Architecture of the proposed dual mode converter designed in a  $0.13\mu m$  CMOS technology.

#### 3.2.1 MP mode

This mode aims at powering an ULV microcontroller in active mode with a  $100\mu W$  typical load. In this mode, the DC/DC converter uses an external clock close to 20MHz coming from the microcontroller. The  $V_{OUT}$  regulation is based on pulse skipped modulation (PSM). As explained in Section 1.5.1, PSM control allows for maintaining the power conversion efficiency high for a large range of output voltage, which is the target of the designed converter. Furthermore the microcontroller functionality is not threatened by the frequency of the voltage ripple as long as its magnitude remains below 25mV. In order to implement the PSM, a clocked comparator senses first the output voltage. As long as  $V_{OUT}$ is higher than the reference voltage  $V_{REF}$ , a TRIGGER signal remains low so that the switches S1 and S3 charge the transfer capacitors  $C_{T1}$  and  $C_{T2}$  to half  $V_{DD-SCN}$ . If  $V_{OUT}$  falls under  $V_{REF}$ , the comparator asserts the TRIGGER signal, and control signals for the switches of the capacitor networks are generated by the non-overlapping clock. They close S2, S4 and S5 so that charges are transferred to the load. This makes  $V_{OUT}$  rise higher than  $V_{REF}$ . After half a 20MHz clock cycle, the TRIGGER signal is reseted low. The clock generator (Clock Gen) and level shifter (LS) blocks are not used in MP mode.

A look at the schematic of the capacitor network in Fig. 3.3 shows that NMOS S3, S4, and S5 are reverse body biased if the body contact is connected to the ground. Indeed, the source terminal of switches S3, S4 and S5 are connected to the upper plate of  $C_{T1}$  and  $C_{T2}$ . This leads to a lower ON current for these devices. Furthermore, as explained in Section 2.2.3 these switches have a small overdrive voltage. Thus wide devices must be used to meet the  $P_{out}$  constraint which induces high driving losses as seen in Section 1.4. A way to reduce this is to connect the body of these transistors to the output voltage. This is done by the BB\_GEN block. Simulations show that this allows for reducing the size of these devices by a factor of  $1.6 \times$  as a rough estimation, reducing thus the  $P_{driving}$  by the same factor.

#### 3.2.2 ULP mode

The purpose of ULP mode is to supply an ULV microcontroller in sleep mode, with a 100nW typical load for SRAM data retention and always-on control circuitry. This mode is primarily similar to MP mode with two key differences. Firstly the converter output impedance  $Z_{out}$  must increase to accommodate the 100nW load. As seen in Section 1.3.1, it can be achieved by increasing either  $Z_{SSL}$  or  $Z_{FSL}$ . By increasing  $Z_{FSL}$  only, the converter would operate deeply in the FSL regime which would lead to inefficient voltage conversion because of high bottom plate losses as seen in Section 2.3.2. To increase  $Z_{SSL}$ , either smaller transfer capacitors or a lower clock frequency can be used as seen in Eq. (1.4) and (1.5). Clocking the comparator with the external 20MHz clock signal consumes prohibitive power as compared to the ultra-low load in this mode. Thus the choice was made to keep the same transfer capacitors and to operate



**Fig. 3.3.** Topology of the SCN. Switches 3 to 5 have their body bias either tied to the ground or to  $V_{OUT}$ .

at low frequency by the addition of an internal clock generator. There is thus a rough pulse frequency modulation (PFM) between MP and ULP modes as explained in Section in Section 1.5.1. The clock generator is made of 57 inverters forming a ring oscillator (RO). It does not have to generate a precise frequency, nor to have a low jitter because  $V_{OUT}$  is precisely regulated by PSM as for the MP mode. In order to reduce the frequency as well as the power consumption of the ring oscillator, it is supplied by the ULV output voltage. Moreover,  $3.9\mu$ m long-channel MOSFET are used to further limit the clock frequency and thereby the power consumption. The generated clock frequency is approximately 65 kHz at 0.4V and typical conditions. As the clock generator is powered by  $V_{OUT}$ , the converter needs to be started in MP mode before entering the ULP mode.

Secondly, the switches can be sized smaller as they have more time to transfer charges to the capacitor. As shown in Fig. 3.3, two transistors have been used for each switch. The largest one is used in MP mode and minimum size transistor for ULP mode (130 to  $200 \times$  smaller than in MP mode). Finally, in order to reduce leakage of large MP mode switches S3, S4 and S5, the body bias as  $V_{OUT}$  in MP mode is not used. The transistors of these switches are therefore reverse body biased, which efficiently reduces their leakage current.

#### 3.2.3 Experimental validation

The converter has been manufactured in a 0.13  $\mu$ m CMOS process. The chip area is 0.12mm<sup>2</sup>, mainly consumed by the 200pF MIM capacitors. Fig. 3.4 shows the die microphotograph with a summary of the main converter blocks.

Fig. 3.5 demonstrates that the use of a dual-mode converter yields a good efficiency over a wide range of loads from  $125\mu$ W down to 25nW. The peak



#### 52 MULTI-MODE DC/DC CONVERTERS FOR DYNAMIC POWER MANAGEMENT

Fig. 3.4. Die microphotograph of the  $0.12mm^2$  SC DC/DC converter.

efficiency (74%) in ULP mode is for a 400nW load. It is close to the maximal achievable efficiency  $\eta_{max}$ , defined in Section 1.4.1, of an SC converter in such configuration (80% for  $V_{IN} = 1$  V and  $V_{OUT} = 0.4$  V). At 100nW, the efficiency remains above 60%. The bottom plate losses  $P_{BP}$  account for 10%, the driving losses  $P_{driving}$  for 20%, the linear losses  $P_{linear}$  for 46% and the control losses  $P_{control}$  for 24% of the power losses. The efficiency drops below the theoretical efficiency of a linear regulator for loads below 25nW due to control losses (13nW) which do not scale with the supplied power. In MP mode the peak efficiency (75%) is obtained with a  $125\mu$ W load where  $P_{BP}$  account for 19%,  $P_{driving}$ for 13%,  $P_{linear}$  for 66% and  $P_{control}$  for 2% of the power losses. For higher loads,  $V_{OUT}$  starts to drop because the transfer capacitors cannot supply enough charges at 20MHz switching frequency. As stated in Section 1.4, efficiency thus falls as  $V_{OUT}$  decreases because of linear losses. Ripple on  $V_{OUT}$  is below 10mV in both modes. It increases the average  $V_{OUT}$  above the 400mV target and thus leads to a slight increase of the digital load power consumption as stated in Section 1.4.5. However, the load power overhead is kept below 2%.

Fig. 3.6 and 3.7 (a) show the benefits of using the output voltage to supply the ring oscillators in ULP mode, performing thus PFM. With a fixed 20MHz external clock and a 100 $\mu$ W load, MP mode fails to follow  $V_{REF}$  above 420mV whereas ULP mode can go up to 485mV while supplying a 100nW load. Indeed as  $V_{OUT}$  increases, the frequency of the ring oscillator increases too, thereby allowing more charges to be transferred to the output. Similarly when  $V_{OUT}$ decreases, more charges are transferred to the load each time the *TRIGGER* signal is set because the difference between  $V_{OUT}$  and  $V_{IN}/2$  increases. Therefore,



Fig. 3.5. Measured converter efficiency in ULP and MP mode from a 1V input voltage to a 0.4V output voltage at  $20^{\circ}$ C.



**Fig. 3.6.** Measured frequency generated by the clock generator based on a 57-stage ring oscillator, at 20°C.

charges need to be transferred less often and the natural clock frequency scaling reduces losses in control circuitry. Fig. 3.7 (b) shows the efficiency variations with the input voltage. Because of linear losses, it decreases as the input voltage lowers, following the theoretical maximal achievable efficiency  $\eta_{max}$  given by Eq. (1.15).

A corner analysis was also performed to further validate the robustness of the proposed circuit. Simulations results in Fig. 3.8 show that the worst-case corner for functionality is with small capacitors at low temperature with slow devices, which limits subthreshold current [49] and thus the ring oscillator frequency and the conductance of switches S3, S4 and S5. Small capacitors reduce  $Z_{SSL}$  as stated in Eq. (1.4).



Fig. 3.7. Measured converter efficiency at  $20^{\circ}$ C in ULP (100nW load) and MP (100 $\mu$ W load) mode for (a) a varying output voltage at  $V_{in} = 1$ V and (b) a varying input voltage at  $V_{out} = 0.4$ V



**Fig. 3.8.** Maximal  $P_{OUT}$  for increasing  $V_{OUT}$  in (a) MP and (b) ULP mode. ULP mode fails at low temperature for loads bigger than 250nW.

# 3.3 A 28NM FD-SOI MULTI-MODE SC DC/DC CONVERTER FOR DVFS MICROCONTROLLERS

In this section, we present a second SC DC/DC converter in the STMicroelectronics 28nm FD-SOI technology, supplying a DVFS microcontroller operating either in high-performance mode (1.3mW @ 450mV) or in low-power mode (40  $\mu$ W @ 300mV).

The SC DC/DC converters described in this thesis take their advantage of the opportunity to be integrated on-chip next to the load minimizing thus the system assembly complexity and cost. Therefore porting their design in nanometer

 Table 3.2.
 Specifications of the 28nm FD-SOI CMOS dual mode SC DC/DC converter

|                 | Value                                                                                                                                                |
|-----------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| Process         | STMicroelectronics 28nm FD-SOI                                                                                                                       |
| Input voltage   | 0.9-1.1V                                                                                                                                             |
| Operating modes | MP/LP/power-gating                                                                                                                                   |
| Output voltage  | $0.45V \pm 10\% (LP)$<br>$0.3V \pm 10\% (MP)$                                                                                                        |
| Output Power    | 1-5mW (microcontroller in high-performance mode)<br>$5-120\mu W$ (microcontroller in low-power mode )<br><50nW (microcontroller in power-gated mode) |
| Ripple          | ${<}5\mathrm{mV}$ @100nF on $V_{out}$                                                                                                                |
| Die area        | $< 0.15 mm^{2}$                                                                                                                                      |

technology node offers the opportunity to achieve higher power density thanks to available  $17 \text{fF}/\mu m^2$  MiM capacitors.

# 3.3.1 Proposed SC DC/DC converter architecture

The specifications of the proposed converter are given in Table 3.2 and its diagram is described in Fig. 3.9. The SC DC/DC converter uses two operating modes plus a power-gating mode. In MP mode it supplies 1-5mW to a highperformance 450mV load. In LP mode the microcontroller voltage and clock frequency is reduced to lower its power consumption and the converter supplies  $5-120\mu$ W to a 300mV load. In power-gating mode, the leakage current in the microcontroller must be less than 50nA. The SC DC/DC converter was designed following the sizing methodology presented in Chapter 2. As this design have several common features with the dual mode converter presented in Section 3.2, we present here the design key differences.

#### Regulation mechanism

The converter uses a pulse skipped modulation (PSM) as the converter presented in Section 3.2, in order to keep the power conversion efficiency high over a large power range of the load. As long as  $V_{OUT}$  is higher than the target voltage (0.45V or 0.3V, fixed by the supplied microcontroller mode controller) the converter does not send charges to the load, making thus  $V_{OUT}$  slowly decreases. When  $V_{OUT}$ falls below the target voltage, the comparator output rises. It enables a sampler module that thus transmits pulses generated by an internal clock (Clock Gen) to the non-overlapping clock (NOC) and to the gate drivers of two interleaved SCN.



Fig. 3.9. Architecture of the multi-mode converter designed in the STMicroelectronics 28nm FD-SOI CMOS technology.

On top of the PSM regulation, the MODE\_SEL signal allows to switch between MP and LP mode, with pulse frequency and switch size adaptation. It allows for keeping the efficiency high at low load. The EN\_DCDC signal allows the DC/DC to act as a power-gating device by turning off the Clock Gen block and by opening all the switches as will be explained below in this section.

#### Switched-capacitor networks

To avoid large linear losses when the output voltage drops to 300mV, the SCN uses two topologies as shown in Fig. 3.10: a divide-by-two topology for the 450mV load in MP mode and a divide-by-three topology for the 300mV load in LP mode. Therefore the maximal achievable efficiency  $\eta_{max}$  of the SC DC/DC converter remains high at 90% in both modes. Furthermode, two interleaved SCNs send charges alternatively to the load. The use of two interleaved SCN allows for mitigating the  $V_{OUT}$  voltage ripple as explained in Section 1.5.5. It is critical because of the ULV microcontroller sensitivity to the fluctuations of its low 300mV supply voltage in LP mode.

Thanks to the FD-SOI CMOS technology, it is possible to apply large forward body bias to particular MOSFETs without having prohibitive leakage currents. In order to take advantage of this, a 1V body bias is applied to the NMOS switches (S1-S8, S11-S15). It increases their conductance by approximately 20%, reducing thus their width by the same factor. Similarly, the body of PMOS switches (S1, S2, S9, S10) is connected to the ground.

As the converter must supply a relatively high power in MP mode, the sizing methodology produces large switch widths, which leads to high leakage when the divide-by-two converter is disabled (in LP mode or when the DC/DC converter is in power-gating mode). Fig. 3.11 (a) shows the leakage paths due to divide-by-two switches (S1-S8). It reduces the efficiency of the converter while operating in LP mode. In order to mitigate this, a 6nm to 9nm channel length upsize is applied on the divide-by-two switches that have large leakage. It reduces their conductance by 10%, so that their width must increase accordingly, which leads to a 10% increase of the driving losses because of the gate capacitance increase of those switches. However as shown in Fig. 3.11 (b), the leakage of the divide-by-two switches is reduced by approximately 75% allowing up to 4.5% efficiency increase in LP mode (@ $V_{OUT}$ =300mV,  $P_{OUT}$  = 40µW), at the cost of a less than 0.5% efficiency drop in MP mode.

#### Clock generation

The pulses enabling the charge transfer through the SCNs are generated on-chip by two ring oscillators as shown in Fig. 3.12. Their frequencies are either 13MHz in LP mode or 67MHz in MP mode at room temperature and typical conditions. These frequencies correspond to the optimal switching frequencies evaluated by the sizing methodology detailed in Chapter 2.

The ring oscillator consists of a chain with an odd number of inverting stages. The last stage output is connected to the input of the first stage. It can be modeled by current sources each charging and discharging a capacitor. These 58 MULTI-MODE DC/DC CONVERTERS FOR DYNAMIC POWER MANAGEMENT



**Fig. 3.10.** SCN of the proposed converter. Two TC are connected in a divide-by-two topology in MP mode, while a divide-by-three topology is used for LP mode.



Fig. 3.11. (a) Leakage paths from the divide-by-two SCN switches: large switches have leakage threatening the efficiency of the converter in LP mode. (b) Leakage reduction thanks to a channel length upsize  $(Lg_{S3,S7} = 30nm, Lg_{S1,S5} = 36nm, Lg_{S2,S4,S6,S8} = 39nm)$ .

current sources are controlled by the voltage on the previous stage capacitor. Therefore the ring oscillator dynamic power consumption is lower with a low ON current in each stage, than with more and faster stages having a high ON



Fig. 3.12. Ring oscillator circuits generating the pulses of SC DC/DC converter.

current. Thus for a target oscillating frequency, the dynamic power consumption is minimized with few stages that have a low ON current.

The easiest way to create such an oscillator is by using logic cells from available libraries. However, in this case it requires more than 500 stages which results in an unacceptable power consumption  $(60\mu W)$  in LP mode of the ring oscillator. Custom inverters with stacked devices have thus been used. Each ring oscillator uses only 9 of these inverters for a simulated power consumption of  $2.2\mu W$  for the 67MHz MP mode clock, and  $0.9\mu W$  for the 13MHz LP-mode clock.

The MODE\_SEL signal enables only the ring oscillator whose pulses are used, canceling the power consumption of the unused ring oscillator. Further a multiplexer controlled by the EN\_DCDC and CLK\_DCDCSEL signals sets CLK\_DCDC either to the ring oscillator output for MP or LP mode, to an external clock (CLK\_DCDCIN) for SCN characterization purposes, or to a static signal (EN\_DCDC) for the power-gating mode. Finally, when the DC/DC converter is disabled (EN\_DCDC=0), both ring oscillators are disabled to reduce the controller power consumption when the load is in sleep mode.

#### Comparator & sampler

Constraints on the comparator design are different in LP mode and MP mode. In LP mode the control circuit power consumption is not negligible when compared to the low supplied power. Thus the comparator quiescent current must be low to avoid a degradation of the DC/DC converter efficiency. In MP mode, the supplied power is higher so that the power consumption of the comparator is not a concern. However, as shown by the SCN equivalent model described in Section 1.3.1, the converter output impedance must be lower in MP mode than in LP mode in order to supply a larger power to the load with a small  $\Delta V = V_{nl} - V_{OUT}$ , while the load equivalent capacitance, composed of decoupling capacitances and supply voltage routing parasitic capacitances, remains the same. It leads to a smaller time constant and thus to faster fluctuations of  $V_{OUT}$  in MP mode. Therefore the comparator must have a faster response time to avoid  $V_{OUT}$  oscillations.

The quiescent current of the comparator is thus adapted to the operating mode. As shown in Fig. 3.13 (a), this is performed by a current mirror (M1-M3) with two copying transistors (M2, M3). In LP mode, M2 is used to bias the



Fig. 3.13. (a) Biasing circuit of the comparator and (b) sampling circuit of the comparator output.

comparator and M3 is OFF with its gate tied to the supply voltage. In MP mode, both M2 and M3 bias the comparator with a higher quiescent current.

Furthermore, the converter switching frequency is higher than for the dual mode converter described in Section 3.2. Therefore we do not use a clocked comparator, as a small comparison time would be required to not change the duty cycle of the trigger signal sent to the DC/DC converter, and thus its output impedance as explained in Section 1.5.1. Instead, a sampling circuit as shown in Fig. 3.13 (b) is used. Every rising edge of CLK\_DCDC, the comparator output is sampled by a flip flop. If it is high, a NAND gate enables the CLK\_DCDC propagation to the non-overlapping clock, while if it is low the pulse is skipped. A small buffer is added between the CLK\_DCDC signal and the NAND gate to avoid glitches due to the propagation time in the flip flop. If the comparator output changes at the CLK\_DCDC rising edge, the flip flop output is unknown, and the SC DC/DC controller may take the wrong decision to send charges or not to the load. However, such timing violation does not lead to malfunction of the  $V_{OUT}$  regulation loop stability, as it is corrected on the next CLK\_DCDC rising edge.

## Power-gating mode

The DC/DC converter can be used to power gate the load when in sleep mode. Therefore, as shown in Fig. 3.9, a multiplexer is added before each gate driver. It allows to either transfer the pulse through the gate drivers when the SC DC/DC converter is enabled, or to open all the switches when the SC DC/DC converter is disabled. Along with the channel length upsize of large switches as described in Fig. 3.11 the simulated current flowing through the converter when acting as a power-gating device is lower than 45nA while loaded by the microcontroller.

This current could further be reduced by selecting a transistor type for the PMOS input switches (S1, S5, S9) with higher  $I_{ON}/I_{OFF}$  ratio instead of choosing the flavor leading to the best switch conductance per width unit. It comes at the cost of a threshold voltage increase of those switches. However input switches have large overdrive voltage as their source terminal is connected to the supply



Fig. 3.14. Layout of the proposed SC DC/DC converter.

voltage, which mitigates the switch conductance sensitivity to their threshold voltage.

Furthermore, let us notice that in nanometer technology nodes with thin gate oxides, gate leakage may occur on large switches. Therefore, turning ON specific switches that have large gate leakage current when the DC/DC acts as a power-gating device may reduce the leakage current across the converter provided that subthreshold leakage is gated by one of the transistors in its leakage paths [50].

#### 3.3.2 Simulation results

A layout of the chip is shown in Fig. 3.14. It occupies  $0.135mm^2$  mainly consumed by the four 330pF MiM capacitors. The control circuits are placed between the two interleaved SCNs. Post-layout simulations as well as corner simulations have been performed to validate the chip functionality.

Fig. 3.15 shows the converter transient behavior when it is activated in LP mode. During the first  $5\mu s$ , the DC/DC is not enabled so that  $V_{OUT}$  remains at 0V. A leakage current (not shown) of 40nA flows to the load. Then the SC DC/DC is turned on, so that it starts to send charges to the load and the 100nF output filtering capacitor, making thus  $V_{OUT}$  rises. After approximately 17.5 $\mu s$ ,  $V_{OUT}$  reaches its target value. Then, when the comparator detects that the output voltage is above the 300mV target voltage, OUT\_COMP is lowered which prevents pulses generated by the ring oscillator to be propagated through the gate drivers and thus stops charge transfer to the load. Therefore  $V_{OUT}$  decreases until it lowers below the target voltage. The ripple magnitude is here 3mV for a  $40\mu W$  load. In MP mode with a 1.3mW 450mV load the ripple is 5mV and  $1\mu s$  is required to rise  $V_{OUT}$  to the target voltage when the converter is started up. This ripple induces an increase of the load dynamic power consumption by 1% and 1.1% in LP and MP mode respectively.





Fig. 3.15. Post layout simulation of the converter transient behavior when activated in LP mode ( $V_{DD-SCN} = 1V$ ,  $P_{OUT} = 40\mu W$ , typical conditions and 25°C).



**Fig. 3.16.** Converter simulated efficiency in MP and LP modes ( $V_{DD} = 1V$ , typical conditions and 25°C).

Fig. 3.16 shows the simulated power conversion efficiency of the SC DC/DC converter vs  $P_{OUT}$ . The peak efficiency in MP and LP modes is respectively 88% @  $P_{OUT} = 3.5mW$  and 82% @  $P_{OUT} = 100 \mu mW$ . In both modes, the main

efficiency loss (10%) comes from linear losses. Then in MP mode with  $P_{OUT} = 1.3mW$ ,  $P_{BP}$  contribute to 1.9%,  $P_{driving}$  to 1%, and  $P_{control}$  to 0.7% efficiency drop. In LP mode with  $P_{OUT} = 40\mu W$ ,  $P_{driving}$  is only 0.6% thanks to smaller switches. However  $P_{control}$  rises to 5% because there is no linear relationship between the controller power consumption and the supplied power. Furthermore,  $P_{BP}$  and losses due to the leakage of MP-mode switches rise to 5%.  $P_{BP}$  are higher with the divide-by-three topology because the SCN must switch  $1.8\times$  more often than with the divide-by-two topology to bring the same amount of charge to the load. Indeed according to the definition of the SSL impedance seen in Section 1.3.2, the  $Z_{SSL}$  is  $380\Omega/MHz$  for the divide-by-two topology. Thus to achieve the same  $Z_{SSL}$ , with the divide-by-three topology it is required to switch more often which leads to higher  $P_{BP}$ .

Let us note that with the MP mode, divide-by-two topology supplying the 300mV load would result in a maximal efficiency of 66%. Therefore eventhough the  $P_{BP}$  are higher with the divide-by-three topology, its use remains beneficial.

#### 3.4 CONCLUSION

In this chapter we proposed to design multi-mode switched-capacitor DC/DC converters to supply dynamic loads such as microcontrollers with DVFS and duty-cycled operation. In order to supply such loads the DC/DC converter must be able to:

- supply wide load range of several orders of magnitude,
- adapt the supplied voltage to the circuit workload,
- enable power gating of the load.

We showed in this chapter that this can be achieved with SC DC/DC converters by reconfiguring the SCN topology according to the load voltage to keep linear losses low, by using a switching frequency scaled to the supplied power, and by body biasing the switches and upsizing their channel lengths to prevent leakage when the load is in sleep mode.

The first converter is designed and measured in a commercial  $0.13\mu$ m CMOS technology. It transfers charges from a 1V-1.2V DC input to a 0.3-0.4V DC output. The medium-power (MP) mode corresponds to the 100 $\mu$ W power consumption of a microcontroller in active mode, and the ultra-low-power (ULP) mode fits microcontroller sleep mode from 25nW to 450nW. In MP mode, the efficiency reaches 74% for 100 $\mu$ W loads, and in ULP mode it reaches 74% for 400nW loads and remains above 60% for loads as low as 100nW.

The second converter is designed and simulated in the STMicroelectronics 28nm FD-SOI CMOS technology. It supplies from a 1V input an ULV microcontroller in high performance mode (MP mode @  $V_{OUT}=0.45$ V,  $P_{OUT}=1.3$ mW) or in low-power mode (LP mode @  $V_{OUT}=0.3$ V,  $P_{OUT}=40\mu$ W). The achieved

#### 64 MULTI-MODE DC/DC CONVERTERS FOR DYNAMIC POWER MANAGEMENT

efficiency is high in both modes (88% in MP mode and 82% in LP mode) thanks to the use of the sizing methodology proposed in Chapter 2.

Table 3.3 compares the two proposed converter performances with low-power SC DC/DC converters. In [19] and [21], the converter is designed to supply a DVFS microcontroller. The SCN can be reconfigured to tune the voltage gain with the load voltage. On top of a pulse skipped control, they use pulse frequency modulation and varying switch size to reduce losses of the converter. The pulse frequency modulation is triggered by a comparator that detects when  $V_{OUT}$  cannot reach its target value. This comes at the cost of an increased control complexity and of a power consumption overhead for the supplied circuit as its voltage guard-band must increase to ensure that no timing violations will occur even if  $V_{OUT}$  remains below its target. In [15], an SC converter and an LDO regulator are used together to address low-power load requirements.

The table shows that for the  $0.13\mu$ m converter described in Section 3.2 with medium loads of tenths of  $\mu$ Ws the proposed converter has a similar efficiency whereas for ultra-low loads below 1  $\mu$ W, a significant improvement is obtained. For the 28nm converter described in Section 3.3, higher efficiency is achieved for medium loads and higher power density is obtained thanks to dense 17fF/mm<sup>2</sup> MiM capacitors and to switch conductance increased by forward body biasing allowed by SOI technology. As an SC DC/DC converter aims at being integrated next to its load, allowing the digital load to communicate to the DC/DC controller its state (high-performance, low-power or sleep mode) such as proposed in this chapter allows for reducing the controller complexity by reducing the number of comparators in the regulation loop. Furthermore, multi-mode converters allow for matching the converter switching frequency and switch size to the optimum point defined in Chapter 2 according to known load profiles corresponding to the converter modes.

In the following chapters we focus on the control circuit of such converters to implement complete power management units that can be integrated on-chip next to the load, reducing the system assembly complexity and cost. In Chapter 4, the PMU mitigates the PVT variations by tuning the converter output voltage. In chapter 5, it interfaces the load with an energy harvester.

 Table 3.3.
 Comparison with low-power SC DC/DC converters

|                  | Dual 0.13                            | Dual 28                  | [19]                       | [21]                        | [15]                  |
|------------------|--------------------------------------|--------------------------|----------------------------|-----------------------------|-----------------------|
| Process          | $0.13 \mu m$                         | 28nm FD-SOI              | $0.18 \mu { m m}$          | $65 \mathrm{nm}$            | $0.13 \mu m$          |
| Area             | $0.12mm^2$                           | $0.135mm^2$              | $0.57mm^2$                 | $0.12mm^2$                  | $0.26mm^2$            |
| $V_{IN}$         | 1-1.2V                               | 0.9 - 1.1 V              | 1.8V                       | 1.2V                        | 2.5 - 3.6 V           |
| V <sub>OUT</sub> | 0.3-0.4V                             | 0.25-0.33V<br>0.33-0.45V | 0.3-1.1V                   | 0.3-1.1V                    | 0.44V                 |
| Load             | 25-450 nW<br>$0.45-125 \mu \text{W}$ | $5-125\mu W$<br>0.05-5mW | $5\text{-}1000\mu\text{W}$ | $1\text{-}500\mu\text{W}$   | 2-250nW               |
| Peak $\eta$      | 75%                                  | 87%                      | 74%                        | 78%                         | 56%                   |
| power<br>density | $1\frac{mW}{mm^2}$                   | $37\frac{mW}{mm^2}$      | $1.6 \frac{mW}{mm^2}$      | $4.2 \frac{mW}{mm^2}$       | $1\frac{\mu W}{mm^2}$ |
| $\eta$ @100nW    | 63%                                  | NA                       | NA                         | $56\% @ 0.5 \mu \mathrm{W}$ | 44%                   |

**CHAPTER 4** 

# PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

# Abstract

In this chapter, we investigate the possibility for low-power applications to integrate an efficient adaptive voltage scaling (AVS) system on chip. Therefore the impact of process (both global and local), voltage and temperature variations is firstly studied on two typical low-power circuits i.e., a CPU for mobile applications and a microcontroller for IoT nodes. In order to ensure safe operation under all conditions, it is required to increase the supply voltage by up to 22%leading to a nearly 50% increase in dynamic power consumption when compared to nominal operating conditions. An AVS system allows for reducing this supply voltage guard band. To be able to include the AVS system on chip using standard CMOS, this chapter proposes to use a switched-capacitor network for DC/DC conversion from the higher battery voltage. A critical path replica is used for both sensing the circuit maximum operating frequency and generating its clock signal. We show that the voltage ripple induced by the DC/DC converter does not significantly contribute to the supply voltage guard-band, and that overall the proposed AVS system allows for reducing this guard band by up to 80%while consuming less than 33% of the total circuit area.

Such an AVS system has been successfully integrated in the 65nm ultralow-voltage microcontroller SoC SleepWalker [5]. Thanks to the AVS system, the microcontroller can operate at 25MHz instead of 10MHz over process and temperature variations from -40°C to 85°C, with a peak efficiency of the DC/DC converter above 80%.

### Contents

| 4.1        | Introduction                                   | 69 |
|------------|------------------------------------------------|----|
| 4.2        | Impact of PVT variations on low-power circuits | 70 |
| 4.3        | State-of-the-art of AVS systems                | 74 |
| <b>4.4</b> | Proposed on-chip AVS system with an SC DC/DC   |    |
|            | converter                                      | 79 |
| 4.5        | Practical AVS implementation                   | 84 |
| 4.6        | Experimental validation                        | 89 |
| 4.7        | Conclusions                                    | 93 |
|            |                                                |    |

# 4.1 INTRODUCTION

The reduction of transistor size driven by Moore's law allows the on-chip integration of increasingly complex circuits, which strongly boosts speed performances of high-end components such as PC processors. Low-power and ultra-low-power applications also benefit from this miniaturization, and it eases the integration of previously discrete components such as on-chip DC/DC converters as seen in previous chapters. This higher integration level offers the opportunity to reduce the bill-of-material for some consumer applications such as smartphones. It also contributes to the development of new applications such as wireless sensor nodes [51], [52] or body area sensor networks [53] for which circuit footprint is critical.

Unfortunately, the sensitivity of circuits to transistor variability is rising [54]. The transistors characteristics depend on process corners, temperature variations or even on slow supply voltage decrease due to the battery discharge. These are called the process, voltage and temperature (PVT) variations or global variations. They have the same impact on each transistors within the circuit. But two transistors lying next to each other also exhibit two different behaviors, for example due to random dopant fluctuations or line edge roughness [54]. These phenomena are called the local variations. As CMOS technology scales down both global and local variations have a higher impact on the circuit properties. In order to ensure safe operation, designers need to take larger guard bands on the supply voltage  $V_{DD}$ . Unfortunately it induces higher dynamic power consumption, proportional to the square of the supply voltage. This may not be acceptable for low-power circuits.

A typical management system as depicted in Fig. 4.1 (a) is composed of a clock management system and a power management system. The clock management system provides a low-jitter clock with PVT insensitive frequency to the circuit through a phase-locked loop (PLL) which multiplies the frequency of a slow external crystal clock. The power management system provides a regulated voltage from the battery. In order to reduce the safety margin, adaptive voltage scaling (AVS) systems have been proposed as an alternative [53, 55, 56, 57, 58, 59, 60, 61]. As shown in Fig. 4.1 (b), such a system merges the clock and power management to ensure safe operation without setup time violation, by generating a supply voltage  $V_{DD}$  so that the critical path delay is shorter than the clock period. The evaluation of the supply voltage is included in a feedback loop so that, it adapts to any PVT variations. The  $V_{DD}$  guard band can therefore be reduced.

We investigate in this chapter the opportunity to design an AVS system integrated on chip by exploiting the possibility given by CMOS technology scaling. Therefore two families of applications are first defined. For both families the impact of global and local variations is studied and the  $V_{DD}$  guard band to compensate them is evaluated. The main components of an AVS feedback loop that can be fully on chip integrated are then described. These are the timing sensor, the clock generator, the DC/DC converter and the controller. The impact of the converter output voltage on the clock is analyzed to ensure error-free operation. An estimation of the converter area for the two applications follows. Then the

#### **70** PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP



**Fig. 4.1.** (a) Conventional power and clock management systems. (b) AVS system with only one feedback loop on the frequency of signal  $f_{CLK}$ . The supply voltage is adapted to ensure that no timing failure occurs.

 $V_{DD}$  guard band savings allowed by the AVS system are evaluated. Therefore the impact of local variations and of ripple on the circuit timing is studied. Practical implementation issues are then discussed and a test chip is proposed and measured while supplying an ULV microcontroller.

This chapter is organized as follows. Section 4.2 studies the impact of PVT variations on benchmark circuits for two low-power applications. Section 4.3 details the state of the art of AVS systems. The proposed AVS architecture and its benefits on the two low-power application benchmarks are explained in Section 4.4, Section 4.5 details the practical AVS implementation, and Section 4.6 provides measurement results.

# 4.2 IMPACT OF PVT VARIATIONS ON LOW-POWER CIRCUITS

To ease the discussion without loss of generality, we divide the low-power field into two typical applications. Conclusions may be extended to circuits with other characteristics by following the same reasonings. Processors for smartphones or tablet PCs belong to the first application called low-power [62]. The second application denominated ultra-low-power is composed of devices whose battery has to last for years [63] or even the whole lifetime of the circuit [64]. Wireless sensor nodes or IoT nodes for which this thesis aims at designing power management units are examples of this second application. This section firstly gives a description of these two applications as well as the benchmark circuits used to characterize them. Then the impact of the variations on these circuits is studied.

#### 4.2.1 Definition of the low-power benchmark circuits

Table 4.1 shows key figures for both applications. Circuits of application A (lowpower) typically operate at the standard voltage of the technology node allowing high clock frequency. The power consumption is reduced through the use of low-

|                              | Application A:<br>low-power<br>(Mobile processor) | Application B:<br>ultra-low-power<br>(Wireless sensor node) |
|------------------------------|---------------------------------------------------|-------------------------------------------------------------|
| Power consumption            | 1 W                                               | $1 \mathrm{mW}$                                             |
| Battery voltage              | $3.6\mathrm{V}$                                   | 3.6V                                                        |
| Operating voltage            | 1.1V                                              | $0.75\mathrm{V}$                                            |
| Crystal oscillator frequency | 32MHz                                             | 32kHz                                                       |
| Main clock frequency         | 1GHz                                              | 10MHz                                                       |
| Pipeline stages              | 10-20                                             | 2-5                                                         |
| Architecture                 | 32-64 bit                                         | 16-32 bit                                                   |
| Critical path depth          | 20FO4                                             | 50FO4                                                       |
| Critical paths $\#$          | 1000                                              | 100                                                         |
| CMOS technology              | $45 \mathrm{nm} \mathrm{LP}$                      | 65nm LP                                                     |
| Dice area                    | $25mm^2$                                          | $2.5mm^2$                                                   |

 Table 4.1.
 Key figures of low-power application.

power process flavour [62] and other design techniques such as multi- $V_t$  assignment [65], power-gating [66] and clock-gating [67]. In order to reach high-speed performances, such applications use pipelining [62] as it reduces the critical path depth. These circuits consume a lot of area due to large on-chip cache memories. New technology nodes reduce the area consumption and are therefore cost effective. Circuits of application B (ultra-low-power) do not need high computing power so that their timing constraints are relaxed. This allows for reducing the supply voltage to near-threshold values, thereby minimizing the dynamic power consumption. As the applications feature long duty cycles, high- $V_t$  transistors are often selected to minimize stand-by leakage power. Moreover as the timing constraints are relaxed there are often less pipeline stages. It leads to longer critical paths. They are also less critical paths because they are less pipeline registers and because the data paths are narrower. These circuits are typically smaller with less on-chip memories [5], and with less cores or coprocessors [68, 69]. They use therefore older technology nodes which are cheaper for small circuits.

In order to determine the impact of global and local variations on these applications, two benchmark circuits are considered. They are made of a critical path replica of the low-power circuits. The critical path replica of the low-power applications (application A) is made of a 20-stage FO4 inverter chain. Four inverters are connected to the output of each stage in order to mimic logic gates connected at each stage of the critical path. It is simulated with industrial models of a 45 nm low-power CMOS technology. The critical path replica of the ultra-low-power applications (application B) is also built with an FO4 inverter chain. The chain





Fig. 4.2. Delay time of the benchmark circuits of application A and application B under varying corner (Slow Slow, Typical Typical and Fast Fast CMOS transistors), temperature and supply voltage.

counts 50 stages because of less pipeline stages and uses industrial models of a low-power CMOS 65 nm technology.

### 4.2.2 Impact of the PVT variations

The PVT variations are first studied. These variations correspond to:

- Process variations which alter the transistor behavior. They are unknown during the design flow but are fixed once the circuit is manufactured.
- The temperature variations which evolve slowly during circuit operation with the ambient temperature but also with the circuit heat dissipation.
- The low frequency supply voltage variations, for example due to battery discharge (high frequency ripple in the supply voltage is discussed below).

The evolution of the benchmark circuit delay under PVT variations is depicted in Fig. 4.2. For application A at 1.1 V supply voltage, there is a ratio of 2 between worst and best cases. This ratio rises to 12 for application B at 0.75 V supply voltage. This higher ratio is due to the behavior of the benchmark circuit operating at low temperature in the slowest corner and low supply voltage  $V_{DD}$ . As explained in [49] the threshold voltage of circuit increases as temperature decreases. In strong inversion region, for the benchmark circuit of application A this effect is negligible with regards to carrier mobility increase at low temperature. However for the benchmark circuit of application B in the near-threshold region, the ON current of the transistors depends exponentially on the threshold voltage. This becomes the dominating effect and explains the higher delay at low voltage and low temperature [49]. In order to compensate the PVT variations a safety  $V_{DD}$  guard band must be taken on  $V_{DD}$  in order to ensure that no



Fig. 4.3. Delay distribution of the benchmark circuits of application A @1.1 V and application B @0.75 V,  $25^{\circ}$ C and typical process corner for a 500-run Monte-Carlo simulation.

timing failure occurs. This guard band is 122 mV for application A and 135 mV for application B. It may seem surprising to get so close values as the delay of application B benchmark circuit is more sensitive to PVT variations. However, application B operates in the near-threshold region. It is thus more sensitive to an increase of its supply voltage. Indeed in subthreshold region the ON current of the transistors also varies exponentially with its supply voltage [70]. Moreover applications A and B do not have the same supply voltage. The normalization of the guard band over the supply voltage leads to a value of 11% for application A and of 18% for application B. The normalized voltage guard band related to PVT variations is thus larger when the circuit operates at low voltage close to the device threshold voltage.

Local variations from transistors to transistors are an important concern in advanced CMOS technology nodes. These variations are due to the mismatch induced by random dopant fluctuations and line edge roughness [54]. As the CMOS technology scales to nanometer sizes these effects are becoming more important. Fig. 4.3 shows the spread of the delay of application A and B benchmark circuits for a 500-run Monte-Carlo simulation. The ratio of the standard deviation over the mean  $\sigma/\mu$  is 2.4% for application A and 4.1% for application B. Application A is designed in a smaller 45 nm CMOS technology node and is shorter (20 stages) leading to less averaging effect between each stages. It should thus be more sensitive to local variations. However, application B operates at a lower supply voltage, and the ON current of the transistors is thus more sensitive to a modification of their threshold voltage. It explains the higher spread in the delay time of application B benchmark circuit. However, application A needs 40 mV of  $V_{DD}$  guard band to ensure safe operation versus local variations while application B needs only 16 mV. It corresponds respectively to 3.6% and 2.1%of their supply voltage. It means that even if application B is more prone to local variations, a smaller correction of its supply voltage is sufficient. This is again

#### 74 PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

because application B operates in the near-threshold region. Therefore, the ON current of its transistors varies exponentially with the supply voltage which make easier to cancel application B higher sensitivity due to near-threshold operation. The normalized voltage guard band related to local variations is thus smaller when the circuit operates at low voltage close to the device threshold voltage even if the critical path delay is more sensitive to local variations.

Finally non-ideality of the supply voltage requires also a  $V_{DD}$  guard band. More specifically, voltage ripple can be a concern. Indeed the DC/DC converter has to guarantee that its output voltage never drops below a safety value to ensure error free operation as explained in Section 1.4.5. Therefore the average voltage produced by the converter is typically increased depending on the ripple magnitude [24], [28]. It can be seen as an additional contributor to the total  $V_{DD}$ guard band. Linear regulator can be added after a switching converter to cancel the ripple and isolate voltage sensitive loads from the noise induced by other loads on the global power domain. However the additional losses in the regulator counterweight the benefits on the  $V_{DD}$  guard band. A typical 25 mV ripple leads to a  $V_{DD}$  guard band corresponding to 1.1% and 1.7% of  $V_{DD}$  for applications A and B, respectively. Ripple in the supply voltage has a contribution to the supply voltage guard band that is in first approximation independent of the circuit characteristics but that depends on the voltage regulator.

When these three contributors to the safety guard band are summed up together,  $V_{DD}$  guard band reaches 15.7% and 21.8% of the  $V_{DD}$  of applications A and B respectively. The dynamic power consumption of a digital load is proportional to the square of the supply voltage because of the energy spent in the switching of the circuit capacitances. Therefore the power budget in Low-Power applications (A) must be increased by 34%. This increase reaches 48% for the Ultra-low-Power applications (B). The main contributor to  $V_{DD}$  guard band remains the global PVT variations. In order to reduce their effect, AVS systems have been proposed.

# 4.3 STATE-OF-THE-ART OF AVS SYSTEMS

An adaptive voltage scaling system aims at reducing the power consumption of a digital circuit. Instead of delivering a constant  $V_{DD}$  to the circuit, the AVS feedback loop controls the supplied circuit  $V_{DD}$  to ensure critical path delay just below the cycle time of the clock. Thus AVS systems automatically adapt the  $V_{DD}$  to ensure just in time operation and avoid  $V_{DD}$  guard band for PVT variations. At industrial level, fuse programming can limit the  $V_{DD}$  guard band induced by process corners. However, for circuits at a lower voltage than the main supply voltage, a voltage regulator is still required so that the area (and cost) overhead of the AVS system remains lower. Further fuse programming has a cost associated to the circuit testing that is avoided with the automatic regulation of an AVS system. Finally an AVS system does not only compensate for process corner but also for temperature and main supply voltage variations.



Fig. 4.4. Architecture of an adaptive voltage scaling (AVS) system.



**Fig. 4.5.** Timing sensor based on a critical path replica of the supplied circuit. The critical path replica is mounted in a ring oscillator fashion. The ring frequency is voltage controlled by the supply voltage of the circuit.

As illustrated in Fig. 4.4, an AVS system is composed of four main blocks. These blocks are the timing sensor, the DC/DC converter, the clock generator and the controller. The sensor sends information about the maximal operating frequency  $f_{MAX}$  to the controller. The controller drives the DC/DC converter depending on the frequency error. The DC/DC converter modifies  $V_{DD}$  so that  $f_{MAX}$  tracks  $f_{CLK}$ . Finally, a clock generator can be added to the circuit for adapting the clock frequency to the actual workload. This section gives an overview of the state-of-the-art of these AVS blocks.

## 4.3.1 The timing sensor

The timing sensor evaluates the maximal circuit frequency  $f_{MAX}$ . It is composed of logic gates of the same family as the supplied circuit. Three main families of sensors can be found in the literature.

The first ones are voltage controlled ring oscillators [55, 56, 58]. As shown in Fig. 4.5, a ring oscillator is built with an odd number of inverting gates forming a chain whose output is connected to the input. A NAND gate and an enable signal ensure that no parasitic oscillation occurs. The ring oscillator frequency is tuned by its supply voltage. Its length is chosen so that its oscillation period matches the critical path delay of the supplied circuit.

#### 76 PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP



Fig. 4.6. Sensor based on a delay chain. At the edge of the supplied circuit clock, a signal propagates through the delay elements. The pulse is stopped after  $T_{TARGET}$ . The delay element reached at that time gives an image of the critical path delay.



Fig. 4.7. Sensor based on a delay chain mounted as a ring oscillator. At the edge of the supplied circuit clock, the circuit starts to oscillate. The frequency counter counts the number of time the pulse cycle through the whole chain. The oscillation is stopped after  $T_{TARGET}$ . The delay element reached at that time and the frequency count result gives an image of the critical path delay.

The second family of sensor consists of delay chains [57, 53, 60]. As depicted in Fig. 4.6, when  $f_{MAX}$  has to be evaluated a rising edge is sent to the sensor input. After a period corresponding to the target clock period  $T_{TARGET}$  the delay element outputs are stored. The delay element reached by the input signal is an image of the critical path delay. One of the delay elements correspond therefore to  $T_{TARGET}$ . If the edge has not reached it, the supplied circuit operates at an insufficient  $V_{DD}$ . The delay chain can also be organized as a ring oscillator as shown in Fig. 4.7 [57]. This allows for saving space and logic gates. In order to implement it, only a few delay elements are used. A frequency counter tracks the number of oscillations. This technique allows a reduction of the consumed area, but it needs a more complex control to analyze the results because the polarity of the delay element outputs are inverted at each oscillation.

The two previously described sensor families have the same behavior as the supplied circuit under PVT variations. However, they cannot monitor local vari-


Fig. 4.8. Razor flip flop sensor. The shadow latch is enabled some time after the main flip-flop. If the two latches do not store the same value, the error flag is triggered, enabling thereby the error recovery circuit.

ations. Another kind of sensor named Razor was proposed by [59] to monitor both PVT and local variations. It consists of shadow latches added to each flipflop following a critical path. It is depicted in Fig. 4.8. The shadow latches are enabled by a dedicated delayed clock. If a timing failure occurs in a critical path, the output of the main flip flop and of the shadow latch differs. This triggers an error signal which launches a data recovery mechanism. This sensor can be seen as an ultimate solution as both local and global variations are monitored. However its practical implementation requires an identification of each critical path in the circuit and the elaboration of an error recovery scheme.

# 4.3.2 The controller

The controller of an AVS system can have two different tasks. For applications with varying workloads, the first task of the controller is to compute the appropriate operating frequency corresponding to the instantaneous workload. It is achieved by software implementation and analysis of the tasks patterns [55]. The second task is to drive the supply voltage in order to allow the circuit to operate safely at a specific clock frequency. This part of the controller is hardware implemented. The controller compares  $f_{MAX}$  provided by the sensor and the reference frequency evaluated by the first part of the controller. This comparison defines the error signal of the AVS feedback loop. This signal can be processed by different kinds of controller. The bang-bang controller keeps the signal between two boundaries [58, 60]. The PID controller uses its derivative to cancel the system pole and get faster transient response [56]. Finally sliding mode controllers use state variables of the system. It defines two boundaries for these state variables and forces the monitored frequency to move between them, making it slide toward the desired value [57, 58].

### **78** PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

Usually two different control schemes are adopted [55, 56, 57, 58]. A faster scheme is used to track the target operating point when it has been modified. A more power efficient mode is used to maintain the voltage and frequency once equilibrium is reached.

# 4.3.3 The clock generator

The main clock of a circuit controlled by an AVS system is not always provided by an external crystal clock. Instead, it can be generated on chip. Two families of clock generator for AVS systems can be identified. Firstly when a voltagecontrolled ring oscillator is used as a sensor, it can be reused to generate the circuit clock signal [55, 56]. This is possible when the oscillation period matches the delay of the critical path. The local variations cannot be captured by this sensor. Thus some margin on the clock period has to be taken to ensure that timing failure cannot occur. This kind of clock generator suffers of clock jitter due to ripple in the circuit supply voltage.

When another kind of sensor is chosen, it is not possible to reuse it as a clock generator for the circuit. When several operating frequency may be selected, the authors of [53] propose to use an external clock signal. This signal passes through a clock divider whose dividing ratio can be chosen. When a higher operating frequency is selected, the new clock is first sent to the sensor in order to adapt the circuit supply voltage. When it is stabilized the new higher clock frequency is transmitted to the circuit. When the clock speed is decreased the lower frequency clock is directly sent both to the sensor and to the circuit.

### 4.3.4 The DC/DC converter

AVS systems typically use switching converters because of their higher efficiency while delivering a large range of voltage. Indeed linear converters exhibit higher losses when their  $V_{OUT}/V_{IN}$  ratio decreases. The most popular converter is the buck converter [55, 56, 57] as the nominal voltage of digital circuits is smaller than the battery voltage. However [58] proposes a buck/boost converter for more exotic applications. To our knowledge AVS systems using SC DC/DC converters have not yet been proposed.

The switching losses are one concern of the buck converter. The switches must be sized to be able to deliver the maximal power consumed. This leads to large switches size with large gate capacitance. The charging and discharging of the gate capacitance as well as the buffer chains driving them consume a lot of power. This limits the efficiency of the AVS system when the circuits are idle or consume less power. Indeed, the efficiency of the converter is lowered due to these switching losses. Several techniques are proposed to scale the switching losses with the delivered power in order to maintain the converter efficiency at low loads. The first technique consists in dividing each switch in a switches array. Depending on the delivered power, more or less switches in the arrays are used [55, 56]. Another technique is to reduce the switching frequency by driving the



Fig. 4.9. Block diagram of the proposed AVS system. An SC DC/DC converter is used for the voltage conversion, the sensor and the clock generation units are built with the same ring oscillator.

converter with a Pulse Frequency Modulation (PFM) instead of a Pulse Width Modulation at low loads[55, 56, 57, 60]. With the PFM, charges are sent to the load only when the output voltage decreases below a threshold value.

# 4.4 PROPOSED ON-CHIP AVS SYSTEM WITH AN SC DC/DC CONVERTER

This section describes a new AVS system that can be fully integrated on chip. It then analyzes the generated clock stability and the achievable benefits on the  $V_{DD}$  guard band. A design example and measurements will be provided in Section 4.5.

### 4.4.1 Description of the AVS system

The proposed architecture is shown in Fig. 4.9. An SC DC/DC converter operating in the SSL regime is used for the voltage conversion. The other blocks of the AVS system are fully digitally implemented. The sensor and the clock generator units are made of the same ring oscillator whose oscillation period matches the critical path delay (CPR RO). The external inputs to the AVS system are the battery voltage  $V_{BAT}$ , the digital code representing target operating frequency  $f_{TARGET}$ , and a low frequency crystal clock.

During the high phase of the external clock  $f_{CRYSTAL}$ , the frequency comparator counts the number of rising edges of the clock generated by the sensor and clock generation unit. On the falling edge of  $f_{CRYSTAL}$ , the difference be-

### 80 PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

tween the count result and  $f_{TARGET}$  is sent to the regulator. It decides if the circuit clock frequency  $f_{CLK}$  must be increased or decreased and therefore the adjustment of the converter output voltage. It generates control signals which are sent to the decoder that converts them into signals compatible with the DC/DC converter driver. The conversion ratio of SC DC/DC converters at no load is fixed by their topology. However as the load current increases, their output voltage decreases below  $V_{nl}$ , as explained in Chapter 1. As long as the converter operates in the SSL regime, from Eq. (1.2), the power  $P_{out}$  that can be delivered by the converter is:

$$P_{out} = \frac{(V_{nl} - V_{out}) \times V_{out}}{Z_{SSL}} = \frac{f_{sw}}{\kappa} \times (V_{nl} - V_{out}) \times V_{out}.$$
 (4.1)

Where  $V_{out}$ , is here the  $V_{DD}$  of the supplied circuit, and where  $\kappa$  defined in Section 1.3.2 is proportional to the transfer capacitor size. For a given  $P_{out}$ , this  $V_{DD}$  can thus be tuned by controlling the switching frequency. This control mechanism follows a PFM scheme. The decoder signals are sent to the converter driving clock on the rising edge of  $f_{CRYSTAL}$ . They select the converter switching frequency. The output of the converter  $V_{DD}$  is sent both to the supplied circuit and to the sensor and clock generation unit. As  $V_{DD}$  varies,  $f_{CLK}$  is thus automatically adjusted.

### 4.4.2 Jitter on the generated clock

The sensor is subject to the same process and temperature variations as the circuit because it is built with the same family of logic gates on the same die. It is also powered with the same supply voltage making the AVS feedback loop able to cancel global PVT variations. The sensor is used for the clock generation unit. Its frequency is determined by the output voltage of the DC/DC converter. As the output voltage of a switching converter suffers from ripple, jitter can appear on clock signal  $f_{CLK}$ .

Fig. 4.10 shows the jitter evolution with the converter switching frequency  $f_{sw}$ . The ripple magnitude is constant and equals 25 mV. The graph results are normalized to the clock period of 1 ns for low-power applications (application A) such as mobile processors and of 100 ns for ultra-low-power applications (application B) such as wireless sensor nodes. For application A, the jitter is lower than 5% of the clock period for 100 MHz  $f_{sw}$ . As  $f_{sw}$  increases the jitter is reduced. This is because the difference between maximal and minimal average  $V_{DD}$  during a 1ns clock period is flattened. For application B the normalized jitter can rise up to 20% for low 1 MHz  $f_{sw}$ . This is because the ring oscillator operates under a lower 0.75V  $V_{DD}$ . The transistors ON current are thus more sensitive to a modification of  $V_{DD}$ . For the same reason as for application A when the converter  $f_{sw}$  increases up to 10 MHz, the jitter is reduced. Moreover when the  $f_{sw}$  is a multiple of the clock frequency, the jitter shows minima. This is because the average  $f_{sw}$  of the ring oscillator is constant as it is integrated over a multiple of its oscillation period.



Fig. 4.10. Jitter in the ring oscillator when supplied by the output of a switching converter. The ripple magnitude is 25 mV.

# 4.4.3 Ripple-induced $V_{DD}$ guard band

As just seen in the previous section the ripple induces jitter in the clock signal generated by the voltage-controlled ring oscillator. The ripple itself does not affect the average circuit speed performances, as in average the circuit operates at the target frequency  $f_{TARGET}$ , a slower clock cycle being compensated by a faster one. The danger comes from the risk of a timing failure if the clock is too fast and if the data does not have enough time to go through the critical path to a register before the next clock cycle. This is avoided if the supply voltage of the circuit is higher when the clock frequency is the highest. Unfortunately, there is a clock propagation time in the clock tree between the clock generator and the circuit logic gates. Therefore for a given clock period, the average  $V_{DD}$  differs between the time it supplies the clock generator and the digital circuit. If no  $V_{DD}$  guard band is taken when the average  $V_{DD}$  is higher for the clock generator than for the supplied circuit, a timing failure may occur.

In order to evaluate this  $V_{DD}$  guard band the test bench depicted in Fig. 4.11 has been used. The output voltage of a switching converter with 25 mV ripple supplies the clock generation unit. The clock signal is sent to the input of application A and B benchmark circuits that were used in Section .4.2 with a propagation delay. As a reminder application A benchmark circuit is a 20 stage FO4 inverter chain representative of low-power application critical path and application B benchmark circuit is a 50 stage FO4 inverter chain for ultra-lowpower application critical path. The propagation delay block mimics the delay occurring in the clock tree. The maximal delay has been set to half the clock period. Finally the benchmark circuits are supplied with the same  $V_{DD}$  as the clock generator unit plus an additional  $V_{DD}$  guard band. The purpose of the test bench is to evaluate the minimal  $V_{DD}$  guard band to ensure that a clock edge can propagate in the benchmark circuit in less than a clock period. Fig. 4.12 shows the results. For application A the worst case does not occur at the minimal  $f_{sw}$ . This is because for  $f_{sw}$  below 500 MHz (i.e., half the clock frequency) the clock

82 PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP



Fig. 4.11. Test bench for the evaluation of  $V_{DD}$  guard band required to compensate ripple in the supply voltage and clock jitter.



Fig. 4.12.  $V_{DD}$  guard band to compensate for ripple in the supply voltage for varying clock propagation delay and switching frequency. Ripple is set to 25 mV.  $V_{DD}$  for application A and B benchmark circuits is 1.1 V and 0.75 V respectively.

propagation delay is small when compared to the switching period leading to close-to-identical average supply voltages for the clock generation unit and the supplied circuit. For  $f_{sw}$  above 500 MHz, the integration time of  $V_{DD}$  during the clock generation increases when compared to the ripple frequency leading to an averaging effect reducing the required  $V_{DD}$  guard band. Application B sees the same evolution of its supply voltage guard band with the exception of 25 MHz and 50 MHz  $f_{sw}$  where the guard band oscillates with increasing clock propagation delay. This is because the switching period is shorter than half the clock period. Therefore the guard band rises for delay times up to half the switching period. At this point, the average  $V_{DD}$  is minimal. Higher delays lead to a higher average  $V_{DD}$  and therefore a smaller  $V_{DD}$  guard band. Moreover,  $V_{DD}$  guard band magnitude is lower for 10 MHz  $f_{sw}$  than for 25 MHz. This is because at 10 MHz,  $f_{sw}$  equals the clock period, and the average voltage does not vary with the clock propagation delay. Fig 4.13 shows the evolution of the maximal  $V_{DD}$  guard band for clock propagation delay between zero and half the clock period and varying  $f_{sw}$ . The ripple magnitude is 25 mV. Application B shows minima for  $f_{sw}$  multiple of the  $f_{CLK}$  for the reasons mentioned above.



Fig. 4.13.  $V_{DD}$  guard band to compensate for ripple in the supply voltage for varying switching frequencies vs. converter area. Ripple is set to 25 mV,  $V_{DD}$ =1.1V,  $P_{out}$ =1mW, and  $\kappa = 46.3 \cdot 10^6 F^{-1}/mm^2$  for application A,  $V_{DD}$ =0.75V,  $P_{out}$ =100  $\mu$ W, and  $\kappa = 40 \cdot 10^6 F^{-1}/mm^2$  for application B, Capacitance density is 7.2 fF/ $\mu m^2$  for application A and 5 fF/ $\mu m^2$  for application B in 45 nm and 65 nm CMOS technology nodes respectively.

The  $V_{DD}$  guard band remains small even in worst cases. It is 0.7% and 1% of the  $V_{DD}$  of applications A and B respectively. Therefore  $f_{sw}$  should be chosen so as minimize the converter area. The area dependence on  $f_{sw}$  is also shown in Fig. 4.13. As stated in Eq. (4.1), for a given output power,  $\kappa$ , and thus the transfer capacitors are inversely proportional to the switching frequency leading to unrealistic size of the converter for low  $f_{sw}$ . Fig. 4.13 shows the  $f_{sw}$  values that leads to a converter area equal to half the supplied circuit area without AVS, corresponding to 33% of the circuit area with AVS included depending on the achieved power savings. The reduction of the converter size by increasing its  $f_{sw}$  comes as the cost of the conversion efficiency reduction. The most straightforward example is the losses occurring in the driving of the large switches gate capacitances, but bottom plate capacitance losses and losses in the control circuit also increase with  $f_{sw}$  reducing therefore the converter efficiency [19, 64].

### 4.4.4 Local variation induced $V_{DD}$ guard band

Local variations also occur in the sensor and clock generation unit as it is built with logic gates. Therefore an increase in the  $V_{DD}$  guard band is needed to ensure that no timing failure occurs if the clock generator is faster than a critical path due to transistors-to-transistors variations. Monitoring of the critical path inside the circuit with a razor-like sensor could avoid this guard band, but it would increase the AVS design complexity. Fig. 4.14 shows that to ensure 99.9% yield ( $3\sigma$  robustness) a  $V_{DD}$  guard band of 70 mV and 28 mV is required for application A and application B respectively. The assumption has be taken that application A counts 1000 independent critical paths and application B only 100 as ultra-low-power circuits are usually less complex with relaxed timing constraints. The increase in the  $V_{DD}$  guard band to compensate local variations due to the clock generator is of 75% for both applications A and B. Application B



Fig. 4.14. Timing yield of application A and application B for a  $V_{DD}$  guard band of 70 mV and 28 mV respectively.

clock generator operates at a lower supply voltage and therefore should be much more prone to local variations. However thanks to more stages in its critical path and therefore in the clock generator, averaging between each stage reduces the local variations impact. These variations are also mitigated by a smaller number of independent critical paths than for application A.  $V_{DD}$  guard band induced by local variations remains small when compared to the benefit of cancelling the effect of the global variations.

# 4.5 PRACTICAL AVS IMPLEMENTATION

In this section, we present an AVS system that has been designed to supply an ULV microcontroller within the SleepWalker SoC [5]. The microcontroller supply voltage is 0.4V in typical conditions with a target 25MHz  $f_{CLK}$ . At 0.4V, to ensure safe operation at the worst-case corner, the microcontroller  $f_{CLK}$  must be reduced to 10MHz. This worst-case corner appears with a SS process at -40°C because low temperature has the most detrimental impact on speed at ULV [49]. To alleviate the PVT impact on gate delay, we designed an all-digital on-chip adaptive voltage scaling (AVS) system based on the theoretical principles from Section 4.4. Its purposes are:

- generation of the regulated 25MHz clock,
- generation of the regulated internal  $V_{dd}$  for the ULV domain from the unregulated (1-1.2V) external  $V_{dd}$ ,
- dynamic adaptation of the internal  $V_{dd}$  to compensate delay variations from process and temperature changes.

The internal  $V_{dd}$  needs to be adapted from 0.48V to compensate SS process at -40°C to 0.32V to save power in FF corner at +85°C.

The AVS regulation loop, depicted in Fig. 4.15, uses the architecture described in Section 4.4.1. It uses a critical-path replica ring oscillator (CPR RO) to both generate the 25MHz clock and sense the ULV critical path delay. For these purposes, a crystal clock serves as a time reference. The information about its frequency is sent to the AVS controller which decides to either increase or decrease the internal  $V_{DD}$ . This internal  $V_{DD}$  is generated by an SC DC/DC converter controlled via a frequency modulation scheme by a variable-length RO.

An AVS system based on a comparable principle was recently proposed in [14] but our AVS system is fully designed for low-carbon WSNs in 65nm CMOS. This necessitates (i) an on-chip implementation with a switched-capacitor DC/DC converter to avoid both the cost and carbon footprint of external inductors, and (ii) an all-digital loop with frequency modulation and simple low-power regulation scheme for low die area.

# 4.5.1 AVS controller

The AVS controller monitors the frequency generated by the CPR. The critical path delays depends on gate delays and RC delays due to interconnections [55, 71]. However, for the supplied ULV microcontroller, the critical path have a low dependency on the RC delays because (i) the microcontroller area is small (< $0.5mm^2$ ), which means short wires between gates of the critical path, and (ii) at ULV RC delays are proportionally smaller than gate delays [72]. Furthermore, the cell library used for the microcontroller synthesis contains gates with the same gate length and a limited three transistors stacking. Therefore a ring oscillator made of NAND2 and NOR2 gates taken from this library was considered to be an area and design cost effective solution to monitor the CPR. A small guard band was included by adding some stages to the ring oscillator to ensure safe operation versus WID variations and behavior differences between the CPR RO and the actual critical paths as explained in Section 4.4.4. A 10-bit counter counts the number of rising edges of the CPR RO during the low phase of the crystal clock. The result is compared to a target count corresponding to the 25MHz target frequency  $f_{TARGET}$ . If the CPR RO frequency is lower (resp. higher) than  $f_{TARGET} \pm 0.5$  MHz, the AVS controller requests an increase (resp. decrease) of the internal  $V_{DD}$  to correct the CPR RO frequency.

A simple +/- regulation is chosen to keep both the power consumption and the area of the AVS loop low, at the cost of a slow loop response. This is acceptable because the AVS feedback loop has to track process and temperature variations, which are slow. Moreover, the CPR RO is supplied by the low internal  $V_{DD}$ , ensuring that no timing violation from fast transient voltage variations or workload fluctuations can occur in the ULV domain.

The AVS controller with the Dec and Sync registers of the variable-length RO which is described in Section 4.5.3 have a  $1.3\mu$ W power consumption in typical conditions from physical synthesis.



Fig. 4.15. Architecture of the proposed all-digital on-chip AVS system

# 4.5.2 DC/DC converter

The DC/DC converter performs the DC/DC voltage down conversion from the external 1-1.2V voltage source to the 0.32-0.48V internal  $V_{DD}$  used by the microcontroller. It is built with two interleaved switched-capacitor networks (SCNs) to limit voltage ripple on the internal  $V_{DD}$  as explained in Section 1.5.5. The SCNs use a divide-by-two topology using five power switches. The switches connected to the converter output (internal  $V_{DD}$ ) and the switch between transfer capacitors  $C_{Ta}$  and  $C_{Tb}$  have their body voltage biased at the internal  $V_{DD}$ . This alleviates the negative body bias normally seen by these devices when they are ON, leading to a 38% switch size reduction and thus lowering the associated gate drive losses. In stand-by mode, the DC/DC clock is stopped and the switches act as power-gating sleep transistors. Reducing the switch sizes thus leads to a direct reduction of the leakage power of microcontroller in stand-by mode [73]. For leakage power reasons, LP MOSFETs are used to implement the switches.

The SCNs use MIM capacitors as transfer capacitors  $C_{Ta}$  and  $C_{Tb}$  for compact on-chip implementation. Furthermore  $C_{Tb}$  is physically implemented at the layout level above the 1V power domain of the microcontroller always-on peripherals circuits. The switching noise from  $C_{Tb}$  does not harm the operation of always-on peripherals because (i) the bottom plate of  $C_{Tb}$  is always connected to the ground, acting therefore as a shield, and (ii) because always-on peripherals have low noise sensitivity thanks to both their 1-1.2V  $V_{dd}$  and their loose timing constraints, as most logic paths operate on the crystal clock. This stacking of MIM capacitors allows for recycling 44% of the converter area. The converter area is only 0.07mm<sup>2</sup> thanks to this MIM stacking scheme i.e. 40% smaller than the converter from [21], designed to supply a sub-MHz microcontroller at 0.5V. Further improvement in the converter area could be achieved through the study of other SCN topologies as shown in [36].  $C_{out}$  results from a trade-off between a ripple lower than 20mV on the ULV supply voltage and a wake-up time smaller than 10 $\mu$ s.

### 4.5.3 Variable-length ring oscillator

As explained in Section 4.4.1, the voltage supplied by the AVS system can be controlled by modulation of the SC DC/DC converter switching frequency. A variable-length ring oscillator (RO) is thus implemented to generate the DC/DC clock. When the AVS controller requests a higher (resp. lower) internal  $V_{DD}$ , the RO length is reduced (resp. increased), thereby increasing (resp. reducing)  $f_{sw}$ .

To avoid interrupting the DC/DC converter, the RO length is adapted dynamically. On the crystal-clock falling edge, the controller sends the request for increasing or decreasing the internal  $V_{DD}$ . In order to avoid parasitic oscillation in the RO, control signals from the synchronization register are sent when the oscillating edge propagates in the fixed-length part of the oscillator, as shown in Fig. 4.15. The incremental parts of the RO are thus reconfigured by the time the oscillating signal reaches them. The control signals from the AVS controller



Fig. 4.16. Simulated AVS response to a steep workload increase. The internal  $V_{dd}$  remains above the functional limits before the AVS loop bring back the CPR frequency close to the target.

thus cross clock domains. Therefore, a double register barrier is used to avoid metastability. The first register (Dec) decodes the signals from the controller into signals able to select the RO length. The second register (Sync) sends these signals to a multiplexer selecting the last stage of the RO and to NAND gates enabling incremental parts of the ring oscillator for low power. There are 31 different RO lengths arranged with a logarithmic scale to generate a DC/DC clock that can accommodate regulation in extreme cases: low  $f_{sw}$  down to 1.65MHz for 1.2V to 0.32V conversion (FF corner at +85°C) and high  $f_{sw}$  up to 17MHz for 1.0V to 0.48V conversion (SS corner at -40°C).

# 4.5.4 AVS stability

The AVS purpose is to avoid any guard band due to PVT variations. The AVS feedback loop is much faster than a temperature change or a battery discharge so that they do not threaten the system stability. However, a steep increase in the microcontroller workload may occur. In that case, additional charges needed by the microcontroller are taken from the 3.3nF  $C_{out}$  as long as the AVS feed-

back loop has not handled this power consumption increase. Because of the small amount of energy stored in  $C_{out}$ , it leads to a lowering of the internal  $V_{DD}$ , which threatens the microcontroller functionality if the internal  $V_{DD}$  drops under the functional voltage limit. In order to prove that such a crash cannot occur, Fig. 4.16 shows the scenario of a maximal workload change. The microcontroller is first clock gated so that the ULV domain power consumption equals the microcontroller leakage power which is  $33\mu$ W. Then, the microcontroller workload rises to its maximal, i.e. an extra  $50\mu W$  dynamic power consumption occurs. The internal  $V_{DD}$  thus drops because of the small amount of energy stored in  $C_{out}$ . Meanwhile, the CPR frequency drops. These two factors reduce both the dynamic and leakage powers of the microcontroller. Furthermore, as the converter output voltage drops, the difference between  $V_{nl}$  and internal  $V_{DD}$  increases, which increases the converter delivered power as stated in Eq. (4.1). Therefore, the system stabilizes at a lower internal  $V_{DD}$ , which remains higher than the microcontroller functional limit ( $\approx 0.3V$  from measurements). Then the AVS regulation loop detects the lower CPR frequency and starts to regulate the system to handle the load change within  $100\mu$ s.

# 4.6 EXPERIMENTAL VALIDATION

The AVS test chip was integrated and manufactured within to the SleepWalker SoC in a 7-metal 65nm LP/GP CMOS process with MIM capacitance option. Fig. 4.17 shows a microphotograph the 23 available dies from a TT wafer were successfully tested.

The open-loop efficiency of the switched-capacitor DC/DC converter is given in Fig. 4.18 (a). Results are shown for various internal  $V_{DD}$  values ensuring safe operation of the ULV power domain in the SS -40°C, TT 25°C and FF 85°C corners, with the length of the variable-length RO swept over the whole range. Each curve exhibits a different range of load current because the difference between  $V_{nl}$  and  $V_{out}$  from Eq. (4.1) is corner dependent, from 0.02V for the SS -40°C corner to 0.18V for the FF 85°C corner. However, the power consumption of the ULV domain also depends on the corners. As shown in Fig. 4.18 (a), the converter is able to deliver enough current to the load in all operating conditions, with a safety margin of at least 2×. The converter efficiency varies with the supplied internal  $V_{DD}$  for two reasons.

- The intrinsic efficiency limit of the switched-capacitor converter equals the ratio between  $V_{out}$  of the converter (the internal  $V_{DD}$  of ULV power domain) and  $V_{nl}$ . The maximum achievable efficiency is thus 96%, 80% and 64% for the following conditions: SS/-40°C at 0.48V, TT/25°C at 0.4V and FF/85°C at 0.32V, respectively.
- The power overhead of the converter from the variable-length RO, the nonoverlapping clock, the gate drivers and the bottom-plate capacitances is proportional to the switching frequency  $f_{sw}$ . At a given  $f_{sw}$  and thus given



Fig. 4.17. Die microphotograph of the sub-mm<sup>2</sup> microcontroller SoC.

power overhead, the load current varies with the internal  $V_{DD}$ . At high internal  $V_{DD}$  this power overhead gets associated to lower load currents, which lowers the efficiency.

The measured top efficiency peaks above 80% for 0.45V internal  $V_{DD}$  and 1V external  $V_{DD}$ .

Fig. 4.18 (b) shows the measured  $V_{dd}$  guard band reduction achieved by the AVS system on 10 tested dies. An average power saving of 25-30% is achieved, thanks to  $V_{DD}$  reductions up to 110mV, when compared to the 0.48V worst-case  $V_{DD}$  for 25MHz operation at the -40°C SS corner.

Fig. 4.19 (a) and (b) show the closed-loop AVS line and load regulation of the main clock frequency and internal  $V_{dd}$ . The internal  $V_{dd}$  is maintained between 398mV and 403.4mV for a load current ranging from 25 $\mu$ A to 713 $\mu$ A or an external  $V_{DD}$  ranging from 0.88V to 1.22V. It corresponds to a main clock frequency between 24.42MHz and 25.81MHz. As the external  $V_{DD}$  or the load current changes, the internal  $V_{DD}$  varies leading to a frequency variation of the main clock. Fig. 4.19 (a) and (b) thus show how the AVS feedback loop detects a deviation from the target frequency and adapts the length of the converter





Fig. 4.18. Measured DC/DC converter and AVS system results: (a) DC/DC converter efficiency for a 1V external  $V_{DD}$ , the arrows show the estimated ULV power domain consumption range for typical and extreme corners, (b) power consumption of the ULV domain (including the DC/DC converter) with and without AVS internal  $V_{DD}$  regulation (10 tested dies, at 25MHz in all conditions).



**Fig. 4.19.** AVS (a) line (@  $I_{load}=155\mu$ A,  $25^{\circ}C$ ) and (b) load (@ external  $V_{DD}=1V$ ,  $25^{\circ}C$ ) regulation of internal  $V_{DD}$  and main clock frequency.

variable-length RO accordingly. The logarithmic scale on the length increment ensures that only 31 regulation steps can accommodate a wide  $30 \times$  load current span. At a given output voltage, a logarithmic switching frequency scale leads to a fixed relative increase in the converter delivered power, as long as the converter operates in the slow switching limit. In contrary a binary scheme leads to a linear switching frequency scale with a fixed absolute increase of the converter delivered power. Therefore for a given granularity in the first steps, a linear scale would require much more settings to achieve the same load span.



### 92 PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

**Fig. 4.20.** Measured transient behavior of the internal  $V_{DD}$  at start-up and for a load change: microcontroller going from performing NOP operations to FIR looping, at  $25^{\circ}C$ .

Fig. 4.20 shows the transient behavior of the internal  $V_{DD}$  at start-up with the start-up steps detailed for illustration purpose. First when the microcontroller state request signal goes to active, the converter sends charges to the 3.3 nF  $C_{out}$ . At this stage, the ULV domain power consumption is limited to leakage through the devices. Therefore internal  $V_{DD}$  rises close to the no-load voltage of the converter, i.e.  $V_{DD}/2$ . Second the CPR is started and switching power occurs in the ULV domain, resulting in a small drop of the internal  $V_{DD}$ . The generated clock is then synchronized with the crystal clock and the operation of the CPU is finally started. The AVS loop starts regulating the CPU frequency at this step. Fig. 4.20 also shows the AVS loop response to a steep workload increase. In Section 4.5.4, to prove system stability, we assumed a low-power mode with ideal clock gating (no switching at all) as a worst case. This could not be tested in measurement as this mode is not implemented in the test chip. We emulate this mode instead by performing only NOP operations for a 54  $\mu$ W power consumption. When the workload signal goes high, the CPU runs at full workload by performing FIR looping and consumes 67  $\mu$ W (with instruction cache power included). When the workload changes, the internal  $V_{DD}$  only falls by 15mV before going back to 0.4V. This shows that the results from Section 4.5.4 are conservative and that workload fluctuations do not threaten the CPU operation as internal  $V_{DD}$  stays 80 mV above the 0.3V functional limit.



Fig. 4.21. Comparison of the total  $V_{DD}$  guard band to ensure safe timing between a system using a typical voltage and clock management architecture and the proposed AVS architecture.

# 4.7 CONCLUSIONS

Advanced CMOS technology nodes suffer from local and global variations leading to large  $V_{DD}$  guard bands for safe operation. However thanks to higher integration levels, AVS systems can be integrated on chip to reduce this guard band. In this Chapter, we proposed to build an on-chip AVS system with a switchedcapacitor network as the DC/DC converter combined with a critical path replica of the monitored circuit. The critical path replica is used both as the sensor and as the clock generator of the circuit. We first investigated its feasibility for two applications: a 45 nm low-power mobile processor (A) and a 65 nm ultra-low power wireless sensor node microcontroller (B). We showed that new technology nodes with higher capacitance density allow for building such converters using less than 33% of the total circuit area. Low-power applications need higher guard band to compensate local variations because it counts more critical paths and they are designed in a smaller CMOS technology node (e.g., 45 nm in this chapter). Therefore they would benefit from a "Razor-like" sensor which allows for canceling local variations as well [59]. This is not the case for ultra-low-power applications where local variations are proportionally less important. Therefore, a critical path replica which is less time consuming to implement would be an acceptable and cost effective solution. The ripple of the converter output voltage induces jitter, but it is mitigated by the AVS feedback loop. The corresponding contribution to the  $V_{DD}$  guard band can be neglected when compared to the guard band induced by local variations. Simulation results summarized in Fig. 4.21 show that the use of an AVS system allows a reduction of the  $V_{DD}$  guard band of a digital circuit by 55% for low-power applications and by 82% for ultra-

# **94** PUSCHING ADAPTIVE VOLTAGE SCALING FULLY ON CHIP

low-power applications. It allows for reducing the dynamic power consumption by 17% and 37% respectively which increases the battery autonomy or decreases its volume.

An AVS systems supplying a microcontroller has been manufactured on a 65nm technology node. Thanks to the PVT canceling, it is possible to increase the microcontroller clock frequency from 10MHz to 25MHz with a 0.4V supply voltage. The SC DC/DC converter inside the AVS system reaches up to 81% efficiency and is able to supply the microcontroller under all PVT corners. The DC/DC converter is able to deliver more power to the load if its output voltage reduces which ensures the AVS system stability versus a steep workload increase.

**CHAPTER 5** 

# A SINGLE-CONVERTER POWER MANAGEMENT UNIT FOR ENERGY-HARVESTING WIRELESS SENSOR NODES

# Abstract

In this chapter, we propose a power management unit to enable energyautonomous operation of an IoT node from solar energy. Therefore we first review the state of the art of power management units using environmental harvesters as energy source and an energy storage element to power the load when no power is available from the harvesters. Such power managements fail to predict the energy available from the storage element due to the flat battery voltage and have relatively high bill of material.

The proposed power management unit is the first based on a single SC DC/DC converter to extract power from solar cells, generate a regulated 1V load voltage, and interface a supercapacitor as energy storage element. The use of only one DC/DC converter allows for reducing the die area and to increase the end-toend conversion efficiency. The PMU requires only a supercapacitor and a filtering capacitor as external components. In order to enable management of the node tasks and to anticipate power downs, an SC\_status signal provides information about the voltage on the supercapacitor and thus on the stored energy. The single SC DC/DC converter has a gearbox with 7 voltage gains, gate-boosting drivers for the switches, and a pulse skipped modulation control. Furthermore, the PMU can operate in three frequency modes with a converter switching frequency of 1MHz, 10MHz, or 40MHz, thereby adapting the quiescent current in the PMU controller depending on the power available at the solar cells. It allows the PMU to handle efficiently input power ranging from  $10\mu W$ , up to 20mW. With the 1MHz switching frequency, the SC DC/DC converter has an average simulated efficiency of 70.7% when charging the supercapacitor from 0V to 2.7V, and of 74.9% when supplying  $V_{reg}$  from the supercapacitor, leading to an average 53% end-to-end efficiency.

A layout has been realized in a commercial 65nm CMOS process. It consumes  $0.425mm^2$  area. Once charged, the low  $7\mu$ W standby power, including the supercapacitor leakage, and a  $1.7\mu$ W sleep power of the load allows the IoT node to survive 43 hours without light before the supercapacitor gets empty.

| Contents |
|----------|
|----------|

| 5.1        | Introduction                                           | 97  |
|------------|--------------------------------------------------------|-----|
| 5.2        | State of the art of energy-harvesting power management |     |
|            | units                                                  | 98  |
| 5.3        | A single-converter PMU for energy harvesting           | 100 |
| <b>5.4</b> | System validation                                      | 117 |
| 5.5        | Conclusions                                            | 122 |
|            |                                                        |     |

# 5.1 INTRODUCTION

As explained in the general introduction, up to trillions of sensor nodes are expected to be deployed to implement the vision of the Internet-of-Things [9]. To enable such a massive deployment, design of an IoT node must address several challenges:

- the node must be energy autonomous (neither requiring battery replacement nor connection to the grid),
- the node must be able to configure itself automatically when inserted in a network,
- the node must have a low carbon footprint for environment sustainability.

Today's designs of IoT nodes do not address all the above challenges. Especially, there is no real efficient solution for the power management unit (PMU) extracting the power from energy harvesters. As the amount of environmental power cannot be predicted, a storage element (a battery or a supercapacitor) is needed to store charges when the power consumed by the loads is smaller than the power harvested, and to supply it back to the loads when their power consumption exceed the power harvested. The PMU must thus address the problem shown in Fig. 5.1 that is to:

- power efficiently generate a regulated voltage for the IoT node circuitry, while interfacing an energy harvester and a storage element,
- being able to start when deployed without any energy previously stored (cold start),
- protect the node against overvoltage when the storage element is fully charged or when the harvester generates an excessive input power,
- provide information about the available energy in the storage element.

Furthermore, such PMUs must use as few external components as possible to reduce their bill of material and thus to meet the low carbon footprint constraint of IoT nodes.

PMUs for energy-harvesting exist [75, 10, 11, 76, 77, 17, 78, 6, 7] but do not meet all the above requirements. Indeed the PMUs described in [75, 10, 11] do not have a regulated-voltage output, which makes them not usable to supply circuits such as microcontrollers. The circuit described in [76] must start with a precharged battery and cannot reboot itself when the node energy falls below a certain threshold. In [78], the PMU has a  $25\mu$ A quiescent current that is too high to enable energy-autonomous operation and in [7], it has an inefficient DC/DC conversion (efficiency < 40%) wasting thus energy. Moreover, most of these PMUs use an inductive converter increasing therefore their bill of material,



**Fig. 5.1.** Problem to be addressed by power management units for autonomous IoT applications.

and their carbon footprint because of the need for an off-chip inductor. Inductorless PMUs are proposed in [17, 7], but they suffer from poor end-to-end efficiency below 40% and the power that can be harvested is limited to few  $\mu$ Ws.

In this chapter we propose an alternative to generate a 1V regulated voltage  $(V_{reg})$  from stacked micro solar cells and store excess energy in a supercapacitor. This is achieved with a single switched-capacitor DC/DC converter with 7 different voltage gains to cope with the 0-3V voltage range on the supercapacitor. When no light power is available, the DC/DC converter supplies  $V_{reg}$  from the energy previously stored on the supercapacitor. The low  $7\mu A$  quiescent current (including  $1\mu A$  from the supercapacitor [74]) allows the PMU to start with input power as low as  $10\mu$ W, corresponding to indoor lighting with  $4 \times 1.5cm^2$  solar cells or outdoor lighting with  $3 \times 0.5cm^2$  solar cells, and to reach a total discharge time as long as 43 hours of a 350-mF supercapacitor [74], ensuring that the system will not lose stored data during the night.

This chapter is organized as follows. Section 5.2 reviews state of the art energyharvesting PMUs. Section 5.3 describes the proposed PMU behavior, controller and analog circuits. Finally Section 5.4 shows measurement results.

# 5.2 STATE OF THE ART OF ENERGY-HARVESTING POWER MANAGEMENT UNITS

Energy-harvesting PMUs can be categorized in two families. PMUs from the first family shown in Fig. 5.2(a) uses a single power path with two voltage converters [75, 76, 77, 78]. The first converter supplies a storage element (often a battery) from the energy harvester when power is available. It also isolates the storage element from the energy harvester when there is no environmental power available to avoid any reverse current. The second converter supplies the load from the storage element with or without power available from the harvester. Such a PMU architecture only requires a simple controller for each converter but charges have



**Fig. 5.2.** Tyical PMU architectures with energy harvesters: (a) architecture with single power path, (b) architecture with parallel power paths.

to pass through two converters to reach the load, which reduces the end-to-end efficiency. In [17] a variation of the single power path architecture is proposed: two voltage converters are used with the load after the first converter and the storage device after the second one. Even though there is only one converter between the load and the harvester, the charges supplying the load from the storage element pass through three converters reducing the end-to-end efficiency of the system.

The second PMU family shown in Fig. 5.2(b) uses two parallel power paths [6, 7]. A voltage converter supplies the load from the harvester when there is environmental power available. At the same time a second converter sends the charges from the harvester that are not consumed by the load to a storage element. When there is no (or not enough) power available from the harvester, a third converter supplies the load from the storage element such as with the single-path architecture. The benefit of the parallel power path architecture is that charges pass through only one voltage converter when there is environmental power available, which increases the system efficiency. However it comes at the cost of a more complex control mechanism.

There are mainly three kinds of energy harvesters used to supply circuits: solar cells, thermoelectric generators (TEG), and piezoelectric transducers. Solar cells provide a DC open-circuit voltage ranging from 400mV with minimum indoor lighting to nearly 650mV with bright outdoor lighting [12]. PMUs having solar cells as energy source often use an inductive converter to boost the cell voltage up to the storage element voltage [75, 76, 78, 6]. Then, the load is supplied by an LDO when a single power path architecture is used [75, 76, 78] or with a second inductive converter for parallel power path architectures [6]. Some works propose to use SC DC/DC converters to reduce the external component count [7, 16, 17]. In [7, 16], an SC DC/DC converter interfaces the battery, and the energy harvester. When active, the load is supplied through an LDO in [7]. In [16] the load is directly connected to the energy harvester. However, in [7, 16] the SC DC/DC has a fixed topology which reduces its efficiency because of the linear losses depending on the solar cell voltage fluctuations. Furthermore the power range that can be handled by the DC/DC converter is limited to less than  $1\mu W.$ 



Fig. 5.3. Proposed PMU principle. A single SC DC/DC converter regulates the load voltage and manages the storage supercapacitor by (a) sending excess charges to the storage supercapacitor (direct mode) when power is available from the solar cells, and by (b) bringing these charges back to the load (reverse mode) when there is no solar power.

PMU architectures with TEG or piezoelectric harvesters have similar implementations with key differences from solar-cell-based PMUs. Thermoelectric generators have a low open circuit voltage proportional to the temperature gradient between their inputs. Even though [11] proposes to use a charge pump to interface such a harvester, it is difficult to reach a high power-conversion efficiency with SC DC/DC when the temperature gradient is small because of the high voltage gain required. Therefore [75] and [6] use an inductive converter that can reach a high voltage gain, and [77] even uses a step-up transformer to be able to harvest power from very small voltage on the TEG. Piezoelectric transducers develop an AC voltage so that the PMU requires a rectifier to deliver the harvested power [6], [10].

Finally PMUs combining different kinds of energy harvesters exist [6, 79, 80]. In [79] the harvesters supply storage capacitors that are stacked. In [80], the harvester that delivers the highest instantaneous power is selected to supply the circuit. However these two solutions lack efficiency because either all the energy harvesters are not used simultaneously, or several converters are required to handle the load and the multiple inputs. To overcome this, [6] proposes to use a single converter to interface all the energy harvesters thanks to inductance sharing.

# 5.3 A SINGLE-CONVERTER PMU FOR ENERGY HARVESTING

Fig. 5.3 shows the principle of the proposed PMU. To increase the end-to-end efficiency of the system, the harvester is connected directly to the regulated voltage  $V_{reg}$ . To ensure that the 1V  $V_{reg}$  is close to the harvester maximum power point with poor lighting, 4 (resp. 3) cells must be stacked when indoor (outdoor) [12], [81]. When the lighting is higher than the worst-case conditions, more input power is available and thus it is not a concern not to be at the maximum power point. In order to avoid any reverse current when there is no



Fig. 5.4. Proposed PMU architecture.

light, an input switch is inserted between the harvester and  $V_{reg}$ . Such as in [16], only one SC DC/DC converter is enough to extract power from the solar cells, to regulate  $V_{reg}$ , and to interface the energy storage element (here a supercapacitor). When solar power is available (Fig. 5.3 (a)), the input switch supplies the load from the solar cells and the SC DC/DC converter brings away excess charges to the storage supercapacitor. An adjustable voltage gain of the SC DC/DC converter ensures low linear losses whatever the voltage on the supercapacitor. When there is not enough light power to supply the load (Fig. 5.3 (b)), the SC DC/DC converter brings back charges from the supercapacitor to the load. The bill of material is low because there is no inductive converter.

Maximum power point tracking (MPPT) is not used because it would require a DC/DC converter instead of the direct connection between the solar cells and the load, and the associated losses would cancel the MPPT benefits at low lighting. The solar cells achieve 22% power conversion efficiency at one sun [12]. With indoor lighting, the cell open circuit voltage can be below 330mV and 4 cells connected in series occupying a total surface of 6.2  $cm^2$  are required. When outdoor, more solar power is available, and higher open circuit voltage develops on each cell. Therefore only 3 solar cells are enough, and their total surface can be reduced to 1.6  $cm^2$  thanks to the use of e.g. [81] that already has 3 cells serially connected.

This section describes the PMU architecture, its control principles and peripherals circuits as well as the implementation details of the SC DC/DC converter.

### 5.3.1 PMU architecture

Fig. 5.4 shows the PMU architecture built in a commercial 65nm CMOS process. It consists of an input PMOS (P1) with an SC DC/DC converter. When the circuit supplied by the PMU is deployed for the first time, P1 is diode connected

so that when power becomes available from the  $V_{PV}$  input, it charges  $V_{reg}$ . When sufficient voltage (0.8V) is detected on  $V_{reg}$ , a power-on-reset block (PoR) activates the PMU and pulls the gate of P1 to the ground, making it act as a closed switch which increases its conductance. If  $V_{reg}$  gets higher than 1V the  $V_{reg}$  comparator triggers the SC DC/DC converter in direct mode, so that it takes excess charges away from  $V_{reg}$  to the supercapacitor with a pulse-skip modulation (PSM) control. The converter thus both regulates  $V_{reg}$  and charges the storage supercapacitor  $V_{SC}$  as long as light power is available.

In dark conditions, the solar cells and the input switch fail to supply  $V_{reg}$  which drops below 1V. Therefore, the source controller pulls up the gate of P1 to put it back in diode connection. It prevents any reverse current from  $V_{reg}$  to  $V_{PV}$  when there is no light. The SC DC/DC then operates then in reverse mode so that it supplies  $V_{reg}$  from the supercapacitor. When power is again available from the solar cells,  $V_{PV}$  rises so that the P1 diode is ON. It thus supplies  $V_{reg}$  making it rise above 1V. When this is detected by the PMU, the gate of P1 is tied to the ground, and the SC DC/DC converter switches back to the direct mode, regulating  $V_{reg}$  at 1V and charging the supercapacitor again.

Information about the energy available from the supercapacitor is available from the SC\_status signal. It is extracted from the current voltage accross the supercapacitor. When  $V_{SC}$  lowers below 1V, the PMU sets the LOW\_E flag to warn that it will run out of energy. Peripheral circuits that are discussed in the following sections also include, a digitally-controlled oscillator (DCO), a voltage reference, a bias generator, offset compensated comparators, an overvoltage protection, and switch drivers. The SC DC/DC uses a gearbox to select between 7 SCN configurations (both step-up and step-down) to adapt the voltage gain to the  $V_{SC}$  level and avoid prohibitive linear conversion losses.

## 5.3.2 PMU controller

The PMU controller is built with a mix of analog and digital circuits. The digital part is synthesized automatically with low-power high- $V_T$  cell libraries to take advantage of the low logic area in advanced technology nodes, while avoiding the multiplication of quiescent-current-hungry analog circuits. The controller is divided into four parts. The DCO controller, the frequency mode controller and the gearbox controller shown in Fig. 5.5 manage the SC DC/DC converter. With the  $V_{reg}$  comparator, they enable pulse skipping modulation to manage charge transfer between  $V_{SC}$  and  $V_{ref}$  and to perform the voltage regulation of  $V_{reg}$ . The source controller shown in Fig. 5.6 controls the input PMOS and the operating mode (direct or reverse) of the SC DC/DC converter. Table 5.1 summarizes the input and output signals, as well as the time constant of the PMU controller digital parts. This section describes the four main parts of the PMU controller.

### Source controller

The source controller state chart and circuit are shown in Fig. 5.6. It monitors if power is available from the harvester and takes the decision to supply  $V_{reg}$  from



Fig. 5.5. Block diagram of the SC DC/DC controller: DCO controller, frequency mode controller, and gearbox controller. It uses pulse skipped modulation enabled by the  $V_{reg}$  comparator.

 Table 5.1.
 PMU controller signal summary

|                              | INPUT                                                              | OUTPUT                                    | Time constant          |
|------------------------------|--------------------------------------------------------------------|-------------------------------------------|------------------------|
| Source<br>controller         | source comp. output<br>en_oc<br>$\frac{DC/DC clk}{256}$            | en_source                                 | $\frac{DC/DCclk}{256}$ |
| Gearbox<br>controller        | V <sub>SC</sub> comp. output<br>en_source<br>32kHz clock           | SC_status [2:0]<br>LOW_E                  | $32\mu s$              |
| Frequency-mode<br>controller | V <sub>reg</sub> comp. output<br>en_source<br>DC/DC clk            | freq. mode[1:0] $\frac{DC/DC \ clk}{256}$ | $\frac{DC/DCclk}{256}$ |
| DCO<br>controller            | hard-coded f_target<br>freq. mode[1:0]<br>DC/DC clk<br>32kHz clock | speed[16:0]                               | $32 \mu s$             |



Fig. 5.6. Source controller state chart and circuits.

the storage supercapacitor or from the harvester. Therefore it uses an hysteresis comparator (source comparator) whose positive input is a fraction of  $V_{reg}$  that depends on the power source (harvester or supercapacitor) currently used. If  $V_{reg}$  is supplied from the harvester ( $\overline{en\_source}=0$ , SC DC/DC in direct mode), the source controller checks that  $V_{reg}$  is higher than 0.95V. If the light goes down,  $V_{reg}$  falls below this low-threshold value because the load consumes more charges on  $V_{reg}$  than supplied by the harvester. The source controller then sets  $\overline{en\_source}$  to 1 and the SC DC/DC operates in reverse mode to supply  $V_{reg}$  from the supercapacitor. The input PMOS gate rises to  $V_{reg}$  making it acts as a diode. It prevents any reverse charge flow from  $V_{reg}$  to the harvester.

When  $V_{reg}$  is supplied from the supercapacitor ( $\overline{en\_source}=1$ , SC DC/DC in reverse mode), the source controller checks that  $V_{reg}$  does not rise above 1.05V. If light goes up and solar power becomes available from the harvester,  $V_{PV}$  rises and the diode-connected PMOS lets charges flow to  $V_{reg}$  when  $V_{PV} >$  $V_{reg}+0.28V$ . If the power supplied by the harvester is higher than the  $V_{reg}$  load power consumption,  $V_{reg}$  rises. If it goes above the high 1.05V threshold, the source controller connects the input PMOS gate to the ground and the DC/DC converter is operated in the direct mode to store excess charges on  $V_{reg}$  to the supercapacitor. The hysteresis comparator is offset compensated. It is because the reference voltage  $V_{ref}$  is low (130mV) as will be seen in Section 5.3.3, so that any offset induces an error on the  $V_{reg}$  level amplified by  $V_{reg}/V_{ref}=7.7$ . During the offset compensation process ( $en\_oc=1$ ), the comparator output is undefined. Therefore the source comparator is not allowed to change the  $en\_source$  state during this process.

### Gearbox controller

The gearbox controller adapts the SCN topology and thereby its voltage gain to the supercapacitor voltage and to the converter operating mode (direct or reverse). As explained in Section 1.4.1 linear losses are proportional to the voltage



Fig. 5.7. (a) Used voltage gains of the SCN topologies versus the supercapacitor voltage, in direct and reverse mode and (b) gearbox controller state chart.

difference  $V_{nl}$ - $V_{out} = \Delta V$ . The gearbox controller thus mitigates these losses by keeping  $V_{nl}$  close to  $V_{out}$ . Possible voltage gains are shown in Fig. 5.7 (a).

Fig. 5.7 (b) shows how the gearbox controller chooses the voltage gain of the SC DC/DC converter. Its circuitry is shown in Fig. 5.5. A *dir* signal is inverted at each crystal clock cycle  $(32\mu s)$  as long as *SC\_status* has not reached its bound values (0 or 7). The gearbox controller selects the ratio of two voltage dividers at the inputs of the  $V_{SC}$  comparator so that:

- When dir=1, the comparator output is high if the  $V_{SC}$  voltage is higher than the upper limit of the  $V_{SC}$  range allowed by the current SCN voltage gain (e.g.  $V_{SC} > 0.9V$  in direct mode with a voltage gain of 1).
- When dir=0, the comparator output is low if the  $V_{SC}$  voltage is lower than the lower limit of the  $V_{SC}$  range allowed by the current SCN voltage gain (e.g.  $V_{SC} < 0.6V$  in direct mode with a voltage gain of 1).

Therefore, when dir=1 (resp. 0), if the comparator output is high (low), the gearbox controller selects an SCN topology with a higher (lower) voltage gain. When the highest (lowest) voltage gain is selected the dir signal remains at 0 (1) until the voltage gain decreases (increases).

# Frequency-mode controller

Power available from the solar cells have large fluctuations from less than  $50\mu W$  per cell in warehouse to up to 23mW with bright sunlight [12]. Therefore the



Fig. 5.8. (a) SC DC/DC switching frequency, maximal power extracted from  $V_{PV}$ , and PMU quiescent current corresponding to the three frequency modes, (b) frequency mode controller state chart.

PMU must be efficient when low power is available from the harvester, and it must be able to handle larger power in outdoor conditions to avoid that an overvoltage occurs on the node circuits.

As shown in Fig. 5.8 (a) the frequency mode controller can adjust the DC/DC clock frequency, the switch size and the quiescent current of the PMU analog circuits to the amount of excess power delivered by the energy harvesters or to the load power consumption such as for dual-mode converters presented in Chapter 3.

When low power is available from  $V_{PV}$ , a low clock frequency reduces the SC DC/DC  $P_{BP}$  and  $P_{driving}$ . Furthermore, the comparators of the control circuits have more time between each switching of the SC DC/DC converter so that their time response may be slower and their bias current reduced. A high clock frequency allows the converter to handle a larger input power up to 14mW (with a voltage gain=3 and  $V_{SC}$ =2.5V). An overvoltage protection circuits prevent  $V_{reg}$  to rise if more power is available from the harvesters or if the supercapacitor is already fully charged at 3V.

To do so, as shown in Fig. 5.8 (b). the frequency mode controller counts the pulses that are skipped by the  $V_{reg}$  comparator. In direct (resp. reverse) mode, a pulse is skipped if  $V_{reg}$  is lower (higher) than 1V. If nearly no pulse is skipped, the converter transfers the maximal amount of charges for the current



Fig. 5.9. DCO controller state chart.

frequency mode and conversely. Thus, when less (resp. more) than 5% (95%) of the pulses are skipped, the frequency mode signal is decreased (increased). There are 3 frequency modes with target frequencies  $f_{TARGET}$  of 1MHz, 10MHz or 40MHz (for higher switching frequencies, the converter losses may threaten the system stability) for the DCO which generates the SC DC/DC converter switching frequency.

### DCO controller

The DCO controller uses a frequency comparator to check that the DCO frequency is at  $f_{TARGET}$  provided by the frequency mode controller described in Section 5.3.2. Its purpose is to compensate the PVT variations and track the frequency mode changes. As shown in Fig. 5.5, the DCO is a ring oscillator with a mix of NMOS and PMOS transistors between its supply voltage rail and  $V_{reg}$ . They starve the current in the DCO. A level shifter at the DCO output regenerates the pulse level to full  $V_{reg}$  swing. Seven transistors starve the current for the 2 lower frequency modes ( $f_{TARGET}=1$ MHz or 10MHz) while 3 transistors are enough to compensate PVT variations for the higher frequency mode ( $f_{TARGET}=40$ MHz). This is because for the two lower frequency modes, the DCO supply voltage is close to the MOSFET threshold voltage to lower its frequency, making it more sensitive to PVT variations as explained in Section 4.2.

The DCO controller operates in a successive approximation register (SAR) fashion and its state chart is shown in Fig. 5.9. A counter counts the number



**Fig. 5.10.** (a) Circuit of the voltage reference, and (b) evolution of the voltage reference with the temperature and (c) with  $V_{reg}$ .

of pulses generated by the DCO during half a crystal clock cycle (16 $\mu$ s) from the external 32kHz clock. Every 16 $\mu$ s the counter result is compared to predefined threshold values corresponding to  $f_{TARGET} \pm 20\%$ . If the count result is below (resp. above) the low- (up-) threshold value, the DCO speed is increased (decreased) by turning ON (OFF) the next starving transistor and thus by increasing (decreasing) the DCO supply voltage. The number of starving transistors activated tracks also the frequency mode changes.

# 5.3.3 Voltage reference

The voltage reference used by the PMU is described in Fig. 5.10 (a). Compact voltage references such as presented in [82] or [83] not relying on a bandgap circuit use non-standard devices or devices with different channel doping levels. Non-standard devices require specific process while different channel doping levels lead to circuits prone to corner variations [82]. The selected 65nm CMOS process offers both general purpose devices (GP) with thin gate oxide and low-

power devices (LP) with thick gate oxide. Therefore we propose to exploit this versatility to build a voltage reference with only two standard devices that have the same channel doping level but different oxide thickness. It reduces the voltage reference sensitivity to corner variations as the device channel doping is performed in a single process step. The difference in oxide thickness induces a threshold voltage lower in M2 than in M1. Both M1 and M2 have their gate tied to the ground. They are thus in the subthreshold (weak inversion) regime. As  $V_T$  of M2 is lower, its  $V_{GS}$  must be lower than M1 in order to have identical currents in M1 and M2. A voltage thus appears at node N1 to lower the  $V_{GS}$  of M2. A second cascaded stage doubles this voltage to  $V_{ref}$ . A capacitor is added at the output of each stage to filter the noise on the supply voltage.

In order to estimate the level of N1, let us equalize the subthreshold current in the two devices. A similar approach allows then to deduce  $V_{ref}$ . The subthreshold current  $I_{sub}$  of MOSFET devices is given by [84]:

$$I_{sub} = \mu_0 C_{ox} \frac{W}{L_{eff}} (n-1) U_{th}^2 \times 10^{\frac{V_{GS} - V_T}{S}} \times \left(1 - e^{\frac{-V_{DS}}{U_{th}}}\right), \qquad (5.1)$$

with  $\mu_0$ , the carrier mobility,  $C_{ox}$  the gate oxyde thickness, W the channel width,  $L_{eff}$  the effective channel length, n the body-effect factor, S the sub-threshold swing,  $U_{th}$  the thermal voltage, and  $V_{DS}$  the drain to source voltage.  $V_T$  can also be expressed as [85]:

$$V_T = V_{T0} - \gamma_{VT} \, V_{BS} - \eta_{VT} \, V_{DS}, \tag{5.2}$$

With  $\gamma_{VT}$ , the linearized body-effect coefficient  $V_{BS}$ , the body to source voltage, and  $\eta_{VT}$  the DIBL coefficient.

For the proposed circuit, the body terminal of each device is tied to its source terminal with deep N-well isolation so that  $V_{BS}=0$ . On top of this let us make two assumptions:

- the devices are assumed to be in the saturation regime of the weak inversion so that  $1 e^{\frac{-V_{DS}}{U_{th}}} \approx 1$ ,
- the DIBL effect is neglected because the devices are built with long channel length  $(>3 \times L_{min})$ , so that  $V_T \approx V_{T0}$ .

Eq. (5.1) can thus be rewritten as:

$$I_{sub} = I_0 \times 10^{\frac{V_{GS} - V_T}{S}},$$
(5.3)

with  $I_0 = \mu_0 C_{ox} \frac{W}{L_{eff}} (n-1) U_{th}^2$ . As  $V_{GS}$  of M1 is zero, and  $V_{GS}$  of M2 is  $-V_{N1}$ , equalizing the current in the two devices leads to:

$$log(\frac{I_{0,1}}{I_{0,2}}) = \frac{V_{N,1} + V_{T,2}}{S_2} + \frac{-V_{T,1}}{S_1}.$$
(5.4)

So that isolating  $V_{N,1}$  gives:



Fig. 5.11. (a) Power on reset circuits, and (b) waveforms in the power on reset.

$$V_{N,1} = \log(\frac{I_{0,1}}{I_{0,2}})S_2 + V_{T,1}\frac{S_2}{S_1} - V_{T,2}.$$
(5.5)

If it is possible to find a sizing so that:

- $I_{0,1} = I_{0,2}$ ,
- $V_{T,1}$  and  $V_{T,2}$  have the same temperature dependance,
- $S_1 = S_2$ ,

Eq. (5.4) simplifies to

$$V_{N1} = V_{T1} - V_{T2}. (5.6)$$

Post-layout simulations in Fig. 5.10 (b) and (c) show that the voltage reference is approximately 130mV.  $V_{DS}$  of M1 is approximately 65mV. It is thus not completely saturated, and the above hypothesis does not totally hold. Furthermore, gate leakage through M3 loads N1 at extremely low temperature below  $-20^{\circ}C$ . At higher temperature, the voltage reference shows good stability versus the temperature with an average  $46\mu V/^{\circ}C$  temperature coefficient between  $20^{\circ}C$  and  $100^{\circ}C$ . The simulated power consumption at room temperature is 37nA. The voltage reference has a stable 32mV/V supply voltage dependence. Furthermore it is able to start from extremely low 200mV supply voltage, i.e. as soon as the devices enter the saturation regime, because it does not rely on a bandgap circuit but on the threshold voltage difference between two devices.

### 5.3.4 Power on reset

The PMU must be able to cold start when deployed. Therefore it must generate a reset signal to start its operation when power is available for the first time from the harvester. The power on reset (PoR) shown in Fig. 5.11 (a) generates the reset signal that starts the PMU controller. Fig. 5.11 (b) shows the associated waveforms. A voltage comparator compares  $V_{ref}$  to  $V_{DIN}$  wich is a fraction of  $V_{reg}$ .  $V_{DIN}$  is obtained by a higly resistive diode chain with low quiescent current.



**Fig. 5.12.** Offset compensated comparators to enable the SC DC/DC pulse skipped modulation and the selection of the energy source.

When solar power is available,  $V_{reg}$  slowly rises as charges flow from the harvester through the input PMOS P1 of Fig. 5.4. When  $V_{reg}$  gets higher than 780mV, the comparator sets the *out\_comp* signal. It is connected to the two inputs of a AND gate, but on one path, *out\_comp* is inverted and delayed by three clock pulses of a the 32kHz crystal clock. Therefore, when *out\_comp* rises, the AND gate asserts the reset signal for three crystal clock periods before releasing it.

### 5.3.5 Voltage comparators

The  $V_{reg}$  and source comparators of Fig. 5.4, respectively enabling the pulse skipping control of the SC DC/DC converter and selecting the PMU energy source, use the voltage reference  $V_{ref}$  as input. A resistive voltage divider is used to compare  $V_{RIN}$ , a fraction of  $V_{reg}$ , to the 130mV  $V_{ref}$ . However, by doing so, any offset on the comparator translates to a magnified error on the  $V_{reg}$  level by a factor  $V_{reg}/V_{ref}=7.7$ . To guarantee that  $V_{reg}$  is regulated at  $1V \pm 100$ mV, and given the 0.95V and 1.05V threshold level for the source comparator, and the uncertainty on  $V_{ref}$ , the offset  $V_{offset}$  must be in the few mV range.

To reach such a low offset value in 65nm CMOS, offset compensated comparators showed in Fig. 5.12 were designed according to [86]. The offset compensation occurs when the  $en\_oc$  signal is triggered. Therefore N1 is opened and N4 is closed, connecting the comparator negative input to its positive input through  $C_{off}$ , a 1pF MiM offset storage capacitor. N3 also connects the comparator out-

### 112 A SINGLE-CONVERTER POWER MANAGEMENT UNIT FOR ENERGY-HARVESTING WIRELESS SENSOR NODES

put to the positive input. By doing so, the comparator output stabilizes at  $V_{RIN}$ and a voltage develops on  $C_{off}$ . It is the voltage that must be applied between the comparator terminals to reach the equilibrium at the comparator output, i.e. the comparator offset  $V_{offset}$ . When the  $en_{-oc}$  signal is released, N1 closes while N3 and N4 open so that  $V_{ref} - V_{offset}$  is applied to the comparator positive input. It cancels thus the comparator offset. The purpose of N2 is to counterweight charge injections on  $C_{off}$  from the N1  $V_{GS}$  capacitances when  $en_{oc}$  is released [86]. Therefore N2 has half the size of N1 so that charges injected by N1 are absorbed by the N2  $V_{GS}$  and  $V_{DS}$  capacitances. Fig. 5.13 (a) shows the achieved offset reduction in the different process, voltage and temperature corners. Without compensation, the offset has a nearly 7.5mV standard deviation with an average value between 0.8mV and 6mV depending on the corner.  $50\mu$ s after the offset compensation, the offset standard deviation is reduced to 0.75mV (worst FF 1.1V, 85°C corner) and its average value is between -0.55mV and 0.90mV. Leakage currents modify the voltage on the storage capacitor so that the compensation degrades slowly with time as shown in Fig. 5.13 (b). These leakage currents are larger in the fast NMOS corner at high temperature. 0.5ms after the offset compensation, the offset standard deviation rises to 1.75mV (worst FF  $1.1V, 85^{\circ}C$  corner) and its average value is between -2.95mV and 1.03mV. The sampled offset on the storage capacitor is thus refreshed every 0.5ms. Therefore the controller counts the rising edges of the DC/DC clock. When the count result reaches a given value (510 in the 1MHz frequency mode, 4094 in the 10MHz and 40MHz frequency modes), en\_oc is triggered for 4 DC/DC clock cycles (in the 1MHz frequency mode) or for 16 clock cycles (in the 10MHz and 40MHz frequency modes), and the counter is reseted.

Two mechanisms shown in Fig. 5.12 adapt the comparator bias current and its response time to the frequency mode. First, current mirrors with different sizes have been implemented. The bias current of the comparators is approximately 50nA, 220nA and  $1.6\mu$ A in the 1MHz, 10MHz and 40MHz frequency modes respectively. Second, three resistive dividers have been implemented to deliver  $V_{RIN}$  to the comparator. The resistive divider must load parasitic capacitances from the comparator input terminal and from the routing metal layers. If its bias current is too low,  $V_{RIN}$  cannot follow the  $V_{reg}$  fluctuations as the SC DC/DC converter removes charges from  $V_{reg}$  to  $V_{SC}$ . There is thus a minimal bias current that must flow in the resistive divider that depends on the SC DC/DC converter switching frequency and thus on the frequency mode. When in the 1MHz frequency mode, the divider with the lowest bias current is enabled. In faster frequency modes, dividers with higher bias currents are enabled to track the fast  $V_{reg}$  fluctuations when the SC DC/DC converter switching frequency increases.

# 5.3.6 SC DC/DC converter

This section describes the PMU SC DC/DC converter. The use of an SC DC/DC converter allows to minimize the number of external components required for the


Fig. 5.13. Simulated comparator offset (2 sigma intervals) before and after ((a)  $50\mu$ s, and (b) 0.5ms) the offset compensation. Results from 100 Monte Carlo runs and with the comparator in the 1MHz frequency mode).

PMU operation. It regulates  $V_{reg}$  by storing excess charges into the supercapacitor when  $V_{reg}$  is supplied by the energy harvesters (SC DC/DC in direct mode), or by supplying  $V_{reg}$  from the supercapacitor when there is no power available from the energy harvester (SC DC/DC in reverse mode).



Fig. 5.14. There are 7 SCN configurations leading to 7 voltage gain settings for the SCN that can be selected by the gearbox controller.

#### Switched-capacitor network

The capacitor network used for the DC/DC conversion is shown in Fig. 5.14. It has two 460pF MIM transfer capacitors mounted in series-parallel topologies. The capacitor network can be reconfigured by the gearbox controller to achieve voltage gains of 1/3, 1/2, 2/3, 1, 3/2, 2 or 3 in direct and reverse mode.

The use of several SCN topologies allows to mitigate the converter linear losses which are proportional to the voltage difference  $\Delta V$  between the converter output and no-load voltages  $V_{nl}$ . It increases thus the maximum achievable efficiency  $n_{max}$  as explained in Section 1.4.1. However, for a given frequency mode, and thus switching frequency, the power that can be transferred by the SCN is dependent on the selected topology and on the  $V_{SC}$  voltage. This is because as explained in Section 1.3.1, the power delivered to the load depends on:

- the voltage difference  $\Delta V = V_{OUT} V_{NL}$ , and thus on the voltage gain of the current topology at no load,
- on the converter output impedance which depends on the capacitor network topology, clock frequency and switch conductance.

Table 5.2 reminds the  $V_{PV}$  voltage range, and provides worst  $n_{max}$  (for maximum  $\Delta V$ ), and worst simulated output power (for minimum  $\Delta V$ ) associated to each gain setting in both direct and reverse mode.

## Supply voltage selector

Gate drivers of the SC DC/DC converter must drive the switches with a voltage swing that is at the maximal voltage seen by the converter to maximize the  $V_{GS}$  on the NMOS switches and to fully turn off PMOS switches. As  $V_{SC}$  can be

|              | Direct mode    |                     |                     | Reverse mode   |                     |                       |
|--------------|----------------|---------------------|---------------------|----------------|---------------------|-----------------------|
| Voltage gain | $V_{SC}$ range | $n_{max}^{\dagger}$ | $P_{SC}^{\ddagger}$ | $V_{SC}$ range | $n_{max}^{\dagger}$ | $P_{Vreg}^{\ddagger}$ |
| 1/3          | 0-0.3V         | 0%                  | $226 \mu W$         | 0.37-0.55V     | 61%                 | $43\mu W$             |
| 1/2          | 0.3-0.45V      | 60%                 | $0.88\mathrm{mW}$   | 0.55-0.74V     | 68%                 | $0.71 \mathrm{mW}$    |
| 2/3          | 0.45-0.6V      | 67.5%               | $0.86\mathrm{mW}$   | 0.74-1.1V      | 61%                 | $0.82 \mathrm{mW}$    |
| 1            | 0.6-0.9V       | 60%                 | $0.87\mathrm{mW}$   | 1.1-1.65V      | 61%                 | $0.98 \mathrm{mW}$    |
| 3/2          | 0.9-0.135V     | 60%                 | $3.1\mathrm{mW}$    | 1.65-2.2V      | 68%                 | $2.17 \mathrm{mW}$    |
| 2            | 1.35-1.8V      | 67.5%               | $1.02 \mathrm{mW}$  | 2.2-3V         | 67%                 | $3.8\mathrm{mW}$      |
| 3            | 1.8-3V         | 60%                 | $1.7\mathrm{mW}$    | -              | -                   | -                     |

Table 5.2. Theoretical performances of the available SCN topologies

<sup>†</sup> Worst conditions @maximal  $\Delta V$ , (@minimal  $\Delta V n_{max}$  is 90%).

<sup>‡</sup> Worst conditions @minimal  $\Delta V$ , with 10MHz frequency mode.



**Fig. 5.15.** (a) Voltage selector between  $V_{reg}$  and  $V_{SC}$  to supply the gate drivers, and (b) waveforms in the voltage selector.

lower or higher than  $V_{reg}$ , a voltage selector circuit shown in Fig 5.15 (a) is used to supply the gate drivers with the appropriate supply voltage  $V_{MAX}$  which is the highest between  $V_{reg}$  and  $V_{SC}$ .

Fig 5.15 (b) shows the waveforms in the voltage selector circuit. If the SC DC/DC voltage gain is lower than 1, VDDSEL is set to 1 by the gearbox controller. The inverter chain driving P2 is supplied by  $V_{SC}$ , (which is lower than  $V_{reg}$ ). Its output is a logical 0 as it has an odd number of stages. It connects thus  $V_{MAX}$  to  $V_{reg}$  through P2. The inverter chain driving the P1 is supplied by  $V_{reg}$ . Its output is 1 because it has an even number of stages. In order to prevent charge flow from  $V_{MAX}$  to the low  $V_{SC}$ , the logical level of this inverter chain



**Fig. 5.16.** (a) Gate drivers circuits with and without gate boosting assist, and (b) waveforms in the gate drivers.

must be at  $V_{MAX}$ , which is here the case as it is supplied by  $V_{reg}$  which is higher than  $V_{SC}$ .

When the SC DC/DC voltage gain is higher or equal to 1,  $V_{SC}$  is higher than  $V_{reg}$ . The gearbox controller sets VDDSEL to 0. Therefore the inverter chain driving P1 has its output at a logical 0, connecting thus  $V_{MAX}$  to  $V_{SC}$  through P1. The other inverter chain has its output at a logical 1. As it is supplied by  $V_{SC}$ , the gate of P2 is at  $V_{SC}$ . The PMOS is thus OFF, preventing charge flow from  $V_{MAX}$  to  $V_{reg}$ .

#### Gate drivers

As  $V_{SC}$  may rise up to 3V, the switches are implemented with thick-oxide I/O MOSFETs. However, they have a higher threshold voltage than LP/GP core MOSFETs. As explained in Section 2.2.3, switches in SC DC/DC converters does not all have the same  $V_{GS}$ . In this specific case, when  $V_{SC}$  is lower than 1V some switches have a  $V_{GS}$  close to their threshold voltage which increases dramatically their size and thus the switch driving losses.

In order to alleviate this, two switch driver circuits shown in Fig. 5.16 (a) have been used. Switches with sufficient  $V_{GS}$  are driven by conventional buffer chains. NMOS switches that have low  $V_{GS}$  when  $V_{SC}$  is lower than 1V are driven by gate-boosting drivers. Waveforms in such gate-boosting drivers are shown in Fig. 5.16 (b). When they are turned ON, the switch gate is first driven to  $V_{MAX}$  together with a boost capacitor  $C_{boost}$  that is a smaller replica of the switch. Then the driver raises the  $C_{boost}$  bottom plate up to  $V_{MAX}$ , thus rising its upper plate at a voltage higher than  $V_{MAX}$ , and enabling charge redistribution between



Fig. 5.17. layout of the PMU and of the IoT node it supplies.

the switch gate and the boost capacitor. When  $V_{SC}$  (and thus  $V_{MAX}$ ) rises above 1V, all switches have higher  $V_{GS}$  so that there is no need for the gate-boosting assist. The bottom-plate of  $C_{boost}$  is thus left floating to avoid extra driving losses because of charging and discharging of the boost capacitor.

Finally level shifters perform the logic level transition between the DCO power domain at  $V_{reg}$ , and the gate drivers power domain at  $V_{MAX}$ .

# 5.4 SYSTEM VALIDATION

The PMU was designed in a commercial 65nm CMOS technology and integrated within an SoC featuring a 32-bit microcontroller and a CMOS image sensor. The PMU area is  $0.425mm^2$ , and its layout is shown in Fig. 5.17. This section gives simulation results of the PMU. Therefore, a model of the solar cells detailed in Appendix A has been used.

The efficiency of the SC DC/DC converter in direct and reverse modes and for the three frequency modes are shown in Fig. 5.18 (a) and Fig. 5.19 (a), respectively. The maximal power that can be supplied by the converter in direct and reverse modes is shown in Fig. 5.18 (b) and Fig. 5.19 (b), respectively. The efficiency is the highest in the 1MHz frequency mode and lowers as the switching



Fig. 5.18. (a) Simulated efficiency of the SC DC/DC converter in direct mode. Average efficiencies (assuming constant power supplied by the SC DC/DC) over loading of the supercapacitor up to 2.7V are 70.7%, 64.9%, and 52.1% for the 1MHz, 10MHz, and 40MHz frequency modes respectively. (b) Maximal power supplied to the supercapacitor by the SC DC/DC converter.

frequency increases in higher frequency modes. This is because the design was optimized to favor the power conversion efficiency when low power is available on the solar cell, considering that in bright sun conditions, enough power is available to guarantee the circuit operation. Furthermore, the active power switch width is adapted to the frequency mode, allowing for reduction of the driving losses. The simulation shows the efficiency gain obtained by using several SCN topologies with a gearbox controller. Starting from 0V in direct mode with the divide-by-three topology, the power conversion efficiency increases with  $V_{SC}$ . When  $V_{SC}$  reaches 0.165V (i.e. half  $V_{nl}$  of the divide-by-three topology) the maximal power that can be supplied by the converter starts to lower because of the reduction of  $\Delta V$ , and thus of the amount of charges transferred each switching cycle by the DC/DC converter. When  $V_{SC}$  reaches 0.3V (i.e. 90% of  $V_{nl}$  of the divide-by-three topology), the gearbox controller switches to the

divide-by-two topology.  $\Delta V$  thus increases to 0.5V-0.3V=0.2V, which increases the maximal supplied power, and makes the DC/DC efficiency drop because of the linear losses increase. Then  $\Delta V$  reduces as  $V_{SC}$  gets closer to 0.5V (i.e. the  $V_{nl}$  of the divide-by-two topology), increasing thus the converter efficiency and reducing the maximal supplied power. When  $V_{SC}$  reaches 0.45V, the gearbox controller changes the topology to the divide-by-two-thirds. This scheme repeats as  $V_{SC}$  increases to 3V. In reverse mode, starting from 3V with the multiply-bytwo topology, the converter efficiency increases and the maximal supplied power decreases as  $V_{SC}$  lowers down to 2.2V. Then the gearbox controller switches to the multiply-by-two-thirds topology and this schemes repeats as  $V_{SC}$  lowers. When  $V_{SC}$  reaches approximately 0.4V, the SC DC/DC in the divide-by-three topology fails to supply  $V_{reg}$  because not enough charges are transferred each switching cycle.

Average efficiencies have been evaluated assuming a constant power supplied by the SC DC/DC converter. The efficiencies in direct mode are 70.7%, 64.9%, and 52.1% in the 1MHz, 10MHz, and 40MHz frequency modes respectively, in reverse mode they are 74.9%, 72.3%, and 66.5% in the 1MHz, 10MHz, and 40MHz frequency modes respectively. Therefore the system average end-to-end efficiency peaks to 53% when supplying the load from the supercapacitor in the 1MHz frequency mode.

Fig. 5.20 shows a transient simulation of the  $V_{reg}$  evolution at start up and for alternating power available or unavailable on the solar cells. For simulation duration purpose, the supercapacitor size has been reduced from 0.35F to 50nF. When power becomes available for the first time,  $V_{PV}$  slowly charges  $V_{reg}$  through the input PMOS P1 which is diode connected. There is a 280mV drop on the diode. When  $V_{reg}$  reaches 780mV, the reset is triggered. Therefore the source controller detects that power is available on the solar cells (en\_source=1) and connects the gate of P1 to the ground, short circuiting  $V_{PV}$  and  $V_{reg}$ . As  $V_{reg}$  goes above 1V, the DC/DC converter starts to regulate it by sending excess charges from the solar cells to the supercapacitor. After 3.25ms, the light is shut-down and power is no longer available on the solar cells. Therefore no more charges are supplied to  $V_{reg}$  by the solar cells, and  $V_{reg}$  slowly decreases as the  $V_{reg}$  load still consumes power. When  $V_{reg}$  falls below 0.95V, the source controller detects that there is no more light and switches the DC/DC converter to the reverse mode to supply  $V_{reg}$  from the supercapacitor (en\_source=0). Furthermore, it connects P1 as a diode so that  $V_{PV}$  is not short-circuited to  $V_{reg}$  anymore.  $V_{reg}$  rises again to 1V. After 4ms, the light is switch-ON and power is again available on the solar cells. Therefore  $V_{PV}$  rises above  $V_{reg}$ +280mV and supplies charges to  $V_{reg}$  through the ON diode P1. When  $V_{reg}$  rises above 1.05V, the source controller switches en\_source from 0 to 1, connecting the gate of P1 to the ground and switching the DC/DC converter to the direct mode.

Fig. 5.21 shows a transient simulation of the loading and discharge of the supercapacitor, and the corresponding SC\_status and LOW\_E signal, as well as the evolution of  $V_{MAX}$  supplying the DC/DC gate drivers. For simulation duration purposes, the supercapacitor size has been reduced from 0.35F to 100nF.



Fig. 5.19. (a) Simulated efficiency of the SC DC/DC converter in reverse mode. Average efficiencies (assuming constant power supplied by the SC DC/DC) over discharge of the supercapacitor from 2.7V to 0.41V are 74.9%, 72.3%, and 66.5% for the 1MHz, 10MHz, and 40MHz frequency modes respectively. (b) Maximal power supplied to  $V_{reg}$  by the SC DC/DC converter.

 $V_{reg}$  is first controlled at a voltage just above 1V to enable charge transfer by the DC/DC converter in direct mode. At start up,  $V_{MAX}$  is set to  $V_{reg}$  because  $V_{SC}$  is lower than 1V. SC\_status starts from its maximal value (7) to avoid risks of overvoltage on  $V_{reg}$  if a reset occurs while the supercapacitor is charged at 3V. SC\_status then quickly drops to 0.  $V_{SC}$  then increases, as the DC/DC converter transfers charges to the supercapacitor, and SC\_status tracks the supercapacitor voltage. When the gearbox selects the multiply-by-two-thirds topology ( $V_{SC} > 0.9$ V),  $V_{MAX}$  is connected to  $V_{SC}$ . After 10.5ms  $V_{reg}$  is lowered in order to switch the DC/DC converter reverse mode. It thus supplies  $V_{reg}$  from the supercapacitor, making  $V_{SC}$  drop. When  $V_{SC}$  falls below 0.74V, the LOW\_E flag rises to warn the IoT node microcontroller that the supercapacitor is running out of energy.



Fig. 5.20. Simulated transient behavior of the PMU: cold start and  $V_{reg}$  regulation with and without power available on the solar cells (room temperature and typical conditions with a 50nF supercapacitor).



Fig. 5.21. Simulated transient behavior of the PMU: loading and discharge of the supercapacitor and associated SC\_status, low\_E and  $V_{MAX}$  signals (room temperature and typical conditions with a 100nF supercapacitor).

Fig. 5.22 (a) shows the stand-by power partitioning on  $V_{SC}$  when no solar power is available, with a 0.35F supercapacitor [74], and when the PMU supplies the IoT node in sleep mode with a 1.7 $\mu$ W power consumption on  $V_{reg}$ . The powers from contributors supplied by  $V_{reg}$  are divided by the DC/DC converter average efficiency in the 1MHz frequency mode (75%). The input PMOS has a large width (W=1mm) to increase its conductance while ON. Therefore when diode connected with no light it has a 1.1 $\mu$ A leakage. Most of the comparator (source comparator,  $V_{reg}$  comparator, and  $V_{SC}$  comparator) quiescent current is due to their input resistive dividers. The supercapacitor has a 1 $\mu$ A leakage from its datasheet [74]. Fig. 5.22 (b) shows the supercapacitor discharge profile. Starting from 2.8V, it can supply  $V_{reg}$  for 43.2 hours (with the load in sleep mode, and only the always-on peripherals powered) before being discharged at 0.4V and being unable to supply  $V_{reg}$ . Therefore the system is able to supply



Fig. 5.22. (a) Partitioning of the contributors of the supercapacitor stand-by power when no power is available from the solar cells and when the IoT node is in sleep mode, (b) supercapacitor discharge profile: the PMU maintains  $V_{reg}$  at 1V for 43.2 hours.

the always-on peripherals such as the sleep controller during the night. If the system runs out of energy, the power-on-reset allows the node to reboot itself as if it were just deployed. However it requires wireless configuration of the digital circuit by other nodes or a base station.

## 5.5 CONCLUSIONS

In this chapter, we proposed a power management system that supplies an energy autonomous circuit from solar energy harvesters and that uses a supercapacitor as an energy storage element. Because of the unpredictable nature of environmental energy harvester, an SC\_status signal provides information about the amount of energy available from the supercapacitor by monitoring its voltage so that the node can plan its tasks accordingly. Unlike existing PMUs, the proposed solution uses only one SC DC/DC converter to perform the three tasks of providing a regulated voltage, of storing energy into the supercapacitor and of supplying it back to the load when no power is available from the harvester. Thanks to the use of an SC DC/DC converter, few external components are required. High power conversion efficiency is reached thanks to an SCN that can be reconfigured in 7 different topologies, to switch drivers with a gate-boost assist for voltage down conversion, and to three frequency modes adapting the converter switching frequency and the PMU quiescent current to the load conditions and to the power available from the harvester. The PMU is able to cold start when the supercapacitor is empty thanks to a power-on-reset circuit and to a low-power low-area voltage reference. Table 5.3 compares the PMU with two other recent SCN-based PMUs. The proposed PMU can handle much larger

|                    | This work                                   | [7]                       | [17]                                     |  |
|--------------------|---------------------------------------------|---------------------------|------------------------------------------|--|
| Process            | 65nm                                        | $0.18 \mu { m m}$         | $0.18 \mu m$                             |  |
| Area               | $0.425 mm^{2}$                              | $\approx 0.94 mm^2$       | $0.95 mm^{2}$                            |  |
| Transfer capacitor | $0.92 \mathrm{nF}$ MiM                      | 1.46 nF MiM               | 1.03nF MiM                               |  |
| DC/DC power range  | $10\mu W$ - $20mW$                          | 100<br>nW-2.5 $\mu \rm W$ | 12.8<br>nW-19.4 $\mu \rm W$              |  |
| $V_{store}$ range  | 0-3V                                        | $3.6\mathrm{V}$           | 3-4.1V                                   |  |
| Standby power      | $7\mu W$                                    | $\approx 5.3 \mathrm{nW}$ | $3.63 \mathrm{nW}$                       |  |
| Converters used    | SC DC/DC                                    | SC DC/DC linear regulator | $2 \times \text{ SC DC/DC}$              |  |
| Conversion ratio   | $\frac{1/3}{1} \frac{1/2}{2} \frac{2/3}{3}$ | 6                         | SCN 1: 2 3<br>SCN 2: 5 6                 |  |
| End-to-end $\eta$  | 53%                                         | NA                        | 26.9~%                                   |  |
| Peak $\eta$        | 89%                                         | 38% (SCN + LDO)           | $\approx 90\%$ (SCN 1)<br>63.8 % (SCN 2) |  |

Table 5.3. Comparison with PMU using SC DC/DC

input power while using on-chip MiM capacitors of similar size. Furthermore, thanks to the gearbox controller it can supply the storage element across a wider voltage range, and the single converter architecture allows for doubling the end-to-end efficiency.

The PMU has been manufactured in a commercial 65nm CMOS technology. It occupies  $0.425mm^2$ . The PMU successfully supplies the node from input power on the harvester as low as  $10\mu$ W up to 20mW. Once charged, the supercapacitor can supply the node for 43 hours. The SC DC/DC converter has an average simulated efficiency of 70.7% in direct mode and 74.9% in reverse mode whith a 1MHz switching frequency, leading to a 53% end-to-end efficiency when supplying  $V_{reg}$  from the supercapacitor.

# CONCLUSIONS AND PERSPECTIVES

The development of wireless sensor nodes and of the Internet-of-Things opens a window on a plethora of new application fields: taking healthcare monitoring out of the hospital, enabling human-less building automation for energy savings, monitoring of structural health in dangerous area, building smart cities with traffic jams prevention, etc.

Researchers across the world are inspired by this vision, but many challenges must still be resolved before the rise of the Internet-of-Things. Development of micro energy-harvesters able to supply power in the tens of  $\mu$ W range and of miniaturized storage elements, as well as the research focus on ultra-lowpower circuits make it today possible to design energy-autonomous systems. The power consumption of such circuits is lowered thanks to duty-cycled operation and to the tuning of their supply voltage below 1V depending on the workload requirement.

This thesis contributes to the edification of the Internet-of-Things by exploring the challenges associated to the design of power management units able to supply such energy-autonomous circuits. Therefore we focused on answering two questions:

- How can we design efficient on-chip DC/DC converters to supply both ultra-low and multi-mode loads?
- How can we integrate these converters in power management units to address the requirements of energy-autonomous systems?

We summarize here the answers to these questions brought in this thesis.

## How can we design efficient on-chip DC/DC converters to supply both ultralow and multi-mode loads?

In order to investigate this, we proposed an analysis of switched-capacitor DC/DC converters as a preliminary study. These converters use only capacitors as passive devices. The energy transfer mechanism is based on the reconfiguration of a capacitor network every half cycle of the switching frequency. They can easily be integrated on-chip with their load thanks to the nanometer CMOS process that have options for building dense capacitors. Furthermore, the power supplied by the converter scales with its switching frequency and with the capacitor size. They are therefore a good candidate to supply low loads with high power-conversion efficiency.

To further validate the choice of SC DC/DC converters, we first reviewed their sources of losses. Gate-driving and bottom-plate losses scale with the power

#### 126 CONCLUSIONS AND PERSPECTIVES

supplied to the load. The linear losses set an upper limit  $\eta_{max}$  to the energyconversion efficiency. However smart selection and adaptation of the capacitor network topology allow to keep  $\eta_{max}$  close to 90%, therefore not preventing SC DC/DC converters to address the first question of this thesis. Second, we showed that SC DC/DC converters can have two operation regimes corresponding to slow switching and fast switching behaviors. Third, we analyzed the existing control mechanisms to perform line and load regulation. Most of these mechanisms modulate the output impedance of the converter to adapt it to the line and load conditions. This is either performed by tuning the switching frequency, the transfer capacitor sizes or the switch characteristics. Another control loop can be used to adapt the topology of the SCN to the ratio  $V_{out}/V_{in}$ , to mitigate the linear losses that threatens the power-conversion efficiency when  $V_{out}$  or  $V_{in}$  have large fluctuations. Finally, interleaving several converters allows for reducing the output voltage ripple which is critical when supplying ultra-low-voltage to digital loads.

We then shaped an answer to the first part of the question: "how to design efficient on-chip DC/DC converters for ultra-low loads" by developing a systematic and practical methodology for sizing SC DC/DC converters. The sizing methodology provides the optimum converter switching frequency, switch sizes and transfer capacitance values to maximize conversion efficiency under the die area constraint. The preliminary study showed that the power-conversion efficiency depends on the switch size and on the converter parasitic capacitances. The proposed sizing methodology, further pointed out that the switch overdrive voltage strongly impacts the optimum switch sizes as it affects the switch conductance. Maximal efficiency is thus achieved for a non-identical size on each switch. We also showed the impact of the ratio between bottom-plate and effective transfer capacitances on the converter efficiency and on its optimal operation regime: the highest the parasitic capacitances the deeper the converter should operate in the slow switching limit regime. The converter efficiency drops quickly with the parasitic capacitance because of the bottom-plate losses. Process with options allowing for dense capacitors or for low parasitic capacitance are thus mandatory to achieve efficient power conversion. In addition to easing the design flow of SC DC/DC converters, the proposed sizing methodology can be used to quickly explore the design trade-off between die area and conversion efficiency.

The second part of the question "How to efficiently supply both ultra-low and multi-mode loads?" requires the design of multi-mode switched-capacitor DC/DC converters to supply dynamic loads. Such DC/DC converter must be able to supply a wide range of load power spanning over 3 orders of magnitude, to adapt the supplied voltage to the circuit workload, and to enable power gating of the load. We showed that it can be achieved by SC DC/DC converters that reconfigure their SCN topology according to the load voltage to keep linear losses low, that use a switching frequency scaled to the supplied power, and that body bias the switches and upsize their channel lengths to prevent leakage when power gating the load. This was illustrated with two practical design examples of multimode DC/DC converters supplying ULV microcontrollers. The first dual-mode converter was designed and measured in a commercial  $0.13\mu$ m CMOS technology. It transfers charges from a 1-1.2V DC input to a 0.3-0.4V DC output. It can operate either in medium-power mode to supply up to  $125\mu$ W to a microcontroller in active mode, or in ultra-low-power mode to supply 5nW to 450nW to a retentive SRAM or to always-on peripherals when the microcontroller is in sleep mode. In MP mode, the efficiency peaks at 74% for 100 $\mu$ W loads, while in ULP mode it peaks at 74% for 400nW loads and remains above 60% for loads as low as 100nW. The second multi-mode converter was designed and simulated in STMicroelectronics 28nm FD-SOI technology. It can dynamically switch the supply voltage of an ULV microcontroller between 0.45V (MP mode) and 0.3V (LP mode) depending on the workload, and act as a power-gating device. The achieved efficiency is high in both modes (88% in MP mode with 1.3mW load and 82% in LP mode with 40 $\mu$ W load) and the load leakage is power-gating mode is reduced to 40nW.

Let us now see how to use such SC DC/DC converters to answers the second question drawn by this thesis.

# How can we integrate these converters in power management units to address the requirements of energy-autonomous systems?

We elaborated the answer of this question in two steps. First energy-autonomous systems are possible only by reducing the microcontroller power consumption to the power level that is available from energy harvesters. This load power reduction requires the PMU to use DC/DC converters that allows multi-mode operation and that can act as a power gating device. Going one step further, of this we proposed a PMU that not only allows but also contributes to the reduction of the digital circuit power consumption. Therefore we observed that nanometer CMOS technology nodes suffer from local and global variations leading to large  $V_{DD}$  guard bands for safe operation at ultra-low voltage, which increases the dynamic power consumption of digital circuits. Adaptive voltage scaling (AVS) systems can be integrated on chip to reduce this guard band. We developed such an AVS system with an SC DC/DC converter combined with a critical path replica of the monitored circuit. The critical path replica both senses the instantaneous maximal clock frequency of the circuit and serves as its clock generator. The voltage supplied by the SC DC/DC converter is adapted accordingly. We first investigated such an AVS feasibility for two applications: a 45nm 1GHz low-power mobile processor and a 65nm 10MHz ultra-low power WSN microcontroller. We showed that new technology nodes with higher capacitance density allow for building such converters using less than 33% of the total circuit area. The low-power mobile processors need higher guard band to compensate local variations because they count more critical paths with a shorter logic depth (less averaging), and they are designed in a smaller CMOS technology node (e.g., 45 nm). For ultra-low-power WSN microcontrollers local variations are proportionally less important. The ripple of the converter output voltage induces jitter, but it is mitigated by the AVS feedback loop. The corresponding contribution

#### 128 CONCLUSIONS AND PERSPECTIVES

to the  $V_{DD}$  guard band can be neglected when compared to the savings on the guard band induced by local variations. Simulation results show that the use of an AVS system allows a reduction of the  $V_{DD}$  guard band of a digital circuit by 55% for low-power mobile processors and by 82% for ultra-low-power WSN microcontrollers. It allows for reducing the load dynamic power consumption by 17% and 37% respectively.

An AVS systems supplying the SleepWalker microcontroller [5] has been manufactured in a commercial 65nm CMOS technology to validate the concept of an AVS system using an SC DC/DC converter. Thanks to the PVT compensation, it is possible to increase the microcontroller clock frequency from 10MHz to 25MHz with a 0.4V supply voltage or to lower the supply voltage in typical conditions from 475mV to 400mV for safe 25MHz operation. The SC DC/DC converter inside the AVS system reaches up to 81% efficiency and is able to successfully supply the microcontroller under all PVT corners.

In a second step we proposed a power management system that supplies an energy autonomous circuit from solar energy harvesters and that uses a supercapacitor as an energy storage element. Because of the unpredictable nature of environmental energy harvester, an SC\_status signal provides information about the amount of energy available into the supercapacitor by monitoring its voltage so that the node can plan its tasks accordingly. Unlike existing PMUs, the proposed solution uses only one SC DC/DC converter to perform the three tasks of providing a regulated voltage, of storing energy into the supercapacitor and of supplying it back to the load when no energy is available from the harvester. Thanks to the use of an SC DC/DC converter, few external components are required. High energy conversion efficiency is reached thanks to an SCN that can be reconfigured in 7 different topologies, to switch drivers with a gate-boost assist for voltage down conversion, and to three frequency modes adapting the converter switching frequency and the PMU quiescent current to the load conditions and to the energy available from the harvester. The PMU is able to cold start when the supercapacitor is empty thanks to a power-on-reset circuit and to a low-power low-area voltage reference.

The PMU has been manufactured in a commercial 65nm CMOS technology. It accupies  $0.425mm^2$ . The PMU successfully supplies the node from input power on the harvester as low as  $10\mu$ W up to 20mW. Once charged, the supercapacitor can supply the node for 43 hours. The SC DC/DC converter has an average simulated efficiency of 70.7% in direct mode and 74.9% in reverse mode whith a 1MHz switching frequency, leading to a 53% end-to-end efficiency when supplying  $V_{reg}$  from the supercapacitor.

#### **Contribution summary**

The energy conversion efficiency and the power density of the SC DC/DC converters designed during this thesis are summarized in Fig. O.1. If we do not consider neither the data points corresponding to the converters in the lowest power mode (square labeled ULP mode, triangle labeled LP mode and dot la-



Fig. 0.1 Performance summary of the SC DC/DC converter realizations. Data for the 65nm PMU (Chapter 5) are for a multiply-by-two gain in direct mode.

beled 1MHz), nor the data point labeled 40MHz corresponding to a converter in deep FSL regime, there is an increase of the converter power density and efficiency with the technology scaling. The power density increase is explained by the improved capacitor density between technology nodes  $(2.5 \times \text{ from the } 0.13 \mu \text{m})$ to the 65nm technology node and  $3\times$  from the 65nm to the 28nm technology node). The efficiency improvement is explained by the reduction of the ratio between parasitic capacitance and effective transfer capacitance. Indeed parasitic capacitance per  $\mu m^2$  remains approximately the same, but the transfer capacitance per  $\mu m^2$  increases. These technology benefits widen the application field of switched-capacitor converters to circuits requiring higher computing power. Such circuits generally require several power domains for analog, RF or digital circuits [18]. Digital circuits also have different optimum supply voltages to minimize their power so that memories are not on the same power domain as the processor or the I/O [5, 25], and when several cores are parallelized, there is an interest to separate their power domain to optimize the chip overall performances [24]. However, converters using external passive components such as inductive converters require package pins which are count limited and increase the system cost, while linear regulators have intrinsic efficiency limitations [18]. Therefore new designs of SC DC/DC converters achieving high power density up to  $0.86 \text{W/mm}^2$  are proposed [24, 25]. Challenges that must be resolved by such SC DC/DC converters supplying large power are different than when targeting an ultra-low-power application. Achieving such high power density requires both high capacitor density offered by the technology scaling and high switching frequency. The latter requires also low switch conductance in order to minimize the fast switching limit impedance. Further the controller power overhead can be neglected while the driving losses become a concern because of the high switching frequency. The sizing methodology developed in Chapter 2 gives a good starting point for such designs.





Fig. O.2 Comparison of the developed converters with state-of-the-art SC DC/DC converters regrouped into three load categories: emerging energy-autonomous systems, low-power microcontrollers and high-performance processors for mobile applications.

Beyond these technology considerations, the design itself was improved from the dual-mode  $0.13\mu$ m converter to the 28nm multi-mode converter described in Chapter 3 thanks to the elaboration of the sizing methodology detailed in Chapter 2, and to the use of a more compact SCN topology. Topology improvement also increases the power density between two implementations of the AVS system (AVS1 described in Chapter 4, AVS2 [50]), but the efficiency drops because of a lower  $V_{OUT}$  (370mV instead of 400mV for a divide-by-two topology), leading to an increase of the linear losses. Finally the SC DC/DC converter used for the PMU described in Chapter 5 was optimized to reach the highest efficiency when few environmental energy is available or when supplying the load from the storage element, i.e. in the 1MHz frequency mode. In higher frequency modes, the power density increases thanks to the  $Z_{FSL}$  lowering, but the efficiency decreases because the converter does not operate at the optimum design point given by the sizing methodology.

The converters developed during this work are compared to state-of-theart converters in Fig. O.2. State-of-the-art converters supply loads that are regrouped into the three categories defined in the introduction: (i) emerging energy-autonomous systems [7, 15, 16, 17], (ii) low-power microcontrollers supplied at voltages below 1V [18, 19, 20, 21, 22], and (iii) high-performance processors for mobile applications [23, 24, 25, 26]. The  $0.13\mu$ m dual-mode converter described in Chapter 3 as well as the converter used by the AVS system proposed in Chapter 4 aim at supplying low-power microcontrollers. The  $0.13\mu$ m dual-mode converter can supply a record load power range while the converter of the AVS system achieves efficiency similar to the best state-of-the-art reported efficiencies. The 28nm multi-mode converter described in Chapter 3 was designed to supply a high-performance microcontroller with dynamic voltage scaling. It achieves higher efficiency than state-of-the-art SC DC/DC converters while being able to supply a wide load range. Finally the converter of the PMU described in Chapter 5 can successfully supply an energy-autonomous system with an unpredictable input power and with a voltage on the storage device varying between 0 and 3V. It is able to supply power orders of magnitude higher than state-ofthe-art converters designed with such constraints.

#### Was it the good choice?

In the first Chapter we bet on SC DC/DC converters. As we get close to the last lines of this thesis, it is worth wondering if it was the good choice. There is a trend to integrate the converter on-chip in order to reduce the package pins, the system form factor and assembly cost. We showed that SC DC/DC converters can meet this constraint thanks to the technology scaling as explained here above. However inductive converters with on-chip inductor also exist.

Inductive converters do not suffer from the main limitations of SC DC/DC converters which are the SCN topology limited range of the output voltage, and the maximal achievable efficiency  $\eta_{max}$  [32]. Therefore integrated inductive converters seem attractive because they are more flexible to the value of the output and input voltages and can reach a theoretical 100% efficiency. However integrated inductors require a large silicon area and have a poor quality factor [29]. The large area induces parasitic capacitance to the substrate [25] which leads to power losses as the voltage at the inductor terminals is switched. The poor quality factor induces conduction losses in the inductor equivalent series resistance. These losses are proportional to the rms current in the inductor [25, 31]. This current can be lowered by increasing the switching frequency, but at high switching frequency the inductor equivalent series resistance (and thus conduction losses) increases because of skin effects [31]. With additional fabrication steps it is possible to improve the inductor quality factor with thick metals for the inductor wire or by using integrated magnetic materials [30].

It is difficult to give a firm answer to the question: which one is the best. An attempt in [22] compares the linear losses from SC DC/DC converters with the conduction losses from integrated inductive converters. It shows that SC DC/DC converters win for high voltage up conversion but the comparison is valid only when the design is switch limited. In [31] efficiency and maximal supplied power of state-of-the-art SC DC/DC converters and of integrated inductive converters are compared. It shows that designs from both converter families reach identical efficiencies. However there is a trend for SC DC/DC converters to supply lower loads than inductive converters.

My opinion is that the answer is cost related. Indeed integrated inductors are expensive because they consume a large silicon area [29, 30], and require specific process steps to improve their quality factor [30]. Therefore inductive converters cannot compete with capacitive converters when supplying an ultra-

#### 132 CONCLUSIONS AND PERSPECTIVES

low load. Indeed the cost of the capacitive converter is related to the capacitor size and thus to the power supplied. As the load power increases the area and cost of capacitive converters increase. At one point inductive converters become cheaper because they can reach higher power densities [31]. Because achievable power density of SC DC/DC converters is technology dependent as explained in the previous section, it is also the case for the point where inductive converters become cheaper than capacitive converters. The same applies when choosing between fully integrated converters have a lower cost as long as the passive devices are small enough because of the required package pins to connect discrete devices. However, when the area occupied by the on-chip passive devices increases beyond a given threshold, it becomes cost effective to use discrete devices.

## Perspectives

This work has contributed to the elaboration of power management units for energy-autonomous systems. However there is still a long way to go before enabling energy-autonomous objects communicating between each others. This last section gives new challenges revealed by this thesis that future researches have to address.

- The proposed sizing methodology showed that the lower limit of the converter output impedance is highly dependent on the switch conductance through  $Z_{SSL}$ . New fully-depleted SOI technologies offer a virtual second gate (the back gate) that can be exploited to modulate the switch conductance because the BOX insulator layer prevents any leakage to the substrate. It offers two opportunities for SC DC/DC converter designs. First for loads that are sensitive to the noise frequency on their supply voltage, the converter could be controlled by switch conductance modulation through this second gate while switching at a fixed frequency. This would allow a continuous tuning of the converter output impedance in opposition to control based on transfer capacitor modulation at a fixed switching frequency [18]. Second as large range of voltage can be applied on this virtual second gate, the switches can thus be deeply reversed body biased when the converter act as a power gating device. This rises also the question of generating the body biasing voltage and the associated trade-off between area overhead and time required to modulate this voltage.
- Even though the SC DC/DC converters used by the proposed PMUs are integrated on-chip, they still require a capacitor to filter the output voltage. At the same time, an efficient way to reduce the node power consumption is by dynamically adapting the supply voltage of digital circuits to their workload. This dynamic tuning requires a fast transient between mode changes and thus a low filtering capacitor on the converter output to be energy effective. Some works proved that deep interleaving of about ten converters allows for removing the output filtering capacitor [24], [25]. Building

a PMU fully on-chip thanks to deep interleaving reduces dramatically the size of the effective output capacitor as it is resumed to the equivalent output capacitance of the other converter. Such a PMU would thus be able to rise the load voltage much faster when the circuit workload increases. Lowering the load voltage must use a charge recycling scheme in order to be energy efficient. It can be achieved through recombination of the load decoupling capacitors such as proposed in [87] for wake to sleep transitions.

• Finally inductive converters that can handle multiple inputs or outputs through the sharing of a single inductance exists [6]. Having similar SC DC/DC converter would be interesting to build inductorless PMUs able to work simultaneously with several energy harvesters. Therefore two strategies are possible. The first one is to successively supply the load from one energy harvester before moving to the next one. The second strategy would be to collect charges from all the energy harvesters before supplying the load. However these two strategies have hard challenges to resolve: the SCN topology must be adapted to the voltage level of each harvester and the controller must prevent reverse current flows from the transfer capacitors to the energy harvesters when their open circuit voltage is low because of a lack of environmental power.

- 1. J. C R Licklider, "Man-Computer Symbiosis", in *IRE Transactions on Human Factors in Electronics*, vol. 1 (1), pp. 4-11, 1960.
- I.F. Akyildiz, W. Su,Y. Sankarasubramaniam, E. Cayirci, "Wireless sensor networks: a survey", in *ELSEVIER Computer Networks*, vol. 38 (4), pp. 393-422, 2002.
- L. Atzori, A. Iera, G. Morabito, "The internet of things: A survey", in *ELSEVIER Computer Networks*, vol. 54 (15), pp. 2787-2805, 2010.
- 4. M. Belleville, et al, "Energy autonomous sensor systems: Towards a ubiquitous sensor technology", in *Microelectronics Journal*, vol. 41 (11), 2010, pp. 740-745.
- D. Bol et al, "SleepWalker: A 25-MHz 0.4-V Sub-mm<sup>2</sup> 7-μW/MHz Microcontroller in 65-nm LP/GP CMOS for Low-Carbon Wireless Sensor Nodes", in *IEEE J. Solid-State Circuits*, vol. 48 (1), pp. 20-32, 2013.
- S. Bandyopadhyay, A.P. Chandrakasan, "Platform Architecture for Solar, Thermal, and Vibration Energy Combining With MPPT and Single Inductor", in *IEEE J. Solid-State Circuits*, vol.47 (9), pp. 2199-2215, 2012.
- M. Fojtik *et al.*, "A Millimeter-Scale Energy-Autonomous Sensor System With Stacked Battery and Solar Cells", in *IEEE J. Solid-State Circuits*, vol.48 (3), pp. 801-813, 2013.
- G. Chen, S. Hanson, D. Blaauw, D. Sylvester, "Circuit Design Advances for Wireless Sensing Applications", in *Proceedings of the IEEE*, vol.98, no.11, pp. 1808-1827, 2010.
- D. Bol, J. De Vos, F. Botman, G. de Streel, S. Bernard, D. Flandre, and J.-D. Legat, "Green SoCs for a Sustainable Internet-of-Things", in *Proc. Workshop Faible Tension Faible Consommation (FTFC)*, 4 p., 2013.
- E. E. Aktakka, R. L. Peterson, K. Najafi, "A Self-Supplied Inertial Piezoelectric Energy Harvester with Power-Management IC", in *IEEE International Solid-State Circuits Conference*, pp. 120-121, 2011.
- I. Doms, P. Merken, R Mertens, C. Van Hoof, "Integrated Capacitive Power-Management Circuit for Thermal Harvesters with Output Power 10 to 1000µW", in *IEEE International Solid-State Circuits Conference*, pp. 300-301, 2009.
- Ixys, "KXOB22-12X1 IXOLAR<sup>TM</sup> High Efficiency SolarBIT Datasheet", Rev. MAR 2011, 2011, Accessed on August 30, 2013, http://ixapps.ixys.com/DataSheet/20110302-KXOB22-12X1-DATA-SHEET.pdf.

- M. Alioto, "Ultra-Low Power VLSI Circuit Design Demystified and Explained: A Tutorial", in *IEEE Transactions on Circuits and Systems I*, vol. 59 (1) pp. 3-29, 2012.
- S. Sridhara *et al*, "Microwatt embedded processor platform for medical systemon-chip applications", in *IEEE J. Solid-State Circuits*, vol. 46 (4), pp. 721-730, 2011.
- M. Wieckowski et al, "A Hybrid DC-DC Converter for Sub-Microwatt Sub 1-V Implantable Applications", in Proc. IEEE Symp. Very Large Scale Integr. (VLSI) Circuits, pp. 166-167, 2009.
- M.H. Ghaed, et al., "Circuits for a Cubic-Millimeter Energy-Autonomous Wireless Intraocular Pressure Monitor", in *IEEE Transactions on Circuits and Systems I*, 11p., accepted paper.
- S. Bang, Y. Lee, I. Lee, Y. Kim, D. Blaauw, D. Sylvester, "A Fully Integrated Switched-Capacitor Based PMU with Adaptive Energy Harvesting Technique for Ultra-Low Power Sensing Applications", in *Proc. Int. Circuits and Systems Conf.*, 2013, pp. 709-712.
- Y.K. Ramadass, A. A. Fayed and A.P. Chandrakasan, "A Fully-Integrated Switched-Capacitor Step-Down DC-DC Converter With Digital Capacitance Modulation in 45 nm CMOS," in *IEEE J. Solid-State Circuits*, vol. 45 (12), pp. 2557-2565, Dec. 2010.
- Y.K. Ramadass, A.P. Chandrakasan, "Voltage Scalable Switched Capacitor DC-DC Converter for Ultra-Low-Power On-Chip Applications", in *Proc. Power Elec*tronics Specialists Conf., 2007, pp. 2353-2359.
- 20. T. Van Breussegem and M. Steyaert, "A fully integrated gearbox capacitive DC/DC-converter in 90nm CMOS: Optimization, control and measurements", in *IEEE Control and Modeling for Power Electronics*, 5 p., 2010.
- J. Kwong, Y.K. Ramadass, N. Verma, A.P. Chandrakasan, "A 65 nm Sub-V<sub>t</sub> Microcontroller With Integrated SRAM and Switched Capacitor DC-DC Converter", in *IEEE J. Solid-State Circuits*, vol.44 (1), pp. 115-126, 2009.
- M.D. Seeman, S.R. Sanders, J.M. Rabaey, "An Ultra-Low-Power Power Management IC for Wireless Sensor Nodes", in *Proc. IEEE Power Electronics Specialists Conference*, pp. 925-931, 2008.
- D. Maksimović, D. Sandeep, "Switched-capacitor DC-DC converters for low-power on-chip applications", in *Proc. IEEE Power Electronics Specialists Conference*, pp. 54-59, 1999.
- L. Hanh-Phuc, S.R. Sanders, E. Alon, "Design Techniques for Fully Integrated Switched-Capacitor DC-DC Converters", in *IEEE J. Solid-State Circuits*, vol. 46 (9), pp. 2120-2131, 2011.
- 25. T. Van Breussegem and M. Steyaert, "Monolithic Capacitive DC-DC Converter With Single Boundary-Multiphase Control and Voltage Domain Stacking in 90 nm CMOS", in *IEEE J. Solid-State Circuits*, vol. 46 (7), pp. 1715-1727, 2011.
- 26. L. Su, D. Ma, and A.P. Brokaw, "A monolithic step-down SC power converter with frequency-programmable subthreshold z-domain DPWM control for ultralow power microsystems", in *IEEE European Solid-State Circuits Conference*, pp. 58-61, 2008.

- 27. Y. Okuma, et al. "0.5-V input digital LDO with 98.7% current efficiency and 2.7- $\mu$ A quiescent current in 65nm CMOS", in *IEEE Custom Integrated Circuits Conference*, 4 p. , 2010.
- E. Alon and M. Horowitz, "Integrated Regulation for Energy-Efficient Digital Circuits", in *IEEE J. Solid-State Circuits*, vol.43 (8), pp. 1795-1807, 2008.
- M. Wens, M. Steyaert, "A fully-integrated 130nm CMOS DC-DC step-down converter, regulated by a constant on/off-time control system", in *Proc. European Solid-State Circuits Conference ESSCIRC*, pp. 62-65, 2008.
- J. Wibben, R. Harjani, "A High-Efficiency DCDC Converter Using 2 nH Integrated Inductors", in *IEEE J. Solid-State Circuits*, vol.43 (4), pp. 844-854, 2008.
- G. V. Piqué and H. J. Bergveld, "State-of-the-art of integrated switching power converters", in Analog Circuit Design, Springer Netherlands, pp. 259-281, 2012.
- T. Van Breussegem, M. Wens, and M. Steyaert, "Control of Fully Integrated DC-DC Converters in CMOS", in *Analog Circuit Design*, Springer Netherlands, pp. 259-281, 2012.
- S. V. Cheong, H. Chung, A. Ioinovici, "Inductorless DC-to-DC converter with high power density," in *IEEE Transactions on Industrial Electronics*, vol. 41 (2), pp. 208-215, Apr 1994.
- M. Makowski and D. Maksimović, "Performance limits of switched-capacitor dc-dc converters", in Proc. Power Electronics Specialists Conf., 1995, pp. 1215-1221.
- W. Jieh-Tsorng and C. Kuen-Long, "MOS charge pumps for low-voltage operation", in *IEEE J. Solid-State Circuits*, vol.33 (4), pp. 592-597, 1998.
- M. D. Seeman and S. R. Sanders, "Analysis and optimization of switched-capacitor dc-dc power converters", in *IEEE Trans. on Power Electronics*, vol. 23 (2), pp. 841-851, 2008.
- 37. I. Oota, N. Hara, F. Ueno, "A general method for deriving output resistances of serial fixed type switched-capacitor power supplies", in *Proc. ISCAS*, vol. 3, pp. 503-506, 2000.
- B. Arntzen and D. Maksimović, "Switched-capacitor DC/DC converters with resonant gate drive", in *IEEE Trans. on Power Electronics*, vol. 13 (5), pp. 892-902, 1998.
- 39. G. V. Piqué, et al, "Energy optimization of tapered buffers for CMOS on-chip switching power converters", in *IEEE Int. Symposium on Circuits and Systems*, Vol. 5, pp. 4453-4456, 2005.
- 40. J.S. Choi and K. Lee, "Design of CMOS Tapered Buffer for Minimum Power-Delay Product", in *IEEE J. Solid-State Circuits*, vol. 29 (9), pp. 1142-1145, 1994.
- D. Maksimović and S. Dhar, "Switched-capacitor DC-DC converters for low-power on-chip applications", in *IEEE Power Electronics Specialists Conference*, vol. 1, pp. 54-59, 1999.
- 42. J. Zeng, S. Kotikalapoodi, and L. Burgyan, "Digital loop for regulating DC/DC converter with segmented switching", U.S. Patent 6,995,995, 2006.
- G. Patounakis, Y.W. Li, and K.L. Shepard, "A fully integrated on-chip DC-DC conversion and power management system", in *IEEE J. Solid-State Circuits*, vol.39 (3), pp. 443-451, 2004.

- 44. R. Perigny, M. Un-Ku, and G. Temes, "Area efficient CMOS charge pump circuits", in *IEEE International Symposium on Circuits and Systems*, vol.1, pp. 492-495, 2001.
- 45. G. Schrom, et al., "Feasibility of monolithic and 3D-stacked DC-DC converters for microprocessors in 90nm technology generation", in Proc. IEEE Low Power Electronics and Design ISLPED, pp. 263-268, 2004.
- 46. V. Kursun, S.G. Narendra, V.K. De, E.G. Friedman, "High input voltage stepdown DC-DC converters for integration in a low voltage CMOS process", in *Proc. IEEE Quality Electronic Design*, pp. 517-521, 2004.
- 47. A. Heringa, J. Sonsky, "Novel power transistor design for a process independent high voltage option in standard CMOS", in *IEEE Int. Symp. on Power Semicon*ductor Devices and IC's, 4p., 2006.
- J. De Vos, D. Flandre, D. Bol, "A DC/DC Converter for Dual-Mode Ultra-Low-Voltage Microcontrollers", in *IEEE Subtreshold Microelectronics Conf.*, 3p., 2012.
- D. Bol et al, "The Detrimental Impact of Negative Celsius Temperature on Ultra-Low-Voltage CMOS Logic", in Proc. European Solid-State Circuits Conference ES-SCIRC, pp. 522-525, 2010.
- 50. F. Botman, J. De Vos, S. Bernard, J.D. Legat, D. Bol, "Bellevue: a 50MHz Variable-Width SIMD 32bit Microcontroller at 0.37V for Processing-Intensive Wireless Sensor Nodes", in *IEEE International Symposium on Circuits and Systems*, 4p., 2014, submitted paper.
- 51. G. Chen *et al*, "Millimeter-scale nearly perpetual sensor system with stacked battery and solar cells", in *IEEE International Solid-State Circuits Conference*, pp. 288-289, 2010.
- 52. M. Kuorilehto, M. Hännikänen, and T. D. Hämälinen, "A survey of application distribution in wireless sensor networks". in *Springer Open J. on Advances in Signal Processing EURASIP*. Vol. 2005 (5), pp. 774-788, 2005.
- 53. S. Dhar, D. Maksimovic, and B. Kranen, "Closed-loop adaptive voltage scaling controller for standard-cell ASICs", in *Proc. Low Power Electronics and Design ISLPED*, pp. 103-107, 2002
- A. Asenov et al., "Simulation of intrinsic parameter fluctuations in decananometer and nanometer-scale MOSFETs". in *IEEE Trans. On Electron Devices*, vol.50 (9), pp. 1837-1852, 2003.
- 55. T. D. Burd *et al.*, "A dynamic voltage scaled microprocessor system", in *IEEE J. Solid-State Circuits*, vol.35 (11), pp. 1571-1580, 2000.
- G.-Y. Wei and M. Horowitz, "A fully digital, energy-efficient, adaptive powersupply regulator". in *IEEE J. Solid-State Circuits*, vol.34 (4), pp. 520-528, Apr 1999
- 57. J. Kim and M. A. Horowitz, "An efficient digital sliding controller for adaptive power-supply regulation". in *IEEE J. Solid-State Circuits*, vol.37 (5), pp. 639-647, 2002.
- 58. F. Luo, D. Ma, "Integrated adaptive step-up/down switching DCDC converter with tri-band tri-mode digital control for dynamic voltage scaling", in *IEEE Int.* Symposium on Industrial Electronics ISIE, pp. 142-147, 2008.

- 59. D. Ernst et al., "Razor: circuit-level correction of timing errors for low-power operation", in *IEEE Micro*, vol.24 (6), pp. 10-20, 2004.
- S. Dhar and D. Maksimovic, "Switching regulator with dynamically adjustable supply voltage for low power VLSI", in *IEEE Proc. of the Industrial Electronics* Society conf. IECON, vol.3, pp. 1874-1879, 2001.
- A. Chandrakasan et al., "Technologies for Ultradynamic Voltage Scaling", in Proceedings of the IEEE, vol.98 (2), pp. 191-214, 2010.
- 62. G. Gerosa et al., "A Sub-2 W Low Power IA Processor for Mobile Internet Devices in 45 nm High-k Metal Gate CMOS", in *IEEE J. Solid-State Circuits*, vol.44 (1), pp. 73-82, 2009.
- 63. E. Le Roux et al., "A 1V RF SoC with an 863-to-928MHz 400kb/s radio and a 32b Dual-MAC DSP core for Wireless Sensor and Body Networks", in *IEEE International Solid-State Circuits Conference*, pp. 464-465, 2010.
- 64. X. Jiang, J. Polastre, and D. Culler, "Perpetual environmentally powered sensor networks", in *IEEE Proc. of the Symposium on Information Processing in Sensor Networks*, pp. 463-468, 2005.
- 65. L. Wei, Z. Chen, M. Johnson, K. Roy, Y. Ye, and V. De, "Design and optimization of dual-threshold circuits for low-voltage low-power applications", in *IEEE Trans.* on Very Large Scale Integration Systems (VLSI), vol.7 (1), pp. 16-24, 1999.
- 66. S. Mutoh et al., "1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS," in *IEEE J. Solid-State Circuits*, vol.30 (8), pp. 847-854, 1995.
- P. Manet et al., "Low power techniques applied to a 80C51 microcontroller for high temperature applications". in ASP Journal of Low Power Electronics, vol.2 (pp. 19-29), 2006.
- Cortex-M0+ specifications, Accessed on November 06, 2013, http://www.arm.com/products/processors/cortex-m/cortex-m0plus.php.
- 69. Cortex-A9 specifications, Accessed on November 06, 2013, http://www.arm.com/products/processors/cortex-a/cortex-a9.php.
- 70. D. Bol, R. Ambroise, D. Flandre, and J.-D. Legat, "Interests and Limitations of Technology Scaling for Subthreshold Logic", in *IEEE Trans. on Very Large Scale Integration Systems (VLSI)*, vol.17 (10), pp. 1508-1519, 2009.
- M. Nakai *et al.*, "Dynamic voltage and frequency management for a low power embedded microprocessor", in *IEEE J. Solid-State Circuits*, vol. 40 (1), pp. 28-35, 2005.
- S. Hanson *et al.*, "Exploring variability and performance in a sub-200-mV processor", in *IEEE J. Solid-State Circuits*, vol. 43 (4), pp. 115-126, 2009.
- 73. D. Bol, C. Hocquet, D. Flandre and J.-D. Legat, "Robustness-aware sleep transistor engineering for power-gated nanometer subthreshold circuits", in *Proc. Int. Circuits and Systems Conf.*, 2010, pp. 1484-1487.
- Tecate Group, "GW201F Cap-XX Supercapacitors Datasheet", 2013, Accessed on September 20, 2013, http://www.tecategroup.com/capacitors/datasheets/capxx/CAP-XX%20Data%20Sheets.pdf

- Texas Instruments, "bq25504 Ultra Low Power Boost Converter with Battery Management for Energy Harvester Applications Datasheet", Rev A, 2012, Accessed on July 29, 2013, http://www.ti.com/lit/ds/symlink/bq25504.pdf.
- Maxim Integrated Circuits, "AX17710 Energy-Harvesting Charger and Protector Datasheet", Rev 2, 2012, Accessed on July 29, 2013, http://datasheets.maximic.com/en/ds/MAX17710.pdf.
- 77. Linear Technology, "LTC 3108 Ultralow Voltage Step-Up Converter and Power Manager Datasheet", Rev. A, 2012, Accessed on July 29, 2013, http://cds.linear.com/docs/en/datasheet/31081fa.pdf.
- Linear Technology, "LTC 3105 400mA Step-Up DC/DC Converter with Maximum Power Point Control and 250mV Start-Up Datasheet", Rev. A, 2011, Accessed on September 9, 2013, http://cds.linear.com/docs/en/datasheet/3105fa.pdf.
- 79. N.J. Guilar, R. Amirtharajah, P.J. Hurst, S.H. Lewis, "An energy-aware multipleinput power supply with charge recovery for energy harvesting applications", in *IEEE International Solid-State Circuits Conference*, pp. 298-299, 2009.
- H. Lhermet, et al., "Efficient Power Management Circuit: From Thermal Energy Harvesting to Above-IC Microbattery Energy Storage", in *IEEE J. Solid-State* Circuits, vol.43 (1), pp. 246-255, 2008.
- 81. Ixys, "KXOB22-04X3 IXOLAR<sup>TM</sup> High Efficiency SolarBIT Datasheet", Rev. AUG 2011, 2011, Accessed on August 30, 2013, http://ixapps.ixys.com/DataSheet/KXOB22-04X3-DATA-SHEET-20110808.pdf
- 82. M. Seok, G. Kim, D. Blaauw, D. Sylvester, "Variability analysis of a digitally trimmable ultra-low power voltage reference", in *Proc. European Solid-State Circuits Conference ESSCIRC*, pp. 110-113, 2010.
- S. Adriaensen, V. Dessard, D. Flandre, "25 to 300°C ultra-low-power voltage reference compatible with standard SOI CMOS process". in *Electronics Letters*, vol.38 (19), pp. 1103-1104, 2002.
- 84. K. Roy, S. Mukhopadhyay and H. Mahmoodi-Meimand, "Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits", in *Proc. IEEE*, vol. 91, no 2, pp. 305-327, 2003.
- D. Bol, "Pushing Ultra-Low-Power Digital Circuits into the Nanometer Era", UCL, PhD thesis, 2008.
- B. Razavi, B.A. Wooley, "Design techniques for high-speed, high-resolution comparators", in *IEEE J. Solid-State Circuits*, vol.27 (12), pp.1916-1926, 1992.
- M. Alioto, A. Consoli, J. M. Rabaey, ""ECHO" Reconfigurable Power Management Unit for Energy Reduction in Sleep-Active Transitions", in *IEEE J. Solid-State Circuits*, vol. 48 (8), pp. 1921-1932, 2013.

# APPENDIX A KXOB22-12X1 MACRO MODEL

In Chapter 5, we proposed to design a power management unit for an energyautonomous system using a solar cell as energy source. Therefore a SPICE macro model of the KXOB22-12X1 solar cell has been created for simulation purposes. The model described in Fig. A.1 is available here:

.SUBCKT SOL GND VPV CONTROL .param IPH=-50e-3 .model dio d is=4e-10 n=1.3 rs=0 area=1 gIph1 v1 gnd control gnd IPH d1 v1 gnd dio rs v1 VPV 1.5 rsh v1 gnd 10000 .ends

The current source value is tuned by the CONTROL parameter which models the light irradiance. The resistor and diode parameters have been fixed in order to match the solar cell characteristics provided in the datasheet [12].

Fig. A.2 shows the comparison between the solar cell characteristics available in the datasheet and the model.



Fig. A.1. Model of the KXOB22-12X1 solar cell used for the SPICE simulations.



Fig. A.2. (a) Comparison between open voltage from the datasheet and the model at  $25^{\circ}$ C. (b) Comparison between IV curves from the datasheet and the model at  $25^{\circ}$ C and a 1000W/m<sup>2</sup> light irradiance.