A Thesis for the Degree of Ph.D. in Engineering

### Design of High Efficiency Monolithic Switched-Capacitor DC-DC Converters for IoT Applications

January 2024

Graduate School of Science and Technology Keio University

Tan Yi

### Committee Members:

Hiroki Ishikuro Cheng Huang Nobuhiko Nakano Makoto Takamiya Kentaro Yoshioka

### Abstract

Switched capacitor power converter circuits are very common in Power Management Integrated Circuits (PMICs) in modern devices because of the favorable high integrity provided by on-chip capacitors. Recently, with the development of Internet-of-Things (IoT), the limited power sources have brought a series of new challenges to switched capacitor power converters.

On the one hand, using the energy harvester as a supplementary power source can extend the maintenance intervals for sensor nodes. However, the lower power harvested from the environment results in a low closed-loop voltage of the generator, raising design challenges in achieving high power conversion efficiency for energy-harvesting PMICs. On the other hand, because of the long standby times and high power consumption in the active state, designers aim for high power conversion efficiency of PMICs under different loading conditions and fast transient response to ensure sensor system availability while minimizing delays, which presents challenges for controller design.

Therefore, this research focuses on the low-power optimization of switched capacitor power converters. Using the thermoelectric energy harvester as an example, we will introduce our efforts to optimize transistor performance under ultra-low power and low voltage conditions, enhancing the peak efficiency achievable by existing topologies. Additionally, we will demonstrate dual lower-bound hysteretic control that provides efficient operation over a wide output power range. Therefore, the proposed techniques are expected to improve the overall performance of generic SCPC converters in IoT applications.

## Contents

| A | bstra | ct      |                                                            | i  |
|---|-------|---------|------------------------------------------------------------|----|
| 1 | Intr  | oductio | on.                                                        | 1  |
| 2 | Cha   | llenge  | s in Always-On IoTs System                                 | 8  |
| 3 | Effi  | ciency  | Optimization in Low Voltage Low Power Applications         | 13 |
|   | 3.1   | Trade   | -offs in SCPC Design                                       | 14 |
|   |       | 3.1.1   | Conduction Losses                                          | 14 |
|   |       | 3.1.2   | Impact of Slow and Fast Switching Limit in Conduction Loss | 15 |
|   |       | 3.1.3   | Switching Losses                                           | 17 |
|   |       | 3.1.4   | Trade-offs in Low Voltage Low Power Applications           | 19 |
|   | 3.2   | Gate V  | Voltage Optimization for Target $R_{on}$                   | 21 |
|   |       | 3.2.1   | Gate Voltage Optimization in Strong Inversion Region       | 21 |
|   |       | 3.2.2   | Gate Voltage Optimization in Weak Inversion Region         | 24 |
|   |       | 3.2.3   | Numerical Calculation                                      | 26 |
|   |       | 3.2.4   | Comparsion with Dynamic Body Biasing                       | 31 |
|   |       | 3.2.5   | Summary of Methodology                                     | 34 |
|   | 3.3   | Imple   | mentation and Verficiation                                 | 35 |
|   |       | 3.3.1   | Verficiation Circuits                                      | 35 |
|   |       | 3.3.2   | Controller                                                 | 39 |
|   |       | 3.3.3   | Startup Process                                            | 41 |

|   |     | 3.3.4  | Measurement Results                                     | 42 |
|---|-----|--------|---------------------------------------------------------|----|
|   |     | 3.3.5  | Comparison of Performance                               | 47 |
|   | 3.4 | Discus | ssions                                                  | 48 |
| 4 | Dua | l Lowe | er-Bound-Hysteresis Control                             | 50 |
|   | 4.1 | Conce  | ept Circuit                                             | 52 |
|   |     | 4.1.1  | Model of Transient Behavior in SCPC                     | 52 |
|   |     | 4.1.2  | Behavior Analysis of the Proposed DLBHC Control         | 55 |
|   |     | 4.1.3  | DLBHC and Frequency Control Design                      | 57 |
|   |     | 4.1.4  | Frequency Controller Design                             | 59 |
|   | 4.2 | Mode   | l for Delay Time Design                                 | 61 |
|   |     | 4.2.1  | Simplified Model for Timing Design                      | 61 |
|   |     | 4.2.2  | Light-load DLBHC Timing Analysis                        | 64 |
|   |     | 4.2.3  | Heavy-load DLBHC Timing Analysis                        | 67 |
|   |     | 4.2.4  | Impact of Transistor Resistance Mismatch                | 67 |
|   |     | 4.2.5  | Simulation Based Analysis Verification                  | 68 |
|   |     | 4.2.6  | Discussion on $T_D$ Selection for DLBHC Design          | 72 |
|   | 4.3 | Imple  | mentation with Distributed Multi-Phase DLBHC Controller | 73 |
|   |     | 4.3.1  | Distributed Multi-Phase DLBHC Design                    | 73 |
|   |     | 4.3.2  | Simulated Transient Behavior of Proposed Design         | 76 |
|   |     | 4.3.3  | Measurement Result                                      | 78 |
|   | 4.4 | Imple  | mentation with Centralized Multi-Phase DLBHC Controller | 86 |
|   |     | 4.4.1  | Circuit Design                                          | 86 |
|   |     | 4.4.2  | Delay Compensations                                     | 86 |
|   |     | 4.4.3  | Centralized DLBHC with Delay Compensation               | 91 |
|   |     | 4.4.4  | Dual-Mode Operations                                    | 91 |
|   |     | 4.4.5  | Frequency Control Design                                | 92 |

|                  |                                   | 4.4.6 Measurement Results | 92  |
|------------------|-----------------------------------|---------------------------|-----|
|                  | 4.5 Comparison of the Performance |                           |     |
|                  | 4.6                               | Discussions               | 99  |
| 5                | Con                               | clusion                   | 100 |
| Ac               | knov                              | vledgements               | 102 |
| Bibliography 103 |                                   |                           |     |

# **List of Figures**

| 1.1 | Statistic and prediction of the number of connected devices from IoT           |    |
|-----|--------------------------------------------------------------------------------|----|
|     | Analytics [1] and Statista [2]                                                 | 2  |
| 1.2 | Typical PMIC system in IoT devices.                                            | 3  |
| 1.3 | Example of Step Down PMIC: SIPC, SCPC and Hybrid Converters                    | 4  |
| 1.4 | Typical SCPC converters comparison.                                            | 5  |
| 2.1 | Challenges in Always-on IoT devices addressed by this thesis: en-              |    |
|     | ergy harvesting (Chapter 3), standby efficiency and wake-up speed              |    |
|     | (Chapter 4)                                                                    | 9  |
| 2.2 | Review of PMIC performance: efficiency vs input voltage                        | 10 |
| 2.3 | Design Trade-offs in the controller design                                     | 11 |
| 3.1 | Charge pump design trade-offs                                                  | 20 |
| 3.2 | Conceptual graph of optimum gate voltage                                       | 24 |
| 3.3 | Circuit setup for evaluating the transistor scaling under different $V_{gs}$ . | 26 |
| 3.4 | Extraction of the $V_{th}$ from simulation results                             | 27 |
| 3.5 | Verification of the theory when $R_{on} = 3000\Omega$                          | 28 |
| 3.6 | Verification of the theory when $R_{on} = 200\Omega$                           | 29 |
| 3.7 | Optimum gate voltage at different $R_{on}$ and $C_c$                           | 30 |
| 3.8 | Power losses at optimum gate voltage with different $R_{on}$ and $C_c$         | 31 |
| 3.9 | (a) Circuit setup of typical charge pumps with DBB. (b) Setup of gate          |    |
|     | voltage optimization for simulations. (c) Setup of DBB for simulations.        | 32 |

| 3.10 | Comparison between dynamic body biasing and gate voltage opti-      |    |
|------|---------------------------------------------------------------------|----|
|      | mization                                                            | 33 |
| 3.11 | The general consideration of determining the optimum gate voltage.  | 34 |
| 3.12 | Charge pump structure.                                              | 36 |
| 3.13 | (a) Charge pump structure. (b) $V_{gs}$ in different stages         | 36 |
| 3.14 | Minimal $P_{s,sw}$ of each stage at different $V_{gate,on}$         | 37 |
| 3.15 | Controlled oscillator.                                              | 39 |
| 3.16 | Feedback loop for $V_{fb1}$ and $V_{fb2}$ generation                | 40 |
| 3.17 | (a) Startup circuit (b) Startup process                             | 40 |
| 3.18 | Chip micrograph                                                     | 43 |
| 3.19 | Load regulation of proposed 3-stage charge pump                     | 44 |
| 3.20 | Clock frequency of proposed 3-stage charge pump                     | 44 |
| 3.21 | Efficiency of proposed 3-stage charge pump                          | 45 |
| 3.22 | Load regulation of proposed 5-stage charge pump                     | 45 |
| 3.23 | Clock frequency of proposed 5-stage charge pump                     | 46 |
| 3.24 | Efficiency of proposed 5-stage charge pump                          | 46 |
| 3.25 | Startup Measurement.                                                | 47 |
| 4.1  | SC converters based on (a) LBHC, (b) frequency control, (c) LBHC SC |    |
|      | converter with pulse skipping based frequency control [3], and (d)  |    |
|      | proposed design.                                                    | 51 |
| 4.2  | Model of the SCPC with Frequency Control.                           | 53 |
| 4.3  | Open-loop charge pump in the mixed domain                           | 54 |
| 4.4  | Closed-loop charge pump with synchronized controller                | 54 |
| 4.5  | (a) Illustration of the proposed DLBHC control (b) flow chart       | 56 |
| 4.6  | Comparison between open loop DLBHC and LBHC, assuming a heavy       |    |
|      | load                                                                | 56 |

| 4.7  | Operation of DLBHC in open-loop: (left) heavy load, (middle) opti-               |    |
|------|----------------------------------------------------------------------------------|----|
|      | mum Load, (right) light load.                                                    | 56 |
| 4.8  | Operation of LBHC in open-loop: (left) heavy load, (middle) opti-                |    |
|      | mum Load, (right) light load.                                                    | 57 |
| 4.9  | Comaprsion of different conceptual implementations w/ frequency                  |    |
|      | control (a) proposed DLBHC, (b) LBHC, frequency control is not prac-             |    |
|      | tical, (c) oversample based LBHC [3] (d) frequency control w/ error              |    |
|      | amplifier                                                                        | 59 |
| 4.10 | Operation of Oversampled LBHC in open-loop: (left) heavy load,                   |    |
|      | (middle) optimum Load, (right) light load                                        | 59 |
| 4.11 | Frequency controller design.                                                     | 61 |
| 4.12 | Modeling for the analysis of an N-phase 2:1 SC converter                         | 62 |
| 4.13 | DLBHC operations: proper $T_D$ (black) and inproper $T_D$ (red)                  | 62 |
| 4.14 | Simulated relationship between steady-state $I_{load}$ and $f_{eff}$             | 68 |
| 4.15 | Simulated relationship between $T_{cross2}$ and $T_{clk}$                        | 69 |
| 4.16 | Simulated relationship between $T_{cross2}$ and $T_{clk}$                        | 69 |
| 4.17 | Simulated relationship between $K_{cross1}$ and $K_{vco}$ (Simulation vs Model). | 70 |
| 4.18 | Process and temperature variation of selected $T_D$ (1.8 V to 0.8 V)             | 70 |
| 4.19 | Relationship between the sampled $V_{out}$ and $T_D$ during the transient        |    |
|      | recovery                                                                         | 71 |
| 4.20 | System block diagram of the proposed N-phase DLBHC SC Converter.                 | 74 |
| 4.21 | Detailed circuits of (a) Level shifter (b) Pulse generator (c) Multiple-         |    |
|      | clock triggered DFF                                                              | 75 |
| 4.22 | Conceptual graph of DLBHC operations when (a) $f_{clk} = 2f_{opt}$ , (b)         |    |
|      | $f_{clk} = f_{clk}/2$ and (c) $f_{clk} = f_{opt}$ .                              | 76 |
| 4.23 | Simulated waveform of 6 $\mu$ A - 6 mA up transient w/o turbo                    | 77 |
| 4.24 | Simulated waveform of 6 mA - 6 $\mu$ A down transient w/o turbo                  | 77 |

| 4.25 | Simulated waveform of 6 $\mu$ A - 6 mA up transient w/ turbo                             | 78 |
|------|------------------------------------------------------------------------------------------|----|
| 4.26 | Chip microphotograph.                                                                    | 78 |
| 4.27 | Comparsion of the measured load regulation between the proposed                          |    |
|      | DLBHC and its conventional LBHC mode.                                                    | 79 |
| 4.28 | Comparsion of the measured efficiency between the proposed DLBHC                         |    |
|      | and its conventional LBHC mode during load regulation                                    | 79 |
| 4.29 | Comparsion of the measured VCO frequency between the proposed                            |    |
|      | DLBHC and its conventional LBHC mode during load regulation                              | 80 |
| 4.30 | Comparsion of the measured line regulation between the proposed                          |    |
|      | DLBHC and its conventional LBHC mode                                                     | 80 |
| 4.31 | Comparsion of the measured efficiency between the proposed DLBHC                         |    |
|      | and its conventional LBHC mode during line regulation                                    | 81 |
| 4.32 | Comparsion of the measured VCO frequency between the proposed                            |    |
|      | DLBHC and its conventional LBHC mode during line regulation                              | 82 |
| 4.33 | Measured transient response w/ turbo from 40 $\mu$ A to 4 mA                             | 82 |
| 4.34 | Measured transient response w/ turbo from 10 $\mu$ A to 6 mA                             | 83 |
| 4.35 | Measured transient response w/o turbo from 10 $\mu$ A to 6 mA                            | 83 |
| 4.36 | Analysis and comparison between the measured step-up load tran-                          |    |
|      | sient w/o $V_{turbo}$ (left, from Fig. 4.34) and w/ $V_{turbo}$ (right, from Fig. 4.35). | 84 |
| 4.37 | Analysis of measured step-down load transient (from Fig. 4.34)                           | 85 |
| 4.38 | Proposed centralized DLBHC controller with delay compensation                            | 87 |
| 4.39 | Analysis of DLBHC operations in time-domain: (a) Flowchart of op-                        |    |
|      | eration. (b) Steady-state with matched frequency. (c) Transient state.                   | 87 |
| 4.40 | Details of delay compensation.                                                           | 88 |
| 4.41 | Timing of delay compensation.                                                            | 88 |
| 4.42 | Simulated timing diagram of delay compensation during end of load                        |    |
|      | transient from 7 mA to 8 mA.                                                             | 89 |

| 4.43 | (a) Mode control circuit (b) Clock timing.                       | 89 |
|------|------------------------------------------------------------------|----|
| 4.44 | (a) State-detection (b) Frequency controller.                    | 90 |
| 4.45 | Chip microphotograph.                                            | 93 |
| 4.46 | Load regulation and efficiency.                                  | 93 |
| 4.47 | Frequency and power consumption of controller                    | 94 |
| 4.48 | Transient response without external activation.                  | 95 |
| 4.49 | Histogram of the response time                                   | 96 |
| 4.50 | Transient response with external activation.                     | 96 |
| 4.51 | Break down of power conversion efficiency at $1 mA$ load current | 97 |

# **List of Abbreviations**

| DLBHC | Dual Lower Bound Hysteresitc Control  |
|-------|---------------------------------------|
| DVFS  | Dynamic Voltage and Frequency Scaling |
| IoTs  | Internet-of-Things                    |
| LBHC  | Lower Bound Hysteresitc Control       |
| РСВ   | Printed Circuit Board                 |
| PMIC  | Power Management Integrated Circuit   |
| SCPC  | Switched Capacitor Power Converter    |
| SIPC  | Switched Inductor Power Converter     |
| VLSI  | Very Large Integrated Circuit         |
|       |                                       |

#### Chapter 1

### Introduction

The Internet is connecting almost every electronic device in the world (Fig. 1.1): according to the statistics, the number of connected devices has increased from around 3.6 billion in 2015 to more than 15 billion in 2023 [1, 2]. Technological innovations in communication techniques, process nodes, sensor techniques and artificial intelligence (AI) have driven these increases over the past few years. These domains are expected to continue catalyzing IoT device development in the future. The rapid growth of IoT-connected devices presents challenges for circuit design. Many research efforts are dedicated to Very Large Integrated Circuit (VLSI) designs and systems-on-chip. These efforts aim to integrate multiple functions into a single chip and optimize power efficiency and performance to meet rising demand from applications. In this scenario, as the power source of every electronic device, the design and characteristics of Power Management Integrated Circuits (PMIC) have a direct influence on overall device effectiveness.

The process of DC-DC voltage conversion holds significant importance in system power conversion. Voltage regulation techniques have been developed to tackle the challenges presented in this area. These techniques include inductive DC-DC converters, capacitive DC-DC converters, low dropout regulators, and hybrid converters. In practical applications, it is common to see VLSI chips operating with varying supply voltages across different subsystems. Hence, there is a need for multiple



Fig. 1.1. Statistic and prediction of the number of connected devices from IoT Analytics [1] and Statista [2].

voltage regulators to facilitate voltage conversion across these domains. Considering Fig. 1.2 as an example, inductive or hybrid voltage regulators typically handle voltage conversion between the off-chip bus and load. Common scenarios include the charging of a battery using a Power Delivery (PD) charger that can manage voltages up to 48V [4–14], or delivering power from battery voltages to the chips [15–23]. This is feasible because the Switched Inductor Power converter (SIPC) can achieve high power conversion efficiency across a wide voltage conversion ratio. Generally, it offers high output power, making it suitable for powering the entire system.

Meanwhile, for on-chip voltage regulation, the Switched-Capacitor Power Converter (SCPC) as a type of PMIC is a competitive choice in many modern electronic



Fig. 1.2. Typical PMIC system in IoT devices.

devices. Compared to SIPC approaches, SCPC offers high-integrity design solutions using on-chip capacitors [24, 25]. In addition to compact design, SCPC can also achieve fast transient performance for load circuits. This makes it a promising choice for compact devices like Internet-of-Things sensors [26–35]. However, SCPC has intrinsic drawbacks in terms of voltage conversion ratio, efficiency, and output power. Therefore, this chapter provides a brief review of the pros and cons from three crucial aspects of PMIC: voltage conversion ratio, efficiency, and speed to establish the background for the following chapters.

Unlike SIPC, which regulates the output voltage by controlling duty cycles with a single topology, SCPC has a fixed voltage conversion ratio for a given topology. Therefore, strategies for effectively covering the target design space have garnered



Fig. 1.3. Example of Step Down PMIC: SIPC, SCPC and Hybrid Converters.

a great deal of research interest. Formulating a suitable topology appropriately to address the challenges of applications is one of the major research topics in this field. As shown in Fig. 1.4, basic SC topologies consist of Dickson, ladder, Fibonacci, series-parallel, and exponential converters. Exponential topologies provide the highest voltage conversion ratio but also magnify the parasitic losses from the bottom plate parasitic capacitance [36]. Dickson topology scales the conversion ratio linearly when stages increase, and the impact of the bottom plate parasitic is minimized with lower voltage swing [36]. However, for a high voltage conversion ratio, Dickson topology leads to more stages. Therefore, the overall losses induced by additional stages become non-negligible factors for this type of converter. Other topologies like Fibonacci or ladder structure take the trade-off between conversion ratio and parasitic losses, providing a more feasible choice for SC converter design. In addition to the basic topologies, techniques like multi-conversion ratio [37], successive approximation [38], and continuous ratio design are also being



Fig. 1.4. Typical SCPC converters comparison.

researched [39]. This endows SCPC with the potential to achieve competitive voltage regulation capability over other types of power converters.

In terms of efficiency, the Switched Inductor Power Converter (SIPC) is renowned for its high efficiency, primarily due to its high-quality, though often bulky, inductors. The inclusion of external inductors facilitates soft-charging during the transfer of charges across different levels. Unlike the SIPC, the Switched Capacitor Power Converter (SCPC), which lacks an inductor, has to mitigate hard-charge sharing losses to achieve competitive efficiency. As a result, within the scope of SCPC research, techniques such as split-phase [40] and multiphase soft-charging [41] have been developed to address this problem. It has also been reported that methods like scalable parasitic redistribution [42] can significantly reduce the impact of bottom plate parasitics by decreasing the voltage fluctuations across different phases. The integration of these techniques can lead to highly competitive efficiency levels in comparison to other types of converters.

When it comes to transient regulation, inductors tend to maintain a constant current. This intrinsic property allows SCPC converters to achieve faster regulation speeds compared to their SIPC counterparts. In fact, one method to enhance transient performance in a hybrid converter is having a capacitive path [43]. Within an SCPC, the controller regulates voltages by adaptively changing the switching frequency. To accomplish this, two of the most commonly used approaches are hysteretic control and frequency regulation. Hysteretic control involves comparing the output and reference voltages, triggering a phase switch when the output voltage falls below the reference voltage. When controlled by a hysteretic controller, an SCPC controller can respond within one clock cycle, demonstrating robust stability. However, a mismatch between optimum frequency and clock frequency may result in excessive subharmonic ripples [37, 42]. Frequency regulation, on the other hand, directly adjusts the operation frequency, thus avoiding the issue of subharmonic ripple. However, unlike hysteretic control, its response speed is limited by the controller's bandwidth. Consequently, achieving a fast transient speed with limited power becomes more challenging [44–46].

Given the advantages and challenges of SCPC, it is clear that a high-performance SCPC must incorporate techniques to improve efficiency and increase the number of available voltage conversion ratios in order to mitigate its drawbacks. This would allow it to combine the benefits of fast transient response speed and high power integrity, showcasing its superiority over SIPC in on-chip voltage regulation scenarios. Specifically, in applications such as always-on IoT sensors where power sources are limited, the critical issues of ultra-low-power/voltage optimization and standby efficiency/speed become prominent and also need to be addressed through thoughtful SCPC design.

In this thesis, two topics for further enhancing the SCPC performance will be discussed: In chapter 2, the challenges in always-on IoT sensor applications will be discussed in detail. In chapter 3, efficiency issues of the SCPC in the emerging thermoelectric energy harvesting applications will be addressed. Based on the critical trade-off between transistor performance and driving costs, our approach to optimizing gate voltage will be discussed and compared with other research to demonstrate why driving the transistor becomes one of the critical topics in energy harvesting applications. In chapter 4, a dual-loop hysteretic control-based frequency control method is discussed to enhance the transient performance of SCPC for Internet-of-Things (IoTs) applications, which minimizes the compromises required when trading off between speed and full-load range power conversion efficiency when implementing the controller. In chapter 5, the results obtained from these two pieces of research will be concluded, and the future development of SCPC will be discussed.

#### Chapter 2

### **Challenges in Always-On IoTs System**

Recently, there has been an increasing research interest in building always-on IoT terminals [47,48]. As shown in Fig. 2.1, a typical always-on IoT system usually consists of an ultra-low power process system and sensors for monitoring devices throughout the day, as well as a high-performance system to handle complicated tasks like processing and transferring data. In addition to the processing and sensor system, the PMIC needs to manage the voltage regulation for both the low-power and high-power modes of the system.

Because the energy sources of IoT devices are limited, extending the maintenance period of devices, or even eliminating the demand for replacing batteries, has garnered growing research interests in building energy harvesting circuits as the main or supplemental power sources in these IoT devices [26–35]. To achieve this goal, researchers are developing harvesters for different energy sources. The temperature gradient, which widely exists in the environment, is considered one of the promising energy sources for these devices. However, when thermoelectric materials are applied to IoT applications, new challenges arise. Despite the output power of a thermoelectric generator (TEG) varying with its materials and device dimensions, its closed-loop voltage is usually limited to hundreds of millivolts. Therefore, lowinput voltage DC-DC converters are necessary for thermoelectric energy harvesting in IoT applications.



Fig. 2.1. Challenges in Always-on IoT devices addressed by this thesis: energy harvesting (Chapter 3), standby efficiency and wake-up speed (Chapter 4).

When extending the discussion between SIPC and SCPC into the field of TEG energy harvesting: Inductive DC-DC converters usually have high efficiency, but bulky high-quality inductors are often required. This drawback is especially significant when it comes to lower power energy harvesting applications: a larger inductance is typically desired to decrease the operating frequency, thus achieving better power efficiency, even at lower levels.

However, while it is desirable to maintain high power conversion efficiency to maximize operational stability, integrity is also important in these cases. Two major benefits exist in fully integrated power converters for IoT applications. A fully integrated power converter not only reduces manufacturing costs by reducing components in PCB level design, but also contributes to the reduction of product sizes. Compact size is important for IoT sensors to minimize intrusion in people's daily lives. For example, in applications such as wearable sensors for Electrocardiography (ECG), a sensor with fewer inflexible components is critical for improving user comfort. Therefore, capacitive DC-DC converters, which can achieve fully integrated designs, are promising approaches for energy harvesting applications.

Compared with the photovoltaic energy harvest, thermoelectric generators have



Fig. 2.2. Review of PMIC performance: efficiency vs input voltage.

the benefit of being able to work with temperature differences even at night. However, small temperature differences pose the challenge of low output voltage in the power management domain. For example, a typical 2.5 cm x 2.5 cm thermoelectric generator has a closed-loop output voltage of less than 0.2V [73] when the temperature differences are below 7 degrees at the maximum power point and an output voltage below 8 mW [73]. The power scales with sensor size but the closed-loop voltage under the matching condition is less size-dependent, hence the power converter needs to address the low input energy harvesting problem. Fig. 2.2 reviews the power conversion efficiency of power converters under different input voltage conditions, and it can be observed that for SCPC, when the input voltage drops below 0.3 V reported by [61], existing approaches [26, 64] find significant difficulty in achieving high efficiency. The increased voltage ratio due to low input voltage, as



Fig. 2.3. Design Trade-offs in the controller design.

well as the insufficient gate, causes the degraded performance. The degradation of power conversion efficiency must be addressed to meet application requirements, and this thesis addresses the gate driving induced efficiency loss to boost the efficiency [59], which will be discussed in detail with Chapter 3.

In addition to searching for energy sources from the environment, the efficiency of utilizing the stored energy also plays an important role in operating the alwayson IoT devices. Most IoT devices spend a significant amount of time in low-power standby mode and are only activated for short periods as needed. Even if the power in standby mode is much lower than in active mode, the much longer standby time still pushes the total amount of consumed energy to a comparable level to the active mode. Therefore, the efficiency of both standby and active modes is critical for improving overall battery runtime. As a result, a high-efficiency converter over a wide range of loads is required. Additionally, fast transient speed is crucial for the responsiveness of IoT devices. As the load circuit must stand by until the PMIC output is ready to start its operations or until it can cope with a voltage drop during a transient event, a slow transition between the standby and active modes would significantly slow down the operation of the system. Unfortunately, just as high-performance cores require high power, a fast controller also requires more power. This leads to an inevitable trade-off between transient performance and light-load efficiency, as shown in Fig. 2.3. This limitation applies to almost all types of controllers but it can potentially be addressed at the system level. The always-on subsystem can notify the PMIC about potential wake-up events. Due to stability considerations of a control loop, not all control methodologies are capable of scaling their performance level internally and externally, hence, the dual lower bound hysteretic control is proposed in Chapter 4 to address this issue.

#### Chapter 3

# Efficiency Optimization in Low Voltage Low Power Applications

In previous research [27–32], different approaches are discussed to improve the efficiency of charge pumps in TEG energy harvesting applications. However, the efficiency and output power degradation is still significant when input voltage decreases. For example, reducing parasitic losses increases the peak efficiency [27, 28], but this approach cannot solve the problem of degraded on-resistance under low input voltage. Therefore, a significant drop in both efficiency and power appears. In addition to scaling the transistor, threshold modulation techniques like dynamic body biasing [29] also enhance the performance of transistors. However, external flying capacitors are still needed in [29], implying that this approach may not be effective in terms of improving the performance of the transistor to a level where fully integrated design is allowed. Another approach is to directly boost the gate voltage. Some researchers use bootstrapped ring-VCO (BTRO) to enlarge the amplitude of driving signals [32]. However, because the BTRO is not regulated, changes in input voltage can impact the frequency, and clock amplitude, and hence degrade the power conversion efficiency significantly. Moreover, in [30], a dual-mode 10-stage charge pump successfully improved the power density with an increased gate voltage. In [31], a gate-boosted charge pump design is also introduced with a significant improvement in output power. However, detailed analyses of the effects of increasing the gate driving voltage are still missing. As the voltage increases, scaling the switching transistors reduces the power required to drive it, but the parasitic losses on other parts of the controllers increase. This trade-off implies that detailed analysis may reveal the existence of an optimum point in charge pump design, which has the potential to become a competitive and cost-effective approach to improving performance.

#### 3.1 Trade-offs in SCPC Design

#### 3.1.1 Conduction Losses

In a regulated SCPC converter, the transistor on-resistance determines the lowest conduction losses that can be achieved, and the conduction losses are a direct result of hard-charge sharing losses under slow switching limits (SSL).

Consider a circuit with two capacitors and a resistor, which holds an initial capacitor voltage  $V_{C1} + \Delta V$  and  $V_{C1}$ . Assuming that a  $\Delta V$  voltage change occurs during one phase, compare the initial state energy:

$$E_{init} = \frac{1}{2}C((V_C + \Delta V)^2 + V_C^2)$$
(3.1)

the final state has an energy of

$$E_{end} = \frac{1}{2}C((V_C + \frac{1}{2}\Delta V)^2 + (V_C + \frac{1}{2}\Delta V)^2)$$
(3.2)

the overall energy stored in the circuit is reduced by:

$$\Delta E_{loss} = \frac{1}{2}C(\frac{1}{2}\Delta V^2) \tag{3.3}$$

This is usually referred to as hard-charge sharing losses. Meanwhile, the energy transferred to  $C_2$  is:

$$\Delta E_{C2} = \frac{1}{2}C(V_C\Delta V + \frac{1}{4}\Delta V^2) \tag{3.4}$$

Because the conduction losses scale quadratically with the voltage changes  $\Delta V$ , while the transferred energy scales linearly with the  $\Delta V$ , it motivates techniques like split-phase [40] and multiphase soft-charging [41] to minimize the  $\Delta_V$  across each stage to enhance the power conversion efficiency.

Regarding the impact of conduction losses in regulated SCPC, because the conduction losses are proportional to  $\Delta V^2$ , the overall conduction losses in SCPC are proportional to the deviation of the output voltage from the topology's ideal voltage conversion ratio,  $\Delta V_{out}$ . This result is trivial as  $\Delta V_{out}$  is accumulated from the voltage drop at each stage. Therefore, conduction losses are usually considered constant in regulated SCPC. However, this doesn't mean that conduction losses are irrelevant to the circuit design. In fact, the minimum achievable conduction losses are closely related to the circuit and are usually discussed alongside the Slow Switching Limit (SSL) and Fast Switching Limit (FSL) in the literature.

#### 3.1.2 Impact of Slow and Fast Switching Limit in Conduction Loss

In practice, because the overall conduction losses are proportional to  $\Delta V_{out}^2$ , it is desired to minimize the voltage drop over the whole operating range. One of the necessary conditions to guarantee that voltage drop can be minimized is that the SCPC needs to have its charge transferred almost completely. This usually refers to the Slow Switching Limit (SSL) of SCPC, where the voltage drop across the switches and interconnect are small enough to be considered almost ideal.

To ensure this condition applies, according to the principle of RC charging and discharging, the time constant needs to be short enough compared to the period of

each phase. If the time constant is comparable to the period of each phase, then the SCPC will enter the Fast Switching Limit (FSL). As the voltage drop across each stage cannot be ignored, it causes additional voltage drops and consumes the voltage headroom of the SCPC: Taking the circuit in chapter 3.1.2 as an example, due to the existence of voltage drop over resistors *R*, the  $\Delta V$  that can be transferred is reduced by  $I_{RC}R$ . This reduces the transferred charge by  $\Delta Q = I_{RC}RC$ . In a regulated SCPC, the degraded charge transfer capabilities need to be compensated for by increasing the operating frequency for a fixed load current. But on the other hand, the increased frequency will push the SCPC closer to the FSL limit. Hence, while the SCPC usually scales the frequency linearly with load current, the relationship is no longer linear, and the frequency is increased rapidly when the operating region is close to FSL. Moreover, if all the voltage headroom is consumed by the resistance's voltage drop, the SCPC will fail to reach the designed output voltage.

In fact, these characteristics are predictable from the behavior of the RC charging and discharging curve. By defining a time constant *RC*, an initial current  $I_{init}$ , and a final voltage change  $\Delta V_{end}$ , at 2*RC*, the voltage change is about 86% $\Delta V_{end}$ , and at 3*RC*, it reaches 98% $\Delta V_{end}$ . While the current flow, as well as the voltage drop across the resistor, reduces to 14% $\Delta V$  and 5% $\Delta V$  respectively. Hence, although the transition between FSL and SSL is not clearly defined, it is roughly at the level of the time constant. Overall, a regulated SCPC operates in SSL for most of its operating ranges, and the transition point to FSL determines the maximum achievable load current level. Therefore, given a target level of conduction losses ( $\Delta V_{out}$ ), the maximum load current level is determined by the FSL limit and hence by the quality of the switches and interconnect (*R*). In other words, given a target load current level, the minimum conduction loss level is determined by headroom to FSL limit at current operating conditions.

Overall, the higher the switches and interconnect quality (which suggests a lower

 $R_{on}$ ), the higher the achievable performance. In fact, for detailed topology, the impedance model, which models the SCPC with an ideal transformer and resistor, provides a powerful approach to quantitatively analyze the issue. A smaller impedance in general would lead to lower conduction losses and higher efficiency. But in this thesis, the qualitative impact of  $R_{on}$  on SCPC is sufficient for the subsequent discussions, hence a detailed review is omitted here.

#### 3.1.3 Switching Losses

While the conduction loss is proportional to the voltage drop from the ideal output voltage  $\Delta V_{out}$  and can be considered constant when the output voltage is regulated by the controller, several other types of switching losses exist in the SCPC that will affect its overall power efficiency.

Bottom-plated parasitic capacitance is one of the representative capacitive switching losses in SCPC. In on-chip capacitors, while the capacitance between the top and bottom plate is of interest to the circuit designer, both the top and bottom plates come with parasitic capacitance relative to ground nodes (or could be other circuit nodes depending on layout). The parasitic capacitance can significantly vary with the type of capacitor. Metal-insulator-metal (MIM) capacitor is one of the most commonly used on-chip capacitor types in modern CMOS technologies. As the MIM capacitors are usually located at higher metal layers and are placed vertically, they come with smaller bottom plate parasitic capacitances to the bulk silicon and negligible top plate parasitic capacitors are fabricated in lower layers of the chips, in which capacitors are much higher than in the MIM capacitors. Using metal fingers also implies that the capacitance density of MOM capacitors heavily depends on the process nodes: a process with more metal usually allows higher-density MOM capacitors.

In practice, while on-chip capacitors usually can achieve higher power density compared with their off-chip counterparts, the limited area still puts significant limits on the overall capacitance available on-chip. Therefore, to maximize the capacitance available on chips, the MIM capacitors and MOM capacitors can be stacked together to further increase the capacitor density, but at the cost of increasing the bottom plate parasitic capacitance of the MIM capacitor. Additionally, techniques like scalable parasitic redistribution [42] are developed to reduce the impact of parasitic capacitance. Overall, the bottom plate parasitic capacitance-induced power losses are topology and process dependent.

Gate capacitive losses are also one of the representative power losses in the SCPC. In fully integrated SCPC, the gate parasitic is scaled with transistor sizes and can be approximated by:

$$C_{gate} \approx C_{ox}WL + C_{ov}W \tag{3.5}$$

where  $C_{ox}$  is the oxide capacitance between the gate and channel, and  $C_{ov}$  is the capacitance that exists between the gate and drain/source, which is dominated by the width of transistors. Because power transistors in SCPC usually take a minimal length, the gate parasitic capacitance scales linearly with the transistor width. Hence the power consumption of the gate scales linearly with the transistor width and quadratically with the gate driving voltage. Scaling the gate driving voltage and transistor width also changes the on-resistance  $R_{on}$  of switches, hence leading to a trade-off between the current and efficiency.

Other types of switching losses also include the reverse leakage current when SCPC changes the phases, but these losses are highly design specific and can be avoided by techniques like deadtime design. Also, for both the gate and parasitic capacitance induced losses, because the parasitic RC loop holds a smaller time constant due to its small parasitic capacitance, the charge transferred per phase is considered complete. Therefore, with a fixed flying node voltage and gate driving voltage in regulated SCPC, the parasitic losses mainly scale with the operating frequency.

#### 3.1.4 Trade-offs in Low Voltage Low Power Applications

Due to the existence of conduction losses and switching losses, there is a tradeoff between the maximum output current and efficiency. As it is summarized in Fig. 3.1, for charge pumps with ideal switches, because the effective internal resistance generates voltage drop  $V_{drop}$  from ideal voltage conversion ratio  $(N + 1)V_{DD}$ and leads to conduction losses [27], the charge pump efficiency is limited by the trade-off between output current  $I_{out}$  and  $V_{drop}$ . However, when maximizing  $I_{out}$ and minimizing  $P_c$ , the best achievable result is limited by the maximum frequency that can be achieved. Such a frequency limitation f is generated by the on-resistance  $R_{on}$  of switches. If the switching frequency is too high, the flying capacitors cannot be effectively charged or discharged, generating additional resistive losses on transistors and hence degrade the overall efficiency.

A reduced input voltage and power further amplify the impacts of these tradeoffs: while the reduced voltage increases the impact of conduction losses for the same voltage drop, the switching power required for reducing the on-resistance hence maintaining a small voltage drop also cannot be ignored. Therefore, it is not difficult to realize that the on-resistance of transistors is a critical parameter when the input voltage is reduced, and the effectiveness of the methods for reducing  $R_{on}$ determines the final performance. As mentioned in the introduction, methods like body-biasing, transistor scaling, and gate voltage can effectively reduce the  $R_{on}$ .



Fig. 3.1. Charge pump design trade-offs.

However, the effectiveness is different for these methods and can be investigated by comparing the required power of controller  $P_{control}$ . If transistor scaling is the only method applied to transistors, it is difficult to compensate for the exponential relationship between currents and voltage in the subthreshold region, hence resulting in a significant  $P_{control}$  and degrading the efficiency. Meanwhile, because  $V_{th}$  is a weaker function of body-biasing voltage  $V_b$  than  $V_{gs}$ , adjusting body-biasing is less effective than directly changing the  $V_{gs}$ . Therefore, optimizing  $V_{gs}$ , which reduces  $R_{on}$  with minimized increment in  $P_{control}$ , is a promising approach for designers to maximize their charger efficiency. The concept of gate voltage optimization is proposed.

# **3.2** Gate Voltage Optimization for Target *R*<sub>on</sub>

### 3.2.1 Gate Voltage Optimization in Strong Inversion Region

Because the output voltage is specified by applications and regulated by feedback loops, an SCPC usually has fixed conduction losses  $P_c$ . Generally, chargesharing losses, which are the losses generated during the charge redistribution process, are the main source of conduction losses. From this viewpoint, for any given load, because the total loss can be written as:

$$P_{loss} = P_s + P_c = P_{s,sw} + P_{s,c} + P_{s,etc} + P_c$$
(3.6)

With a fixed  $P_c$ , the target would hence be to minimize switching losses  $P_s$ . Part of  $P_s$ , such as the bottom plate parasitic capacitance-induced losses, is related to the charge pump operating voltages of each stage. Similar to conduction loss, it can be considered a constant because our method does not change the operation voltage of each stage during optimization. These  $V_{gs}$  independent losses hence are summarized as  $P_{s,etc}$ . Meanwhile, gate parasitic capacitance of switch transistors  $P_{s,sw}$ , which is expected to be scaled with  $V_{gs}$ , can be written as:

$$P_{s,sw} = C_{sw} V_{gate}^2 f = C_{ox} W_{sw} L_{sw} V_{gate}^2 f$$
(3.7)

 $C_{sw}$  is the gate parasitic capacitance of each transistor, which can be further expressed by parasitic capacitance per unit area  $C_{ox}$ , transistor width  $W_{sw}$  and length  $L_{sw}$ .  $V_{gate}$  is the voltage applied to the gate of transistors. Meanwhile, although the power consumption of the controller varies with different designs, capacitive switching losses of the controller  $P_{s,c}$  is selected to model the general behavior of

control and derive the general considerations of charge pump designs:

$$P_{s,c} = C_c V_{gate}^2 f \tag{3.8}$$

where  $C_c$  is the effective controller parasitic capacitance to each switching transistor.

When scaling the  $V_{gs}$  and transistor sizes, a scaling rule must be defined to maintain a constant charge pump RC-loop characteristic. This implies that key parameters, such as the charge pump frequency of the regulated charge pump, do not need to be changed during the optimization. To achieve this, the *R*<sub>on</sub> of the charge pump is selected as the unchanged parameter during the scaling process. While charge pump optimization is a multidimensional problem, encapsulating the charge pump performance with a single parameter aids us in discussing the optimization problem in an extra dimension of controller and gate driving losses. Hence, discussing the optimum point within the optimization space becomes more convenient. To introduce our proposed method, let us first assume that  $R_{on}$  is the optimum choice for the charge pump, for the sake of simplicity. It's important to note that this is essential in guaranteeing that the resulting  $P_s$  is minimal. The following analysis will provide verification of whether this assumption is reasonable. For conceptual analysis, assuming the voltage is sufficient for transistors to operate in the triode region. In the triode region, the relationship between gate to source voltage  $V_{gs}$  and transistor width  $W_{sw}$  can be derived as:

$$W_{sw} = \frac{L_{sw}}{\mu R_{on} C_{ox} (V_{gs} - V_{th} - V_{ds}/2)}$$
(3.9)

Replacing the  $W_{sw}$  in (3.7) hence leads to:

$$P_{s,sw} = \frac{L_{sw}^2 V_{gate}^2 f}{\mu R_{on} (V_{gs} - V_{th} - V_{ds}/2)}$$
(3.10)

For simplicity, a PMOS connected to the output node is chosen to demonstrate the concept. Other transistors can be discussed by modifying the  $V_{gate}$  and  $V_{gs}$ . In this case, output is the highest voltage available in the system. The  $V_{gs}$  hence equals to  $V_{gate}$ . Therefore, the overall power consumption in the triode region can be written as:

$$P_{s} = V_{gs}^{2} f\left(\frac{L_{sw}^{2}}{\mu R_{sw,on}(V_{gs} - V_{th} - V_{ds}/2)} + C_{c}\right)$$
(3.11)

Observing (3.11), it is not difficult to notice that both the f and  $R_{sw,on}$  appear. If other parameters of the charge pump are already given, then each  $R_{sw,on}$  would correspond to a different f. Since the proposed scaling does not alter the  $R_{sw,on}$ , both  $R_{sw,on}$  and f can be considered as constants. Therefore, it is not difficult to compare the result obtained with different  $R_{sw,on}$  and determine the optimum  $R_{sw,on}$ . Assuming that  $R_{on}$  is the optimum choice of the charge pump at the beginning of the analysis is hence a reasonable choice.

By taking the partial derivative of (3.11), the total switching losses can hence be reduced, if:

$$\frac{\partial P_s}{\partial V_{gs}} = V_{gs} f[2(C_{sw} + C_c) - \frac{C_{sw} V_{gs}}{V_{gs} - V_{th} - V_{ds}/2}] < 0$$
(3.12)

To simplify the expression, define  $C_c = kC_{sw}$ . We observe that (3.12) is negative, if:

$$V_{gs} < 1 + \frac{1}{2k+1} \tag{3.13}$$

Therefore, an upper limit of  $V_{gs}$  is given by (3.13).

The concept  $V_{gs}$  optimization can thus be summarized as shown in Fig. 3.2. At high  $V_{gs}$ , the losses on the switches become less sensitive to changes in their gate driving voltage, and the controller losses  $P_{s,c}$  would gradually dominate over the overall system losses. Conversely, at a lower voltage, the losses on the switches become more sensitive to changes in their gate driving voltage, and the overall losses



Fig. 3.2. Conceptual graph of optimum gate voltage.

would gradually be dominated by the  $P_{s,sw}$ . An optimum point thus exists.

In practice, the assumptions in (3.13) have limitations. For example, the theory of transistor scaling based on (3.9) becomes less convincing when the transistor operates around the transition region of weak and strong inversion. However, because the degradation of the transistor's conductivity occurs much faster in weak inversion regions, this would result in similar behavior as described in Fig. 3.2. The prediction made by (3.13), therefore, remains meaningful to the circuit designers. It allows them to utilize the conclusion to estimate the benefits of optimizing  $V_{gs}$  in their respective applications.

### 3.2.2 Gate Voltage Optimization in Weak Inversion Region

To build a better understanding of gate voltage optimization, this section introduces an enhanced model to address the limitations presented in section 3.2.1 and aligns with the simulation results. To achieve this, effective gate-to-source voltage  $V_{gsteff}$  and effective channel length  $L_{eff}$  need to be introduced.

In the BSIM3 model [74], the drain-source current of different regions can be defined as:

$$I_{ds} = I_{dso} F(V_{ds}, V_{gs})$$
(3.14)

where  $F_{Vds,Vgs}$  summarizes non-dominated effects such as drain-induced barrier lowering, substrate current body effect, velocity saturation, and parasitic drainsource resistances. The  $I_{dso}$  can be written as:

$$I_{dso} = \mu_{eff} \frac{W_{eff}}{L_{eff}} C_{ox} V_{gsteff} (1 - \frac{A_{bulk} V_{dseff}}{2(V_{gsteff} + 2v_t)}) V_{dseff}$$
(3.15)

where effective width  $W_{eff}$ , length  $L_{eff}$ , mobility  $\mu_{eff}$ , gate-to-source voltage  $V_{gsteff}$ , and drain-to-source voltage  $V_{dseff}$  are introduced to better represent transistor physics. Among these parameters,  $W_{eff}$  is very close to the actual transistor geometry because the transistor is large in the weak inversion region. The small voltage between terminals also does not generate significant impacts on effective mobility  $\mu_{eff}$ . However, the effective channel length  $L_{eff}$  and gate-to-source voltage  $V_{gsteff}$  have important impacts on the results.

For the 180-nm CMOS process, the differences between the effective channel length  $L_{eff}$  and the actual transistor geometry cannot be ignored. Actually,  $L_{eff}$  is smaller than the actual transistor size, leading to higher current and smaller power losses in the simulation results.

$$V_{gsteff} = \frac{2nv_t ln[1 + exp(\frac{V_{gs} - V_{th}}{2nv_t})]}{1 + 2nC_{ox}\sqrt{\frac{2\Phi_s}{q\epsilon_{si}N_{ch}}}exp(\frac{-V_{gs} - V_{th} - 2V_{off}}{2nv_t})}$$
(3.16)

Detailed definitions of the parameters in (3.16) can be found in the BSIM3 manuals [74], and most parameters can be directly extracted or calculated from the PDK. The



Fig. 3.3. Circuit setup for evaluating the transistor scaling under different  $V_{gs}$ .

power losses on the transistor hence are modified as:

$$P_{s,sw} = \frac{L_{sw}L_{sw,eff}V_{gate}^{2}f}{\mu R_{on}(V_{gsteff} - \frac{V_{ds}}{2(V_{gsteff} + 2v_{t})})}$$
(3.17)

Generally, the  $V_{gsteff}$  is larger than the  $V_{gs} - V_{th}$  when  $V_{gs}$  is close to or smaller than the threshold voltage. As a result, smaller transistor sizes and power losses are expected in simulations.

#### 3.2.3 Numerical Calculation

As previously mentioned, constant performance is necessary when scaling the transistors and optimizing the gate voltage for the charge pump. The setup of the experiment can be seen in Fig. 3.3. In this simulation, we define T = 3RC as the point where the two circuits are compared. The simulation includes the impact of changes in source voltage. For a 5-stage charge pump with 120 mV input voltage and 600 mV output voltage, the average voltage drop across each switching transistor is around 20 mV. Therefore, the initial capacitor voltage is set to 100 mV, and  $V_{DD}$  is set to 120 mV. Since the NMOS transistor source is shifted by a  $V_{DD}$ , (3.11) needs modification as follows:

$$P_{s} = V_{gate}^{2} f\left(\frac{L_{sw}^{2}}{\mu R_{sw,on}(V_{gs} - V_{th} - V_{ds})} + C_{c}\right)$$
(3.18)



Fig. 3.4. Extraction of the  $V_{th}$  from simulation results.

where  $V_{gate}$  can be expressed as  $V_{gs} + V_{ds}$ . To account for source voltage changes,  $V_{ds}$  is set to half of the voltage drop, which equates to 10 mV in this calculation. This approximation sacrifices some accuracy to maintain simplicity. While process-related parameters can be obtained from the PDK,  $V_{th}$  must be extracted using the  $I_{ds}/\sqrt{g_m}$  method [75] for optimal fitting. Fig. 3.4 demonstrates that the typical  $V_{th}$  is 540 mV.

Fig. 3.5 and Fig. 3.6 show the result of a normal threshold NMOS transistor with 180 nm channel length. If designed minimal charge/discharge time is 3RC, the time constant needs to be smaller than 1/6f. Taking 1 MHz as the target frequency and 50 pF as the pumping capacitor, the on-resistance needs to be smaller than 3.3 k $\Omega$ . Hence, 3 k $\Omega$  on-resistance is taken for calculation and simulation in Fig. 3.5. At higher power, a 200  $\Omega$  on-resistance is taken in Fig. 3.6 for comparsion. The simulated power loss on transistor switches is plotted as the *P*<sub>s,sw,sim</sub>, the result calculated



Fig. 3.5. Verification of the theory when  $R_{on} = 3000\Omega$ .

from (3.10) is plotted as  $P_{s,sw}$  and the result calculated from (3.17) is plotted as  $P_{s,sw}^*$ . Because the overall optimum point is achieved by trading-off the  $P_{s,sw}$  and  $P_{s,c}$ , The  $P_{s,c}$  needs to be modeled to calculate the optimum point. The  $P_{s,c}$  is the power losses on the controller which does not drive the switches directly. Usually,  $P_{s,c}$  heavily depends on the controller design. In this analysis, the power losses generated by parasitic capacitance are considered. Typically, the metal line capacitance ranges from 0.01 to 0.2 fF/um. It is hence reasonable to assume  $C_c$  is 30 fF for analysis, and the corresponding controller losses can be plotted as  $P_{s,c}$  in Fig. 3.5 and Fig. 3.6. By adding up the  $P_{s,c}$ , the overall switching loss can be plotted as  $P_s$ ,  $P_s^*$ , and  $P_{s,sim}$ , where  $P_s$  is the result of (3.18),  $P_s^*$  is the result based on (3.17), and  $P_{s,sim}$  is the result of simulation.

Fig. 3.5 and Fig. 3.6 provide a basis for discussing the accuracy and implications



Fig. 3.6. Verification of the theory when  $R_{on} = 200\Omega$ .

of previous analyses. When identifying the optimal gate voltage, the method presented in section 3.2.1 can yield accurate results, provided the switches operate in the triode region. However, because (3.18) does not include the weak inversion region, errors are introduced when  $V_{gs}$  is close to  $V_{th}$ . Despite the uncertainties that arise during the early stages of circuit design, the simple estimation offered by (3.18) proves useful, as it indicates the approximate range for the optimal gate voltage. For a more precise analysis, designers should use (3.17) to optimize performance effectively.

Furthermore, the optimal gate voltage largely depends on the controller and  $R_{on}$ . To discuss the wide-ranging impact of  $P_{s,c}$  at various output power levels, the optimal gate voltage and minimum power losses are computed based on (3.17). Fig. 3.7 shows the result of optimum gate voltage. Unless a small  $C_c$  and  $R_{on}$  can be achieved simultaneously, which is difficult in thermoelectric energy harvesting, the optimum



Fig. 3.7. Optimum gate voltage at different  $R_{on}$  and  $C_c$ .

gate voltage is more likely to be located at a lower voltage. Meanwhile, as it is shown in Fig. 3.8, a high  $C_c$  and  $R_{on}$  is usually undesired due to its large power losses, the resulting optimum gate voltage hence generally locates near the threshold voltage. Moreover, although the optimum  $V_{gs}$  can be changed at different load current levels, which is equivalent to optimize the circuit for the corresponding optimum  $R_{on}$ , the results of Fig. 3.7 and Fig. 3.8 shows that the difference between these optimum points is less significant. Optimizing the  $V_{gs}$  for maximum load current condition hence can be considered sufficient in most cases.

It's also worth mentioning that the power, being specific to controller design, may not always align with the ideal capacitive loss model. In general,  $P_{s,c}$  will increase with  $V_{gs}$  for most controllers, although the slope can differ. For instance, when level shifters are introduced, while most parts of the controller are supplied with a constant voltage, the level shifter still consumes extra power for larger  $V_{gs}$ .



Fig. 3.8. Power losses at optimum gate voltage with different  $R_{on}$  and  $C_c$ .

As the optimal  $V_{gs}$  results from the trade-off between controller power and switch power,  $V_{gs}$  would place at a higher value if  $P_{s,c}$  is lower and the change rate versus  $V_{gs}$  is slower. Also, as the reduction of  $P_{s,sw}$  at higher  $V_{gs}$  is less significant and  $P_{s,sw}$ might be insignificant compared to output power, the actual benefits of these approaches need to be assessed on a case-by-case basis. Overall, the analysis process is like the previous one, except  $P_{s,c}$  needs to be replaced with the actual model.

### 3.2.4 Comparison with Dynamic Body Biasing

In terms of improving the performance of charge pumps for energy harvesting applications, a competitive approach is dynamic body biasing. For example, dynamic body biasing (DBB) is introduced in [29] to enhance charge pump performance. In their research, forward body bias is applied to an NMOS transistor when



Fig. 3.9. (a) Circuit setup of typical charge pumps with DBB. (b) Setup of gate voltage optimization for simulations. (c) Setup of DBB for simulations.

charging the flying capacitors. Reverse bias is applied to the NMOS transistors dynamically when discharging the flying capacitors. The threshold voltage hence is dynamically modified for a better performance.

To compare the performance differences between gate voltage optimization and body biasing techniques, this paper simulates and analyzes the impacts of applying these techniques. Fig. 3.10 shows the concept of DBB described in [29] and the simulation setup for this comparison. The target equivalent on-resistance is 200 ohms. For gate voltage optimization, the body of NMOS transistors is connected to the ground. The transistor width is scaled at different  $V_g$  to maintain constant performance. In DBB-based circuits, cross-coupled designs make the effective  $V_{gs}$  close to  $V_{DD}$ . Therefore,  $V_g$  is set to  $2V_{DD}$  when the transistor is turned on, and  $V_{DD}$  when the transistor is turned off. To maintain a constant performance, the transistor is scaled with different forward biasing voltage  $V_b$  and is biased to the ground before the transistor is turned on. Therefore, the power consumption of switching bodybiasing voltage also needs to be considered. Taking  $V_{DD} = 120mV$ , the result can be plotted as Fig. 3.10. Both transistors are low-threshold transistors. The horizontal



Fig. 3.10. Comparison between dynamic body biasing and gate voltage optimization.

axis is  $V_g$  for gate voltage optimization and  $V_b$  for DBB.

The graph illustrates that the overall power consumption of DBB is higher than that of the proposed approach at high biasing voltages, suggesting that gate voltage optimization is more effective than body biasing. This is aligned with common sense because of the weak relationship between threshold voltage and body biasing. Because gate voltage optimization usually can be achieved by careful design of the controller without modifying its structure significantly. The proposed method hence is a more cost-effective approach to improve the overall power conversion efficiency.



Fig. 3.11. The general consideration of determining the optimum gate voltage.

#### 3.2.5 Summary of Methodology

Overall, we can summarize a practical design flow as depicted in Fig. 3.11. Initially, it is desirable to determine whether the power level of the target applications could make the cost of implementing the controller a dominant factor. If so, optimizing the gate voltage becomes necessary for enhancing power conversion efficiency. Generally, power losses on the switches decrease when the  $V_{gs}$  increases. However, in energy harvesting application, this effect becomes less influential compared to capacitive loss at high  $V_{gs}$ . Consequently, the impact of capacitive loss on the  $P_{s,c}$ gradually dominates the overall losses in the proposed model. Meanwhile, a small  $R_{on}$  usually indicates that the power converter is designed for high-power applications, causing  $P_{s,sw}$  to increase and dominate the overall losses even at higher  $V_{gs}$ levels.

In terms of determining a practical  $V_{gs}$ , non-ideal effects must also be taken into account. For instance, leakage current also scales with  $V_{gs}$ , shifting the optimum  $V_{gs}$  to a lower value. Despite this, in the subthreshold region, the benefits gained from increased speed with higher voltage typically outweigh its cost. Therefore, designing the gate-to-sources voltage of switches around the threshold voltage in energy harvesting applications usually results in performance levels close to the optimum of each stage. Likewise, designing a controller near the threshold voltage promotes a balanced trade-off between performance and power.

Thus, the designer should verify if the target output voltage is proximate to the threshold voltage, which is often the case when supplying power to lower power circuits. A combination of different  $V_{th}$  options and full voltage driving from output is usually sufficient for achieving excellent performance since the cost of the controller can be minimized with its simplified design. If this condition does not apply, separate gate driving circuits might be necessary, requiring case-by-case discussion. For example, some middle stages could be used to supply the voltage levels, thus minimizing the number of level shifters and their powers. Overall, balancing the transistor and the controller becomes critical in optimizing gate usage for energy harvesting SCPC.

#### **Implementation and Verficiation** 3.3

#### 3.3.1 Verficiation Circuits

Based on the above general discussion about  $V_{gs}$  optimization for a single stage of the charge pump, we will discuss our design considerations in this section. The aim is to illustrate the practical application of the general conclusions from Section II to the design process of charge pumps for energy harvesting applications. Fig. 3.12 depicts the system-level structure of the proposed charge pump.

In Fig. 3.13(a), we present one phase of the proposed 5-stage linear charge pump for a detailed demonstration of  $V_{gs}$  optimization. For a 120 mV input voltage, a 600 mV output voltage is selected for the proposed 5-stage charge pump. Due to the low  $V_{DD}$  of the thermoelectric generator, we desire to operate the controller under a higher voltage to achieve the target speed and accuracy of the control loop. We select the output voltage, which is close to the threshold voltage, as the power source of the controller, following the general conclusion derived from the previous discussion.



Fig. 3.12. Charge pump structure.



Fig. 3.13. (a) Charge pump structure. (b)  $V_{gs}$  in different stages.

Fig. 3.13(b) shows the available  $V_{gs}$  at different stages. Since the output stage  $M_{p6}$  has sufficient voltage swing, using a standard threshold transistor could lead to improved efficiency because it reduces reverse leakage. However, the advantages garnered by employing a low-threshold voltage are restricted, as analyzed in the preceding sections. For the middle stages  $M_{p3}$ ,  $M_{p4}$ , and  $M_{p5}$ , which exhibit



Fig. 3.14. Minimal  $P_{s,sw}$  of each stage at different  $V_{gate,on}$ .

smaller  $V_{gs,on,P}$  due to source voltage shifts, we have chosen low-threshold voltage transistors with a threshold voltage of around 300 mV to compensate for the diminished  $V_{gs,on,P}$ . Despite the higher mobility of the NMOS transistor, it is not included in these stages because the PMOS transistor can offer leakage current prior to controller operation and is more suited for self-startup. Similarly, the first and second stages  $M_{n1}$  and  $M_{n2}$  utilize low-threshold voltage NMOS transistors to utilize the large  $V_{gs,on,N}$ . We have also incorporated additional small PMOS transistors  $M_{p1}$  and  $M_{p2}$  to provide currents during the startup process. Additionally, buffers for generating  $\Phi_{clk}$  and  $\Phi'_{clk}$  introduce more low-threshold NMOS transistors due to a gate driving voltage similar to the first stage, although this is not depicted in Fig. 3.13(a). To avoid latch-up, we bias the bulk of NMOS transistors to the ground, and similarly, bias PMOS transistors to  $V_{int}$ . Non-overlapping clocks,  $\Phi_1$ ,  $\Phi'_1$ ,  $\Phi_2$ , and  $\Phi'_2$ , are introduced to mitigate the effects of reverse leakage currents. The timing of  $\Phi_{clk}$  corresponds to  $\Phi_1$ , and  $\Phi'_{clk}$  aligns with  $\Phi'_1$ . Both  $\Phi'_{clk}$  and  $\Phi_{clk}$  have an amplitude of  $V_{in}$ . Even though the reverse biased  $V_{gs}$  in those stages during off-state can increase capacitive power losses due to larger voltage swings, it assists in reducing leakage currents. As suggested in [29], lowering leakage currents can enhance the efficiency of the charge pump.

In order to validate the impact of such gate voltage optimization, we simulated and plotted the power required to drive each switch under differing on-stage gate voltages  $V_{gate,on}$  for one phase. At a frequency of 250 kHz and 120 mV  $V_{in}$ , the output power reaches 180 nW. By activating the PMOS transistor with ground voltage and switching on the NMOS transistor with the output voltage, as observed from Fig. 3.14, we manage to prevent considerable power losses. Options for different threshold voltages sufficiently compensate for the source voltage shifts at each stage, ensuring an overall boost in efficiency. Although we can modify the gate voltage of transistors, particularly for  $M_{n1}$  and  $M_{p5}$ , without eliciting substantial power losses, using this voltage headroom (like activating  $M_{p5}$  with a 0.2 V on-state gate voltage) difference to lower the power consumption of the controller is challenging. This challenge occurs because the controller power needs to be sufficiently reduced while maintaining the speed and accuracy. If the  $P_{sw,c}$  is almost independent of the  $V_{gs}$ , it is not difficult to predict that driving the charge pump with a full swing would be the optimum choice. While further optimization may be possible, it demands innovative controller designs with presumably marginal benefits. This paper primarily focuses on balancing between controller and gate driving losses to enhance efficiency. Although it doesn't provide an optimization strategy for each transistor, it nonetheless provides a sufficient demonstration of the concept.

Regarding circuit design optimizations, the power conversion efficiency of the charge pump is a reflection of the efficiency of each stage. Therefore, any impact



Fig. 3.15. Controlled oscillator.

from scaling the transistor size will be evident in the overall charge pump power conversion efficiency, particularly under heavy load conditions. If the  $R_{on}$  exceeds its optimum value, the operating frequency will need to increase to counteract output voltage drops. This adjustment will continue until the  $R_{on}$  limits the output power and causes an output voltage regulation failure, subsequently leading to lower efficiency. If the  $R_{on}$  falls below its optimum value, take note that the charge pump frequency's lack of sensitivity to  $R_{on}$  when operating under SSL limits becomes important. In such cases, the system may suffer decreased power conversion efficiency due to losses induced by large switches. Therefore, such optimization is attainable within a few iterative attempts with the help of EDA tools.

#### 3.3.2 Controller

In the proposed design, frequency modulation is introduced to regulate the output. Fig. 3.15 illustrates the structure of the proposed controlled oscillator. The wide frequency modulation range and low power characteristics of delay gate-based ring oscillators have been proven in [27]. In this paper, to reduce leakage currents, two groups of delay gates with balanced PMOS leakage paths are used. These control



Fig. 3.16. Feedback loop for  $V_{fb1}$  and  $V_{fb2}$  generation.



Fig. 3.17. (a) Startup circuit (b) Startup process

the non-overlapping clock and frequency separately. The oscillator can be divided into three inverting stages, each consisting of two delay gates. For instance, in inverting stage 1, the delay gate, controlled by  $V_{b1}$ , dominates the delay of this stage. The delay gate controlled by  $V_{b2}$  dominates the delay for non-overlapping clocks. When  $V_{b2}$  is lower than  $V_{b1}$ , the current flows through the leakage path is larger, thus guarantee a shorter generated delay time. A proper non-overlapping control hence can be achieved. Overall, six phases are generated from the oscillator, which achieves ripple reduction.

Meanwhile, the circuits for generating  $V_{b1}$  and  $V_{b2}$  are shown in Fig. 3.16. A comparator compares the feedback voltage  $V_{fb}$  with the reference voltage  $V_{ref}$ , then generates clock signals on  $\Phi_{up}$  or  $\Phi_{down}$ . By controlling charge pump CP<sub>up</sub> and CP<sub>down</sub>,

the  $V_{b1}$  is directly generated at the holding capacitor  $C_h$ . For a better transient,  $V_{b1}$ is also coupled with  $V_{int}$ . By controlling  $V_{b1}$ , the frequency can be controlled, hence achieving output regulation. As speed is usually not the focus, ensuring the loop stability is not difficult. To follow the changes of frequency,  $V_{b2}$  also needs to follow the changes in  $V_{b1}$ . The  $V_{b2}$  are generated from  $V_{b1}$  using a reversed two-transistors voltage reference similar to [76]. By creating a constant voltage drop from  $V_{b1}$ , a proper non-overlapping period can be generated using  $V_{b2}$ . As the current flows through the leakage path in delay gates is small, the transistors for controlling the leakage path operate in the subthreshold region. Meanwhile, the voltage difference generated by the  $V_{b2}$  from  $V_{b1}$  is close to a subthreshold swing, the non-overlapping time hence is much smaller than clock periods. Maintaining the voltage difference between  $V_{b1}$  and  $V_{b2}$  hence results in a good ratio between clock periods and nonoverlapping times. It has to be mentioned here that the circuit performance is less sensitive to the variations of differences between  $V_{b1}$  and  $V_{b2}$ , the approach hence does not require calibrations. Moreover, during the startup process, M<sub>s</sub> would shunt the  $V_{b1}$  to  $V_{in}$ , which provides an initial bias for the delay gates.

#### 3.3.3 Startup Process

The controller, being powered by the output of the charge pump, necessitates a carefully designed startup process. In our proposed circuit, utilizing the leakage current for startup is a viable option. Therefore, additional startup circuits are introduced to manage this process and potential circuit failures, as shown in Fig. 3.17(a). During the startup process, transistor  $M_{isolate}$  isolates the external load and  $V_{int}$ . This action allows storage of charge into  $C_{int}$  and increases the voltage available to the controllers. Given that the oscillator can operate at a very low voltage, the accumulated charges at  $V_{int}$  can power the oscillator operation, which further pumps charges into  $V_{int}$ . With proper circuit design, the charges being pumped into  $C_{int}$ 

|                                      | 5-stage   | 3-stage   |  |
|--------------------------------------|-----------|-----------|--|
| Area mm( <sup>2</sup> )              | 1.924     | 1.170     |  |
| <i>C<sub>fly</sub></i> ( <b>p</b> F) | 30 * 50.4 | 18 * 50.4 |  |
| C <sub>int</sub> (pF)                | 46.6      | 46.6      |  |
| <i>C<sub>fb</sub></i> (pF)           | 2.12      | 2.12      |  |

**TABLE 3.1: Capacitor Parameters** 

can exceed the controller's consumption, thereby leading to a successful startup. Isolating the load from the power converter during startup means that the startup process is determined by the circuit implementation. By calculating the clock cycle with a 6-bit asynchronous counter, sufficient startup time can be assured.

In case of circuit failure, this startup process needs to be activated if an unexpected drop is generated at  $V_{out}$ . In this design, the output voltage is the same as the controller supply voltage. By using both  $V_{out}$  and clocks as inputs to a NOR gate, the output status can be monitored. During normal circuit operations,  $V_{out}$  inhibits the input of clock signals, thereby locking the startup circuits. During circuit failure, characterized by a drop in  $V_{out}$ , the on-resistance of  $M_{isolate}$  increases, causing a voltage difference between  $V_{int}$  and  $V_{out}$ . Consequently, the asynchronous counter begins receiving clock signals, leading to the activation of the startup circuit. A summary of the overall startup process is presented in Fig. 3.17(b).

#### 3.3.4 Measurement Results

The proposed design is implemented with a 3-stage version and a 5-stage version in a standard 180 nm CMOS process. In the 3-stage version, the second and fourth stages of the 5-charge pump are removed. Fig. 3.18 shows the microphotograph of the proposed charge pump. The parameter of capacitors is summarized as Table 3.1.



Fig. 3.18. Chip micrograph.

The  $C_{fly}$  is the pumping capacitors,  $C_{int}$  is the output capacitor at  $V_{int}$  and  $C_{fb}$  is the sum of feedback loop capacitors. All capacitors are MIM-caps.

Fig. 3.19 demonstrated the measured load regulation of the proposed 3-stage charge pump under different input voltages. The corresponding clock frequency is given as Fig. 3.20. Load regulation with acceptable accuracy hence is achieved. As shown in Fig. 3.21, a 54.0% peak power conversion efficiency is achieved with a 0.18 V input voltage and 1.9  $\mu$ A load current by the proposed 3-stage charge pump.

Fig. 3.22 shows the measured load regulation of the proposed 5-stage charge



Fig. 3.19. Load regulation of proposed 3-stage charge pump.



Fig. 3.20. Clock frequency of proposed 3-stage charge pump.

pump under different input voltages. A regulated output hence is generated over wide load current ranges as well. Fig. 3.23 illustrates the corresponding clock frequency. Fig. 3.24 gives the measured power conversion efficiency.

Fig. 3.25 measures the startup process of the proposed 5-stage charge pump with



Fig. 3.21. Efficiency of proposed 3-stage charge pump.



Fig. 3.22. Load regulation of proposed 5-stage charge pump.

a 0.12 V input voltage and load resistor connected. CLK is the buffered clock signal and cannot represent the real amplitude. The result depicted that the proposed charge pump is capable of self-startup under heavy load conditions, which avoids



Fig. 3.23. Clock frequency of proposed 5-stage charge pump.



Fig. 3.24. Efficiency of proposed 5-stage charge pump.

the complicated dual-mode startup mechanism described in [77].



Fig. 3.25. Startup Measurement.

#### 3.3.5 Comparison of Performance

Table 3.2 summarizes the performance of the proposed charge pump and compares it with other state-of-the-art designs. In [78–80], significant efficiency and output power degradation appear due to the subthreshold operation. Because [78] and the proposed design have a similar 3-stage linear charge pump structure, making a comparison is therefore convincible. Compared with adaptive body biasing, a 20%

|                      | JSSC 14 [80]                                                                                                      | JSSC 17 [79]                 | JSSC 14 [78]                                                                                           | ASP-DAC 12 [77]                                                                                   | This work<br>180nm                    |                                                                                                            |
|----------------------|-------------------------------------------------------------------------------------------------------------------|------------------------------|--------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|---------------------------------------|------------------------------------------------------------------------------------------------------------|
| Process              | 180nm                                                                                                             | 180nm                        | 65nm                                                                                                   | 65nm                                                                                              |                                       |                                                                                                            |
| Structure            | Self-Oscillating<br>Charge Pump                                                                                   | Discontinuous<br>Charge Pump | 3-stage Adaptive<br>Body Biasing                                                                       | 10-stage<br>Dual-mode                                                                             | 3-stage<br>V <sub>gs</sub> Opt.       | 5-stage<br>V <sub>gs</sub> Opt.                                                                            |
| Min. V <sub>in</sub> | 0.14 V                                                                                                            | 0.25 V                       | 0.15 V                                                                                                 | 0.12 V                                                                                            | 0.13 V                                | 0.12 V                                                                                                     |
| Output<br>Voltage    | 2.2  V - 5.2  V<br>@ $V_{in} = 0.35 \text{ V}$                                                                    | 3.8 V - 4 V                  | 0.619 V<br>@ $V_{in} = 0.18$ V                                                                         | 1 V<br>@ $V_{in} = 0.12 V$                                                                        | 0.6  V<br>@ $V_{in} = 0.18 \text{ V}$ | 0.6  V<br>@ $V_{in} = 0.12 \text{ V}$                                                                      |
| Max.<br>Efficiency   | $\begin{array}{c} 39\% \\ @ \ V_{in} = 0.25 \ \mathrm{V^*} \\ 50\% \\ @ \ V_{in} = 0.45 \ \mathrm{V} \end{array}$ | 60%                          | $\begin{array}{c} 34\%\\ @\ V_{in}=0.18\ \mathrm{V}\\ 72.5\%\\ @\ V_{in}=0.45\ \mathrm{V} \end{array}$ | $\begin{array}{c} 38\%\\ @\ V_{in} = 0.12\ \mathrm{V}\\ (V_{out} = 0.77\ \mathrm{V}) \end{array}$ | 54% @ $V_{in} = 0.18$ V               | 42.6% @ $V_{in} = 0.12$ V                                                                                  |
| Max. Pout            | $5 \mu W @ V_{in} = 0.45 V 80 nW @ V_{in} = 0.25 V^*$                                                             | $4~\mu W$                    | 10.5 $\mu$ W<br>@ $V_{in} = 0.18$ V                                                                    | $3 \mu W$<br>@ $V_{in} = 0.12 V^*$                                                                | 27 µW                                 | $\begin{array}{c} 1 \ \mu W \\ @ \ V_{in} = 0.12 \ V \\ 16.9 \ \mu W \\ @ \ V_{in} = 0.18 \ V \end{array}$ |
| Fully<br>Integrated  | Yes                                                                                                               | Yes                          | No                                                                                                     | Yes                                                                                               | Yes                                   |                                                                                                            |
| Area                 | 0.86 mm <sup>2</sup>                                                                                              | 2.72 mm <sup>2</sup>         | 0.066 mm <sup>2</sup>                                                                                  | 0.783 mm <sup>2</sup>                                                                             | 1.17 mm <sup>2</sup>                  | 1.84 mm <sup>2</sup>                                                                                       |

TABLE 3.2: Comparison of Performance

\* Estimated number from paper.

efficiency advancement is observed in the proposed  $V_{gs}$  optimized 3-stage design. Not to mention, high-quality off-chip capacitors are used in [78].

Comparing the proposed 5-stage charge pump with [77], a 4.6x voltage conversion ratio is achieved by the proposed 5-stage design with efficiency advancement and a 6.4x conversion ratio is achieved by the 10-stage charge pump in [77] under maximum load condition. While the controller power is not available for comparison, having a closed-loop design in the proposed design implies an optimized trades-off between the controller and switches, proofing the importance of the proposed approach.

## 3.4 Discussions

In this chapter, the optimum  $V_{gs}$  is analyzed for low input voltage charge pumps. A 3-stage and a 5-stage  $V_{gs}$  optimized charge pump are fabricated for verification. Compared with [77], the efficiency and voltage conversion ratio shows that optimization is important for thermoelectric design when boosting the gate voltage. The results show a 20% efficiency improvement when compared with a similar 3-stage linear charge pump [78]. Proofed that optimizing the gate voltage is more effective than body biasing. As  $V_{gs}$  optimizations generate significant performance improvement, more research potentials exist in applying this technique under wider input/output voltage conditions.

# Chapter 4

# **Dual Lower-Bound-Hysteresis Control**

As discussed in previous sections, when one is selecting a PMIC for IoT applications, capacitive DC-DC converters offer several advantages over inductive approaches. In particular, they do not require bulky and expensive power inductors, leading to better integration, smaller form factor and lower cost. Meanwhile, techniques have been developed to mitigate its drawbacks, such as limited voltage conversion ratios [38,72,81,82], and to reduce the losses [25,41,42]. The benefits of SCPC make it a preferable choice for low-power IoT applications. However, in addition to the efficient power conversion, controlling the circuit remains challenging. This involves maintaining both a fast transient response speed and high efficiency over a wide power range simultaneously.

For instance, the conventional lower-bound hysteresis control (LBHC) illustrated in Fig. 4.1(a) has its transient speed restrained by the fixed input clock frequency. A high-frequency clock input necessary for a fast-transient response would result in more significant controller power consumption and consequently deteriorate its light-load efficiency [37, 42]. Similarly, Switched-capacitor (SC) DC-DC converters with frequency control (as shown in Fig. 4.1(b)) face an alike issue. A fast frequency scaling controller usually leads to poor controller power scaling and degrades the light-load efficiency [44–46]. Therefore, good controller power scaling is essential for high light-load efficiency.



Fig. 4.1. SC converters based on (a) LBHC, (b) frequency control, (c) LBHC SC converter with pulse skipping based frequency control [3], and (d) proposed design.

As seen in Fig. 4.1(c), oversampling-based frequency control is introduced to LBHC in [3]. While the load voltage regulation is comparable to the LBHC controller as displayed in Fig. 4.1(a), the controller power in [3] can be scaled with load currents. This is achieved by regulating the frequency until the skipped phase  $N_{skip}$  matches a target value  $N_{target}$ . However, a larger  $N_{skip}$  will still necessitate extra comparator operations which subsequently cause increased controller power consumption and restricted efficiency.

In this chapter, a dual lower-bound hysteretic control (DLBHC) mechanism is proposed to address the above limitations [83]. In addition to the power and frequency scaling with only two comparator operations, it also allows the transient performance to be boosted from the system level without interfering with the voltage regulation, which helps alleviate low power but slow controller problems from the system level. To describe the proposed concept, the circuit and principle of DLBHC will be analyzed in section 4.1. Moreover, modeling of the DLBHC behavior will be introduced in section 4.2 to describe its stability, which helps design the critical delay time required in DLBHC. The measurement results will be presented and analyzed in section 4.3.3 and a conclusion will be made in section 4.6.

## 4.1 Concept Circuit

#### 4.1.1 Model of Transient Behavior in SCPC

To comprehend the behavior of SCPC during a transient response, it's crucial to model the relationship between frequency and output. In fact, a mixed-domain model [84] can facilitate an intuitive understanding of the transient dynamics. As illustrated in Fig. 4.2, the entire system comprises variable time sampling and two discrete domain transfer functions. The variable sampling frequency changes the output current into charge flows within the SCPC, and SCPC's operation is nearly independent of frequency when in the slow-switching limits (SSL). This results in a uniform mathematical representation of the large signal behavior for SCPC. Moreover, the sampled  $V_{DD}$  and  $I_L$  set the input for the discrete model transfer functions, thus determining the transient response.

In more detail, as shown in Fig. 4.3, the open-loop transfer function of an SCPC can be expressed by:

$$V_n(z) = A_n(z)V_{DD}(z) - R_n(z)I_L(z)$$
(4.1)

 $V_n(z)$  is the output voltage,  $V_{DD}(z)$  is the input voltage,  $I_L(z)$  is the load current and  $A_n(z)$  is a transfer function in z-domain from input voltage to output voltage. For any given moment *k*, the average current of this clock cycle is:

$$I_L[k] = \frac{1}{T_s[k]} \int_{t[k-1]}^{t[k]} I_L(t) dt$$
(4.2)



Fig. 4.2. Model of the SCPC with Frequency Control.

where t[k] and t[k-1] are the moment of clock switching and are determined by the frequency sequence f[n]. The  $T_s[k]$  is the period of sampling and is defined as t[k] - t[k-1]. However, once f is introduced, current  $I_L[k]$  is converted to charge Q[k], which is determined by:

$$Q_L[k] = \frac{I_L[k]}{f[k]} = I_L[k]T_s[k] = \int_{t[k-1]}^{t[k]} I_L(t)dt$$
(4.3)

It can be observed from Fig. 4.3 that the major difficulty of designing a frequencymodulated SCPC comes from the conversion between  $I_L(t)$  and  $Q_L[n]$ . In fact, this relationship can be written in Z-domain as well:

$$Q_L(z) = \frac{1}{2\pi j} \oint_C I_L(v) T_s(\frac{z}{v}) v^{-1} \mathrm{d}v$$
(4.4)

where *v* is the points in circle *C*,  $I_L(z) = \mathcal{Z}\{I_L[n]\}$  and  $T_s(z)$  is  $\mathcal{Z}\{T_s[n]\} = \mathcal{Z}\{1/f[n]\}$ . Therefore, in the closed-loop configuration, as shown in 4.4, the sampling of  $T_L$  to



Fig. 4.3. Open-loop charge pump in the mixed domain.



Fig. 4.4. Closed-loop charge pump with synchronized controller.

 $Q_L$  creates a convolution in the frequency domain, making conventional small signal linearized analysis less accurate, but numerical calculation is still practical.

In terms of the concept of regulating the output voltage by manipulating the frequency, or the *T* sampling period, to reach the desired output voltage, the controller's speed is the primary determinant of transient speed. This is because, qualitatively, the zero and first order dominate the transfer function responses of  $B_n(z)$ . Therefore, the transient performance is significantly influenced by the speed of the controller.

Despite the fact that the frequency modulation control is restricted by its bandwidth, which requires several clock cycles to react to the load transient, hysteretic control can respond to the transient event within a single clock cycle as it directly compares the output and determines  $1/f_n$ . In the proposed DLBHC design, additional clock cycles can be triggered with a resolution of  $T_D$ , further improving recovery time and incorporating frequency controllers to optimize ripple and efficiency. However, while the discrete-time approach can model the behavioral patterns of SCPC across multiple clock cycles, further analysis of the transient behaviors of controllers such as DLBHC-based SCPCs necessitates the modeling of transient behavior within a single clock cycle given the short  $T_D$ .

#### 4.1.2 Behavior Analysis of the Proposed DLBHC Control

Fig. 4.1(d) and Fig. 4.5 show the concept of the proposed design. By introducing a secondary LBHC loop, the proposed DLBHC approach can facilitate output voltage regulation and generate the  $N_{pump-skip} = N_{pump} - N_{skip}$  signal for frequency regulation. This helps in scaling the controller power relative to load current and optimizing light-load efficiency.

In Fig. 4.5(a), the state of the DLBHC controller is detailed. The proposed DLBHC controller has a primary LBHC loop, similar to the conventional LBHC control loop. It's triggered by  $\Phi_{clk}$  and samples  $V_{out}$  as  $V_{out,min}$ . If  $V_{out,min} > V_{ref}$ , the primary LBHC loop will induce the SC power stage to skip phase switching in this clock cycle, resulting in  $N_{skip} = 1$ . If  $V_{out,min} < V_{ref}$ , the SC power stage will pump the charge into the output node, thereby incrementing the count of  $N_{pump}$  to 1.

Fig. 4.6 shows the open-loop implementation that achieves the state flow described above and compares it with an open-loop LBHC. In fact, the behavior of the DLBHC (Fig. 4.7, right) under light-load and optimum load conditions is similar to that of the LBHC (Fig. 4.8, right). However, unlike conventional LBHC controllers, when  $V_{out,min} < V_{ref}$ , DLBHC also triggers the proposed secondary LBHC loop (state 3) and samples the output voltage  $V_{out}$  with a delay time  $T_D$ .



Fig. 4.5. (a) Illustration of the proposed DLBHC control (b) flow chart.



Fig. 4.6. Comparison between open loop DLBHC and LBHC, assuming a heavy load.



Fig. 4.7. Operation of DLBHC in open-loop: (left) heavy load, (middle) optimum Load, (right) light load.



Fig. 4.8. Operation of LBHC in open-loop: (left) heavy load, (middle) optimum Load, (right) light load.

Ideally, by properly designing  $T_D$ , the sampled voltage  $V_{out,max}$  will reflect the maximum voltage of  $V_{out}$  that can be charged by flying capacitors and is expected to be greater than  $V_{ref}$  in steady-state. Thus, if  $V_{out,max} < V_{ref}$ , the secondary LBHC will control the SC power stage to pump charges into the output node, resulting in  $N_{pump}$  being incremented by 1. If  $V_{out,max} > V_{ref}$ , phase switching of the SC power stage would not be triggered, and this event will be counted as  $N_{skip} = 1$ . Note that the secondary LBHC loop can be triggered multiple times during a load current transient, which enhances the response speed.

#### 4.1.3 DLBHC and Frequency Control Design

Based on the above discussion, it is clear that in the open loop operation the relationship between  $\Phi_{pump}$  and  $\Phi_{skip}$  reveals the relationship between load current and frequency in the proposed DLBHC design; hence, it is possible to regulate the frequency. Considering a single-phase SC converter for simplicity, the DLBHC mechanism described in section 4.1.2 can be implemented as Fig. 4.9(a). From the viewpoint of frequency, because the load current is determined by the charge pumped into the output node in each clock cycle  $\Delta Q$  multiplied by the frequency, there exists an optimum frequency  $f_{opt}$  which can provide the required output voltage  $I_L = \Delta Q f_{opt}$ . Therefore, the voltage regulation by DLBHC generates an equivalent frequency  $f_{DLBHC,eff}$ , which is equal to the mismatch between the current operating frequency  $f_{clk}$  and the optimum frequency  $(f_{opt})$ :

$$f_{opt} = f_{clk} + f_{DLBHC,eff} = (1 + N_{pump-skip,eff})f_{clk}$$

$$(4.5)$$

where  $N_{pump-skip,eff}$  is equivalent to the average  $N_{pump-skip}$  generated in a multiple clock cycle sample window,  $f_{clk}$  is the frequency of VCO. Therefore, a holding capacitor  $C_h$ , which integrates the  $N_{pump}$  and  $N_{skip}$  with a pull-up and pull-down network, is one of the effective implementations that creates negative feedback to force  $N_{pump-skip}$  to reach zero to match the  $f_{opt}$  and  $f_{clk}$ . Additionally, applying a short pulse wave to  $V_t$  can enable turbo mode operation, which will maximize the frequency and shorten the transient response time. Load regulation will not be affected by  $V_t$  because the primary LBHC will regulate  $V_{out}$ . The aforementioned method provides an effective protocol to dynamically acquire a fast active transient response while maintaining good light-load efficiency provided by low-power controller operation.

However, for LBHC operation, between the light load and optimum load, samples of  $V_{out,min}$  are all smaller than  $V_{ref}$  (Fig. 4.8), hence it is not practical to implement similar frequency control mechanisms (Fig. 4.9(c)). To identify the load information from the comparison result, oversampling, which reduces the load current level at a given frequency, is proposed to conduct the frequency regulation [3], the principle of which can be described by Fig. 4.10. This approach, however, reduces the maximum load current as well as the efficiency. Overall, introducing an LBHC in the loop can enhance transient performance compared with pure frequency controlbased approaches, which shown as Fig. 4.9(d).



Fig. 4.9. Comaprison of different conceptual implementations w/ frequency control (a) proposed DLBHC, (b) LBHC, frequency control is not practical, (c) oversample based LBHC [3] (d) frequency control w/ error amplifier.



Fig. 4.10. Operation of Oversampled LBHC in open-loop: (left) heavy load, (middle) optimum Load, (right) light load.

### 4.1.4 Frequency Controller Design

Because the rising edge of  $\Phi_{skip,i}$ ,  $\Phi_{pump,i}$  and  $\Phi_{pump2,i}$  represent the effective operations of the DLBHC, the overall effective DLBHC clock  $\Phi_{pump-skip,eff}$  can be

obtained by adding them from all phases:

$$\Phi_{pump-skip,eff} = \sum_{i=1}^{N} (\Phi_{pump,i} + \Phi_{pump2,i} - \Phi_{skip,i})$$
(4.6)

where N is the number of phases.

To implement this function, as shown in Fig. 4.11, a SC-based controller is introduced. Two multiple-clock triggered DFFs, which are the same as in Fig. 4.21(c), are introduced to collect the outputs of  $\Phi_{skip,[1...N]}$ ,  $\Phi_{pump,[1...N]}$  and  $\Phi_{pump2,[1...N]}$  separately. Each rising edge of  $\Phi_{down,[1...N]}$  triggers a DFF output flipping in  $\Phi_{down}$ , and creates a pulse on  $TG_1$  and  $TG_2$  with each rising and falling edge input to the pulse generator. The pulse generator used in Fig. 4.11 is modified from Fig. 4.21(b) by replacing the NAND gate with an XNOR gate. This pump charges into  $C_{out}$ , decreases the currents of VCO and hence reduces its frequency. And similarly, each rising edge from  $\Phi_{pump,[1...N]}$  and  $\Phi_{pump2,[1...N]}$  will increase  $f_{clk}$ . As the result, the rising edge from  $\Phi_{skip,[1...N]}$ ,  $\Phi_{skip,[1...N]}$  and  $\Phi_{skip2,[1...N]}$  are added up in  $C_{hold}$  as voltage  $V_f$  and control the frequency with  $M_3$ . Generally, since the speed of the frequency control loop is much slower than that of the hysteresis control loop, it won't interfere with the operation of the DLBHC during transients.

Additionally, to ensure a minimum current during startup and to guarantee that the output of the VCO satisfies the input requirements of the level-shifter,  $M_1$  is introduced. A maximum frequency limiter, added by  $M_2$  and  $V_{b2}$ , protects the DLBHC from being driven by overspeed clocks. This prevents potential regulation failures caused by bottlenecks such as the DFF flipping speed in the frequency controller loop. In this design, both the  $V_{b1}$  and  $V_{b2}$  are externally biased using a constant current mirror, and the power used for biasing is also factored into the efficiency calculation.

Furthermore, while the proposed frequency controller can serve the purpose of



Fig. 4.11. Frequency controller design.

optimizing the frequency in this design, it is also advisable to consider using a digitally controlled oscillator. This is because it offers better support for frequency range limitations and can more effectively fine-tune performance at the system level.

## 4.2 Model for Delay Time Design

#### 4.2.1 Simplified Model for Timing Design

 $T_D$  is a critical parameter for a good DLBHC performance. Ideally, a  $T_D$  that detects exactly the maximum voltage  $V_{out}$  can achieve the maximum DLBHC stability. However, in practice, designing with a fixed delay time is more feasible due to its simple, power-saving structure. Intuitively, a  $T_D$  design based on its maximum current point might be a good choice as the ripple sampled by DLBHC is smaller under heavy-load conditions than light-load conditions. But if the  $T_D$  is incorrectly configured, as it is shown in Fig. 4.13, it can generate a wrong  $N_{pump-skip}$ , which can cause unpredictable behavior of the controller. Therefore, it is necessary to build a model and test the behavior over the whole load current range to ensure that selecting a fixed  $T_D$  is a viable choice. While the model of the SC converter has been



Fig. 4.12. Modeling for the analysis of an N-phase 2:1 SC converter.



Fig. 4.13. DLBHC operations: proper  $T_D$  (black) and inproper  $T_D$  (red).

well-developed with various approaches, most of the works focus on describing the overall steady-state behaviors like conversion ratio, efficiency, and losses [85–90]. To analyze the dynamic behavior of the proposed DLBHC controller and obtain a

proper  $T_D$  timing range, a simplified model for transient behavior analysis is proposed in this section.

Consider the charge and discharge phase of a 2:1 SC converter, as it is shown in the upper part of Fig. 4.12. In steady-state, the  $V_{out}$  is equal to  $V_{out,min}$  at both the beginning and the end of each phase. At this moment, the voltage across the flying capacitor  $V_{fly}$  hence will cycle between  $V_{S1} = V_{out,min} + I_{R,min}R_{on}$  and  $V_{S2} =$  $V_{DD} - V_{out,min} - I_{R,min}R_{on}$ , where  $I_{R,min}R_{on}$  reflects the voltage drop on the switch's on-resistance.

Instead of solving the voltage changes of  $V_{cfly}$ , the active phase in N-phase 2:1 converter can be represented by a voltage biasing  $V_{out,min}$  and the effective charge can be pumped into output node  $Q_{pump} = C_{fly}V_{pump}$ . From the previous analysis,  $V_{pump} = V_{S2} - V_{S1}$ . Hence, for an incomplete charge-discharge cycle,  $V_{pump}$  can be defined as:

$$V_{pump} = V_{DD} - 2V_{out,min} - 2I_{R,min}R_{on}$$

$$(4.7)$$

which describes the voltage differences that can be used by flying capacitors to transfer charge.

Meanwhile, by observing the time constant of each RC loop, it is possible to estimate the circuit dynamics. When the charge and discharge phases have equal resistance, the discharge/charging phases have symmetrical behavior during each phase. Therefore, for the currents flow from the active phase to  $C_{out}$ , the following time constant holds:

$$T_{eq1} = R_{on}C_{eq1} = \frac{R_{on}C_{fly}C_{out}}{C_{fly} + C_{out}}$$
(4.8)

Meanwhile, for the current flowing from the active phase to other N-1 inactive phases, the following time constant holds due to symmetrical structure:

$$T_{eq2} = R_{on}C_{fly} \tag{4.9}$$

Considering the final state of charge transfer, the pumped charge  $Q_{pump}$  will be transferred to both the  $C_{fly}$  in inactive phases and the  $C_{out}$ . Therefore, by defining  $C_{out} = \alpha C_{fly}$ , an approximately  $V_{pump}/(N + \alpha)$  final voltage change is expected if no load is connected. Meanwhile, with  $T_{eq2}$  significantly larger than  $T_{eq1}$ , the charging speed of  $V_{out}$  node is dominated by  $T_{eq1}$  at the beginning of each phase. It hence is approximated by  $V_{pump}/(1 + \alpha)$ , assuming that  $T_{eq2}$  does not have a significant effect on the charging speed in this time frame. The RC discharge induced waveform changes on  $V_{out}$  hence can be modeled by  $V_{pump}K_1(t)$ , where  $K_1(t) = K'_1(t) + K''_1(t)$ , and is defined with:

$$K_1'(t) = \frac{1}{1+\alpha} \left( 1 - e^{\frac{-t}{T_{eq1}}} \right)$$
(4.10)

$$K_1''(t) = \left(\frac{1}{N+\alpha} - \frac{1}{1+\alpha}\right) \left(1 - e^{\frac{-t}{T_{eq2}}}\right)$$
(4.11)

As the voltage drop of  $V_{out}$  is mainly caused by  $I_{out}$ , its contribution can be approximated as:

$$I_{out}K_2(t) = \frac{I_{out}t}{(N+\alpha)C_{fly}}$$
(4.12)

Combining the (4.10), (4.11) and (4.12),  $V_{out}$  can be approximated as:

$$V_{out}(t) = V_{out,min} + V_{pump}K_1(t) - I_{out}K_2(t)$$
(4.13)

which predicts output voltage *V*<sub>out</sub> in steady-state.

### 4.2.2 Light-load DLBHC Timing Analysis

When the frequency of  $f_{clk}$  is far from the optimal range, the stability of the DLBHC loop does not raise concerns. However, as shown in Fig. 4.13, the behavior of DLBHC becomes less straightforward when  $f_{clk}$  is close to the optimal range. When  $f_{clk}$  is just above  $f_{opt}$ , if the primary LBHC control skipped one clock cycle

and the following phase switch cannot pump  $V_{out}$  sufficiently to exceed  $V_{ref}$ , then the secondary LBHC will be triggered and generate a zero  $N_{pump-skip}$  for the two clock cycles. If such a situation repeatedly occurs, then frequency control will be less likely to function correctly. Similar cases may also happen if noise is introduced into the output node. To prevent this, it is necessary to determine the boundary conditions, where  $T_D$ , as a design parameter, plays a very important role.

When  $V_{out,min} = V_{ref} + \delta V$ , which is equivalent to an  $f_{clk}$  slightly higher than the optimal frequency range, the stability margin is the smallest. Because increasing  $\delta V$  will increase  $V_{out,min2}$ , reducing the difficulties of pumping  $V_{out}$  above  $V_{ref}$ . As the primary LBHC skipped a clock cycle, the last active phase is discharged for  $2T_{clk}$ . Because  $T_{clk}$  is usually much larger than  $T_{eq1}$ ,  $I_{out}$  dominates the voltage drop. The voltage drop from  $V_{ref}$  can be approximated as:

$$\Delta V_{out1} = -I_{out}K_2(T_{clk}) \tag{4.14}$$

A sufficiently long period also makes  $I_{out}$  dominate  $I_{R,min}$ . Note that during the transient, the voltage cycle of flying capacitors is  $V_{DD} - V_{out,min2} - V_{ref}$ .  $V_{pump}$  hence can be derived as:

$$V_{pump} = V_{DD} - V_{out,min2} - V_{ref} - 2I_{out}\frac{1}{N+\alpha}R_{on}$$

$$(4.15)$$

where  $V_{out,min2} = V_{ref} + \Delta V_{out1}$ . Therefore, if we define the steady-state  $V_{pump,ss}$ :

$$V_{pump,ss} = V_{DD} - 2V_{ref} - 2I_{out}\frac{1}{N+\alpha}R_{on}$$

$$\tag{4.16}$$

then (4.15) can be written as:  $V_{pump} = V_{pump,ss} - \Delta V_{out1}$ . Hence, after  $T_D$ , the  $V_{out}$  is pumped up by:

$$\Delta V_{out2} = (V_{pump,ss} - \Delta V_{out1})K_1(T_D) - I_{out}K_2(T_D)$$
(4.17)

To ensure the operation, it is desired that:

$$\Delta V_h = \Delta V_{out1} + \Delta V_{out2} > 0 \tag{4.18}$$

For light-load condition,  $I_{out}K_2(T_D)$  can be ignored. Also, the steady-state condition implies that  $V_{pump,ss}K_1(T_{clk}) = I_{out}K_2(T_{clk})$ . This leads to:

$$\Delta V_h = V_{pump,ss}[K_1(T_D) + (K_1(T_D) - 1)K_1(T_{clk})]$$
(4.19)

Hence, (4.18) is satisfied if:

$$K_1(T_D) > \left(1 - \frac{1}{1 + K_1(T_{clk})}\right)$$
 (4.20)

As  $K_1(T_{clk})$  will close to  $1/(\alpha + N)$  for light-load condition, it leads to:

$$K_1(T_D) > \frac{1}{N+\alpha+1}$$
 (4.21)

This suggests a lower limit of  $T_D$ , which links to the  $T_{cross1}$  in Fig. 4.13. Moreover, for upper boundary of  $T_D$  under light-load condition, which links to the  $T_{cross2}$  in Fig. 4.13, it can be approximately predicted by:

$$V_{out,min} + V_{pump} - I_{out}K_2(T_D) > V_{vref}$$

$$(4.22)$$

This is also due to the fact that  $T_{eq1} \ll T_{eq2} \ll T_{cross2}$ , hence  $Q_{pump}$  can be considered as fully charged at  $T_{cross2}$ . It suggests  $T_{cross2}$  will decrease linearly with the increasing load current in light-load conditions.

#### 4.2.3 Heavy-load DLBHC Timing Analysis

One of the main limitations of the model in Section 4.2.3 is that it does not consider the heavy-load condition. However, it must be analyzed to depict the full picture of the DLBHC operation. Interestingly, while a large  $I_{out}K_2(T_D)$  is expected to have a larger impact, it is not the dominant effect that degrades the voltage head-room due to the short  $T_D$ . According to (4.20), it is clear that an increased  $K_1(T_{clk})$  will increase the required minimal  $K_1(T_D)$ . From the simplified model, assuming that the (4.11) and (4.12) still have reasonable accuracy for this region, then (4.12) suggests  $K_1''(T_D)$  will increase rapidly once  $T_{clk}$  approaches  $3T_{eq2}$ , requiring a longer  $T_D$  to compensate it, and this point will be verified by simulation in section 4.2.5.

#### 4.2.4 Impact of Transistor Resistance Mismatch

In practice, some charge-pump phases may have different  $R_{on}$  in actual charge implementations. This is mainly because PMOS and NMOS transistors usually have different mobility and  $V_{gs}$  in an SC converter, and a complete balancing of resistance can lead to excessive switching power losses due to large transistor size or extra driver circuit, which may not be selected in the design.

In these cases, having different  $R_{on}$  does not significantly impact DLBHC under light-load conditions, because each loop can be considered fully charged or discharged. However, the behavior under heavy-load conditions will be slightly different. If the charging/discharging loop has different switch resistances  $R_{on1}$  and



Fig. 4.14. Simulated relationship between steady-state  $I_{load}$  and  $f_{eff}$ .

 $R_{on2}$ , there will be three variants of  $T_{rc,eq2}$  in the SC converter. Different combinations of  $R_{on1}$  and  $R_{on2}$  will lead to different  $T_{rc,eq2}$  that increase  $T_D$  in the DLBHC headroom at different frequencies. For example, if  $R_{on2} > R_{on1}$ , then the loop with  $R_{on2} + R_{on2}$  will partially impact the performance at lower frequencies, and the loop with  $R_{on1} + R_{on1}$  will impact the performance starting from a higher frequency. Therefore, we can conclude that it will result in frequency degradation starting from a lower frequency, but the change will also be slower.

#### 4.2.5 Simulation Based Analysis Verification

To verify the accuracy of the analysis, simulations are conducted to reveal the required timing for the DLBHC controller. In the simulation, a 3-phase 2:1 switched capacitor stage is selected with constant on-resistance switches, each phase consists of two 222.72 pF flying capacitors with interleaved switching stages. The output capacitor is set to 70.4 pF.

The operating condition is first set by ensuring  $V_{out,min}$  reaches reference voltage, which is 800 *mV* with 0.1 *mV* error, and then voltage dynamics following the action



Fig. 4.15. Simulated relationship between  $T_{cross2}$  and  $T_{clk}$ .



Fig. 4.16. Simulated relationship between  $T_{cross2}$  and  $T_{clk}$ .

of skipping one clock cycle is extracted. For the  $R_{on}$  value selection, a combination of 60  $\Omega$  + 12  $\Omega$ , 30  $\Omega$  + 15  $\Omega$  and 20  $\Omega$  + 20  $\Omega$ , which has 20  $\Omega$  equivalent resistance in parallel, as well as 60  $\Omega$  + 60  $\Omega$  and 30  $\Omega$  + 30  $\Omega$ , are simulated. This allows the comparison between both the equal and unequal on-resistance.

Fig. 4.14 illustrates the relationship between steady-state frequency and current



Fig. 4.17. Simulated relationship between  $K_{cross1}$  and  $K_{vco}$  (Simulation vs Model).



Fig. 4.18. Process and temperature variation of selected  $T_D$  (1.8 V to 0.8 V).

for various  $R_{on}$  setups. It can be observed that the relationship between the effective frequency  $1/T_{clk}$  and load current is almost linear, which meets the expectation because it operates in the well-known slow-switching limits for most of the load current range.



Fig. 4.19. Relationship between the sampled  $V_{out}$  and  $T_D$  during the transient recovery.

Meanwhile, the relationship between the two crossover times and operating frequency can be plotted. As shown in Fig. 4.15,  $T_{cross2}$  primarily depends on the load current, confirming the conclusion made in section 4.2.3. To facilitate comparison of  $T_{cross1}$ , the x-axis of Fig. 4.16 is further normalized using the following equation:

$$K_{vco} = \frac{T_{clk}}{T_{eq2,3rc}} \tag{4.23}$$

For on-resistance that is matched,  $T_{eq,3rc}$  is equivalent to  $T_{eq2}$  defined in (4.9). For onresistance that is mismatched, we evaluate the  $T_{eq2}$  with the highest series resistance because it affects lower frequencies more. Hence,  $T_{eq,3rc} = max(R1, R2)C_{fly}$ .

Because of the strong relationship between  $T_{cross1}$  and the time constant, normalization of the y-axis based on (4.10) and (4.11) is used for  $T_{cross1}$ :

$$K_{cross1} = K_1'(T_{cross1}) + K_1''(T_{cross1})$$
(4.24)

Here, for unbalanced on-resistance, the parallel resistance of charging/discharge

phases is adopted for normalizing  $K'_1$  and  $K''_1$ . Also, (4.20) is also calculated with  $R_{on1} = 60\Omega$  and  $R_{on2} = 12\Omega$ , as well  $R_{on1} = 30\Omega$  and  $R_{on2} = 30\Omega$  for comparison. Note that during the calculation (4.20) with  $R_{on1} = 60\Omega$  and  $R_{on2} = 12\Omega$ ,  $K''_1(T_{cross1})$  is further approximated by:

$$K_1'' = \frac{K_{1,R_1+R_2}''}{2} + \frac{K_{1,max(R_1,R_2)}''}{4} + \frac{K_{1,min(R_1,R_2)}''}{4}$$
(4.25)

which is designed to include the effect of branches with different  $R_{on}$ . The result hence can be plotted as Fig. 4.17. From (4.21), the boundary of the light-load condition is supposed to be:

$$K_{min} = \frac{1}{3 + 1 + \frac{70.4}{222.7 \times 2}} = 0.2405 \tag{4.26}$$

which is close to the  $K_{cross1} = 0.23$  shown in Fig. 4.17. Also, the calculation of the model still shows a reasonably good prediction of the required  $T_D$  in heavy-load conditions. It can be observed that  $K_{cross1}$  starts to increase after  $T_{clk}$  becomes shorter than  $T_{eq,3rc}$ . This confirms the assumption that  $T_{eq2}$  has the dominant impact on heavy-load performance in section 4.2.3 and section 4.2.4, as well as verifies our analysis that mismatched  $R_{on}$  causes the voltage headroom degradation to occur at lower frequencies due to larger  $T_{eq,3}$ , but with a smaller slop. Combining the Fig. 4.15, Fig. 4.16 and Fig. 4.17,  $T_D$  selected between  $T_{cross1}$  and  $T_{cross2}$  can guarantee the DLBHC operation.

#### **4.2.6** Discussion on *T<sub>D</sub>* Selection for DLBHC Design

While the proposed model generates an area between  $T_{cross1}$  and  $T_{cross2}$ , the selection of  $T_D$  also needs to consider the trade-off between performance and efficiency.

With the above discussion, it is demonstrated that the light-load stability is less worrying than the maximum load-current point. Designing the  $T_D$  for the maximum load current condition will guarantee a correct DLBHC operation in the full-load current range. It also helps maintain a high efficiency with its simple structure. A series of inverters, which generate a  $T_D$  of approximately 0.8 ns with PVT variation in the range of 0.53 ns to 1.2 ns from -25°C to 125°C across all corners, is adopted in the proposed design. Also, considering the impacts of non-ideal factors in actual circuit implementations, like the variation of the 1.7 *ns* typical delay between the activation of the comparator and the switching of the SC converter in this design, as well as effects like input offset of different comparators, there is additional headroom to tolerate these impacts. From Fig. 4.19, it can be observed that the sampled voltage is also close to its maximum voltage under light-load conditions, proving that having a fixed  $T_D$  is a reasonable choice for a balanced trade-off between performance and efficiency.  $T_D$  hence can be selected by observing the peak voltage of  $V_{out}$  under maximum load current conditions.

# 4.3 Implementation with Distributed Multi-Phase DLBHC Controller

#### 4.3.1 Distributed Multi-Phase DLBHC Design

To further reduce the output ripple, multi-phase SC design is one of the commonly used techniques in fully on-chip design. Fig. 4.20 shows the proposed Nphase implementation. In this design, a non-latched comparator CMP1 works as the primary LBHC stage, and a multi-clock-triggered D flip-flop (DFF) is used to receive the comparator output of the secondary LBHC comparator from the previous stage. Comparators CMP2 and CMP3 are the secondary LBHC stages, which can



Fig. 4.20. System block diagram of the proposed N-phase DLBHC SC Converter.

trigger the secondary LBHC stage in the next phase via the DFF to allow multiple activations of the secondary LBHC stage. The details of multi-clock-triggered DFF used in this design are shown in 4.21(c) and 4.21(d), which is simplified from [91]. Hence the DFF triggers  $\Phi_{1,cp}$  flipping for every rising edge from either the CMP1, CMP2 or CMP3. The dead-time generation mechanism is similar to [64] but does not include the adaptive dead-time, and the detail of implementation in this paper is omitted here as it is a commonly used structure. The inverter-based oscillator is similar to Fig. 4.9, except all phases produce outputs. The level-shifter is based on [92] with its detailed structure illustrated in Fig. 4.21(a). The frequency controller also holds a similar principle to Fig. 4.9, but is designed to accept input from all phases, which is shown in Fig. 4.11 and will be introduced in section 4.1.4. In this implementation, N is set to 3.

As shown in Fig. 4.22(a), when  $f_{clk} = 2f_{opt}$ , assuming  $V_{out}$  is smaller than  $V_{ref}$ when the clock edge arrives at previous phase, the  $\Phi_{i-1}$  will trigger an inversion of  $\Phi_{i-1,cp}$  and pumps charge into the capacitor  $C_{out}$ . This action increases  $V_{out}$ . When the next rising edge of  $\Phi_i$  arrives,  $V_{out}$  will be larger than  $V_{ref}$  at phase *i* due to the high  $f_{clk}$ . As a result, the rising edge of  $\Phi_i$  is ignored and no inversion will be triggered on  $\Phi_{i,cp}$ . Hence, for every two  $\Phi_i$  cycles, one is skipped and it leads to a



Fig. 4.21. Detailed circuits of (a) Level shifter (b) Pulse generator (c) Multiple-clock triggered DFF.

 $N_{pump-skip,eff} = -1/2$ . This also triggers a rising edge on  $\Phi_{down,i}$ , and it indicates that the frequency should be decreased.

When  $f_{clk} = f_{opt}/2$ , the maximum  $V_{out}$  pumped by  $\Phi_i$  is below  $V_{ref}$ . When  $\Phi_{i,d}$  arrives, it triggers the secondary LBHC. This can be depicted in Fig. 4.22(b). A rising edge hence is generated at  $\Phi_{pump,i}$  by CMP2 or  $\Phi_{pump2,i}$  by CMP3. As  $\Phi_{pump,i}$  and  $\Phi_{pump,i}$  are the clock input of the DFF in phase i + 1, they will trigger the inversion of  $\Phi_{i+1,cp}$  in advance. Additional switching hence is inserted by the secondary LBHC comparators, which makes  $N_{pump-skip,eff} = 1$ .

In Fig. 4.22(c), when the rising edge of  $\Phi_i$  arrives,  $\Phi_{i,cp}$  will be triggered because  $V_{out}$  is smaller than  $V_{ref}$ . When  $\Phi_{i,d}$  is inverted,  $V_{out}$  is pumped above  $V_{ref}$ . This generates a zero  $N_{pump-skip,eff}$  when  $f_{clk} = f_{opt}$ . Moreover, comparing this implementation with the single phase concept in Fig. 4.9,  $N_{pump}$  and  $N_{skip}$  cancel each other by one, hence eliminating unnecessary operations when regulating frequency.

It is also worth mentioning here that  $T_D$  defined by the analysis in section 4.1.2 is the delay between switching of SC stage and sampling of secondary DLBHC, hence is the delay between  $\Phi_{i,cp}$  and  $\Phi_{i,d}$  in Fig. 4.20. In this design,  $T_D$  is set to around 0.8



Fig. 4.22. Conceptual graph of DLBHC operations when (a)  $f_{clk} = 2f_{opt}$ , (b)  $f_{clk} = f_{clk}/2$  and (c)  $f_{clk} = f_{opt}$ .

ns to guarantee proper DLBHC operation, and it is discussed in section 4.2 in detail.

Note that it is also practical to use a central DLBHC control to drive multiple phases via phase interleaving techniques [93]. But compared to the implementation in this paper, both the phase interleaved and timing control require extra power, hence degrading the overall power conversion efficiency. Therefore, having a separate DLBHC unit in each phase is a more balanced approach.

#### 4.3.2 Simulated Transient Behavior of Proposed Design

Fig. 4.23, Fig. 4.24, and Fig. 4.25 show the simulated waveform of the proposed DLBHC control with step-up/step-down load currents. With the selected  $T_D = 0.8$  ns, the proposed DLBHC demonstrated a robust voltage regulation. For the load transients without  $V_{turbo}$ , the proposed DLBHC has a worst-case response speed proportional to  $1/f_{sw}$ , where  $f_{sw}$  is the effective switching frequency, equal to  $3f_{clk}$  in this implementation. Therefore, the worst-case response speed is 5.93  $\mu s$  for  $I_{out} = 60 \ \mu A$  and 5.27 *ns* for  $I_{out} = 6 \ mA$ . In turbo mode, the system can actively



Fig. 4.24. Simulated waveform of 6 mA - 6  $\mu$ A down transient w/o turbo.

request a fast transient response by sending a pulse wave with  $V_{turbo}$ . The frequency controller will actively boost the frequency to its maximum value, which is 81.1*MHz* in this design, hence reducing the  $1/f_{sw}$  significantly. In this design, 55*ns* is sufficient for the frequency controller to boost the VCO frequency. With the proposed



Fig. 4.25. Simulated waveform of 6  $\mu$ A - 6 mA up transient w/ turbo.



Fig. 4.26. Chip microphotograph.

DLBHC regulating output voltage, controlling the frequency via  $V_{turbo}$  does not require considering the overall system stability, which is favorable for system design.

#### 4.3.3 Measurement Result

Fig. 4.26 shows the microphotograph of the prototyping chip. The proposed SC converter is designed with 180-nm CMOS technology. Three phases are implemented and consist of two 1/2 SC converters with reversed phases. Six 222.7 *pF* MIM-cap based  $C_{fly}$  is integrated. The controller is placed underneath the 70.4 pF output capacitor  $C_{out}$ .



Fig. 4.27. Comparison of the measured load regulation between the proposed DLBHC and its conventional LBHC mode.



Fig. 4.28. Comparison of the measured efficiency between the proposed DLBHC and its conventional LBHC mode during load regulation.



Fig. 4.29. Comparison of the measured VCO frequency between the proposed DLBHC and its conventional LBHC mode during load regulation.



Fig. 4.30. Comparison of the measured line regulation between the proposed DLBHC and its conventional LBHC mode.



Fig. 4.31. Comparison of the measured efficiency between the proposed DLBHC and its conventional LBHC mode during line regulation.

Fig. 4.27 presents the measurement results of the load regulation. Fig. 4.28 shows the measurement results of the power conversion efficiency. The peak power conversion efficiency reaches 80.4% with an input voltage of 1.8 *V* and a reference voltage of 0.81 *V*. The figure also indicates that the frequency is effectively regulated at different load current levels. The power conversion efficiency is over 75% for 28 *u*A - 6.4 *m*A load current ranges with 1.8 V input, which is  $228 \times 1000$  current ranges. The load voltage is effectively regulated. To compare the proposed design with conventional LBHC mode control, a measurement comparison is made by disabling the secondary LBHC comparators (CMP2 and CMP3) and the frequency controller. With the frequency set to the same level of maximum current point, the primary LBHC comparator (CMP1) emulates the conventional LBHC control. The performance is measured and shown as LBHC mode in Fig. 4.28 and Fig. 4.27, showing a significant improvement of the light-load efficiency compared to conventional



Fig. 4.32. Comparison of the measured VCO frequency between the proposed DLBHC and its conventional LBHC mode during line regulation.



Fig. 4.33. Measured transient response w/ turbo from 40  $\mu$  A to 4 mA.

LBHC. Moreover, the linear relationship from Fig. 4.29 aligns with the simulation result from Fig. 4.14.

The line regulation is also measured as Fig. 4.30, Fig. 4.31 and Fig. 4.32. Increased



Fig. 4.34. Measured transient response w/ turbo from 10  $\mu$  A to 6 mA.



Fig. 4.35. Measured transient response w/o turbo from 10  $\mu$  A to 6 mA.

voltage drop enlarges ripple and causes  $V_{out}$  drifting because DLBHC regulation relies on  $V_{out,min}$  and the sampled  $\Delta V_{out}$ . A high voltage drop from the ideal conversion ratio can induce power losses, which are often referred to as conduction losses. On the other hand, an insufficient voltage drop from the ideal conversion ratio can



Fig. 4.36. Analysis and comparison between the measured step-up load transient w/o  $V_{turbo}$  (left, from Fig. 4.34) and w/  $V_{turbo}$  (right, from Fig. 4.35).

lead to a faster clock speed. This forces the DLBHC to enter the fast-switching limit at a lower load current level, therefore reducing the efficiency. This result emphasizes the value of the multi-ratio SC converter design, which could be combined with the proposed DLBHC control for enhanced overall performance.

Fig. 4.33, Fig. 4.34 and Fig. 4.35 measure the transient response of the proposed SC converter. In Fig. 4.33, a pulse wave is applied to  $V_{turbo}$  and covers the rising edge of the 40  $\mu A$  - 4 mA transient measurement. The measured current rising time is 25 *ns* and the  $V_{turbo}$  is applied 55 *ns* earlier than the rising of  $I_{load}$ . Therefore, an 80 *ns* transient response speed is achieved. In Fig. 4.34, a pulse is applied to  $V_{turbo}$  in advance but does not cover the current rising edge. It takes 157 ns from the rising edge of the pulse wave to the point where the load current reaches 6 mA. A very small



Fig. 4.37. Analysis of measured step-down load transient (from Fig. 4.34).

voltage drop during this process is observed with 168 mV  $V_{pp}$  including the ripple. Compared with the intrinsic response, which is shown in Fig. 4.35 and Fig. 4.36,  $V_{turbo}$  significantly enhances the transient performance by boosting the clock speed, and DLBHC ensures the voltage regulation. This can be observed by comparing the effective frequency extracted from ripple  $1/T_{ripple}$  and the effective VCO frequency  $3/T_{vco}$ . Hence it allows a flexible dynamic trade-off between speed and efficiency at the system level. The maximum frequency limit is set to 34 MHz for measurement of 1.8V and has been demonstrated to operate correctly in both the steady-state and transient state. This limit of frequency is about 10% higher than the frequency of steady-state. Because of the high clock speed under heavy-load conditions, DLBHC can respond rapidly to step-down steps. As it can be observed in Fig.4.37, DLBHC skips multi-clock cycles to recover the  $V_{out}$ . The larger overshoot compared to the simulation is because of the parasitic inductance in the measurement setup. When not in steady-state, the output of DFFs in Fig. 4.11,  $\Phi_{down}$  and  $\Phi_{up}$ , are generating clock edges to control the frequency. When steady-state is reached,  $\Phi_{down}$  and  $\Phi_{up}$  are also stopped, and this means the  $\Phi_{skip}$  and  $\Phi_{pump}$  have stopped triggering DFFs. Hence, the overall behavior is similar to the simulation results in Fig. 4.23, Fig. 4.24 and Fig. 4.25.

# 4.4 Implementation with Centralized Multi-Phase DLBHC Controller

#### 4.4.1 Circuit Design

In addition to the distributed design, a centralized DLBHC control loop is proposed. The overall structure of the proposed design is shown in Fig. 4.38. Delay compensation circuits are inserted between the VCO and the DLBHC controller to adjust the  $T_{D,VCO}$ . Mode control is implemented to operate the charge pump in an 8-phase mode under light-load conditions, which guarantees a smaller ripple and maintains constant efficiency. Moreover, pattern detection is implemented in the frequency control loop to shorten the intrinsic transient response time to half of the DLBHC operating frequency, which is one clock cycle.

#### 4.4.2 Delay Compensations

It is not difficult to notice that the moment of measuring  $V_{o,1}$  defines a sampling window  $T_{sample}$ . In previous analysis, it is assumed that Loop 1 only triggers once for each clock cycle, as is the case shown in Fig. 4.39(b). This is because the frequency control is typically integrated with a VCO, which has a fixed phase for its output,



Fig. 4.38. Proposed centralized DLBHC controller with delay compensation.



Fig. 4.39. Analysis of DLBHC operations in time-domain: (a) Flowchart of operation. (b) Steady-state with matched frequency. (c) Transient state.

and having only one LBHC Loop 1 operation makes the  $T_{sample}$  equal to the  $T_{vco}$ . A stable zone is thus defined for planning the design of the distributed DLBHC. In contrast, if LBHC Loop 1 is triggered multiple times, like the case depicted by Fig. 4.39(c), the  $T_D$  would introduce an error between  $T_{sample}$  and  $T_{vco}$ , hence degrading



Fig. 4.40. Details of delay compensation.



Fig. 4.41. Timing of delay compensation.

the overall DLBHC performance. To explain this, at the end of one phase, because the charge-sharing current decreases exponentially, the load current dominates the variation of  $V_0$ . Therefore, for simplicity, the difference in sampled voltage caused by the additional  $T_D$  can be modeled as:

$$\Delta V_{o,1} \approx \frac{\Delta Q}{C_o + \sum C_{fly}} = \frac{I_o N T_D}{C_o + \sum C_{fly}}$$
(4.27)



Fig. 4.42. Simulated timing diagram of delay compensation during end of load transient from 7 mA to 8 mA.



Fig. 4.43. (a) Mode control circuit (b) Clock timing.

where *N* represents the number of additional LBHC Loop 1 triggers. Apparently,  $\Delta V_{o,1}$  reduces the effective  $\Delta V_{deadzone}$  during the transient response under heavyload conditions. Without having an effective  $\Delta V_{deadzone}$ , it cannot produce a correct  $F_h$  for the frequency controller. However, because the DLBHC loop can determine the relative frequency differences to the ideal frequency when the definition of  $T_{sample}$  matches  $T_{vco}$ , this holds true even if multiple LBHC Loop 1 occurrences are triggered. Compensation techniques hence are proposed to address this issue in centralized DLBHC.

| Steady-state                                          | : Trigger $V_{S}$              | ST Cur                         | rent Phase k                   |
|-------------------------------------------------------|--------------------------------|--------------------------------|--------------------------------|
| Any                                                   | $\mathbf{\Phi}_{	ext{pump}}$   | $oldsymbol{\Phi}_{	ext{skip}}$ | $\mathbf{\Phi}_{	ext{pump}}$   |
| k-3                                                   | k-2                            | k-1                            | k                              |
| Normal tran                                           | sient (load st                 | ep-up): Trigg                  | er V <sub>up</sub>             |
| $\Phi_{ m pump}$                                      | $oldsymbol{\Phi}_{	ext{skip}}$ | $\mathbf{\Phi}_{	ext{pump}}$   | $\mathbf{\Phi}_{	ext{pump}}$   |
| k-3                                                   | k-2                            | k-1                            | k                              |
| Turbo trans                                           | sient : Trigge                 | r ${ m V_{turbo,1}}$           |                                |
| Any                                                   | $\mathbf{\Phi}_{	ext{pump}}$   | $\mathbf{\Phi}_{	ext{pump}}$   | $\mathbf{\Phi}_{	ext{pump}}$   |
| k-3                                                   | <b>k-2</b>                     | k-1                            | k                              |
| Normal transient (load step-down): Trigger $V_{down}$ |                                |                                |                                |
| $\mathbf{\Phi}_{	ext{skip}}$                          | Any                            | $\mathbf{\Phi}_{	ext{skip}}$   | $oldsymbol{\Phi}_{	ext{skip}}$ |
| k-3                                                   | k-2                            | k-1                            | k                              |
|                                                       |                                | (a)                            |                                |
| $\mathbf{C}_{\text{hold}}$                            | Mup4<br>V'up<br>V down<br>own3 |                                |                                |

Fig. 4.44. (a) State-detection (b) Frequency controller.

#### 4.4.3 Centralized DLBHC with Delay Compensation

Fig. 4.40 shows the detailed implementation of delay compensation circuits. Using the state  $C_1$  and  $C_2$  detected from the four phases  $\Phi_1$ ,  $\Phi'_1$ ,  $\Phi_2$ ,  $\Phi'_2$ , which generated by phase interleaver, at the moment of triggering  $\Phi_{skip}$  or  $\Phi_{pump}$ , it selects proper phase to compensate the delay, which is shown in Fig. 4.41.

As it is shown in the timing diagram Fig. 4.42, output clock phase  $\Phi_c$  is dynamically shifted at the rising edge of  $\Phi_{pump}$  and  $\Phi_{skip}$ . The sampling time  $T_{sample1}$  hence becomes close to the steady-state sample time  $T_{sample2}$ . Although having a limited compensation resolution, which is half of the  $T_{vco}$ , it effectively compensates the time offset  $T_D$ , hence resulting in a higher load current. The  $T_D$  is implemented at the clock input port of the comparator, providing a time offset and eliminating the potential glitches during delay compensation.

#### 4.4.4 **Dual-Mode Operations**

Although multiphase design can reduce ripple, it generates stress on the controller as a higher frequency is required to operate the additional phases. However, it's possible to reduce the sampling rate of the DLBHC loop without degrading stability. This is due to the fact that a single DLBHC sample can extract the relationship between pumped charge and consumed charge, hence generating a proper  $F_h$ . The dual-mode design is hence proposed in this paper. As shown in Fig. 4.43, two D-Flip-Flops monitor the phase changes, which switch the charge pump to 4-phase mode under heavy load conditions when  $T_D$  becomes comparable to  $T_{vco}$ , thus extending the load current range. Meanwhile, with a light-load current, a low  $V_{mode}$ enables the 8-phase mode for a smaller ripple. Additional clock signals would be directly passed through  $V_{pump2}$  without triggering DLBHC operations, thus maintaining a constant sampling frequency.

#### 4.4.5 Frequency Control Design

The proposed frequency control is implemented by a charge pump which controls the biasing voltage of VCO, as shown in Fig. 4.44b. With a single clock input, the mismatch between different VCO phases is less significant, hence a delay gate based oscillator is used [27] to implement the VCO. This provides a much wider tuning range and is more efficient at low frequencies due to the removal of the level shifter. Ideally, the average transient response time of the proposed DLBHC control can be minimized to one clock cycle under light-load conditions. This is because the DLBHC operates for every 2 clock cycles in 8-phase mode, hence 1 clock cycle is the average time to respond to the transient. Therefore, to achieve this response time, we implement flip-flop-based state detection, as shown in Fig. 4.44a. As the lightto-heavy load transient triggers multiple  $V_{pump}$  continuously, it triggers  $V_{turbo1}$  and hence triggers  $V_{turbo}$  to maximize the frequency. The  $V_{turbo,ext}$  also accepts external requests for active transient behavior. Moreover, the state-detection also filters out noise signals from  $V_{pump}$  and  $V_{skip}$  that do not follow a steady-state, thus improving the accuracy of frequency regulation.

#### 4.4.6 Measurement Results

Fig. 4.45 shows the microphotograph of the implemented chip. The proposed converter is implemented with 180-nm technology. There are a total of 16 flying capacitors, each has a capacitance of 86.6pF. Two flying capacitors constitute one phase. Meanwhile, a 144 pF output capacitor is selected to store the charges at the output node.

Fig. 4.46 shows the load regulation of the proposed chips. The peak power efficiency is 68.1%. Over 65% efficiency is maintained from 5.3  $\mu$ W to 6.9 *m*W output power thanks to the proper controller power-frequency, as shown in Fig. 4.47. The



Fig. 4.45. Chip microphotograph.



Fig. 4.46. Load regulation and efficiency.

 $P_{DLBCH+DC}$  refers to the power consumed by the delay compensation circuit and the core DLBCH loop. The  $P_{FC}$  refers to the power consumed by the frequency control. The lower boundary is mainly extended by the delay gate based VCO, and the proposed delay compensation enhances the heavy load performance.



Fig. 4.47. Frequency and power consumption of controller.

Fig. 4.48 and Fig. 4.49 show the transient performance of the DLBCH controller. The intrinsic response time is scaled with the operating frequency, which is an inevitable trade-off between light efficiency and speed. But the DLBCH loop can reduce the average transient time down to half of the DLBCH loop operating frequency, which is one clock cycle in 8-phase mode. For example, the frequency at 0.3 mA load current is measured to be 5.38 MHz, giving a 185.9 ns clock period. And statistics of the transient time from 0.3 mA - 3 mA, as shown in Fig. 4.49, also prove this point. For higher demands of transient speed, external  $V_{turbo}$  can be applied to boost the clock in advance, as shown in Fig. 4.50.

Fig. 4.51 shows the detailed power consumption obtained from measurement. With 13% of power loss can be calculated as the charge sharing loss  $P_{CS}$ , 6% of other power losses exists in the charge pump conversions  $P_{other}$ , which includes the parasitic and the power of phase interleaved. Meanwhile, 5% is consumed by



Fig. 4.48. Transient response without external activation.

the frequency control loop  $P_{FC}$ , including the VCO. An 8% power,  $P_{DLBCH+DC}$ , is required by the DLBCH controller loop and the delay compensation circuits.

By comparing the performance with state-of-the-art design using Table 4.1, the power density is improved compared with the previous DLBCH controller [94]. Al-though controller complexity limits efficiency, this research still demonstrated a path to improving the current range and power density of DLBCH control, which might be further enhanced with an advanced technology node.

### 4.5 Comparison of the Performance

Table 4.1 compares the results with the state-of-the-art research. Without proper controller power scaling, it is challenging to maintain good efficiency over a wide load current range [42,72]. While the overall efficiency can be enhanced by large



Fig. 4.50. Transient response with external activation.



Fig. 4.51. Break down of power conversion efficiency at 1 *mA* load current.

flying capacitors and scalable parasitic redistribution techniques [42], or accurate topology conversion ratio [72], constant controller power consumption would degrade the light load performance due to a fixed controller frequency. While [3,46] scales the power according to load currents, [3] requires eight comparator operations per phase, which is less efficient than our proposed DLBHC design that uses 2 comparator operations per phase. Pseudo clock frequency [46] works well in hundreds of milliwatts output levels with the off-chip design, which shows comparable load ranges with our design. However, switching between limited fixed frequency clock inputs cannot match the frequency perfectly and impacts the efficiency when the optimum frequency loactes between the clock inputs. In [93], the central DLBHC controller with phases interleaving consumes excessive power, limiting the overall power conversion efficiency. Therefore, the proposed DLBHC loop delivers better performance in maintaining the efficiency under different load levels, which is

|                                               | TPE 19 [72]                                | TCAS-I 14 [46]                                | JSSC 16 [42]           | TVLSI 17 [3]                           | This work<br>Centralized<br>DLBHC [93]                                  | <b>This work</b><br>Distributed<br>DLBHC [60]                                                       |
|-----------------------------------------------|--------------------------------------------|-----------------------------------------------|------------------------|----------------------------------------|-------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------|
| Process                                       | 250 nm                                     | 55 nm                                         | 40 nm                  | 180 nm                                 | 180 nm                                                                  | 180 nm                                                                                              |
| C <sub>cfly</sub>                             | 10 nF                                      | 2×100 nF<br>(off-chip)                        | 10 nF                  | 1.5 nF                                 | 1.369 nF                                                                | 1.336 nF                                                                                            |
| Cout                                          | 0                                          | 1000 nF<br>(off-chip)                         | 0                      | 2 nF (off-chip)                        | 0.144 nF                                                                | 0.07 nF                                                                                             |
| Area<br>(mm <sup>2</sup> )                    | 7.14                                       | 1.32                                          | 2.2                    | 0.495                                  | 1.79                                                                    | 1.85                                                                                                |
| Topology                                      | Asymmetrical<br>Shunt: 187 Ratios          | 1/3                                           | 1/2                    | 1/2                                    | 1/2                                                                     | 1/2                                                                                                 |
| Input<br>Voltage (V)                          | 3.3                                        | 3.3                                           | 1.855 - 2.07           | 1.8                                    | 1.8                                                                     | 1.4 - 1.8                                                                                           |
| Output<br>Voltage (V)                         | 0.4 - 2.8                                  | 1                                             | 0.9                    | 0.8                                    | 0.75                                                                    | 0.62 - 0.81                                                                                         |
| P <sub>out,max</sub><br>(mW)                  | 14.5 *<br>(10 mA & 1.45 V)                 | 250                                           | 3.85                   | 5                                      | 6.9                                                                     | 5                                                                                                   |
| P <sub>out</sub> /C <sub>fly</sub><br>(mW/nF) | 1.45                                       | 2.5                                           | 0.385                  | 3.33                                   | 5.04                                                                    | 3.74                                                                                                |
| Power Density<br>(mW/mm <sup>2</sup> )        | 2.03                                       | 189.4<br>(excluding<br>off-chip caps.)        | 1.75                   | 10.1                                   | 3.85                                                                    | 2.7                                                                                                 |
| I <sub>load</sub> (mA)<br>w/ Eff. >75%        | 0.5 - 10 (20x)<br>@ 1.5 V V <sub>out</sub> | 1* - 250 (250x)                               | 0.1 - 4.28<br>(42.8x)* | 0.22 - 7<br>(31.8x)*                   | none                                                                    | 0.028 - 6.4 <b>(228x)</b>                                                                           |
| I <sub>load</sub><br>Step (mA)                | 0 - 7                                      | 25 - 70, 70 - 130                             | 0 - 4.25               | 0.2 - 2                                | 0.03 - 3                                                                | 0.04 - 4, 0.01 - 6                                                                                  |
| Response<br>Time                              | 1.5 μs                                     | 20 μs<br>@ 25 - 70 mA<br>7 μs @ 70 - 130 mA   | 8 ns                   | $250~{ m ns}~^* pprox rac{1}{f_{sw}}$ | $185.9 \text{ ns}$ (w / Turbo) $\approx \frac{1}{f_{sw}}$ (w / o Turbo) | 80 ns @ 0.04 - 4 mA<br>157 ns @ 0.01 - 6 mA<br>(w/ Turbo)<br>$\approx \frac{1}{f_{sw}}$ (w/o Turbo) |
| V <sub>pp</sub> (mV)<br>@ Up Tran.            | 180*                                       | 100*<br>@ 25 - 70 mA<br>80*<br>@ 70 - 130 mA  | 48*                    | 66*                                    | 300*                                                                    | 151 @ 0.04 - 4 mA<br>168 @ 0.01 - 6 mA                                                              |
| FoM<br>(mV*nF/mA)                             | 257                                        | 2666<br>@ 25 - 70 mA<br>1600<br>@ 70 - 130 mA | 112.9                  | 128.3                                  | 152.8                                                                   | 53.6 @ 0.04 - 4 mA<br>39.4 @ 0.01 - 6 mA                                                            |

TABLE 4.1: Comparison of Performance

\* Estimated number from paper

 $FoM = \frac{C_{total}V_{pp,tran}}{I_{step}}$ 

over 75% efficiency in 228× load current ranges. In terms of transient response, because the worst-case intrinsic transient response speed is proportional to switching frequency  $f_{sw}$ , it is inevitably limited by the low-power constraint in light-load conditions. However, once the load requires a fast transient response by sending a

pulse  $V_{turbo}$ , the frequency is actively boosted. The overall Figure of Merit (FoM) of the turbo mode is better than the high-speed LBHC used in [42] and other designs [3,46,72,93].

#### 4.6 Discussions

In this chapter, while keeping the robustness of conventional LBHC control, the idea of adding a secondary LBHC to detect the frequency with a minimum of two samples per clock cycle is analyzed and demonstrated. Because this mechanism decouples load voltage regulation stability from operating frequency, it allows external adjustment of the performance to provide a fast active transient response. Also, by guiding the frequency and power scaling through DLBHC's operation, a consistent efficiency over a wide load current range is achieved. The delay time required for designing a stable DLBHC loop is also analyzed by studying the boundary condition of DLBHC operation. Overall, our prototype design demonstrated a 75% efficiency over a 30 $\mu$ A - 6.4 mA load current range for the distributed DLBHC design, which is significantly wider than other state-of-the-art designs, and a fast active transient response. As this research mainly focuses on controller design, this makes it a competitive candidate when combining other performance optimization techniques to build high-performance PMIC system designs for always-on IoT applications.

### Chapter 5

## Conclusion

In this thesis, the design challenges that exist in the always-on IoT applications are addressed from two aspects: First, for the efficient operations of SCPC under ultra-low voltage and power conditions, the chapter 3 of this thesis pointed out that the transistor on-resistance as the crucial parameter plays an important role in the overall performance of power converters. Because of the constraints on the power level, trade-offs in the entire system become crucial when investigating the effective approaches for boosting the performances. Therefore, by comparing the different approaches for optimizing the transistor performance, it has been figured out that the proposed gate voltage optimization methods can achieve the targeting on-resistance with minimal costs, hence achieving competitive overall power conversion efficiency in measurement even if conventional topologies are adopted to demonstrate the concept.

Meanwhile, to address the challenges of standby power conversion efficiency and transient speed in wake-up events in IoT sensors, a dual lower bound hysteretic control method is described in 4. The proposed method minimizes the power consumption of the hysteretic control loop to 2 comparisons per phase and converts the state of operation to the difference between the optimum state and current state, hence guiding the frequency control loop optimized the power conversion efficiency under light-load condition by scaling the frequency. Also, due to the isolation provided by the DLBHC loop, the frequency of SCPC is allowed to be controlled and boosted adaptively by external circuits, hence achieving a fast transient response in wake-up events. To better demonstrate the concept, a centralized and a distributed DLBHC are implemented, and it is proven that the distributed design has higher efficiency because the controller is less complicated.

Overall, by demonstrating the effectiveness of these two fundamental techniques for enhancing the performance of SCPC in IoT applications, we demonstrated how it improves performance over conventional approaches. As it is compatible with many state-of-the-art efficiency optimization techniques, we hope that this thesis can enable wider applications of SCPC in various IoT applications, which need to combine multiple techniques to formulate a sophisticated design.

# Acknowledgements

I would like to express my deep and sincere gratitude to my research supervisor, Prof. Hiroki Ishikuro, and my co-supervisor, Prof. Huang, for giving me the opportunity to conduct this research and providing me with invaluable guidance throughout this process. It is a great privilege and honor to study in the Ishikuro Laboratory under their instruction. I am very grateful for what they have offered.

My sincere thanks also give to the members of Ishikuro Laboratory, for becoming friends with me, helping me with my life and studies and the enjoyment we obtained together during the past years.

Special thanks also give to Keio University, especially the Graduate School of Science and Technology, for the opportunity they have provided to me and the knowledge they have imparted.

I am extremely grateful to my parents as well, for their support and sacrifices to my life and choices. It is they who push me to become the person I wish to be and to live the life I desire to own.

Yi Tan, January, 2024

## Bibliography

- [1] "State of IoT 2023: Number of Connected IoT Devices Growing 16% to 16.7 Billion Globally, (accessed by Jan. 2024)," https://iot-analytics.com/numberconnected-iot-devices/, May 2023.
- [2] "IoT Connected Devices Worldwide 2019-2030 (accessed by Jan. 2024)," https://www.statista.com/statistics/1183457/iot-connected-devicesworldwide/.
- [3] Z. Xiao, A. K. Bui, and L. Siek, "A Hysteretic Switched-Capacitor DC–DC Converter with Optimal Output Ripple and Fast Transient Response," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 25, no. 11, pp. 2995– 3005, 2017.
- [4] K. Wei, Y. Ramadass, and D. B. Ma, "Direct 12V/24V-to-1V Tri-State Double Step-Down Power Converter With Online VCF Rebalancing and In-Situ Precharge Rate Regulation," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 8, pp. 2416–2426, Aug. 2021.
- [5] W. Han, C. Chen, J. Liu, and H. Lee, "An 80A 48V-Input Capacitor-Assisted Dual-Inductor Hybrid Dickson Converter for Large-Conversion Ratio Applications," in 2022 IEEE Energy Conversion Congress and Exposition (ECCE), Oct. 2022, pp. 1–5.

- [6] D. Yan, X. Ke, and D. B. Ma, "Direct 48-/1-V GaN-Based DC–DC Power Converter With Double Step-Down Architecture and Master–Slave AO2T Control," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 4, pp. 988–998, Apr. 2020.
- [7] Z. Xia and J. T. Stauth, "A Cascaded Hybrid Switched-Capacitor DC–DC Converter Capable of Fast Self Startup for USB Power Delivery," *IEEE Journal of Solid-State Circuits*, vol. 57, no. 6, pp. 1854–1864, Jun. 2022.
- [8] Y. Yamauchi, T. Sai, K. Hata, and M. Takamiya, "0.55 W, 88%, 78 kHz, 48 Vto-5 V Fibonacci Hybrid DC–DC Converter IC Using 66 mm<sup>3</sup> of Passive Components With Automatic Change of Converter Topology and Duty Ratio for Cold-Crank Transient," *IEEE Transactions on Power Electronics*, vol. 36, no. 8, pp. 9273–9284, Aug. 2021.
- [9] H. Cao, X. Yang, C. Xue, L. He, Z. Tan, M. Zhao, Y. Ding, W. Li, and W. Qu, "A 12-Level Series-Capacitor 48-1V DC–DC Converter With On-Chip Switch and GaN Hybrid Power Conversion," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 12, pp. 3628–3638, Dec. 2021.
- [10] Z. Ye, R. A. Abramson, and R. C. N. Pilawa-Podgurski, "A 48-to-6 V Multi-Resonant-Doubler Switched-Capacitor Converter for Data Center Applications," in 2020 IEEE Applied Power Electronics Conference and Exposition (APEC), Mar. 2020, pp. 475–481.
- [11] M. Gong, H. Chen, X. Zhang, R. Jain, and A. Raychowdhury, "A 90.4% Peak Efficiency 48-to-1-V GaN/Si Hybrid Converter With Three-Level Hybrid Dickson Topology and Gradient Descent Run-Time Optimizer," *IEEE Journal of Solid-State Circuits*, vol. 58, no. 4, pp. 1002–1014, Apr. 2023.
- [12] G.-S. Seo, R. Das, and H.-P. Le, "A 95%-Efficient 48V-to-1V/10A VRM Hybrid Converter Using Interleaved Dual Inductors," in 2018 IEEE Energy Conversion

*Congress and Exposition (ECCE)*. Portland, OR, USA: IEEE, Sep. 2018, pp. 3825–3830.

- [13] P. Assem, W.-C. Liu, Y. Lei, P. K. Hanumolu, and R. C. N. Pilawa-Podgurski, "Hybrid Dickson Switched-Capacitor Converter With Wide Conversion Ratio in 65-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 9, pp. 2513– 2528, Sep. 2020.
- [14] W.-C. Liu, P. Assem, Y. Lei, P. K. Hanumolu, and R. Pilawa-Podgurski, "10.3 A 94.2%-Peak-Efficiency 1.53A Direct-Battery-Hook-up Hybrid Dickson Switched-Capacitor DC-DC Converter with Wide Continuous Conversion Ratio in 65nm CMOS," in 2017 IEEE International Solid-State Circuits Conference (ISSCC). San Francisco, CA, USA: IEEE, Feb. 2017, pp. 182–183.
- [15] A. Abdulslam and P. P. Mercier, "A Symmetric Modified Multilevel Ladder PMIC for Battery-Connected Applications," *IEEE J. Solid-State Circuits*, vol. 55, no. 3, pp. 767–780, Mar. 2020.
- [16] Y. Li, M. John, Y. Ramadass, and S. R. Sanders, "AC-Coupled Stacked Dual-Active-Bridge DC–DC Converter for Integrated Lithium-Ion Battery Power Delivery," *IEEE J. Solid-State Circuits*, vol. 54, no. 3, pp. 733–744, Mar. 2019.
- [17] X. Yang, L. Zhao, M. Zhao, Z. Tan, L. He, Y. Ding, W. Li, and W. Qu, "A 5V Input 98.4% Peak Efficiency Reconfigurable Capacitive-Sigma Converter With Greater than 90% Peak Efficiency for the Entire 0.4~1.2V Output Range," in 2022 IEEE International Solid- State Circuits Conference (ISSCC). San Francisco, CA, USA: IEEE, Feb. 2022, pp. 108–110.
- [18] C. Hardy, Y. Ramadass, K. Scoones, and H.-P. Le, "A Flying-Inductor Hybrid DC–DC Converter for 1-Cell and 2-Cell Smart-Cable Battery Chargers," *IEEE J. Solid-State Circuits*, vol. 54, no. 12, pp. 3292–3305, Dec. 2019.

- [19] G. Cai, Y. Lu, and R. Martins, "A Battery-Input Sub-1V Output 92.9% Peak Efficiency 0.3A/mm<sup>2</sup> Current Density Hybrid SC-Parallel-Inductor Buck Converter with Reduced Inductor Current in 65nm CMOS," in 2022 IEEE International Solid- State Circuits Conference (ISSCC). San Francisco, CA, USA: IEEE, Feb. 2022, pp. 312–314.
- [20] J.-Y. Ko, Y. Huh, M.-W. Ko, G.-G. Kang, G.-H. Cho, and H.-S. Kim, "A 4.5V-Input 0.3-to-1.7V-Output Step-Down Always-Dual-Path DC-DC Converter Achieving 91.5%-Efficiency with 250mΩ-DCR Inductor for Low-Voltage SoCs," in 2021 Symposium on VLSI Circuits. Kyoto, Japan: IEEE, Jun. 2021, pp. 1–2.
- [21] Y. Huh, S.-W. Hong, and G.-H. Cho, "A Hybrid Structure Dual-Path Step-Down Converter With 96.2% Peak Efficiency Using 250-mω Large-DCR Inductor," *IEEE J. Solid-State Circuits*, vol. 54, no. 4, pp. 959–967, Apr. 2019.
- [22] C. Wang, Y. Lu, M. Huang, and R. P. Martins, "A Two-Phase Three-Level Buck Converter With Cross-Connected Flying Capacitors for Inductor Current Balancing," *IEEE Trans. Power Electron.*, vol. 36, no. 12, pp. 13855–13866, Dec. 2021.
- [23] X. Liu, C. Huang, and P. K. T. Mok, "A High-Frequency Three-Level Buck Converter With Real-Time Calibration and Wide Output Range for Fast-DVS," *IEEE J. Solid-State Circuits*, vol. 53, no. 2, pp. 582–595, Feb. 2018.
- [24] S. R. Sanders, E. Alon, H.-P. Le, M. D. Seeman, M. John, and V. W. Ng, "The Road to Fully Integrated DC–DC Conversion via the Switched-Capacitor Approach," *IEEE Trans. Power Electron.*, vol. 28, no. 9, pp. 4146–4155, Sep. 2013.
- [25] N. Butzen, H. Krishnarnurthy, Z. Ahmed, S. Weng, K. Ravichandran, M. Zelikson, J. Tschanz, and J. Douglas, "14.4 A Monolithic 26A/mm2Imax, 88.5% Peak-Efficiency Continuously Scalable Conversion-Ratio Switched-Capacitor

DC-DC Converter," in 2023 IEEE International Solid- State Circuits Conference (ISSCC), Feb. 2023, pp. 232–234.

- [26] L. Intaschi, P. Bruschi, G. Iannaccone, and F. Dalena, "A 220-mV Input, 8.6 Step-up Voltage Conversion Ratio, 10.45-μW Output Power, Fully Integrated Switched-Capacitor Converter for Energy Harvesting," in 2017 IEEE Custom Integrated Circuits Conference (CICC). Austin, TX: IEEE, Apr. 2017, pp. 1–4.
- [27] W. Jung, S. Oh, S. Bang, Y. Lee, Z. Foo, G. Kim, Y. Zhang, D. Sylvester, and D. Blaauw, "An Ultra-Low Power Fully Integrated Energy Harvester Based on Self-Oscillating Switched-Capacitor Voltage Doubler," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2800–2811, Dec 2014.
- [28] X. Wu, Y. Shi, S. Jeloka, K. Yang, I. Lee, Y. Lee, D. Sylvester, and D. Blaauw, "A 20-pW Discontinuous Switched-Capacitor Energy Harvester for Smart Sensor Applications," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 4, pp. 972–984, April 2017.
- [29] J. Kim, P. K. T. Mok, and C. Kim, "A 0.15 V Input Energy Harvesting Charge Pump With Dynamic Body Biasing and Adaptive Dead-Time for Efficiency Improvement," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 2, pp. 414–425, Feb 2015.
- [30] P. Chen, K. Ishida, Xin Zhang, Y. Okuma, Y. Ryu, M. Takamiya, and T. Sakurai, "A 120-mV Input, Fully Integrated Dual-mode Charge Pump in 65-nm CMOS for Thermoelectric Energy Harvester," in 17th Asia and South Pacific Design Automation Conference, Jan 2012, pp. 469–470.
- [31] H. Fuketa, S.-i. O'uchi, and T. Matsukawa, "Fully Integrated, 100-mV Minimum Input Voltage Converter With Gate-Boosted Charge Pump Kick-Started

by *LC* Oscillator for Energy Harvesting," *IEEE Trans. Circuits Syst. II*, vol. 64, no. 4, pp. 392–396, Apr. 2017.

- [32] H. Yi, J. Yin, P.-I. Mak, and R. P. Martins, "A 0.032-mm<sup>2</sup> 0.15-V Three-Stage Charge-Pump Scheme Using a Differential Bootstrapped Ring-VCO for Energy-Harvesting Applications," *IEEE Trans. Circuits Syst. II*, vol. 65, no. 2, pp. 146–150, Feb. 2018.
- [33] A. Ballo, A. D. Grasso, and G. Palumbo, "A Review of Charge Pump topologies for the power management of IoT nodes," *Electronics*, vol. 8, no. 5, p. 480, 2019.
- [34] Y. Tan, Y. Shiiki, and H. Ishikuro, "A 0.12V Fully Integrated Charge Pump with Gate Voltage Optimization for Energy Harvesting Applications," in 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020, pp. 1–5.
- [35] S. Roy, A. N. M. W. Azad, S. Baidya, and F. Khan, "A Comprehensive Review on Rectifiers, Linear Regulators, and Switched-Mode Power Processing Techniques for Biomedical Sensors and Implants Utilizing in-Body Energy Harvesting and External Power Delivery," *IEEE Trans. Power Electron.*, vol. 36, no. 11, pp. 12721–12745, Nov. 2021.
- [36] W.-H. Ki, Y. Lu, F. Su, and C.-Y. Tsui, "Analysis and Design Strategy of On-Chip Charge Pumps for Micro-power Energy Harvesting Applications," in VLSI-SoC: Advanced Research for Systems on Chip, S. Mir, C.-Y. Tsui, R. Reis, and O. C. S. Choy, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, vol. 379, pp. 158–186.
- [37] R. Jain, B. Geuskens, M. Khellah, S. Kim, J. Kulkarni, J. Tschanz, and V. De, "A 0.45-1V Fully Integrated Reconfigurable Switched Capacitor Step-down DC-DC Converter with High Density MIM Capacitor in 22nm Tri-gate CMOS," in 2013 Symposium on VLSI Circuits, 2013, pp. C174–C175.

- [38] S. Bang, D. Blaauw, and D. Sylvester, "A Successive-Approximation Switched-Capacitor DC–DC Converter With Resolution of V<sub>IN</sub>/2<sup>N</sup> for a Wide Range of Input and Output Voltages," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 2, pp. 543–556, Feb. 2016.
- [39] Y. Wang, M. Huang, Y. Lu, and R. P. Martins, "A Continuously Scalable-Conversion-Ratio SC Converter with Reconfigurable VCF Step for High Efficiency over an Extended VCR Range," in 2023 IEEE International Solid- State Circuits Conference (ISSCC), Feb. 2023, pp. 1–3.
- [40] Y. Lei, R. May, and R. Pilawa-Podgurski, "Split-Phase Control: Achieving Complete Soft-Charging Operation of a Dickson Switched-Capacitor Converter," *IEEE Transactions on Power Electronics*, vol. 31, no. 1, pp. 770–782, Jan. 2016.
- [41] N. Butzen and M. S. J. Steyaert, "Design of Soft-Charging Switched-Capacitor DC–DC Converters Using Stage Outphasing and Multiphase Soft-Charging," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 12, pp. 3132–3141, Dec. 2017.
- [42] N. Butzen and M. S. J. Steyaert, "Scalable Parasitic Charge Redistribution: Design of High-Efficiency Fully Integrated Switched-Capacitor DC–DC Converters," IEEE Journal of Solid-State Circuits, vol. 51, no. 12, pp. 2843–2853, 2016.
- [43] Y. Lu, G. Cai, and J. Huang, "Favorable basic cells for hybrid DC–DC converters," *J. Semicond.*, vol. 44, no. 4, p. 040301, Apr. 2023.
- [44] Y. Lu, J. Jiang, and W. Ki, "A Multiphase Switched-Capacitor DC–DC Converter Ring with Fast Transient Response and Small Ripple," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 2, pp. 579–591, 2017.

- [45] U.-F. Chio, K.-C. Wen, S.-W. Sin, C.-S. Lam, Y. Lu, F. Maloberti, and R. P. Martins, "An Integrated DC–DC Converter With Segmented Frequency Modulation and Multiphase Co-Work Control for Fast Transient Recovery," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 10, pp. 2637–2648, Oct. 2019.
- [46] W.-C. Chen, D.-L. Ming, Y.-P. Su, Y.-H. Lee, and K.-H. Chen, "A Wide Load Range and High Efficiency Switched-Capacitor DC-DC Converter With Pseudo-Clock Controlled Load-dependent Frequency," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, no. 3, pp. 911–921, Mar. 2014.
- [47] Z. Wang, L. Ye, Y. Liu, P. Zhou, Z. Tan, H. Fan, Y. Zhang, J. Ru, Y. Wang, and R. Huang, "12.1 A 148nW General-Purpose Event-Driven Intelligent Wake-Up Chip for AIoT Devices Using Asynchronous Spike-Based Feature Extractor and Convolutional Neural Network," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 436–438.
- [48] D. Rossi, F. Conti, M. Eggiman, S. Mach, A. D. Mauro, M. Guermandi, G. Tagliavini, A. Pullini, I. Loi, J. Chen, E. Flamand, and L. Benini, "4.4 A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7uW Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 60–62.
- [49] E. J. Carlson, K. Strunz, and B. P. Otis, "A 20 mV Input Boost Converter With Efficient Digital Control for Thermoelectric Energy Harvesting," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 741–750, Apr. 2010.
- [50] P.-H. Chen and P. M.-Y. Fan, "An 83.4% Peak Efficiency Single-Inductor Multiple-Output Based Adaptive Gate Biasing DC-DC Converter for Thermoelectric Energy Harvesting," *IEEE Trans. Circuits Syst. I*, vol. 62, no. 2, pp. 405– 412, Feb. 2015.

- [51] P.-S. Weng, H.-Y. Tang, P.-C. Ku, and L.-H. Lu, "50 mV-Input Batteryless Boost Converter for Thermal Energy Harvesting," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 1031–1041, Apr. 2013.
- [52] Y. K. Ramadass and A. P. Chandrakasan, "A Battery-Less Thermoelectric Energy Harvesting Interface Circuit With 35 mV Startup Voltage," *IEEE J. Solid-State Circuits*, vol. 46, no. 1, pp. 333–341, Jan. 2011.
- [53] P.-H. Chen, X. Zhang, K. Ishida, Y. Okuma, Y. Ryu, M. Takamiya, and T. Sakurai, "An 80 mV Startup Dual-Mode Boost Converter by Charge-Pumped Pulse Generator and Threshold Voltage Tuned Oscillator With Hot Carrier Injection," *IEEE J. Solid-State Circuits*, vol. 47, no. 11, pp. 2554–2562, Nov. 2012.
- [54] M. Dini, A. Romani, M. Filippi, and M. Tartagni, "A Nanocurrent Power Management IC for Low-Voltage Energy Harvesting Sources," *IEEE Trans. Power Electron.*, vol. 31, no. 6, pp. 4292–4304, Jun. 2016.
- [55] Y. Qiu, C. Van Liempd, B. O. Het Veld, P. G. Blanken, and C. Van Hoof, "5μWto-10mW Input Power Range Inductive Boost Converter for Indoor Photovoltaic Energy Harvesting with Integrated Maximum Power Point Tracking Algorithm," in 2011 IEEE International Solid-State Circuits Conference. San Francisco, CA, USA: IEEE, Feb. 2011, pp. 118–120.
- [56] G. Schrom, P. Hazucha, F. Paillet, D. J. Rennie, S. T. Moon, D. S. Gardner, T. Kamik, P. Sun, T. T. Nguyen, M. J. Hill, K. Radhakrishnan, and T. Memioglu, "A 100MHz Eight-Phase Buck Converter Delivering 12A in 25mm<sup>2</sup> Using Air-Core Inductors," in APEC 07 Twenty-Second Annual IEEE Applied Power Electronics Conference and Exposition. Anaheim, CA, USA: IEEE, Feb. 2007, pp. 727–730.

- [57] N. Sturcken, E. O'Sullivan, N. Wang, P. Herget, B. Webb, L. Romankiw, M. Petracca, R. Davies, R. Fontana, G. Decad, I. Kymissis, A. Peterchev, L. Carloni, W. Gallagher, and K. Shepard, "A 2.5D Integrated Voltage Regulator Using Coupled-Magnetic-Core Inductors on Silicon Interposer Delivering 10.8A/mm<sup>2</sup>," in 2012 IEEE International Solid-State Circuits Conference. San Francisco, CA, USA: IEEE, Feb. 2012, pp. 400–402.
- [58] X. Liu, B. H. Calhoun, and S. Li, "A Sub-nW 93% Peak Efficiency Buck Converter With Wide Dynamic Range, Fast DVFS, and Asynchronous Load-Transient Control," pp. 1–1. [Online]. Available: https://ieeexplore.ieee.org/ document/9750405/
- [59] Y. Tan, Y. Shiiki, and H. Ishikuro, "Optimization of Gate Voltage in Capacitive DC–DC Converters for Thermoelectric Energy Harvesting," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 30, no. 4, pp. 463–473, Apr. 2022.
- [60] Y. Tan, C. Huang, and H. Ishikuro, "Design of Dual Lower Bound Hysteresis Control in Switched-Capacitor DC–DC Converter for Optimum Efficiency and Transient Speed in Wide Loading Range for IoT Application," pp. 1–12. [Online]. Available: https://ieeexplore.ieee.org/document/10319457/
- [61] Z. Chen, M.-K. Law, P.-I. Mak, and R. P. Martins, "A Single-Chip Solar Energy Harvesting IC Using Integrated Photodiodes for Biomedical Implant Applications," *IEEE Trans. Biomed. Circuits Syst.*, vol. 11, no. 1, pp. 44–53, Feb. 2017.
- [62] W. Jung, S. Oh, S. Bang, Y. Lee, Z. Foo, G. Kim, Y. Zhang, D. Sylvester, and D. Blaauw, "An Ultra-Low Power Fully Integrated Energy Harvester Based on Self-Oscillating Switched-Capacitor Voltage Doubler," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 2800–2811, Dec. 2014.

- [63] Y.-C. Shih and B. P. Otis, "An Inductorless DC–DC Converter for Energy Harvesting With a 1.2-μW Bandgap-Referenced Output Controller," *IEEE Trans. Circuits Syst. II*, vol. 58, no. 12, pp. 832–836, Dec. 2011.
- [64] J. Kim, P. K. T. Mok, and C. Kim, "A 0.15 V Input Energy Harvesting Charge Pump With Dynamic Body Biasing and Adaptive Dead-Time for Efficiency Improvement," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 2, pp. 414–425, Feb. 2015.
- [65] O. Al-Terkawi Hasib, M. Sawan, and Y. Savaria, "A Low-Power Asynchronous Step-Down DC–DC Converter for Implantable Devices," *IEEE Trans. Biomed. Circuits Syst.*, vol. 5, no. 3, pp. 292–301, Jun. 2011.
- [66] T. Ozaki, T. Hirose, T. Nagai, K. Tsubaki, N. Kuroki, and M. Numa, "A 0.21-V Minimum Input, 73.6% Maximum Efficiency, Fully Integrated Voltage Boost Converter with MPPT for Low-Voltage Energy Harvesters," in ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESSCIRC). Venice Lido, Italy: IEEE, Sep. 2014, pp. 255–258.
- [67] T. Ozaki, T. Hirose, H. Asano, N. Kuroki, and M. Numa, "Fully-Integrated High-Conversion-Ratio Dual-Output Voltage Boost Converter With MPPT for Low-Voltage Energy Harvesting," *IEEE J. Solid-State Circuits*, vol. 51, no. 10, pp. 2398–2407, Oct. 2016.
- [68] D. Maksimovic and S. Dhar, "Switched-capacitor DC-DC converters for lowpower on-chip applications," in 30th Annual IEEE Power Electronics Specialists Conference. Record. (Cat. No.99CH36321), vol. 1. Charleston, SC, USA: IEEE, 1999, pp. 54–59.
- [69] L. George, G. D. Gargiulo, T. Lehmann, and T. J. Hamilton, "A 0.04 mm Buck-Boost DC-DC Converter for Biomedical Implants Using Adaptive Gain and

Discrete Frequency Scaling Control," *IEEE Trans. Biomed. Circuits Syst.*, vol. 10, no. 3, pp. 668–678, Jun. 2016.

- [70] G. V. Pique, "A 41-Phase Switched-Capacitor Power Converter with 3.8mV Output Ripple and 81% Efficiency in Baseline 90nm CMOS," in 2012 IEEE International Solid-State Circuits Conference. San Francisco, CA, USA: IEEE, Feb. 2012, pp. 98–100.
- [71] Y. Lu, J. Jiang, and W. Ki, "A Multiphase Switched-Capacitor DC–DC Converter Ring With Fast Transient Response and Small Ripple," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 2, pp. 579–591, 2017.
- [72] Y.-T. Lin, Y.-J. Lai, H.-W. Chen, W.-H. Yang, Y.-S. Ma, K.-H. Chen, Y.-H. Lin, S.-R. Lin, and T.-Y. Tsai, "A Fully Integrated Asymmetrical Shunt Switched-Capacitor DC–DC Converter With Fast Optimum Ratio Searching Scheme for Load Transient Enhancement," *IEEE Transactions on Power Electronics*, vol. 34, no. 9, pp. 9146–9157, Sep. 2019.
- [73] TXL Group, Inc., "Txl-127-02k (accessed by jan. 2024)," 2017, https://txlgroup. com/wp-content/uploads/2018/06/TXL-127-02K\_data\_sheet.pdf.
- [74] Y. Cheng, M. Chan, K. Hui, M.-c. Jeng, Z. Liu, J. Huang, K. Chen, J. Chen, R. Tu, P. K. Ko *et al.*, "Bsim3v3 manual," *University of California*, vol. 1996, 1995.
- [75] G. Ghibaudo, "New Method for the Extraction of MOSFET Parameters," *Electronics Letters*, vol. 24, no. 9, pp. 543–545, 1988.
- [76] H. Wang and P. P. Mercier, "A 14.5 pW, 31 ppm/°C resistor-less 5 pa current reference employing a self-regulated push-pull voltage reference generator," in 2016 IEEE International Symposium on Circuits and Systems (ISCAS), May 2016, pp. 1290–1293.

- [77] P. Chen, K. Ishida, Xin Zhang, Y. Okuma, Y. Ryu, M. Takamiya, and T. Sakurai, "A 120-mV Input, Fully Integrated Dual-mode Charge Pump in 65-nm CMOS for Thermoelectric Energy Harvester," in 17th Asia and South Pacific Design Automation Conference, Jan 2012, pp. 469–470.
- [78] J. Kim, P. K. T. Mok, and C. Kim, "A 0.15 V Input Energy Harvesting Charge Pump With Dynamic Body Biasing and Adaptive Dead-Time for Efficiency Improvement," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 2, pp. 414–425, Feb 2015.
- [79] X. Wu, Y. Shi, S. Jeloka, K. Yang, I. Lee, Y. Lee, D. Sylvester, and D. Blaauw, "A 20-pW Discontinuous Switched-Capacitor Energy Harvester for Smart Sensor Applications," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 4, pp. 972–984, April 2017.
- [80] W. Jung, S. Oh, S. Bang, Y. Lee, Z. Foo, G. Kim, Y. Zhang, D. Sylvester, and D. Blaauw, "An Ultra-Low Power Fully Integrated Energy Harvester Based on Self-Oscillating Switched-Capacitor Voltage Doubler," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2800–2811, Dec 2014.
- [81] N. Butzen and M. Steyaert, "Design of Single-Topology Continuously Scalable-Conversion-Ratio Switched-Capacitor DC–DC Converters," IEEE Journal of Solid-State Circuits, vol. 54, no. 4, pp. 1039–1047.
- [82] J. Jiang, W.-H. Ki, and Y. Lu, "Digital 2-/3-Phase Switched-Capacitor Converter With Ripple Reduction and Efficiency Improvement," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 7, pp. 1836–1848, Jul. 2017.
- [83] Y. Tan and H. Ishikuro, "A Switched-Capacitor DC-DC Converter with >77.3%Efficiency and 80 ns Active Transient Response in 40  $\mu$ A - 4 mA Load Current

Range," in ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference (ESSCIRC), Sep. 2021, pp. 355–358.

- [84] Y. Tan and H. Ishikuro, "A Discrete-Time Model for Frequency Modulated Charge Pumps with Synchronized Controller," in 2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS), Aug. 2020, pp. 929–932.
- [85] T. Tanzawa and T. Tanaka, "A Dynamic Analysis of the Dickson Charge Pump Circuit," IEEE Journal of Solid-State Circuits, vol. 32, no. 8, pp. 1231–1240, Aug. 1997.
- [86] C. Kurth and G. Moschytz, "Nodal Analysis of Switched-Capacitor Networks," *IEEE Transactions on Circuits and Systems*, vol. 26, no. 2, pp. 93–105, 1979.
- [87] M. D. Seeman, "A Design Methodology for Switched-Capacitor DC-DC Converters," Ph.D. dissertation, EECS Department, University of California, Berkeley, May 2009.
- [88] J. M. Henry and J. W. Kimball, "Switched-Capacitor Converter State Model Generator," *IEEE Transactions on Power Electronics*, vol. 27, no. 5, pp. 2415–2425, May 2012.
- [89] T. Souvignet, B. Allard, and X. Lin-Shi, "Sampled-Data Modeling of Switched-Capacitor Voltage Regulator With Frequency-Modulation Control," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 4, pp. 957–966, Apr. 2015.
- [90] S. Gregori and R. E. Rotunno, "Z-Domain Analysis of Dickson Charge Pumps," in 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS), Dec. 2016, pp. 185–188.

- [91] M. Imai and T. Yoneda, "Multiple-clock multiple-edge-triggered multiple-bit flip-flops for two-phase handshaking asynchronous circuits," in 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Jun. 2014, pp. 141–144.
- [92] S. Kabirpour and M. Jalali, "A Low-Power and High-Speed Voltage Level Shifter Based on a Regulated Cross-Coupled Pull-Up Network," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 66, no. 6, pp. 909–913, Jun. 2019.
- [93] Y. Tan and H. Ishikuro, "A Dual-Mode 2:1 Switched Capacitor Converter with >65% Efficiency over 1000x Load Current Range and One Clock Cycle Transient Response," in 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 1–4.
- [94] Y. Tan and H. Ishikuro, "A Switched-Capacitor DC-DC Converter with >77.3% Efficiency and 80 ns Active Transient Response in 40 μA - 4 mA Load Current Range," in ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference (ESSCIRC), Sep. 2021, pp. 355–358.