Low-Voltage Analog-to-Digital Converters
and Mixed-Signal Interfaces
Prakash Harikumar
Abstract

Analog-to-digital converters (ADCs) are crucial blocks which form the interface between the physical world and the digital domain. ADCs are indispensable in numerous applications such as wireless sensor networks (WSNs), wireless/wireline communication receivers and data acquisition systems. To achieve long-term, autonomous operation for WSNs, the nodes are powered by harvesting energy from ambient sources such as solar energy, vibrational energy etc. Since the signal frequencies in these distributed WSNs are often low, ultra-low-power ADCs with low sampling rates are required. The advent of new wireless standards with ever-increasing data rates and bandwidth necessitates ADCs capable of meeting the demands. Wireless standards such as GSM, GPRS, LTE and WLAN require ADCs with several tens of MS/s speed and moderate resolution (8-10 bits). Since these ADCs are incorporated into battery-powered portable devices such as cellphones and tablets, low power consumption for the ADCs is essential.

The first contribution is an ultra-low-power 8-bit, 1 kS/s successive approximation register (SAR) ADC that has been designed and fabricated in a 65-nm CMOS process. The target application for the ADC is an autonomously-powered soil-moisture sensor node. At $V_{DD} = 0.4$ V, the ADC consumes 717 pW and achieves an FoM = 3.19 fJ/conv-step while meeting the targeted dynamic and static performance. The 8-bit ADC features a leakage-suppressed S/H circuit with boosted control voltage which achieves > 9-bit linearity. A binary-weighted capacitive array digital-to-analog converter (DAC) is employed with a very low, custom-designed unit capacitor of 1.9 fF. Consequently the area of the ADC and power consumption are reduced. The ADC achieves an ENOB of 7.81 bits at near-Nyquist input frequency. The core area occupied by the ADC is only 0.0126 mm$^2$.

The second contribution is a 1.2 V, 10 bit, 50 MS/s SAR ADC designed and implemented in 65 nm CMOS aimed at communication applications. For medium-to-high sampling rates, the DAC reference settling poses a speed bottleneck in charge-redistribution SAR ADCs due to the ringing associated with the parasitic inductances. Although SAR ADCs have been the subject of intense research in recent years, scant attention has been laid on the design of high-performance on-chip reference voltage buffers. The estimation of important design parameters of the buffer as well
critical specifications such as power-supply sensitivity, output noise, offset, settling
time and stability have been elaborated upon in this dissertation. The implemented
buffer consists of a two-stage operational transconductance amplifier (OTA) combined
with replica source-follower (SF) stages. The 10-bit SAR ADC utilizes split-array
capacitive DACs to reduce area and power consumption. In post-layout simulation
which includes the entire pad frame and associated parasitics, the ADC achieves an
ENOB of 9.25 bits at a supply voltage of 1.2 V, typical process corner and sampling
frequency of 50 MS/s for near-Nyquist input. Excluding the reference voltage buffer,
the ADC consumes 697 µW and achieves an energy efficiency of 25 fJ/conversion-step
while occupying a core area of 0.055 mm$^2$.

The third contribution comprises five disparate works involving the design of key
peripheral blocks of the ADC such as reference voltage buffer and programmable
gain amplifier (PGA) as well as low-voltage, multi-stage OTAs. These works are a)
Design of a 1 V, fully differential OTA which satisfies the demanding specifications
of a PGA for a 9-bit SAR ADC in 28 nm UTBB FDSOI CMOS. While consuming
2.9 µW, the PGA meets the various performance specifications over all process
corners and a temperature range of $[-20^\circ C \div +85^\circ C]$. b) Since FBB in the 28 nm
FDSOI process allows wide tuning of the threshold voltage and substantial boosting
of the transconductance, an ultra-low-voltage fully differential OTA with $V_{DD} = 0.4$ V
has been designed to satisfy the comprehensive specifications of a general-purpose
OTA while limiting the power consumption to 785 nW. c) Design and implementation
of a power-efficient reference voltage buffer in 1.8 V, 180 nm CMOS for a 10-bit,
1 MS/s SAR ADC in an industrial fingerprint sensor SoC. d) Comparison of two
previously-published frequency compensation schemes on the basis of unity-gain
frequency and phase margin on a three-stage OTA designed in a 1.1 V, 40-nm CMOS
process. Simulation results highlight the benefits of split-length indirect compensation
over the nested Miller compensation scheme. e) Design of an analog front-end (AFE)
satisfying the requirements for a capacitive body-coupled communication receiver in
a 1.1 V, 40-nm CMOS process. The AFE consists of a cascade of three amplifiers
followed by a Schmitt trigger and digital buffers. Each amplifier utilizes a two-stage
OTA with split-length compensation.
Analog-till-digital omvandlare (ADC) är viktiga byggstenar för att konvertera signaler i den fysiska världen till den digitala domänen. AD-omvandlare är nödvändiga i flera applikationer som trådlösa sensornätverk (WSN), trådlös/trådbundna kommunikationsmottagare och datainsamlingsystem. För att ge trådlösa sensornoder lång autonom livslängd kan de utvinna energi från sin omgivning exempelvis från solenergi, rörelseenergi etc. Eftersom frekvensen på insignalerna i dessa distribuerade sensornoder ofta är låg behövs AD-omvandlare med ultralåg effektförbrukning och låg samplingshastighet. Även skapandet av nya trådlösa standarder med ständigt ökande datatakter och bandbredd kräver AD-omvandlare som kan tillgodose dessa krav. Trådlösa standarder som GSM, GPRS, LTE och WLAN behöver AD-omvandlare med samplingshastigheter på flera tiotals miljoner samplings per sekund och moderat upplösning (8-10 bitar). Eftersom AD-omvandlare används i batteridrivna portabla enheter som mobiltelefoner och surfplatta är även låg effektförbrukning viktigt.


Det andra bidraget är en 1,2 V, 10-bit, 50 MS/s SAR AD-omvandlare designad och implementerad i 65-nm CMOS för kommunikationsapplikationer. För medelhöga till höga samplingshastigheter begränsar insvängningstiden för DA-omvandlarens referensspänning hastigheten i laddningsredistributions SAR AD-omvandlare på grund av ringningar orsakad av parasitiska induktanser. Trots att SAR AD-omvandlare har

Preface

This dissertation presents the research work performed during the period August 2011 – September 2015 at the Department of Electrical Engineering, Linköping University, Sweden. The main contributions of this dissertation are as follows:

- Design and implementation of a 0.4 V, 717 pW, 8-bit 1 kS/s SAR ADC in 65 nm CMOS. The ADC features a leakage-reduced sampling switch with a multi-stage charge pump to guarantee sufficient linearity. A custom-designed unit capacitor achieves reduced area and power consumption for the capacitive DAC.

- Design and implementation of a 10-bit, 50 MS/s SAR ADC with an on-chip reference voltage buffer in 65 nm CMOS. The speed limitation for medium/high-speed SAR ADCs due to inaccurate DAC settling in the presence of bondwire parasitics is discussed. The performance specification and design details of a high-speed reference voltage buffer are elaborated upon in this work.

- Design of a low-power PGA for a 9-bit SAR ADC in 28 nm UTBB FDSOI CMOS process. The RBB feature of this CMOS process node was utilized to enhance the DC gain while avoiding large resistors in the CMFB circuit for the first stage.

- Design of an ultra-low-voltage fully differential OTA in 28 nm UTBB FD-SOI CMOS. With a differential output swing of 0.8FS, the OTA achieves an SNR of 8.7 bits and a THD of −60.3 dB while consuming 785 nW from a 0.4 V supply. The small-signal and large signal performance as well as the matching-constrained specifications such as PSRR, CMRR and offset have been determined using exhaustive simulations.

- Design and implementation of a power-efficient reference voltage buffer in 180 nm CMOS for a 10-bit, 1 MS/s SAR ADC in a fingerprint sensor. The buffer meets the requirements on settling time, PSRR, output noise, stability and also supports a low-power standby mode.
Comparison of two previously-published frequency compensation schemes using a three-stage OTA designed in 40 nm CMOS. Utilizing metrics such as phase margin, unity-gain frequency and total compensation capacitance, the advantages of the reversed nested indirect compensation technique are illustrated for high-speed multi-stage OTAs.

Design of a receiver AFE for capacitive body-coupled communication in 1.1 V, 40 nm CMOS. Three different AFE topologies were designed and compared in terms of noise, gain and power consumption.

The contents of this dissertation are based on the following publications:


- **Paper VIII** – S. A. Aamir, P. Harikumar and J. J. Wikner, “Frequency Compensation of High-Speed, Low-Voltage CMOS Multistage Amplifiers”, in
Contribution: I designed a three-stage OTA in 40 nm CMOS and employed two frequency compensation schemes on it. I carried out the relevant simulations and played a major role in manuscript preparation. Aamir designed and simulated the four-stage OTA in 65 nm CMOS and was involved in manuscript preparation.


  Contribution: I designed a two-stage OTA in 40 nm CMOS as well as the three AFE topologies. I carried out the necessary simulations and played a major role in manuscript preparation. The calculation of the noise level allowed for the AFE was provided by Jacob. Irfan was responsible for including the details of the human body electrical model and discrete transceiver realization.

The following papers were also published during this period which are outside the scope of this dissertation:

• K. Chen, *P. Harikumar* and A. Alvandpour, “Design of a 12.8 ENOB, 1 kS/s Pipelined SAR ADC in 0.35-µm CMOS”, *Analog Integr. Circ. Sig. Process.*, 2015 (Accepted).

Acknowledgments

It has been an arduous, oft-enervating trek. I am beholden to those who helped me prevail.

• My grandmother H. Balambal for her selfless, doting love. She cossets me from within and awaits me at the egress from this earthly sojourn.

• My parents Dr. V. Thankamani and M. Harikumar for imbuing me with tenacity and rectitude that have stood me in good stead throughout my Ph.D. studies. Their boundless affection, foursquare support and sagacious precepts efface despondency and whet my resolve.

• My supervisor Dr. J Jacob Wikner who extended me a carte blanche on research albeit with a caveat on robustness and testability of the circuits. He never shoehorned me into confined perspectives, yet spurred me on with the celerity of his pithy e-mails. Completing his exacting Ph.D. courses is a badge of honour, akin to earning the beret from Fort Bragg, NC, USA.

• My co-supervisor Prof. Atila Alvandpour who reposed faith in my abilities and proffered several opportunities to parlay my skills. His candid and germane feedback ameliorated the quality of my work and manuscripts.

• The profound philosophical depth of the Hindu religion which exhorts “Thy right is only to act and not to its fruits. Let not the fruit of action be thy motive; nor let thy attachment be to inaction. (The Bhagavad Gita)”

• My uncle V. Murali who edified me through his life. He was and still remains virtues incarnate. Yet he was flagrantly shortchanged by God and mortals alike.

• Dr. Oscar Gustafsson and Prof. Håkan Johansson for providing me an opportunity to pursue Ph.D. studies.

• Dr. Darius Jakonis (Acreo AB, Sweden) for the numerous discussions regarding the 28 nm UTBB FDSOI CMOS process. Despite having no stake in my research, he responded with alacrity and zeal to my emails, and galvanized me to investigate the potential of this advanced process node.
• *Tongzh* Dr. Dai Zhang for the countless technical discussions and resilient friendship. Her stellar accomplishments are snugly ensconced amidst her endearing humility and modesty. In the domain of ultra-low-power ADC design, I regard her as an exemplar worthy of emulation.

• Pavel Angelov (Fingerprint Cards AB, Sweden) who fortuitously shares his first name with the legendary Pavel. O. Sukhoi, for his munificence in sparing time for technical discussions. Such rendezvous disabused me of several misconceptions regarding ADC design and widened my horizons.

• Dr. Anu Kalidas with whom I cherish a fraternal bond, a bond forged between compatriots confronting bitter winters and daunting exams/labs in an alien land. Conferring with A. K. Das (one among his myriad noms de guerre) has been the perfect salve for my ruffled mind during these years.

• Dr. Ameya Bhide for his valuable advice on a slew of issues and for sharing the LaTeX template for his thesis.

• Martin Nielsen-Lönn (Ph.D. student, ICS), whose metier spans the gamut of electronics from precision soldering and CAD to oscilloscope-based Flappy-Pong, for the fruitful collaboration on the SMS and smartMemphis projects. To me, his indefatigable verve for troubleshooting remains an enigma.

• Kairang Chen (Ph.D. student, ICS) from Henan, the land of the warrior monks, for the unencumbered co-operation on the pipelined ADC work.

• Arta Alvandpour, Principal Research Engineer, ICS whose assistance ensured that we were in lockstep with Cadence® on the CAD tool versions.

• Dr. Manil Dev for the camaraderie that has endured the inexorable passage of time.

• My wife Sruthi Kodoth for her unswerving support and encouragement. Her sangfroid was an indispensable equipoise to my angst during the chip measurements. Sruthi’s parents Sailaja Kodoth and M. Madhusoodhan buttressed my efforts with genuine concern and fervent prayers.

Prakash Harikumar
Linköping, Sep. 2015
# Contents

1 Introduction 1
  1.1 Background ..................................... 1
  1.2 Objectives .................................... 3
  1.3 Methodology .................................. 3
  1.4 Organization and Scope of Dissertation ........... 3

2 Design Considerations for SAR ADCs and OTAs 7
  2.1 SAR ADC Design Considerations ................... 7
    2.1.1 Sample-and-Hold Circuit ..................... 7
    2.1.2 Capacitive DAC ............................ 13
    2.1.3 Comparator ................................. 15
  2.2 OTA Design Challenges ............................ 17
    2.2.1 Stabilization of OTAs ....................... 19
  2.3 Features of the 28 nm UTBB FDSOI CMOS Process .... 25
    2.3.1 Control of Threshold Voltage Using Body Bias 25
    2.3.2 Intrinsic Gain of Transistors ................ 26
    2.3.3 Boosting Transconductance Using Forward Body Bias 27
  2.4 Summary ....................................... 28

3 Design of a 0.4 V, sub-nW, 8-bit 1 kS/s SAR ADC 29
  3.1 Introduction .................................... 29
  3.2 ADC Architecture ................................ 30
  3.3 Circuit Implementation ........................... 31
    3.3.1 Input Sampling Switch ....................... 31
    3.3.2 Capacitive Array DAC ....................... 34
    3.3.3 Dynamic Latch Comparator .................... 37
    3.3.4 SAR Logic .................................. 38
  3.4 Measurement Results ............................. 40
  3.5 Ultra-low-power RC Oscillator ..................... 45
  3.6 Summary ....................................... 49
# Contents

## 4 Design of a 10-bit 50 MS/s SAR ADC
4.1 Introduction ........................................................................................................... 51
4.2 Limitations for DAC Settling ................................................................................ 52
4.3 ADC Architecture .................................................................................................. 53
4.4 Implementation of ADC Building Blocks ............................................................... 55
   4.4.1 Reference Voltage Buffer ............................................................................. 55
   4.4.2 Input Sampling Switches ............................................................................... 59
   4.4.3 Dynamic Comparator ................................................................................... 63
   4.4.4 Split Binary-Weighted Array DAC ............................................................... 65
   4.4.5 SAR Controller ............................................................................................ 67
   4.4.6 Layout of the ADC ...................................................................................... 69
4.5 Simulation Results .................................................................................................. 70
4.6 Summary .................................................................................................................. 74

## 5 Mixed-Signal Interfaces
5.1 Introduction .............................................................................................................. 75
5.2 A PGA for a 9 bit, 1 kS/s SAR ADC ...................................................................... 76
   5.2.1 Performance Requirements ........................................................................ 76
   5.2.2 Architecture .................................................................................................. 76
   5.2.3 Common-mode Feedback .......................................................................... 77
   5.2.4 Simulation Results for the OTA ................................................................. 78
5.3 An Ultra-Low-Voltage OTA in 28 nm UTBB FDSOI CMOS ..................................... 82
   5.3.1 Ultra-low-voltage OTA Design .................................................................. 83
   5.3.2 OTA Architecture ....................................................................................... 83
   5.3.3 Simulation Results ..................................................................................... 85
5.4 Reference Voltage Buffer for a 10-bit 1-MS/s SAR ADC ......................................... 91
   5.4.1 Requirements on the RVBuffer .................................................................. 91
   5.4.2 OTA Topology and Simulation Results ....................................................... 92
   5.4.3 Re-design of the RVBuffer ....................................................................... 96
5.5 Frequency Compensation of a Three-Stage OTA in 40 nm CMOS .......................... 97
   5.5.1 RNIC Stabilization .................................................................................... 97
   5.5.2 NMCNR Stabilization ............................................................................... 98
5.6 A Receiver AFE for Capacitive Body-Coupled Communication ............................... 100
   5.6.1 Requirements on the AFE ....................................................................... 100
   5.6.2 AFE Architecture ..................................................................................... 101
   5.6.3 AFE Topologies and Simulation Results .................................................... 102
5.7 Summary .................................................................................................................. 106

## 6 Conclusions and Future Work
6.1 Future Work ............................................................................................................ 108

References ..................................................................................................................... 109
CONTENTS

A  Paper Collection  121
List of Figures

1-1 Block diagram of a typical AFE and ADC . . . . . . . . . . . . . . 2
1-2 Basic SAR ADC architecture. . . . . . . . . . . . . . . . . . . . . 2
2-1 Basic S/H circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2-2 Noise of the S/H circuit over PVT corners. . . . . . . . . . . . . . 9
2-3 Charge injection and clock feedthrough errors. . . . . . . . . . . . 9
2-4 Impact of lower supply voltage on the \( R_{ON} \) of a TG switch. . . 11
2-5 Variation of leakage current and threshold voltage with temperature. 12
2-6 Binary-weighted capacitive DAC. . . . . . . . . . . . . . . . . . . 13
2-7 Split-array capacitive DAC. . . . . . . . . . . . . . . . . . . . . . 14
2-8 Schematic of a dynamic comparator. . . . . . . . . . . . . . . . . . 15
2-9 Intrinsic gain vs. \( V_{DS} \) in advanced CMOS process nodes. . . . 18
2-10 Voltage headroom in a single-stage differential amplifier. . . . . . 18
2-11 A single-stage OTA. . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2-12 Simple Miller compensated (SMC) two-stage OTA. . . . . . . . . 21
2-13 Small-signal model of the SMC OTA. . . . . . . . . . . . . . . . . 22
2-14 Simple Miller compensation with nulling resistor (SMCNR). . . . 23
2-15 A three-stage OTA with NMC. . . . . . . . . . . . . . . . . . . . 24
2-16 Threshold voltage vs. forward body bias in low-V\(_{TH}\) transistors. . 26
2-17 Boosting \( R_{ON} \) using reverse body bias. . . . . . . . . . . . . . 26
2-18 Nominal intrinsic gain of minimum-sized LVT NMOS transistor. . . 27
2-19 Impact of forward body bias (FBB) on transconductance. . . . . . 27
3-1 Block diagram of the proposed ADC. . . . . . . . . . . . . . . . . 30
3-2 Schematic of the multi-stage charge pump for the input S/H. . . . . 32
3-3 Schematic of the leakage-reduced S/H with multi-stage charge pump. 32
3-4 THD performance of the S/H over process and temperature corners. 33
3-5 Structure of the custom-designed unit capacitor. . . . . . . . . . . 37
3-6 Layout of the binary-weighted DAC. . . . . . . . . . . . . . . . . . 38
3-7 Schematic of the dynamic latch comparator. . . . . . . . . . . . . 39
3-8 Schematic of the synchronous SAR logic. . . . . . . . . . . . . . . . 39
<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>3-9</td>
<td>Timing sequence for the synchronous SAR logic</td>
<td>40</td>
</tr>
<tr>
<td>3-10</td>
<td>Chip microphotograph and layout of the 8-bit SAR ADC</td>
<td>40</td>
</tr>
<tr>
<td>3-11</td>
<td>Measured DNL and INL errors of the ADC</td>
<td>41</td>
</tr>
<tr>
<td>3-12</td>
<td>Measured $\mu_{\text{DNL}}$ and $\mu_{\text{INL}}$ for seven ADC chips</td>
<td>41</td>
</tr>
<tr>
<td>3-13</td>
<td>Measured $\sigma_{\text{DNL}}$ and $\sigma_{\text{INL}}$ for seven ADC chips</td>
<td>42</td>
</tr>
<tr>
<td>3-14</td>
<td>Measured FFT spectrum (2048-point) for the ADC at 1 kS/s with near-DC and near-Nyquist inputs</td>
<td>42</td>
</tr>
<tr>
<td>3-15</td>
<td>Measured SNDR and SFDR of the ADC vs. input frequency</td>
<td>43</td>
</tr>
<tr>
<td>3-16</td>
<td>ADC measurement set-up with solar panel</td>
<td>44</td>
</tr>
<tr>
<td>3-17</td>
<td>Schematic of the RC oscillator</td>
<td>46</td>
</tr>
<tr>
<td>3-18</td>
<td>Bias circuit for the RC oscillator</td>
<td>46</td>
</tr>
<tr>
<td>3-19</td>
<td>Variation of current vs. temperature for the bias circuit</td>
<td>47</td>
</tr>
<tr>
<td>3-20</td>
<td>Chip microphotograph and layout of the RC oscillator</td>
<td>48</td>
</tr>
<tr>
<td>4-1</td>
<td>The proposed SAR ADC architecture</td>
<td>54</td>
</tr>
<tr>
<td>4-2</td>
<td>Capacitive DAC during sampling phase of the SAR ADC</td>
<td>54</td>
</tr>
<tr>
<td>4-3</td>
<td>Timing diagram for the sampling phase of the SAR ADC</td>
<td>55</td>
</tr>
<tr>
<td>4-4</td>
<td>Topology of the reference voltage buffer</td>
<td>59</td>
</tr>
<tr>
<td>4-5</td>
<td>Schematic of the reference voltage buffer</td>
<td>60</td>
</tr>
<tr>
<td>4-6</td>
<td>Open-loop gain and phase plot for the RVBuffer</td>
<td>60</td>
</tr>
<tr>
<td>4-7</td>
<td>Schematic of the constant-$g_m$ bias circuit</td>
<td>61</td>
</tr>
<tr>
<td>4-8</td>
<td>Schematic of the bootstrapped switch</td>
<td>62</td>
</tr>
<tr>
<td>4-9</td>
<td>Linearity performance of the bootstrapped switch</td>
<td>63</td>
</tr>
<tr>
<td>4-10</td>
<td>Schematic of the double-tail dynamic comparator</td>
<td>64</td>
</tr>
<tr>
<td>4-11</td>
<td>Split-array DAC</td>
<td>65</td>
</tr>
<tr>
<td>4-12</td>
<td>Layout of the split-array DAC (single-side)</td>
<td>68</td>
</tr>
<tr>
<td>4-13</td>
<td>INL/DNL of the 10-bit fully differential split-array DAC</td>
<td>68</td>
</tr>
<tr>
<td>4-14</td>
<td>Synchronous SAR logic</td>
<td>69</td>
</tr>
<tr>
<td>4-15</td>
<td>Timing sequence of the SAR logic</td>
<td>69</td>
</tr>
<tr>
<td>4-16</td>
<td>Layout of the SAR ADC</td>
<td>70</td>
</tr>
<tr>
<td>4-17</td>
<td>Ringing on the DAC output due to inductance</td>
<td>71</td>
</tr>
<tr>
<td>4-18</td>
<td>Output spectrum of the SAR ADC for low-frequency input</td>
<td>72</td>
</tr>
<tr>
<td>4-19</td>
<td>Output spectrum of the SAR ADC for near-Nyquist input frequency</td>
<td>72</td>
</tr>
<tr>
<td>4-20</td>
<td>Dynamic performance versus input frequency</td>
<td>73</td>
</tr>
<tr>
<td>4-21</td>
<td>Power breakdown for the SAR ADC (typical PVT corner)</td>
<td>73</td>
</tr>
<tr>
<td>5-1</td>
<td>Schematic of the two-stage OTA</td>
<td>77</td>
</tr>
<tr>
<td>5-2</td>
<td>CT CMFB for the second stage of the OTA</td>
<td>78</td>
</tr>
<tr>
<td>5-3</td>
<td>Output signals and common-mode error for the OTA (closed-loop)</td>
<td>78</td>
</tr>
<tr>
<td>5-4</td>
<td>Differential input, output and output CM level of the OTA</td>
<td>80</td>
</tr>
<tr>
<td>5-5</td>
<td>DFT of the differential output of the OTA</td>
<td>80</td>
</tr>
<tr>
<td>5-6</td>
<td>THD vs. differential input voltage</td>
<td>81</td>
</tr>
<tr>
<td>5-7</td>
<td>Full-scale pulse outputs of the OTA</td>
<td>81</td>
</tr>
<tr>
<td>Figure</td>
<td>Description</td>
<td>Page</td>
</tr>
<tr>
<td>--------</td>
<td>-------------</td>
<td>------</td>
</tr>
<tr>
<td>5-8</td>
<td>Impact of CMFF on the CM gain of a pseudo-differential OTA</td>
<td>84</td>
</tr>
<tr>
<td>5-9</td>
<td>Schematic of the two-stage OTA with FBB</td>
<td>84</td>
</tr>
<tr>
<td>5-10</td>
<td>Schematic of the second stage CT CMFB with FBB</td>
<td>85</td>
</tr>
<tr>
<td>5-11</td>
<td>Output voltages and output CM error voltage of the OTA</td>
<td>86</td>
</tr>
<tr>
<td>5-12</td>
<td>Gain and phase plots of the OTA in open-loop</td>
<td>87</td>
</tr>
<tr>
<td>5-13</td>
<td>Differential input, output and output CM voltages</td>
<td>87</td>
</tr>
<tr>
<td>5-14</td>
<td>THD and differential output voltage vs. input voltage</td>
<td>88</td>
</tr>
<tr>
<td>5-15</td>
<td>Full-scale differential input and output pulse voltages</td>
<td>89</td>
</tr>
<tr>
<td>5-16</td>
<td>Flicker noise comparison for 28 nm FDSOI and 65 nm bulk CMOS</td>
<td>90</td>
</tr>
<tr>
<td>5-17</td>
<td>Block diagram of the SAR ADC</td>
<td>91</td>
</tr>
<tr>
<td>5-18</td>
<td>Schematic of the PMOS-input RVBuffer</td>
<td>93</td>
</tr>
<tr>
<td>5-19</td>
<td>Gain-phase plot of the PMOS-input RVBuffer</td>
<td>94</td>
</tr>
<tr>
<td>5-20</td>
<td>Layout of the PMOS-input RVBuffer</td>
<td>96</td>
</tr>
<tr>
<td>5-21</td>
<td>Schematic of the three-stage OTA with RNIC</td>
<td>98</td>
</tr>
<tr>
<td>5-22</td>
<td>Gain-phase plots of the three-stage OTA</td>
<td>99</td>
</tr>
<tr>
<td>5-23</td>
<td>Block diagram of the transceiver</td>
<td>100</td>
</tr>
<tr>
<td>5-24</td>
<td>Schematic of the two-stage OTA</td>
<td>102</td>
</tr>
<tr>
<td>5-25</td>
<td>Gain-phase plot of the two-stage OTA</td>
<td>103</td>
</tr>
<tr>
<td>5-26</td>
<td>Schematic of the resistive-feedback AFE</td>
<td>104</td>
</tr>
<tr>
<td>5-27</td>
<td>Schematic of the capacitive-feedback AFE</td>
<td>104</td>
</tr>
<tr>
<td>5-28</td>
<td>Schematic of the capacitive-feedback AFE with SC bias</td>
<td>104</td>
</tr>
<tr>
<td>5-29</td>
<td>Transient outputs for the capacitive feedback AFE</td>
<td>105</td>
</tr>
<tr>
<td>5-30</td>
<td>Transient outputs for the SC bias AFE</td>
<td>106</td>
</tr>
</tbody>
</table>
List of Tables

3-1 Multi-stage charge-pump performance over all PT corners. . . . . . . . . 33
3-2 ADC performance summary and comparison. . . . . . . . . . . . . 44
3-3 Simulated performance of the RC oscillator. . . . . . . . . . . . . 48
3-4 Measured performance of the RC oscillator. . . . . . . . . . . . . 48
4-1 Performance summary of the reference voltage buffer. . . . . . . 61
4-2 Performance summary of the dynamic comparator. . . . . . . . . . 64
4-3 Comparison to state-of-the-art works. . . . . . . . . . . . . . . . 74
5-1 Performance of the OTA over PT corners. . . . . . . . . . . . . . . 79
5-2 Comparison to low-power OTAs. . . . . . . . . . . . . . . . . . . . 82
5-3 Simulated performance of the OTA over PVT corners. . . . . . . 88
5-4 Comparison to ultra-low-voltage OTAs. . . . . . . . . . . . . . . . 90
5-5 Simulated RVBuffer performance with NMOS capacitor load. . . . 94
5-6 Simulated performance of the RVBuffer. . . . . . . . . . . . . . . 95
5-7 Pole-zero locations of the three-stage OTA with RNIC. . . . . . . 98
5-8 Pole-zero locations of the three-stage OTA with NMCNR. . . . . . 99
5-9 Comparison of NMCNR and RNIC schemes. . . . . . . . . . . . . 99
5-10 Two-stage OTA performance summary. . . . . . . . . . . . . . . . 103
5-11 Simulation results for the AFE configurations. . . . . . . . . . . . 105
List of Abbreviations

ADC  Analog-to-Digital Converter
AFE  Analog Front-End
BW   Bandwidth
CMFB Common Mode FeedBack
CMFF Common Mode FeedForward
CMOS Complementary Metal Oxide Semiconductor
CS   Common Source
CT   Continuous Time
DAC  Digital-to-Analog Converter
DFF  D-Flip Flop
DFT  Discrete Fourier Transform
DNL  Differential NonLinearity
ENOB Effective Number of Bits
ERBW Effective Resolution Bandwidth
FBB  Forward Body Bias
FDSOI Fully Depleted Silicon-On-Insulator
FFT  Fast Fourier Transform
FF   Flip Flop
FoM  Figure of Merit
<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>FS</td>
<td>Full Scale</td>
</tr>
<tr>
<td>GBW</td>
<td>Gain-BandWidth product</td>
</tr>
<tr>
<td>$\text{HD}_n$</td>
<td>Harmonic Distortion of $n^{\text{th}}$-order</td>
</tr>
<tr>
<td>IA</td>
<td>Instrumentation Amplifier</td>
</tr>
<tr>
<td>INL</td>
<td>Integral NonLinearity</td>
</tr>
<tr>
<td>JLCC</td>
<td>J-Leaded Chip Carrier</td>
</tr>
<tr>
<td>LHP</td>
<td>Left Half Plane</td>
</tr>
<tr>
<td>LSB</td>
<td>Least Significant Bit</td>
</tr>
<tr>
<td>MIM</td>
<td>Metal-Insulator-Metal</td>
</tr>
<tr>
<td>MOM</td>
<td>Metal-Oxide-Metal</td>
</tr>
<tr>
<td>MSB</td>
<td>Most Significant Bit</td>
</tr>
<tr>
<td>NMCNR</td>
<td>Nested Miller Compensation with Nulling Resistor</td>
</tr>
<tr>
<td>NMC</td>
<td>Nested Miller Compensation</td>
</tr>
<tr>
<td>OTA</td>
<td>Operational Transconductance Amplifier</td>
</tr>
<tr>
<td>PDK</td>
<td>Process Design Kit</td>
</tr>
<tr>
<td>PGA</td>
<td>Programmable Gain Amplifier</td>
</tr>
<tr>
<td>PT</td>
<td>Process and Temperature</td>
</tr>
<tr>
<td>PVT</td>
<td>Process, Voltage and Temperature</td>
</tr>
<tr>
<td>RBB</td>
<td>Reverse Body Bias</td>
</tr>
<tr>
<td>RHP</td>
<td>Right Half Plane</td>
</tr>
<tr>
<td>RNIC</td>
<td>Reversed Nested Indirect Compensation</td>
</tr>
<tr>
<td>RVBuffer</td>
<td>Reference Voltage Buffer</td>
</tr>
<tr>
<td>S/H</td>
<td>Sample-and-Hold</td>
</tr>
<tr>
<td>SAR</td>
<td>Successive Approximation Register</td>
</tr>
<tr>
<td>SC</td>
<td>Switched Capacitor</td>
</tr>
<tr>
<td>SFDR</td>
<td>Spurious-Free Dynamic Range</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Description</td>
</tr>
<tr>
<td>--------------</td>
<td>-------------</td>
</tr>
<tr>
<td>SF</td>
<td>Source Follower</td>
</tr>
<tr>
<td>SLCL</td>
<td>Split Length Current mirror Load</td>
</tr>
<tr>
<td>SMCNR</td>
<td>Simple Miller Compensation with Nulling Resistor</td>
</tr>
<tr>
<td>SMC</td>
<td>Simple Miller Compensation</td>
</tr>
<tr>
<td>SNDR</td>
<td>Signal-to-Noise-and-Distortion Ratio</td>
</tr>
<tr>
<td>SNR</td>
<td>Signal-to-Noise Ratio</td>
</tr>
<tr>
<td>SoC</td>
<td>System-on-Chip</td>
</tr>
<tr>
<td>TG</td>
<td>Transmission Gate</td>
</tr>
<tr>
<td>THD</td>
<td>Total Harmonic Distortion</td>
</tr>
<tr>
<td>UTBB</td>
<td>Ultra-Thin Box and Body</td>
</tr>
<tr>
<td>WSN</td>
<td>Wireless Sensor Network</td>
</tr>
</tbody>
</table>
Chapter 1

Introduction

1.1 Background

Analog-to-digital converters (ADCs) are crucial blocks which form the interface between the physical world and the digital domain. ADCs are indispensable in numerous applications such as sensor networks, wireless/wireline communication receivers and data acquisition systems. Wireless sensor networks (WSNs) are increasingly employed for environmental and structural health monitoring, military surveillance and personal health monitoring [1]. Each sensor node in the WSN consists of a sensor, ADC, digital control logic and storage as well as a radio to relay the data to a central base station. To achieve long-term, autonomous operation, the nodes are powered by harvesting energy from ambient sources such as solar energy, vibrational energy etc. Such energy-constrained operation makes it imperative for the WSN electronics including the ADC to have minimum power consumption. Since the signal frequencies in these distributed WSNs are often low, ultra-low-power ADCs with low sampling rates are required. In wireless communication receivers, ADCs convert the received analog signal for subsequent digital processing. The advent of new wireless standards with ever-increasing data rates and bandwidth necessitates ADCs capable of meeting the demands. Wireless standards such as GSM, GPRS, LTE and WLAN require ADCs with several tens of MS/s speed and moderate resolution (8-10 bits) [2]. Since these ADCs are incorporated into battery-powered portable devices such as cellphones and tablets, low power consumption for the ADCs is essential.

Figure 1-1 shows the block diagram of a typical analog front-end (AFE) along with the succeeding ADC. The instrumentation amplifier (IA) achieves rejection of common-mode disturbances with low input-referred noise [3]. The programmable gain amplifier (PGA) provides sufficient amplification to the input signal in order to maximize the dynamic range of the succeeding ADC. The PGA must possess fast settling time to drive the input sampling capacitance of the ADC. The low-pass filter
(LPF) performs the anti-aliasing function. For proper functionality, the ADC requires other peripheral blocks such as stable reference voltages and associated drivers, clock signal generator etc.

Figure 1-2 shows the block diagram of the basic SAR ADC architecture. It consists of a sample-and-hold block (S/H), a comparator, DAC and a successive approximation register (SAR). Each conversion consists of a sampling phase followed by the bit cycling phases. During the sampling phase, the input voltage is sampled. The successive-approximation register is set such that the output of the DAC is half of the reference voltage $V_{ref}$. In the initial bit cycle, the comparator compares the input voltage to $V_{ref}/2$ in order to determine the most significant bit (MSB). The comparator output is stored in the SAR logic. Simultaneously, the SAR controller generates the next bit approximation. The DAC forms the corresponding scaled value of $V_{ref}$ and the comparator compares the input voltage to the new value of the DAC output. The MSB-1 bit is thus determined. The bit cycles are repeated until all the bits up to the least significant bit (LSB) are determined.

In Fig. 1-2, if the comparator is implemented by a regenerative latch [4], then no static bias currents are required in the ADC which leads to excellent power-efficiency. Due to the fully dynamic nature of the SAR ADC, the power consumption scales
1.2 Objectives

The dissertation involves the following objectives.

- Design and implement an ultra-low-voltage ADC with power consumption less than 1 nW for WSNs.
- Design and implement a 10-bit, 50 MS/s SAR ADC with on-chip reference voltage buffer (RVBuffer). Identify the design challenges and suitable circuit topology for the buffer. Ensure that the various performance specifications of the RVBuffer are satisfied.
- Design power-efficient operational transconductance amplifiers (OTAs) for applications such as RVBuffer, PGA and AFEs. Determine the impact of advanced CMOS process nodes and frequency compensation schemes on OTA performance.

1.3 Methodology

The adopted methodology consists of an initial literature survey to identify appropriate circuit topologies and state-of-the-art performance. Based on the targeted system specification, the performance parameters for the constituent circuit blocks were derived. Subsequently, the design of these circuit blocks were commenced. Combining circuit techniques from different publications as well as modification of conventional circuit topologies were utilized to optimize the design. Simulation scenarios encompassed process corners, supply voltage variation and the relevant temperature range.

1.4 Organization and Scope of Dissertation

As delineated in Section 1.1, power-efficient ADCs are required in WSNs and in portable communication devices. Such ADCs will be part of System-on-Chips (SoC) implemented in advanced CMOS process nodes to take advantages of the benefits of scaling for the digital blocks. As the supply voltage reduces concomitantly with...
the reduction in feature size, the ADCs should meet the targeted performance at low supply voltages ($V_{DD} \leq 1.2$ V). In this dissertation, an ultra-low-power 8-bit, 1 kS/s SAR ADC has been designed and fabricated in a 65 nm CMOS process. At $V_{DD} = 0.4$ V, the ADC consumes 717 pW and achieves an FoM = 3.19 fJ/conv-step while meeting the targeted dynamic and static performance. It is very challenging to achieve an FoM < 5 fJ/conv-step for SAR ADCs with sampling rates in the range of 1 kS/s to 10 kS/s due to the substantial leakage power consumption [8]. The 8-bit ADC features a leakage-suppressed S/H circuit with boosted control voltage. The proposed S/H circuit is superior to conventional bootstrap/boosted switches and achieves > 9-bit linearity over process and temperature corners. The capacitive DAC is a crucial block that determines linearity performance, power and area of a SAR ADC. Instead of employing DAC switching schemes which suffer signal-dependent offset [9], additional voltages and switches [10], or increased number of capacitors and complex SAR logic [11], a conventional binary-weighted array DAC is employed with very low, custom-designed unit capacitors. Consequently the area of the ADC and power consumption are reduced while meeting INL/DNL specifications.

In this dissertation, the SAR architecture has been utilized to design and implement a 1.2 V, 10 bit, 50 MS/s SAR ADC in 65 nm CMOS aimed at communication applications. Accurate settling of the DAC voltage is essential to meet the performance in a charge-redistribution SAR ADC. For medium-to-high sampling rates, the DAC reference settling poses a speed bottleneck due to ringing associated with the parasitic inductances. In SAR ADCs embedded in SoCs, a high-speed buffer has to be incorporated to supply a stable reference voltage for the DAC. Although SAR ADCs have been the subject of intense research in recent years, scant attention has been laid on the design of high-performance on-chip RVBuffers. The few existing works [12] emphasize only the dominant power consumption of the buffer. The estimation of important design parameters of the buffer as well critical specifications such as PSRR, output noise, offset, settling time and stability have been elaborated upon in this dissertation. The implemented RVbuffer consists of a two-stage OTA combined with replica source-follower (SF) stages. The 10-bit SAR ADC utilizes split-array capacitive DACs to reduce area and power consumption. A constant-$g_{m}$ bias circuit with external resistor generates the biasing voltage for the buffer. A bootstrapped sampling switch maintains > 10-bit linearity over all PVT corners. In post-layout simulation which includes the entire pad frame and associated parasitics, the ADC achieves an ENOB of 9.25 bits at a supply voltage of 1.2 V, typical process corner and sampling frequency of 50 MS/s for near-Nyquist input. Excluding the reference voltage buffer, the ADC consumes 697 $\mu$W and achieves an energy efficiency of 25 fJ/conversion-step while occupying a core area of 0.055 mm$^2$.

The OTA is a key building block of mixed-signal processing systems. Due to the lower output resistance of transistors and low supply voltages in scaled CMOS technologies, the design of OTAs with adequate open loop DC gain, unity-gain frequency, output swing and linearity presents a formidable challenge. Techniques
such as cascoding become less feasible and hence cascading multiple stages to achieve the targeted DC gain has emerged as an attractive design choice. However, the frequency compensation technique in multi-stage OTAs has to be appropriately chosen to achieve power efficiency and higher speed. The useful features of advanced CMOS process nodes such as 28 nm UTBB FDSOI CMOS can be enlisted to overcome the constraints imposed by low supply voltage and reduced intrinsic gain. A 1 V, fully differential OTA which satisfies the demanding specifications of a PGA for a 9-bit SAR ADC has been designed which supports rail-to-rail output swing and provides DC gain = 70 dB, unity-gain frequency = 4.3 MHz and phase margin = 68° while consuming 2.9 \( \mu \)W. Crucial specifications for the PGA such as settling time and linearity are also satisfied over all process corners and a temperature range of \([-20^\circ C + 85^\circ C]\). Since FBB in the 28 nm FDSOI process allows wide tuning of the threshold voltage and substantial boosting of the transconductance, an ultra-low-voltage fully differential OTA with \( V_{DD} = 0.4 \) V has been designed to satisfy the comprehensive specifications of a general-purpose OTA while limiting the power consumption to 785 nW. A power-efficient RVbuffer has been implemented in 1.8 V, 180 nm CMOS for a 10-bit, 1 MS/s SAR ADC in an industrial fingerprint sensor SoC. The RVbuffer utilizes a single-stage, cascoded current-mirror OTA which minimizes current consumption, enhances PSRR and obviates frequency compensation while meeting the requirements on DC gain and settling speed. Even though numerous frequency compensation schemes for OTAs driving large capacitive loads have been proposed, their impact on high-speed, low capacitive-load OTAs in advanced CMOS nodes has not been studied in detail. Hence two previously-published frequency compensation schemes are compared on the basis of unity-gain frequency and phase margin on a three-stage OTA designed in 1.1 V, 40 nm CMOS process. Simulation results highlight the benefits of split-length indirect compensation over the nested Miller compensation scheme. An AFE satisfying the requirements for a capacitive body-coupled communication receiver has been designed in 1.1 V, 40 nm CMOS. The AFE consists of a cascade of three amplifiers followed by a Schmitt trigger and digital buffers. The amplifiers utilize a two-stage OTA with split-length compensation. Three AFE topologies were simulated and compared in terms of noise, gain and power consumption. The rest of the dissertation is organized as follows.

- **Chapter 2** discusses the design considerations for low-voltage SAR ADCs and OTAs.

- **Chapter 3** presents the design and implementation of a 0.4 V, 717 pW, 8-bit 1 kS/s SAR ADC in 65 nm CMOS for wireless sensor network applications. This chapter is based on **Paper I** and **Paper III**.

- **Chapter 4** presents the design and implementation of a 10-bit, 50 MS/s SAR ADC with an on-chip reference voltage buffer in 65 nm CMOS. This chapter is based on **Paper II** and **Paper IV**.
Chapter 5 presents the work on the design of mixed-signal interfaces with a focus on OTAs. This chapter is based on Paper V – Paper IX. Utilizing the beneficial features of the 28 nm UTBB FDSOI CMOS process, a PGA for a 9-bit SAR ADC (Paper V) and an ultra-low-voltage OTA (Paper VI) have been designed. A power-efficient reference voltage buffer has been designed for a 10-bit embedded SAR ADC in a fingerprint sensor (Paper VII). Frequency compensation techniques for a three-stage OTA and a receiver AFE for body coupled communication in 40 nm CMOS are described in Paper VIII and Paper IX respectively.

Chapter 6 presents a conclusion and outlines the directions for future work.

Finally, Appendix A provides a copy of the published papers for a quick reference.
Chapter 2

Design Considerations for SAR ADCs and OTAs

This chapter describes the important design considerations for SAR ADCs and multi-stage amplifiers. The performance requirements of the crucial blocks in a SAR ADC and their associated design challenges are elaborated upon. Owing to the low intrinsic gain of transistors in advanced CMOS process nodes, multi-stage amplifiers with two or more stages are gaining popularity in mixed-signal systems. A discussion on frequency compensation of multi-stage amplifiers is included. The features of advanced process nodes such as the 28 nm ultra-thin box and body (UTBB) fully depleted silicon-on-insulator (FDSOI) CMOS process can be utilized to overcome performance bottlenecks and achieve significant trade-offs in power consumption vs. performance. Useful features of the 28 nm UTBB FDSOI CMOS process are also delineated.

2.1 SAR ADC Design Considerations

This section describes the non-idealities and design trade-offs in the sub-blocks of the SAR ADC such as the sample-and-hold (S/H) circuit, capacitive DAC and comparator.

2.1.1 Sample-and-Hold Circuit

The S/H circuit plays a critical role in determining the performance of the SAR ADC. The thermal noise associated with the sampling process degrades the SNR of the ADC. Nonlinear variation of the ON-resistance, signal dependent charge injection, and leakage are other non-idealities of the sampling switch that degrade the performance of the ADC.
2.1.1.1 Thermal Noise

The basic S/H circuit consists of a MOS transistor switch and a capacitor as shown in Fig. 2-1. During the tracking phase, when the switch is ON, the NMOS device approximates a linear resistor. The thermal noise of the MOS transistor is sampled on the capacitor $C_s$. For an $N$-bit ADC with a full-scale input voltage of $V_{FS}$, the quantization noise power is given by

$$P_Q = \frac{V_{FS}^2}{12 \cdot 2^{2N}}.$$  \hspace{1cm} (2.1)

If the thermal noise of the sampler is designed to be equal to the quantization noise power, a 3 dB degradation in SNR will be incurred. In such a scenario, the value of the total sampling capacitance is given by

$$C_s = \frac{12kT}{2^{2N}V_{FS}^2}.$$  \hspace{1cm} (2.2)

Assuming $V_{FS} = 1$ V, and $N = 10$ bits, a minimum sampling capacitance of 52 fF will be needed to satisfy (2.2) at room temperature. However, in reality, the sampling capacitance is chosen such that the thermal noise contribution is much lower than the quantization noise so as to minimize the SNR degradation. It is worth noting that the S/H noise given by $kT/C_s$ is significantly impacted by the change in temperature. For the bootstrapped S/H circuit with $C_s = 480$ fF reported in [13], the simulated output noise over the entire set of process, supply voltage and temperature (PVT) corners is shown in Fig. 2-2. A supply voltage variation of $\pm 10\%$ and temperature range of $[-40^\circ C + 125^\circ C]$ were utilized for the simulation. The computed value of the output noise $V_{noise,RMS} = \sqrt{\frac{kT}{C_s}}$ is $82 \mu$V and $107 \mu$V at $-40^\circ C$ and $+125^\circ C$ respectively. From Fig. 2-2, it is seen that the simulation results closely match the theoretical values. Also the process and supply voltage variations have negligible
influence on the output noise of the S/H circuit.

2.1.1.2 Charge Injection and Clock Feedthrough

Charge injection and clock feedthrough are error sources associated with the sampling switch. When the switch turns off at the start of the hold phase, the charge in the conduction channel of the MOS transistor is injected into the drain and source nodes which perturbs the sampled value on the capacitor. Clock feedthrough refers to the coupling of the gate control signal of the switch through the parasitic capacitance to the output node. Both these error sources are shown in Fig. 2-3. The combined error voltage due to the two phenomena in an NMOS and PMOS switch are given by [14]
\[ \Delta V_{err,NMOS} = -kW_{NL}C_{ox}(V_{DD} - V_{THN} - V_{IN}) \frac{C_s}{C_s - C_{GD,NMOS}} - C_{GD,NMOS} V_{DD}, \quad (2.3) \]

\[ \Delta V_{err,PMOS} = kW_{PL}C_{ox}(V_{IN} - |V_{THP}|) \frac{C_s}{C_s + C_{GD,PMOS}} + C_{GD,PMOS} V_{DD}, \quad (2.4) \]

where \( k \) is the fraction of the charge injected on the output node, \( C_{ox} \) is the gate-oxide capacitance, \( V_{THN} \) and \( V_{THP} \) are the threshold voltages, and \( C_{GD,NMOS}, C_{GD,PMOS} \) are the gate-drain overlap capacitance of the NMOS and PMOS respectively. In (2.3) and (2.4), the first part represents the charge-injection error. It is seen that the charge injection error has a linear dependency on the input signal which causes nonlinearity. An obvious way to reduce charge injection error is to use a larger sampling capacitor \( C_s \). However, this impacts the speed and power consumption adversely. Charge injection error can also be mitigated by circuit techniques such as dummy switch and bottom-plate sampling. Clock feedthrough error represented by the second part in (2.3) and (2.4) contributes an offset. It can be alleviated by adopting a fully-differential topology for the converter.

### 2.1.1.3 Tracking Bandwidth

During the tracking phase of the S/H circuit, the MOS transistor is turned ON in the linear region with an ON-resistance \( R_{ON} \) and the S/H circuit constitutes a simple RC filter. The primary source of non-linearity in the S/H circuit is the input-dependent variation of \( R_{ON} \). An \( N \)-bit SAR ADC with a sampling rate of \( f_s \) uses an internal clock frequency \( f_{sys} = (N + 1)f_s \). Assuming a half clock cycle period of \( f_{sys} \) for settling of the S/H circuit output gives the settling time as

\[ t_s = \frac{1}{2f_{sys}}, \quad (2.5) \]

For an \( N \)-bit ADC to achieve sufficient performance, the settling error at the output of the S/H circuit must be \( < \text{LSB}/2 \) [15] where LSB is the voltage corresponding to the least significant bit of the ADC. Utilizing the voltage settling expression for a single-pole RC filter with a time constant \( \tau \), we require

\[ e^{-\frac{t_s}{\tau}} < 2^{-(N+1)}. \quad (2.6) \]

The time-constant \( \tau \) can be expressed as

\[ \tau = \frac{1}{2\pi f_{3dB}}, \quad (2.7) \]

where \( f_{3dB} \) is the \(-3 \text{ dB} \) bandwidth of the RC filter formed by the S/H circuit. Let \( C_s \) be the sampling capacitance of the ADC that constitutes the load for the sampling.
2.1 SAR ADC Design Considerations

![Figure 2-4: Impact of lower supply voltage on the $R_{ON}$ of a TG switch.](image)

switch. Then the $f_{3dB}$ of the S/H circuit can be expressed as

$$f_{3dB} = \frac{1}{2\pi R_{ON}C_s}. \quad (2.8)$$

Re-arranging (2.6) and substituting for $t_s$ and $\tau$ using (2.5) and (2.7) respectively results in

$$f_{3dB} \geq \frac{(N + 1) \ln 2}{\pi f_{sys}}. \quad (2.9)$$

From (2.8) and (2.9), it is evident that, for a given value of $C_s$, an upper-bound is set on the $R_{ON}$ of the S/H circuit to satisfy the targeted ADC performance.

A lower supply voltage reduces the available gate-overdrive for the MOS switches in the S/H circuit leading to increased values for $R_{ON}$ and consequent degradation in linearity. The impact of lowering the supply voltage on the $R_{ON}$ of a transmission-gate (TG) S/H circuit in a 65 nm CMOS process is illustrated in Fig. 2-4. The TG switch uses standard-$V_{TH}$ devices with $(W/L)_N = (1.2 \mu m/1.2 \mu m)$ and $(W/L)_P = (2.4 \mu m/1.2 \mu m)$ For ultra-low-voltage applications, conventional bootstrapping [16] proves inadequate to overcome this limitation, and hence double-bootstrapping [17] or cascade of charge pumps [18–20] are often employed. It is worth noting that there exists a trade-off between the number of charge pump stages in the cascade and the deterioration in voltage boosting due to parasitic capacitances [19].

2.1.1.4 Impact of Leakage

Even though the S/H circuit will remain OFF during the bit conversion cycles in the SAR ADC, subthreshold leakage in the transistors will cause the sampled voltage to
droop. The subthreshold current is given by [21]

\[ I_{DS} = \mu_0 C_{ox} \frac{W}{L} (m - 1) V_T^2 e^{\frac{V_{th}}{V_T}} (1 - e^{-\frac{V_{DS}}{V_T}}), \]  

(2.10)

where \( V_{th} \) is the threshold voltage, \( V_T = kT/q \) is the thermal voltage, \( C_{ox} \) is the gate oxide capacitance, \( \mu_0 \) is the zero-bias mobility and \( m \) is the subthreshold swing coefficient. Another contributor is the gate leakage current which occurs due to the high electric field across the gate oxide and the low oxide thickness. Major constituents of gate leakage are gate oxide tunneling and injection of hot carriers from substrate to the gate oxide [21]. In analog and mixed-signal circuits working at very low frequencies, the leakage power forms a significant portion of the total power consumption. The sub-threshold leakage current depends on the voltage across the switch as seen in (2.10) and hence causes harmonic distortion at the ADC output [20, 22]. This problem is particularly acute in SAR ADCs with low sampling rates. With bottom-plate input sampling, the leakage of the S/H switches is not a major concern since the bottom-plate nodes of the DAC capacitors will be connecting to the reference voltages. However, bottom-plate input sampling technique is not suitable for ultra-low voltage SAR ADCs since a large number of charge-pump based switches will be required. Leakage-suppression is achieved in top-plate S/H circuits using device-stacking [22], employment of high-\( V_{TH} \) devices [20], negative body bias [19], negative gate bias [23] or a combination of these techniques. For the MOS transistors, the subthreshold leakage current increases with temperature while the threshold voltage reduces with temperature. For minimum-sized, standard-\( V_{TH} \) devices in 65 nm CMOS, the variation of leakage current and \( V_{TH} \) with temperature are shown in Fig. 2-5. Hence the worst leakage PVT condition must be considered to satisfy robust S/H performance. **Paper I** implements a leakage-reduced switch with
2.1 SAR ADC Design Considerations

Figure 2-6: Binary-weighted capacitive DAC.

a multi-stage charge pump for a 0.4 V, 8-bit SAR ADC while Paper II describes a bootstrapped switch for a 1.2 V, 10-bit SAR ADC.

2.1.2 Capacitive DAC

The capacitive DAC in the SAR ADC provides feedback of the scaled reference voltage based on the control bits from the SAR logic. The capacitive array DAC is preferred to the resistor string DAC because of the improved matching properties of capacitors and the absence of static power dissipation. Figure 2-6 shows an $N$-bit conventional binary-weighted capacitive array DAC. Mismatches between the capacitors in the DAC as well as parasitic capacitances in the DAC layout cause nonlinearity at the ADC output and thus limit the INL, DNL performance of the SAR ADC. To reduce mismatch effects, the entire DAC capacitor array is constructed using multiples of a unit capacitor $C_u$. The capacitive DAC is usually laid out in common-centroid configuration to cancel global errors such as non-uniform oxide growth. To minimize the impact of parasitic capacitance due to the interconnections, adequate shielding is provided. In most SAR ADCs, the capacitive DAC also performs the sampling of the input signal. The choice of the unit capacitor is primarily determined by thermal noise and matching requirements. Limitations imposed by the process technology on the minimum capacitor value also have to be considered in the choice of $C_u$. The mismatch-limited $C_u$ value for the fully differential binary-weighted array DAC is given by

$$C_u \geq 9(2^N - 1)K_\sigma^2 K_c,$$

where $K_\sigma$ is the matching coefficient and $K_c$ is the capacitance density parameter. The detailed derivation of (2.11) is provided in Chapter 3.

The disadvantage of a conventional binary-weighted array DAC is that the capacitance increases exponentially with the resolution of the ADC. For medium/high speed SAR ADCs, the RC settling time for the DAC capacitor and associated switch poses a speed bottleneck. Furthermore the binary-weighted array entails increased chip area and power consumption. The split-array capacitive DAC aims to mitigate these drawbacks. Figure 2-7 shows an $N$-bit split array consisting of a $M$-bit main DAC and $S$-bit sub DAC where $M + S = N$. The bridge capacitor $C_B$ is given by
For a 10-bit split-array DAC with $M = S = 5$, $C_B = \frac{32}{31} C_u$. This fractional value of $C_B$ poses difficulty for layout and worsens mismatch. A technique to overcome this limitation is to remove the dummy $C_u$ in the sub DAC such that $C_B = C_u$ [24]. However this causes a gain error of $1/1 - 2^{-N}$ [25] which can be calibrated in the digital domain if needed. It is shown in [26] that the parasitic capacitance $C_{P,A}$ causes code-dependent errors and thus degrades linearity of the ADC. In order to reduce $C_{P,A}$, the bottom plate node of $C_B$ which contributes larger parasitic capacitance should be connected to the main DAC. Reducing the number of bits in the sub DAC helps to decrease $C_{P,A}$. However, this results in a larger main DAC for a given $N$ and consequently larger spread in capacitor values. Hence the distribution of bits between the main DAC and sub DAC should consider the trade-off between nonlinearity and capacitance spread [26]. The mismatch-limited $C_u$ value for the fully differential $N$-bit split-array DAC is given by

$$C_u \geq 9 \cdot (2^{M-1}) \cdot 2^{2(N-M)} \cdot K^2 \cdot K_c.$$  (2.13)

The detailed derivation of (2.13) is provided in Chapter 4. The ratio of the mismatch-limited $C_u$ values for the split-array and binary-weighted DACs with $N = 10$, $M = S = 5$, assuming same $K_\sigma$ and $K_c$ is given by

$$\frac{C_{u,split}}{C_{u,bw}} = \frac{(2^{M-1}) \cdot 2^{2(N-M)}}{2^N - 1} \approx 31,$$  (2.14)

indicating that the split-array DAC imposes significantly larger $C_u$ to meet the desired linearity. If a particular ADC design uses only capacitors provided in the design kit and $C_{u,split} \ll C_{u,proc}$ where $C_{u,proc}$ is the minimum value of the capacitor available in the design kit, then selecting a split-array DAC offers benefits over the binary-
weighted topology. However, if very low custom-designed unit capacitors much lower than $C_{u,proc}$ are used, then the binary-weighted DAC will be advantageous [27]. The 8-bit ADC in Paper I utilizes a binary-weighted capacitive array DAC with custom-designed unit capacitors while Paper II implements a 10-bit ADC using a split-array DAC.

### 2.1.3 Comparator

The dynamic comparator commonly used in SAR ADCs consists of a differential pair loaded by a regenerative latch [28]. In some applications, a preamplifier is used before the dynamic comparator to attenuate the thermal noise and improve the speed. However, many recent works on SAR ADCs employ only the dynamic comparator to achieve moderate resolution with high power efficiency. Consider the dynamic comparator shown in Fig. 2-8. The operation of the dynamic comparator consists of two phases. During the reset phase (clk is LOW), the switches $M_8$-$M_{11}$ are ON and the outputs as well as the drain nodes of $M_1$, $M_2$ are charged to $V_{DD}$. In this phase, the comparator is cleared of the previous state. Since the tail-current device $M_3$ is OFF, no current is drawn. During the evaluation phase (clk is HIGH), the input voltage difference at the differential pair causes their drain nodes to be discharged from $V_{DD}$. The cross-coupled inverters are initially OFF. The input transistors have different drain currents and this causes their drain nodes and the outputs to be discharged at different speeds. Finally, one of the cross-coupled inverters is activated. Strong positive feedback of the regenerative latch amplifies the output voltage difference until one output reaches $V_{DD}$ and the other reaches ground. The important performance specifications of the comparator such as offset, noise, speed and metastability are discussed in the following subsections.
2.1.3.1 Offset

Offset in the dynamic comparator is caused by mismatches in the threshold voltages, device dimensions, and current factor $\mu C_{\text{ox}}$ [29]. Capacitive load imbalance on the output nodes also contributes to offset [30]. The offset voltage contribution is usually dominated by mismatches in the input differential pair of the dynamic comparator [29]. An obvious technique to reduce the offset voltage is to increase the size of the input pair. But this method entails high power consumption due to the parasitic capacitances in the input pair. Conventionally, a preamplifier is added before the dynamic comparator to reduce the input-referred offset. In such a case, the preamplifier provides sufficient gain to the comparator inputs so that the offset voltage is overcome. However, a high bandwidth preamplifier will consume large power. Also, attaining sufficient gain in the preamplifier becomes more challenging in scaled CMOS technologies. Intentional capacitor mismatch is often introduced at the comparator output nodes to cancel the input-referred offset [1].

2.1.3.2 Noise

Even though the dynamic comparator achieves high speed with excellent power efficiency, it suffers heavily from thermal noise. The input-referred noise of the comparator adds directly to the noise budget of the ADC and hence it should be minimized. Noise analysis of the dynamic comparator is rendered difficult by the fact that the operating regions are time-varying. Noise analysis in [31] uses stochastic differential equations. In [31], a number of design guidelines for mitigating noise have been outlined. A linear time-varying model is used in [32] to accurately predict the error probability. It is shown in [32] that the input-referred noise has the familiar $kT/C$ form scaled by $(g_{m}/I_d)^{-1}$. It is important to note that many design techniques for reducing noise degrade the comparator speed [32].

2.1.3.3 Speed

The speed of the comparator is determined by the input differential voltage at the start of the regeneration phase as well as the regeneration time constant. The regeneration time constant is given by

$$\tau_{\text{reg}} = \frac{C_c}{g_{m,\text{INV}}}, \hspace{1cm} (2.15)$$

where $C_c$ is the capacitive load on the regenerative nodes and $g_{m,\text{INV}}$ is the total transconductance of the inverter. The regeneration time constant is mainly determined by the transit frequency $f_T$ of the CMOS process. However, $\tau_{\text{reg}}$ is also impacted by the sizing of the devices in the comparator [33]. In a SAR ADC, the comparator should be fast enough to resolve a differential input voltage of $\text{LSB}/2$ within the allotted time under all PVT conditions.
2.2 OTA Design Challenges

2.1.3.4 Metastability
Metastability occurs when the input differential voltage is so small that the latch cannot produce acceptable logic levels within the requisite time. Metastability can be caused by low speed of the comparator and/or inadequate time allotted to the regeneration phase. Since the comparator outputs have not attained the proper logic levels, the succeeding digital logic will interpret them differently leading to large errors in the A/D conversion. For SAR ADCs, [34] provides analysis of metastability errors and derives the signal-to-metastability-error ratio (SMR).

2.2 OTA Design Challenges

Many applications such as low-dropout regulators, high-resolution ADCs, sensitive receiver AFEs, etc., require amplifiers with high gain. The design of high-performance OTAs in advanced CMOS process nodes ($L_{\text{min}} < 90$ nm) with low supply voltages is rendered difficult by a number of factors. The reduced output resistance of transistors in deep-submicron CMOS processes results in lower intrinsic gain ($g_{m}/g_{ds}$) of the transistor [35]. The plot of intrinsic gain vs. $V_{DS}$ for minimum-sized, standard-$V_{TH}$ NMOS transistors in 65 nm, 40 nm and 28 nm CMOS technologies is shown in Fig. 2-9. A reduced supply voltage results in limited voltage headroom. Consider the single-stage differential-input amplifier shown in Fig. 2-10. The minimum supply voltage and input common-mode voltage for a certain process variation $\Delta V_{TH}$ can be derived as [36]

\[ V_{DD} \geq 3V_{DS,sat} + |\Delta V_{TH}|, \]  

(2.16)

\[ V_{in,CM} \geq V_{DS,sat} + V_{GS} + |\Delta V_{TH}| = V_{DS,sat} + V_{ov} + V_{TH} + |\Delta V_{TH}|. \]  

(2.17)

From (2.16) and (2.17), it is seen that the minimum supply voltage is limited by $V_{DS,sat}$ while the input common-mode range is limited by $V_{TH}$. Since $V_{DS,sat}$ does not scale with technology [36] and $V_{TH}$ scales at a lower rate than the supply voltage, the design of analog circuits with large common-mode range and robust operation over PVT corners in low-voltage process nodes constitutes a formidable challenge. Techniques such as forward body biasing to lower $V_{TH}$ [37] and pseudo-differential stages [38] have been proposed to mitigate the voltage headroom problem. However, the range of body bias voltage is limited in bulk CMOS technologies due to the risk of forward biasing the source to bulk diodes. Though a pseudo-differential topology eliminates the voltage drop over the tail current source, it suffers large common-mode gain which necessitates a common-mode feedforward (CMFF) circuit. Body-input OTAs which use the bulk of the MOS device as the input have been implemented for ultra-low-voltage applications [37]. A major impediment for the body-input topology is the reduction in $g_{mb}/g_{m}$ in advanced CMOS processes. For e.g., $g_{mb}/g_{m} = 0.12$ in 65 nm CMOS [39]. The lower $g_{mb}$ value results in increased noise and lower
Figure 2-9: Intrinsic gain vs. $V_{DS}$ in advanced CMOS process nodes.

Figure 2-10: Voltage headroom in a single-stage differential amplifier.
unity-gain frequency. Also the bulk-input stage can draw substantial currents which will load the preceding stage.

In order to maintain acceptable signal swings under low supply voltages, cascoding is not feasible. Hence cascading of amplifier stages has emerged as a viable option to achieve high gain. When two or more amplifier stages are cascaded, cost is incurred in the form of increased circuit complexity, stability issues, and higher power consumption while we benefit from increased gain and output swing. An overview of the commonly used frequency compensation techniques is provided in the following section.

2.2.1 Stabilization of OTAs

OTAs are mostly used in a feedback configuration. In a feedback loop, the Barkhausen criteria have to be met in order to ensure that the amplifier does not turn into an oscillator. Stability requires that sufficient phase margin (PM) must be achieved for the OTA. In order to accomplish this, any OTA should have a single dominant pole with the non-dominant poles placed at much higher frequencies than the unity-gain frequency. Designers achieve this by frequency compensation topologies which utilize capacitors and resistors for pole-splitting and pole-zero cancellation. The following assumptions are made to simplify the transfer function analysis of various amplifiers [40]:

- The gains of all stages are much greater than one.
- The loading and compensation capacitances are much larger than the lumped output parasitic capacitances of each stage.
- Inter-stage coupling capacitances are negligible.

For the different OTAs described in this section, $A_i$ and $g_m i$ represent the gain and transconductance of the $i$-th stage. $R_i$ and $C_i$ are the resistance and capacitance associated with the $i$-th stage. Also the amplifiers have only capacitive loads and do not include any buffer stage at the output.

2.2.1.1 Single-Stage OTA

A single-stage fully differential OTA is shown in Fig. 2-11. The only high-impedance nodes are at the outputs. The gain-bandwidth product (GBW) is given by

$$\text{GBW} = \frac{g_m 1}{C_L}. \quad (2.18)$$

The transfer function is given by [40]

$$A_{v,\text{single}}(s) = \frac{-g_m 1 R_L}{1 + s C_L R_L}. \quad (2.19)$$
where \( g_{m1} \) is the transconductance of \( M_1 \), \( R_L = (r_{o1} \parallel r_{o2}) \) is the output resistance and \( C_L \) is the load capacitance. From (2.19) it is clear that the amplifier has no zero and only one left-half plane (LHP) pole given by \( p_{3dB} = 1/R_L C_L \) in the frequency response. Hence this OTA is always stable. Assuming that the GBW is much higher than the pole, the phase margin (PM) is 90°. The gain of the single-stage amplifier is only \( g_{m1} R_L \) which proves inadequate for many applications. In switched-capacitor \( \Sigma \Delta \) ADCs, high gain OTAs are needed to reduce gain error in the integrator and minimize noise leakage. Achieving an open-loop DC gain > 60 dB in advanced CMOS process nodes with a single-stage OTA requires a telescopic cascode topology often enhanced by gain boosting [41]. However this topology severely restricts the output swing of the OTA and requires a feedforward \( \Sigma \Delta \) modulator to be employed.

### 2.2.1.2 Two-Stage OTA

In order to achieve high DC gain combined with large output swing, two-stage OTAs are used. In a two-stage OTA, gain is distributed between the two stages and the output stage is a common source (CS) stage which provides large output swing. However, it should be noted that the two-stage OTA will entail higher power consumption due to static bias currents in the two stages. The block diagram of a two-stage OTA is shown in Fig. 2-12. It has two high-impedance nodes denoted by \( V_1 \) and \( V_{out} \). A compensation capacitor \( C_m \) is connected between these nodes to provide pole splitting and to generate a dominant pole. The second stage must be an inverting amplifier to ensure that \( C_m \) provides negative feedback. This topology is referred...
2.2 OTA Design Challenges

Figure 2-12: Simple Miller compensated (SMC) two-stage OTA.

to as the simple Miller compensated (SMC) OTA. The transfer function of the SMC OTA is given by [40]

$$A_{v, SMC}(s) = \frac{g_{m1} g_{mL} R_1 R_L \left( 1 - s \frac{C_m}{g_{mL}} \right)}{(1 + s C_m g_{mL} R_1 R_L) \left( 1 + s \frac{C_m}{g_{mL}} \right)}.$$ (2.20)

From (2.20) we find that there are two LHP poles and one right-half plane (RHP) zero. The dominant pole is given by

$$p_1 = \frac{1}{C_m g_{mL} R_1 R_L}.$$ (2.21)

The non-dominant pole and RHP zero are obtained as

$$p_2 = \frac{g_{mL}}{C_L},$$ (2.22)

$$z_1 = \frac{g_{mL}}{C_m},$$ (2.23)

respectively. The GBW of the two-stage amplifier is

$$\text{GBW} = \frac{g_{m1}}{C_m}.$$ (2.24)

In order to achieve closed-loop stability, $p_2$ and $z_1$ must be placed at higher frequencies than the GBW. If $C_m$ is increased to separate the poles further, the GBW is reduced as evident from (2.24). Hence overcompensation proves to be harmful. To achieve a PM of $\approx 60^\circ$, the GBW is set to half of $p_2$. Using (2.24) and (2.22), we get

$$C_m = 2 \left( \frac{g_{m1}}{g_{mL}} \right) C_L.$$ (2.25)
Since the GBW is set to half of $p_2$, we have

$$\text{GBW} = \frac{g_{m1}}{C_m} = \frac{1}{2} \left( \frac{g_{mL}}{C_L} \right),$$

(2.26)

which is half of that of the single-stage amplifier [40]. It is seen from (2.24) and (2.25) that the GBW of an SMC amplifier cannot be increased by increasing $g_{m1}$ since $C_m$ needs to be increased proportionally in order to maintain (2.26). The expression for the PM of the SMC OTA is given as [40]

$$PM = 180^\circ - \tan^{-1} \left( \frac{\text{GBW}}{p_1} \right) - \tan^{-1} \left( \frac{\text{GBW}}{p_2} \right) - \tan^{-1} \left( \frac{\text{GBW}}{|z_1|} \right)$$

$$= 63^\circ - \tan^{-1} \left( \frac{g_{m1}}{g_{mL}} \right).$$

(2.27)

Thus a low $g_{m1}/g_{mL}$ ratio provides higher PM. However, $g_{m1}$ is limited by the bias current and the size of the input differential pair. To achieve high slew rate, a large bias current is required while low offset necessitates wide input transistors. Hence a low value of $g_{m1}$ is often not realized. In such a scenario, the SMC amplifier needs to be designed with large $g_{mL}$ to achieve sufficient PM. This leads to large currents in the second stage degrading the power efficiency of the amplifier. From (2.27) it is seen that the RHP zero degrades the PM of the SMC OTA. The small-signal model of the SMC OTA is shown in Fig. 2-13. The RHP zero occurs due to the feedforward small-signal current that flows from the input to the output through $C_m$. In Fig. 2-13, the feedforward current flowing into the output node $V_{out}$ is $i_{ff} = sC_mV_1$ while the current $g_{mL}V_1$ flows out of the output node [42]. Total current at $V_{out}$ is $i_v = (g_{mL} - sC_m)V_1$. A zero exists in the transfer function where $i_v$ becomes zero [42] as evident from (2.20). The RHP zero can be eliminated by blocking the feedforward current through $C_m$. A source follower in series with $C_m$ [43] or a common-gate stage [44] can be used to accomplish this. Connecting $C_m$ to the source node of a cascode device in the first stage of the OTA also helps to reduce the feedforward current [42]. The effect of the RHP zero can be nullified by inserting a resistor $R_m$ in series with $C_m$ as shown in Fig. 2-14. The transfer function of the
SMCNR OTA is given by [40]

\[
A_{v,SMCNR}(s) = \frac{g_{m1}g_{mL}R_1R_L}{1 + sC_m(R_m + g_{mL}R_1R_L)} \left[ \frac{1 - sC_m \left( \frac{1}{g_{mL}} - R_m \right)}{1 + sC_L \left( \frac{1}{R_m + g_{mL}R_1R_L} \right)} \right].
\]  

(2.28)

The dominant pole for the SMCNR amplifier is same as that for the SMC amplifier and is given by

\[
p_1 = \frac{1}{C_m g_{mL} R_1 R_L}.
\]  

(2.29)

The non-dominant pole is given by

\[
p_2 \approx \frac{g_{mL}}{C_L}.
\]  

(2.30)

From (2.28), it is seen that the RHP zero is located at

\[
z_{RHP} = \frac{1}{C_m \left( \frac{1}{g_{mL}} - R_m \right)}. \tag{2.31}
\]

There are three ways to nullify the effect of the RHP zero. These are

* Move the zero to infinity. This is done by choosing

\[
R_m = \frac{1}{g_{mL}}. \tag{2.32}
\]

* Move the zero to the LHP. An LHP zero helps to improve PM. This can be done by selecting

\[
R_m > \frac{1}{g_{mL}}. \tag{2.33}
\]

* The final option is to move \( z_{RHP} \) to the LHP and use it to cancel \( p_2 \). By
equating (2.30) with (2.31), we find that this can be achieved by choosing

\[ R_m = \frac{C_L + C_m}{g_{mL} C_m}. \]  

(2.34)

Indirect feedback frequency compensation of a two-stage OTA using split-length transistors has been proposed in [45]. This technique does not involve the use of common-gate or cascode stages thus avoiding extra devices and bias currents as well as signal swing limitation. Paper II and Paper IX employ this compensation technique on two-stage OTAs designed in 65 nm and 40 nm CMOS respectively.

### 2.2.1.3 Three-Stage OTA

The three-stage OTA has three high impedance nodes. This in turn spawns additional poles which would require more elaborate frequency compensation schemes than the SMC. Nested Miller compensation (NMC) scheme and its variants are often used to stabilize the three-stage OTA. The block diagram of a three-stage OTA with NMC is shown in Fig. 2-15. Before compensation, the poles associated with the nodes 1, 2 and 3 are close to each other. In order to separate these poles and generate a single dominant pole, compensation capacitors \( C_{c1} \) and \( C_{c2} \) are connected as shown in Fig. 2-15. The second stage should be non-inverting and the third stage inverting to ensure that the capacitors provide negative feedback and achieve pole-splitting. According to [46], the two non-dominant poles in the three-stage NMC OTA should be placed at \( 3 \times GBW \) and \( 5 \times GBW \) respectively in order to achieve a 60° phase margin. When resistors are added in series with \( C_{c1} \) and \( C_{c2} \), the scheme is called Nested Miller compensation with Nulling Resistor (NMCNR). Numerous compensation schemes for three stage amplifiers that alleviate the disadvantages of NMC have been proposed [47], [48], [49]. In all these works, the objective is to achieve sufficient PM and GBW with low values of compensation capacitance such that power efficiency is enhanced. Paper VIII compares the NMCNR and
2.3 Features of the 28 nm UTBB FDSOI CMOS Process

In the 28 nm ultra-thin buried oxide (BOX) and body (UTBB) fully-depleted silicon-on-insulator (FDSOI) CMOS process, a 25 nm buried-oxide (BOX) separates the thin silicon layer (7 nm) from the substrate [50]. This structure achieves total dielectric isolation and thus provides latch-up immunity. The undoped channel facilitates lower $V_{TH}$ variability. Use of a high-K metal gate reduces the gate leakage. Since the carriers are entirely confined to the thin SOI channel, excellent control of short channel effects (SCE) is achieved [50]. In this process, the drain and source regions are isolated from the bulk by the BOX and the body of the transistor can be used as a back-gate to tune the threshold voltage. Unlike in bulk CMOS technologies, the range of forward body bias (FBB) voltages is not restricted since the BOX eliminates the parasitic diodes between the bulk and the source/drain terminals. The low-$V_{TH}$ (LVT) transistors in the 28 nm UTBB FDSOI CMOS process are fabricated as ‘flip-well’ devices where the NMOS and PMOS devices are placed in the N-well and P-well respectively [51]. For the LVT transistors, the range of FBB voltage is $[-0.3 \, V \, + \, 3 \, V]$ and $[-3 \, V \, + \, 0.3 \, V]$ for the NMOS and PMOS respectively. A strong body bias factor ($\Delta V_{TH} / \Delta FBB$) of 85 mV/V combined with the wide range of FBB voltages enables the $V_{TH}$ to be tuned by approximately 250 mV which is significantly higher than that in bulk CMOS technologies. Reduction of $V_{TH}$ using FBB increases the gate overdrive of a transistor for a given $V_{GS}$ leading to improved linearity. FBB also boosts the transconductance of MOS devices. Similarly reverse body bias (RBB) is supported for regular-$V_{TH}$ (RVT) devices where the range of RBB voltage is $[-3 \, V \, + \, 0.3 \, V]$ and $[-0.3 \, V \, + \, 3 \, V]$ for the NMOS and PMOS respectively. Such features allow designers to trade-off performance and power consumption effectively using the body bias (BB) which is applied to the wells beneath the BOX.

2.3.1 Control of Threshold Voltage Using Body Bias

The body bias voltage (BB) has a strong influence on the threshold voltage ($V_{TH}$) of transistors in the 28 nm UTBB FDSOI process. Figure 2-16 plots the tuning of $V_{TH}$ for minimum-sized low-$V_{TH}$ (LVT) MOS transistors using forward BB (FBB) at the nominal power supply voltage of 1 V. The body factor is defined as $\Delta V_{TH} / \Delta FBB$. From Fig. 2-16 it is found that the NMOS device has a body factor of 80 mV/V and the PMOS device has a body factor of 103 mV/V. Reverse BB (RBB) can be used to increase the $V_{TH}$ of regular-$V_{TH}$ (RVT) transistors. RBB helps to curb leakage and boosts the resistance of the MOS devices. The body factor for RBB is very similar.
2.3.2 Intrinsic Gain of Transistors

For analog design, the intrinsic gain of the MOS transistors defined as $g_m/g_{ds}$ is of significance. Figure 2-18 plots the intrinsic gain vs. $V_{DS}$ for a minimum-sized LVT NMOS transistor at different $V_{GS}$ voltages. For the NMOS, $V_{TH} \approx 300$ mV. It is seen that the intrinsic gain is increased when $V_{GS}$ is close to the threshold voltage. The maximum intrinsic gain obtained is 19.5 dB (9.5 V/V).
2.3 Features of the 28 nm UTBB FDSOI CMOS Process

2.3.3 Boosting Transconductance Using Forward Body Bias

In the 28 nm FDSOI process, forward body bias can be used to boost the transconductance $g_m$. Figure 2-19 plots the $g_m$ vs. $V_{GS}$ of an LVT NMOS device with $W/L = 2 \mu m/100 \, nm$ and $V_{DD} = 1 \, V$ for different FBB voltages. At $V_{GS} = 0.5 \, V$, FBB = 3 \, V provides a $g_m$ value which is 3.94 times that obtained at FBB = 0 \, V. Enhancement of $g_m$ provides benefits such as increased unity-gain frequency, lower noise etc. In [51], FBB has been utilized to improve linearity over a wide supply voltage range by tuning $g_m$ while [52] employs FBB to lower the $V_{TH}$ and hence reduce the ON-resistance of a sampling switch. Only negligible currents, consisting of leakage currents in the reverse-biased diodes, are drawn from the body biasing voltage source. Paper V utilizes the RBB feature to boost the DC gain of a PGA by 6 dB while
Paper VI exploits FBB to design a fully differential OTA with $V_{DD} = 0.4$ V.

2.4 Summary

In this chapter the design trade-offs in the sub-blocks of SAR ADCs were described with an emphasis on low-voltage implementation in advanced CMOS nodes. The various nonidealities in a S/H circuit as well as two different capacitive DAC topologies were discussed. Since multi-stage OTAs are relied upon to achieve adequate DC gain and output swing in scaled CMOS technologies, a brief overview of low-voltage OTA design and frequency compensation techniques was provided along with the useful features of the 28 nm UTBB FDSOI CMOS process.
Chapter 3

Design of a 0.4 V, sub-nW, 8-bit 1 kS/s SAR ADC

3.1 Introduction

Wireless sensor networks (WSNs) are increasingly employed in a wide range of applications including military surveillance and environmental monitoring. Soil moisture measurement using WSNs is used for desertification studies, efficient management of water resources and to provide adequate irrigation [53, 54]. To ensure long-term autonomous operation, the WSN is powered by photovoltaic cells which harvest ambient light energy. An energy storage element (e.g. supercapacitor) acts as the reservoir for the harvested energy. Small form-factors are required for the nodes to reduce cost and simplify deployment. For mm-scale photovoltaic cells, the output power can be as low as tens of nW [55]. Hence ultra-low power consumption is paramount for the WSN electronics.

This chapter presents a 0.4 V, 8-bit, 1 kS/s SAR ADC with sub-nW power consumption targeted at WSNs for soil-moisture sensing. Under such ultra-low supply voltages and low sampling rate, a formidable trade-off between the ON-resistance and subthreshold leakage of the input sampling switches occurs. Since traditional bootstrapping and charge-pumps [1, 8] prove inadequate for the task, a two-stage charge-pump has been implemented that generates > 2.5X boosted gate control voltage for the input sampling switches which have been designed to alleviate leakage. To minimize the DAC power consumption and area without using additional voltages and boosted switches, a binary-weighted capacitor array with a low-value, custom-designed unit capacitor $C_u = 1.9 \text{ fF}$ is used. This strategy allows $C_u$ to be chosen close to the thermal noise limited value. A dynamic latch comparator which does not entail static power consumption has been designed to achieve low input-referred noise. To minimize leakage power, the SAR logic utilizes minimum-sized
Design of a 0.4 V, sub-nW, 8-bit 1 kS/s SAR ADC

The ADC was designed and fabricated in 65 nm CMOS and uses a supply voltage of 0.4 V. In measurement, the prototype ADC achieves an ENOB of 7.81 bits at near-Nyquist input while consuming 717 pW. The DNL and INL are 0.35 LSB and 0.36 LSB respectively. The resulting FoM is 3.19 fJ/conv-step and the core area occupied is only 0.0126 mm$^2$. An ultra-low-power, 10 kHz RC oscillator has also been designed and fabricated on the same chip. In the envisaged operational scenario for the soil moisture sensor node, the RC oscillator will provide the clock cycles for the bit conversions in the SAR ADC and other digital blocks once it is activated by an external wake-up signal. Design details and measurement results for the RC oscillator are provided in this chapter.

3.2 ADC Architecture

The block diagram of the proposed ADC which utilizes a supply voltage $V_{DD} = 0.4$ V is shown in Fig. 3-1. The ADC consists of differential binary-weighted capacitive DACs, a dynamic latch comparator and synchronous SAR logic. The differential inputs are sampled on the top-plate node of the DAC capacitors using boosted sampling switches. For such low supply voltages, top-plate sampling is advantageous compared to bottom-plate sampling, since it helps to obviate the increased number of boosted switches as well as the associated power consumption. The DAC switches are simple inverters which switch between the high and low reference levels $V_{REF} = V_{DD}$ and ground respectively. In order to achieve full-range input sampling without the use of extra voltages, top-plate sampling with MSB preset [56] has been used.

![Figure 3-1: Block diagram of the proposed ADC.](image-url)
3.3 Circuit Implementation

The design details of the various circuit blocks in the ADC are described in this section.

3.3.1 Input Sampling Switch

When the sampling pulse is HIGH, the sampling switches close and the inputs are sampled on the DAC capacitors. During the bit cycling period, which comprises eight clock periods of a 10 kHz clock, the sampling switches are turned OFF (HOLD phase) and will then suffer from significant leakage. The leakage current causes a voltage droop in the held voltage which eventually degrades the performance. In addition, the subthreshold leakage current shows a nonlinear dependency on the voltage $V_{DS}$ across the sampling switch which causes harmonic distortion in the ADC [56]. In the ON state, the sample-and-hold (S/H) constitutes a low-pass filter (RC network) with the tracking bandwidth given by

$$f_{3dB} = \frac{1}{2\pi R_{ON} C_s}, \quad (3.1)$$

where $C_s$ is the sampling capacitance and $R_{ON}$ is the switch ON-resistance. For an $N$-bit ADC, $f_{3dB}$ must be sufficiently high so that the sampled voltage settles within one LSB which requires [15]

$$f_{3dB} > (N + 1) \ln(2) \frac{f_{sys}}{\pi}, \quad (3.2)$$

where $f_{sys}$ is the system clock frequency used by the SAR logic. For a sampling capacitance $C_s = 485 \text{nF}$, $N = 8$ and $f_{sys} = 10 \text{kHz}$, solving (3.1), (3.2) results in

$$R_{ON} < 16 \text{ M\Omega}. \quad (3.3)$$

This upper bound on $R_{ON}$ has to be satisfied over all process and temperature (PT) corners for the entire input range of the ADC. PT corners encompass the entire set of process-defined corners and a temperature range $[0\degree C + 85\degree C]$.

A reduction of leakage in the switch can be achieved by using HVT devices, longer channel-lengths and device stacking. When two transistors are stacked, leakage reduction occurs due to negative $V_{GS}$ biasing and increased $V_{TH}$ in the stacked transistors [57]. However, increased $V_{TH}$ and channel-length will increase $R_{ON}$ consequently reducing the tracking bandwidth of the switch. Insufficient bandwidth causes settling errors and introduces nonlinearities. Thus a trade-off between leakage mitigation and speed emerges.

In this work, transmission-gate (TG) switches utilizing HVT devices and device stacking have been used for sampling the inputs. A conventional bootstrapped switch keeps $V_{GS} = V_{DD}$ providing low and almost-constant $R_{ON}$. However, this method
proves inadequate at $V_{DD} = 0.4$ V necessitating the use of on-chip charge pumps to generate a control voltage higher than $V_{DD}$.

To achieve sufficiently low $R_{ON}$ for the TG switch, a two-stage charge pump as shown in Fig. 3-2 was implemented to generate the boosted control voltage for the TG switch [20]. Based on charge conservation analysis and ignoring charge sharing on parasitic capacitance as well as leakage, it can be found that the output node $SA$ in the second stage is at $3V_{DD}$ and node $SAb$ goes to 0 V when $Clk_{in}$ is HIGH. When $Clk_{in}$ is LOW, node $SA$ goes to 0 V and node $SAb$ becomes $3V_{DD}$. The sampling pulse generated by the SAR logic is applied to node $Clk_{in}$ of the charge pump while the output nodes $SA$ and $SAb$ control the gate nodes of the NMOS and PMOS devices respectively in the TG switch as shown in Fig. 3-3. Capacitor $C_s$ represents the capacitance of the DACs on which the inputs are sampled. In Fig. 3-2, the value of $C_s$ was chosen to be 250 fF which alleviates the impact of parasitic capacitance without incurring a large area penalty. Post-layout simulation
results over all PT corners at $V_{DD} = 0.4$ V are provided in Table 3-1, where it is seen that the charge pump provides an output control voltage $> 2V_{DD}$ even for the worst PT corner. In order to achieve an 8-bit linearity for the SAR ADC, the required total harmonic distortion (THD) of the sampling switch should be below $-(6.02 \cdot 8 + 1.76 + 6) = -55.9$ dB where a margin of 6 dB has been added. The THD performance of the fully-differential sampling switch was simulated using full-range differential sinusoids at near-Nyquist frequency as inputs. Figure 3-4 plots the simulated, post-layout THD performance of the switch over PT corners. To determine the impact of process variations and device mismatch on the switch performance, 250 Monte Carlo (MC) simulations were performed on the post-layout netlist of the sampling switch. From the Monte Carlo simulation results, it is seen that $\mu_{THD} = -73.8$ dB and $\sigma_{THD} = 3.75$ dB. Charge-injection and clock-feedthrough are the other error sources in the sampling switch. Owing to the use of small devices, the maximum error voltage due to charge-injection is only $490 \mu$V which is lower than the LSB/2 of the ADC while a fully-differential topology mitigates clock-feedthrough errors.
3.3.2 Capacitive Array DAC

The capacitive array DAC plays a crucial role in determining the linearity, power consumption and area of the SAR ADC. Although several energy-efficient DAC switching schemes such as monotonic [9] and Vcm-based [10] have been published, each has its disadvantages such as signal-dependent comparator offset, and the need for an additional voltage and switches respectively. Since the 8-bit ADC supply voltage $V_{DD} = 0.4$ V will eventually be generated by an energy harvesting source, it is beneficial to avoid the generation of a separate common-mode voltage and minimize the number of boosted switches. In the 65 nm process design kit which we have used, the minimum value of the capacitor is 10.5 fF. Designing a capacitive DAC using these capacitors will be wasteful in terms of area and power consumption as 10.5 fF is much higher than the minimum value required to meet noise and matching requirements of the 8-bit ADC. Hence custom-designed capacitors with a much lower unit capacitor value $C_u$ are preferred. In order to limit the thermal noise power of the sample-and-hold $P_{n,\text{sample}} \leq 0.03P_{n,\text{quant}}$, where $P_{n,\text{quant}}$ is the quantization noise power of the ADC, the minimum value of the unit capacitor required in a binary-weighted DAC is 1.32 fF. Unlike the split-array DAC [8], a conventional binary-weighted DAC utilizes the entire DAC capacitance for sampling. In this work, a conventional binary-weighted capacitive (BWC) DAC with $C_u$ value chosen to satisfy the thermal noise constraint has been implemented.

The two types of custom capacitors that can be used are a) inter-layer capacitor (sandwich capacitor) b) lateral or fringe capacitor. It is shown in [58] that sandwich capacitors have lower mismatches than fringe capacitors by comparing 8 fF capacitors implemented in a 130 nm CMOS process. Single-layer 1.2 fF fringe capacitors in 32 nm SOI CMOS are found to have 0.8% matching in [59] which is sufficient to meet the INL, DNL specification of an 8-bit ADC. The matching requirement for the unit capacitor was estimated as follows.

The unit capacitor is modeled with a nominal value of $C_u$ and a standard deviation of $\sigma_u$. For a binary-weighted capacitor array, the worst-case DNL and INL occur at the MSB code transition due to accumulation of the capacitor mismatch. According to [60], the worst-case DNL and INL for a fully-differential binary-weighted capacitor array are expressed in terms of LSBs as

$$\sigma_{DNL,\text{max}} = \sqrt{2 \cdot (2^N - 1) \frac{\sigma_u}{2C_u}} \text{LSB}, \quad (3.4)$$

$$\sigma_{INL,\text{max}} = \sqrt{2 \cdot (2^N - 1) \frac{\sigma_u}{2C_u}} \text{LSB}. \quad (3.5)$$

Since (3.5) is less than (3.4), the worst-case DNL will be used as the reference in the ensuing analysis. For capacitor matching, the mid-code transition is considered as the worst case, since all the capacitors in the BWC DAC are switched. For an $N$-bit
3.3 Circuit Implementation

SAR ADC, the maximum DNL error should be limited to [61], [27]

\[ 3\sigma_{DNL,max} < \frac{1}{2} \text{LSB}. \]  

(3.6)

An 8-bit fully-differential BWC DAC is composed of 510 \( C_u \) elements \((2 \cdot (2^8 - 1))\). The LSB step equals \( 2C_u \) where the factor 2 arises from the fully-differential implementation. At the MSB-code transition, all 510 capacitors are switched, thus leading to

\[ \sigma_{DNL,max} = \sqrt{2 \cdot (2^N - 1)} \cdot \frac{\sigma_u \cdot \text{LSB}}{2C_u} = \frac{\sqrt{510} \cdot \sigma_u \cdot \text{LSB}}{2C_u}. \]  

(3.7)

Combining (3.6) and (3.7), we get

\[ 3\sqrt{510} \cdot \frac{\sigma_u \cdot \text{LSB}}{2C_u} < \frac{1}{2} \text{LSB} \implies \frac{\sigma_u}{C_u} \leq 1.47\%. \]  

(3.8)

If we had a single-ended DAC implementation we would have required a matching of 1.04\% compared to 1.47\% obtained for a fully-differential BWC DAC. Thus the required matching is relaxed by \( \sqrt{2} \) due to the fully-differential implementation.

The lower-bound for the mismatch-limited unit capacitor can be determined as follows. Combining (3.4) and (3.6), we have

\[ 3\sqrt{2 \cdot (2^N - 1)} \cdot \frac{\sigma_u}{C_u} < 1. \]  

(3.9)

For a typical MIM (or MOM) capacitor [61],

\[ \sigma \left( \frac{\Delta C}{C} \right) = K_{\sigma} \cdot \frac{\sigma}{\sqrt{A}}, \]  

(3.10)

\[ C_u = K_c \cdot A, \]  

(3.11)

where \( \sigma \left( \frac{\Delta C}{C} \right) \) is the standard deviation of the capacitor mismatch, \( K_{\sigma} \) is the matching coefficient, \( A \) is the capacitor area of \( C_u \) and \( K_c \) is the capacitor density parameter. Also we have [61]

\[ \frac{\sigma_u}{C_u} = \frac{1}{\sqrt{2}} \cdot \sigma \left( \frac{\Delta C}{C} \right). \]  

(3.12)

Combining (3.10) and (3.11) results in

\[ \sigma \left( \frac{\Delta C}{C} \right) = \frac{K_{\sigma} \cdot \sqrt{K_c}}{\sqrt{C_u}}. \]  

(3.13)
Combining (3.12) and (3.13) results in

\[
\frac{\sigma_u}{C_u} = \frac{1}{\sqrt{2}} \frac{K_\sigma \cdot \sqrt{K_c}}{\sqrt{C_u}}.
\]  

(3.14)

Using (3.14) in (3.9) gives

\[
\frac{3\sqrt{2}^{2N-1}K_\sigma \sqrt{K_c}}{\sqrt{C_u}} < 1.
\]  

(3.15)

Re-arranging (3.15), we get the mismatch-limited unit capacitor as

\[
C_u \geq 9(2^N - 1)K_\sigma^2 K_c.
\]  

(3.16)

If a capacitor provided in the process design kit (PDK) is used, the value of the matching coefficient \(K_\sigma\) and the capacitor density parameter \(K_c\) will be available in the PDK documentation thus allowing us to calculate the minimum value for \(C_u\) using (3.16). However, for a custom-designed metal plate capacitor, \(K_\sigma\) is not known. In this work \(N = 8\) bits, \(C_u = 1.88\) fF and \(K_c = 0.47\) fF/\(\mu m^2\). Substituting these values in (3.16) provides the maximum allowable value for \(K_\sigma\) as

\[
K_\sigma \leq 4.17\% \mu m.
\]  

(3.17)

The matching constraint imposed by (3.17) was deemed to be realizable in a 65 nm CMOS process.

The structure of the custom-designed unit capacitor which is modified from [62] is shown in Fig. 3-5. It uses metal layers M2, M3, M4 and M5. The inner M4 plate as well as the three inner M3 fingers constitute the top plate node. The outer M3 layer, the two M3 fingers connected to it as well as the M2 layer, outer M4 layer and the M5 layer form the bottom plate node. The top plate is completely enclosed between the bottom plates except for the routing paths. The custom-designed unit capacitor combines the inter-layer capacitance and the lateral fringing capacitance. With \(C_u = 1.88\) fF and an area of 4 \(\mu m^2\), the unit capacitor has a capacitive density \(K_c = 0.47\) fF/\(\mu m^2\). A partial common-centroid layout was adopted for the capacitive array DAC and dummy unit capacitors were placed along the periphery of the array. The total DAC capacitance (single side) is \(C_{dac} \approx 485\) fF and the parasitic capacitance on the top-plate node \(C_{top,p} = 8.6\) fF. \(C_{top,p}\) attenuates the DAC output voltage which reduces the swing available at the comparator input. In this work, the attenuation factor \(H_{dac,atten}\) due to \(C_{top,p}\) is given by

\[
H_{dac,atten} = 1 - \frac{C_{dac}}{C_{dac} + C_{top,p}} = 0.017,
\]  

(3.18)

which illustrates the diminished impact of \(C_{top,p}\) on the DAC output voltage. The
3.3 Circuit Implementation

layout of the binary-weighted DAC (single side) is shown in Fig. 3-6. The MSB capacitors follow a common-centroid arrangement to mitigate mismatch errors due to non-uniform oxide growth in the capacitors while the LSB capacitors are placed close to the switch network to simplify the interconnections. The unmarked capacitors in Fig. 4-12 are dummies. The area of each capacitive DAC is 44.5 µm × 44.5 µm. The DAC switches are simple inverters that use HVT devices.

3.3.3 Dynamic Latch Comparator

The dynamic latch comparator [8] is shown in Fig. 3-7. The Reset node of the comparator is connected to the 10 kHz system clock of the ADC. A succeeding SR latch consisting of cross-coupled NOR gates stores the output of the comparator for one full clock cycle. In order to ensure sufficient comparison speed at $V_{DD} = 0.4$ V, low-$V_{TH}$ (LVT) devices have been used for the input differential pair and cross-coupled inverters while HVT devices have been used for the reset switches $M_5$ and $M_6$. To mitigate comparator noise, balanced capacitance has been added on the output nodes of the comparator. In post-layout simulation over PT corners, the maximum value of comparator input-referred noise is 311.7 µV (RMS) corresponding to 0.1 LSB. Consequently, the comparator noise power $P_{n,comp} = 0.12P_{n,quant}$. Based on MC simulations, the input-referred offset of the comparator $\sigma_{offset,comp} = 9.8$ mV. The fully-differential input range of the ADC is (0.4 V × 2) = 0.8 V. Taking 3σ value of the input-referred offset of the comparator, the comparator offset results in an SNR
loss given by

$$SNR_{loss,offset} = 20 \times \log(0.8 - 0.03) - 20 \times \log(0.8) = -0.33 \text{ dB},$$  

which amounts to an ENOB loss of 0.05 bits.

### 3.3.4 SAR Logic

The ADC utilizes a synchronous SAR controller as shown in Fig. 3-8.

The D-FlipFlops (DFFs) in the top row act as a ring counter while those in the bottom row provide the control signals for the DAC switches and store the comparison result during successive clock cycles. The signal samp generated by the leftmost DFF on the top row constitutes the sampling pulse for the ADC. It is provided as input to the charge pump to obtain the boosted gate control voltage. The signal setMSB generated as (samp OR s2) is used to set the MSB to HIGH during the input sampling and MSB approximation clock phases. Transmission-gate based DFFs with minimum-sized HVT devices have been implemented to achieve low power consumption and minimize leakage. The timing sequence for the SAR logic is shown Fig. 3-9.

Post-layout simulation of the ADC included transient noise and the entire set of I/O pad schematics. A source impedance of 50 Ω was included on the differential inputs $V_{IP}$ and $V_{IN}$. Parasitic capacitance modeling the digital probes of the oscilloscope were added to the output pins of the ADC. In post-layout simulation, the
ADC achieved an ENOB of 7.92 bits at near-Nyquist input and \( V_{DD} = 0.4 \) V while consuming 730 pW. The power breakdown of the ADC is 41.4% for the digital logic, 36.8% for the analog part comprising the comparator, charge pump and sampling switch and 21.8% for the DAC.
3.4 Measurement Results

The prototype SAR ADC with a core area of 105 $\mu$m $\times$ 112 $\mu$m was designed and fabricated in a 65 nm, 1-poly 7-metal (1P7M) CMOS process. It was packaged in a 42-pin J-Leaded Chip Carrier (JLCC) package. The chip microphotograph is shown in Fig. 3-10. Histogram test was conducted to determine the static linearity performance of the ADC. A full-scale differential sinusoid with near-DC frequency and amplitude of 800 mVpp was applied to the 1 kS/s ADC. The plot of DNL and INL error is shown in Fig. 3-11. The peak DNL error is $+0.35/-0.35$ LSB and the peak INL error is $+0.36/-0.36$ LSB. In order to ascertain the statistical matching performance of the custom-capacitor based DAC, the DNL and INL of seven prototype chips were measured. The mean and standard deviation of the measured DNL and INL curves are shown in Fig. 3-12 and Fig. 3-13 respectively. The worst-case DNL$_{\text{max}}$ and INL$_{\text{max}}$ on the seven samples are both 0.4 LSB.

The dynamic performance of the ADC was measured using the tone test. Fig. 3-14 shows the measured FFT spectrum of the 1 kS/s ADC for near-DC and near-Nyquist input frequencies. The amplitude of the test tone was set to $-0.4$ dBFS. At near-DC
3.4 Measurement Results

Figure 3-11: Measured DNL and INL errors of the ADC.

Figure 3-12: Measured $\mu$DNL and $\mu$INL for seven ADC chips.
Figure 3-13: Measured $\sigma_{DNL}$ and $\sigma_{INL}$ for seven ADC chips.

Figure 3-14: Measured FFT spectrum (2048-point) for the ADC at 1 kS/s with near-DC and near-Nyquist inputs.
frequency, the measured SNDR of the ADC is 48.81 dB corresponding to an ENOB of 7.82 bits. At near-Nyquist frequency, the measured SNDR of the ADC is 48.78 dB corresponding to an ENOB of 7.81 bits. Fig. 3-15 shows SNDR and SFDR with respect to the input frequency. The SNDR remains almost constant over the entire bandwidth. The FoM of the ADC is calculated as

$$\text{FoM} = \frac{\text{Power}}{2^{\text{ENOB} @ \text{Nyquist}} \times f_s}$$

(3.20)

The 1 kS/s ADC consumes 717 pW at $V_{DD} = 0.4$ V resulting in an FoM of 3.19 fJ/conv.-step. Leakage power consumption of the ADC is 90 pW which constitutes 12.6% of the total power consumption. Table 3-2 summarizes the ADC performance and compares this work with previously published SAR ADCs having similar sampling rates. The FoM achieved by the proposed ADC is very competitive and the power consumption is the lowest.

Measurements were performed on the ADC when it was powered by a solar panel. The measurement set-up is shown in Fig. 3-16. When the analog and digital supply voltage of 0.4 V was supplied from the solar panel and the reference voltage from a voltage source, the ADC provided an ENOB of 7.7 bits for near-Nyquist input frequency. When the reference voltage was also provided from the solar panel, the ENOB degraded to 7.3 bits which can be attributed to the substantial fluctuation in the solar panel output voltage in the absence of any regulation.
Table 3-2: ADC performance summary and comparison.

<table>
<thead>
<tr>
<th>Specification</th>
<th>[63]</th>
<th>[1]</th>
<th>[8]</th>
<th>[64]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>350</td>
<td>65</td>
<td>65</td>
<td>65</td>
<td>65</td>
</tr>
<tr>
<td>Supply voltage (V)</td>
<td>1</td>
<td>0.55</td>
<td>0.7</td>
<td>0.6</td>
<td>0.4</td>
</tr>
<tr>
<td>Sample rate (kS/s)</td>
<td>1</td>
<td>20</td>
<td>1</td>
<td>1.1</td>
<td>1</td>
</tr>
<tr>
<td>Resolution (bit)</td>
<td>12</td>
<td>10</td>
<td>10</td>
<td>10</td>
<td>8</td>
</tr>
<tr>
<td>DNL (LSB)</td>
<td>0.8</td>
<td>0.58</td>
<td>0.55</td>
<td>0.96</td>
<td>0.35</td>
</tr>
<tr>
<td>INL (LSB)</td>
<td>1.4</td>
<td>0.57</td>
<td>0.61</td>
<td>0.87</td>
<td>0.36</td>
</tr>
<tr>
<td>Power (nW)</td>
<td>230</td>
<td>206</td>
<td>3</td>
<td>1.1</td>
<td>0.717</td>
</tr>
<tr>
<td>ENOB (bit)</td>
<td>10.2</td>
<td>8.84</td>
<td>9.1</td>
<td>9.2</td>
<td>7.81</td>
</tr>
<tr>
<td>FoM (fJ/c.-s.)</td>
<td>196</td>
<td>22.4</td>
<td>5.5</td>
<td>1.7</td>
<td>3.19</td>
</tr>
<tr>
<td>Area (mm²)</td>
<td>N/A</td>
<td>0.212</td>
<td>0.037</td>
<td>N/A</td>
<td>0.0126</td>
</tr>
</tbody>
</table>

Figure 3-16: ADC measurement set-up with solar panel.
3.5 Ultra-low-power RC Oscillator

Integrated low-frequency oscillators can replace crystal oscillators to reduce the size, cost and power consumption in wireless sensors [65]. An integrated RC oscillator with a centre frequency of 10 kHz is implemented on the same chip as the ADC. The oscillator will provide the clock for on-chip digital logic and thus reduce the reliance on external components. In the operational scenario envisaged for our WSN node, an external pulse acts as the sampling signal for the ADC. The RC oscillator will provide 10 kHz clock cycles for bit approximation. Consequently there are no stringent demands on the jitter of the clock signal.

In this work, the RC oscillator topology reported in [66] has been implemented in 65 nm CMOS with a supply voltage $V_{DD} = 0.4$ V. The schematic of the oscillator is shown in Fig. 3-17. When $\phi = 1$, the current $I$ flows through the resistor $R$ causing a voltage drop $V_2 = IR$. During the same time, a matched current source $I$ charges capacitor $C_1$ so that $V_1$ crosses $V_2$ after a time $RC_1$. Once the crossing occurs, the comparator followed by the Schmitt trigger and digital buffer make $\bar{\phi} = 0, \phi = 1$. The capacitor $C_1$ is discharged and the role of the comparator inputs are interchanged. The nominal period of oscillation is given by $2(RC_1 + t_{delay})$ where $t_{delay}$ corresponds to the delay of the comparator and digital buffers.

To achieve a clock period of 100 $\mu$s ($f_{clk} = 10$ kHz), $R = 5$ MΩ and $C_1 = 10$ pF are used. The current $I$ in Fig. 3-17 is set to 15 nA. The resistor is realized by high-resistance poly (HIPO) resistors connected in series. Capacitors are realized using MOS-based capacitors enhanced by metal layers. This helps to achieve high capacitive density. The RC oscillator has two sub-blocks that require a bias voltage $V_{BiasP}$. These blocks are the comparator and the RC network. In order to make the comparator delay independent of temperature, a PTAT current reference is required [66]. Figure 3-18 shows the constant-$g_m$ bias circuit (beta-multiplier) with an off-chip resistor of $R_{ext} = 5$ MΩ used to generate $V_{BiasP}$. At $V_{DD} = 400$ mV and temperature = 27°C, this bias circuit supplies an output current $I_{out} = 5$ nA. Figure 3-19 confirms the PTAT nature of the current generated by the bias circuit. A start-up circuit consisting of devices $M_5$, $M_6$ and $M_7$ has been incorporated to avoid the zero-current state. Transient simulations with I/O pad and PCB trace parasitic capacitance modeled on the node connected to $R_{Ext}$ were undertaken to assess stability of the circuit under different supply ramp-up conditions. Subsequently, MIM capacitors with value $C_{ext} = 100$ fF were added to the bias voltage nodes to ensure stability. The two important specifications of an oscillator, besides frequency and duty cycle are temperature coefficient and line sensitivity.

- **Temperature Coefficient**: The temperature sensitivity of the RC oscillator is expressed in terms of its temperature coefficient. It is important to have a low value for this parameter over the operational temperature range. It is expressed
Figure 3-17: Schematic of the RC oscillator.

Figure 3-18: Bias circuit for the RC oscillator.
3.5 Ultra-low-power RC Oscillator

Figure 3-19: Variation of current vs. temperature for the bias circuit.

in %/°C. Temperature accuracy is given by

\[
TC = \frac{(max(f_{clk}) - min(f_{clk})) \times 10^2}{f_{clk,27°C} \times (T_{max} - T_{min})} \%/°C. \quad (3.21)
\]

In this case, \(T_{max} = +85°C\) and \(T_{min} = 0°C\).

• **Line Sensitivity**: The supply voltage sensitivity of the RC oscillator is expressed in terms of its line sensitivity. It is important to have a low value for this parameter over the entire range of supply voltages. Voltage accuracy is expressed in %/V and is given by

\[
LS = \frac{(max(f_{clk}) - min(f_{clk})) \times 10^2}{f_{clk,nom} \times (V_{DD,max} - V_{DD,min})} \%/V, \quad (3.22)
\]

where \(f_{clk,nom}\) is the value of \(f_{clk}\) at the nominal supply voltage. In this case, a nominal supply voltage of 0.4 V has been used. \(V_{DD,max} = 1\) V and \(V_{DD,min} = 0.4\) V.

For typical process corner and temperature = +27°C, post-layout simulation results for the RC oscillator are provided in Table 3-3. The RC oscillator with a core area 142.5 μm × 103 μm was designed and fabricated in a 65 nm, 1-poly 7-metal (1P7M) CMOS process. The chip microphotograph is shown in Fig. 3-20. Table 3-4 summarizes the measured performance of the RC oscillator on four different chips. The FoM achieved ranks among the lowest reported for integrated low-frequency oscillators.
Table 3-3: Simulated performance of the RC oscillator.

<table>
<thead>
<tr>
<th>Specification</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply voltage</td>
<td>0.4 V</td>
</tr>
<tr>
<td>$I_{\text{tran}}$</td>
<td>56.2 nA</td>
</tr>
<tr>
<td>Power</td>
<td>22.5 nW</td>
</tr>
<tr>
<td>$f_{\text{clk}}$</td>
<td>9.97 kHz</td>
</tr>
<tr>
<td>FoM ($\text{Power}/f_{\text{clk}}$)</td>
<td>2.26 nW/kHz</td>
</tr>
<tr>
<td>Duty cycle (%)</td>
<td>50.16</td>
</tr>
<tr>
<td>TC</td>
<td>0.21%/°C</td>
</tr>
<tr>
<td>LS</td>
<td>5.85 %/V</td>
</tr>
<tr>
<td>Area</td>
<td>0.0147 mm$^2$</td>
</tr>
</tbody>
</table>

Figure 3-20: Chip microphotograph and layout of the RC oscillator.

Table 3-4: Measured performance of the RC oscillator.

<table>
<thead>
<tr>
<th>Specification</th>
<th>Chip 1</th>
<th>Chip 2</th>
<th>Chip 3</th>
<th>Chip 4</th>
</tr>
</thead>
<tbody>
<tr>
<td>Frequency (kHz)</td>
<td>10.8</td>
<td>11.5</td>
<td>11</td>
<td>11.8</td>
</tr>
<tr>
<td>Supply voltage (V)</td>
<td>0.4</td>
<td>0.4</td>
<td>0.4</td>
<td>0.4</td>
</tr>
<tr>
<td>Current (nA)</td>
<td>41</td>
<td>45</td>
<td>59</td>
<td>50</td>
</tr>
<tr>
<td>Power (nW)</td>
<td>16.4</td>
<td>18</td>
<td>23.6</td>
<td>20</td>
</tr>
<tr>
<td>FoM (nW/kHz)</td>
<td>1.52</td>
<td>1.56</td>
<td>2.14</td>
<td>1.69</td>
</tr>
</tbody>
</table>
3.6 Summary

A sub-nW, 1 kS/s 8-bit SAR ADC has been presented. In order to achieve the targeted static and dynamic performance under extremely low supply voltage, careful optimization of the different circuit blocks was essential. A multi-stage charge pump which provided $>2.5X$ boosted gate control voltage was combined with a leakage-reduced sampling switch. The S/H circuit achieved 9-bit linearity highlighting the importance of leakage suppression and ensuring sufficient tracking bandwidth. The proposed MOM custom-capacitor combining inter-layer and fringe capacitance was crucial in reducing the area and power consumption of the ADC while achieving a worst-case INL/DNL = 0.4 LSB over seven measured samples. For low/medium resolution SAR ADCs, employment of such custom capacitors will yield significant savings. The reduced input referred noise of the comparator and the low-leakage SAR logic implementation further enhanced ADC performance.
Design of a 0.4 V, sub-nW, 8-bit 1 kS/s SAR ADC
Chapter 4

Design of a 10-bit 50 MS/s SAR ADC

4.1 Introduction

The proliferation of mobile devices supporting applications such as DVB-T and DVB-H, and wireless standards such as GSM, GPRS, WLAN etc. have created the demand for power-efficient ADCs with resolutions of 8-10 bits and sampling speeds of several tens of MS/s [2, 67]. Medium-resolution, high-speed analog-to-digital conversion has traditionally been dominated by pipelined ADCs. But the continued scaling of CMOS technologies accompanied by the reduction of supply voltage poses significant hurdles for the design of power-efficient pipelined ADCs. Pipelined ADCs require high-gain, linear opamps which are power-hungry blocks. Due to the low output resistance of short channel MOS devices, multi-stage amplifiers are necessary to attain high DC gain which diminishes power efficiency. The reduced supply voltages in sub-100 nm CMOS processes result in lower signal swings in the amplifiers which further degrades the signal-to-noise ratio (SNR) for a given value of sampling capacitance. On the contrary, SAR ADCs eliminate the use of opamps and achieve excellent power efficiency [6]. Dynamic comparators are commonly used in SAR ADCs [68]. Since the dynamic comparator does not require static bias currents, the power consumption of the SAR ADC linearly scales with the sampling frequency. Also the speed and power efficiency of the digital logic in the SAR ADC improves with CMOS process scaling. SAR ADCs have achieved sampling rates from several tens of MS/s to low GS/s with 10-bit resolution [69], [10], [70]. SAR-assisted pipelined ADCs [71] have achieved good power efficiency at speeds exceeding 200 MS/s.

In this work, a power-efficient SAR ADC implemented in 65 nm CMOS is presented. The design includes an on-chip reference voltage buffer (RVBuffer) which helps to eliminate the speed limitation posed by incomplete DAC settling.
The topology and design details of the RVBuffer which satisfies the performance requirements are discussed in detail. Comprehensive post-layout simulation results which verify the ADC performance are reported.

In medium-resolution, high-speed SAR ADCs, the speed limitation is caused by incomplete DAC settling. Further performance degradation occurs when the DAC reference voltage is provided off-chip due to the parasitic inductances on the reference voltage line. Achieving the targeted ADC performance under such conditions requires a high-speed on-chip reference voltage buffer. In addition to fast settling behaviour, the reference voltage buffer should possess sufficiently high power-supply rejection ratio (PSRR), low noise and must remain stable for all operating conditions. This chapter presents a 10-bit, 50 MS/s SAR ADC with power consumption of 697 µW implemented in 65 nm CMOS technology. To overcome the performance degradation due to ringing on the DAC reference caused by bondwire inductances, a high-speed on-chip reference voltage buffer has been designed and incorporated in the ADC. To reduce the area and capacitance values in the DAC, a split binary-weighted capacitive array has been used. A double-tail dynamic comparator has been optimized for noise and speed. Bootstrapped switches are used for input sampling in order to guarantee sufficient linearity for the ADC. A synchronous SAR controller has been implemented using static CMOS logic.

The rest of the chapter is organized as follows. Section 4.2 explains the DAC settling limitations in high-speed SAR ADCs and the necessity of a high-speed on-chip reference voltage buffer. Section 4.3 describes the architecture of the proposed SAR ADC. Section 4.4 presents the implementation of the important building blocks of the ADC. Section 4.5 provides the simulation results for the SAR ADC. Conclusions are drawn in Section 4.6.

4.2 Limitations for DAC Settling

In high-speed, medium-to-high resolution (> 9 bits) SAR ADCs, the speed bottleneck is caused by the settling time for the DAC. Assuming that one clock cycle is allocated for sampling the input, a conventional \( N \)-bit SAR ADC requires a minimum of \( (N + 1) \) clock cycles for one complete conversion. For a 10-bit, 50 MS/s SAR ADC, this results in a minimum system clock frequency of 550 MS/s. In this work, a system clock frequency of \( f_{\text{clk}} = 600 \text{ MHz} \) is used which corresponds to 12 cycles of the sampling clock. Each period of the system clock is divided equally between the DAC settling phase and the comparison phase. Within the half-cycle time period, the DAC voltage has to charge/discharge to a new level and settle with an accuracy > 10 bits. Incomplete DAC settling will introduce conversion errors and destroy the performance of the ADC. When the DAC reference voltage is provided off-chip, the DAC settling is worsened by the effect of the parasitic inductances of the bondwires, PCB traces etc. Large charging currents drawn from the off-chip reference voltage through the inductances will cause ringing on the DAC capacitor node being charged.
This in turn will cause the DAC output to ring [72]. The bondwire inductance depends on the dimensions of the gold wire used in bonding. The inductance of a gold wire of length \( l \) [mm] and radius \( r \) [mm] is given by [73]

\[
L = \frac{l}{5} \left[ \ln \left( \frac{2l}{r} \right) - \frac{3}{4} + \frac{r}{l} \right].
\]  

(4.1)

Assuming a diameter of 1 mil (25.4 \( \mu \)m) and length of 3 mm for the bondwire, (4.1) results in an inductance \( L = 3.24 \) nH. With some margin for the PCB trace inductance, a minimum total inductance of 4 nH is allocated on the off-chip reference voltage line. It is shown in Section 4.5 that the magnitude of ringing on the DAC output voltage even for 4 nH inductance is many times higher than the least significant bit (LSB) of the ADC which unacceptably degrades performance.

One method to solve the DAC settling issue in the presence of parasitic inductances is to ensure sufficient timing margin such that the ringing effect diminishes to the required accuracy level. For high-speed SAR ADCs, this will considerably lower the maximum sampling frequency that can be employed. On-chip decoupling for the DAC reference is another technique to mitigate the ringing effect. But prohibitively large capacitance values will be required to limit the perturbations below 1 LSB rendering this technique impractical for most implementations. Incorporating a high-speed on-chip reference voltage buffer isolates the DAC reference voltage from ringing effects and helps to attain the targeted ADC performance. Design details and implementation of the reference voltage buffer will be elaborated in Section 4.4.1.

### 4.3 ADC Architecture

Figure 4-1 shows the proposed SAR ADC architecture. It consists of bootstrapped sampling switches, split binary-weighted capacitive DACs, a high-speed dynamic comparator, synchronous SAR logic and an on-chip reference voltage buffer. In a conventional SAR ADC [74], the input voltage is sampled on the bottom plates of the capacitor array and the top plates are connected to a fixed voltage \( V_{cm} \). In the redistribution mode for the MSB, the output voltage of the DAC is given by [74]

\[
V_{DAC} = V_{cm} - V_{in} + \frac{V_{REF}}{2}.
\]  

(4.2)

If one of the power rails is chosen as the value for the fixed voltage \( V_{cm} \), it is seen from (4.2) that \( V_{DAC} \) exceeds the supply rails during conversion when the input voltage ranges from 0 to \( V_{DD} \). A remedy is to use a lower input range which has a detrimental effect on the signal-to-noise ratio (SNR) of the converter. In this work top-plate sampling with preset MSB is used for rail-to-rail analog inputs without using additional fixed voltages [6], [56]. It is assumed that the input common-mode
level is $V_{DD}/2$. During the entire conversion, the DAC outputs are limited to the supply rails and the common-mode voltage of the DAC outputs is kept at $V_{DD}/2$.

During the sampling phase of the SAR ADC, the inputs $V_{inP}$ and $V_{inN}$ are connected to the top-plate node of the main DAC, the MSB is preset to high and all other bits are reset to low as shown in Fig. 4-2. At the end of the sampling phase, the sampling switches are opened and the differential input voltages are sampled on the top-plate node of the main DAC. During the first comparison cycle, the comparator compares $V_{DACP}$ and $V_{DACN}$. If $V_{DACP} > V_{DACN}$, the MSB is kept high. If $V_{DACP} < V_{DACN}$, the MSB is made low. Then MSB-1 is set to high and the comparator compares its differential inputs. The associated timing diagram is shown in Fig. 4-3. The process continues until all the ten bits are determined.
4.4 Implementation of ADC Building Blocks

The important building blocks of the ADC are the RVBuffer, input sampling switches, dynamic comparator, capacitive DAC and SAR controller. The RVBuffer facilitates the precise settling of the DAC reference voltage. For the simulations of the various circuit blocks, the process, supply voltage and temperature (PVT) corners encompass all process defined corners for MOS devices, resistors and capacitors, temperature range [-40°C +125°C] and ±10% supply voltage variation. Monte Carlo (MC) simulations including process variation and device mismatch were performed to determine offset voltage of the RVBuffer and comparator, INL/DNL of the DACs and the ENOB of the full ADC. The design details of these building blocks are described in the following subsections.

4.4.1 Reference Voltage Buffer

When the DAC reference voltage is provided off-chip for SAR ADCs working at high sampling rates, the effect of bondwires and other parasitic inductances can severely degrade DAC settling [72]. The switching of a capacitor in the DAC array to the high reference voltage causes charge to be drawn from the off-chip reference. The charge transfer through the bondwire inductance on the reference line causes ringing on the DAC output [72]. For this ADC, the DAC output should settle with an accuracy higher than 10 bits within the half-cycle time period of the 600 MHz clock. Providing for 100 ps delay in the digital logic, this leads to a DAC settling time requirement of around 733 ps. Achieving such fast settling requires a voltage buffer with very low output impedance. Few published works on SAR ADCs include on-chip reference voltage buffers.
4.4.1.1 Calculation of Design Parameters

To aid the design of the RVBuffer, estimation of important design parameters such as peak output current during slewing, unity-gain frequency and DC gain of the amplifier was carried out.

The settling time is a critical specification for the RVBuffer. The total settling time consists of the constant-slope (slewing) regime and the linear settling regime. Usually the linear settling time takes much longer than the slewing time [46]. It is difficult to precisely demarcate the two regimes and hence we allocate 10% of the total settling time for slewing and the remaining 90% for linear settling [75]. Based on this split-up, the minimum output current during slewing and unity-gain frequency are computed. From Fig. 4-1, it can be found that the total effective DAC capacitance that is connected to the RVBuffer at any point of time is $31C_u + C_u = 32C_u$. The worst case settling scenario for the RVBuffer occurs when the MSB-1 bit is preset to HIGH ($V_{REF}$) with the MSB bit determined to be LOW (0 V). In this case, $24C_u$ is switched from the low reference (GND) to the high reference ($V_{REF}$) while the total effective load capacitance of the RVBuffer is $32C_u$. From charge-conservation, it is found that this switching causes a voltage change $\Delta V$ at the RVBuffer output where

$$32C_u \cdot \Delta V = 24C_u(V_{REF} - GND).$$

(4.3)

Using $V_{REF} = 1.2$ V in (4.3), we have

$$\Delta V = \left(\frac{24C_u}{32C_u}\right)V_{REF} = 0.75V_{REF} = 900\text{ mV}.$$  

(4.4)

The slew-rate of the RVBuffer is given by

$$\frac{\Delta V}{t_{slew}} = \frac{0.9}{0.1 \cdot 733 \cdot 10^{-12}} = \frac{I_{out}}{C_{L, RVBuffer}},$$

(4.5)

where $C_{L, RVBuffer} = 32C_u$ and $I_{out}$ is the peak output current during slewing. The unit capacitor $C_u$ is chosen to be 15 fF as explained in Section 4.4.4. Solving (4.5), we get $I_{out} = 5.9$ mA.

For a single-pole amplifier in closed-loop configuration, the step-response is given by

$$V_{out}(t) = V_{step}(1 - e^{-t/\tau}),$$

(4.6)

where $\tau = \frac{1}{\beta \omega_{ug}}$ and $\beta$ is the feedback factor of the RVBuffer and $\omega_{ug}$ is the unity-gain frequency of the RVBuffer in rad/s. For unity-gain feedback, $\beta = 1$. The settling error $\epsilon = e^{-t/\tau}$. Thus

$$t = -\ln(\epsilon) = -\frac{-\ln(\epsilon)}{\omega_{ug}} = \frac{\ln(\epsilon)}{2\pi \cdot f_{ug}},$$

(4.7)
where \( f_{ug} \) is the unity-gain frequency of the buffer in Hz. Since the reference voltage has to settle with an accuracy \( > 10 \) bits, we have targeted 13-bit settling accuracy which provides sufficient design margin. From (4.7), for 13-bit settling, we find the minimum \( f_{ug} \) as

\[
f_{ug} = \frac{-\ln(\epsilon)}{2\pi \cdot \epsilon} = \frac{-\ln(1/2^{13})}{2\pi \cdot 0.9 \cdot 733 \cdot 10^{-12}} = 2.17 \text{ GHz. (4.8)}
\]

A high open-loop DC gain (\( A_0 \)) is necessary to achieve good PSRR performance [76]. Since the PSRR is a key requirement for an RVBuffer, we have targeted an open-loop DC gain \( A_0 = 60 \) dB for the RVBuffer. A large DC gain also helps to minimize the static error on the buffered voltage. For a closed-loop amplifier, the finite \( A_0 \) results in a gain error factor of \( \zeta \) [46]

\[
\zeta = 1 - \left( \frac{A_0\beta}{1+A_0\beta} \right) \approx \frac{1}{A_0\beta}. (4.9)
\]

For a unity-gain buffer, the feedback factor \( \beta = 1 \), making \( \zeta = 1/A_0 \). An open-loop DC gain = 60 dB helps to limit the gain error \( \zeta \) to 0.1%.

### 4.4.1.2 Circuit Details of the Reference Voltage Buffer

In this work, the high reference voltage \( V_{REF} = 1.2 \text{ V} \) and the low reference voltage is ground. A single-ended reference voltage buffer for generating \( V_{REF} \) has been designed with 2.5 V thick oxide MOSFETs from the 65 nm design kit. The topology of the RVBuffer shown in Fig. 4-4 has been adapted from [77]. The selected RVBuffer topology is similar to a low-dropout (LDO) regulator. The source-follower stage achieves low output resistance and thus provides large output current when its gate-source voltage \( V_{GS} \) changes. In section 4.4.1.1, it was estimated that \( f_{ug} \approx 2 \) GHz will be needed in a conventional single-stage opamp to meet the settling time requirement. However, use of the topology shown in Fig. 4-4 enables fast settling with a lower unity-gain frequency for the opamp \( OA1 \). In Fig. 4-4, the replica source-follower (SF) stage isolates the node \( V_{REF0} \) from the capacitive load of the DAC which is connected to node \( V_{REF} \). The feedback loop assures stability while the open-loop settling due to the replica SF stage enables fast operation. Hence the feedback loop can be designed with lower bandwidth which leads to a lower unity-gain frequency specification for \( OA1 \).

In Fig. 4-4, \( OA1 \) is a two-stage amplifier. This structure achieves an open-loop DC gain \( > 60 \) dB thus providing sufficient PSRR and low static gain error on the \( V_{REF} \) voltage. A two-stage amplifier with conventional Miller compensation requires large transconductance in the second stage for ensuring sufficient phase margin (PM) which increases power consumption. Also the presence of the right half plane zero degrades PM unless a nulling resistor is used. To circumvent these disadvantages, split-length compensation of op-amps has been proposed [45]. Figure 4-5 shows the
schematic of the RVBuffer. The two-stage amplifier uses split-length current mirror load topology [45]. The open-loop gain and phase plot of the RVBuffer for nominal PVT condition is shown in Fig. 4-6. The open-loop DC gain = 70 dB, phase margin = 68° and unity-gain frequency = 600 MHz. The worst (i.e minimum) values for these parameters across all PVT corners are provided in Table 4-1.

The supply nodes of the DAC driver inverters are connected to node $V_{REF}$ in Fig. 4-5. During capacitor switching in the DAC, the voltage change on $V_{REF}$ will couple through the $C_{gs}$ of $M_{10}, M_{11}$ and perturb the node $V_{2}$ thus degrading DAC settling. This effect is mitigated by the capacitance $C_{st}$ [78] at node $V_{2}$. To eliminate body-effect, the bulks of $M_{10}, M_{11}$ are connected to their respective sources through the use of deep-NWELL devices. The two SF stages in Fig. 4-5 are carefully laid out and dummy devices included to improve matching.

The post-layout simulation results for the RVBuffer are summarized in Table 4-1. For all performance specifications, the worst-case values over PVT corners and mismatch simulations are presented in Table 4-1. The RVBuffer provides a worst-case settling time of 692 ps for $V_{REF}$ with 13-bit accuracy which satisfies the DAC settling requirement for the SAR ADC. PSRRp signifies the rejection of disturbances on the 2.5 V supply line. The integrated output noise of the RVBuffer is only 0.086 LSB. The integration limits are [0 Hz 180 GHz]. Due to the benefits of the replica SF topology, the RVBuffer meets the settling time specification with a lower unity-gain frequency of 600 MHz. The transient current consumption and open-loop DC gain for the implemented RVBuffer compare well with the estimated values in section 4.4.1.1. The total offset voltage of the RVBuffer is 4.18 mV. The offset voltage contribution due to mismatch in the input differential pair is 690 $\mu$V while mismatch in the replica SF stages contributes an offset of 1.72 mV. The static error on the RVBuffer output due to the finite loop gain of the amplifier causes an offset of 1 mV. It is seen that the mismatch in the replica SF stages makes the dominant contribution to the offset voltage.

In this work, the RVBuffer is implemented with 2.5 V MOS devices. Although a higher supply voltage increases the power consumption of the RVBuffer, this allows the use of $V_{REF} = 1.2$ V and the SAR ADC can sample rail-to-rail inputs. Also the DAC drivers are simple inverters instead of transmission-gate switches which enhances speed and power efficiency. If the RVBuffer is to be implemented with core 1.2 V devices, the considerable $V_{GS}$ drop caused by the SF stages have to be surmounted so that an adequate input range for the ADC i.e. $(V_{refp} - V_{refn})$ can be guaranteed. Techniques such as forward body biasing [79] or level shifting with boosted supply voltages and additional SF stages [80] will need to be employed in the RVBuffer which will increase design complexity. With a lower input range for the ADC and assuming that the total DAC capacitance remains unchanged, a pre-amplifier for the dynamic comparator will become inevitable to reduce the noise at the cost of increased power consumption [81].

In this work, the ADC works at the core voltage (1.2 V) while the RVBuffer works...
at the IO voltage (2.5 V). From a system-on-chip (SoC) implementation perspective, this does not constitute a significant drawback since SoCs often employ multiple supply voltages (e.g. high-voltage devices are used to reduce leakage in certain critical blocks of the SoC).

The bias voltage $V_{BiasP}$ in Fig. 4-5 is generated by the constant-$g_m$ bias circuit [82] shown in Fig. 4-7. The start-up circuitry consisting of $M_9-M_{11}$ precludes the possibility of a zero-current state. Since the constant-$g_m$ bias circuit uses positive feedback and $R_{Ext}$ is an off-chip resistor, parasitic capacitances on node $V_R$ can cause oscillations [82]. A sufficiently high value for $R_{Ext}$ was chosen. Extensive DC, transient simulations of the bias circuit involving estimated off-chip parasitics at node $V_R$ were performed to confirm circuit stability over PVT corners. It is important to bear in mind that the circuit in Fig. 4-7 provides an output current that is stable only over supply voltage variations in a certain range. Significant variation in the output current was found for the extreme temperature corners namely -40°C and +125°C.

### 4.4.2 Input Sampling Switches

The input sampling switches are crucial in determining the linearity of the ADC. Various non-ideal effects of the sampling switch such as signal-dependent on-resistance variation, charge-injection and clock feedthrough degrade linearity [14]. Also the tracking bandwidth of the sampling switch must be sufficiently high. The tracking bandwidth of a sampling circuit for an N-bit converter should satisfy [22]

$$f_{3dB} = \frac{1}{2\pi R_{ON}C_s} > \frac{(N + 1)\ln(2)}{\pi} f_s.$$  (4.10)
Figure 4-5: Schematic of the reference voltage buffer.

Figure 4-6: Open-loop gain and phase plot for the RVBuffer.
Table 4-1: Performance summary of the reference voltage buffer.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply voltage</td>
<td>2.5 V</td>
</tr>
<tr>
<td>Settling time</td>
<td>692 ps</td>
</tr>
<tr>
<td>PSRRp @ 10 kHz</td>
<td>68 dB</td>
</tr>
<tr>
<td>PSRRp @ 600 MHz</td>
<td>25 dB</td>
</tr>
<tr>
<td>$\sigma_{\text{noise, out}}$</td>
<td>201 $\mu$V</td>
</tr>
<tr>
<td>Offset voltage ($\sigma_{\text{offset}}$)</td>
<td>4.18 mV</td>
</tr>
<tr>
<td>Phase margin</td>
<td>58$^\circ$</td>
</tr>
<tr>
<td>DC gain</td>
<td>62 dB</td>
</tr>
<tr>
<td>Unity-gain frequency</td>
<td>511 MHz</td>
</tr>
<tr>
<td>Area</td>
<td>(55 $\mu$m $\times$ 104 $\mu$m)</td>
</tr>
<tr>
<td>Current ($I_{\text{tran}}$)</td>
<td>8 mA</td>
</tr>
</tbody>
</table>

Figure 4-7: Schematic of the constant-$g_m$ bias circuit.
In this case, the sampling/acquisition period is one clock cycle of the SAR clock which sets $f_s = 12 \times 50 \text{ MHz} = 600 \text{ MHz}$. For the split binary-weighted array capacitive DAC with top-plate sampling as shown in Fig. 4-11, the total sampling capacitance $C_s = 480 \text{ fF}$. For $N = 10$ bits, this results in an on-resistance ($R_{on}$) upper bound for the sampling switch given by $R_{ON} < 227 \Omega$. Attaining such a low $R_{ON}$ value without huge device sizes over PVT corners requires the use of bootstrapped sampling switches. In this work, the bootstrapped switch topology presented in [16] has been used. The transistor implementation of the bootstrapped switch is shown in Fig. 4-8. Similar to [83], the dummy switch PD in Fig. 4-8 helps to alleviate charge injection at node E. A MIM capacitor from the 65 nm design kit has been used to implement the bootstrap capacitor. The value of the bootstrap capacitor was chosen such that the effect of parasitic capacitances on the switch performance is minimized. In Fig. 4-8, the devices $N_3, P_4, N_5$ are OFF during the acquisition phase $\phi_1n$. Hence high-threshold voltage devices were used to implement $N_3, P_4, N_5$ in order to reduce leakage. Post-layout simulation results for the bootstrapped switch linearity (noise not included) over PVT corners is provided in Fig. 4-9. The simulation testbench for switch linearity includes $50 \Omega$ resistors for the analog inputs. The analog input frequency is $21 \text{ MHz}$ (near-Nyquist). The worst linearity occurs for the slowest MOS corner combined with the highest temperature of $+125^\circ\text{C}$. From Fig. 4-9, it is seen that the switch maintains 10-bit linearity even for the worst PVT conditions. Since the absolute gate voltage of the switch exceeds the supply voltage $V_{DD}$, sufficient precautions have to be taken to guarantee reliability of the switch. A long circuit lifetime is assured for the switch by ensuring that the critical terminal voltages $V_{gs}$, $V_{gd}$ and $V_{ds}$ are kept within the rated supply voltage $V_{DD}$ [84].
4.4 Implementation of ADC Building Blocks

4.4.3 Dynamic Comparator

Dynamic comparators are a popular choice in SAR ADC implementations since they eliminate static bias currents and thus improve power efficiency. Figure 4-10 shows the schematic of the double-tail dynamic comparator [85]. Inverters have been added at the outputs to make the output loading identical. During the reset state ($Clk = 0 \ V$), transistors $M_4$ and $M_5$ charge the nodes $D_{i-}$ and $D_{i+}$ to $V_{DD}$ which causes the devices $M_{10}$ and $M_{11}$ to discharge the output nodes to ground. During the evaluation phase ($Clk = V_{DD}$), the cross-coupled inverters regenerate the input voltage difference to provide digital output levels. The devices $M_{10}$ and $M_{11}$ help to reduce kickback noise [85]. The double-tail architecture provides a number of benefits over the conventional sense-amplifier latch [4]. Since the number of stacked devices is less, the double-tail comparator can operate at lower supply voltages. Due to the double-tail structure, the current in the input stage can be decoupled from that in the latching stage. According to [31], the input-referred noise of the dynamic comparator can be lowered by using smaller current, i.e. lower size for the tail-source transistor $M_3$, in the input stage. In the double-tail comparator, lower current is used in the input stage to reduce noise, offset while higher current is used in the latching stage to meet the speed requirement. Increasing the capacitance on nodes $D_{i-}$, $D_{i+}$ helps to lower the comparator noise [31] but also results in increased comparator delay. As a trade-off, minimum size metal-plate capacitors from the design kit were added to the nodes $D_{i-}$, $D_{i+}$. Simulated performance of the dynamic comparator is summarized in Table 4-2. In post-layout simulation over PVT corners, the comparator achieves a worst-case standard-deviation of the input-referred noise $\sigma_{\text{noise,in}} = 471 \ \mu V$ which is equivalent to 0.2 LSB. In this SAR ADC, the input common-mode level of the comparator remains at mid-rail ($V_{DD}/2$) throughout the conversion and hence the
offset of the comparator appears as a static offset which does not affect linearity of the converter [9]. Taking $3\sigma_{\text{offset, in}}$, the input-referred offset will reduce SNR by 0.2 dB which corresponds to an ENOB loss of .03 bits. The worst-case delay of the comparator working with an input voltage difference $< \text{LSB/2}$ is 641 ps which fits well within the half clock cycle time period of 833 ps. Further reduction in input-referred noise and offset of the comparator can be achieved by including preamplifier stage(s) in front of the comparator. Since a high-speed preamplifier will significantly increase power consumption, it has been excluded in this work.
4.4 Implementation of ADC Building Blocks

4.4.4 Split Binary-Weighted Array DAC

The capacitive array DAC in a SAR ADC samples the input voltage and performs the DAC function of generating and subtracting the scaled reference voltage. Although the binary-weighted capacitor array provides high linearity, the exponential increase of the array capacitance with resolution imposes area and power penalties. The speed limitation in SAR ADCs stems from the large settling times for the DAC caused by the RC time constants of the DAC capacitors and driver switches. Charging large capacitors in a binary-weighted array at high speeds will require extremely fast reference voltage buffers that consume enormous amount of power. Alternative DAC topologies are the C-2C ladder DAC [86] and split binary-weighted capacitive DAC [24]. The C-2C ladder DAC suffers poor linearity due to parasitic capacitance at the interconnection nodes and hence is not widely employed in medium-to-high resolution SAR ADCs. The split binary-weighted capacitive DAC is commonly used to reduce the total DAC capacitance and the spread of the capacitor values in the array, thus providing area and power savings and also relaxing the settling time requirements.

Figure 4-11 shows the 10-bit split array DAC which is composed of two binary-weighted capacitive arrays, a 5-bit main DAC and a 5-bit sub DAC separated by a bridge capacitor $C_B$ in the middle. The bridge capacitor $C_B$ is chosen to be a unit capacitor instead of a fractional value for better matching and ease of layout [24]. Also the dummy unit capacitor at the end of the sub DAC has been avoided. Such a modification introduces a gain error of approximately 1 LSB which can be readily corrected in the digital domain. The selection of the unit capacitor in the DAC involves important considerations such as thermal noise, mismatch and technology limitations.

First we consider the thermal noise constraint. For an N-bit ADC with a full-scale range of $V_{REF}$, the quantization noise is given by

$$P_Q = \frac{V_{REF}^2}{12 \cdot 2^{2N}}.$$  (4.11)
If the thermal noise is designed to be equal to the quantization noise, a 3 dB loss in SNR will occur. For such a scenario, the minimum value of sampling capacitance is given by

$$C_s = \frac{12kT \cdot 2^{2N}}{V_{REF}^2}. \quad (4.12)$$

For $N = 10$ bits, $V_{REF} = 1.2 \, \text{V}$, the minimum sampling capacitance is $C_s = 36 \, \text{fF}$. In this work, since a split-array capacitive DAC with top-plate sampling is used, the total sampling capacitance is $480 \, \text{fF}$ which satisfies the thermal noise constraint with good margin.

Next we determine the mismatch-limited minimum value for $C_u$ assuming a single-ended implementation. Due to the bridge capacitor, the effect of capacitor mismatch in the sub DAC is reduced by $1/2^M$ where $M$ is the main DAC resolution. For relatively large values of $M$, the main DAC dominates the total mismatch performance. Usually $M \geq N/2$ is chosen. Following the analysis given in [60], the worst-case standard deviation of the differential nonlinearity (DNL) for the $M$-bit main DAC in a fully-differential implementation is

$$\sigma_{\text{DNL,MAX}} = \sqrt{2(2^M - 1)} \left( \frac{\sigma_u}{2C_u} \right) LSB_M, \quad (4.13)$$

where $LSB_M = 2V_{REF}/2^M$, $C_u$ and $\sigma_u$ are the nominal value and standard deviation of the unit capacitor. For sufficient accuracy in the $N$-bit ADC [60],

$$3\sigma_{\text{DNL,MAX}} < \frac{LSB}{2}, \quad (4.14)$$

Combining (4.13), (4.14) we have

$$\frac{\sigma_u}{C_u} < \frac{1}{3 \cdot 2^{N-M} \cdot \sqrt{2(2^M - 1)}.} \quad (4.15)$$

For the typical metal capacitor,

$$\sigma \left( \frac{\Delta C}{C} \right) = \frac{K_\sigma}{\sqrt{A}}, \quad (4.16)$$

$$C_u = K_c \cdot A, \quad (4.17)$$

where $\sigma \left( \frac{\Delta C}{C} \right)$ is the standard deviation of capacitor mismatch, $K_\sigma$ is the matching coefficient, $A$ is the capacitor area and $K_c$ is the capacitance density parameter. Also

$$\frac{\sigma_u}{C_u} = \frac{1}{\sqrt{2}} \cdot \sigma \left( \frac{\Delta C}{C} \right). \quad (4.18)$$
Combining (4.16)-(4.18) gives
\[
\frac{\sigma_u}{C_u} = \frac{K_s \sqrt{K_c}}{\sqrt{2C_u}}
\]  
(4.19)

Substituting the value of \(\sigma_u/C_u\) from (4.19) in (4.15) results in a mismatch limited lower bound for the unit capacitor given by
\[
C_u \geq 9 \cdot (2^M - 1) \cdot 2^{2(N-M)} \cdot K_s^2 \cdot K_c.
\]  
(4.20)

Using \(N = 10\), \(M = 5\) and \(K_s, K_c\) values from the design kit documentation in (4.20) gives \(C_u \geq 3\) fF. Considering the minimum capacitor value defined by the design kit and also the layout parasitics, \(C_u = 15\) fF has been chosen in this work.

One potential drawback of the split-array architecture is its vulnerability to the parasitic capacitances connected to the nodes A and B as shown in Fig. 4-11. The parasitic capacitances \(C_{P,A}\) and \(C_{P,B}\) are caused by the top-plate and bottom-plate parasitics of \(C_B\) as well as the top-plate parasitic capacitance of the sub DAC and main DAC respectively. In [87], it is shown that \(C_{P,B}\) contributes only a gain error at the DAC output while \(C_{P,A}\) causes both gain error and code-dependent errors. The code-dependent errors degrade the linearity of the ADC. Hence it is beneficial to lower the value of \(C_{P,A}\). One design technique is to ensure that the bottom-plate of \(C_B\), which usually has higher parasitic capacitance, is connected to node B, i.e., the top-plate node of the main DAC thus lowering \(C_{P,A}\). The layout of the split-array capacitive DAC (single side) is shown in Fig. 4-12. Common-centroid layout is employed for the MSB capacitors in the main DAC and sub DAC to alleviate the mismatch errors due to non-uniform oxide growth. The LSB capacitors are placed close to the corresponding switches in order to simplify the interconnects. The unmarked capacitors in Fig. 4-12 are dummies. The linearity of the split-array DAC was verified by computing the DNL, INL over MC simulations on the post-layout netlist. The plot of \(3\sigma_{INL}\), \(3\sigma_{DNL}\) vs output code for 250 MC runs is shown in Fig. 4-13. From Fig. 4-13, it is seen that the requirements on DAC INL/DNL for 10-bit linearity are satisfied. Similar to [9], inverters have been used to switch between the high and low reference voltages of the DAC. The power supply nodes of these inverters are connected to the \(V_{REF}\) node (1.2 V) output from the RVBuffer. The sizes of the inverters have been scaled such that the different capacitors of the DAC constitute approximately the same RC constant.

### 4.4.5 SAR Controller

In this work, a synchronous binary search successive approximation register which utilizes a ring counter and shift register has been implemented [88]. A simplified block diagram of the SAR logic is shown in Fig. 4-14. Static CMOS logic has been used to implement the SAR controller. The timing sequence of the SAR logic is
Figure 4-12: Layout of the split-array DAC (single-side).

Figure 4-13: INL/DNL of the 10-bit fully differential split-array DAC.
shown in Fig. 4-15. An entire conversion requires twelve clock cycles of the SAR clock. A 600 MHz SAR clock is used and the 50 MHz sampling clock is generated by the SAR logic. Power consumption in the SAR block is mainly due to the 22 flip-flops (FFs). The lower chain of FFs in Fig. 4-14 drive the inverter switches of the DAC. The delay through the FF and the RC settling time of the DAC should fit within the half clock cycle period (833 ps). Hence the FFs were designed to minimize the clock-to-Q delay.

4.4.6 Layout of the ADC

The layout of the SAR ADC with on-chip RVBuffer is shown in Fig. 4-16. The core-area of the chip is \((225 \mu m \times 245 \mu m)\). Approximately 70\% of the core-area is taken up by the capacitive DAC arrays. NMOS capacitors enhanced by additional metal layers have been used for supply decoupling. The ADC uses two 1.2 V supplies (analog, digital) and a 2.5 V supply for the RVBuffer. Total decoupling capacitance for supplies used in the ADC is 85 pF. For the RVBuffer, a supply decoupling capacitance of 25 pF is used. Decoupling capacitance of 35 pF and 25 pF are used for the digital and analog supplies respectively.
4.5 Simulation Results

The simulation test-bench for the SAR ADC includes the entire pad frame, decoupling capacitors and source resistors for the signal sources. The impact of the IO pads manifests in the form of parasitic capacitance and resistance. An inductance of 4 nH is added in series to every IO pad connection to the ADC to mimic the bondwire. Since there is no on-chip bandgap reference generator in this work, the input reference voltage $V_{\text{RefIn}}$ shown in Fig. 4-5 is provided through an analog input pad to the SAR ADC. Figure 4-17 illustrates the necessity of the RVBuffer in the SAR ADC when an inductance of 4 nH is used on the $V_{\text{RefIn}}$ input pad. Without the RVBuffer, the DAC output voltage $V_{\text{DACP}}$ suffers large ringing due to the bondwire inductance. From Fig. 4-17, it is evident that the magnitude of ringing is much higher than 1 LSB. Inclusion of the RVBuffer sufficiently lessens the ringing. Without the RVBuffer, the SNDR of the ADC is a meagre 25 dB which corresponds to an ENOB of 4 bits further confirming the harmful impact of bondwire inductances. An ENOB > 9 bits is achieved after addition of the RVBuffer.

Simulations were performed to determine the impact of the bondwire inductance connected to the 2.5 V supply node of the RVBuffer. The inductance connected to the $V_{\text{DD}}$ node of the RVBuffer was increased from 4 nH upto 10 nH. Simulations show that the ringing voltage on the $V_{\text{DD}}$ node due to transient current flow in the transistor $M_{11}$ shown in Fig. 4-5 has a frequency of approximately 300 MHz. For a frequency of 300 MHz, the RVBuffer has a PSRRp = 27 dB. The PSRRp of the RVBuffer sufficiently suppresses the impact of the ringing on $V_{\text{DD}}$ at the RVBuffer output node. Thus the output voltage $V_{\text{REF}}$ of the RVBuffer is not significantly disturbed. With an inductance of 10 nH added to the power supply node of the
RVBuffer, the ADC achieves a linearity corresponding to 9.6 bits in the nominal PVT corner for post-layout simulation indicating that accurate settling of the DAC voltages is maintained.

Due to the varying current flow through the inductances, ringing will occur on the different supply domains of the SAR ADC. Based on simulation, the decoupling capacitance required for each supply domain has been estimated such that the ringing on the supplies is kept to acceptable limits. Digital output pads with in-built level-shifter and driver are used for the output bits of the ADC. Simulations with realistic capacitance value of the logic analyzer probe were done to determine the required drive strength of the digital output pads.

The output spectrum of the SAR ADC for a low input frequency of 1 MHz obtained using post-layout simulation is shown in Fig. 4-19. For a 1 MHz input, the ADC achieves an SNDR = 59.9 dB and the corresponding ENOB = 9.66 bits. The output spectrum of the SAR ADC for a near-Nyquist input frequency of 21 MHz obtained using post-layout simulation is shown in Fig. 4-19. Post-layout simulation result for the dynamic performance of the SAR ADC with varying input frequency and a sampling rate of 50 MS/s is shown in Fig. 4-20. At an input frequency of 23 MHz, the SNDR degrades by 2.7 dB from its value at 1 MHz. Hence the Effective Resolution Bandwidth (ERBW) is taken as 23 MHz. The FoM of the ADC has been calculated as [9]

\[
FoM = \frac{Power}{2^{ENOB \cdot \min(2 \cdot ERBW, f_s)}}. \tag{4.21}
\]

The block-wise power breakdown for the SAR ADC in typical PVT corner with a
Figure 4-18: Output spectrum of the SAR ADC for low-frequency input.

Figure 4-19: Output spectrum of the SAR ADC for near-Nyquist input frequency
4.5 Simulation Results

Figure 4-20: Dynamic performance versus input frequency.

Figure 4-21: Power breakdown for the SAR ADC (typical PVT corner).

total power consumption of 697 µW is shown in Fig. 4-21. It is seen that the SAR logic forms the dominant source of power consumption which agrees well with that of SAR ADCs with similar specification reported in [68], [9], [89], [90].

Post-layout simulation of the ADC including device noise was performed for the typical and worst PVT corners with near-Nyquist differential inputs and a sampling rate of 50 MS/s. Transient noise simulation, which has been previously used to determine ADC performance [91], was utilized to incorporate device noise. Table 4-3 compares the performance of the proposed ADC with other state-of-the-art SAR ADCs. It is seen that the proposed ADC is very competitive in terms of FoM and area when compared with SAR ADCs of similar specifications. To ascertain the impact of process variation and device mismatch on the ADC performance, MC simulations were run on the post-layout netlist of the full ADC including the on-chip RVBuffer. For 50 MC runs with transient noise enabled, the SAR ADC with on-chip RVBuffer achieves $\mu_{ENOB} = 9.2$ bits and $\sigma_{ENOB} = 0.46$ bits for sinusoidal inputs at near-Nyquist frequency.
Table 4-3: Comparison to state-of-the-art works.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>130</td>
<td>130</td>
<td>90</td>
<td>65</td>
</tr>
<tr>
<td>Supply Voltage (V)</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
</tr>
<tr>
<td>Sampling rate (MS/s)</td>
<td>50</td>
<td>40</td>
<td>50</td>
<td>50</td>
</tr>
<tr>
<td>ENOB (bit)</td>
<td>9.18</td>
<td>8.11</td>
<td>10.5</td>
<td>9.25</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>0.826</td>
<td>0.55</td>
<td>4.7</td>
<td>0.697</td>
</tr>
<tr>
<td>FoM (fJ/conv.-step)</td>
<td>29</td>
<td>50</td>
<td>63.9</td>
<td>25</td>
</tr>
<tr>
<td>Active area (mm²)</td>
<td>0.052</td>
<td>0.32</td>
<td>0.118</td>
<td>0.055</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Specification</th>
<th>ISSCC’13 [94]</th>
<th>ISSCC’12 [95]</th>
<th>TVLSI’13 [81]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>90</td>
<td>40</td>
<td>90</td>
<td>65</td>
</tr>
<tr>
<td>Supply Voltage (V)</td>
<td>1.2</td>
<td>1.1</td>
<td>1</td>
<td>1.2</td>
</tr>
<tr>
<td>Sampling rate (MS/s)</td>
<td>50</td>
<td>40</td>
<td>30</td>
<td>50</td>
</tr>
<tr>
<td>ENOB (bit)</td>
<td>11.5</td>
<td>9.15</td>
<td>9.16</td>
<td>9.25</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>4.2</td>
<td>NA</td>
<td>0.98</td>
<td>0.697</td>
</tr>
<tr>
<td>FoM (fJ/conv.-step)</td>
<td>36.1</td>
<td>63</td>
<td>57</td>
<td>25</td>
</tr>
<tr>
<td>Active area (mm²)</td>
<td>0.097</td>
<td>0.08</td>
<td>0.1</td>
<td>0.055</td>
</tr>
</tbody>
</table>

*: chip measurement results

4.6 Summary

In this chapter, a power-efficient SAR ADC with an on-chip reference voltage buffer implemented in 65 nm CMOS was presented. The limitation on the DAC reference voltage settling poses a major obstacle in high-speed SAR ADCs. Consequently, an RVBuffer becomes indispensable to achieve the desired performance. Based on the available settling time and desired accuracy, the important design parameters for the RVBuffer were computed. Selection of a suitable topology followed by careful device sizing was required to satisfy the numerous specifications of the RVBuffer. Monte Carlo simulations were used to verify PSRR and offset of the buffer. The output noise of the RVBuffer forms part of the ADC noise budget and hence it has to be minimized by proper choice of the output current in the RVBuffer. A bootstrapped sampling switch enabled the ADC to meet the targeted linearity performance while a split-array capacitive DAC yielded power and area savings.
Chapter 5

Mixed-Signal Interfaces

5.1 Introduction

Operational transconductance amplifiers (OTAs) constitute a crucial building block of mixed-signal interface circuits. A well-known application of the OTA is in providing programmable gain for the input signals of an ADC so that its dynamic range is maximized under different operating scenarios. OTAs also feature in ΣΔ ADCs as loop filters and in analog buffers such as the reference voltage buffer of an ADC. Body-area networks which facilitate communication between devices in close proximity to the human body have wide-ranging applications. The receiver for such networks employs an analog-front-end (AFE) based on OTAs [96]. Since the OTA is a major source of power consumption and must satisfy numerous specifications such as gain, bandwidth, linearity, noise etc., the design of power-efficient OTAs at low supply voltages is a major challenge. In addition to innovative design techniques, the features of new process technologies can also be enlisted to attain the desired performance.

This chapter compiles the disparate works done on power-efficient OTAs in different CMOS process technologies. The reported OTAs have been designed for applications such as PGA, reference voltage buffer and receiver AFE. The features of the advanced 28 nm UTBB FDSOI CMOS process have been used to design a PGA for a 9-bit, 1 kS/s SAR ADC as well as an ultra-low-voltage general-purpose OTA. A power-efficient reference voltage buffer for a 10-bit, 1 MS/s SAR ADC has been designed in 180 nm CMOS for an embedded SAR ADC in a fingerprint sensor. Proper choice of the frequency compensation scheme in multi-stage OTAs is needed to reduce power consumption and achieve large unity-gain frequency. Two previously-published compensation schemes were compared using a three-stage OTA designed in 40 nm CMOS. Finally, a two-stage OTA in 40 nm CMOS for the receiver AFE in body-coupled communication is reported.
5.2 A PGA for a 9 bit, 1 kS/s SAR ADC

This work presents a fully-differential operational transconductance amplifier (OTA) designed in a 28 nm UTBB FDSOI CMOS process. The OTA which features continuous-time CMFB circuits will be employed in the programmable gain amplifier (PGA) for a 9-bit, 1 kS/s SAR ADC. The reverse body bias (RBB) feature of the FDSOI process explained in Section 2.3.1 is used to enhance the DC gain by 6 dB. The OTA achieves rail-to-rail output swing and provides DC gain = 70 dB, unity-gain frequency = 4.3 MHz and phase margin = 68° while consuming 2.9 µW with a $V_{DD}$ = 1 V. A high linearity > 12 bits without the use of degeneration resistors and a settling time of 5.8 µs (11-bit accuracy) are obtained under nominal operating conditions. The OTA maintains satisfactory performance over all process corners and a temperature range of $[-20^\circ C, +85^\circ C]$.

5.2.1 Performance Requirements

A PGA should have very high input-impedance so that the preceding instrumentation amplifier (IA) is not loaded [3]. Hence a gate-input OTA is preferable to a body-input OTA. The PGA should have sufficient bandwidth (settling speed) to drive the capacitive load of the succeeding ADC. In this work, the PGA drives a 9-bit, 1 kS/s SAR ADC. The sampling capacitance (single-ended) for the SAR ADC is estimated to be 500 fF. Hence the OTA should achieve accurate settling with a capacitive load of 500 fF. Linearity and large output swing are also important specifications for the PGA. Consequently the OTA should meet 10-bit linearity for large output swings ($\geq 80\%$) over all process and temperature corners. Since the PGA realizes variable signal gains (e.g. 1x, 2x, 4x etc) so as to maximize the dynamic range of the ADC, the closed-loop gain error should be minimized. Hence the OTA should have a large DC gain which is maintained across all envisaged operating conditions. Adequately high CMRR is necessary for the OTA. Power consumption of the OTA should be minimized.

5.2.2 Architecture

A two-stage Miller compensated architecture has been chosen for the OTA as shown in Fig. 5-1. An important motivation for this choice is that the load capacitance of the OTA will vary during different phases of operation. During the input sampling phase of the SAR ADC, the sampling switches are turned ON and the capacitors connect to the OTA outputs. During the bit conversion cycles, the sampling switches are OFF and the capacitive load is disconnected from the OTA outputs. In a two-stage Miller compensated OTA, the non-dominant pole is formed at the output nodes. Hence, if sufficient phase margin is ensured for the maximum expected load capacitance, any reduction of the load capacitance from that value will not imperil stability. A two-stage architecture helps to meet the requirements of high DC gain and large
5.2 A PGA for a 9 bit, 1 kS/s SAR ADC

output swing. The compensation capacitor $C_c$ and the zero-nulling resistor $R_z$ have been realized by a 100 fF MIM capacitor and 36 kΩ poly resistor respectively while $C_L = 500$ fF. A constant-$g_{m}$ bias circuit generates the required bias voltages. It is common-practice to use a switched-capacitor (SC) CMFB in the second stage of an OTA. However, the maximum frequency of the input signal for an OTA with SC CMFB is constrained to be much lower than the clock frequency. An SC CMFB also requires non-overlapped clocks. For a specified unity-gain frequency, an OTA with continuous time (CT) CMFB can support higher input signal frequencies. In the OTA, separate CT CMFB circuits are used for the two stages.

5.2.3 Common-mode Feedback

A resistor-divider CMFB has been employed in the first stage of the OTA. The bias current in the first stage is 200 nA. For such low bias currents, the output impedance of the stage is significantly high. In order to avoid degradation of DC gain, very large resistors need to be used in the CMFB circuit. In this work, the resistors $R_{MOS}$ shown in Fig. 5-1 have been realized using RVT NMOS devices biased in the linear region. The RBB feature has been used to boost the resistance of the NMOS devices to $\approx 7 \text{ MΩ}$. The RBB voltage applied equals $V_{DD} = 1$ V. Application of RBB helps to enhance the gain of the first stage by 6 dB. The CMFB circuit in the second stage plays a crucial role in determining the output common-mode level, output swing, linear output range and DC gain of the OTA. In a traditional CT CMFB circuit using resistors, source followers are often interposed between the OTA outputs and the common-mode sensing resistors to avoid resistive loading of the OTA outputs. However, the source followers restrict the output swing of the OTA. In this work, a CT CMFB circuit which combines high input impedance and rail-to-rail output swing is utilized. Figure 5-2 shows the CMFB circuit which has been adopted from [97]. Since all the nodes in Figure 5-2 except the output node vCmfb2 are low-impedance nodes,
the CMFB circuit does not create low-frequency poles which can degrade the stability of the OTA. Using resistive feedback, the OTA is set in closed-loop configuration with a gain of 1. Figure 5-3 plots the output voltages and output common-mode error voltage for the full range of differential input voltages. From Fig. 5-3, it is seen that the worst output common-mode error voltage is only 20.9 mV. The CMFB circuit consumes only 295 nW.

### 5.2.4 Simulation Results for the OTA

For the simulations, a total of 30 corners combining all process corners and temperature limits \([-20\degree C + 85\degree C]\) were utilized. The performance of the OTA over PT corners is summarized in Table 5-1.
For simulating the linearity and settling time, resistive feedback is employed to set the OTA with a closed-loop gain of 1. Linearity of the OTA was simulated using differential sine wave inputs. For an input amplitude of 800 mV p-p (single-ended) and frequency of 10 kHz, the differential input, output and output common-mode level are shown in Fig. 5-4. The DFT plot of the differential output of the OTA is shown in Fig. 5-5. The third-harmonic (HD3) is 80.4 dB below the fundamental. Even with an input amplitude of 900 mV p-p (90% of the full-range), the OTA achieves a THD of −59 dB. For 1 V p-p (single-ended), 10 kHz inputs (full range) and closed-loop gain of 1, the OTA achieves a differential output voltage of 1.96 V p-p indicating a closed-loop gain error of only 2% thus confirming rail-to-rail operation. For 1.96 V p-p differential output voltage, the THD is −78 dB which is equivalent to 12.65 bits. Even with an input amplitude of 900 mV p-p (90% of the full-range), the OTA achieves a THD of −59 dB. For 1 V p-p (single-ended), 10 kHz inputs (full range) and closed-loop gain of 1, the OTA achieves a differential output voltage of 1.96 V p-p indicating a closed-loop gain error of only 2% thus confirming rail-to-rail operation. For 1.96 V p-p differential output voltage, the THD is −78 dB which is equivalent to 12.65 bits. Even with an input amplitude of 900 mV p-p (90% of the full-range), the OTA achieves a THD of −59 dB. For 1 V p-p (single-ended), 10 kHz inputs (full range) and closed-loop gain of 1, the OTA achieves a differential output voltage of 1.96 V p-p indicating a closed-loop gain error of only 2% thus confirming rail-to-rail operation. For 1.96 V p-p differential output voltage, the THD is −78 dB which is equivalent to 12.65 bits.
Figure 5-4: Differential input, output and output CM level of the OTA.

Figure 5-5: DFT of the differential output of the OTA.
5.2 A PGA for a 9 bit, 1 kS/s SAR ADC

Figure 5-6: THD vs. differential input voltage.

Figure 5-7: Full-scale pulse outputs of the OTA.
Table 5-2: Comparison to low-power OTAs.

<table>
<thead>
<tr>
<th>Specification</th>
<th>[98]</th>
<th>[37]</th>
<th>[99]</th>
<th>[100]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>180</td>
<td>180</td>
<td>65</td>
<td>65</td>
<td>28</td>
</tr>
<tr>
<td>Supply voltage (V)</td>
<td>0.5</td>
<td>0.5</td>
<td>1</td>
<td>1.2</td>
<td>1</td>
</tr>
<tr>
<td>DC gain (dB)</td>
<td>65</td>
<td>72</td>
<td>56</td>
<td>67.2</td>
<td>70</td>
</tr>
<tr>
<td>Unity-gain freq. (MHz)</td>
<td>0.55</td>
<td>15</td>
<td>450</td>
<td>321.5</td>
<td>4.3</td>
</tr>
<tr>
<td>Power (µW)</td>
<td>28</td>
<td>100</td>
<td>1600</td>
<td>240</td>
<td>2.9</td>
</tr>
<tr>
<td>Output amp. for 1% HD₃</td>
<td>0.75FS</td>
<td>0.71FS</td>
<td>-</td>
<td>-</td>
<td>FS</td>
</tr>
<tr>
<td>Output clipping level</td>
<td>0.79FS</td>
<td>0.75FS</td>
<td>0.56FS</td>
<td>0.75FS</td>
<td>FS</td>
</tr>
<tr>
<td>Settling time (µs)</td>
<td>-</td>
<td>-</td>
<td>562</td>
<td>741</td>
<td>5.8</td>
</tr>
<tr>
<td>IFoMₛ (MHz × pF/µA)</td>
<td>196</td>
<td>1500</td>
<td>562</td>
<td>402</td>
<td>106</td>
</tr>
<tr>
<td>IFoMₐ (V/µs × pF/µA)</td>
<td>107</td>
<td>270</td>
<td>-</td>
<td>224</td>
<td></td>
</tr>
</tbody>
</table>

FS : Diff. full-scale voltage of the OTA

other low-power OTAs. The widely used small-signal FoM (IFOMₛ) and large-signal FoM (IFOMₐ) indicate that this work achieves very competitive performance. The proposed OTA provides the highest linearity with rail-to-rail operation among the compared works while consuming the least power.

5.3 An Ultra-Low-Voltage OTA in 28 nm UTBB FDSOI CMOS

The proliferation of wireless sensor networks and biomedical circuits which are powered by batteries and/or energy harvesting sources has made it imperative for analog and mixed-signal circuits to adopt ultra-low supply voltages (< 0.6 V). Since energy-harvesting sources such as piezoelectric transducers often generate output power in the range of tens of µW, minimizing power consumption of the load circuits is paramount. In such scenarios, achieving sufficient analog performance in terms of gain, linearity and dynamic range poses a formidable challenge. Along with innovative circuit design techniques, advances in CMOS process technologies can be utilized to overcome these impediments. This work presents an ultra-low-voltage, sub-µW fully differential operational transconductance amplifier (OTA) designed in 28 nm ultra-thin buried oxide (BOX) and body (UTBB) fully-depleted silicon-on-insulator (FDSOI) CMOS process. In this CMOS process, the BOX isolates the substrate from the drain and source and hence enables a wide range of body bias voltages. Extensive use of forward body biasing which was described in Section 2.3 has been utilized in this work to reduce the threshold voltage of the devices, boost the device transconductance ($g_{m}$) and improve the linearity. Under nominal process and temperature conditions at a supply voltage of 0.4 V, the OTA achieves −64 dB of total
harmonic distortion (THD) with 75% of the full scale output swing while consuming 785 nW. The two-stage OTA incorporates continuous-time common-mode feedback circuits (CMFB) and achieves DC gain = 72 dB, unity-gain frequency of 2.6 MHz and phase margin of 68°. Sufficient performance is maintained over process, supply voltage and temperature variations.

5.3.1 Ultra-low-voltage OTA Design

In order to support low supply voltages, pseudo-differential (PD) OTAs [101] and inverter-based subthreshold OTAs [102] have been proposed. Even though the PD OTA achieves wider signal swings by eliminating the tail current source transistor, it suffers from poor common-mode (CM) rejection unless a common-mode feedforward (CMFF) circuit is included [101]. Figure 5-8 shows the CM gain of a PD OTA with and without CMFF. Without CMFF, the CM gain equals the differential-mode (DM) gain implying a CMRR = 0 dB. Adding the CMFF circuitry reduces the CM gain to unity at low frequencies. However, implementation of CMFF requires separate transconductors and/or additional current mirrors in order to cancel the input CM signal at the OTA outputs which increases the power consumption of the OTA. Additionally a common-mode feedback (CMFB) circuit is required to stabilize the output CM voltage of the PD OTA. The inverter-based OTA in [102] also utilizes a PD topology. However the output swing is limited to <100 mV. Although the choice of a feedforward ΣΔ modulator topology in [102] alleviates the requirements on the OTA output swing, such flexibility regarding the system architecture cannot be exercised in many scenarios. Absence of a dedicated bias circuit in the classic PD OTA renders its performance vulnerable to process, supply voltage and temperature (PVT) variations.

For a general-purpose OTA which can be employed in low-power, ultra-low-voltage analog front-ends, large signal swing, sufficient linearity and robustness to PVT variations have to be ensured. Adequate CMRR and PSRR are required for the OTA. If the OTA is used as a programmable gain amplifier (PGA), it should have very high input impedance so that the preceding instrumentation amplifier is not loaded. In such cases, a gate-input topology is preferable to a body-input one. The OTA should possess fast settling time to drive the capacitive load of the succeeding ADC. Based on these considerations, we have selected a conventional two-stage fully differential architecture for the OTA. The FBB feature in the 28 nm UTBB FDOSI CMOS process has been utilized to achieve sufficient performance for the OTA at \( V_{DD} = 0.4 \text{ V} \) and sub-\( \mu \text{W} \) power consumption.

5.3.2 OTA Architecture

The schematic of the proposed OTA including FBB is shown in Fig. 5-9. To attain high PSRR and minimize the error in the closed-loop gain, the OTA requires large DC gain. In order to maximize the dynamic range for \( V_{DD} = 0.4 \text{ V} \), output swing has to be maximized. The two-stage architecture helps to meet the requirements of
Figure 5-8: Impact of CMFF on the CM gain of a pseudo-differential OTA.

Figure 5-9: Schematic of the two-stage OTA with FBB.
high DC gain and large output swing. The Miller compensation capacitor $C_c$ and the zero-nulling resistor $R_z$ have been realized by a 200 fF MIM capacitor and 32 kΩ poly resistor respectively while $C_L = 500$ fF. In addition to stabilizing the output CM voltage of the OTA, the CMFB circuit is crucial in determining the output swing, DC gain and linearity of the OTA. CMFB for the first stage is implemented by $M_0$ NMOS devices which act as resistors. Since a switched-capacitor (SC) CMFB limits the maximum input signal frequency and requires generation of non-overlapped clocks, a continuous-time (CT) CMFB has been employed in the second stage. Figure 5-10 shows the CT CMFB adopted from [97] along with FBB voltages. The use of complementary source-followers to detect the output CM level enables rail-to-rail output swing without loading the OTA outputs. Using resistive feedback, the OTA is configured with a closed-loop gain of 1.

From Fig. 5-11, it is seen that the worst output common-mode error voltage is $< 4$ mV. The second stage CMFB provides a CM loop gain of 77 dB and consumes 131 nW. The bias voltages $v_{BiasP}$ and $v_{BiasN}$ are generated by a constant-$g_{m}$ bias circuit which uses a 476 kΩ resistor. The bias circuit uses FBB of 2 V and $-2$ V for the NMOS and PMOS devices respectively and has $I_{bias} = 80$ nA. In a chip implementation, the FBB voltages can be generated using SC-based charge pump circuits [103]. The power savings accrued by the use of a circuit-wide ultra-low $V_{DD} = 0.4$ V will outweigh the costs imposed by the charge pumps.

### 5.3.3 Simulation Results

In order to facilitate a robust design, simulations for the OTA encompassed process corners for the MOS devices (TT, SS, FF, FS, SF), capacitors (Cmax, Cmin) and resistors (Rmax, Rmin) combined with ±10% supply voltage variation and a temperature range of $[-20^\circ C, +80^\circ C]$. The open-loop gain and phase plots of the OTA
are shown in Fig. 5-12. To simulate the THD, settling time, and slew-rate, resistive feedback was employed to set the OTA with a closed-loop gain of 1. Figure 5-13 plots the differential input, output and output CM voltages for a 10 kHz sinusoidal input with $V_{in,diff} = 640$ mV. The plot of THD and differential output voltage vs. the differential input voltage is shown in Fig. 5-14. It is seen from Fig. 5-14 that the OTA achieves THD $= -40.3$ dB even with a differential output voltage of 758 mV.

Table 5-3 summarizes the OTA performance over PVT corners. The THD values provided in Table 5-3 were obtained for a differential output voltage which is 75% of the differential full-scale output swing $FS = 2V_{DD}$. Figure 5-15 shows the differential input and output voltages of the OTA with 10 kHz, full-scale pulse inputs. For $\pm 20\%$ variation on the load capacitance $C_L$, the OTA achieves a minimum phase margin of 58$^\circ$ over PVT corners indicating robust stability.

In order to estimate the CMRR, PSRR and input-referred offset voltage of the OTA, 250 Monte Carlo simulations with process variation and device mismatch were used. The mean, standard deviation and minimum values of the CMRR at 10 kHz are $\mu_{CMRR} = 69$ dB, $\sigma_{CMRR} = 8.8$ dB, and $CMRR_{min} = 49$ dB respectively. For the supply noise rejection from $V_{DD}$ at 10 kHz, $\mu_{PSRR} = 79$ dB, $\sigma_{PSRR} = 8.8$ dB and $PSRR_{min} = 55$ dB. The standard-deviation of the input-referred offset $\sigma_{offset} = 6$ mV. The flicker noise contributed by the NMOS devices $M_3$ in Fig. 5-9 is the main cause of increased input-referred noise in the OTA. To assess the flicker noise levels of the MOS devices in the FDSOI process, NMOS- and PMOS-input common-source (CS) amplifiers were designed in 28 nm FDSOI and 65 nm bulk CMOS processes. All four CS amplifiers used LVT devices, a current-mirror load with $I_{bias} = 1$ $\mu$A and a $V_{DD} = 0.5$ V. Gate area ($W \times L$) for the input device was 0.28 $\mu$m$^2$ in each
5.3 An Ultra-Low-Voltage OTA in 28 nm UTBB FDSOI CMOS

Figure 5-12: Gain and phase plots of the OTA in open-loop.

Figure 5-13: Differential input, output and output CM voltages.
Figure 5-14: THD and differential output voltage vs. input voltage.

Table 5-3: Simulated performance of the OTA over PVT corners.

<table>
<thead>
<tr>
<th>Specification</th>
<th>Nominal PVT [TT 0.4 V +27°C]</th>
<th>PVT corners Min.</th>
<th>Max.</th>
</tr>
</thead>
<tbody>
<tr>
<td>DC gain (dB)</td>
<td>73</td>
<td>57</td>
<td>78</td>
</tr>
<tr>
<td>Unity-gain frequency (MHz)</td>
<td>2.63</td>
<td>1.4</td>
<td>3.8</td>
</tr>
<tr>
<td>Phase margin (°)</td>
<td>68</td>
<td>61</td>
<td>75</td>
</tr>
<tr>
<td>Gain margin (dB)</td>
<td>−26.5</td>
<td>−52</td>
<td>−20</td>
</tr>
<tr>
<td>Power (µW)@ V_DD = 0.4 V</td>
<td>0.785</td>
<td>0.394</td>
<td>1.5</td>
</tr>
<tr>
<td>Output CM level (mV)</td>
<td>200</td>
<td>136</td>
<td>275</td>
</tr>
<tr>
<td>THD (dB)</td>
<td>−64</td>
<td>−74</td>
<td>−41</td>
</tr>
<tr>
<td>Settling time (µs) (10-bit accuracy)</td>
<td>10.9</td>
<td>6.2</td>
<td>23.8</td>
</tr>
<tr>
<td>Slew rate (V/µs)</td>
<td>0.5</td>
<td>0.22</td>
<td>0.55</td>
</tr>
<tr>
<td>Inp.ref. noise @ 10 kHz (µV/√Hz)</td>
<td>1.57</td>
<td>1.4</td>
<td>1.95</td>
</tr>
<tr>
<td>Inp.ref. noise @ 1 MHz (µV/√Hz)</td>
<td>1.3</td>
<td>1.2</td>
<td>1.7</td>
</tr>
</tbody>
</table>
case. For the CS amplifiers in the 28 nm FDOSI process, |FBB| = 500 mV was used for all devices. Figure 5-16 plots the flicker noise of the input devices for all four cases. As expected, the NMOS contributes higher flicker noise than the PMOS [104]. However, it is seen that the flicker noise of MOS devices in the 28 nm FDOSI process is considerably higher than that in the 65 nm bulk CMOS process which confirms the general trend that flicker noise worsens with CMOS process scaling [104]. The RMS value of the total input-referred noise of the proposed OTA integrated over a frequency range of [0.1 Hz 1 kHz] is 463 µV. With a differential input signal amplitude = 640 mV i.e. 0.8FS, the OTA achieves a signal-to-noise ratio of 54 dB which corresponds to 8.7 bits. Table 5-4 employs the widely used figures of merit (FoM) for OTAs [105] to compare this work with other ultra-low-voltage OTAs. This work achieves the second-best FoMS and IFoMS values while its large-signal FoM values are very competitive.
Figure 5-16: Flicker noise comparison for 28 nm FDSOI and 65 nm bulk CMOS.

Table 5-4: Comparison to ultra-low-voltage OTAs.

<table>
<thead>
<tr>
<th>Specification</th>
<th>[98]</th>
<th>[37]</th>
<th>[106]</th>
<th>[39]†</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>180</td>
<td>180</td>
<td>130</td>
<td>130</td>
<td>28</td>
</tr>
<tr>
<td>Supply voltage (V)</td>
<td>0.5</td>
<td>0.5</td>
<td>0.5</td>
<td>0.25</td>
<td>0.4</td>
</tr>
<tr>
<td>DC gain (dB)</td>
<td>65</td>
<td>72</td>
<td>51</td>
<td>67.2</td>
<td>73</td>
</tr>
<tr>
<td>Unity-gain freq. (MHz)</td>
<td>0.55</td>
<td>15</td>
<td>112</td>
<td>0.0019</td>
<td>2.63</td>
</tr>
<tr>
<td>Power (µW)</td>
<td>28</td>
<td>100</td>
<td>600</td>
<td>0.018</td>
<td>0.785</td>
</tr>
<tr>
<td>Output amp. for 1% HD3</td>
<td>0.75FS</td>
<td>0.71FS</td>
<td>-</td>
<td>-</td>
<td>0.95FS</td>
</tr>
<tr>
<td>Output clipping level</td>
<td>0.79FS</td>
<td>0.75FS</td>
<td>-</td>
<td>-</td>
<td>FS</td>
</tr>
<tr>
<td>FoMₚ (MHz × pF/mW)</td>
<td>393</td>
<td>3000</td>
<td>1213</td>
<td>450</td>
<td>1675</td>
</tr>
<tr>
<td>FoMₜ (V/µs × pF/mW)</td>
<td>214</td>
<td>540</td>
<td>236</td>
<td>533</td>
<td>318</td>
</tr>
<tr>
<td>IFoMₚ (MHz × pF/µA)</td>
<td>196</td>
<td>1500</td>
<td>607</td>
<td>113</td>
<td>670</td>
</tr>
<tr>
<td>IFoMₜ (V/µs × pF/µA)</td>
<td>107</td>
<td>270</td>
<td>118</td>
<td>133</td>
<td>127</td>
</tr>
</tbody>
</table>

FS: Diff. full-scale voltage of the OTA; †: Single-ended
5.4 Reference Voltage Buffer for a 10-bit 1-MS/s SAR ADC

In an ADC, incorporating an on-chip reference voltage buffer (RVBuffer) reduces the need for external components thus leading to reduced chip-area and system complexity [107]. RVBuffers play a critical role in ensuring high performance of ADCs. Power-efficiency, fast-settling, low output noise and high PSRR are key specifications for the RVBuffer [108]. Majority of the publications on SAR ADCs do not incorporate on-chip RVBuffers. In the works on SAR ADCs which include RVBuffers, attention is drawn mostly to the dominant power consumption of the block while other important design metrics are not discussed in-depth [107], [12]. This work discusses the comprehensive performance requirements on the RVBuffer, coarse estimation of design parameters and choice of a suitable amplifier topology. Simulation results for a PMOS-input RVBuffer optimized for current consumption and area are provided.

5.4.1 Requirements on the RVBuffer

The block diagram of the SAR ADC which includes the RVBuffer is shown in Fig. 5-17. In Fig. 5-17, $V_{\text{refHi}} = 0.5$ V. The hybrid DAC shown in Fig. 5-17 is the combination of a binary-weighted capacitor array and a C-2C ladder along with the necessary switches. It presents an equivalent capacitance of $16 \times C_u$ on each side where $C_u = 25$ fF is the unit capacitance. Since the SAR ADC employs a fully-differential architecture, the total load capacitance connected to the buffer output ($V_{\text{refHi}}$) during conversion will be $16 \times C_u$.

The main function of the RVBuffer is to charge the DAC capacitors to $V_{\text{refHi}}$ within the specified time. For the 10-bit 1-MS/s SAR ADC, the internal clock frequency is 10 MHz. Ideally 50% of the clock period is available for DAC settling. Allocating 50% of the clock period for comparator evaluation and 10% of the clock period for the delay of the digital logic following the comparator, 40 ns is available for DAC settling.
Estimating the impact of layout and interconnect parasitics on the settling time is not straightforward. To account for this effect during schematic design phase itself, a settling-time requirement of 25 ns is placed on the RVBuffer. The reference voltage needs to settle within LSB/2 of the nominal voltage to avoid conversion errors in the ADC. In this case also, a design margin is provided and 13-bit settling accuracy for the RVBuffer is targeted.

In an embedded ADC, the power-supply lines can suffer significant noise spikes due to the presence of oscillators and fast-switching digital blocks. The power-supply noise will perturb the output of the RVBuffer resulting in incorrect DAC estimation and consequent performance degradation. It can also result in harmonic distortion [109]. The RVBuffer must provide sufficient attenuation for the noise injected by the power-supply line, which demands a high PSRR from low-frequency up to the largest system clock frequency. The noise level at the RVBuffer output due to the power-supply must be less than LSB/2. For this fully-differential ADC with $V_{ref,hi} = 0.5 \text{ V}$ and $V_{ref,lo} = 0 \text{ V}$, $\text{LSB}/2 = 488 \mu \text{V}$. In this work, the output of the RVBuffer is referred to ground and hence only PSRRp of the positive supply, namely $V_{DD}$, is of relevance. To guarantee an attenuation factor of 100 for the power-supply noise appearing at the RVBuffer output, a minimum PSRRp of 40 dB from 10 kHz (low-frequency) to the system clock frequency of 10 MHz is required for the RVBuffer. Output noise of the RVBuffer is directly superimposed on the reference voltage. The output noise of the RVBuffer must be limited to a value less than the ADC quantization noise. For a fully-differential, 10-bit ADC with 0.5 V reference, the quantization noise level is $282 \mu \text{V}$ (RMS). In this work, the output noise of the RVBuffer integrated over the frequency range of 0.1 Hz to 20 MHz must be less than $140 \mu \text{V}$ (RMS). The output noise consists of flicker and thermal noise components and must be integrated over the entire band of interest.

For the RVBuffer, load capacitance variation during different operational phases can cause instability or oscillations. The SAR ADC in Fig. 5-17 incorporates a calibration mode. During self-calibration of the ADC, only a single DAC branch is connected to the RVBuffer output presenting a capacitive load as small as $C_u$. Hence sufficient load capacitance must be switched onto the buffer output in order to ensure adequate phase margin (PM). Power-down mode to reduce power consumption must be incorporated. Current consumption and area should be optimized. The buffer should maintain reliable performance over Process-Voltage-Temperature (PVT) and mismatch corners and a variation of $\pm20\%$ on $C_u$. In the PVT corner simulations, supply voltage variation of $\pm10\%$ and a temperature range from $-20^\circ\text{C}$ to $+80^\circ\text{C}$ are utilized.

### 5.4.2 OTA Topology and Simulation Results

Fast settling requires large unity-gain frequency $f_{ug}$ and a single-pole settling behaviour of the opamp while accurate settling requires high DC gain [110]. Though
multi-stage amplifiers can achieve high gains, the presence of additional poles necessitates frequency compensation for closed-loop stability. In such cases, the $f_{ug}$ is limited by the compensation capacitance. Also power consumption is higher due to the bias currents flowing in the different stages. Single-stage amplifiers are inherently stable and thus require no compensation. A symmetric (or current-mirror) OTA is a popular architecture to achieve high $f_{ug}$ and slew-rate. It is a single-stage amplifier with the high-impedance node at the output. Cascoding is a popular technique to enhance DC gain of opamps without degrading the high-frequency performance. In order to achieve fast-settling and sufficient DC gain, a symmetric OTA with cascodes [46] was selected as the topology for the RVBuffer. The PMOS-input RVBuffer shown in Fig. 5-18 was designed to buffer 0.5 V. The bias current for the RVBuffer is derived from a band-gap reference. Wide-swing cascode current mirrors are used to generate the voltages $V_{CascP}$ and $V_{CascN}$. An approach similar to that explained in Section 4.4.1.1 was used to estimate the minimum output current $I_{out,min}$, $f_{ug}$ and open-loop DC gain $A_0$ of the OTA. The resulting values are $I_{out,min} = 48 \mu A$, $f_{ug} = 63.7$ MHz and $A_0 = 60$ dB which were used to guide the design of the OTA.

The stability of the RVBuffer was simulated under two conditions

- During ADC operation, where the DAC capacitance forms the load of the buffer
- During self-calibration of the ADC, when the DAC is disconnected from the buffer

In a single-stage amplifier, the dominant pole is determined by the load capacitance at the output node. In the symmetric OTA, a non-dominant pole occurs at the mirroring nodes (nodes 1, 2) in Fig. 5-18. The capacitance at these nodes can be significant.
For e.g the parasitic capacitance at node 1 of Fig. 5-18 is given by

\[ C_{p1} = C_{db,M_1} + C_{gs,M_3} + B \cdot C_{gs,M_3} \]

(5.1)

since \( W_{M_6} = W_{M_5} \cdot B \). Hence careful device sizing and prudent choice of the mirroring-ratio \( B \) were made to ensure that the non-dominant pole does not degrade stability. The gain-phase plot for the buffer amplifier in conversion mode, for nominal PVT corner, is shown in Fig. 5-19. Reducing/removing the load capacitance also imperils stability. During self-calibration of the ADC, since most of the capacitive load posed by the DAC is disconnected from the buffer output, an NMOS capacitance is connected to the buffer output through a transmission-gate switch to guarantee sufficient PM. The choice of the NMOS capacitor assumes significance since a PMOS capacitor will lead to degradation of PSRRp. The simulated performance of the RVBuffer with the NMOS capacitor load is given in Table 5-5.

The settling time of the buffer for the nominal nominal PVT corner is 16.1 ns. For the RVBuffer in Fig. 5-18, PSRRp is determined largely by the output resistance and the parasitic capacitance \( C_{db} \) of the tail-current source \( M_5 \) and the mismatch between the input differential pair. Careful choice of device sizes and bias currents enabled the buffers to maintain required PSRRp without degrading other performance.
### 5.4 Reference Voltage Buffer for a 10-bit 1-MS/s SAR ADC

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>DC gain</td>
<td>65 dB</td>
</tr>
<tr>
<td>Unity-gain frequency</td>
<td>70 MHz</td>
</tr>
<tr>
<td>Phase margin</td>
<td>64°</td>
</tr>
<tr>
<td>Settling time</td>
<td>19.3 ns</td>
</tr>
<tr>
<td>PSRRp @ 10 kHz</td>
<td>72 dB</td>
</tr>
<tr>
<td>PSRRp @ 10 MHz</td>
<td>48 dB</td>
</tr>
<tr>
<td>Output noise</td>
<td>132 $\mu$V (RMS)</td>
</tr>
<tr>
<td>Current (max)</td>
<td>66 $\mu$A @ 1.8 V</td>
</tr>
<tr>
<td>Current (power-down)</td>
<td>94 pA</td>
</tr>
<tr>
<td>Start-up time</td>
<td>64.6 ns</td>
</tr>
<tr>
<td>Area</td>
<td>19.2 $\mu$m x 19.2 $\mu$m</td>
</tr>
</tbody>
</table>

The integrated output noise of the buffer consists of thermal noise and flicker noise components. Proper choice of device sizes helped to alleviate flicker noise contribution while the bias currents through the different branches of the buffer were selected so as to reduce the thermal noise contribution.

The RVBuffer supports a power-down mode where the buffer is turned OFF by a control signal. In this mode, the devices $M_3$ to $M_{13}$ in Fig. 5-18 are turned OFF by NMOS/PMOS switches. Two performance criteria for the RVBuffer in the power-down mode are current consumption and start-up time. Since the 180-nm PDK lacks high-threshold voltage devices, leakage through the switches cause a minor current flow in the power-down mode. The start-up time is defined as the time required for the buffer output to settle to 13-bits of its nominal output voltage once the power-down signal has been disabled. To determine the worst-case start-up time, the buffer output was initialized to 0 V in the simulation set-up before enabling the buffer. For each performance parameter, the worst-case value among all simulations (PVT corners, Monte-Carlo included) is provided in Table 5-6. From Table 5-6, it is clear that the buffer satisfies the settling time requirement with a good margin. It is possible to reduce current consumption further while maintaining settling performance within the specified limit. But such a reduction of current causes unacceptable increase in the noise level, indicating that the noise specification sets the lower limit on power consumption. The value for the maximum current consumption in Table 5-6 differs from that computed in Section 5.4.2. There are multiple reasons for this discrepancy such as a) The necessity to maintain performance over stringent PVT, mismatch corners and load variations entails larger current consumption. b) The minimum output current was calculated for a simple CMOS OTA [46] with five devices where the entire bias current of the amplifier flows to the output in the slewing regime. The symmetric OTA architecture entails higher current consumption during slewing. c) Noise analysis is often non-trivial and depends on amplifier topology, device sizing
and bias currents. During simulations, it was found that the noise specification presented a limit on the minimum current consumption.

5.4.3 Re-design of the RVBuffer

The RVBuffer shown in Fig. 5-18 was re-designed to buffer voltages in the range [150 mV  600 mV]. Lowering the input voltage, $V_{\text{refHi}}$, will cause the transistors $M_7$ and $M_9$ to go into the linear region, thus unacceptably degrading the performance. Low gate overdrive for the transistors $M_6, M_7, M_8$ and $M_9$ are required in order to avoid this issue, which in turn requires these devices to have large sizes (large $W/L$). To drive a large capacitive load of 3.7 pF while maintaining the settling time less than 40 ns, the current-mirror factor $B$ was chosen to be unequal for the two branches in Fig. 5-18. A mirror factor of $B = 1$ was used for the left branch while $B = 6$ was used for the right branch since this branch directly charges the load capacitance. Thus it is seen that the chosen amplifier topology helps to achieve faster settling without excessive current consumption by optimizing the currents through the different branches of the amplifier. Layout of the redesigned PMOS-input RVBuffer is shown in Fig. 5-20. To simplify routing on the top-level, only metal layers $M_1$ and $M_2$ have been used in the layout of the buffer. Symmetric arrangement of the transistors and dummy devices were utilized to enhance matching. For debugging purposes, current programmability has been incorporated in the RVBuffer. Two control bits are used to select four different values for the tail current. This is accomplished by connecting/disconnecting multiples of the tail current source device $M_5$ in Fig. 5-18.
5.5 Frequency Compensation of a Three-Stage OTA in 40 nm CMOS

In advanced CMOS process nodes with low supply voltages (≤ 1.2 V), cascoding to achieve higher gain in OTAs is no longer feasible. Hence, cascading multiple stages to achieve high gain emerges as a promising design paradigm. Additional amplifier stages, however, introduce poles and zeros that degrade stability and makes advanced frequency compensation indispensable. Several compensation schemes such as Nested Miller Compensation [111] (NMC), Nested Gm-C Compensation [112] (NGCC), Active Feedback Frequency Compensation [113] (AFFC) and many others have been proposed. A majority of the publications on multi-stage amplifier compensation schemes target low-speed, high-capacitive load applications where the GBW is less than 10 MHz. However, detailed comparison of compensation schemes with respect to the amplifier specifications such as phase margin, unity-gain frequency \( f_{ug} \) and DC gain is often not available for high-speed amplifiers in deep submicron CMOS processes. In this work, the NMCNR [114] and RNIC [115] compensation techniques are compared using transistor schematic level simulations for a three-stage OTA designed in 1.1 V, 40 nm CMOS. The simulation results illustrate that the RNIC scheme achieves much higher phase margin (PM) and unity-gain frequency for lower values of compensation capacitance compared to NMCNR.

5.5.1 RNIC Stabilization

Saxena et al. proposed a compensation technique which utilized split-length transistors to create a low-impedance node for connecting the Miller capacitor [45]. The compensation scheme does not require an embedded cascode in the input pair to create a low-impedance node. A split-length device achieves the same topology, in which one "half" of the transistor is always in triode region with a very low voltage drop across it. Therefore, it is suitable for low-voltage implementation. Indirect feedback through the low-impedance node eliminates the RHP zero and improves PM. The three-stage OTA, designed in a 1.1-V, 40-nm CMOS process, utilizes differential pairs for the first and second stages followed by a common-source amplifier acting as the third stage [115]. Use of a differential pair as the second-stage simplifies biasing.

The schematic of the three-stage OTA with reversed nested indirect compensation (RNIC) is shown in Fig. 5-21. The split-length differential pair in the first-stage creates the low-impedance nodes (fbl, fbr). RNIC prevents the output node from being loaded by the two compensation capacitors and help achieve higher \( f_{ug} \). In NMC, the third stage must always be inverting and the second stage must be non-inverting as described in Section 2.2.1.3. However, such restrictions do not apply in the RNIC scheme. The only criterion to be ensured is that the capacitors must be connected across two nodes which are moving in the opposite directions to guarantee negative feedback. There are two compensation loops, each with a capacitor and series...
resistor. The transfer function of the three-stage OTA has two LHP zeros \( f_{z1}, f_{z2} \) and five poles \( f_{p1} - f_{p5} \) [115]. Of these, \( f_{p4} \) and \( f_{p5} \) are caused by the low-impedance nodes and lie at very high frequencies, while \( f_{p2}, f_{p3} \) can be canceled using the LHP zeros by properly choosing \( R_{1c} \) and \( R_{2c} \) [115]. Thus addition of the series resistors helps to accomplish pole-zero cancellation and improves the PM. The amplifier can also be stabilized without using the series resistors for pole-zero cancellation. But fixing the location of poles and zeros proves more tedious and the resulting PM might be lower. In transistor schematic simulations, the three-stage OTA achieves DC gain \( = 80 \text{ dB}, f_{ug} = 770 \text{ MHz} \) and PM \( = 81^\circ \). The gain-phase plot of the three-stage OTA with RNIC is shown in Fig. 5-22. A small-signal equivalent model was set up to analyze the impact of RNIC on pole-zero locations and the results are provided in Table 5-7. It highlights the pole-splitting and shows how the non-dominant poles \( f_{p2} \) and \( f_{p3} \) are located close to the zeros \( f_{z1} \) and \( f_{z2} \) respectively. Also the pole-zero pairs, \( f_{p2}, f_{z1} \) and \( f_{p3}, f_{z2} \) are closely located. Such an arrangement of non-dominant poles and LHP zeros is optimal for a low-power three-stage amplifier [115].

### 5.5.2 NMCNR Stabilization

The three-stage OTA was compensated using NMCNR. Since the feedforward current through the compensation capacitor creates a RHP zero, the PM is degraded. The location of poles and zeros before and after NMCNR obtained using the small-signal
5.5 Frequency Compensation of a Three-Stage OTA in 40 nm CMOS

Figure 5-22: Gain-phase plots of the three-stage OTA.

Table 5-8: Pole-zero locations of the three-stage OTA with NMCNR.

<table>
<thead>
<tr>
<th>Comp. scheme</th>
<th>( f_{p1} )</th>
<th>( f_{p2} )</th>
<th>( f_{p3} )</th>
<th>( f_{p4} )</th>
<th>( f_{z1} )</th>
<th>( f_{z2} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>None</td>
<td>2.4e7</td>
<td>3.4e8</td>
<td>8.3e8</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>2C2R</td>
<td>3.4e4</td>
<td>3.6e7</td>
<td>1.2e9</td>
<td>4.5e9</td>
<td>4e7</td>
<td>4.1e8*</td>
</tr>
</tbody>
</table>

The equivalent model of the OTA is shown in Table 5-8. \( f_{z2} \) is a RHP zero which degrades the phase margin. The gain-phase plot is shown in Fig. 5-22. A comparison of the two frequency compensation schemes is summarized in Table 5-9. The load capacitance for the three-stage OTA is 1 pF and \( V_{DD} = 1.1 \) V. \( C_t \) denotes the total compensation capacitance. It is evident from Table 5-9 that the RNIC scheme achieves higher \( f_{ng} \) as well as PM, while requiring a lower value of compensation capacitance. Further, Table 5-9 also suggests that the RNIC requires lower power compared to NMCNR, for the same target design specifications. The commonly used small-signal FoM, \( \text{FoM}_S \) \cite{105}, indicates the superior performance of the RNIC technique.

Table 5-9: Comparison of NMCNR and RNIC schemes.

<table>
<thead>
<tr>
<th>Comp. scheme</th>
<th>( C_t ) (pF)</th>
<th>( f_{ng} ) (MHz)</th>
<th>PM (°)</th>
<th>Gain (dB)</th>
<th>Power (mW)</th>
<th>( \text{FoM}_S ) (MHz·pF/mW)</th>
</tr>
</thead>
<tbody>
<tr>
<td>NMCNR</td>
<td>8</td>
<td>396</td>
<td>64</td>
<td>75</td>
<td>3</td>
<td>132</td>
</tr>
<tr>
<td>RNIC</td>
<td>4.5</td>
<td>770</td>
<td>81</td>
<td>80</td>
<td>2.6</td>
<td>296</td>
</tr>
</tbody>
</table>
5.6 A Receiver AFE for Capacitive Body-Coupled Communication

Utilizing the human body as a communication channel is an emerging paradigm in body area networks (BAN). In this work, an AFE which satisfies the gain and noise requirements for a 10 Mbps, capacitive coupled body channel receiver has been designed in a 40 nm CMOS process.

5.6.1 Requirements on the AFE

The transceiver architecture based on digital baseband transmission is shown in Fig. 5-23. It uses Manchester data encoding scheme with a data rate of 10 Mbps. The Manchester encoding scheme eliminates the need for separate transmission of a clock signal. This transceiver architecture has an inherent advantage of lower power consumption as it does not require an ADC. Gain requirement/overall attenuation of the signal in the receiver depends on multiple factors like the maximum distance travelled, lower supply voltage, lower value of coupling capacitors due to thicker stratum corneum and the requirement of a single signal electrode. The large amount of attenuation due to these factors (typically 60-80 dB) is compensated by ac-coupled multi-stage amplifiers with a total gain of 60 dB. From digital communication theory, noise power spectral density for AWGN channel and Manchester encoding scheme can be derived. The minimum Euclidian distance of a signal constellation for unipolar Manchester codes in terms of signal energy can be approximated by

\[ d = \sqrt{2T \cdot V_p^2}, \]  

(5.2)

where \( d \) is in normalized scale, \( T \) is the pulse period and \( V_p \) is the peak voltage. The probability of error, \( \rho \), can be expressed by the density function, \( Q \) as

\[ \rho = Q \left( \frac{d}{\sqrt{2N_0}} \right) \]  

(5.3)

where \( N_0 \) is the noise power spectral density. Assuming relatively constant body channel attenuation \( R \) (power/energy scale) in 5-10 MHz band, \( d \) modifies as \( d' = \)

Figure 5-23: Block diagram of the transceiver.
\[ \sqrt{R} \cdot d \text{ in (5.3) giving} \]
\[ \rho = Q \left( \frac{d'}{\sqrt{2N_0}} \right) = Q \left( \frac{\sqrt{T \cdot V_p^2 \cdot R}}{\sqrt{N_0}} \right). \] (5.4)

Therefore, the noise power spectral density from (5.4) is
\[ N_0 = \frac{T \cdot V_p^2 \cdot R}{(Q^{-1}(\rho))^2}. \] (5.5)

Assuming a modest bit error rate of 0.1\%, \( \rho = 10^{-3}, T = 1/10 \text{ MHz}, V_p = 0.5 \text{ V} \) (half the supply voltage) and a high attenuation of \( R = 10^{-8} \) we get the noise power spectral density of
\[ N_0 = 26aV^2/\text{Hz} = 5 \text{ nV}/\sqrt{\text{Hz}}. \] (5.6)

### 5.6.2 AFE Architecture

Digital pulses, when transmitted through the body, result in narrow wideband spikes at the receiver input [96]. The receiver AFE should recover the digital pulses. In order to accomplish this, the AFE should achieve the noise and gain specification given in Section 5.6.1. The AFE architecture described in this work consists of an amplifier chain followed by a Schmitt trigger [96]. Each amplifier uses a single-ended two-stage OTA and has a closed-loop gain of 20 dB. The single-ended OTA is designed in a 1.1 V, 40-nm CMOS technology. In such a process node, the low supply voltage makes it difficult to achieve high gain by cascoding of devices. Therefore, a two-stage OTA has been utilized which provides large output swing as well as moderately high gain. However, frequency compensation is essential in order to stabilize the OTA. Compared to the traditional Miller compensation, indirect compensation helps to achieve larger unity-gain frequencies at lower power [45]. The two-stage OTA used in this work is shown in Fig. 5-24.

PMOS transistors have been used for the input differential pair to lower the flicker noise and optimize the unity-gain frequency [116]. The split-length current mirror load (SLCL) topology [45] has been utilized to create low-impedance nodes (nodes A, B in Fig. 5-24) for indirect compensation. This technique helps to eliminate the right-half plane (RHP) zero which degrades phase margin. Only a left-half plane (LHP) zero occurs which enhances the phase margin. In the traditional Miller compensation topology for a two-stage OTA, the RHP zero warrants the use of a nulling resistor to either eliminate the zero or move it beyond the unity-gain frequency [116]. The SLCL topology obviates the need for the nulling resistor and improves performance over PVT corners. The biasing circuit for the OTA is the beta-multiplier circuit combined with a differential amplifier to reduce the sensitivity of the bias current to supply voltage variations [82]. The open-loop gain-phase plot of the OTA is
shown in Fig. 5-25. The FoMs for an OTA which are independent of supply voltage are $\text{IFoM}_S = (\text{GBW} \cdot \text{C}_L)/I_{dd}$ and $\text{IFoM}_L = (\text{SR} \cdot \text{C}_L)/I_{dd}$ [117]. The simulated performance of the OTA is summarized in Table 5-10. The output of the amplifier chain is provided to a Schmitt trigger [82] which recovers the digital pulses.

5.6.3 AFE Topologies and Simulation Results

Three different AFE topologies such as resistive feedback, capacitive feedback and capacitive feedback with SC bias were designed and simulated. In the resistive-feedback AFE, each amplifier uses resistive feedback to achieve the required closed-loop gain as shown in Fig. 5-26. For a non-inverting amplifier with $R_{in}$ and $R_f$ representing the input and feedback resistances, the low-frequency closed-loop gain is given by [76]

$$A_{CL} = \left( 1 + \frac{R_f}{R_{in}} \right) \cdot \frac{|LG|}{1 + |LG|} \cdot |LG| \tag{5.7}$$

where $LG = A_0 \beta$, $A_0$ is the open-loop DC gain of the OTA and $\beta$ is the feedback factor. For each non-inverting amplifier stage in Fig. 5-26, $R_f/R_{in} = 9$ to attain a closed loop gain of 20 dB. Consequently $\beta = 1/10$. Substituting $\beta = 0.1$ and the value of $A_0$ from Table 5-10, this results in a gain error factor of 0.94 on the ideal closed-loop gain. The simulated total closed-loop gain achieved for the cascade is 58 dB for a pulse input at 10 MHz. The resistors add significant amount of thermal noise resulting in a total input-referred noise (RMS) of 41 $\mu$V. In the capacitive feedback AFE, capacitors are used to achieve feedback in each amplifier [118] as
5.6 A Receiver AFE for Capacitive Body-Coupled Communication

![Frequency vs. Gain and Phase](image)

**Figure 5-25:** Gain-phase plot of the two-stage OTA.

**Table 5-10:** Two-stage OTA performance summary.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply voltage</td>
<td>1.1 V</td>
</tr>
<tr>
<td>Process node</td>
<td>40-nm CMOS</td>
</tr>
<tr>
<td>DC gain</td>
<td>44.8 dB</td>
</tr>
<tr>
<td>Unity-gain frequency</td>
<td>979 MHz</td>
</tr>
<tr>
<td>Phase margin</td>
<td>61°</td>
</tr>
<tr>
<td>Power</td>
<td>2 mW</td>
</tr>
<tr>
<td>Slew-rate (Rise, Fall)</td>
<td>(100, 426) V/µs</td>
</tr>
<tr>
<td>Load capacitance</td>
<td>5 pF</td>
</tr>
<tr>
<td>Input-referred noise PSD (10 MHz)</td>
<td>4.3 nV/√Hz</td>
</tr>
<tr>
<td>Input-referred noise (5 to 15 MHz)</td>
<td>13.7 µV RMS</td>
</tr>
<tr>
<td>PSRRp, PSRRn</td>
<td>(45.1, 47.8) dB</td>
</tr>
<tr>
<td>IFoMS</td>
<td>2646 (MHz · pF)/mA</td>
</tr>
<tr>
<td>IFoML</td>
<td>711 (V/µs · pF)/mA</td>
</tr>
</tbody>
</table>
shown in Fig. 5-27. The capacitors help to reduce the input-referred noise of the amplifier chain compared to the resistive-feedback topology. However, it requires a resistor in each amplifier for DC biasing. The total input-referred noise (RMS) is 14 $\mu$V. The resistor in a constant-$g_m$ bias circuit can be replaced by a switched capacitor to attain increased robustness against process and temperature variations [118]. In the third AFE topology, the bias resistor in the capacitive feedback amplifier was replaced with a switched capacitor as shown in Fig. 5-28. The SC circuit is driven by two non-overlapping clock phases $\phi_1$ and $\phi_2$. 

Figure 5-26: Schematic of the resistive-feedback AFE.

Figure 5-27: Schematic of the capacitive-feedback AFE.

Figure 5-28: Schematic of the capacitive-feedback AFE with SC bias.
5.6 A Receiver AFE for Capacitive Body-Coupled Communication

Table 5-11: Simulation results for the AFE configurations.

<table>
<thead>
<tr>
<th>AFE</th>
<th>Power</th>
<th>Input-noise</th>
<th>Gain</th>
</tr>
</thead>
<tbody>
<tr>
<td>Resistive-feedback</td>
<td>7.3 mW</td>
<td>13 nV/√Hz</td>
<td>58.6 dB</td>
</tr>
<tr>
<td>Capacitive-feedback</td>
<td>6.8 mW</td>
<td>4.4 nV/√Hz</td>
<td>57.6 dB</td>
</tr>
<tr>
<td>SC-bias</td>
<td>6.8 mW</td>
<td>4.9 nV/√Hz</td>
<td>57.2 dB</td>
</tr>
</tbody>
</table>

The clock \( \phi_2 \) has a large duty cycle and is used to charge the capacitor to the input common-mode level \( v_{CM} \). The clock \( \phi_1 \) has a sufficiently narrow duty-cycle to transfer the charge from the capacitor to the negative input node of the OTA, thus biasing the OTA. The cascaded gain of the amplifier chain is 57.6 dB for a 10 MHz pulse input.

Table 5-11 summarizes the simulated performance of the AFEs. Both the capacitive feedback AFE and the SC bias AFE satisfy the requirements on gain and input-referred noise. To simulate the effects of transmission through the human body, a 2 mVpp pulse at 10 MHz was passed through a passive differentiator to generate narrow wideband spikes. The outputs of transient simulation for the capacitive feedback and SC bias AFE topologies are shown in Fig. 5-29 and Fig. 5-30 respectively. It is seen that the AFE topologies retrieve the binary values corresponding to the received waveform.
5.7 Summary

In advanced CMOS process nodes, two- and three-stage OTAs become necessary to achieve the desired DC gain. Such OTAs require frequency compensation to ensure stability and choice of the compensation scheme will have significant impact on the unity-gain frequency and phase margin. Lowering the required compensation capacitance helps to reduce area and power consumption. Application of the indirect frequency compensation using split-length devices on two-stage and three-stage OTAs helped to eliminate the nulling resistor and achieve higher power efficiency. The RBB and FBB feature of the 28 nm UTBB FDSOI CMOS process can be used to attain the desired performance in OTAs. Boosting the DC gain and enabling ultra-low-voltage operation for fully-differential OTAs using RBB and FBB respectively were illustrated in this chapter. For mature CMOS technology nodes such as 180 nm CMOS, proper choice of the OTA topology and careful design resulted in a power-efficient RVBuffer for a 10 bit, 1 MS/s SAR ADC.
Chapter 6

Conclusions and Future Work

Based on the objectives outlined in Section 1.2, this dissertation presented two SAR ADCs and five OTA designs. The circuit blocks were designed and implemented in advanced CMOS process nodes with low supply voltages. Design considerations for the various circuits and their performance specifications were elaborated upon in Chapter 2-5 and the following conclusions are drawn.

Chapter 2 described the design considerations for SAR ADCs and the trade-offs involved in the sub-blocks of the ADC. In order to ensure sufficient gate overdrive voltage for the sampling switches, bootstrapping and multi-stage charge pumps have to be used. Leakage-mitigation assumes significance in very low sampling rate ADCs which are often employed in biomedical implants and WSNs. Choice of the capacitive DAC topology as well as factors limiting the performance of comparators were elaborated upon. Since multi-stage OTAs are needed to meet the DC gain requirement in AFEs and data converters, frequency compensation techniques as well as the features of the 28 nm UTBB FDSOI CMOS process useful for enhancing the performance of analog blocks were discussed.

A sub-nW, 8-bit, 1 kS/s SAR ADC implemented in 65 nm CMOS was presented in Chapter 3. To overcome the formidable trade-off between leakage and $R_{ON}$, a reduced leakage switch driven by a multi-stage charge pump was used. By constructing very low valued unit capacitors with a well-shielded top plate in a binary-weighted capacitive DAC, the area and power consumption associated with the DAC was minimized while meeting the linearity performance. At near-Nyquist input frequency, the ADC achieves 7.81 ENOB while consuming only 717 pW making it an ideal candidate for WSNs and biomedical circuits.

Chapter 4 presented a 10-bit, 50 MS/s SAR ADC with an on-chip reference voltage buffer. The speed limitation caused by accurate DAC settling for medium/high speed SAR ADCs and the need for a fast-settling reference voltage buffer were discussed. The designed buffer meets the requirements on settling time, PSRR, noise and stability. A bootstrapped sampling switch maintains > 10-bit linearity over all
Conclusions and Future Work

PVT corners. In post-layout simulation which includes the entire pad frame and associated parasitics, the ADC achieves an ENOB of 9.25 bits at a supply voltage of 1.2 V, typical process corner and sampling frequency of 50 MS/s for near-Nyquist input. Excluding the reference voltage buffer, the ADC consumes 697 µW and achieves an energy efficiency of 25 fJ/conversion-step while occupying a core area of 0.055 mm$^2$. The FoM achieved compares well with that of start-of-the-art ADCs with similar specifications.

Chapter 5 presented five different OTA designs which are closely linked with ADCs and AFEs. Two OTAs which utilize the beneficial features of the 28 nm UTBB FDSOI CMOS process were discussed. It is shown that the OTAs satisfy the demanding specifications with low power consumption. A reference voltage buffer for a 10-bit, 1 MS/s SAR ADC in an industrial SoC as well as the importance of choosing the appropriate frequency compensation technique in high-speed, multi-stage OTAs was presented. Finally an AFE in 40 nm CMOS for a body-coupled communication receiver was described.

6.1 Future Work

As part of smart-MEMPHIS, an EU-funded project, I am working on the design of CMOS rectifier circuits for energy-harvesting applications. This project involves the implementation of autonomously-powered circuits for biomedical and structural health monitoring applications. For structural health monitoring, a medium resolution ADC is required. Since the supply voltage for the ADC is generated by harvesting vibrations, the ADC power consumption should be minimized. The experience that I have gained in implementing ultra-low-power ADCs will benefit me in this project.

Next-generation wireless standards such as LTE-advanced will require ADCs with ENOB > 10 bits and sampling rates > 200 MS/s [71]. The pipelined SAR ADC also known as the SAR-assisted pipeline ADC harnesses the advantages of SAR and pipelined ADCs to realize high resolution and improved sampling rate. For a given resolution, the pipelined SAR ADC requires lesser number of stages compared to a conventional pipelined ADC which translates into substantial power savings. The speed bottleneck caused by the sequential SAR algorithm is alleviated in the pipelined SAR ADC especially for high resolutions. Use of a dynamic residue amplifier [71, 119] helps to reduce power consumption. Pipelining combined with time-interleaving holds significant potential to boost the sampling rate and resolution of SAR ADCs [71, 119]. More work could to be done to exploit the benefits of this topology. The use of pipelined SAR ADCs with dynamic residue amplification for high-resolution, low-sampling rate applications can also be investigated.
References

[1] M. Yip and A. P. Chandrakasan, “A resolution-reconfigurable 5-to-10-bit 0.4-
to-1 V power scalable SAR ADC for sensor applications,” *IEEE J. Solid-State
Circuits*, vol. 48, no. 6, pp. 1453–1464, June 2013.

70 dB dr 10 b 0-to-80 MS/s current-integrating SAR ADC with adaptive
dynamic range,” *IEEE J. Solid-State Circuits*, vol. 49, no. 5, pp. 1173–1183,
May 2014.

[3] N. Van Helleputte, M. Konijnenburg, J. Pettine, D.-W. Jee, H. Kim, A. Mor-
gado, R. Van Wegberg, T. Torfs, R. Mohan, A. Breeschoten, H. de Groot,
C. Van Hoof, and R. F. Yazicioglu, “A 345 µW multi-sensor biomedical SoC
with bio-impedance, 3-Channel ECG, motion artifact reduction, and integrated

latch sense amplifier and a static power-saving input buffer for low-power
1993.

[5] Shikata, R. Sekimoto, T. Kuroda, and H. Ishikuro, “A 0.5V 1.1MS/sec
6.3fJ/conversion-step SAR-ADC with tri-level comparator in 40nm CMOS,” in

B. Nauta, “A 1.9µw 4.4fJ/conversion-step 10b 1MS/s charge-redistribution

10b 200kS/s subranging SAR ADC in 40nm CMOS,” in *IEEE ISSCC Dig.

[8] D. Zhang and A. Alvandpour, “A 3-nW 9.1-ENOB SAR ADC at 0.7 V and


REFERENCES


Appendix A

Paper Collection

Journals

  The above journal paper is not included in the appendix since it is currently under review.


Conferences


Paper Collection

The articles associated with this thesis have been removed for copyright reasons. For more details about these see:

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-122730