Implementation of Flash Analog-to-Digital Converters in Silicon-on-Insulator Technology

Erik Säll
Implementation of Flash Analog-to-Digital Converters in Silicon-on-Insulator Technology

Copyright © 2005 Erik Säll

eriks@isy.liu.se
http://www.es.isy.liu.se
Division of Electronics Systems
Department of Electrical Engineering
Linköping University
SE-581 83 Linköping
Sweden

ISBN 91-85457-79-5    ISSN 0280-7971

Printed by UniTryck, Linköping, Sweden 2005
Abstract

High speed analog-to-digital converters (ADCs) used in, e.g., read channel and ultra wideband (UWB) applications are often based on a flash topology. The read channel applications is the intended application of this work, where a part of the work covers the design of two different types of 6-bit flash ADCs. Another field of application is UWB receivers.

To optimize the performance of the whole system and derive the specifications for the sub-blocks of the system it is often desired to use a top-down design methodology. To facilitate the top-down design methodology the ADCs are modeled on behavioral level. The models are simulated in MATLAB®. The results are used to verify the functionality of the proposed circuit topologies and serve as a base to the circuit design phase.

The first flash ADC has a conventional topology. It has a resistor net connected to a number of latched comparators, but its thermometer-to-binary encoder is based on 2-to-1 multiplexers buffered with inverters. This gives a compact encoder with a regular structure and short critical path. The main disadvantage is the code dependent timing difference between the encoder outputs introduced by this topology. The ADC was simulated on schematic level in Cadence® using the foundry provided transistor models. The design obtained a maximum sampling frequency of 1 GHz, an effective resolution bandwidth of 390 MHz, and a power consumption of 170 mW.

The purpose of the second ADC is to demonstrate the concept of introducing dynamic element matching (DEM) into the reference net of a flash ADC. This design yields information about the performance improvements the DEM gives, and what the trade-offs are when introducing DEM. Behavioral level simulations indicate that the SFDR is improved by 11 dB when introducing DEM, but the settling time of the reference net with DEM will now limit the conversion speed of the converter. Further, the maximum input frequency is limited by the total resistance in the reference net, which gets increased in this topology. The total resistance is the total switch on-resistance plus the total resistance of the resistors. To increase the conversion speed and the maximum input frequency a new DEM topology is proposed in this work, which reduces the number of switches introduced into the reference net compared with earlier proposed DEM topologies. The transistor level simulations in Cadence® of the flash ADC with DEM indicates that the SFDR improves by 6 dB compared with when not using DEM, and is expected to improve more if more samples are used in the simulation. This was not possible in the current simulations due to the long simulation time. The improved SFDR is however traded for an increased chip area and a reduction of the maximum sampling frequency to 550 MHz.
for this converter. The average power consumption is 92 mW.

A goal of this work is to evaluate a 130 nm partially depleted silicon-on-insulator (SOI) complementary metal oxide semiconductor (CMOS) technology with respect to analog circuit implementation. The converters are therefore implemented in this technology. When writing this the ADCs are still being manufactured. Since the technology evaluation will be based on the measurement results the final results of the evaluation are not included in this thesis. The conclusions regarding the SOI CMOS technology are therefore based on a literature study of published scientific papers in the SOI area, information extracted during the design phase of the ADCs, and from the transistor level circuit simulations. These inputs indicate that to fully utilize the potential performance advantages of the SOI CMOS technology the partially depleted SOI CMOS technology should be exchanged for a fully depleted SOI CMOS technology. The manufacturing difficulties regarding the control of the thin-film thickness must however first be solved before the exchange can be done.
Preface

This licentiate thesis is a result of my research during the period from November 2002 through November 2005 at the division of Electronics Systems, Department of Electrical Engineering, Linköping University, Sweden.

My main contributions to the field of analog-to-digital converters are modeling and design of flash analog-to-digital converters, and the evaluation of the suitability to implement mixed-signal circuits in a partially depleted silicon-on-insulator complementary metal oxide semiconductor technology. The choice of thermometer-to-binary decoder topology and its effect on the converter performance was evaluated during the converter design and an encoder based on multiplexers is proposed in this work. An approach for incorporating dynamic element matching in the reference net of a flash analog-to-digital converter is also proposed and demonstrated in this work.

My research has resulted in the following publications related to the thesis.


I have also been involved in other research work generating the following publications falling outside the scope of this thesis.


Acknowledgments

First I thank my supervisor Prof. Mark Vesterbacka for giving me the opportunity to do this work and for his support and guidance along the path towards this thesis. I also thank him for the valuable help in proofreading this thesis.

I also thank my past and present colleagues at the Division of Electronics Systems for their support, friendship and for providing such an inspiring environment. For the same reasons I thank the rest of my colleagues I have had cooperation with at the Department of Electrical Engineering.

I am also grateful for the assistance by Jonas Carlsson, Erik Backenius, and Mattias Olsson in proofreading the drafts of this thesis. You really made my drafts much more colorful!

I thank all my friends for helping me to re-charge my batteries during my off-duty time. I hope to meet some of you under a Cumulus to the sound of a happy variometer.

Last I thank my parents Lennart and Birgitta, and my sisters Ann-Sofie and Josefine for their support through the years, but I think that you still do not really know what I am doing at work. Maybe this thesis can give you some more insight?

Thank you all!

Erik Säll

Linköping, November 18, 2005
# Contents

1 Introduction ........................................... 1
  1.1 Read Channel Applications ....................... 3
  1.2 UWB Applications .................................... 5
  1.3 Scope of the Work .................................. 6
  1.4 Thesis Outline ...................................... 6
  1.5 Abbreviations ...................................... 6

2 Silicon-on-Insulator Technology ..................... 9
  2.1 Background ........................................... 9
  2.2 Partially and Fully Depleted SOI ................. 11
  2.3 SOI CMOS vs Bulk CMOS Devices .................. 13
  2.3.1 Doping Density ................................... 13
  2.3.2 Effect of Scaling on Speed Performance ....... 14
  2.3.3 Current Drive Capability ......................... 15
  2.3.4 Unity-Gain Frequency ............................ 16
  2.3.5 Latch-Up and Device Density .................... 17
  2.3.6 Radiation Hardness .............................. 18
  2.3.7 Crosstalk and Passives .......................... 18
  2.4 Partially vs Fully Depleted SOI ................. 19
  2.4.1 Kink and History Effect ......................... 19
  2.4.2 Self-heating and Thermal Coupling ............. 21
  2.5 ESD Protection ....................................... 23
  2.5.1 ESD Models ...................................... 24
  2.5.2 ESD Protection Circuits for SOI CMOS ....... 24

3 Analog-to-Digital Conversion .......................... 29
  3.1 ADC Function ........................................ 29
  3.2 Quantization Noise .................................. 30
  3.3 ADC Performance Measures ......................... 31
     3.3.1 Resolution and Accuracy ....................... 31
3.3.2 Signal-to-Noise Ratio .................................. 32
3.3.3 Signal-to-Noise and Distortion Ratio ...................... 32
3.3.4 Spurious-Free Dynamic Range ........................... 32
3.3.5 Effective Number of Bits and Maximum Sampling Frequency ............................................. 33
3.3.6 Effective Resolution Bandwidth .......................... 34
3.3.7 Figure of Merit ........................................ 34

3.4 Flash ADC Topology ....................................... 35
  3.4.1 Reference Generator ..................................... 36
  3.4.2 Comparator ........................................... 36
  3.4.3 Thermometer-to-Binary Encoder ......................... 38

3.5 Error Sources and Error Correction in Flash ADCs .......... 40
  3.5.1 Sampling Time Uncertainty ............................ 41
  3.5.2 Resistive Reference Generator......................... 43
  3.5.3 Comparator ........................................... 43

3.6 DEM in Flash ADCs ....................................... 49

4 Proposed Circuits ........................................ 51
  4.1 Folded Wallace Tree Encoder ................................ 51
  4.2 MUX-Based Encoder ....................................... 53
  4.3 DEM Flash ADC ........................................... 57
    4.3.1 The 1-of-M Decoder .................................. 60
    4.3.2 The Thermometer-to-Binary Encoder ................... 61

5 Modeling of Flash ADCs .................................... 63
  5.1 Clock Skew ............................................... 63
  5.2 Reference Generator ..................................... 65
    5.2.1 Input-to-Reference Signal Feedthrough ............... 65
    5.2.2 Resistor Mismatch and Reference Net Supply Fluctuations .............................. 70
  5.3 Comparator ............................................... 71
  5.4 Thermometer-to-Binary Encoder ........................... 73
  5.5 DEM Flash ADC ........................................... 75

6 ADC Designs ................................................. 81
  6.1 Flash ADC with MUX-Based Encoder ........................ 82
    6.1.1 Reference Generator .................................. 82
    6.1.2 Thermometer-to-Binary Encoder ......................... 83
  6.2 DEM Flash ADC ........................................... 85
    6.2.1 DEM Flash ADC Reference Generator .................... 86
## Contents

6.2.2 PRBS .......................... 86  
6.2.3 1-of-126 Decoder .................. 87  
6.2.4 Thermometer-to-Binary Encoder ............. 88  
6.3 Comparator .......................... 89  
6.3.1 Preamplifier .................. 89  
6.3.2 Latched Comparator ............. 91  
6.4 Digital Circuits .................. 92  
6.4.1 Full Adder ................ 92  
6.4.2 2:1 MUX .................... 92  
6.4.3 D Flip-Flop .......................... 93  
6.5 ESD Protection Circuit Design ............. 94  

7 Results and Discussion .............. 97  
7.1 Flash ADC with MUX-Based Encoder ............. 98  
7.2 DEM Flash ADC .......................... 100  
7.3 Comparator .......................... 105  
7.4 Discussion of the SOI CMOS Technology ............. 107  

8 Conclusions .......................... 109  
8.1 Future Work .......................... 110  

A Notation .......................... 113  

References .......................... 117
High-speed analog-to-digital converters (ADCs) are often based on a flash ADC topology [41, 85, 88], which is illustrated in Figure 1.1. These ADCs are used for a number of applications, e.g., read channel applications presented in Section 1.1 and in ultra wideband (UWB) radio applications presented in Section 1.2. As seen in Figure 1.1 the input signal of a flash ADC is applied to the positive comparator inputs. Their negative inputs are connected to a resistor net that generates the reference voltages. The output pattern of the comparator corresponds to thermometer code, which is encoded to, e.g., binary code by the \((2^N - 1)\)-to-\(N\) encoder in Figure 1.1, where \(N\) is the resolution of the ADC in number of bits. Since the number of comparators grows as \(2^N\) with the number of bits \(N\), the resolution is generally limited to at most eight bits for this type of converter. For larger resolutions, the large number of required comparators would limit the input bandwidth of the converter, consuming a large chip area, and high power [88]. However, for low resolutions the flash ADC topology can be used to yield fast ADCs [85, 88].

A part of this work is the design of two different ADC topologies. One with an encoder based on multiplexers (MUXs). The other flash ADC demonstrates an improved way of introducing dynamic element matching (DEM) into the reference net of the flash ADC, thereby improving the spectral properties on the converter output.

A desired design methodology is often the top-down methodology. Using this methodology the system is first modeled on a behavioral level, allowing fast simulation of the whole system. Although the simulation is not as accurate as a transistor level simulation, the results yield information that is valuable in the design of the subsequent steps in the design flow. The
first model is then gradually refined and finally the circuit topology of each sub-circuit in the system can be decided, e.g., what type of comparators to use in the flash ADC, or what encoder topology to use. A number of behavioral level models of the ADCs are therefore also presented in this work, and the results are used in the circuit design phase.

During the behavioral level modeling of the ADCs a study of the effect of the chosen thermometer-to-binary encoder topology was performed. Two new encoder topologies were studied, a modified Wallace tree encoder [41] and a MUX-based encoder [75]. The latter is used as the encoder in one of the ADC designs. That converter is being manufactured when writing this.

Another proposal presented in this work is a flash ADC where DEM is introduced by modification of the reference net. This ADC was also first modeled on the behavioral level. The behavioral level simulation and the later transistor level simulation showed promising results. The flash ADC with DEM is therefore manufactured, and will later be characterized by measurements.

This work is a part of the industry-initiated pan-European program for advanced cooperative research and development in microelectronics, called MEDEA+. One of the around 30 projects in the MEDEA+ program is the T206 project, and this work is a part of the T206 project. The purpose of the T206 project it to find out if the silicon-on-insulator (SOI) technology is suitable for implementation of a range of mobile and networking devices. The targeted applications of this work are presented in Section 1.1 and in

---

**Figure 1.1:** Illustration of a flash ADC.
To facilitate the evaluation of SOI technology the program provides an SOI technology for circuit implementation. The technology is a 130 nm partially depleted (PD) SOI complementary metal oxide semiconductor (CMOS) technology. Our part of the project was to design and evaluate mixed-signal circuits implemented in this technology, with the focus on ADCs, and hereby evaluate the technology.

Even though SOI technology has been around since 1964 [49], it has so far mostly been used for digital [2, 30, 68, 70] or pure RF [61, 62, 81, 84] applications. Many of these designs are often targeted for special applications, e.g., military, space, high-temperature, or highly radiant environments [15, 44]. Hence, it has been used mainly for low volume production. In the latest years, more circuits designed in SOI technologies have been used in high volume production. Most of those are digital circuits. Only a few are analog RF circuits.

A study in this work showed that few analog baseband designs and implementations in SOI have been presented. The presented circuits are mostly targeted for special applications, e.g., low-voltage sensor circuits [83], and analog baseband circuits for radiant [19] or high-temperature environments [22]. There is also a GSM receiver implemented in a 0.25 μm SOI CMOS technology using a sigma-delta analog-to-digital converter (ADC) [16], intended as a demonstrator of the SOI CMOS technology.

Another reason for choosing to design analog circuits in SOI CMOS technology is to investigate how the analog circuits are affected by the unwanted effects in the technology. It is also desired to develop design guidelines that consider these effects. Among the unwanted effects appearing when implementing circuits in this technology are the kink, history, and self-heating effects. These effects are presented in more detail in Chapter 2. The digital circuits are also affected by the kink, history, and self-heating effects, but they are especially problematic for analog circuits.

\section{1.1 Read Channel Applications}

The intended application of this work is read channel application. The data recovery circuit, i.e., the read channel, of, e.g., a DVD player, a CD player, or a hard disk drive (HDD) converts the stored data into digital bit streams [52]. A read channel is illustrated by the block diagram in Figure 1.2. Here the input to the read channel is first amplified by the variable gain amplifier (VGA). By adjusting its gain, the signal level of
the VGA output can be adjusted so that it is within the range set by the ADC. If the ADC input signal magnitude were outside the proper range it would saturate either the ADC if the signal magnitude were too large, or the quantization noise of the ADC would be too large relative to the signal if the input signal magnitude is too small. The next stage of the read channel adjusts the direct current (DC) offset of the signal. Before the signal enters the ADC it is filtered by the anti aliasing low-pass filter (LPF). The output of the ADC undergoes digital signal processing (DSP) before the output of the read channel is generated. The DSP unit controls the VGA, the offset adjustment unit, and the ADC.

![Illustration of a read channel application.](image)

Figure 1.2: Illustration of a read channel application.

The high data rates in, e.g., hard disks, set high demands on the conversion speed of the read channel. The ADC in the read channel must therefore be able to operate at high sampling rates. The requirement on its resolution is about six bits [12]. The requirements gives that the flash ADC topology is suitable for the converter in the read channel [87].

The above-mentioned CD, DVD, and HDD applications are now appearing in different hand-held applications, where the demand on low power consumption is important. The hand-held applications are therefore an application domain where the SOI technology can be valuable, since it is suited for low-power applications at a low power supply voltage, especially the fully depleted SOI technology [24]. In addition, the power consumption due to increased leakage current of the devices becomes dominant as the device technologies are continuously downscaled. The leakage current of fully depleted SOI CMOS is shown to be two to three times lower than for bulk CMOS in [51], where the bulk CMOS technologies are the mainstream CMOS technologies that are not SOI, bipolar CMOS (BiCMOS), or
other more advanced technologies. The lower leakage current is an important factor to why the SOI technology is expected to become increasingly more common in the future [10], and therefore motivates a study on how to implement mixed signal circuits in such technology.

1.2 UWB Applications

Another application area for this work is UWB. UWB is defined as any signal that occupies a bandwidth of more than 500 MHz in the unlicensed band from 3.1 to 10.6 GHz, and that meets the requirements on UWB signal spectrum mask [1]. The block diagram of a UWB receiver is found in Figure 1.3 [53]. In this figure the input signal from the antenna is first amplified by the low noise amplifier (LNA) and then by the VGA. The latter is controlled by the baseband digital signal processor (DSP) via a digital-to-analog converter (DAC). The purpose of the VGA is the same as in the read channel, i.e., to ensure that the ADC is fed an appropriate input voltage. The output of the VGA is converted by the ADC, whose output is processed by the baseband DSP. The DSP also controls the clock (CLK) generator, which controls the sampling by the ADC.

![Figure 1.3: Block diagram of a UWB receiver.](image)

The expected applications for UWB are for short-range wireless communications over a distance of about 10 m. Wireless transceivers employing the UWB technique are scalable and adaptive [1], which is why UWB is expected to be the technique used in the coming IEEE 802.15.3a standard for wireless personal area networks [1, 39]. The main requirements on the transceivers are low complexity, low cost, and low power consumption. The requirement of low cost generally restricts the technology used for the implementation to mainstream bulk CMOS. However, use of a bulk CMOS
technology yields implementation challenges regarding, e.g., the wideband LNA, the phase locked loop (PLL), and the ADC [1]. Studies have shown that an ADC with a resolution of four bits is sufficient for reliable UWB reception [53]. Hence, a 6-bit flash ADC could be used for this application as well. Its input bandwidth should be more than 500 MHz.

1.3 Scope of the Work

This work focuses on design and implementation of flash ADCs in a partially depleted SOI CMOS technology provided in the framework of the MEDEA+ program. The targeted applications are read channel and UWB applications, which are in line with the goals of the earlier mentioned T206 project. The ADC used in read channel and UWB applications must have a high sampling rate and a resolution of four to six bits. Hence, the resolution of the ADCs in this work is six bits. The ADCs are designed using the top-down design methodology. To facilitate that methodology the ADCs in this work are modeled on the behavioral level. The modeling is described in this thesis. This work also demonstrates the concept of introducing DEM into the reference net of a flash ADC.

1.4 Thesis Outline

This thesis is organized as follows. An introduction to the SOI technology is given in Chapter 2. In Chapter 3 an introduction and background to analog-to-digital conversion is presented together with the performance measures used in this thesis. The proposed circuit topologies in this work are presented in Chapter 4. In Chapter 5 the modeling of the flash ADCs is presented. The simulation results of these models were used in the design of the converters presented in Chapter 6. The transistor level simulations results of the designed ADCs are given in Chapter 7, together with a discussion of the results. Here are also the SOI CMOS technology discussed based on what was presented in Chapter 2. The main conclusions are given in Chapter 8, together with a list of suggestions for future work.

1.5 Abbreviations

This section lists the abbreviations used in this thesis.
<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADC</td>
<td>Analog-to-digital converter</td>
</tr>
<tr>
<td>BiCMOS</td>
<td>Bipolar complementary metal oxide semiconductor</td>
</tr>
<tr>
<td>BOX</td>
<td>Buried oxide</td>
</tr>
<tr>
<td>BSIM3SOI</td>
<td>Berkeley short-channel IGFET model for SOI</td>
</tr>
<tr>
<td>CD</td>
<td>Compact disc</td>
</tr>
<tr>
<td>CLK</td>
<td>Clock</td>
</tr>
<tr>
<td>CMOS</td>
<td>Complementary metal oxide semiconductor</td>
</tr>
<tr>
<td>DAC</td>
<td>Digital-to-analog converter</td>
</tr>
<tr>
<td>DC</td>
<td>Direct current</td>
</tr>
<tr>
<td>DEM</td>
<td>Dynamic element matching</td>
</tr>
<tr>
<td>DSP</td>
<td>Digital signal processor, or digital signal processing (both are used)</td>
</tr>
<tr>
<td>DUT</td>
<td>Device under test</td>
</tr>
<tr>
<td>DVD</td>
<td>Digital versatile disc</td>
</tr>
<tr>
<td>ENOB</td>
<td>Effective number of bits</td>
</tr>
<tr>
<td>ERBW</td>
<td>Effective resolution bandwidth</td>
</tr>
<tr>
<td>ESD</td>
<td>Electrostatic discharge</td>
</tr>
<tr>
<td>FA</td>
<td>Full adder</td>
</tr>
<tr>
<td>FD</td>
<td>Fully depleted</td>
</tr>
<tr>
<td>FoM</td>
<td>Figure of merit</td>
</tr>
<tr>
<td>GaAs</td>
<td>Gallium arsenide</td>
</tr>
<tr>
<td>GSM</td>
<td>Global system for mobile communications</td>
</tr>
<tr>
<td>HDD</td>
<td>Hard disk drive</td>
</tr>
<tr>
<td>IGFET</td>
<td>Insulated-gate field effect transistor</td>
</tr>
<tr>
<td>LNA</td>
<td>Low noise amplifier</td>
</tr>
<tr>
<td>LPF</td>
<td>Low-pass filter</td>
</tr>
<tr>
<td>LSB</td>
<td>Least significant bit</td>
</tr>
<tr>
<td>MiM</td>
<td>Metal insulator metal</td>
</tr>
<tr>
<td>MOSFET</td>
<td>Metal oxide semiconductor field effect transistor</td>
</tr>
<tr>
<td>MSB</td>
<td>Most significant bit</td>
</tr>
<tr>
<td>MUX</td>
<td>Multiplexer</td>
</tr>
<tr>
<td>NMOS</td>
<td>Negative-channel metal oxide semiconductor</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Definition</td>
</tr>
<tr>
<td>--------------</td>
<td>------------</td>
</tr>
<tr>
<td>PD</td>
<td>Partially depleted</td>
</tr>
<tr>
<td>PLL</td>
<td>Phase locked loop</td>
</tr>
<tr>
<td>PMOS</td>
<td>Positive-channel metal oxide semiconductor</td>
</tr>
<tr>
<td>PRBS</td>
<td>Pseudo-random bit stream</td>
</tr>
<tr>
<td>Q</td>
<td>Quality factor</td>
</tr>
<tr>
<td>RF</td>
<td>Radio frequency</td>
</tr>
<tr>
<td>ROM</td>
<td>Read-only memory</td>
</tr>
<tr>
<td>SFDR</td>
<td>Spurious-free dynamic range</td>
</tr>
<tr>
<td>SH</td>
<td>Sample-and-hold</td>
</tr>
<tr>
<td>Si</td>
<td>Silicon</td>
</tr>
<tr>
<td>SiO₂</td>
<td>Silicon dioxide</td>
</tr>
<tr>
<td>SNDR</td>
<td>Signal-to-noise and distortion ratio</td>
</tr>
<tr>
<td>SNR</td>
<td>Signal-to-noise ratio</td>
</tr>
<tr>
<td>SOI</td>
<td>Silicon-on-insulator</td>
</tr>
<tr>
<td>SOS</td>
<td>Silicon-on-sapphire</td>
</tr>
<tr>
<td>SQNR</td>
<td>Signal-to-quantization noise ratio</td>
</tr>
<tr>
<td>TH</td>
<td>Track-and-hold</td>
</tr>
<tr>
<td>VGA</td>
<td>Variable gain amplifier</td>
</tr>
</tbody>
</table>
Chapter 2

Silicon-on-Insulator Technology

In SOI the thin silicon layer where the devices are laid out is placed on top of an insulating layer. This insulator can be, e.g., sapphire, as for silicon-on-sapphire (SOS) devices, or an oxide layer, as for the SOI CMOS technology used in this work. The different devices having this structure will be referred to as SOI devices, independent of what type of insulator that is used. The devices used in this work are therefore referred to as SOI CMOS devices, i.e., manufactured in an SOI CMOS technology.

This chapter gives a brief background to the development of the SOI technology. Two different types of SOI CMOS devices, partially depleted (PD) and fully depleted (FD), are presented, followed by a comparison of the SOI CMOS technology with the bulk CMOS technology. A comparison between partially and fully depleted SOI CMOS devices is presented in Section 2.4. In Section 2.5 are electrostatic discharge (ESD) protection circuits for SOI CMOS presented.

2.1 Background

The first SOI substrates were of SOS type [49], but the first substrates was of poor quality and could therefore not be used for commercial manufacturing of devices. The development advanced thanks to the interest from the military and space industry for radiation hard devices and devices that could operate at high temperatures. From the beginning of the 1980s through the 1990s, the SOI devices, based on silicon-on-sapphire substrates, were mostly used for radiation-hardened and high-temperature applications [15], i.e., low-volume production. In the end of the 1990s, the market for the silicon-on-sapphire based devices increased as they started to be used for
RF applications [61, 62]. Consequently, the silicon-on-sapphire has in the latest years become cheaper, which is also due to improved wafer quality and wafer production efficiency. However, since the materials and manufacturing methods are different from what is used for low-cost bulk CMOS production, the cost is still significantly higher than for bulk CMOS.

In 1989 research on the possibility to convert cheap bulk CMOS technologies into SOI CMOS technologies was initiated [67]. One of the goals was to reduce the production cost of the SOI CMOS devices. As a result a few low volume SOI CMOS technologies were in use in the mid 1990s for research purpose [67], using similar materials and manufacturing methods as for bulk CMOS production. The similarities are illustrated by Figure 2.1 where the cross section of a bulk CMOS device and an SOI CMOS device are illustrated. Figure 2.1 shows that the main difference between these two types of devices is the buried oxide (BOX) layer that is used as an insulator below the device, as illustrated in Figure 2.1(b). The introduction of the buried oxide layer creates the thin-film structure that is characteristic for all SOI devices. In 1998 the SOI CMOS technology development took another important step towards the goal of becoming a mature technology, as the first microprocessor manufactured in an SOI CMOS technology become commercially available [67].

As the development advanced, promising properties of the SOI CMOS technology became obvious for future scaled technologies in the sub 100 nm region [4, 38, 56]. It became clear that the SOI CMOS technology really could be a strong competitor to bulk CMOS. In the latest years, it has become widely accepted that the SOI CMOS technology will become one of the future mainstream technologies [4]. In fact, many of the latest mainstream CMOS technologies are SOI CMOS technologies [10], used for several mainstream products. Motivations to why the SOI CMOS technologies are expected to be more common in the future are presented in this chapter.

A disadvantage of the technology is that the wafers are more expensive than the wafers used for bulk CMOS production [15]. This makes the total cost for manufacturing higher for SOI CMOS compared with bulk CMOS technologies, even though the processing steps are slightly fewer for SOI CMOS [15]. The difference in cost between the two technologies has however been reduced over the last years as the quality and yield of the wafer production have improved, reducing the price of the SOI wafers. As a result, a partially depleted SOI CMOS chip is today less than 10 % more expensive to manufacture compared with bulk CMOS [72]. Although the difference in cost now is small, the price must, of course, still be considered
when choosing the technology to use.

## 2.2 Partially and Fully Depleted SOI

The SOI MOSFET device is a thin-film device that is either partially depleted (PD) or of fully depleted (FD) type. An illustration of the cross section of a partially depleted and a fully depleted SOI device is depicted in Figure 2.2(a) and Figure 2.2(b), respectively.

An undepleted region is present in the body region of the device if the thickness of the thin-film silicon layer where the devices are formed, i.e., the active silicon, is thicker than the depletion depth of the device, as indicated in Figure 2.2(a). The body is therefore only partially depleted. Hence, these devices are referred to as partially depleted SOI devices. The undepleted body region can be charged during operation, giving rise to several unwanted effects, of which the most important are presented later in this chapter, e.g., the kink and history effects.

If the thin-film is made thinner than the depletion depth, typically about
100 nm or less, the whole body region is depleted [51]. This type of devices is referred to as fully depleted devices. Consequently, the body region of the device cannot be charged, and several of the unwanted effects appearing in the partially depleted devices are thereby avoided. However, other obstacles exist of which the difficulty of manufacturing is the most important [51].

Since the active thin-film silicon layer must be very thin, its statistical variation of relative thickness over the chip is large. Further, the threshold voltage of the device depends on the thin-film thickness. Consequently, the thin-film thickness variation results in varying threshold voltages over the chip [14]. The thickness must therefore be accurately controlled during manufacturing, which is difficult.

Figure 2.2: Cross section of (a) a partially depleted and (b) a fully depleted SOI MOSFET device.
2.3 SOI CMOS vs Bulk CMOS Devices

2.3.1 Doping Density

When downscaling the bulk CMOS technology the channel doping is increased to avoid the formation of a current path between the drain and source. The current path can be formed by electrostatic coupling between the junctions beneath the channel [10]. The increased doping density reduces the mobility, and as a result, the current drive capability of the device. To compensate for the reduced current drive capability the gate oxide thickness is reduced. The parasitic gate capacitance, which is inversely proportional to the oxide thickness, is thereby increased together with the gate leakage. If the gate leakage is too high, another gate-dielectric with higher permittivity than silicon dioxide ($\text{SiO}_2$) could be used. This would increase the gate oxide capacitance for a fixed gate oxide thickness, $t_{\text{ox}}$, thereby increasing the gate coupling to the channel. Because of the increased permittivity, a thicker gate oxide can be used, reducing the gate leakage [10].

Since the partially depleted SOI CMOS devices have an undepleted body region below the channel, the electrostatic coupling of the junctions under the channel occurring in the bulk CMOS technology as a result of downscaling can also occur in partially depleted SOI [10]. As in bulk CMOS, the electrostatic coupling in the downscaled partially depleted SOI CMOS technologies is compensated for by increasing the doping density and reducing the gate oxide thickness.

For fully depleted SOI CMOS technologies, the complete depletion of the channel region prevents the formation of the current path below the channel. Hence, the doping density does not need to be increased as much as for bulk CMOS and partially depleted SOI CMOS technologies [44]. Due to the lower doping density the mobility is higher in fully depleted SOI CMOS compared with bulk CMOS and partially depleted SOI CMOS. In addition, the gate leakage is lower for fully depleted SOI CMOS devices, since a thicker gate oxide can be used. As a result, the parasitic gate capacitance is reduced [10]. The fully depleted SOI CMOS technology is expected to become increasingly more common in the future, as the devices are scaled [11], much due to the reduced gate leakage. The difficulties of manufacturing must however first be addressed before the fully depleted SOI CMOS technology can be used for commercial production.
2.3.2 Effect of Scaling on Speed Performance

The maximum operating frequency of a bulk CMOS device is in large determined by the magnitude of the total parasitic drain and source junction capacitances, $C_{j,d}$ and $C_{j,s}$, respectively. These capacitances consist of the parasitic sidewall junction capacitance $C_{j,sw}$ and the bottom drain and source area parasitic junction capacitances, $C_{db}$ and $C_{sb}$, respectively, according to

$$C_{j,d} = C_{j,sw} + C_{db}$$

for the drain junction capacitance of the bulk CMOS devices.

In the case of a bulk CMOS device the distance between the charged areas of the parasitic bottom-area junction capacitances is equal to the width of the junction, where the charged areas are the areas where the charges accumulates in the drain, source, and substrate. When introducing the buried oxide the distance between these charged areas is increased by the thickness of the buried oxide. Consequently, the contribution from the bottom-area drain and source junction capacitances to the total drain and source parasitic junction capacitance is significantly reduced. The total drain and source parasitic capacitances for SOI CMOS devices are thereby reduced by a factor of four to seven compared with a bulk CMOS technology of the same dimensions [14, 44, 70]. The speed of the SOI CMOS devices are therefore increased compared with the bulk CMOS devices of the same size. Hence, the SOI CMOS technology is about one generation ahead of the bulk CMOS technology in terms of device speed [15, 44].

Now consider partially depleted SOI CMOS devices with the body tied to the source, in relation to bulk CMOS devices. The speed enhancement of the partially depleted SOI CMOS devices is then entirely due to the reduction of the parasitic bottom-area drain and source junction capacitances [14, 56]. When the devices are downscaled, this area is reduced for both types of devices, which reduces the parasitic drain and source bottom junction area capacitances. As these capacitances are reduced, the relative performance enhancement of the partially depleted SOI CMOS devices diminishes. The performance advantage, in terms of speed for digital circuits over bulk CMOS, is thereby reduced to less than 10 % for body tied devices when the technology scaling approaches a gate length of 70 nm [56]. The minimum gate length limit when using partially depleted SOI CMOS no longer is an advantage over using bulk CMOS is therefore in the range of 100 nm down to 70 nm [44, 56]. Hence, when approaching a minimum gate length of 70 nm, fully depleted SOI CMOS should be preferred to utilize the advantages of SOI CMOS over bulk CMOS, due to, e.g., the improved
current drive capability, as explained next.

2.3.3 Current Drive Capability

The body factor \( n \) of a bulk CMOS or SOI CMOS device is given by [13]

\[
\begin{align*}
n &= 1 + \frac{C_{\text{ch,b}}}{C_{\text{g,ch}}} \\
&= 1 + \frac{C_{\text{ch,b}}}{C_{\text{g,ch}}},
\end{align*}
\]

where \( C_{\text{ch,b}} \) is the parasitic capacitance between the channel and body, and \( C_{\text{g,ch}} \) is the parasitic capacitance between the gate and channel. Hence, \( n \) gives information on how much of the gate potential that is effecting the channel, i.e., it gives information of the efficiency of the coupling between the gate and the channel [13].

In fully depleted SOI devices, the introduction of the insulating layer and the complete depletion of the body reduce the channel to body capacitance \( C_{\text{ch,b}} \). As seen from (2.2) this reduces the body factor of the fully depleted SOI CMOS devices, compared with both bulk CMOS devices and partially depleted SOI CMOS devices. As an effect, the efficiency of the coupling between the gate and the channel improves.

Studies have shown that the body factor \( n \) can be as low as 1.05-1.1 for fully depleted SOI CMOS, compared with 1.3-1.5 for bulk CMOS and also partially depleted SOI CMOS devices [13, 14, 24, 25]. The current drive capability, given by the saturation current \( I_{D,\text{sat}} \), which is proportional to \( 1/n \), is thereby higher in fully depleted SOI CMOS compared with a bulk CMOS or partially depleted SOI CMOS technology [15].

Further, the ratio between the transconductance \( g_m \) and the DC drain current \( I_D \) is given by (2.3a) for weak inversion and (2.3b) for strong inversion [13]. In equation (2.3a) \( q \) is the electron charge, \( k \) is Boltzmann’s constant, and \( T \) is the temperature in Kelvin. The channel charge carrier mobility is denoted \( \mu \) in (2.3b). Further, \( C_{\text{ox}} \) is the gate oxide capacitance per unit area, \( W \) is the width of the transistor, and \( L \) is the channel length.

\[
\begin{align*}
\left. \frac{g_m}{I_D} \right|_{\text{weak inversion}} &= \frac{1}{n} \frac{q}{kT} \\
\left. \frac{g_m}{I_D} \right|_{\text{strong inversion}} &= \frac{1}{\sqrt{n}} \sqrt{\frac{2\mu C_{\text{ox}} W}{L}}
\end{align*}
\]

The lower body factor \( n \) of fully depleted SOI CMOS devices yields a larger \( g_m/I_D \) ratio compared with bulk CMOS and partially depleted SOI
CMOS devices. Hence, for the same drain bias current $I_D$ the transconductance $g_m$ is larger for devices implemented in fully depleted SOI CMOS than in a bulk CMOS or partially depleted SOI CMOS technology. The larger transconductance also improves the device speed, in addition to the reduced parasitic drain and source bottom-area junction capacitances.

It is worth noting that the $g_m/I_D$ ratio is proportional to $1/\sqrt{n}$ in strong inversion, while it is proportional to $1/n$ in weak inversion. Due to the lower body factor $n$ in fully depleted SOI, its advantage over bulk CMOS and partially depleted SOI CMOS devices is even larger when they operate in weak inversion. Fully depleted SOI CMOS circuits are therefore especially suited for low-voltage and low-power applications [13, 24].

### 2.3.4 Unity-Gain Frequency

For analog circuits the DC open loop voltage gain $A_0$ and unity-gain frequency $f_T$ of the devices, are of importance for the overall circuit performance. Consider the transistor in the common source configuration, illustrated in Figure 2.3, biased with the drain bias current $I_D$ delivered from an ideal current source. If the load is $C_L$, the expression for the DC gain $A_0$ (2.4) and unity-gain frequency $f_T$ (2.5) becomes [69]

$$A_0 = -\frac{g_m}{I_D} V_A \quad (2.4)$$

and

$$f_T = \frac{1}{2\pi} \frac{g_m}{C_L}, \quad (2.5)$$

where $V_A$ is the Early voltage

$$V_A = \frac{I_D}{g_{ds}}, \quad (2.6)$$

i.e., the ratio between the DC current and the small-signal output conductance $g_{ds}$.

The DC gain and the unity-gain frequency are the same for partially depleted SOI CMOS as for bulk CMOS devices of the same dimension, since the body factor and consequently the ratio $g_m/I_D$ is the same.

For fully depleted SOI CMOS devices, the body factor is lower, yielding a larger $g_m/I_D$ ratio. The DC gain $A_0$ and unity-gain frequency $f_T$ is therefore improved when using fully depleted SOI CMOS compared with using bulk CMOS and partially depleted SOI CMOS devices. As the lower
body factor of fully depleted SOI CMOS yields a higher DC gain $A_0$ and unity-gain frequency $f_T$ compared with partially depleted SOI CMOS and bulk CMOS. The improvement of $A_0$ and $f_T$ is therefore between 9% and 20% in strong inversion, and between 18% and 43% in weak inversion. These are the main reasons why circuits implemented in a fully depleted SOI CMOS technology are expected to outperform the bulk CMOS counterpart in terms of gain, speed, and power consumption, especially for low-voltage and low-power applications [25].

2.3.5 Latch-Up and Device Density

In an SOI CMOS transistor the drain and source junctions reach through the silicon thin-film to the buried oxide. Hence, the buried oxide isolates the thin-film silicon from the substrate. This SOI CMOS technology characteristic prevents the formation of parasitic bipolar transistors in the SOI CMOS technologies due to the isolation of the negative-channel metal oxide semiconductor (NMOS) and positive-channel metal oxide semiconductor (PMOS) devices [15, 35, 44]. The latch-up effect in bulk CMOS is therefore not present in SOI CMOS.

In bulk CMOS, the latch-up effect is avoided by ensuring that the minimum distance between the NMOS and PMOS devices is sufficiently large and by using substrate contacts. For SOI CMOS technologies, having no latch-up effect, the minimum distance between the devices can be smaller. In addition, the deep oxide filled trenches [35], often used in bulk CMOS technologies to get a good isolation between the devices, can be replaced by shallow trenches in SOI CMOS technologies [15, 35] due to the thin-film structure on top of the buried oxide. The minimum space between the devices in an SOI CMOS technology is thereby only dependent on the technology constraints on the minimum width of the shallow oxide filled...
trenches [44]. The device density can therefore be higher in SOI CMOS technologies than in bulk CMOS technologies, which is an advantage in digital circuits, especially in, e.g., memories.

For analog circuits the layout must be done so that the self-heating presented in Section 2.4.2, and other thermal effects are minimized [82]. It is thereby harder to make any general conclusions on the device density when comparing bulk CMOS technologies with SOI CMOS technologies for analog circuits.

2.3.6 Radiation Hardness

Radiation particles incident upon a silicon wafer ionizes some of the silicon atoms. If the silicon is replaced by a thin-film silicon layer, like in SOI CMOS devices, the radiation particles are more likely to pass through the silicon thin-film without ionizing the silicon atoms in the thin-film.

In the case of an SOI CMOS technology, the particles may still ionize some of the silicon atoms in the substrate, but since the active silicon layer is separated from the substrate by the buried oxide, the devices implemented in an SOI CMOS technology are less affected by the incident radiation [19, 44]. For that reason, the radiation hardness is better for SOI CMOS than for bulk CMOS technologies [19]. Hence, using an SOI CMOS technology for implementing, e.g., memories, would reduce their soft-error rate, compared with if they were implemented in a bulk CMOS technology [15, 44].

There is the possibility that some of the incident particles are trapped in the buried oxide, which could turn on the back channel, and by this reduce the threshold voltage of the device [44]. The back channel is an unwanted channel that can be formed in the bottom of the active silicon [44]. The effect of the incident radiation on the SOI CMOS devices is however still much less, compared with its effect on bulk CMOS devices.

The incident radiation particles can also ionize the oxide atoms in the buried oxide layer. This can form a conducting trace through the oxide. Some of the incident radiation particles can then travel back to the silicon thin-film on top of the insulating oxide, thereby introducing soft-errors. To avoid this, the more radiation hard SOS substrates should be used. This is why SOS often is used for, e.g., space applications.

2.3.7 Crosstalk and Passives

One of the methods to reduce the latch-up effect in bulk CMOS technologies is to use a substrate with sufficiently low resistivity [40]. The use of a
substrate with too low resistivity would give too high crosstalk, poor noise performance, and reduced quality of the implemented passives, such as resistors, capacitors, and inductors. The resistivity is therefore chosen from several trade-offs, and is therefore often limited to below around 20 Ωcm [15].

In SOI CMOS technologies the latch-up effect is not present. This can be utilized by using different resistivity in the silicon above and below the insulator. The resistivity of the substrate below the insulator can be increased and the resistivity of the silicon on top of the insulator can be maintained on the same level as before, or even reduced [44]. If this scheme is applied, the crosstalk is significantly reduced, compared with bulk CMOS technologies. SOI CMOS with high resistivity substrate actually seems to be a good candidate for future high frequency mixed signal integrated circuits [60]. Further crosstalk improvements can be reached if guard rings are introduced into the silicon thin-film and the substrate [37].

In addition to a reduced crosstalk, the quality factor (Q) of on-chip inductors is also improved when increasing the resistivity of the substrate [20, 33, 48, 62]. For bulk CMOS with a regular substrate resistivity, about 20 Ωcm, the Q factor of inductors is about four to six [15]. If instead SOI CMOS with high resistivity substrate is used, the Q factors of the inductors can be significantly increased. Single metal layer inductors with a Q factor of 11 for a substrate resistivity of 10 kΩcm [21], and a Q factor of 50 for a substrate resistivity of 200 Ωcm and multiple metal layers have been reported [59]. The SOI CMOS technology therefore has the potential of being used for digital, mixed signal, and RF circuits, all implemented on the same chip [14]. Hence, the SOI CMOS technology is a good candidate as a mainstream technology for future system-on-chip implementations.

2.4 Partially vs Fully Depleted SOI

This section goes into more details regarding the differences between fully depleted and partially depleted SOI CMOS technologies. The most important unwanted effects of SOI compared with bulk CMOS are also presented in more detail. Most of them are only appearing in partially depleted SOI technologies, but the thermal effects are affecting both types of devices.

2.4.1 Kink and History Effect

Due to the high electric field in the pinched-off region of the channel, impact ionization occurs when the SOI MOSFET is operated in the saturation region [15]. Electron-hole pairs are generated by the impact ionization,
which are separated by the electric field near the drain. In a partially depleted SOI device the electrons drift to the drain and the holes are injected into the undepleted body region [15], as illustrated in Figure 2.4(a).

Since there is a potential barrier between the body and source, the body potential increases rapidly to a value equal to the built-in potential of the body-source junction [44]. The increase in body potential reduces the threshold voltage of the device via the body effect. The reduction of threshold voltage can be observed in the $I_d/V_{ds}$ characteristic as a sharp increase, a kink, in the drain current at the onset of impact ionization, as illustrated in Figure 2.4(b) [44, 77]. This effect is referred to as the kink effect.

![Diagram](image)

**Figure 2.4:** Partially depleted SOI device during (a) impact ionization giving rise to (b) the kink effect.

The partially depleted SOI device has an undepleted body region where charges generated by gate-to-body tunneling, impact ionization, or diode leakage, may accumulate, due to a limited recombination time constant [44].
The charge accumulation alters the body potential, and thereby the threshold voltage of the device. As a result, the threshold voltage, and hence gate delay for digital circuits, will vary depending on the frequency, bias conditions, and the switching history. This is the history effect [51]. The history effect will give different behavior in DC and transient operation, which is a cumbersome property for analog applications, where the DC bias should set the transient characteristics.

The lack of undepleted body region in fully depleted SOI yields that neither the history effect nor the kink effect exists in such devices [44, 95]. In partially depleted SOI, these effects can be avoided by using body contacts. The body contacts remove the accumulated holes in the body, and stabilize the body potential, i.e., it is the same method used in bulk CMOS technologies to avoid having a floating body. Two other methods for reducing the kink effect are either to use cascodes, or to optimize the bias point and transistor sizes so that the transistors that sets the gain and current operates outside the kink region [8, 9]. However, as the devices are scaled the body contact area is reduced, which increases the body contact resistance. Hence, for future SOI CMOS technologies, other methods or fully depleted SOI CMOS instead of partially depleted must be used.

### 2.4.2 Self-heating and Thermal Coupling

Due to the buried oxide layer in SOI CMOS technologies the thermal conductivity is about 100 times lower for the devices compared with bulk CMOS devices [82]. Most of the heat generated in a bulk CMOS device is transferred to the substrate below the device and only little heat is transferred to neighboring devices. For an SOI CMOS device the situation is different. Due to the poor thermal conductivity of the insulating layer, more of the heat generated by the device will remain in the device, which increases its temperature. In addition, more heat is transferred to neighboring devices, which increases their temperature as well. This is illustrated by Figure 2.5. As a result, the local device temperature can vary significantly when the power dissipation of the device changes. This effect is known as self-heating and it depends on several parameters. The most important parameter is the power consumption of the device, since this corresponds to the generated heat. The increased device temperature leads to a decrease of the electron and hole mobility, which in turn results in a decrease of drain current. The variation of the drain current due to self-heating can be as much as 20-25 % [44]. For bulk CMOS, the device temperature is both lower and more even compared with SOI CMOS devices.
The self-heating effect has a time constant in the order of a microsecond [45], hence the variation becomes less important for fast switching logic circuits, since the circuit will reach a thermal equilibrium [44, 82]. The characteristics of the devices are then constant.

![Illustration of heat transfer in (a) bulk CMOS and (b) SOI CMOS.](image)

**Figure 2.5**: Illustration of the heat transfer in (a) bulk CMOS and (b) SOI CMOS.

### Thermal Effects in Analog Circuits

If analog circuits are considered, the effects of self-heating are more severe than for digital circuits. The reason is that the self-heating in combination with poor thermal coupling between the devices on the chip degrades the matching between the devices, and worse, the thermal coupling may introduce new feedback paths, i.e., thermal feedback [45]. Hence unexpected instability can occur that may not be visible in circuit simulations during the design if not the thermal coupling is included. These issues must be considered when designing analog circuits, e.g., by finding methods to reduce the unwanted thermal effects. A few methods to reduce the thermal effects are touched upon in this section.

To improve the device matching, the devices can be placed in the same well to equalize the local temperature. This applies well to the general layout strategies for bulk CMOS, e.g., the layout of the input pair of a differential amplifier, or a current mirror. However, if placing the input and output transistors of a current mirror, having several outputs, close together, the heat generated by the output transistors can more easily be transferred to the input transistors than if they are separated by a large distance. The output transistors closest to the input transistor therefore affect the input transistor more than the output transistors further away. This generates a mismatch between the output transistors since they do not have the same effect on the input transistors [82]. The heat flow of a current mirror is
illustrated in Figure 2.6, which also is the floor plan of the current mirror. The heat flow $\Phi_1$ from the output transistor M1 introduce a voltage shift $\Delta V_{ds,0}$ of the input transistor M0. The voltage shift results in the current shift $\Delta I_1$ of the output transistor Mx. Since the transistor Mx is further away from the input than M1 the heat flow from Mx, $\Phi_x$, have less effect on the input transistor M0. Consequently, since Mx and M1 have different effect on M0 this results in a mismatch between the output transistors M1 and Mx.

The ability to simulate the thermal coupling in analog circuit design in SOI technology is important, especially when the technology is scaled down [45, 82]. The thermal effects must therefore be included in the models. Today the self-heating effect is easily included [71]. However, the thermal heat transfer is also important to include, if not even more important, since the heat transfer can result in the unwanted feedback paths [45, 82]. Models and simulators that can fully simulate the thermal effects, both self-heating and thermal heat transfer, are required. In addition, for accurate simulation of the thermal coupling the floor plan must be known.

During the manufacturing, transport, and later handling of the circuits by the end user, electrostatic charges may build up. An electrostatic discharge (ESD) can then occur which can seriously damage the circuits if they are not properly protected. The protection in bulk CMOS is commonly accomplished by bypassing the ESD currents through a diode network [44]. Hence their purpose is to avoid the ESD current to enter the core of the chip and damage the circuits. Many ESD protection circuits implemented in SOI are also based on diode networks. Due to the thin-film structure the diodes must however be realized in different ways than for bulk CMOS. Some of

![Figure 2.6: The heat transfer in a current mirror.](image-url)
the ESD protection circuits used in SOI CMOS technologies is presented in Section 2.5.2.

2.5.1 ESD Models

As long as the ESD currents are not large enough to damage the ESD protection circuits, the circuits in the core of the chip are protected if the ESD is within the specified values. To design the ESD protection circuits, the sources of the ESD have to be modeled to get an idea of the magnitude of the currents or voltage pulses that the protection circuits must withstand. The two models illustrated in Figure 2.7 are commonly used for this purpose [42].

The first model is the human body model, shown in Figure 2.7(a). In this model, the discharge of the human to the device is modeled by the discharge of a 100 pF capacitor through a 1.5 kΩ resistor. The capacitor is charged by the ESD voltage $V_{ESD}$.

The machine model models the discharge of charge built up on machines during the manufacturing, and is illustrated by Figure 2.7(b). The difference from the human body model is the absence of the human body resistance, and that the capacitor is increased to the double. The absence of the resistance implies that this model yield higher ESD currents compared with the human body model for the same $V_{ESD}$.

![Figure 2.7](image)

(a) Human body

(b) Machine

Figure 2.7: Illustration of (a) the human body model and (b) the machine model for ESD testing and design.

2.5.2 ESD Protection Circuits for SOI CMOS

The ESD protection circuits designed for bulk CMOS can in general not be used for SOI CMOS. In bulk CMOS technologies, it is easy to get large-area vertical junctions between the positively and negatively doped areas
(pn-junctions) to remove the ESD currents. In SOI CMOS, the use of a thin-film separated from the substrate by the insulating layer results in a higher current density in the pn-junctions, since the pn-junction area is small due to the thin-film structure. In addition, no vertical pn-junctions are available due to the buried oxide layer. The power density is thereby higher, yielding a higher device temperature. The temperature rise is not desirable due to the poor heat dissipation of SOI CMOS compared with bulk CMOS. The larger peak temperature of the devices can do serious damage, which is one reason to why many of the ESD protection approaches used for bulk CMOS technologies cannot directly be used in SOI CMOS technologies [44, 77, 78]. Another reason is the difficulty to accomplish a sufficiently large pn-junction area on a small chip area in SOI CMOS compared with bulk CMOS technologies. The ESD protection circuits therefore consume a larger area in an SOI CMOS technology compared with a bulk CMOS technology if bulk CMOS ESD protection circuits are used. If the diodes in the bulk CMOS ESD protection circuits are modified and adapted to the thin-film structure of the SOI CMOS technology the required chip area can be the same as in a bulk CMOS technology for at least the same ESD robustness [91, 92].

If partially depleted SOI CMOS is used, the gated double-diode networks using CMOS, PMOS, or NMOS devices, shown in Figure 2.8, can be used [92]. These ESD protection circuits are directly mapped from bulk CMOS ESD protection circuits, based on high perimeter diode structures [92]. The body contact of the transistors serves as the anode or cathode, and the device is used to realize a diode-like structure for the ESD protection.

When ESD occurs the channel is turned on, causing not only the body to source and drain path to conduct, but also the body to channel path [44]. The gated double-diode approach thereby provides a low resistive ESD bypass path, which can allow a higher ESD current than many of the diode structures used in bulk CMOS technologies. The performance is therefore close to what it is in bulk CMOS. The ESD robustness can be above 4 kV for 500 µm wide and about 4 µm long partially depleted SOI CMOS devices using the human body ESD model [92]. The gated double-diode approach can however not be used for fully depleted devices, since body contacts are required.

For fully depleted SOI CMOS technologies other approaches have to be used. One is to use lateral unidirectional bipolar type insulated gate transistors, also known as Lubistors [44, 91, 92]. No body contact is required using this approach and the lateral structure of the Lubistors make them
especially suitable in fully depleted SOI technologies. In Figure 2.9(a) a cross section of a Lubistor is shown, and its schematic symbol is shown in Figure 2.9(b). An SOI CMOS ESD protection circuit for an input pad is depicted in Figure 2.9(c) for a technology with a core supply voltage $V_{DD}$ equal to 2.5 V [91]. The circuit uses Lubistors instead of the diodes used in bulk CMOS technologies, or the gated double-diode networks in Figure 2.8. The circuit was in fact directly mapped from an ESD protection circuit used in a bulk CMOS technology [91]. The ESD protection diodes in the bulk CMOS technology were simply exchanged by Lubistors.

The circuit in Figure 2.9(c), implemented in SOI CMOS, has an ESD robustness that is even better than many bulk CMOS implementations [91]. It can withstand ESD pulses exceeding 6.5 kV using the human body model, while the bulk CMOS implementation failed at 4.3 kV in the experiment presented in [91]. Worth noting though is that bulk CMOS ESD protection networks has been shown to withstand more than 8 kV using the human body model.
The ESD robustness of the Lubistor based circuits is not their only advantage. In addition, studies have shown that the ESD protection properties of the Lubistors improve with technology scaling [91]. Hence, it is a promising approach for the future as well.
Chapter 3

Analog-to-Digital Conversion

This chapter gives an introduction to analog-to-digital conversion and defines the performance measures used in this work to evaluate the ADC performance. The flash ADC topology is presented in more detail, followed by an introduction to some of the error sources encountered in flash ADCs, and methods to compensate for these error sources. The last part in this chapter explains how dynamic element matching (DEM) can be introduced in flash ADCs.

3.1 ADC Function

The ADC can be modeled as a sample-and-hold (SH) circuit followed by a quantizer, as illustrated by Figure 3.1(a) where the dashed frame illustrates the schematic symbol of an ADC. The SH circuit samples the input voltage $V_{in}$ and generates the sampled voltage $V_s$. The sampled voltage is held while the quantizer converts the sampled voltage $V_s$ to the digital output $D_{out}$. The conversion is illustrated by Figure 3.1(b), where a ramp input is converted. The result of the conversion is the digital output $D_{out}$ given by (3.1a) [88], where $q_s$ is the quantization step and $q_e$ is the quantization error. The number of bits in the digital output, denoted $N$, is the resolution of the ADC.

\[
\frac{V_s}{q_s} = D_{out} + q_e \quad (3.1a)
\]

\[
D_{out} = \sum_{i=0}^{N-1} d_i 2^i \quad (3.1b)
\]
The quantization step size $q_s$ is equivalent to one least significant bit (LSB) and is given by the ratio between the full-scale voltage of the ADC, $V_{FS}$, and the number of quantization steps, according to (3.2). The full-scale voltage is the maximum input voltage that can be applied to the converter input without saturating the converter.

$$q_s = \frac{V_{FS}}{2^N}$$  \hspace{1cm} (3.2)

The quantization error $q_e$ is the difference between $V_{in}/q_s$ and the quantized digital signal $D_{out}$, and is shown in Figure 3.1(b) for a ramp input. The absolute quantization error should be within one LSB for correct operation [88], i.e.,

$$-\frac{q_s}{2} < q_e \leq \frac{q_s}{2}.$$  \hspace{1cm} (3.3)

### 3.2 Quantization Noise

Assuming that the quantization error is uniformly distributed on the interval given by (3.3), the mean-squared value of $q_e$ can be calculated according to [88]

$$\langle q_e^2 \rangle = \frac{1}{q_s} \int_{-\frac{q_s}{2}}^{\frac{q_s}{2}} q_e^2 d(q_e) = \frac{1}{12} q_s^2.$$  \hspace{1cm} (3.4)

The mean-squared value of $q_e$, given by (3.4), can be used to derive the well known expression in (3.5b), which is the signal-to-quantization noise ratio (SQNR) for an $N$-bit ADC with a sinusoid input. The root mean-squared value of a sinusoid with an amplitude of $V_{FS}/2$ is $V_{FS}/(2\sqrt{2})$, where $V_{FS} = q_s2^N$ [40, 88]. The SQNR for an $N$-bit quantized full-scale sinusoid signal can then be calculated by the ratio of the root mean-squared value of the input over the root mean-squared value of the quantization noise, according to

$$\text{SQNR} = \frac{\frac{V_{FS}}{2\sqrt{2}}}{\frac{1}{\sqrt{12} q_s}} = \frac{\sqrt{12} q_s 2^N}{2\sqrt{2} q_s},$$  \hspace{1cm} (3.5a)

which in log-domain becomes

$$\text{SQNR} = 6.02N + 1.76 \text{ dB}.$$  \hspace{1cm} (3.5b)
As mentioned, the resolution $N$ of an ADC is the number of bits $d_i$ in the digital output $D_{out}$ of the converter. This is not the same as the accuracy of the converter. Instead, the accuracy reveals how much of the output that is significant after accounting all the non-ideal behavior and errors introduced during the conversion [27]. Hence, the resolution is the goal and the accuracy is the result.
3.3.2 Signal-to-Noise Ratio

The signal-to-noise ratio (SNR) is the ratio of the power of the fundamental to the total noise power [32, 40, 88]. The total noise power is calculated by integrating the noise over the frequency band from DC up to the Nyquist frequency, i.e., half the sampling frequency $f_s$. In the integration the DC and harmonics components are excluded [31]. The SNR is often measured in decibel and is given by

$$SNR = 10 \log_{10} \left( \frac{P_{signal}}{P_{noise}} \right) \text{dB}. \quad (3.6)$$

If assuming an ideal system without noise, the SNR is equal to the SQNR, given by (3.5b). Hence the SQNR is the maximum achievable SNR for an $N$-bit ADC.

3.3.3 Signal-to-Noise and Distortion Ratio

The signal-to-noise and distortion ratio (SNDR) is the ratio of the signal power to the total noise and distortion power, i.e., the SNR plus harmonics. As for the SNR, the SNDR is also measured in the frequency band from DC to $f_s/2$, excluding the DC component [31]. In log-domain it becomes

$$SNDR = 10 \log_{10} \left( \frac{P_{signal}}{P_{noise} + P_{distortion}} \right) \text{dB}. \quad (3.7)$$

3.3.4 Spurious-Free Dynamic Range

The spurious-free dynamic range (SFDR) is the difference between the amplitude of the desired output signal and the amplitude of the largest output signal component that is not present in the input [31]. Measured with the signal power, instead of the amplitude, yield that the SFDR is the ratio between the power of the desired output signal component $P_{signal}$ and the power of the largest output component that is not present in the input, i.e., the power of the largest spurious tone $P_{spurious,max}$. Hence the SFDR is given by (3.8), and is illustrated in Figure 3.2 where an output spectrum is plotted. As seen in Figure 3.2 the SFDR is calculated by taking the difference between the magnitude of the fundamental tone and the largest spur.

$$SFDR = 10 \log_{10} \left( \frac{P_{signal}}{P_{spurious,max}} \right) \text{dB} \quad (3.8)$$
The effective number of bits (ENOB) is a scaled version of the SNDR according to (3.9), i.e., the ENOB contains the same information as the SNDR. By rearranging (3.5b) the ENOB is calculated as

$$\text{ENOB} = \frac{\text{SNDR} - 1.76}{6.02}, \quad (3.9)$$

which can be used to calculate the lower bound of the required ADC resolution for a specified SNDR.

To find the maximum sampling frequency $f_{s,\text{max}}$ a low frequency input is applied and the sampling frequency is swept. The maximum sampling frequency is reached when the SNDR is 3 dB lower than for low sampling frequencies. The 3 dB reduction of SNDR is equivalent to a 0.5 bit lower ENOB. Hence, by plotting the ENOB as a function of the sampling frequency the maximum sampling frequency is found when the ENOB is 0.5 bit lower than for low sampling frequencies, as illustrated in Figure 3.3.

The ENOB used in this work was calculated by fitting a fixed frequency sine wave in MATLAB® to the sampled output data of the Eldo™ simulator in Cadence® [28, 36, 43]. The sinusoid to fit to the output data was

$$\hat{y}_n = A \cos(\omega t_n) + B \sin(\omega t_n) + C, \quad (3.10)$$

where $\omega$ is the input angular frequency, $t_n$ is the sample times, and $A$, $B$, and $C$ are the fit parameters. The squared error $\Psi$ between the output
Illustration of how to derive the maximum sampling frequency. Samples $y_n$ and the sinusoid to fit, $\tilde{y}_n$, is

$$\Psi = \sum_{n=1}^{M} (y_n - \tilde{y}_n)^2 = \sum_{n=1}^{M} (y_n - A \cos(\omega t_n) - B \sin(\omega t_n) - C)^2,$$  

(3.11)

where $M$ is the number of output samples.

The squared error given by (3.11) is then minimized by setting the partial derivatives with respect to the fit parameters $A$, $B$, and $C$ to zero. This yields a linear equation system that is solved using MATLAB®, which gives the fit parameters. The ENOB is then calculated according to the following expression [43].

$$\text{ENOB} = N - \frac{1}{2} \log_2 \left( \frac{12 \Psi}{q_s M} \right)$$

(3.12)

### 3.3.6 Effective Resolution Bandwidth

The effective resolution bandwidth (ERBW) is defined as the input frequency at which the ENOB is reduced by 0.5 bit compared with the ENOB at low input frequencies. Hence, by sweeping the input frequency instead of the sampling frequency the ERBW can be derived by the same method as for the maximum sampling frequency, as illustrated by Figure 3.4. In addition, it is measured at the maximum sampling frequency $f_{s, \text{max}}$.

### 3.3.7 Figure of Merit

To compare the efficiency of different ADC designs a figure of merit (FoM) is required. The figure of merit can be defined in several ways [29, 64, 93].
In this work, the FoM considers the power consumption of the converter as well as the ENOB and ERBW, according to

\[ \text{FoM} = \frac{\text{Power}}{\text{ERBW} \cdot 2^{\text{ENOB} + 1}} \cdot J, \]  

which yields the energy per conversion step.

### 3.4 Flash ADC Topology

High speed ADCs are often based on a flash structure [41, 85, 88]. In an \( N \)-bit flash ADC, the input signal is applied to the inputs of \( 2^N - 1 \) comparators, where \( N \) is the resolution of the converter. Each comparator is connected to a reference voltage that commonly is generated by a resistor net. The generation of the reference voltages is described in more detail in Section 3.4.1.

The output of a comparator is high if the input voltage is larger than the reference voltage at the reference input of the comparator. Otherwise the output is low. Hence, the output pattern of the comparators corresponds to thermometer code. The comparators can be implemented in several different ways, which will be discussed further in Section 3.4.2 where the comparator topology used in this work is presented.

In Figure 3.5 the thermometer code is encoded to the binary output code by the \( (2^N - 1) \)-to-\( N \) encoder, i.e., the thermometer-to-binary encoder. More details on the thermometer-to-binary encoder are found in Section 3.4.3. The comparators in Figure 3.5 are numbered, and this number corresponds to the thermometer output position \( m \) later used in Section 3.4.3.
The reference generator used in flash ADCs usually consists of one or two strings of resistors \[32\]. Two strings are required when differential comparators are used. In this design the input signal is single ended, hence only one string of resistors is used, as illustrated by Figure 3.5. The behavioral level modeling of the reference generator is presented in Section 5.2. The results of this model are used for the design of the reference generator for the flash ADC with MUX-based encoder in Section 6.1.1 and the reference generator for the DEM flash ADC in Section 6.2.1.

3.4.2 Comparator

The comparators compare two signals on their inputs. Since the ADC input is single ended, the comparators do only have two inputs, i.e., the positive signal input and the negative reference input. If the input signal is larger than the reference signal the output should be logic one. In Figure 3.6 an illustration of the comparator used in this work is shown. As seen the comparator has a differential output, i.e., each comparator has two outputs. These outputs are connected to a D-flip-flop that holds the outputs for a clock period, which gives the thermometer-to-binary encoder enough time to convert the outputs of the comparators to the binary output of the ADC. In this work, only the D-flip-flops connected to the positive output of the
comparators serves as input to the encoder. The purpose of the D-flip-flops connected to the negative output is to give the same load on the two comparator outputs.

![ comparator diagram ](image)

**Figure 3.6:** Block diagram of a comparator with D-flip-flops on the outputs.

There are numerous different comparator topologies. For low requirements on resolution and speed, an operational amplifier can be used as the comparator [40]. For high-resolution applications the voltage corresponding to an LSB ($V_{\text{LSB}}$) is small. The gain of the comparator then has to be high to be able to amplify an input voltage of magnitude $V_{\text{LSB}}$ to an output voltage that can be interpreted as a logic one or zero. If in addition the requirement on the conversion rate of the ADC is high, the comparator has to detect if the difference between its inputs is positive or negative within a short time period. This implies that the bandwidth of the comparator must be high.

The requirements on high gain and high bandwidth of the comparators generally require the use of comparators based on cascaded amplifiers [40], as illustrated in Figure 3.7(a). At very high speed, the fast latched comparator topology is most commonly used [32, 88]. It is often combined with a number of cascaded preamplifiers as illustrated in Figure 3.7(b) [40]. Another advantage of the latched based comparators is that they can operate at a low supply voltage [87], which is the case in this work where the maximum supply voltage is limited to 1.2 V due to the used SOI CMOS technology.

In the targeted applications of this work the requirement on conversion speed is high, but the required resolution is only six bits. The speed of the comparators is therefore more important than their accuracy. Consequently, a latched comparator with only one preamplifier is used in this work. The chosen comparator topology is shown in Figure 3.8. As seen from this figure the preamplifier has a passive load. If it instead had an active load, it could be designed for a higher gain, which would improve the accuracy of the comparator. Since the speed of the comparator is more important than the accuracy in this work, the gain is however traded for speed by choosing a passive load instead of an active [87, 88].
As mentioned earlier the output pattern of the comparators corresponds to thermometer code. This is generally decoded to binary code, but other output codes can also be used, such as gray code \[64, 86, 87, 89\]. There are also examples where the encoding is divided into two steps. First thermometer-to-gray encoding followed by gray-to-binary encoding, which can improve the bit error rate due to a reduction of an effect called bubble errors \[12, 47, 66\], explained below.

For low-resolution and low-speed converters the input to the encoder will indeed be a perfect thermometer code. However, as the resolution is increased the bubble error rate increases, especially if the sampling rate of the ADC is increased as well. The “bubbles” in the thermometer code are digital zeroes introduced in the string of ones, or digital ones introduced in
the string of zeroes. The bubbles are mainly introduced near the transition level in the thermometer code [40]. They are due to the uncertainty of the effective sampling instant, introduced by, e.g., the global signal propagation over a long distance of the input signals and clock signals. The signal propagation over a long distance incurs a timing difference between the input signal lines and the clock lines. Hence, the clock and input signal paths should be closely matched by, e.g., using a buffer tree for the clock distribution. Other bubble error sources are the comparator metastability, the comparator offset, cross talk, noise, limited preamplifier bandwidth, etc [40, 41, 75, 80].

The remainder of this section presents two different encoder topologies. The read-only memory (ROM) encoder topology and the ones-counter encoder topology.

**ROM Encoder**

A common approach to encode the thermometer code is to use a gray or binary-encoded ROM. In Figure 3.9 a flash ADC with a gray encoded ROM is depicted [85]. The appropriate row $m$ in the gray encoded ROM is selected by using a row encoder that has the output of comparator $m$ and the inverse of comparator $m+1$ as inputs. The output $m$ of the row encoder, connected to memory row $m$, is high if the output of comparator $m$ is high and the output of comparator $m+1$ is low. The row encoder can be realized by, e.g., a number of 2-input NAND gates, where one input to each NAND gate is inverted. This circuit selects multiple rows if a bubble error occur, which introduces large errors in the output of the encoder [41, 85].

If only single bubble errors occur, these errors can be corrected for by using 3-input NAND gates, as shown in Figure 3.9. The 3-input NAND gates removes all bubble errors if they are separated by at least three bits in the thermometer scale.

The main advantage of the ROM encoder approach is its regular structure that is straightforward to design. A disadvantage is that as the conversion speed increases, more bubble errors are introduced and a more advanced bubble error correction scheme is required. As the complexity of the bubble error correction circuit increases, its propagation delay will in general also increase. The longer propagation delay reduces the speed of the overall encoder if not applying pipelining. The increased complexity of the circuit increases its consumed chip area, and it will most likely consume more power [75, 80].
The output of a thermometer-to-binary encoder is the number of ones on the input represented in, e.g., gray or binary code. Hence, a circuit counting the number of ones in the thermometer code, i.e., a ones-counter, can be used as the encoder.

The use of a ones-counter gives global bubble error suppression [41, 80]. Another benefit of the approach is that a suitable ones-counter topology may be selected by trading speed for power. This is why the Wallace tree topology [94], illustrated in Figure 3.10, is a good candidate as an encoder for high-speed converters [41, 57, 80].

### 3.5 Error Sources and Error Correction in Flash ADCs

As mentioned earlier the input signal of the flash ADC is connected to the inputs of $2^N - 1$ comparators, where $N$ is the number of bits. Each comparator is also connected to a reference voltage, commonly generated by a resistor net. All these parts of the converter introduce errors due to their non-ideal behavior caused by, e.g., component mismatch. The errors
introduced by these parts are discussed in more detail in Section 3.5.2 and Section 3.5.3. The effects of sampling time uncertainty are presented first.

### 3.5.1 Sampling Time Uncertainty

Consider an input that is sampled by a SH circuit. Due to clock jitter, the instant where the clock is changing is varying from clock cycle to clock cycle. The exact sampling times are therefore unknown. This sampling time uncertainty, $\Delta t_s$, introduces an uncertainty in the sampled value, $\Delta V_s$, [88], as illustrated by Figure 3.11. If $\Delta V_s$ is too large this significantly reduces the ADC performance. The maximum allowable sampling time uncertainty for a certain resolution $N$ and maximum input frequency $f_{\text{in, max}}$ will therefore be derived.

Further, consider a sinusoid input with an amplitude equal to half the full-scale voltage $V_{FS}$, and a frequency equal to the maximum input frequency $f_{\text{in, max}}$,

$$V_{in} = \frac{V_{FS}}{2} \sin (2\pi f_{\text{in, max}} t). \quad (3.14)$$

The maximum rate of change for the input signal occurs at the zero crossing of $V_{in}$, i.e., where the derivative of (3.14) is largest. The uncertainty in the sampled value is therefore largest at this point, since deviations of the sampling time instant give the largest deviation in the sampled value when the derivative has its maximum. The maximum ratio of $\Delta V_s$ to $\Delta t_s$
The uncertainty in the sampled value $\Delta V_s$ caused by the sampling time uncertainty $\Delta t_s$. can be approximated by the maximum derivative of the input according to
\[
\max \left\{ \left| \frac{\Delta V_s}{\Delta t_s} \right| \right\} \approx \max \left\{ \left| \frac{dV_{in}}{dt} \right| \right\} = \pi f_{in,max} V_{FS}. \quad (3.15)
\]

Requiring an uncertainty in the sampled value of less than 0.5 LSB, i.e., $\Delta V_s$ should be less than half the quantization step $q_s$, yield the following relation for the maximum sampling time uncertainty by using [88]
\[
\Delta t_s < \frac{0.5q_s}{\pi f_{in,max} V_{FS}} = \sqrt{V_{FS}} = q_s 2^N = \frac{1}{\pi 2^{N+1} f_{in,max}}. \quad (3.16)
\]

In Table 3.1 the maximum sampling time uncertainty is listed for ADCs with resolutions of four, six, and eight bits. The maximum input frequency is assumed to be 500 MHz.

<table>
<thead>
<tr>
<th>$N$</th>
<th>$\Delta t_s$</th>
</tr>
</thead>
<tbody>
<tr>
<td>4 bits</td>
<td>20.0 ps</td>
</tr>
<tr>
<td>6 bits</td>
<td>5.0 ps</td>
</tr>
<tr>
<td>8 bits</td>
<td>1.2 ps</td>
</tr>
</tbody>
</table>

**Table 3.1:** Maximum sampling time uncertainty for different resolutions and an input frequency of 500 MHz.
3.5.2 Resistive Reference Generator

Due to mismatch, the resistance of the resistors in the reference net deviate from the nominal values by $dR$. The deviation can often be assumed to have a Gaussian distribution with zero mean and $\sigma_R$ standard deviation, i.e.,

$$dR \sim N(0, \sigma_R).$$

(3.17)

An effect of the resistor mismatch is that the reference levels also deviate from their nominal levels. The reference level deviation results in a nonlinear transfer function of the ADC, which introduces harmonics in its output. These effects are included in the behavioral models of the ADCs designed in this work. The models are presented in Chapter 5.

Another error source in flash ADCs is the signal feedthrough of the input signal to the reference generator outputs. The feedthrough occurs due to the parasitic capacitance between the inputs of the comparators. The effect of the feedthrough can be reduced by designing the reference net to have a sufficiently high bias current, which thereby yields reference output voltages that are stable enough to reduce the harmonics to acceptable levels. Hence, the total resistance of the reference net should be designed sufficiently low. However, if it is too low it consumes unnecessarily high power, i.e., there is a trade-off. More on this is found in Section 5.2. There are models presented where the above error sources are included. In these models, the voltage fluctuations of the reference generator power supplies are also included.

3.5.3 Comparator

The timing errors of an ADC mainly originate from four major timing error sources [88]. First we have the signal dependent delay, which is introduced by the comparators. Second, the sampling clock jitter depends on the quality of the off-chip clock signal. The third source is the rise and fall times of the on-chip clock signal. If these are too long, the noise from the clock buffers cause additional problems with clock jitter [88].

The last of the four error sources is the skew between the clock signal and the input signal. The skew originates from the routing of the signals over large distances on the chip. As an example, the distance from the bottom comparator to the top comparator of the converters designed in this work is about 1 mm. Hence, the timing difference between the clock signals of the bottom comparator compared to the top comparator is about 10 ps, assuming the on-chip signal propagation speed is about one third of the speed of light, and that the clock signal is routed from the bottom to
the top of the comparator array. It is therefore important to carefully route the clock and input signal so that these signal paths are closely matched in terms of propagation delay.

Another method to reduce the effect of the signal dependent delay and the clock skew is by introducing a SH circuit on the converter input. Ideally the output of the SH circuit is constant. Hence, all comparators compare the same input independent on the exact sampling time of each comparator, i.e., the effect of clock skew is removed. Further, the signal dependent delay is reduced by increasing the ratio of the bandwidth to the input signal frequency of the preamplifiers in the comparators [88]. This ratio is maximized by applying a DC input signal, i.e., the SH output signal. Maximizing the ratio of the preamplifier bandwidth to the input signal frequency therefore minimizes the signal dependent delay. The drawback of this solution is that since the SH circuit has the total input parasitic capacitance of the comparators as load, which usually is large, the power consumption of the SH circuit will be high [55]. However, the results from MATLAB simulations of the model presented in Section 5.1 indicate that a clock skew of between 4 and 5 ps can be tolerated and still having an ENOB of 5.5 bit without a SH circuit. This clock skew requirement can be tolerated, and the power consumption of the designed ADCs in this work should hence be low. No SH circuit is therefore used for the ADCs designed in this work. A study presented in [55] also shows that no SH circuit is required for flash ADCs with resolutions of up to six bits. In that study they also included restrictions on the maximum allowable differential nonlinearity, but allowed an ENOB degradation of 1 bit from the maximum.

The SH circuit reduces the effect of the signal dependent delay and the clock skew. Since no SH circuits are used in the converter designs in this work, the signal dependent delay will increase the third order distortion on the converter output. To reduce the signal dependent delay, and thereby the third order distortion, the preamplifiers must be designed to have a sufficiently high bandwidth [88], which is be explained in the distortion section below.

**Distortion**

The differential output of the comparators used in this work should ideally remove the second order distortion [86]. In reality the mismatch between the comparators, introduced during manufacturing, cause second order distortion to be present. However, by careful layout that minimizes the mismatch, the differential topology at least reduces the second order distortion com-
pared with a single ended output topology. The emphasis during the design of the comparators presented in this work is therefore on minimizing the third order distortion.

As mentioned earlier the third order distortion of the converter can be reduced by designing the preamplifiers of the comparators to have a sufficiently high bandwidth. An expression is derived in [88], which yield that the third order distortion $D_3$ is given by (3.18). The expression is later used in Section 6.3, where the design of a comparator is presented. In (3.18) $f_{\text{amp}}$ is the $-3$ dB bandwidth of the preamplifier and $f_{\text{in}}$ is the input frequency of the ADC. The linear range of the preamplifier $V_{lr}$ is the difference between the gate-source voltage $V_{gs}$ of the input transistors and their threshold voltage $V_T$.

\[
D_3 = 20 \log_{10} \left( \frac{2}{3\pi \frac{f_{\text{amp}}}{f_{\text{in}}}} e^{-\left( \frac{V_{lr}}{V_{FS}} \frac{f_{\text{amp}}}{f_{\text{in}}} - 1 \right)} \right)
\]  

(3.18)

**Offset Error**

The differential stage on the input of a comparator has one input connected to the reference input. The other input is connected to the single ended ADC input signal. Since the input stage of the comparator is differential, it is sensitive to mismatch between the two transistors to which the inputs are connected. The mismatch is due to the statistical variations during manufacturing and gives rise to the offset error of the comparator, as illustrated by the model in Figure 3.12. Careful layout of the input stage is therefore required to reduce the mismatch between the input transistors.

The input offset of the subsequent stages in the comparator, e.g., additional amplifiers or latches, also adds to the input offset of the comparator. This is however reduced by designing the first stages of the chain to have high gain. However, there is a trade-off with the $-3$ dB bandwidth of the preamplifier. The bandwidth should be as large as possible to reduce the signal dependent delay, and thereby the third order distortion of the converter. The gain of the preamplifiers is therefore generally limited to below 10 [32, 40].

**Latched Comparator**

This section presents methods to reduce the metastability errors and kick-back noise of latched comparators, illustrated by Figure 3.13. Consider
a comparator that is based on cascaded amplifiers, as illustrated by Figure 3.7(a). If the output of a comparator is neither logic one nor logic zero the comparator is said to be in its metastable state [88]. This can happen if, e.g., the gain of the comparator is too low, or, equivalently, when the voltage difference between the inputs of the comparator is too low. The metastable states must be avoided, and hence the comparators must be designed accordingly.

Figure 3.12: Model of the offset error of the differential input stage of a comparator.

The latched comparators often encountered in high-speed ADCs also suffer from metastability errors like comparators based on cascaded amplifiers. By proper comparator design the metastability errors can be reduced. One alternative is to design the preamplifiers to have a high enough gain, which thereby reduces the metastability error rate [85], i.e., the same method as for the comparators based on cascaded amplifiers. However, since the third or-
der distortion is reduced by designing for a sufficiently high bandwidth, this generally limits the possibility to design the preamplifier for a sufficiently high gain to have lower metastability error rate. Instead, note that the latch used in a latched comparator can be considered as two negative gain amplifiers in a positive feedback loop. From this observation the expression in (3.19) can be derived. That expression gives the number of metastable states per second $M_n$ as a function of the sampling frequency $f_s$, the unity gain frequency $f_T$ of the negative gain amplifiers in the latch, and their DC gain $A_0$ [88].

$$M_n = f_s e^{-\left(1 - \frac{1}{A_0}\right) \frac{f_T}{f_s} \pi} \quad (3.19)$$

From (3.19) it is seen that the other methods to reduce the metastability error rate is to increase the ratio $f_T$ over $f_s$, increase the DC gain $A_0$, or a combination of those methods.

A benefit of using fully depleted SOI CMOS technology compared with a bulk CMOS technology for the implementation of high-speed ADCs is found by studying the expression for the regeneration pole of the latch, $p_{reg}$ [86].

$$p_{reg} = \frac{g_{m5} + g_{m6}}{C_{gs5} + C_{gs6} + C_{db5} + C_{db6} + C_{db2}} \quad (3.20)$$

The maximum sampling frequency of the latch is related to the regeneration pole $p_{reg}$ of the latch. Hence (3.20) reveals a benefit of using a fully depleted SOI CMOS technology compared with a bulk CMOS technology for the implementation of high-speed ADCs. As mentioned in Chapter 2 the body factor of fully depleted SOI CMOS is larger than for a bulk CMOS technology, which yields a higher current drive capability and therefore a higher $g_m$. Hence $g_{m5}$ and $g_{m6}$ is larger in a fully depleted SOI CMOS technology than in a bulk CMOS technology. In addition, the parasitic capacitances $C_{db5}$, $C_{db6}$, and $C_{db2}$ are smaller for SOI CMOS than for bulk CMOS, which further improves the speed. The latter performance advantage diminishes as the technology is scaled. For a partially depleted SOI CMOS technology the body factor is the same as bulk CMOS, and as the technology is scaled, the advantage of the reduced parasitic capacitances is reduced. There is therefore little advantage of using a partially depleted SOI CMOS technology over a bulk CMOS technology for the implementation of the flash ADC, considering the speed of the latch. Since the technology used for the implementation of the designed converters in this work has a minimum channel length of 130 nm there should however still be an advantage in terms of reduced parasitic capacitance over bulk CMOS.
During the sample phase the outputs of the latched comparator are connected through the transistor switch M4 in Figure 3.13. The latch then enters its metastable state. To ensure that the outputs of the comparator are not in a metastable state during the sample phase the outputs are buffered by inverters sized to have a higher threshold voltage than the latch [85]. Hence, the outputs of the comparator are logic one during the sample phase. To reduce the metastable errors further, the comparator output buffers are connected to D flip-flops to further increase the regeneration gain of the comparator. Thereby the probability of having metastability errors is reduced [50].

The use of latched comparators introduces another error, namely the kick-back noise. When the latch goes from the sample phase to the evaluation phase the comparator outputs change rapidly. These rapidly varying signals are fed through to the inputs of the comparator, i.e., the reference net and the converter input, as illustrated by Figure 3.14. Hence, the kick-back noise affects the input signal and introduces noise to the reference net. The latter can however be somewhat reduced by further increasing the bias current in the reference net, which is the same approach used to reduce the input signal to reference net feedthrough.

Different circuit architectural approaches can be applied to reduce the effect of the kick-back noise on the input and further reduce its effect on the reference net [23, 87]. The approaches used in this work are to use a preamplifier and to introduce the NMOS transistors M2a and M2b in

![Figure 3.14: Feedthrough of the kick-back noise from the outputs to the inputs.](image-url)
Figure 3.14. These transistors are connected to the clock signal. When the comparator is in its sample phase they are turned on. However, when the clock goes low and the comparator enters its evaluation phase, M2a and M2b are turned off. When they are turned off, they introduce a high-impedance path for the kick-back noise from the comparator outputs to the comparator inputs. This reduces the kick-back noise on the inputs of the comparators.

3.6 DEM in Flash ADCs

Due to process variations, mismatch errors are introduced during the manufacturing of the circuits. To compensate for mismatch errors both static as well as dynamic matching techniques can be employed [74].

When using static matching the components are placed close together in certain patterns and made large to yield small relative errors, i.e., to reduce the variation between the components [18, 82].

A complementary approach is DEM, which has been used for a couple of decades in digital-to-analog converters (DACs) [88]. It has been used in stand alone DACs as well as DACs used in ADCs [3, 26, 34, 46]. In the past years some attempts have been made to introduce DEM in ADCs as well, such as pipelined ADCs [63], sigma-delta ADCs [26], and flash ADCs [6, 7, 73]. As with DACs, the introduction of DEM in ADCs improves the spectral properties of the converters [7, 73].

The dynamic element matching in the flash ADC is accomplished by introducing a number of switches between each adjacent resistor in the reference generator of the converter [6, 7], as illustrated in Figure 3.15, where the reference output with the lowest index has the lowest reference voltage. Hence, the voltage on each output of the reference generator can be changed during operation by changing the state of the switches in the reference net. A certain comparator thereby compares different input levels in different sample instants. If the switches are connected to a random generator, which updates the reference voltages each sample, the spurious tones in the output introduced by the uncertainties in resistor values and offset voltages of the comparators are reduced.

The drawback with the solutions in [6, 7] is that a large number of switches are connected in series with the resistors in the reference generator. The resistance associated with the switches therefore adds to the total resistance of the reference generator. Hence the overall reference net resistance is increased, which increases the input signal to reference net
feedthrough [76, 90], as will be shown in Section 5.2. The proposed DEM in [6, 7] therefore limit the maximum input frequency of the converter more than the proposal in this work presented in Section 4.3. The limitation of the maximum input frequency is due to the on-resistance of the introduced switches, as will be explained in Section 5.2.1. In addition, the large number of switches used in the reference net introduces large parasitic capacitances. Hence the settling time of the reference net is increased, which limit the conversion rate of the ADC.

To improve the speed of the ADC with DEM, a DEM-circuit with less complexity is presented in Section 4.3 [74]. This circuit should be able to operate at higher frequencies compared with the circuits proposed in [6, 7].

Figure 3.15: The reference generator of a flash ADC with DEM [7].
Chapter 4

Proposed Circuits

This chapter presents the new circuit proposed in this work. The first is the folded Wallace tree encoder. This approach reduces the hardware cost and the propagation delay of a Wallace tree encoder by reducing its size and using the same Wallace tree for several intervals of the thermometer code [80].

In Section 4.2 a thermometer-to-binary encoder based on multiplexers is presented, i.e., the MUX-based encoder [75]. This is followed by the proposed DEM flash ADC topology in Section 4.3 [74, 79].

4.1 Folded Wallace Tree Encoder

When a Wallace tree is used as the thermometer-to-binary encoder, the size of the Wallace tree and the delay is depending on the number of added bits, i.e., the width of the base of the tree. A full adder (FA) may be built up of three two-to-one (2:1) multiplexers (MUXs), according to Figure 4.1 [54]. Consider an $N$-bit flash ADC, where the resolution $N$ is larger than one. The hardware cost of its Wallace tree encoder, $\Gamma_{\text{Wallace}}$, in hardware cost of a 2:1 MUX, $\Gamma_{\text{MUX}}$, then becomes [41, 80]

$$\Gamma_{\text{Wallace}} = 3 \sum_{i=1}^{N} (i - 1)2^{N-i} \Gamma_{\text{MUX}}. \quad (4.1)$$

The length of the critical path of the Wallace tree encoder in units of the propagation delay of a 2:1 MUX $t_{\text{MUX}}$ becomes [41, 80]

$$t_{CP,\text{Wallace}} = (4N - 6)t_{\text{MUX}}. \quad (4.2)$$
As expected, the hardware cost and the propagation delay of the Wallace tree encoder decreases as the resolution $N$ decreases. Now split the thermometer code into several intervals where each interval is encoded by a Wallace tree encoder that is reduced in size compared with the original encoder. Each of those smaller Wallace tree encoders would have a shorter critical path and lower hardware cost than the encoder that encodes the whole thermometer code. If the same Wallace tree encoder were used for all intervals, the overall encoder hardware cost would be reduced. This is the idea behind the folded Wallace tree encoder, shown in Figure 4.2 [80].

Using the folded Wallace tree encoder approach the thermometer code on the input of the encoder is split into $2^k$ different intervals. The different intervals of the thermometer code are multiplexed to a single Wallace tree encoder that is reduced in size compared with the original Wallace tree [80]. Its hardware cost and critical path is thereby reduced. The new costs can be calculated by exchanging $N$ by $N - k$ in (4.1) and (4.2) under the assumption that $N - k$ is larger than one. To derive the total hardware cost of the folded encoder and its propagation delay, also the hardware cost and propagation delay of the MUX in front of the Wallace tree in Figure 4.2 must be considered, and added to $\Gamma_{\text{Wallace}}$ and $t_{\text{CP,Wallace}}$. The number of MUXs required for the folded Wallace tree encoder is $2^{N-k}$, where each MUX is of the type $2^k$-to-1 (2k:1) [80], which can be built up of $2^k - 1$ number of 2:1 MUXs. An example of how a 4:1 MUX can be realized from three 2:1 MUXs is shown in Figure 4.3. Including the MUX in the derivation of the hardware cost $\Gamma_{\text{folded}}$ and the critical path $t_{\text{CP,folded}}$ of the folded Wallace tree encoder yields the following expressions for $N - k$ larger than one.

![Full Adder](a)

![MUX Realization](b)

**Figure 4.1:** Illustration of (a) a full adder and (b) how it can be realized by three 2:1 MUXs.
\[ \Gamma_{\text{folded}} = \left( 3 \sum_{i=1}^{N-k} (i-1) 2^{N-k-i} \right) + 2^{N-k} \left( 2^k - 1 \right) \] 
\[ t_{\text{CP,folded}} = (4N - 3k - 6) t_{\text{MUX}} \]

For folding above four the MUX control circuit, depicted in Figure 4.4 for a 4-level folded encoder, will limit the critical path of the encoder [80]. Folding of four therefore seems to be the practical folding limit. The hardware cost is however still reduced for folding above four. The encoder topology is evaluated by the behavioral level simulation presented in Section 5.4.

### 4.2 MUX-Based Encoder

This section describes the idea behind the MUX-based encoder for an ADC with a resolution of four bits and how the encoder can be generalized for use in an \( N \)-bit flash ADC [75, 76].
Illustration of (a) a 4:1 MUX and (b) an example of how it can be realized by three 2:1 MUXs.

Figure 4.4: MUX control circuit for 4-level folding.

Consider a 4-bit flash ADC with the thermometer output code to the left in Figure 4.5. The most significant bit (MSB) of the thermometer-to-binary encoder output is logic one if more than half of the outputs in the thermometer scale are one. Hence, the MSB is the same as the thermometer output at level $2^{N-1}$, which is logic one in the example in Figure 4.5, i.e., $d_3 = 1$. To find the value of the second most significant bit, MSB $- 1$, the original thermometer scale is divided into two partial thermometer scales, separated by the output at level $2^{N-1}$. To find MSB $- 1$ the appropriate partial thermometer scale must be decoded. If MSB equals one the upper partial thermometer scale needs to be investigated, otherwise the lower partial thermometer scale is of interest. In the example in Figure 4.5 the upper scale is chosen. The MSB $- 1$ is decoded in the same way as the MSB, i.e., by assigning it the value of the middle output in the selected partial thermometer scale. In this example the middle output is a logic zero, i.e., $d_2$ in Figure 4.5. This is continued recursively, according to Figure 4.5, until all output bits $d_i$ are assigned. In the example in Figure 4.5, the binary output, $D_{out}$, becomes 1011$_2$, which equals 11$_{10}$, i.e., the number of ones on the encoder input.

The algorithm above can be realized by the MUX-based encoder in Fig-
Section 4.2 MUX-Based Encoder

Figure 4.5: Example of the MUX-based encoder algorithm.

Figure 4.6. As seen from Figure 4.6 the MSB ($d_3$) is generated from the thermometer output at position $m_0 = 2^{N-1}$. The position equals eight when $N$ is four. The same output is also connected to the control inputs of the MUXs in the first encoder column. Hence, if $d_3$ is one the upper part of the thermometer scale is chosen as the partial thermometer scale, otherwise the lower part of the scale is chosen. This is continued recursively until only one MUX remains. Its output is the LSB of the binary output of the encoder, i.e., $d_0$.

Due to the regular structure of the encoder, it can easily be expanded to operate in a system of higher resolution than four bits [76], which is explained below. The regular structure is also a benefit when doing the physical layout of the circuit [75].

In general, the outputs $d_i$ are the bits in the binary output, $D_{out}$, of the thermometer-to-binary encoder, where $i = 0, 1, \ldots, N - 1$. Column one is the first MUX column, as shown in Figure 4.6. Further, the thermometer code outputs, i.e., the inputs to the encoder, and the MUX outputs in column $i$ are denoted $m_i$ for $i = 0$, and $i > 0$, respectively. For an $N$-bit flash ADC the thermometer output has $2^N - 1$ levels, i.e., $m_{i=0} = 0, 1, \ldots, 2^{N-i} - 1$. The thermometer outputs, $m_{i=0}$, or MUX outputs, $m_{i>0}$, are connected to the MUX at level $m_i$ modulo $2^{N-i-1}$ and column $i+1$. If $m_i < 2^{N-i-1}$ they are connected to the “0” input of the MUX, or to the
"1"-input if \( m_i > 2^{N-i-1} \). The remaining output \( m_i = 2^{N-i-1} \) is connected to the control input of the MUXs in column \( i+1 \), and is the encoder output \( d_{N-i-1} \). The encoder therefore has the hardware cost

\[
\Gamma_{\text{MUX-encoder}} = \sum_{i=1}^{N-1} (2^{N-i} - 1)\Gamma_{\text{MUX}}.
\] (4.4)

The critical path \( t_{\text{CP,MUX-encoder}} \) in units of \( t_{\text{MUX}} \) is

\[
t_{\text{CP,MUX-encoder}} = (N-1)t_{\text{MUX}}.
\] (4.5)

A comparison of the hardware cost and the critical path is shown in Table 4.1 for a 6-bit flash ADC. The comparison shows that the hardware cost is significantly reduced if the 4-level folded Wallace tree encoder is used, instead of the Wallace tree encoder. In addition, the propagation delay is shorter. When using the MUX-based encoder both the hardware cost and propagation delay is significantly reduced compared with the other encoders.
## 4.3 DEM Flash ADC

In [7] DEM is introduced into the ADC by adding three switches at each reference level of the reference generator, as illustrated in Figure 3.15. Two of the switches are connected to the reference generator supply and the third is connected between adjacent resistors. The reference generator therefore consists of several resistors and switches in series.

Since the resistance of the reference generator should be small enough to reduce the feedthrough of the input signal [76, 90], which will be further discussed in Section 5.2, the switches must be designed to have a low resistance. For a MOSFET switch this implies that it should be made wide. For example, in [85] the total reference net resistance $R_{\text{tot}}$ of the single resistor string, given by

$$R_{\text{tot}} = R_u (2^N - 1),$$  \hspace{1cm} (4.6)

is 120 $\Omega$, where $R_u$ is the unit resistance in the reference net, and the resolution $N$ is six. To approximate the maximum on-resistance of the switches the resistance $R_u$ is assumed zero. This yields for a resolution of six bits, i.e., 63 reference levels and 65 switches, the maximum on-resistance of each switch to be less than 2 $\Omega$. Such low switch on-resistance is not feasible in practice, since it would require a very wide transistor, which would introduce excessive parasitic capacitance and consume excessive chip area. The parasitic capacitance of the switches increases the settling time of the reference net, which reduces the maximum operating speed of the ADC. The increased area could require a larger physical separation of the reference generator outputs, depending on the physical separation prior to the introduction of the switches. The physical separation of each output of the reference generator is set by the pitch of the comparators, which is minimized to reduce the clock and input signal skew [85]. The introduction of the switches could therefore increase the effects of clock and input signal skew, which is a reason to why the circuit in [7] is only usable for low speed

<table>
<thead>
<tr>
<th>Type of encoder</th>
<th>Hardware cost</th>
<th>Critical path</th>
</tr>
</thead>
<tbody>
<tr>
<td>Wallace tree</td>
<td>171 $\Gamma_{\text{MUX}}$</td>
<td>18 $t_{\text{MUX}}$</td>
</tr>
<tr>
<td>4-level folded Wallace tree</td>
<td>81 $\Gamma_{\text{MUX}}$</td>
<td>12 $t_{\text{MUX}}$</td>
</tr>
<tr>
<td>MUX-based</td>
<td>57 $\Gamma_{\text{MUX}}$</td>
<td>5 $t_{\text{MUX}}$</td>
</tr>
</tbody>
</table>

**Table 4.1:** Hardware cost and length of the critical path of the Wallace tree encoder, the 4-level folded Wallace tree encoder, and the MUX-based encoder, for a 6-bit flash ADC.
applications, where a large $R_{\text{tot}}$ can be tolerated.

The principle of the DEM proposed in [6] is the same as the one proposed in [7], but the resistors in the reference generator are replaced by the switch transistors. The switch transistors are charged by a fixed amount of charge at each sample instant, which determines the on-resistance of the switch transistors. However, the circuit in [6] has the same disadvantage as the one proposed in [7], i.e., in high-speed applications the size of the switches would be too large. The circuit is also somewhat complex and contains ten transistors for each reference level, which introduces a large parasitic capacitance at the reference generator outputs, thereby further reducing the speed of the circuit.

The proposed DEM reference circuit in this work aims at avoiding the switch connected between adjacent resistors [74]. This is accomplished by doubling the number of unit resistors and connecting them in a loop, demonstrated by Figure 4.7(a). The blocks $R$ are defined by Figure 4.7(b). By connecting $V_{\text{ref}+}$ and $V_{\text{ref}-}$, respectively, to different diagonally opposite pair of blocks $R$, the output voltages of the reference generator can be interchanged. Doing this, e.g., every sample period, gives dynamic element matching of the resistors. In addition, a reference voltage is connected to the reference input of a comparator. Since the reference voltage changes every sample, the effect of the comparator offset on the output would then also be reduced by the DEM.

To control which pair of blocks $R$ to connect to $V_{\text{ref}+}$ and $V_{\text{ref}-}$, respectively, a DEM control unit is included. The unit controls the signals $c_{p,i}$ and $c_{n,i}$, that opens or closes the switches in the blocks $R$. Since only one pair of diagonally opposite blocks $R$ are connected to $V_{\text{ref}+}$ and $V_{\text{ref}-}$ at the same time, there will only be two switches in series with the resistors between the reference supply $V_{\text{ref}+}$ and $V_{\text{ref}-}$. This is a significant reduction of the number of switches introduced into the reference net compared with the solutions in [6, 7]. The DEM circuit proposed in this work should thereby allow a higher maximum input frequency and a higher sampling frequency compared with the circuits in [6, 7]. The DEM flash ADC topology will be described in more detail next.

In Figure 4.8(a) a block diagram of the DEM flash ADC is shown, listing its different building blocks. In Figure 4.8(b), the circular structure of the resistor net can be seen for a special case of five quantization levels. Figure 4.8(b) also shows that the topology uses the double amount of unit resistors compared with a conventional flash ADC topology. Further, the reference net supply $V_{\text{ref}}$ is connected to one pair of nodes in the resistor
net, dividing it into two strings of resistors with equal total resistance.

The reference net supply is distributed to the resistor loop by the set of MOSFET switches seen in Figure 4.8(b). Only one pair of switches is activated at the same time. To ensure this activation the switches are controlled by a 1-of-$M$ decoder, where $M$ equals eight in this example, and $2^{N+1} - 2$ in the general case. The 1-of-8 encoder activates one pair of switches by setting one of its outputs to logic zero, keeping the others at logic one. The 1-of-8 encoder is in turn controlled by a random generator. The random generator generates a random binary number that is decoded to a digital zero on one of the outputs of the 1-of-8 decoder. The position of the zero determines where on the circular resistor net $V_{\text{ref}+}$ and $V_{\text{ref}-}$ are connected, which determines the current set of reference voltages. Hence, the reference voltages can be randomly interchanged each sample. The
advantage of this solution is that the total resistance of the string of resistors, $R_{\text{tot}}$, still can be chosen sufficiently low to meet the requirements on input signal feedthrough, except for the series resistance of two switches, which is limited by restrictions on the physical size of the switches.

![Block diagram](image)

(a)

![Schematic](image)

(b)

**Figure 4.8:** (a) The block diagram of the proposed DEM flash ADC and (b) an example of the schematic of the proposed DEM flash ADC for five quantization levels.

A MATLAB® model of the proposed DEM architecture has been developed and the ADC incorporating DEM has been implemented in a 130 nm partially depleted SOI CMOS technology. The results of the MATLAB® simulations and the transistor-level circuit simulations are presented in Section 5.5 and Section 7.2, respectively.

### 4.3.1 The 1-of-$M$ Decoder

If the zero on one of the 1-of-$M$ decoder outputs is allowed to change position to any of the $M$ decoder outputs every sample the reference voltages could change by as much as the full-scale voltage $V_{FS}$ every sample. This large voltage fluctuation would limit the speed of the overall converter, since the reference voltages must settle every sample before the output of the ADC
can be used. To reduce this settling time only neighboring comparators should be allowed to interchange their reference voltages in high-speed applications. This reduces the reference voltage changes to one $V_{\text{LSB}}$, and therefore the settling time is also reduced [74]. In Section 5.5 it will be shown that restricting the interchange of reference voltage to neighboring comparators still reduces the spurious tones significantly compared with not using DEM.

The 1-of-$M$ decoder in Figure 4.8 is designed with an $M$ stage circular shift register capable of shifting one position in either direction. One of its stages is initiated to logic zero and the others to logic one. Hence, one of the $M$ outputs of the 1-of-$M$ decoder will always be zero. The zero is shifted one position each clock cycle with a direction dependent on the output of the random generator.

### 4.3.2 The Thermometer-to-Binary Encoder

If a ROM encoder would be used with the suggested DEM circuit, a switch net has to be added between the comparators and the encoder to connect the comparator connected to the lowest reference voltage to the encoder input with the lowest weight, etc. A better approach is to use a ones-counter encoder [41, 74, 80]. The ones-counter encoder can be used since the number of ones in the thermometer code before the introduction of DEM is the same as after introducing DEM with the ones in a different order.
Chapter 5

Modeling of Flash ADCs

Behavioral simulations of a system yield information and input to the design phase and circuit specifications, e.g., for the comparators or the reference generator. The behavioral level models are gradually refined. In the end the design can be verified and fine-tuned by a few transistor level simulations [5, 17]. Hence, the behavioral level models enable the top-down methodology, which is crucial for designing a large system.

5.1 Clock Skew

As mentioned in Section 3.5.3 the problems of the clock skew can be alleviated by applying a SH circuit on the input of the ADC. This will however increase the power consumption, since the SH has to drive a large load capacitance and therefore will consume much power [55]. This section presents a behavioral model of an ADC with clock skew. The model is illustrated by Figure 5.1 where the input is sampled at the sample time instants $t_n$ plus the sampling time uncertainty $\Delta t_s$. The model is used to compute the yield as a function of the standard deviation of the clock skew by simulation in MATLAB®. The simulation result is then used to evaluate if a SH circuit is required.

When no SH circuit is present at the input, every comparator samples the input signal on inaccurate time instants due to the clock skew. This effect was modeled in MATLAB® assuming a Gaussian distributed clock skew with zero mean value. The model was simulated in MATLAB® with a 1 GHz 0.5 V full-scale sinusoid input and a sampling frequency of 2.1 GHz. The design target in this simulation was an ENOB larger than 5 bits. The result of the MATLAB® simulations is presented in Figure 5.2(a), where the
yield is plotted as a function of the standard deviation of the clock skew, and in Figure 5.2(b), where the same is plotted, but zoomed in. These simulation results show that the standard deviation of the clock skew can be slightly above 4 ps and still result in an ENOB larger than 5 bits. This requirement on clock skew is possible to achieve without a SH circuit on the input, as was also concluded in [55]. Hence, since a low power consumption of the ADC is required a SH circuit is not used in the test ADCs in this work. Higher resolution or input frequency of the ADC would require a SH circuit on the input.

Figure 5.1: Model for the clock skew.

Figure 5.2: (a) The yield as a function of the clock skew where the design targeted ENOB should be larger than 5 bits for the 6-bit ADC, and (b) a magnification of a part of the plot.
5.2 Reference Generator

Two of the major error sources related to the resistive reference generator are the input-to-reference signal feedthrough and the effect of mismatch between the resistors in the reference net. Another error source is the fluctuations of the reference net supply. These error sources were modeled prior to the circuit design of the test ADCs of this work. In Section 5.2.1 a model of the input-to-reference signal feedthrough presented and a method to reduce the power consumption of the reference net is discussed. In Section 5.2.2 the effect of mismatch between the resistors in the reference net is modeled together with the effect of the reference net fluctuations.

5.2.1 Input-to-Reference Signal Feedthrough

There is a parasitic capacitance between the comparator inputs, $C_{\text{comp,in}}$, due to the parasitic capacitance between the gate and the source, $C_{gs}$, of the preamplifier input transistors. This is indicated in the input stage in Figure 5.3.

![Input stage of the comparator](image)

**Figure 5.3:** (a) The input stage of the comparator with the parasitic capacitors between the gate and the source, and (b) the comparator symbol including the parasitic capacitor between the inputs.

In Figure 5.4 the reference net and comparators for a 2-bit flash ADC is shown. The parasitic capacitors between the inputs of the comparators are shown in this figure, which couple the input signal to the reference net.
Hence, the input is fed through to the resistive net generating the reference voltages.

\[ V_{\text{ref}^+} \] \[ V_{\text{in}} \] \[ \frac{R}{2} \] \[ R \] \[ R \] \[ \frac{R}{2} \] \[ V_{\text{ref}^-} \]

**Figure 5.4:** Reference net and capacitors with the parasitic capacitors between the comparator inputs included.

The input-to-reference signal feedthrough causes a variation of the reference voltages. The reference voltage variation can be too large if the resistance of the resistors is not low enough. However, if the resistance is chosen too low, the reference net consumes unnecessarily high power [90]. An expression for the maximum allowable total reference net resistance was thereby derived. The model, derived from Figure 5.4, of the resistor net is shown in Figure 5.5 for three comparators and the assumption that \( V_{\text{ref}^+} \) and \( V_{\text{ref}^-} \) are perfectly decoupled.

\[ R/2 \] \[ R \] \[ V_{\text{mid}} \] \[ R \] \[ R \] \[ R/2 \] \[ 1/sC \] \[ 1/sC \] \[ 1/sC \] \[ V_{\text{in}} \]

**Figure 5.5:** Model for the input-to-reference signal feedthrough for a resolution of two bits.

After expanding the model in Figure 5.5 to the general case of \( 2^N - 1 \)
comparators, $R$ and $C$ are given by

$$ R = \frac{R_{\text{tot}}}{2^N - 1} \quad (5.1) $$

and

$$ C = \frac{C_{\text{tot}}}{2^N} \quad (5.2) $$

where $R_{\text{tot}}$ is the total reference net resistance and $C_{\text{tot}}$ is the total $C_{\text{comp,in}}$ on the ADC input. The ratio between each reference output and the input signal $V_{\text{in}}$ was calculated using the symbolic solver in MATLAB®. The resolution of the ADC was assumed to six bits in the derivation. From the results it was seen that the middle reference net output $V_{\text{mid}}$ is affected the most by the input signal feedthrough. The same result was obtained in [90]. Under the assumption that $2\pi f_{\text{in}}RC \ll 1$ the following expression for $R_{\text{tot}}$ is obtained

$$ R_{\text{tot}} \leq \frac{4.1}{\pi f_{\text{in}} C_{\text{tot}}} \frac{1}{V_{\text{in}}}, \quad (5.3) $$

which also was validated by transistor level simulations in Cadence®.

To express (5.3) as a function of the feedthrough in number of LSBs the ratio of $V_{\text{mid}}$ over $V_{\text{in}}$ is assumed to be $q_{\text{LSB}}$ number of LSBs, i.e.,

$$ \frac{V_{\text{mid}}}{V_{\text{in}}} = \frac{q_{\text{LSB}}}{2^N}, \quad (5.4) $$

which yield the expression for the input-to-reference signal feedthrough,

$$ R_{\text{tot}} \leq \frac{4.1q_{\text{LSB}}}{\pi f_{\text{in}} C_{\text{tot}} 2^N}. \quad (5.5) $$

This is close to the expression derived in [90], where they approximate a reference net similar to the reference net in Figure 5.4 and then derive the expression for $R_{\text{tot}}$. In this work the reference net was not approximated before the derivation, but the derived expression is instead approximated after the calculation in MATLAB®.

The expression in (5.5) is used in the design of the reference net, presented in Section 6.1.1, after extracting the input capacitance of the designed comparators, discussed in Section 6.3.

**Decoupling of the Reference Net Outputs**

As mentioned above the expression in (5.5) can be used to calculate the maximum allowable total reference net resistance. For high-speed ADCs this resistance is in generally low. Consider, e.g., a flash ADC with a resolution of
six bits and a maximum input frequency \( f_{\text{in}} \) of 500 MHz. Assuming the input capacitance of each comparator is 21 fF yields a total input capacitance \( C_{\text{tot}} \) of 1.3 pF. Requiring a maximum feedthrough lower than one LSB then yields a maximum total reference net resistance of 31 Ω, which imply that each resistor in the reference net should be 0.5 Ω. Further, assuming a full-scale voltage of 1 V, i.e., the reference net supply \( V_{\text{ref}} \) is 1 V, yields a 32 mW power consumption of the reference net. To reduce this power consumption the total resistance of the reference net must be increased, but as mentioned above this would yield a higher input-to-reference signal feedthrough. The question now is how the reference net can be modified so that the feedthrough is not increased when increasing the resistor values of the resistor in the reference net?

In the example above the resolution was six bits, i.e., the reference net has 63 outputs, as illustrated by Figure 5.6(a). Each of the outputs is connected to the reference input of a comparator. A model of the input-to-reference signal feedthrough of the 6-bit flash ADC is shown in Figure 5.6(b).

![Figure 5.6](image.png)

**Figure 5.6:** Illustration of (a) the reference net for a 6-bit flash ADC, and (b) the model of the input-to-reference signal feedthrough of the reference net in (a).

Now assume decoupling by an on-chip capacitor at every 4\(^{th}\) reference
output. This distance is defined as the decoupling period $p_{\text{dec}}$, which is equal to four in this example. Also assume that the decoupling is perfect. The modified reference net can now be illustrated by Figure 5.7(a) and its input-to-reference signal feedthrough can be modeled by Figure 5.7(b).

![Diagram](a)

**Figure 5.7:** Illustration of (a) the decoupled reference net for a 6-bit flash ADC, and (b) the model of the input-to-reference signal feedthrough of the decoupled reference net in (a).

From Figure 5.7(b) it is seen that the effect of the decoupling is that the original reference net model in Figure 5.6(b) is divided into 16 parts, each similar to the model in Figure 5.5. The difference is that the edge resistors of the middle sub-nets have the value $R$ instead of $R/2$, which is the case in Figure 5.5. By the decoupling the new worst-case feedthrough occur at the $V_{\text{mid},m}$ nodes, i.e., at the reference outputs $m$, where $m = 2, 6, \ldots, 62$ in this case. The new maximum reference net resistance can therefore be calculated from the model in Figure 5.5.

As a result of the decoupling the $N$ in (5.5) is reduced from six to two, i.e., the reference net decoupling period $p_{\text{dec}}$ is reduced from 64 to four. Hence, $N$ in (5.5) can be exchanged for $\log_2(p_{\text{dec}})$, or $2^N$ can be exchanged for the reference net decoupling period $p_{\text{dec}}$. The new expression for the input-to-reference feedthrough then becomes

$$R_{\text{tot}} \leq \frac{4.1 q_{\text{LSB}}}{\pi f_{\text{in}} C_{\text{tot}} p_{\text{dec}}},$$

(5.6)

where $p_{\text{dec}} \leq 2^N$. 
Using (5.6) for a $p_{\text{dec}}$ of four yield that the maximum reference net resistance is increased to 500 $\Omega$, which reduces the reference net power consumption to 2 mW. This power consumption is significantly lower than the original 32 mW. However, the reduction of power consumption does not come free, since the decoupling capacitors require additional chip area.

### 5.2.2 Resistor Mismatch and Reference Net Supply Fluctuations

Due to mismatch, the resistance of the resistors in the reference net deviate from their nominal values. The reference levels thereby deviate from their nominal levels. To investigate the effects of the mismatch a model was developed in MATLAB®. In this model the deviation from the nominal resistor values, $dR$, was assumed to have a Gaussian distribution with zero mean value and a standard deviation $\sigma_R$.

\[ dR \sim \mathcal{N}(0, \sigma_R) \]  \hspace{1cm} (5.7)

In addition to the resistor mismatch fluctuations of the reference net supply voltage $V_{\text{ref}}$ was also modeled. The fluctuations could be caused by, e.g., crosstalk. The $V_{\text{ref}}$ fluctuations was modeled by calculating the reference net supply voltage deviation, $dV_{\text{ref}}$, in each sample, and then $dV_{\text{ref}}$ was added to the nominal reference net supply voltage $V_{\text{ref}}$. New reference voltages were thereby calculated every sample. In this model, the reference net supply voltage deviation was assumed to have a Gaussian distribution with zero mean value and a standard deviation $\sigma_{\text{ref}}$ in units of LSB.

\[ dV_{\text{ref}} \sim \mathcal{N}(0, \sigma_{\text{ref}}) \]  \hspace{1cm} (5.8)

The model of the resistor mismatch and the supply voltage fluctuations is illustrated in Figure 5.8. As seen the reference net supply voltage variations is included on both the positive, as well as the negative supply. The results of the simulations are presented in Figure 5.9. In Figure 5.9(a) and Figure 5.9(b) the average ENOB is plotted as a function of the standard deviation of the resistor mismatch, $\sigma_R$, and the standard deviation of the reference voltage supply fluctuations, $\sigma_{\text{ref}}$. In Figure 5.9(c) the yield is plotted as a function of the standard deviation of $\sigma_R$ with an ENOB design target of 5.5 bits for a 6-bit flash ADC with a full-scale voltage of 0.5 V. Finally, Figure 5.9(d) shows the yield plotted with the same design target, but as a function of $\sigma_{\text{ref}}$.

As seen in Figure 5.9(a) the standard deviation of the resistor mismatch should be less than 5 % to yield an average ENOB equal to the maximum
of six bits. Assuming an ENOB design target of 5.5 bits it is seen from the yield in Figure 5.9(c) that $\sigma_R$ should be less than 7 % to obtain a yield of about 99 %. For the reference supply fluctuations it is seen from Figure 5.9(b) that $\sigma_{ref}$ should be around 0.1 LSB for an average ENOB equal to the maximum of six. Finally, Figure 5.9(d) shows that a yield of about 99 % can be accomplished for a $\sigma_{ref}$ of 0.4 LSB at an ENOB design target of 5.5 bits.

### 5.3 Comparator

After manufacturing, the process variations over the chip cause mismatch between different devices. This is especially serious for the differential inputs of amplifiers, since these generally are assumed to have identical properties during the design. The mismatch between the input transistors results in an input offset voltage of the differential amplifier. Since comparators have a differential pair in the input stage, they also suffer from mismatch, introducing an offset voltage between the inputs.

The input offset voltage is also modeled and simulated in MATLAB®. The offset voltage $V_{\text{offset}}$ was assumed to have a zero mean value Gaussian distribution with a standard deviation of $\sigma_{\text{offset}}$, and is modeled by adding this random voltage to one of the terminals of the comparators, as illustrated in Figure 5.10. The results of the simulations are presented in Figure 5.11, where the average ENOB and the yield is plotted as a function of $\sigma_{\text{offset}}$ in units of LSB.

Figure 5.11(a) shows the average ENOB as a function of $\sigma_{\text{offset}}$. From this plot it is seen that $\sigma_{\text{offset}}$ should be less than 0.1 LSB to get the maximum ENOB of six bits. Figure 5.11(b) shows further that for a ENOB design target of 5.5 bits the requirement on $\sigma_{\text{offset}}$ can be increased to 0.2 LSB and still obtain a yield of 99 % or more.
The average ENOB plotted as a function of (a) $\sigma_R$ and (b) $\sigma_{\text{ref}}$ for a 6-bit flash ADC with a full-scale of 0.5 V. The yield is plotted as a function of the same variables for the same ADC with the design target ENOB of 5.5 bits, in (c) and (d) respectively.

Figure 5.9: The average ENOB plotted as a function of (a) $\sigma_R$ and (b) $\sigma_{\text{ref}}$ for a 6-bit flash ADC with a full-scale of 0.5 V. The yield is plotted as a function of the same variables for the same ADC with the design target ENOB of 5.5 bits, in (c) and (d) respectively.

Figure 5.10: The comparator model used in the MATLAB® simulations with a mismatch offset.
The bubble errors in the comparator outputs are due to the timing difference between the clock signal and the input signal. This introduces an uncertainty in the effective sampling instant of each comparator, which is of concern if no SH circuit is used. The bubbles the uncertainty gives rise to can have significant effect on the ENOB of the overall ADC. Different thermometer-to-binary encoder topologies are able to suppress these bubble errors to a certain degree. To evaluate their performance in terms of bubble error suppression, a MATLAB® model was developed for each topology of interest, i.e., the ROM encoder [75, 87], ones-counter encoder [41, 75], folded Wallace tree encoder [80], and the MUX-based encoder [75]. These models are presented in this section, together with the simulation results.

The uncertainty in sampling instant due to the timing difference between the clock and input signal lines, $\Delta t$, was modeled by a Gaussian distribution with zero mean value and a standard deviation $\sigma_t$,

$$\Delta t \sim N(0, \sigma_t). \quad (5.9)$$

The input signal was assumed to be a sinusoid with the peak-to-peak magnitude equal to the full-scale voltage, and with the frequency $f_{\text{in}}$,

$$V_{\text{in}} = \frac{V_{\text{FS}}}{2} \sin(2\pi f_{\text{in}}t). \quad (5.10)$$

The maximum of the time derivative of the input can then be calculated in the same way as in (3.15). An approximation of the maximum time
The derivative of the input is
\[ \frac{\Delta V_{\text{in}}}{\Delta t} \approx \max \left\{ \left| \frac{dV_{\text{in}}}{dt} \right| \right\} = \pi f_{\text{in}} V_{\text{FS}}. \] (5.11)

The effect of the timing difference is therefore an uncertainty in the sampled input voltage, \( \Delta V_{\text{in}} \). Using (5.11), the maximum uncertainty in the sampled input voltage have a Gaussian distribution according to
\[ \Delta V_{\text{in}} \sim N(0, \sigma_t \sqrt{\pi f_{\text{in}} V_{\text{FS}}}). \] (5.12)

The uncertainty \( \Delta V_{\text{in}} \) is added to the input \( V_{\text{in}} \) in the simulations. The sampling time uncertainty is therefore modeled as an offset voltage on the input of the comparators, given by (5.12), which yields the comparator model shown in Figure 5.12.

Note that the inherent input referred offset of the comparators due to mismatch is modeled in the same way as the effect of the timing difference. Compare, e.g., Figure 5.10 and Figure 5.12. Hence, the input referred offset of the comparators also is a source to the bubble errors.

The comparator model in Figure 5.12 was used in the MATLAB® simulations of a flash ADC with the four different encoders, i.e., the ROM encoder with 3-input NAND gates for bubble error correction, the ones-counter encoder, the MUX-based encoder, and the 4-level folded Wallace tree encoder. The results of these simulations are shown in Figure 5.13 and Figure 5.14.

In Figure 5.13 the average ENOB of the ADC is plotted as a function of the standard deviation of the timing difference between the clock lines and the signal lines, i.e., \( \sigma_t \). As seen in the figure the performance of the MUX-based encoder is about the same as for the ROM encoder, with 3-input NAND gates used for the bubble error correction. Note that the MUX-based encoder has no special bubble error correction circuits. It is also seen
that the ones-counter encoder has better performance than both the ROM encoder and the MUX-based encoder. Last, the 4-level folded Wallace tree encoder has a slightly lower average ENOB than the ones-counter. The reason for this is that the folded Wallace tree topology is more sensitive to bubble errors at the thermometer input levels that are connected to the 3-input OR gates shown in Figure 4.2, since these levels controls the MUX seen in the same figure.

In Figure 5.14 the yield of the ADC is plotted as a function of $\sigma_t$. The design target is an ENOB of 5.5 bits. As seen in this figure the differences in yield between the topologies are small. However, Figure 5.14 indicates that the ones-counter and folded Wallace tree encoders result in a slightly higher yield than the ROM and MUX-based encoders.

![Figure 5.13](image_url)

**Figure 5.13:** Average ENOB as a function of $\sigma_t$ for ROM encoder with 3-input NAND gates, ones-counter encoder, MUX-based encoder, and 4-level folded Wallace tree encoder.

## 5.5 DEM Flash ADC

To evaluate the performance of the proposed DEM flash ADC in Section 4.3, a behavioral model of the topology was developed in MATLAB® with one modification compared with the topology shown in Figure 4.8. The random generator is replaced by a pseudo-random bit stream (PRBS) generator, as shown in Figure 5.15. The PRBS generator consists of a 15 stage shift register where the first stage is initiated to logic one and the other stages to zero. The outputs of the last and the first stage of the shift register are connected to an XOR gate, whose output is connected to the input of the
The yield as a function of $\sigma_t$ for ROM encoder with 3-input NAND gates, ones-counter encoder, MUX-based encoder, and 4-level folded Wallace tree encoder. The design target is an ENOB of 5.5 bits.

As mentioned in Section 4.3.1 the 1-of-$M$ decoder consists of an $M$ stage circular shift register where one position is set to logic zero and the rest to one. The zero is shifted one position each clock cycle. The shift direction is set by the output of the random generator, which in this case is the PRBS generator. This ensures that the reference voltages change are equal to $V_{LSB}$, which should reduce the settling time of the reference generator compared with the circuits proposed in [6, 7]. The reduced settling time improves the speed of the overall flash ADC.

In the behavioral model an uncertainty in the resistor values in the reference net as well as the offset of the comparators are included. The resistor uncertainty is considered to be Gaussian distributed with a standard deviation $\sigma_R$ in value of 10%. The comparator offset is also assumed to have a Gaussian distribution with a standard deviation $\sigma_{offset}$ of 15 mV. $V_{ref}$ is 1 V and the resolution is six bits. The output spectrum from a MATLAB® simulation is depicted in Figure 5.16. The number of clock cycles was $2^{16}$.

In Figure 5.16(a) the output spectrum of one simulation is shown when no DEM is applied. In this the spurious tones is clearly visible, and the SFDR is about 37 dB. If allowing a fully random DEM, i.e., the zero on one of the 1-of-$M$ decoder outputs can be shifted to any of the $M$ decoder outputs each sample, the spurious tones are distributed over the spectrum.
This is shown in Figure 5.16(b). Note that when the spurious tones are distributed over the spectrum the noise floor is raised. The SFDR is however much improved, and is for the fully random DEM increased to 62 dB. However, as mentioned earlier, this is not a viable approach for a high-speed ADC due to the settling time of the reference voltages, which would reduce the conversion rate significantly. Instead the random generator is realized by a PRBS, with results in Figure 5.16(c).

A histogram of the position of the zero in the output of the 1-of-126 decoder is shown in Figure 5.17. As seen the zero position is not uniformly distributed like it would be if the fully random DEM is applied. In this case this is intentionally since the DEM is restricted to only shift the zero a single position. If the zero position would have been uniformly distributed it would mean that the decoder shifts the zero one step in the same direction each clock cycle, i.e., there would be no randomization of the shift direction.

The output spectrum of the DEM flash ADC with the PRBS generator is shown in Figure 5.16(c). This shows the spurious tones are still present, and the SFDR is 54 dB. However, the SFDR is still improved by as much as 17 dB, compared with the spectrum for the ADC without DEM in Figure 5.16(a).

The simulation results in Figure 5.16 are only from one single simulation, and do therefore not show the statistical variation. A simulation of the yield for the three different cases above was therefore also done. The results of the yield simulations are shown in Figure 5.18. In this figure the yield is plotted
Figure 5.16: Simulated output spectrum (a) without DEM, (b) with fully random DEM, and (c) PRBS DEM for six bits resolution, a $\sigma_R$ of 10 %, and a $\sigma_{\text{offset}}$ of 15 mV.

as a function of the SFDR design target, SFDR$_{\text{target}}$. The yield is therefore the probability that the SFDR is larger than SFDR$_{\text{target}}$. In the simulations the standard deviation of the comparator input offset voltage $\sigma_{\text{offset}}$ was 15 mV with a full-scale voltage of 1 V. The standard deviation of the resistor values $\sigma_R$ was 10 %. In addition, the number of clock cycles was $2^{16}$. As seen in Figure 5.18(a) the yield is significantly improved when applying DEM, especially if fully random DEM is applied. From Figure 5.18(b) it is seen that for a yield of 99 % the SFDR design target is increased by about 11 dB if PRBS DEM is applied, and close to 26 dB if fully random DEM is applied, compared with the ADC without DEM.
**Figure 5.17:** Histogram of the position of the logic zero in the output of the 1-of-126 decoder.
Figure 5.18: In (a) the yield for the flash ADC is plotted as a function of the design target SFDR without DEM, with fully random DEM, and with PRBS DEM, for six bits resolution, a $\sigma_R$ of 10 %, and a $\sigma_{\text{offset}}$ of 15 mV. In (b) a part of the plot has been enlarged.
Chapter 6

ADC Designs

This chapter presents the design of the two test ADCs in this work. In Section 6.1 a design of a 6-bit flash ADC with a MUX-based thermometer-to-binary encoder is presented. In Section 6.2 a design of the proposed flash ADC with DEM is presented. Hence, the section contains the design of the reference generator, the PRBS, the 1-of-126 decoder, and the thermometer-to-binary encoder. The same comparator is used in both these two ADC designs. The design of the latched comparator is presented in Section 6.3. The circuit topologies of the digital circuits used in the ADC designs are presented in Section 6.4. Some comments on the design of the circuits are also included. In the last section, Section 6.5, the design of the ESD protection circuits is presented.

The results of the modeling in Chapter 5 were used in the design considerations. As an example, the modeling of the clock skew in Section 5.1 indicated that a flash ADC with resolution of up to six bits and an input frequency of 1 GHz can be operated without a SH circuit on the input. The ADCs in this chapter are therefore designed without SH circuits. The removal of the SH circuit saves power, but the comparator must be designed for a higher bandwidth, i.e., their power consumption is increased. The result of the modeling of the reference generator in Section 5.2 is used to calculate the maximum reference net resistance of the reference generator in the flash ADC as well as the DEM flash ADC. The behavioral level simulations of the ADC performance for different encoders in Section 5.4 indicated that the chosen encoder topology in the respective ADC design would not affect the overall ADC performance negatively. Finally, the simulation results of the DEM flash ADC behavioral models presented in Section 5.5 demonstrated the concept of introducing DEM into the reference generator.
of flash ADCs and to estimate the SFDR improvement. The behavioral level model of the PRBS circuit is used to decide the type of PRBS circuit, whose design is presented in Section 6.2.2, which is a part of the design of the DEM flash ADC, presented in Section 6.2.

6.1 Flash ADC with MUX-Based Encoder

This section presents the design of the reference generator and the thermometer-to-binary encoder for the flash ADC with MUX-based encoder, starting with the reference generator. The ADC topology is illustrated by Figure 3.5. The design of the comparators used in this ADC is presented in Section 6.3. The results of the design of this ADC are presented and discussed in Section 7.1.

6.1.1 Reference Generator

As seen in Figure 3.5 the reference generator of the ADC with MUX-based encoder consists of a string of equally sized resistors, except for the resistors on the top and bottom of the string. These resistors have a resistance value that is half compared with the other resistors. This choice of resistance centers the reference voltage range on the 850 mV input DC level. A ramp input would therefore yield the quantization error shown in Figure 3.1.

As mentioned in Section 3.5.2 the parasitic capacitors result in input signal feedthrough to the resistor ladder generating the reference voltages. This feedthrough may cause too large reference voltage variations if the resistance of the resistors is too large. The reference net was therefore modeled to find the reference net resistance that yield sufficiently low input-to-reference feedthrough at the lowest power consumption. The model was presented in Section 5.2 and the expression (5.6) was derived. That expression is used to calculate the maximum reference net resistance.

To have some design margin to other error sources the reference net was designed for a maximum feedthrough of 0.25 LSB, i.e., \( q_{\text{LSB}} \) in (5.6) is 0.25. The maximum input frequency was assumed to be 1 GHz. After designing the comparator its parasitic input capacitance was derived, which is 13 fF. The total number of comparators is 63, hence \( C_{\text{tot}} \) in (5.6) becomes 0.83 pF, i.e., the total converter input capacitance excluding the input routing capacitance. Inserting these values into (5.6) gives that the maximum total reference net resistance, \( R_{\text{tot}} \), should be 6 Ω. This low resistance would yield a reference net power consumption of about 42 mW. To reduce the power consumption the method discussed in Section 5.2.1 is applied, i.e., some of the reference net outputs are decoupled.
In Section 5.2.1 it was concluded that by decoupling a number of the reference net outputs the effect of input-to-reference net feedthrough can be further reduced, or the feedthrough can be traded for lower power consumption by increasing the total reference net resistance. This is utilized by decoupling every thirteenth reference net output by a 45 pF on-chip capacitor. The total reference net resistance $R_{tot}$ could thereby be increased to 30 Ω and still having a feedthrough of less than 0.25 LSB. This choice of total resistance would reduce the power consumption of the reference net for that feedthrough. However, to obtain some design margin the resistance was chosen to 16 Ω. This still reduces the power consumption by more than 60 % compared with not decoupling the reference net outputs, for the same input-to-reference feedthrough.

In the implementation the decoupling capacitors are realized by 45 pF on-chip capacitors, but the decoupling is assumed ideal in the derivation of (5.6). A model of the reference net with an $R_{tot}$ of 16 Ω was therefore simulated on transistor level in Cadence® to verify that the worst-case feedthrough still was below the design target of 0.25 LSB. In this model the reference net including its decoupling capacitors and the parasitic capacitors of the comparators were included. The reference net, with only the reference net supplies decoupled, was also simulated. The results of these simulations are presented in Figure 6.1.

In Figure 6.1(a) the simulation result of the decoupled reference net is plotted. As seen from this figure the magnitude of the worst-case input-to-reference signal feedthrough is below 0.25 LSB at input frequencies up to more than 1 GHz. Hence the design target is fulfilled.

For comparison the input-to-reference feedthrough was simulated with only the reference net supply decoupled. In the simulation the decoupling capacitance was large, i.e., the decoupling can be considered ideal. Comparing the plot in Figure 6.1(a) with the plot in Figure 6.1(b) where only the reference net supplies are decoupled, we find the feedthrough to be nearly 0.5 LSB higher at the 1 GHz input frequency.

### 6.1.2 Thermometer-to-Binary Encoder

The thermometer-coded output of the comparators is converted to binary code by a thermometer-to-binary encoder. The encoder is implemented using the MUX-based encoder presented in Section 4.2. Since the 2:1 MUXs used in the encoder are buffered with one inverter on their output, as explained in Section 6.4.2, the encoder topology is slightly changed compared to the topology shown in Figure 4.6 for a resolution of four bits. The modi-
The worst-case input-to-reference signal feedthrough as a function of the input frequency with (a) every thirteenth reference output decoupled, and with (b) only the reference net supply decoupled.

Figure 6.1: The worst-case input-to-reference signal feedthrough as a function of the input frequency with (a) every thirteenth reference output decoupled, and with (b) only the reference net supply decoupled.
yields that the different control signals will drive different load capacitances. This introduces a variation of the propagation delays of the encoder outputs that vary with the input signal. In an improved version of the MUX-encoder the timing difference could be compensated by making the output buffers of the MUXs that drives the control inputs stronger, and by designing the MUXs to have similar propagation delay from the input to the output, as from the control-input to the output.

6.2 DEM Flash ADC

This section presents the design of the DEM flash ADC, whose topology is illustrated by Figure 4.8(b). The purpose of this design is to demonstrate the DEM technique. The focus is therefore not on the speed performance in this design. As mentioned in Section 4.3 the requirement on maximum input frequency will affect the required MOSFET switch size. Hence the maximum input frequency is limited to 130 MHz, which yield around 100 µm
wide switch transistors at minimum gate length.

The design of the reference generator will be presented first, followed by the design of the PRBS circuit, and then the DEM control circuit. The results of the design of the DEM flash ADC are presented and discussed in Section 7.2.

6.2.1 DEM Flash ADC Reference Generator

The reference generator of the DEM flash ADC consists of two resistor strings connected in a circular structure according to Figure 4.8(b). Although they are connected in a circular structure, they can each be seen as individual resistor strings connected in parallel with the same reference supply voltage \( V_{\text{ref}} \). The maximum total resistor value of each resistor string can therefore be calculated using the same method as for the flash ADC with MUX-based encoder, i.e., by using (5.6).

In this design, the same comparator is used as in the previous presented design. Hence \( C_{\text{tot}} \) is still equal to 0.83 pF, but the maximum feedthrough in this design is one LSB, i.e., \( q_{\text{LSB}} \) is one. The maximum input frequency is limited to 130 MHz. Using (5.6) this yield that the maximum total reference net resistance \( R_{\text{tot}} \) of each of the two resistor strings should be less than 190 \( \Omega \). Further, since the settling time of the reference net limits the speed of the DEM flash ADC the parasitic capacitances on the reference net output nodes should be minimized. This implies that decoupling of the reference net output cannot be used to reduce the input-to-reference feedthrough further, or to reduce the power consumption of the reference net.

6.2.2 PRBS

On-chip random generators can use the thermal noise of resistors to generate the random signals [58]. In this work a PRBS generator is used. Although it does not give a true random output signal, the result of the behavioral level models show that DEM with a PRBS generator as random signal source still yield a performance improvement compared with not using DEM, as shown in Figure 5.16 and Figure 5.18. From these behavioral level models the PRBS circuit is extracted. The floor plan of the PRBS consisting of 15 D flip-flops and an XOR gate is shown in Figure 6.3.

Connecting the D flip-flops of the shift register in one row during the layout would give a long routing distance from the last D flip-flop to the XOR gate, assuming the XOR gate is placed near the first D flip-flop. Having a single-row shift register floor plan would require a routing distance of
about 700 µm in this design, i.e., the propagation delay of the signal along this path would be about 7 ps, assuming that the signal propagates with a speed equal to a third of the speed of light. In addition, the parasitic capacitance of the wire would also add to the signal propagation time. The propagation delay of the signal between the other stages would be much lower. To balance this delay the D flip-flops are placed in two rows, as illustrated by the floor plan of the PRBS in Figure 6.3. This floor plan yields similar signal propagation times between every stage of the PRBS.

![Figure 6.3: The floor plan of the PRBS generator.](image)

### 6.2.3 1-of-126 Decoder

To minimize the settling time of the reference generator the DEM was restricted by only allowing neighboring comparators to exchange their reference voltages. In the behavioral level model of the 6-bit DEM flash ADC this was accomplished by implementing the 1-of-126 decoder by a 126-stage circular shift register. The shift register shifts a single zero, and it is capable of shifting its content one position in either direction. The results of the behavioral level simulations in Figure 5.16 and Figure 5.18 show that this restriction reduces the SFDR compared with the fully random DEM flash ADC. The performance enhancement of the restricted DEM flash ADC is however still significant, compared with the ADC without DEM.

As for the shift register of the PRBS generator, the shift register of the 1-of-126 decoder also has a floor plan where the shift register cells, i.e., the D flip-flops, are placed in multiple rows. The reason in the same as for the shift registers of the PRBS, i.e., to get minimum and equal signal propagation times between each stage. The floor plan of the 1-of-126 decoder is shown in Figure 6.4. One of the D flip-flops have a reset input and the other 125 D flip-flops have a preset input. Hence, the 1-of-126 decoder can be initiated to have only one zero on its outputs.
The thermometer-coded output of the comparators is converted to binary code by a thermometer-to-binary encoder, which can be implemented by various approaches, e.g., a ones-counter. In this case a ones-counter should be used, since the comparators are connected to different reference levels during each clock period, as determined by the PRBS and the 1-of-126 decoder. The reference level connected to each comparator therefore varies. The number of ones on the comparator outputs are however the same for the same input, but in a different order. A carry-save adder is therefore chosen as the encoder. It reduces the 63 inputs to 10 outputs, as illustrated by Figure 6.5. The numbers in parentheses in Figure 6.5 indicate the relative weight of the bits on the different nodes. The decoding of the 10 remaining outputs to the binary value is subsequently performed using MATLAB®. The depth of the tree is thereby limited to six levels, which ensures that the encoder will not limit the speed performance of the ADC for sample frequencies up to 800 MHz. In future designs the complete decoding to a binary output can be accomplished on-chip by introducing pipelining in the encoder. Further optimization during the sizing of each full adder can also improve the performance. To balance the propagation delay of the 10 output signals each of them pass through the same number of full adders. This approach causes some of the full adders to always have a digital zero on one or two of their three inputs.

Figure 6.4: The floor plan of the 1-of-126 decoder.

### 6.2.4 Thermometer-to-Binary Encoder
Figure 6.5: Illustration of the thermometer-to-binary encoder used for the 6-bit DEM flash ADC.

6.3 Comparator

This section describes the design of the comparators used in the flash ADC with MUX-based encoder, and in the DEM flash ADC. The measurement results of the designed comparator are presented in Section 7.3. The chosen comparator topology is shown in Figure 6.6. As seen from this figure the comparator consists of a latched comparator with a preamplifier having resistive loads. As mentioned in Section 3.5.3, the use of a preamplifier reduces the input referred offset of the comparator and reduces the kickback noise introduced by the latched comparator.

6.3.1 Preamplifier

In Figure 6.6 it is seen that the preamplifier consists of the bias transistor M8, the differential input M9a and M9b, and the two resistors $R$ as passive load. An amplifier with an active load can be designed for a higher gain than if a passive load is used. However, since the requirement on the bandwidth is high the gain should be low [87]. The differential preamplifier therefore has resistive loads, where the resistors are implemented with the unsilicided high value polysilicon layer available in the used partially depleted SOI CMOS technology. This choice yields a preamplifier gain less than three.

As mentioned in Section 3.5.3 the differential topology is chosen to reduce the second order distortion. The emphasis during the design of the comparator was therefore on minimizing the third order distortion given
The comparator topology used in the designed ADCs.

by the expression in (3.18). The third order distortion is plotted in Figure 6.7 as a function of the $f_{\text{amp}}/f_{\text{in}}$ ratio for different linear ranges $V_{lr}$, where $V_{lr} = V_{gs} - V_T$. The full-scale voltage $V_{FS}$ is 0.5 V.

![Comparator topology diagram]

**Figure 6.7:** Third order distortion as a function of the $f_{\text{amp}}/f_{\text{in}}$ ratio for linear ranges $V_{lr} = 0.1$, $0.175$, $0.25$, and $0.5$ V.

The third order distortion should be below the quantization noise floor, which is given by (3.5b). Since the resolution $N$ of the ADCs is six, the third order distortion should be below $-38$ dB, as seen from using (3.5b). The linear range $V_{lr}$ was chosen to $175$ mV. Figure 6.7 then gives that the third order distortion requirement is fulfilled for an $f_{\text{amp}}/f_{\text{in}}$ ratio of
2.5. An $f_{\text{amp}}/f_{\text{in}}$ ratio of three was chosen to obtain a design margin. A sampling frequency of 2 GHz yields a Nyquist frequency of 1 GHz, i.e., the maximum input frequency of the ADC. Hence, the $-3$ dB bandwidth of the preamplifier should be 3 GHz.

To reduce the mismatch of the input transistors, M9a and M9b in Figure 6.6, due to the self-heating, they are placed in the same well during layout. The temperature difference between the input transistors should thereby be reduced compared with if they are placed in separate wells, as explained in Section 2.4.2.

### 6.3.2 Latched Comparator

In Section 3.4.2 it is stated that for high speed ADCs the comparators generally are based on a regenerative latch, which gives a very fast comparator that can operate at a low power supply voltage. The comparator in this design, depicted in Figure 6.6, therefore consists of a differential input pair, a regenerative latch and inverters as buffers at each output. The differential input pair is the transistors M1a and M1b, which are connected to the bias transistor M0. M2a and M2b are used to reduce the kick-back noise, as explained in Section 3.5.3. The regenerative latch consists of the transistors M4, M5a, M5b, M6a, and M6b.

The transistor pairs M6a/M5a and M6b/M5b builds up the two inverters, or negative gain amplifiers, of the regenerative latch. The outputs of the inverters are connected to the input of the other inverter. A fifth transistor, M4, is connected between the outputs of the two inverters. When the clock goes high, the transistor M4 sets the latch in its metastable state. The input stage of the comparator is at the same time turned on. The input stage will introduce an imbalance between the currents through the two metastable inverters. This imbalance is dependent on the signal on the input of the comparator. When the clock goes low, the transistor M4 is turned off and the latch enter its evaluation phase. The current imbalance introduced into the latch by the input stage then steers the latch to one of its stable states.

In Section 2.4.2 it is stated that fast switching circuits are less affected by the self-heating effect. As mentioned the reason was that since they are constantly switched on and off they will reach a thermal equilibrium at a device temperature that is lower than if they are turned on the whole time. Since the input stage of the comparator is turned off when M2a and M2b are turned off, the self-heating could have less effect on the input stage. To further reduce the effect of the self-heating the transistors M1a and M1b are placed in the same well for the same reason as for M9a and M9b, i.e.,
to reduce the device temperature difference between the input transistors of the differential input.

As mentioned in Section 3.5.3 the metastability error rate can be reduced by designing the inverters on the output of the latch to have a higher threshold voltage than the latch. This is accomplished by making the ratio between the size of the PMOS and NMOS transistors of the inverters larger than the ratio of the PMOS and NMOS transistors of the regenerative latch [86]. The output is then logic one when the latch is in the reset phase, which should reduce the metastability errors on the outputs of the comparator.

The buffers are each connected to a D flip-flop, which holds the output of the comparator for a clock period. This gives the thermometer-to-binary encoder more time to perform the conversion. These D flip-flops also increase the regeneration gain of the latched comparator. The increased regeneration gain further reduces the metastability error rate [50].

6.4 Digital Circuits

This section presents the different digital circuits used in this work, starting with the full adders.

6.4.1 Full Adder

The full adders are realized by the static CMOS logic circuit depicted in Figure 6.8 [42]. They are designed to have a propagation delay of about 200 ps. In addition, each of their two outputs is designed to drive a load equal to the input of a similar full adder. Hence, they are designed to drive a load of about 150 fF.

6.4.2 2:1 MUX

The depth of the MUX-based encoder is five 2:1 MUXs. Since the MUXs are based on transmission gates this yield a slow encoder circuit if the MUXs are not buffered. To improve the maximum operation frequency of the encoder an inverter is introduced as an output buffer in the MUXs. Only one inverter is introduced to have a low output-buffer propagation delay. To use only one inverter as buffer is acceptable since it only lead to a minor modification of the original MUX-based encoder circuit, as seen by comparing Figure 4.6 and Figure 6.2.
The symbol of a transmission gate and its transistor level implementation is depicted in Figure 6.9. The buffered 2:1 MUX schematic topology is shown in Figure 6.10, where the output buffer is included.

The MUXs were designed with another MUX as the load. The design yields a propagation delay of 50 ps from the “0” or “1”-input to the output, and 70 ps from the control input to the output.

### 6.4.3 D Flip-Flop

The D flip-flops are built with transmission gates as depicted in Figure 6.11. The topology is the same for the D flip-flops with preset as for those with reset. The only difference is the logic level of the “1”-input of the 2:1 MUX.
Figure 6.10: (a) Schematic symbol of a 2:1 MUX including the output buffer, and (b) the MUX circuit.

that is logic one for preset and logic zero for reset. In Figure 6.11 the local clock buffer is also included.

The D flip-flops were designed for a 100 fF load, and have a propagation delay of 70 ps. In addition, the input must be stable at least 20 ps before the positive clock edge.

Figure 6.11: The D flip-flop circuit.

6.5 ESD Protection Circuit Design

The technology used for the ADC implementations is still much on the test stage. Very little is therefore included in the design kit. As an example, parameterized transistor cells used for generating the layout of the transistors were not included, nor were the pads. The ESD protection circuits of the pads therefore also had to be designed, which is presented in this section.

The used technology is a partially depleted SOI CMOS technology. Hence two stages of the CMOS gated double-diode network discussed in Section 2.5.2 are used as the ESD protection for the input pads. The
Circuit is depicted in Figure 6.12. The purpose of the resistor $R$ in this figure is to limit the ESD current to the second stage.

The circuit was designed for a maximum $V_{\text{ESD}}$ of 1.5 kV using the human body model depicted in Figure 2.7(a). The $-3$ dB bandwidth of the ESD protection circuit is 8 GHz and should therefore not notably affect the performance of the ADCs.

![Circuit Diagram](image)

**Figure 6.12:** The circuit topology of the ESD protection circuits of the input pads.
Chapter 7

Results and Discussion

The designed ADCs are currently manufactured. The ADC results presented in this chapter is therefore based on transistor level simulations in Cadence® using the foundry provided Berkeley short-channel insulated-gate field effect transistor (IGFET) model for SOI (BSIM3SOI) Eldo™ models. The provided models only contain the typical transistor parameters. Hence no corner simulation was possible. In the simulations only the transistor parasitic capacitances are included. The interconnect parasitics are not included, since the provided design kit not yet supports extraction of the interconnect parasitics. Hence when designing the sub-circuits of the ADCs the interconnect parasitic capacitance was approximated by the rule of thumb of 2 pF/cm. The approximate interconnect distances were extracted from the floor plan and from the layout.

The inductance of the bond wires introduces supply voltage variations, which can be large for high-speed circuits. The power supply was therefore simulated with the bond wires modeled as inductors. For the simulations of the power supply, the bond wires were assumed to have an inductance of 1 nH/mm.

This chapter is organized as follows. In Section 7.1 the simulation results for the flash ADC with MUX-based encoder is presented and discussed. Section 7.2 presents and discusses the simulation results of the DEM flash ADC. Some measurement results of an earlier manufactured comparator are presented and discussed in Section 7.3. Finally, in Section 7.4 the SOI CMOS technology is discussed based on what was presented in Chapter 2.
7.1 Flash ADC with MUX-Based Encoder

In Section 6.1 the design of a 6-bit flash ADC with a MUX-based encoder is presented. The encoder consists entirely of a number of 2:1 MUXs, giving a compact and regular structure. This was a benefit when doing the layout of the encoder.

To find the maximum sampling frequency $f_{s,max}$ the ENOB was plotted as a function of the sampling frequency for a 9 MHz 0.5 V full-scale sinusoid input. The plot is found in Figure 7.1(a). This plot yields a maximum sampling frequency $f_{s,max}$ of slightly above 1 GHz. The ENOB at low input frequencies for a sampling frequency of 1 GHz is 5.8 bits. The plotted ENOB was calculated from a sine wave curve fit, according to the method described in Section 3.3.5.

When designing the comparators the sampling frequency design target was 2 GHz to obtain some design margin. The maximum sampling frequency was thereby expected to be somewhat higher than the obtained 1 GHz. Limitations in either the MUX-based thermometer-to-binary encoder or the latches in the comparators are believed to be the reason for not reaching a maximum sampling frequency of more than 1 GHz. This should be investigated in future work. The real $f_{s,max}$ will be extracted from the future measurements.

In Figure 7.1(b) the ENOB is plotted as a function of the input frequency for a 1 GHz sampling frequency, i.e., close to the maximum sampling frequency. The input was also in this case a full-scale sinusoid. This test yields an ERBW of 390 MHz.

![Figure 7.1](image-url)

Figure 7.1: ENOB as a function of (a) the sampling frequency and (b) the input frequency for the flash ADC with MUX-based encoder.
To illustrate the ADC output spectrum a plot of the simulated output power spectrum at 1 GHz sampling frequency and a 359 MHz full-scale sinusoid input is shown in Figure 7.2. This plot shows that the SFDR is 39 dB. Further, the power consumption is 170 mW. Hence the figure of merit in terms of energy per conversion step, defined according to (3.13), becomes 3.9 pJ.

![Figure 7.2](image)

**Figure 7.2:** Simulated output power spectrum of the flash ADC with MUX-based encoder for a 359 MHz full-scale sinusoid input at $f_{s,\text{max}}$.

The ADC occupies a total chip area of 2.9 mm$^2$ and a core area of about 0.4 mm$^2$ without the decoupling capacitors. The layout is shown in Figure 7.3, where the different parts are indicated. On the top, indicated by (1), the MUX-based encoder is shown. Below the encoder are the stacked comparators indicated. The reference net and the decoupling capacitors for the reference net are indicated by (3) and (4) respectively. The other capacitors are some used for decoupling of the analog power supply, and the other for decoupling of the digital power supply.

The ADC performance is summarized in Table 7.1. The total input capacitance in this table only considers the parasitic input capacitance of the comparators. Adding the parasitic capacitance of the input interconnect increases the total input capacitance by about 0.1 to 0.2 pF.

The figure of merit in terms of energy per conversion step of the converter is 3.9 pJ based on the transistor level simulation results. For comparison Table 7.2 lists the efficiency of some previously published 6-bit flash ADCs implemented in bulk CMOS technologies, together with the efficiency of the flash ADC with MUX-based encoder discussed in this section. The reason
for only comparing this design with flash ADCs implemented in bulk CMOS technology is that no publications on high sampling rate 6-bit flash ADCs implemented in SOI have been found. The comparison indicates that the flash ADC with MUX-based encoder could be as efficient as other state-of-the-art converters, without applying interpolating or averaging techniques as in [29, 65, 66]. Note that the other results are based on real measurements, while the results presented in this work are based on simulations. The results presented in Table 7.2 are therefore likely to be modified after the measurements have been performed.

7.2 DEM Flash ADC

In Section 6.2 the design of a 6-bit flash ADC with DEM is presented. The operation of the ADC with DEM is evaluated by simulation in Cadence® using foundry provided BSIM3SOI models for Eldo™. The ADC was allowed to settle for 1.25 µs at start-up before using the data and the input was 90% of full-scale, i.e., 225 mV amplitude. The long settling time is a major drawback of this topology. To reduce the settling time the size of the switches in the reference net must be reduced. The sizing of the switches should be investigated further in future work.

The ENOB as a function of the sampling frequency for a low frequency, 9 MHz, sinusoid input is plotted in Figure 7.4. From this estimate it is seen that the maximum sampling frequency $f_{s,\text{max}}$ is above 550 MHz. Allowing a longer settling time before using the data indicated an improved ENOB.
Table 7.1: Performance summary of the 6-bit flash ADC with MUX-based encoder.

However, due to the very long simulation time, more than two weeks on a Sun Fire 280R with UltraSPARC III processors, this could not be fully evaluated. The improvement of ENOB will be further investigated during the measurements. The measurements will also reveal the real maximum sampling frequency and the ERBW. The latter might be effected negatively by the introduced switches in the reference net, which also will be evaluated by measurements.

Although the ADC was allowed to settle for 1.25 µs at start-up the reference net did not completely settle within this time period. This is believed

<table>
<thead>
<tr>
<th>Reference</th>
<th>Technology/Supply voltage</th>
<th>Figure of merit</th>
</tr>
</thead>
<tbody>
<tr>
<td>[29]</td>
<td>350 nm bulk CMOS/3.3 V</td>
<td>7.9 pJ</td>
</tr>
<tr>
<td>[87]</td>
<td>250 nm bulk CMOS/1.8 V</td>
<td>13.7 pJ</td>
</tr>
<tr>
<td>[66]</td>
<td>180 nm bulk CMOS/1.95 V</td>
<td>5.0 pJ</td>
</tr>
<tr>
<td>[65]</td>
<td>130 nm bulk CMOS/1.5 V</td>
<td>2.3 pJ</td>
</tr>
<tr>
<td>MUX-based encoder</td>
<td>130 nm PD SOI CMOS/1.2 V</td>
<td>3.9 pJ</td>
</tr>
</tbody>
</table>

Table 7.2: Comparison of the ADC with MUX-based encoder with previous work in terms of energy per conversion step.
to be the reason for the reduction of the ENOB for low frequencies seen in Figure 7.4. The reason that the curve in Figure 7.4 is non-monotonous is believed to be due to that not all input levels are converted within the simulation time in some of the cases. This will be investigated in the measurements by using more samples for the sine wave curve fit.

![Figure 7.4: ENOB as a function of the sampling frequency for the DEM flash ADC with a 9 MHz sinusoid input at 90% of full-scale.](image)

During the circuit simulations, it was observed that when increasing the sampling frequency the simulation time increased for the same number of samples. More samples were used when simulating the output spectrum than for calculating the ENOB. To reduce the simulation time when simulating the output spectrum the sampling frequency of the ADC with DEM was limited to 130 MHz. Although the number of samples was about 1100, the simulation time still was nearly a month on a Sun Fire 280R with UltraSPARC III processors.

The spectrum with and without DEM are plotted in Figure 7.5 for frequencies up to 65 MHz, i.e., the Nyquist frequency. The used data is without allowing the ADC to settle at start-up, resulting in the low-frequency component seen in the plots in Figure 7.5. In addition, the uncertainty in the resistor values was assumed Gaussian distributed with a standard deviation of 10% in this simulation. The comparator offset standard deviation was scaled to 7.5 mV since $V_{ref}$ is 0.5 V in the implementation instead of the 1 V used in the MATLAB® simulations presented in Section 5.5.

The simulated output power spectrum in Figure 7.5(a) yields an SFDR of 39 dB when not using DEM. When DEM is used the SFDR is increased to
Figure 7.5: Simulated output power spectrum at 130 MHz sampling frequency (a) without DEM and (b) with DEM.

45 dB, as shown in Figure 7.5(b). Hence a 6 dB improvement of the SFDR when using DEM compared with not using DEM. The 6 dB improvement differs from the results of the behavioral level model simulations, which indicated an improvement by 11 dB according to the yield simulation, where 216 samples were used. Due to the long simulation time, a yield simulation on transistor level was not practical. Instead, this will be investigated by the coming measurements.

The behavioral level simulations also indicated a 17 dB improvement of the SFDR in one of the simulations shown in Figure 5.16, compared with 6 dB in the transistor level simulation shown in Figure 7.5. The improvement of the SFDR is expected to be larger if more samples are used, i.e., by increasing the simulation time, since using more samples would further distribute the signal power of the spurious tones over the frequency range. For practical reasons this could not be evaluated, but the measurements will reveal if the SFDR improvement is larger when more samples are used. However, the design still demonstrates the concept of introducing DEM in the reference net of a flash ADC and the transistor level simulation results indicate that the introduced DEM result in an improved SFDR, as expected from the behavioral level simulations.

As mentioned above the number of samples is limited to 1100 in the simulation of the spectrum. The histogram of the position of the zero in the output of the 1-of-126 decoder becomes according to Figure 7.6. As seen, several of the positions are never visited, and the histogram differs much from the histogram in Figure 5.17. The 1100 samples are just above
3% of the whole PRBS sequence, which is $2^{15} - 1$ long. This is one likely reason to why the SFDR improvement is 6 dB in the circuit simulations. Another reason may be that some of the nonideal behavior of the circuits is included when simulating on transistor level compared with the behavior level simulations.

![Histogram of the position of the zero in the output of the 1-of-126 decoder for the transistor level simulations.](image)

**Figure 7.6:** Histogram of the position of the zero in the output of the 1-of-126 decoder for the transistor level simulations.

The average power consumption for a 9 MHz 0.5 V full-scale sinusoid input and a 300 MHz sampling frequency is 92 mW. The ADC occupies a total chip area of 3.9 mm$^2$, including the pad frame. The core area is 0.9 mm$^2$. The layout is found in Figure 7.7 where the different parts of the ADC are indicated. In Figure 7.7 the upper half of the chip is missing. That part contains the ADC with MUX-based encoder, which is shown in Figure 7.3.

The same comparators are used in the design of the flash ADC with DEM as in the design of the flash ADC with MUX-based encoder. Hence the total input capacitance is the same in both designs, if the parasitic capacitance of the input interconnect is not considered. The simulated performance is summarized in Table 7.3.

After designing the DEM flash ADC it was obvious that the introduction of only two switches in series with the reference net also imposes a major limitation on the maximum input frequency. This result indicates that the solutions in [6, 7] would be even more limited in terms of maximum input frequency. The DEM circuits proposed in [6, 7] would also have a longer settling time since the larger number of switches introduced into the reference net would increase the parasitic capacitance. The larger number
The layout of the DEM flash ADC. Indicated parts are (1) PRBS, (2) 1-of-126 decoder, (3) reference generator and switches, (4) comparators including the D flip-flops, and (5) ones-counter encoder.

of switches would also yield a larger chip area of the circuit.

### Section 7.3 Comparator

The comparator was designed and simulated using the foundry provided BSIM3SOI models. To reduce the third order distortion the preamplifier was designed for a $-3$ dB bandwidth of 3 GHz. The large bandwidth imposed a trade-off on the preamplifier DC gain, which is 9.2 dB.

The offset was derived by applying a ramp input to the simulated comparator for different sampling frequencies. These simulations yield that the offset is lower than 0.25 LSB up to sampling frequencies of 1.5 GHz. This offset is only due to the speed limitations of the comparator and not the mismatch of the input transistors.

The comparators have been manufactured, but due to limitations in the measurement set-up the comparators have only been tested for sampling frequencies of up to 500 MHz. These measurements yield a power consumption of 9.6 mW, including the output buffers, measured at a sampling frequency of 400 MHz and a constant input signal. A chip photo of the comparator evaluation chip is shown in Figure 7.8.

The power supply fluctuation was reduced by having multiple power supply pads and by decoupling. The chip area was pad limited due to the
Technology: 130 nm PD SOI CMOS
Supply voltage: 1.2 V
Full-scale voltage: 0.5 V
Resolution: 6 bit
$f_{s,\text{max}}$: > 550 MHz
SFDR at $f_s = 130$ MHz and $f_{\text{in}} = 9$ MHz: 45 dB
Total input capacitance: 0.83 pF
Average power consumption at $f_s = 300$ MHz: 92 mW
Total chip area: 3.9 mm$^2$
Core area: 0.9 mm$^2$

Table 7.3: Performance summary of the 6-bit DEM flash ADC.

A large number of supply pads. Hence, after placing the comparator, the clock buffer, and the output buffers the remaining chip area was used for the decoupling capacitors. Those are the squares inside the pad frame in Figure 7.8. As seen from that figure the decoupling capacitors consume most of the chip area. The comparator is indicated in Figure 7.8 and has an area of 0.0018 mm$^2$. The total chip area is 0.7 mm$^2$.

Figure 7.8: Chip photo of the comparator evaluation chip, where (1) outlines the comparator.
Chapter 2, the SOI CMOS technology has several advantages, but also several disadvantages over bulk CMOS technologies. This section discusses the benefits and drawbacks of SOI CMOS compared with bulk CMOS, and the differences between the partially depleted and fully depleted SOI CMOS technologies. From this a few conclusions regarding the SOI technology are presented in Chapter 8.

Maybe the most obvious advantage of SOI is the reduced parasitic capacitance, which implies that the devices and circuits can operate at a higher speed. This is also mentioned in Chapter 2 together with some other less obvious advantages. They were, e.g., a reduced doping density, and improved current drive capability due to a reduced body factor $n$, which also imply that the DC-gain and unity gain frequency improve. However, all above listed properties are only true for fully depleted SOI CMOS technologies, except for the reduction of parasitic capacitance. The latter is also true for partially depleted SOI CMOS, at least for technologies with minimum gate lengths larger than 100 nm. As the partially depleted SOI CMOS technology is scaled below 100 nm the advantage diminishes, and is predicted to be almost completely gone when approaching 70 nm, as mentioned in Chapter 2. Hence, there seems to be no apparent advantage of using a partially depleted SOI CMOS technology over a bulk CMOS technology for analog circuit design. In addition, considering the kink effect, history effect and the self-heating, the bulk CMOS technology even seems to be a better choice. The first two effects can however be avoided by using body contacts.

Apart of what was discussed above there are still a few advantages of using the partially depleted SOI CMOS technology. Due to the buried insulator and the thin-film structure of the partially depleted SOI CMOS technology the latch-up effect is eliminated and the device density can be higher in a partially depleted SOI CMOS technology compared with a bulk CMOS technology, at least for digital circuits. For analog circuits no general conclusions can be drawn of the device density, since other parameters also must be considered during layout, e.g., matching and self-heating. For digital circuits the self-heating is less important, and therefore generally does not impose restrictions on the device density of digital circuits, except for parts where close matching between the propagation delays are required, since the matching could be effected by temperature variations. Further, the radiation hardness is improved due to the buried insulator. This is obviously an advantage for radiation hard applications, and the radiation hardness reduces the soft error rate of radiation sensitive circuits, e.g., memories. In
addition, the crosstalk is reduced and, e.g., inductors can be implemented having higher Q factor than for bulk CMOS technologies.

For the fully depleted SOI CMOS technologies the situation is predicted to be different. The required doping density is lower, and a reduced body factor yields that the DC-gain and unity gain frequency improve, which are important properties when used in, e.g., analog circuits. As for the partially depleted SOI CMOS technology the fully depleted SOI CMOS technology benefit from the elimination of the latch-up effect, improved radiation hardness, reduced crosstalk, and the possibility of designing inductors with higher Q factors than in bulk CMOS technologies. In addition, devices implemented in a fully depleted technology do not suffer from the kink and history effects due to the complete depletion of their bodies. In partially depleted technology these effects are present and must be considered.

However, the thermal effects are still present in fully depleted SOI, and will be more severe since the thin-film thickness is smaller than for partially depleted SOI. A major obstacle is however the difficulty of manufacturing, since the thin-film thickness must be controlled accurately to avoid having too large threshold voltage variations over the chip. However, if this is solved and if the thermal effects can be managed in the design flow, the fully depleted SOI CMOS technology seems to be a good candidate for the future mainstream sub 100 nm technologies compared with bulk CMOS and compared with partially depleted SOI CMOS technologies.
Chapter 8

Conclusions

Behavioral level models of flash ADCs were presented. The models were used to facilitate the top-down design methodology. MATLAB® simulations of these models yield that the thermometer-to-binary encoder will have an effect on the overall ADC performance. Further, these models gave insight and helped during the design phase. The models were also used to explore the potential SFDR improvement of applying DEM in flash ADCs.

The implementation of two 6-bit flash ADCs was presented. One with a MUX-based encoder and the other is a flash ADC with DEM. The layout was done in a 130 nm partially depleted SOI CMOS technology. The ADCs are currently manufactured. Hence the ADC results presented in this thesis are based on simulations in Cadence® using the foundry provided BSIM3SOI models for Eldo™.

The 6-bit flash ADC with MUX-based encoder has a maximum sampling frequency of 1 GHz and an efficiency in terms of energy per conversion step of 3.9 pJ, based on the transistor level simulation results. A comparison to other published work show that it could be as efficient as contemporary state-of-the-art converters. This will be verified by the future measurements. This design also demonstrates that the MUX-based encoder topology can be used for high-speed flash ADCs. The regular structure of this topology was a major benefit when doing the layout of the encoder, and it consumes little chip area. The major drawback of the MUX-based encoder is the input code dependent propagation delay of the output signals.

The second 6-bit flash ADC presented in this work demonstrated the concept of introducing DEM into the reference net. Due to the introduction of DEM the converter requires a ones-counter as encoder. The transistor level simulations indicated a 6 dB improvement of the SFDR compared with
not using DEM. This shows that the proposed DEM topology is likely to have a positive effect on the SFDR. The maximum sampling frequency is above 550 MHz according to the simulations. The simulation results will be verified by future measurements on the manufactured circuits.

The main advantages of the SOI CMOS technology over bulk CMOS technology is the elimination of the latch-up effect, increased device density, reduced soft-error rate of circuits implemented in SOI CMOS, reduced crosstalk, and an improved quality factor Q of inductors. The partially depleted SOI CMOS technology is however predicted to show little advantage in speed over the bulk CMOS technology as the technologies are scaled below 100 nm gate length. Fully depleted SOI CMOS technology, on the other hand, has several advantages when it is scaled. It requires a lower doping density and has lower body factor, which improves the DC gain and unity gain frequency of the devices. Further, it does not suffer from the kink and history effects. The effect of the self-heating and the difficulty of manufacturing must however be solved. If this is successful, the fully depleted SOI CMOS technology is a promising contender as a future mainstream technology.

8.1 Future Work

This section lists some suggestions for future work related to the presented ADCs and to the evaluation of the SOI CMOS technology.

First of all, the manufactured circuits will be characterized by measurements, which will show how well the simulated performance matches the real performance.

The source of the limitation of the maximum sampling frequency of the flash ADC with MUX-based encoder should be investigated further. The limitation is expected to be due to limitations in the latches of the comparators or the MUX-based thermometer-to-binary encoder. The MUX-based encoder should also be improved to reduce the input code dependent propagation delay of the output signal.

If the measurement results of the flash ADC with DEM are promising, the sizing of the MOSFET switches in the reference net should be investigated. The goal should be to find a good trade-off between the start-up settling time of the reference net, the maximum input frequency, and the maximum sampling frequency by proper sizing of the switches.

Further studies of the performance improvement of using SOI CMOS technology in analog design should also be performed. This can be ac-
complished by, e.g., optimization based design of common analog building blocks. These could be designed in both the partially depleted SOI CMOS technology, as well as in the bulk CMOS technology that the partially depleted SOI CMOS technology is based on. The simulated performance of the designed circuits could then serve as a benchmark of the partially depleted SOI CMOS technology compared with the bulk CMOS technology. This would point out what, if any, performance enhancements there are for analog circuits from using the partially depleted SOI CMOS technology.

In the future, the implementation of analog circuits in fully depleted SOI CMOS technologies should also be investigated to verify the predicted good properties.
Appendix A

Notation

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>$A_0$</td>
<td>DC gain</td>
</tr>
<tr>
<td>$C_{ch,b}$</td>
<td>Parasitic capacitance between channel and body</td>
</tr>
<tr>
<td>$C_{comp,in}$</td>
<td>Parasitic capacitance between the comparator inputs</td>
</tr>
<tr>
<td>$C_{db}$</td>
<td>Parasitic bottom junction area capacitance at the drain</td>
</tr>
<tr>
<td>$C_{g,ch}$</td>
<td>Parasitic capacitance between gate and channel</td>
</tr>
<tr>
<td>$C_{gs}$</td>
<td>Parasitic capacitance between gate and source</td>
</tr>
<tr>
<td>$C_{j,d}$</td>
<td>Parasitic drain junction capacitance</td>
</tr>
<tr>
<td>$C_{j,s}$</td>
<td>Parasitic source junction capacitance</td>
</tr>
<tr>
<td>$C_{j,sw}$</td>
<td>Parasitic sidewall junction capacitance</td>
</tr>
<tr>
<td>$C_L$</td>
<td>Load capacitance</td>
</tr>
<tr>
<td>$C_{ox}$</td>
<td>Gate oxide capacitance per unit area</td>
</tr>
<tr>
<td>$C_{sb}$</td>
<td>Parasitic bottom junction area capacitance at the source</td>
</tr>
<tr>
<td>$C_{tot}$</td>
<td>Total $C_{comp,in}$ on the ADC input</td>
</tr>
<tr>
<td>$D_3$</td>
<td>Third order distortion</td>
</tr>
<tr>
<td>$d_i$</td>
<td>Bit $i$ in the digital output</td>
</tr>
<tr>
<td>$D_{out}$</td>
<td>Digital output</td>
</tr>
<tr>
<td>$dR$</td>
<td>Statistical deviation of the resistance</td>
</tr>
<tr>
<td>$dV_{ref}$</td>
<td>Statistical deviation of the reference net supply voltage</td>
</tr>
<tr>
<td>$f_{amp}$</td>
<td>Preamplifier $-3$ dB bandwidth</td>
</tr>
<tr>
<td>$f_{in}$</td>
<td>Input frequency</td>
</tr>
<tr>
<td>$f_{in,max}$</td>
<td>Maximum input frequency</td>
</tr>
<tr>
<td>$f_s$</td>
<td>Sampling frequency</td>
</tr>
</tbody>
</table>
\[ f_{s,\text{max}} \] Maximum sampling frequency
\[ f_T \] Unity gain frequency
\[ g_{ds} \] Small signal drain-source conductance
\[ g_m \] Transistor transconductance
\[ I_{ch} \] Channel current
\[ I_d \] Total drain current
\[ I_D \] DC drain current
\[ I_{D,\text{sat}} \] Drain saturation current
\[ I_{\text{ref}} \] Reference bias current
\[ k \] Boltzmann’s constant
\[ L \] Channel length
\[ M \] Total number of output samples
\[ M_n \] Rate of metastable states
\[ n \] Body factor
\[ N \] Total number of output bits, i.e., ADC resolution
\[ N(0, \sigma) \] Gaussian distribution with zero mean and \( \sigma \) standard deviation
\[ p_{\text{dec}} \] Reference net decoupling period, used in (5.6)
\[ P_{\text{distortion}} \] Distortion power
\[ P_{\text{noise}} \] Noise power
\[ P_{\text{signal}} \] Signal power
\[ P_{\text{spurious, max}} \] Power of the largest spurious
\[ q \] The charge of an electron, i.e., the elementary charge
\[ q_e \] Quantization error
\[ q_{\text{LSB}} \] Feedthrough from input to reference net measured in number of LSBs
\[ q_s \] Quantization step
\[ R_{\text{tot}} \] Total reference net resistance
\[ R_u \] Unit resistance of a resistor in the reference net
\[ \Delta t \] Timing difference between the clock and input signals wires
\[ T \] Temperature in Kelvin
<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>$t_{\text{MUX}}$</td>
<td>Propagation delay of a 2:1 MUX</td>
</tr>
<tr>
<td>$t_n$</td>
<td>Sample time instants</td>
</tr>
<tr>
<td>$t_{\text{ox}}$</td>
<td>Oxide thickness</td>
</tr>
<tr>
<td>$\Delta t_s$</td>
<td>Sampling time uncertainty</td>
</tr>
<tr>
<td>$t_{\text{CP,\text{folded}}}$</td>
<td>Critical path of the folded Wallace tree encoder</td>
</tr>
<tr>
<td>$t_{\text{CP,MUX-encoder}}$</td>
<td>Critical path of the MUX-based encoder</td>
</tr>
<tr>
<td>$t_{\text{CP,Wallace}}$</td>
<td>Critical path of the Wallace tree encoder</td>
</tr>
<tr>
<td>$t_{\text{XOR}}$</td>
<td>Propagation delay of an XOR gate</td>
</tr>
<tr>
<td>$V_A$</td>
<td>Early voltage</td>
</tr>
<tr>
<td>$V_{\text{bias}}$</td>
<td>Bias voltage</td>
</tr>
<tr>
<td>$V_{\text{DD}}$</td>
<td>Supply voltage</td>
</tr>
<tr>
<td>$V_{\text{ds}}$</td>
<td>Voltage between drain and source</td>
</tr>
<tr>
<td>$V_{\text{ESD}}$</td>
<td>ESD stress voltage</td>
</tr>
<tr>
<td>$V_{\text{FS}}$</td>
<td>Full-scale voltage</td>
</tr>
<tr>
<td>$V_{\text{gs}}$</td>
<td>Voltage between gate and source</td>
</tr>
<tr>
<td>$V_m$</td>
<td>Input signal</td>
</tr>
<tr>
<td>$\Delta V_{\text{in}}$</td>
<td>Uncertainty in the sampled input voltage</td>
</tr>
<tr>
<td>$V_{\text{lr}}$</td>
<td>Linear range of the preamplifier</td>
</tr>
<tr>
<td>$V_{\text{LSB}}$</td>
<td>Voltage equivalent to one LSB</td>
</tr>
<tr>
<td>$V_{\text{mid}}$</td>
<td>Magnitude of the middle reference net output voltage with respect to the input voltage magnitude</td>
</tr>
<tr>
<td>$V_{\text{offset}}$</td>
<td>Comparator input offset voltage</td>
</tr>
<tr>
<td>$V_{\text{ref}}$</td>
<td>Reference net supply voltage</td>
</tr>
<tr>
<td>$V_{\text{ref,m}}$</td>
<td>Reference generator output voltage $m$</td>
</tr>
<tr>
<td>$V_s$</td>
<td>Sampled input voltage</td>
</tr>
<tr>
<td>$\Delta V_s$</td>
<td>Uncertainty in the sampled voltage</td>
</tr>
<tr>
<td>$V_{\text{SS}}$</td>
<td>Negative supply voltage</td>
</tr>
<tr>
<td>$V_T$</td>
<td>Threshold voltage</td>
</tr>
<tr>
<td>$W$</td>
<td>Transistor width</td>
</tr>
<tr>
<td>$y_n$</td>
<td>Output sample $n$</td>
</tr>
<tr>
<td>$\tilde{y}_n$</td>
<td>Sinusoid to fit to the output samples at sample $n$</td>
</tr>
<tr>
<td>$\Gamma_{\text{folded}}$</td>
<td>Hardware cost of the folded Wallace tree encoder</td>
</tr>
<tr>
<td>Symbol</td>
<td>Meaning</td>
</tr>
<tr>
<td>--------</td>
<td>---------</td>
</tr>
<tr>
<td>$\Gamma_{\text{MUX}}$</td>
<td>Hardware cost of a 2:1 MUX</td>
</tr>
<tr>
<td>$\Gamma_{\text{MUX-encoder}}$</td>
<td>Hardware cost of the MUX-based encoder</td>
</tr>
<tr>
<td>$\Gamma_{\text{Wallace}}$</td>
<td>Hardware cost of the Wallace tree encoder</td>
</tr>
<tr>
<td>$\mu$</td>
<td>Surface channel charge carrier mobility</td>
</tr>
<tr>
<td>$\omega$</td>
<td>Angular frequency</td>
</tr>
<tr>
<td>$\Psi$</td>
<td>Squared error between fitted output samples and sinusoid</td>
</tr>
<tr>
<td>$\sigma_{\text{offset}}$</td>
<td>Standard deviation of the input referred comparator offset voltage</td>
</tr>
<tr>
<td>$\sigma_R$</td>
<td>Standard deviation of the reference net unit resistor deviation</td>
</tr>
<tr>
<td>$\sigma_{\text{ref}}$</td>
<td>Standard deviation of the reference net supply voltage deviation</td>
</tr>
<tr>
<td>$\sigma_t$</td>
<td>Standard deviation of the time difference between the clock and the input signal</td>
</tr>
<tr>
<td>$\langle . \rangle$</td>
<td>Mean value</td>
</tr>
</tbody>
</table>
References


