Extreme ultraviolet (EUV) projection lithography is a proposed next-generation lithography technique for manufacturing integrated circuits at high volumes. It is targeted to print critical dimensions of 70 nm and below with a large depth of focus. An EUV microstepper using a 10x Schwarzschild objective is currently in use through modifications of an EUV PS/PDI station (phase-shifting/point diffraction interferometer). This EUV interferometer is located at the Advanced Light Source undulator beamline 12.0.1 (Lawrence Berkeley National Laboratory).
To evaluate resist materials for EUV lithography, it is necessary to expose test patterns with very high spatial resolution (less than 50 nm lines and spaces). Currently, there are a limited number of imaging systems that can achieve this fine feature printing at EUV wavelengths. By developing this synchrotron-based imager using the 10x reduction Schwarzschild optics system, the limits of test resists can be examined. One method to print these fine features is to double the spatial frequency of the object grating.
Placing a grating in the path of the coherent EUV source will create diffracted orders at angles determined by the pitch of the grating. By eliminating the DC term and recombining only the +1 and -1 orders, spatial frequency doubling at the image plane can be achieved with high contrast. The combination of the frequency doubling and the optical system creates a 20x reduction of the object grating pitch. Thus far, we have successfully used this technique to print equal line and space patterns with line widths as small as 30 nm. Line edge roughness measurements have been done on our 50 nm dense lines/space pattern with a three sigma rms value of 4 nm. The printing of even smaller features is currently under investigation. Simulations show that by using the fully extended NA, the system can achieve line widths as small as 12 nm. Using a suitably designed mask, spatially separate grating objects with differing pitch can be simultaneously imaged in a single exposure. A technique to print multiple contrasts during a single exposure is also being developed. These configurations will be of great use in evaluating the ultimate performance and extendibility of resist materials for EUV lithography.
Applying drop-on-demand (DOD) inkjet printing technology to directly write photoresist or polymer patterns on a wafer surface is becoming a new and powerful tool for microprocessing. Compared with other lithography approaches, this maskless lithography method has the advantages of the low cost, the wide variety of materials it could pattern, as well as the versatility of the substrates that a circuit could be built on.
The existing inkjet systems could only form liquid droplets with a volume bigger than several picoliters, which limited their minimum printable size to tens of microns. To build a system that is capable of producing droplets in the micron to submicron regime, we first chose a thermal bi-membrane actuator structure to provide the high driving force for ejecting a small droplet. However, the bi-membrane system suffered problems such as fragility of the membranes and irregularity of droplet generation. Therefore, we rebuilt our system based on the most common actuation mechanism used nowadays in inkjet printers–thermal vapor bubble formation by applying an extremely high thermal flux to the liquid on top of a smooth heater surface. The pressure change generated by the rapid growth and collapse of a vapor bubble in a chamber will push liquid through a nozzle and then break it to form a single droplet.
We have been able to successfully fabricate monolithic thermal bubble inkjet printheads by epoxy stamp bonding, in which we bonded a pair of wafers with nozzle, chamber, and heater structures using a thin film of epoxy transferred to the high areas of the top wafer by a dummy. The test chips were then plugged into our experimental system (as described in abstracts in previous Research Summaries) and we observed stable and continuous generation of 13 µm scale water droplets from those chips (Figure 1). Our next step is to shrink the liquid size further by reducing the nozzle diameter and changing the electric driving conditions. We will also try to print patterns with different materials utilizing the new printhead.

Figure 1: Water droplets generation sequence viewed at the nozzle of our new monolithic thermal bubble printhead
Free space laser communication is a promising candidate for a high bit-rate, interference-free data link at lower power consumption compared to radio and microwave frequency systems. On the other hand, an optical link has its own challenges: pointing, stabilization, and acquisition. Considering the directional laser beam, with 1 miliradian (.017 deg) divergence, we can say that pointing accuracy must be a fraction of 1 miliradian, which is the beamwidth.
Vibrations on the hosting platform can easily be large enough to disturb the beam heading and interrupt the link. A small airplane, with a 1 meter wing span, is our main candidate as the host vehicle for the laser transceiver. Inertial sensors have demonstrated effectiveness in detecting and canceling the effects of undesired vehicle vibrations. Stabilization as described above, however, needs a steering capability of +/- 10 degrees optical.
The third challenge, acquisition, needs a laser beam to scan an area that the opposite end is likely to be in. An estimation of the target's position narrows the scan area, and its dimensions are limited by the errors in the estimation. In this particular application, we need another +/- 10 degrees of beam steering for acquisition. So, the total range of the mirror needs to be +/- 20 degrees optical.
Considering the requirements above, the beam steering element in the system must be precise, fast, and have a large dynamic range. A 2-DOF MEMS mirror built on SOI wafers will do the beam steering. Small dimensions give the mirror a reasonably large bandwidth to meet the speed requirement.
The goal of the research summarized here is to design and implement a feedback loop around the mirror in order to achieve the resolution, which is more than 12 bits. The 20 degree range requires high voltages in the actuator. Position sense for the mirror is another essential component in the feedback system. In most of the cases, separate sense fingers are needed for a better performance. Additional fingers bring up layout and stability challenges for the mirror. Various methods for drive and the sense have been investigated. The current research direction implements the high voltage actuation without a need for high voltage circuitry. The same method also has a chance of using the same set of fingers for both sense and drive.

Figure 1: A 2 DOF mirror built on SOI wafer
Digital signal processing in system-on-chip applications has created a need for high performance ADCs that are compatible with deep sub-micron technology. These applications typically demand high linearity, speed, and resolution while maintaining low power consumption. While digital circuits benefit from the aggressive downscaling of CMOS technology, the trend of decreasing supply voltage in deep sub-micron processes tends to increase the power dissipation of high resolution ADCs. This research focuses on a novel concept through which analog domain precision can be traded off for low power digital signal processing.
In most high-speed pipelined A/D converters, high-gain residue-amplifiers dominate overall power dissipation. Substantial power savings are possible when precision amplifiers are replaced by simple open-loop gain stages. To take advantage of this opportunity, we propose a new digital calibration technique capable of correcting errors arising from amplifier nonlinearity and temperature drift. Our approach uses a statistics-based signal processing technique to measure and cancel gain- and nonlinearity errors of the imprecise, low power residue amplifiers. Critical converter stages are switched randomly between two transfer characteristics without interrupting normal A/D operation. Comparison of the two distinct cumulative distributions in the converter back-end allows estimation of the required calibration parameters.
To evaluate the proposed scheme, we have designed and implemented a 12-bit, 75 Msample/s prototype ADC in 0.35 µm CMOS technology [1]. For simplicity, the digital calibration is applied only to the first converter stage and implemented off chip. Compared to a state-of-the-art reference design [2], we achieve more than 60% power savings in this critical portion of the ADC. Measurement results show that the digital post-processing technique improves the signal to noise + distortion ratio (SNDR) of the converter from 48 dB to 67 dB. Future work will focus on expanding the concept to multi-stage calibration applied to a high performance design in a low-voltage, deep-submicron technology.

Figure 1: Chip micrograph
A typical MEMS gyroscope measures rotation rate by sensing the Coriolis acceleration of a vibrating proof-mass. The gyroscope design can be divided roughly into three parts: the proof-mass, the actuator for vibrating the proof-mass, and the sensor for detecting the Coriolis acceleration of the proof-mass. This research focuses on the design of 5V CMOS electronics and the mechanics of the actuator to increase sensitivity to rotation and reduce sensitivity to process variation and temperature.
Gyroscope sensitivity depends on the velocity of the proof-mass, which is affected by the size of the actuator, the mechanical spring, and the mechanical damping. Increasing the sensitivity requires more actuation, vibration at the mechanical resonance, and reduced damping. Reducing the sensitivity to process variation and temperature requires position sensing and feedback control to electronically adjust the spring constant and maintain a constant sinusoidal velocity.
The actuator design will use capacitive position sensing and electrostatic forces, which are easy to integrate but tend to be nonlinear for large motion of the proof-mass. The design will use parallel-plate actuators, which can generate significantly larger forces than the more common lateral comb drive and can generate an electrostatic negative spring that adjusts the mechanical spring constant. CMOS circuits will be designed to measure the position, stabilize the parallel-plate actuator over large motions, and reduce the nonlinearity of the negative spring effect. Additionally, the mechanical design of the actuator will minimize damping and maximize stiffness of undesired mechanical modes.
The actuator design will be demonstrated first in a z-axis gyroscope and later in a six-degree-of-freedom inertial measurement unit (6 DOF IMU), which includes three gyroscopes and three accelerometers. The actuator design will be applicable to other MEMS such as scanning mirror displays and micropositioners.
The objective of this research is to develop electronic interface circuits to measure strain with a silicon micromachined resonant sensor. These sensors are analogous to a guitar string. When the sensor is stretched (tensile strain) or compressed, its resonant frequency increases or decreases accordingly. Resonant sensors have several attributes that make them attractive. First, the information we want is contained in their output frequency and therefore sensor output is immune to AM noise. Second, their sensitivity to applied strains has been shown to be quite high [1].
While resonate sensors have the promise of high sensitivity, challenges remain in the development of the sensors. In fact, there are two components of the design that need to be improved from the current state of the art. The first component is the actual resonant sensor/oscillator, which is composed of a micromachined resonator and oscillator circuitry. The micromachined resonators have a high Q, but they suffer from high motional resistance [2]. This makes it difficult to make an oscillator, as we must match the impedance of the resonator with an equivalent negative resistance for oscillation to occur. The impedance match is performed by the oscillator circuitry. Improvements in design of the resonator and oscillator circuitry can significantly improve the linearity and phase noise of the oscillator, resulting in better sensor resolution. The second component is the method used to measure the sensor's output. In many applications, the change in resonant frequency is measured by frequency counting [3]. With this method, high accuracy measurements can be obtained, but at the expense of bandwidth. Another method of measurement uses FM demodulation to measure the change in frequency at the sensor output and is usually done with a PLL. With this method, better bandwidth is achieved; however, the phase noise of the VCO and the resonant oscillator make the DC measurement resolution quite poor.
This research will focus mainly on methods to improve oscillator circuit design and frequency measurement techniques to yield good sensor resolution over a large bandwidth.
The goal of this project is to develop a high-order sigma-delta sense interface for micromachined gyroscope sensors. The main priorities are providing sufficient stability margin and low power consumption while maintaining the intrinsically high resolution of the sensor element. Research so far has shown that implementing the gyroscope sense interface as a closed loop system using sigma-delta modulation has the advantage of low sensitivity to parameter variations and high linearity as well as an intrinsic digital output. In this work a fourth order sigma-delta loop is being developed. The advantage of the fourth-order topology is that it can provide high attenuation of the quantization noise in the signal band and allow for sufficient compensation of the loop as well as operation at low sampling frequency without introducing an additional noise penalty. The current status of the project is summarized below:
This work relates the potential energy savings to the energy profile of a circuit. These savings are obtained by using gate sizing and supply voltage optimization to minimize energy consumption subject to a delay constraint [1,2]. The sensitivity of energy to delay is derived from a linear delay model extended to multiple supplies. The optimizations are applied to a range of examples that span typical circuit topologies including inverter chains, SRAM decoders, and adders. At a delay of 20% larger than the minimum, energy savings of 40% to 70% are possible, indicating that achieving peak performance is expensive in terms of energy. The analysis is extended to register files, minimizing energy across pipeline stages, and optimal parallelism.
In current attempts at low-power, single-chip, integrated radio solutions, the analog circuitry tends to consume a majority of the total power. Tight RF requirements on the front-end receivers, and large transmit powers necessary for long distances and high signal-to-noise ratios, constrain a design with difficult, if not impossible, specifications to implement in very low power in low-cost CMOS technology. While low-power digital techniques for large-scale designs exist and are being actively applied, no comparable techniques have emerged yet for the analog design components. Current trends suggest that the while the speed and energy efficiency of digital circuits will improve with the lower supplies and smaller geometries, analog circuits are actually hampered by the supply reduction. This suggests a sort of "Holy Grail" for radio design, which eliminates as much as possible the necessity for analog components. This radio would ideally convert the incoming antenna signal to a binary value and then perform all processing digitally, yielding an implementation with all of the benefits digital design has to offer (full integration, lower power, cheaper technology, robustness, the ability to implement complex algorithms such as adaptation, maximum likelihood estimation, etc.) While current radio standards would require a very fast and high accuracy A/D, we believe that by using a pulse-based, ultra-wideband (UWB) signaling scheme we can approach this fully-digital, fully-integrated radio; reducing both transmit power and the receiver's analog complexity beyond simply scaling a traditional narrowband transceiver.
The focus of this research is the design of such a "fully-digital" single-chip radio transceiver. We assume no special or fixed building infrastructure; the radios will be able to communicate flexibly in both peer-to-peer or broadcast modes. The target cell-size is approximately 5-10 meters with an estimated maximum of 32 active users at one time per cell. The anticipated bit-rate will be around 100 kb/s (uncoded BER ~1e-3) with a total 1 mW (TX+RX) power budget for the transceiver. A narrow pulse (approximately 1 ns wide) is transmitted using simple digital switches; spreading energy over a Gigahertz of bandwidth. Reception, after wideband gain and filtering, occurs in a bank of A/D converters which capture the received pulse in an adjustable window of 16 to 64 ns (shown in Figure 1). This window is composed of 32 to 128 data samples at a 2 GHz rate and is repeated at the pulse broadcast frequency which may range from 62.5 MHz to roughly 1 MHz. The digital backend (shown in Figure 2) aggregates these windows into a block of 256 samples which is fed into a bank of 128 parallel matched filters of length 128 samples each with 5-bit programmable taps. The outputs of these matched filters are sent to either an acquisition or synchronization block. The synchronization block implements early-late correlation for tracking, and the acquisition block contains 11 de-spreading correlators in parallel as a compromise between area and search time. Once a correlation peak above the programmable threshold is found by the peak detector logic, the backend switches from acquisition to tracking mode. For flexibility, separate spreading codes may be used for acquisition and synchronization and both may be of length 1 to 1024 chips.
In addition to communication, the ability to do some form of ranging or localization is a considered a necessity. Due to the fine time resolution inherent to UWB, accuracy on the order of several feet is possible and research into robust ranging algorithms has begun. Also, as extreme low cost and high integration are desired, we are investigating PCB/circuit co-design for the antenna and matching elements, and targeting a generic, digital CMOS IC process for fabrication.

Figure 1: Analog frontend block diagram

Figure 2: Digital backend block diagram
Because of the ultra-wide bandwidth of the transmitted signal, receiver design strategy has different interesting issues from narrowband systems. Given the fact that ultra wideband has several possible application areas, system-level explorations will be done in this research. Our first work will be focused on building a real-time Simulink model which includes both the analog and digital processing components in a UWB system. The simulation combined with the future UWB test board will allow us to understand more about system tradeoffs as a basis of future ultra wideband system design.
A digital back end with basic synchronization and tracking functionalities was first implemented via BEE emulation engine. With the help of FPGA testing, we could play with more sophisticated detection algorithm to improve the system performance from communication theory perspectives. An ASIC version of the baseband using the new design flow in BWRC will also be done in this project. The final goal of this project is to propose a suitable architecuture of UWB system operating between 3 GHz and 10 GHz.
Ultra-wideband (UWB), as opposed to traditional narrowband radios, is a wireless digital communication system that exchanges data using short duration pulses. The complexity of the analog front-end in UWB is drastically reduced due to its intrinsic baseband transmission. Based on this simplification and the high spreading gain it possesses, UWB promises low-cost implementation with fine time resolution and high throughput at short distances without interfering with other existing wireless communication systems. However, the wideband nature of the front-end architecture leads to a totally different design methodology from traditional narrow-band systems. For example, if one employed the conventional narrow-band design approach, matching between the power amplifier and the antenna would be a big problem owing to the fact that it is extremely difficult to match accurately over a such a wide range of frequencies. In addition, we desire a high degree of integration, which requires an antenna on the order of centimeters in size, but it is hard to attain efficient transmission bandwidth from DC to GHz with such a small antenna.
The focus of this research is to determine the methodology for co-designing an appropriate antenna suitable for efficient pulse transmission/generation and pulse reception with analog circuits that won’t induce signal dispersion (ISI, inter-symbol interference) or further complicate the digital back end. Finite-difference time-domain (FDTD) electromagnetic wave simulation will be used to characterize the antenna. While doing antenna/circuit co-design optimization, the method of combining FDTD and SPICE simulation will also be investigated.
Emerging low-power, embedded wireless sensor devices are targeting a wide range of applications, yet have very limited processing, storage, and energy resources. An architecture must be developed that can efficiently meet system demands while simultaneously remaining flexible to application specific optimizations. To answer the demands of application specific operations, we are building an integrated CMOS version of the Berkeley motes wireless sensor platform. A prototype chip that included CPU, ADC, communication accelerators, and memory was designed and fabricated by National Semiconductor as shown in Figures 1 and 2. Measuring just 2 mm x 2 mm, it represents a significant reduction is size, cost, and power over current generation motes. The test chip was not fully functional, but it could successfully execute instructions and demonstrate basic I/O capabilities. A second generation of this node has been designed and is currently being fabricated. In addition to fixing the minor bugs in the first prototype, this second generation chip includes support for multiple register sets, data encryption, and it is equipped with a CMOS RF transmitter. The transmitter architecture uses a 32 Khz crystal as a reference oscillator and frequency lock for a capacitor array based VCO to a 900 Mhz transmission frequency.

Figure 1: TinyOS network stack accelerator

Figure 2: Floorplan for the mote chip
This project will investigate the possibilities of using digital signal processing techniques to enhance pipelined A/D converter performance. Specifically, we're currently applying the Wiener filtering concept to the correction of analog errors. With a slow-but-accurate helper A/D and a back end FIR digital filter, we have proven in simulation that capacitor mismatch, finite opamp gain, and various offset errors can be eliminated through the digital filtering. The analog signal paths involved are open-loop. Correction is performed solely in digital, without feedback to the pipeline A/D to tweak analog parameters. The system is further made adaptive to track slow environmental changes (power supply voltage drift, ambient temperature change, etc.) by means of an LMS algorithm. Adaptation rate can be adjusted depending on the speed of the slow helper. With this approach, we're potentially looking at a very high conversion speed (> 200 MS/s) and high accuracy (>= 10 bits) where the stringent requirement on analog circuit components can be relaxed with the aid of digital techniques. Down the road, we will also investigate applications of Voterra filtering to correct nonlinearities and bandwidth limitations in analog circuits where the complexity of digital filters will increase geometrically. The driving force behind this, however, is the inexorable power of scaling coming from digital CMOS technology. If strategically leveraged upon, it will revolutionize the performance and design of traditional analog circuits in the near future.
The 4G Wireless LAN demands a high data rate, such as multi-gigabits/s. The need for such a high data rate Wireless LAN has prompted the Federal Communications Commission (FCC) to release 5 GHz of unlicensed spectrum, from 59 GHz to 64 GHz.
In 60 GHz radio systems, there is not only a high frequency operation issue but also a baseband processing issue. In the case that we operate five channels in a 5 GHz band, one channel of bandwidth is 1 GHz. In order to process a 1 GHz bandwidth channel, we need a 4 GS/s A/D converter, which doesn't bode well either in the realization of such an A/D converter or in the digital signal processing after A/D conversion.
One of the conventional approaches for meeting the high-speed requirement is a time-interleaved parallel A/D converter. At the sampling rate of 4 GS/s, it suffers from path mismatch, marring system performance, and adds the complexity of digital error calibration. Even after A/D conversion, the digital signal processing speed is still 4 GHz.
One of the promising communication modulation schemes for wideband applications is orthogonal frequency division multiplexing (OFDM). OFDM consists of multiple subcarriers, and each subcarrier is orthogonal to the other subcarriers. 60 GHz radio systems consider OFDM to be the strong candidate for the modulation schemes. The new idea came from the unique characteristics of OFDM.
The key idea of parallel path receiver architecture is that a wideband OFDM channel can be split up into a number of uncorrelated narrow band OFDM subchannels. For example, a 1 GHz channel has 200 MHz of guard band and 800 MHz of information channel with 1024 OFDM subcarriers. When we split up the 800 MHz bandwidth channel into 8 subchannels, each subchannel has 100 MHz bandwidth with 128 OFDM subcarriers. We just need to process 100 MHz of bandwidth with 400 MS/s A/D converter, but we need to have eight copies of the same blocks. This approach doesn't suffer from the path mismatch problem. 400 MS/s A/D converter is easy to design and the digital signal processing can be run with 400 MHz, which implies a low energy solution.
From the circuit design perspective, the important circuit building blocks are mixers and frequency synthesizers, because they are sensitive to cross talks. Parallel mixers are exposed to the possible cross modulation. Frequency synthesizers are vulnerable to the cross harmonics. This research project places emphasis on the demonstration of parallel mixers and frequency synthesizers operating against two cross talks. Another issue is the reasonable power consumption for the parallel mixers and multiple frequency synthesizers.
The test structure of mixers and frequency synthesizers will be designed with 0.13 µm CMOS. There are four parallel paths. The operating intermediate frequency is 2.5 GHz. The subchannel bandwidth is 50 MHz. The circuit topology of mixers is a parallel folded active mixer with current bleeding. This topology reduces the effective input loading and power consumption of parallel mixers and isolates the cross modulation. The circuit topology of frequency synthesizers is a mixer-based frequency synthesizer, which eliminates the frequency divider.
Steady trends in the personal portable communications market have demanded lower cost and better overall performance of the transceiver. Modern CMOS technologies, currently demonstrated to be feasible for implementing RF circuitry, offer the prospect of higher levels of integration, bringing low cost, smaller form factors and, with the elimination of many off-chip signal paths, the potential for reduced power consumption. At the system level, non-constant-envelope (non-CE) modulation schemes are attractive, as they offer better spectral efficiency than CE schemes, allowing higher data rates to be transmitted in a given bandwidth. The goal of this work is the design of an integrated CMOS radio transmitter for a non-CE modulation scheme.
Battery life is a major consideration in the design of portable radio units, and this is particularly so in the design of the RF power amplifier with its significant power consumption. At the same time, non-CE modulations require a linear transmit path for low distortion, and there exists a fundamental tradeoff between linearity and power efficiency, and CMOS device characteristics only make this tradeoff worse.
Cartesian feedback is a linearization architecture that can provide a low-distortion output from a nonlinear amplifier, offering the potential for integration without needing off-chip delay lines or couplers. This architecture requires a power amplifier whose output envelope can be modulated by varying the input envelope. Though Class-C power amplifiers allow this modulation, while offering good power efficiency, they also have bad AM/PM distortion, which can introduce instability with Cartesian feedback.
We are investigating a modification to the normal class-C amplifier architecture to reduce the severity of AM/PM distortion. Simulated peak drain efficiencies for a 0.18 µm CMOS prototype amplifier are on the order of 55%. The larger goal of this work is to implement this modification together with the other elements of the Cartesian feedback architecture in an integrated transmitter targeting GSM EDGE specifications.
In many portable transceivers, the power amplifier (PA) is the most power consuming block. In general, the maximum power efficiency can be achieved only when the PA is transmitting peak output power. The efficiency worsens as the output power decreases. Under typical operating conditions, the PA transmits less than peak output power, therefore effective power efficiency is much lower than the maximum value.
One of the efficiency enhancement techniques under investigation is the Doherty amplifier. The merit of this technique is that it allows a power amplifier to achieve a maximum or close-to-maximum efficiency over a wider range of the output power. In the Doherty amplifier, an auxiliary amplifier is introduced. The auxiliary amplifier turns on when the output power is high and, by means of a passive impedance inverter, effectively lowers the impedance seen by the main amplifier, thus allowing higher output power while maintaining high efficiency. However, since the Doherty amplifier consists of two amplifiers, it is subject to phase mismatch between the two paths. Also, the main amplifier has to experience different load impedance due to the on and off state of the auxiliary amplifier. These can lead to a significant degradation in the linearity. Different linearization techniques that can be applied to Doherty amplifier will also be investigated.
The great success of the digital CMOS IC technology during the last ten years has firmly built up the dominance of CMOS as the mainstream silicon technology of the IC industry. The everlasting pursuit for high integration and low cost has also endorsed the design methodology known as the "system-on-chip" approach. But integrating noise-sensitive, high-sensitivity analog and RF circuits on the same die with noisy digital signal processing (DSP) circuits switching at high frequencies is a very challenging task for circuit designers. The demand for wide bandwidth of the analog front-end exacerbates the problem since the designer cannot benefit from most of the narrow-band techniques, such as noise shaping and filtering. This underlines the need for innovation of broadband, power-efficient analog circuit techniques that also feature high dynamic range.
Among these challenges, the analog-to-digital interface circuit is one of the most difficult to deal with and can often consume 50% of the total receiver power. This underlines the research effort for innovations of analog circuit techniques in the deep-submicron regime. This research project will investigate these limitations and propose new techniques that ameliorate strong tradeoffs among power, accuracy, and bandwidth of the A/D interface circuits.
Modern broadband communication systems require highly stable frequency references adjustable over a wide range of frequencies. One such system is a cable tuner, which requires a voltage-controlled oscillator (VCO) capable of tuning between 1.2 and 2.1 GHz and a phase noise below –85 dBc/Hz at a 10 kHz offset from the carrier [1]. Solutions addressing such specifications have traditionally relied on expensive high-end technologies and external components [2]. Today’s CMOS technologies are cost-effective and provide sufficient bandwidth for many RF communication applications. This research project focuses on the design of wideband low-phase-noise frequency VCOs in CMOS.
Although various types of VCOs can achieve a wide tuning range, LC VCOs are most suitable since they generally exhibit lower phase noise. To achieve a wide tuning range using a single resonator, we propose a VCO tuned using a mixed-signal scheme. Its frequency is digitally adjusted in coarse steps and subsequently fine-tuned to the desired value using a varactor device. An amplitude control scheme is also implemented.
The goal of this research project is to establish a framework for the analysis, design, and optimization of wideband low-phase-noise VCOs. Tradeoffs between different candidate LC VCO topologies and between available devices will be investigated. The effects of varactor nonlinearities on phase noise and tuning range will be analyzed. A set of analytical methods will be provided to aid in predicting these effects and any other quantities relevant to the overall VCO performance. Our proposed LC VCO will be integrated within a frequency synthesizer to demonstrate its practical feasibility.
One of the most important specifications in a wireless transmitter is the adjacent channel power ratio (ACPR), which is used to measure the nonlinear distortion in the transmitted signal. ACPR, together with the modulation scheme, determines the maximum allowable nonlinearity of the power amplifier, the last active circuit block before the antenna. Although other measures of distortion such as harmonic or intermodulation distortion have been analyzed before, the relationship between the physical mechanisms in the transistors and ACPR is not well understood. Designers are also hampered by the difficulty of simulating ACPR, as carrier frequencies are often two orders of magnitude higher than the channel bandwidth.
In this research project, we will try to predict ACPR in linear RF power amplifiers required in wireless systems using non-constant envelope modulation schemes, such as CDMA. At first, frequency domain Volterra kernels will be calculated to model the nonlinear behavior of the power amplifier. Then the baseband equivalent of the transmitted signal will be fed into this model in order to estimate ACPR using MATLAB. As all of the simulations will be done in the frequency domain, the simulations are not expected to take long. This method can easily be used for processes from different vendors, once the basic SPICE parameters are known. Besides, it will also help designers during the initial design phase, because it does not require any amplifier to be fabricated and measured beforehand, as opposed to the empirical methods that utilize some form of parameter fitting. Vendors can also modify their processes to design special transistors for power amplifiers, once the main contributors to distortion and tradeoffs are identified.
The accuracy of this method will be tested by making measurements on single and two stage SiGe bipolar power amplifiers designed and fabricated using a commercially available BiCMOS process from Maxim Integrated Products. The measurements will be made using the IEEE 802.11b standard which operates at a 2.4 GHz ISM band.
Modern cable and wireless communication systems require gain control in the signal path. For example, a CDMA phone needs adequate control of its output power in order to maintain an efficient link between the user and the base station. Depending on the distance between the receiver and the base station, the received power may vary by orders of magnitude. Hence, gain control circuitry is needed to limit the incident power to the receiver chain. Therefore, much effort has been put into the design of attenuators that can be used as a means of controlling the received signal strength.
Attenuators with a broad response are desirable since they can be used in applications operating over different frequency bands. There are two main methods to build broadband attenuators. PIN diode attenuators have been used for this purpose. Although they are very linear, broadband, and able to handle high power, their main drawbacks are constant power dissipation and difficulty in integrating them on-chip. FET's ability to be used as a voltage controlled variable resistor in the triode region makes them another suitable choice in the design of attenuators. GaAs MESFETs have been the traditional choice in the design of attenuators due to their superior high-frequency performance compared to MOSFETs. However, the low cost and availability of CMOS makes it an attractive choice of technology for attenuator design. Furthermore, the downscaling of CMOS technology continues to provide transistors with higher fTs, which are more suitable for broadband RF-IF attenuators.
The goal of this research project is to analyze and design broadband CMOS attenuators, which can be integrated on-chip. The focus will be on possible advancements in the design of attenuators on the circuit level as well as on the device level. Device and circuit optimization will be studied for distortion, frequency response, power-handling capability, and insertion loss. Gain control circuitry will be designed for linear attenuation with the control voltage. Various attenuator ICs will be designed and tested for verification of the analysis.
The simplicity of the direct conversion receiver offers some important advantages over the superheterodyne architecture. The RF signal is filtered, amplified, and then converted to baseband directly. There is no IF signal, and hence the image reject filter, IF amplifier, and high-Q IF channel select filter are no longer required. Without these components, the homodyne receiver can achieve a higher level of integration than the heterodyne counterpart, and integration can translate into cost savings.
However, the direct conversion architecture also has some disadvantages that make its implementation difficult. DC offsets, 1/f noise, and second-order distortion from the mixer will all fall in the signal band. Another problem is LO self-mixing. Since the RF and LO are at the same frequency, any leakage from the LO port to the input of the mixer is going to produce a DC component at the output, adding to the DC-offset problem.
The goal of this project is to study the use of sub-harmonic mixers in the direct conversion receiver. Because the LO is now at a fraction of the RF frequency, any LO leakage to the input of the mixer will be mixed to a frequency outside the signal band. Currently, a sub-harmonic mixer in a 0.13-µm CMOS process is being designed.
Aberrations in the exposure tool have been shown to produce line-edge and line-end perturbations on phase-shifting masks that can result in design defects. A pattern matching system has been developed to locate areas in a phase shift mask most sensitive to these lens aberrations. The original prototype of the pattern matcher was developed in the SKILL programming language of Cadence's Design Framework II. Speed and memory limitations prompted the creation of a new C++ binary, which incorporated the core data structures and matching algorithms. Specialized algorithms for partitioning, prefiltering, and compression resulted in a fast and memory efficient matching process. The local layout geometry is automatically output to SPLAT for detailed image analysis with and without aberrations. The pattern matching C++ software supports multi-level compression, layer Booleans, pattern proximity calculations, and many other features. The pattern matching idea has recently been expanded to search for sensitivity to defects, misalignment, reflective notching, laser-assisted thermal processing, and chemical-mechanical polishing. The current software system includes a combined graphical and a text-based interface and can be run on any operating system, independent of Cadence. The web-based version of the pattern matcher allows a user to either input custom layouts and match patterns or select from a list and then perform matching runs from a web browser.

Figure 1: Screenshot of pattern matcher web applet

Figure 2: Large coma and trifoil aberration patterns matched on a small phase-shift mask layout

Figure 3: Coma aberration pattern matched on a complex phase-shift mask
Residual aberrations in optical lithography systems used to produce integrated circuits can significantly affect the image quality produced at the wafer. Thus, the development of a simple and reliable technique for quantifying aberrations is of great importance. A theoretical foundation has been given for the ability of programmed probe based aberration targets to measure individual Zernike aberration terms. The optimum targets are inverse Fourier transforms of the Zernike polynomials and this allows the main features of the family of targets to be predicted in advance. Simulation of discretized versions shows an impressive 27 to 36% increase, per 0.01 waves of rms aberration, in the intensity at the center of the target relative to the clear field intensity. The cross contamination by other targets is about 1/6th as large and it is thus possible to measure spherical aberration independent of focus. The theoretical foundation of this work, as well as initial simulation results are presented in [1]. An invention disclosure was filed on this work and the university filed for a provisional patent application in September 2001.
Once the aberration fingerprint of a lens or a family of lenses is determined, this data can be input into the powerful PatternMatch software created by Frank Gennari. Device designers can use this software to discover where their layouts will be affected by the aberrations present and design accordingly. This offers a significant link between designers and lithographers, which will only become increasingly important as low k1 lithography solutions are implemented.

Figure 1: An example of the digitized spherical aberration target and the response of the central probe image intensity on the wafer to no aberration, 0.05 waves (peak) of defocus, and 0.05 waves of spherical aberration.
Rigorous electromagnetic simulations are used to test the lithographic printing of novel technologies such as polarization masks. Polarization masks use small bars inserted into features to polarize the incident radiation, allowing features to be printed with the chosen polarization. Proximity effects from electric field spillovers between adjacent features can be reduced by passing opposite polarizations, resulting in spatially orthogonal electric field vectors than give a reduction in intensity of _. Additionally, polarization bars can help mitigate the effects of large phase transitions on phase-shifting masks. Several special purpose monitors can be constructed with polarization bars to monitor polarization imbalances in the illumination, high numerical aperture effects on vector addition, as well as polarization dependent resist coupling effects. Small, dense contacts may also be produced by using a combination of polarization and phase shifting to generate four wave interference at the wafer. These polarization bars must have gap widths of about l/8 - l/3 with the bars themselves being about l/8.
1Graduate Student (non-EECS), Applied Science and Technology
Immersion lithography offers the capability to further reduce the resolution of printed features. For example, placing deionized water (n=1.44) between the projection optics and the wafer, the current 193 nm stepper could achieve the same resolution as a 157 nm stepper while improving the depth of focus by 8%. Immersion liquids such as Fomblin pump-oil (n=1.36) and other liquids have been examined for suitability in 157 nm steppers [1]. The immersion layer also allows larger amounts of energy to be coupled into the resist due to a lower reflection coefficient and lower effective NA.
This research focuses on quantifying potential problems by performing analytical estimation and simulation. Intrinsic characteristics of the liquid likely to contribute to aerial image degradation include homogeneity and reactivity with optical and resist surfaces. Liquid dynamic effects such as local heating, relative motion of the liquid and projection lens, convection heating, and resist outgassing are being investigated.
The high resolution and superior coupling of light into photoresist are being investigated through generalizing the thin-film formulation developed by Michael Yeung for SPLAT [2]. This approach consists of modeling thin-film effects through viewing them as amplitude and phase effects in the lens pupil.
A maskless lithography system can replace expensive masks with a reusable, electronic mask. The maskless system requires extremely high throughput, around 10 Tbps, to match the wafer write speed of conventional lithography. We are focusing research on advanced data decompression techniques and a high-speed analog interface to the mask-writing mechanism.
Previous research has shown that Lempel-Ziv and Burrows-Wheeler can compress layout data well. We analyzed these algorithms for their suitability to hardware decompression. Lempel-Ziv is a pattern matching encoder followed by a Huffman encoder. Burrows-Wheeler is block sort followed by a locally-adaptive compression algorithm. Both of these algorithms are limited by memory available in compression and decompression—history buffer size for LZ and block size for BW. We found that we could stretch the effectiveness of limited memory in both algorithms by precompressing with a simple runlength-encode (RLE). RLE fares well because layout data typically has homogenous blocks. When sufficient memory is available, RLE has little effect on the compression ratio. But for memory-limited compression, RLE significantly improves the compression ratio. Burrows-Wheeler requires more memory to achieve the same compression ratio as Lempel-Ziv. BW also requires a more complex decompression architecture. Thus, we determined that RLE followed by LZ is best suited for hardware decompression of layout data.
This research focuses on architectures and algorithms' iterative decoders for error correction codes. Iterative decoding algorithms for turbo codes and low-density parity check (LDPC) codes have recently been discovered to achieve performance close to theoretical capacity bounds. These algorithms are based on message passing between modules using soft-input-soft-output decoding.
Currently, some of the decoders that have been implemented in silicon are often based on algorithms that are serial in nature. Algorithms needed to decode convolutional codes or partial response channels, such as the BCJR algorithm and soft-output Viterbi algorithm are some examples. In many applications, such as magnetic recording, a source and channel decoder are necessary for decoding. While a parallel LDPC decoder can now be used for outer source decoding, channel decoding becomes the bottleneck in the system.
We investigate parallel descriptions of algorithms previously described serially. With the increasing number of transistors available on a single chip, it is now possible to investigate direct implementation of these parallel algorithms. As an example, this research investigates joint MAP and LDPC decoding algorithms and their implementations on a single chip.
This research addresses the algorithms and implementations of iterative decoders for error control in communication applications. The iterative codes are based on various concatenated schemes of convolutional codes [1], and low-density parity check (LDPC) codes [2]. The decoding algorithms are instances of message passing or belief propagation [3] algorithms, which rely on the iterative cooperation between soft-decoding modules known as soft-input-soft-output (SISO) decoders.
Implementation constraints imposed on iterative decoders applying the message-passing algorithms are investigated. Serial implementations similar to traditional microprocessor datapaths are contrasted against architectures with multiple processing elements that exploit the inherent parallelism in the decoding algorithm. Turbo codes and low-density parity check codes, in particular, are evaluated in terms of their suitability for VLSI implementation in addition to the performance as measured by bit-error rate as a function of SNR.
In this research, the computational hardware and memory requirements of magnetic storage applications [4] provide a platform for evaluation of the iterative decoders. Past accomplishments include modification of known algorithms to accentuate the physical design considerations. A VLSI implementation of a soft-output Viterbi decoder suitable for high throughput Turbo applications has been demonstrated [5]. The ongoing efforts continue to study and demonstrate the traits of particular low-density parity check codes that lend themselves to efficient mapping on hardware architectures.
In variable-throughput digital systems, power dissipation can be reduced by adjusting the operating frequency, supply voltage, or MOSFET threshold voltage, so that the system throughput never exceeds the requirements. Supply voltage scaling (VS) has been one of the most effective power-reduction techniques [1,2]. Threshold voltage scaling (TS) has also been proposed to effectively curtail the leakage power of the system [3]. Minimizing the power dissipation for a given throughput requires a careful balance of active and static power contributions, which can be achieved by simultaneous control of both supply and threshold. This research investigates several power reduction scenarios through different technology generations, logic depths, and switching activities, and demonstrates the effectiveness of each power reduction technique on both an inverter chain-based calculation model and through simulation of a 20-bit adder circuit. A typical variable-throughput system, an inverse discrete cosine transformer for an MPEG decoder, is also designed for hardware demonstration of the effectiveness of supply and threshold voltage control.
Simple, accurate short channel MOSFET current and delay models are useful in low-power digital design for rapidly evaluating the effect of changing transistor width, supply, and threshold voltage. As device channel lengths have scaled, effects such as mobility degradation and velocity saturation have made the Shockley square-law model insufficient for accurate characterization. Two accurate short channel current models have been presented in [1] and [2]. In this research it is shown the second model can be simplified such that only three extracted parameters are necessary to model the velocity saturation and mobility degradation behavior, covering both triode and saturation operating regions, and these parameters can be easily extracted from transistor I-V curves. This simplified model is demonstrated both on a commercial 0.13 µm process technology and a simulated 20 nm FinFET technology. Furthermore, it is demonstrated that the form of both models can be used in delay expressions that accurately capture inverter delays across a range of supply voltages and fan-outs.
We are investigating the performance and power-area-delay tradeoffs for CMOS arithmetic circuits in deep submicron technologies. This exploration is done on an example of high-performance 64-bit adders. A number of high-speed adder designs have been reported that increase speed and reduce power by: (1) architectural or logic transformations of carry look-ahead equations, and (2) advanced circuit styles in combination with advanced timing methodologies. The goal of this project is to determine minimum achievable delays for given adder topologies for varying output loads, to minimize the delay under energy and area constraints and to minimize energy and/or area under delay constraints. The main design knobs are gate sizes, supply voltage, and transistor threshold voltage. Furthermore, an optimum adder topology will be found for the given set of constraints. This goal is accomplished through a common method that allows performance comparison between different adder architectures (topologies) in early phases of the design. The methodology can be extended to optimization of various digital building blocks in the energy-delay space.
As the continuous miniaturization of solid-state devices increases the chip operating frequency and circuit density, it presents the circuit designer with a slew of new problems concerning the optimal design and robustness of high-speed circuits. Some of these problems are more pressing device variation, matching concerns, and the fact that an increasing fraction of the clock cycle is required for non-computational tasks such as clock skew/jitter compensation, latch/flip-flop setup, and hold times. One goal of this project is to examine the effect of increasing device variation on the components of high-speed link transceivers used for inter-chip communication, such as the interleaved sampling front ends, clock recovery, and decision circuits. In addition, we are analytically investigating the overall optimization of master-slave flip-flops based on the concept of "sampling function" to obtain minimal setup and hold times and as a complement to time-consuming optimization via simulation.
The primary task of this project is to design, fabricate, and characterize the comb-actuated nanomirror array for EUV maskless lithography to print feature sizes below 100 nm. The technological challenges facing the nanomirror fabrication come from the need to fabricate sub-micron mirrors with sub-50 nm vertical comb gaps, for which a self-aligned process is necessary to build a hidden-hinge and double-comb structure. Moreover, the analog “gray-scaling” printing scheme requires a clear understanding of the difference between the posicast switching and biased switching in an analog scheme. A continuing research effort on this dynamic will follow in the future.
A preliminary process has been designed and tested successfully. Vertical sub-micron comb structures (without a hinge) with released 50 nm comb gaps have been achieved (see figure). We have identified some problems of the process and are currently optimizing the fabrication parameters for further improvement of device structure. We are also investigating the relevant process to form a mirror-compatible amorphous Si thin film with satisfactory photosensitive and resistive properties (for electrical damping).

Figure 1: The SEM pictures of fabricated nanocomb structures with 50-nm comb gaps
The goal of this project is to develop an ultra low power integrated circuit that will form the core of a self-contained, millimeter-scale sensing and communication platform for a massively distributed sensor network. The integrated circuit will contain sensor signal conditioning circuits, a temperature sensor, an A/D converter, microprocessor, SRAM, communications circuits, and power control circuits (Figure 1). The IC, together with the sensors, will operate from a power source integrated with the platform.
Smart Dust are millimeter-scale sensing and communication platforms [1,2] composing a distributed sensor network that can monitor environmental conditions in both military and commercial applications. These networks consist of hundreds to thousands of dust motes and a few interrogating transceivers. The motes are built from integrated circuit and micromachining processes for low-cost, low-power consumption [3], and small size. Communication between the motes and the receiver is accomplished via a wireless optical communication link at 1 kb/s or less.
We have demonstrated a 138 mm3 autonomous uni-directional sensing/communication mote that optically transmits a measure of the incident light level and a 63 mm3 autonomous bi-directional communication mote [4].
We have demonstrated a 16 mm3 [5] autonomous solar-powered sensor node with bi-directional optical communication (Figures 2-4). The device digitizes integrated sensor signals and transmits and receives data optically. The system consists of three die: a 0.25 µm CMOS ASIC, a trench-isolation SOI solar cell array, and a micromachined four-quadrant corner-cube retroreflector (CCR, see Lixia Zhou’s research abstract), but a new MEMS process is being developed that will integrate the solar cells, CCR, and a capacitive accelerometer, yielding a 6.6 mm3 device.
A finite state machine (FSM) controls the system by multiplexing sensors, directing the ADC to take samples, and sending data to the CCR transmitter. The optical receiver operates at 375 kb/s and consumes 26 µW at 2.1 V (69 pJ/bit). The 8-bit serial ADC consumes 3.1 µW at 1 V and 100 ksamples/sec (31 pJ/sample, 4 pJ/bit). The ASIC also contains a 200 x 200 µm photosensor that provides a measure of the ambient light level.
We are just completing the testing of an ultra-low energy microprocessor that consumes less than 20 pJ/instruction (this is 1-2 orders of magnitude less than many "low power" microprocessors) and is tailored to distributed wireless sensor networks. It is 600 µm on a side. This will dramatically increase the intelligence of the mote and provide data storage and computational capability.

Figure 1: Smart Dust mote conceptual diagram

Figure 2: System diagram of the Golem Dust mote and annotated layout of the integrated circuit. Because light shields cover the active circuits, die photos are not very interesting.

Figure 3: 11.7 mm3 mock-up of Golem Dust system, showing a 0.25 µm CMOS ASIC, solar power array, accelerometer, and CCR, each on separate die. A new process is being developed to integrate everything but the ASIC into one die, which will decrease the circumscribed volume to 6.6 mm3.

Figure 4: Photograph of the mock-up in Figure 2.

Figure 5: Layout of a test chip containing the custom ultra-low energy microprocessor (large block in the upper left) and custom low power 1 k x 8 and 1 k x 17 SRAMs.
The scanning micromirror has attracted much attention due to its wide range of applications, such as free space communication, projection display, spatial light modulation, etc. Scanners which have a large range of deflection ability, low voltage actuation, and fast dynamic response are preferred. A large amount of work has been done on developing processes which have the capability of realizing 2Dof micromirrors.
Figure 1 shows the schematic principle of micromirror actuation. An off-axis lateral force is acting on the torsional suspension beam and induces the torsional movement of the micromirror. In the previous approach [1], multi-layer structures were realized through STS timing etch. Timing etch has a few disadvantages, such as wafer-across nonuniformity and process condition variation. The thickness of the low SCS and upper SCS layer is hard to control. This affects the process yield and makes the design difficult.
The new method involves the alignment bonding of patterned SOI/SOI wafers. Unlike the timing etch process, the thickness of the most critical layer, the upper SCS layer, is predetermined by the device layer of the SOI wafer. This approach has much better wafer-across uniformity and the process yield is much higher. Figure 2 shows the process flow. First two SOI wafers are patterned and etched individually. Then they are prebonded by Ksaligner and annealed at 1200 degrees for 24 hours. STS backside etch and handle wafer etch are performed afterwards. Finally, the bonded wafer is released in concentrated HF for a couple of minutes.
Figure 3 shows the picture of a fabricated 2Dof scanning mirror. The preliminary testing result shows that a 1Dof mirror is deflected 11 degrees optically under an actuation of 53 V, compared with the result of 6 degrees at 56 V from the timing etch process. Finer tuning of the design parameter and more tests are on the way.

Figure 1: Torsional movement of a micromirror induced by off-axis lateral actuation

Figure 2: Process flow of patterned SOI/SOI wafer bonding

Figure 3: Picture of a fabricated 2D scanning mirror
(left corner: a SEM picture of the multi-layer structure)
In the quest to implement a network of low-power, reconfigurable sensor nodes, it is necessary to aggressively scale the amount of energy each block requires while maintaining full functionality, network connectivity, and data throughput. The RF transceiver block is crucial, as the power dissipation of this block could easily eclipse the entire sensor node power budget if not properly designed. This research focuses on the implementation of an energy efficient RF transceiver for PicoRadio.
There are three key requirements of the RF transceiver. To facilitate low-power communication of bursty sensor node data, the chosen performance metric of the transceiver is the energy required to transmit each bit of data (energy/bit). Second, in order to achieve a low-cost and small form-factor sensor node, high integration of the RF transceiver is important. Finally, because indoor sensor-node environments typically present narrowband fading (varying degrees of attenuation of narrow frequency bands), some mechanism of fading immunity is required.
To meet these requirements, a break from the traditional low-power, narrowband radio design paradigm is necessary. Some key enablers are recent developments in microelectromechanical (MEMs) technology. By utilizing MEMs resonators, it is possible to perform passive frequency translation and filtering. Traditionally, these operations are accomplished with active circuitry (i.e., mixers), consuming large amounts of power in the process. Secondly, the ultimate goal of this MEMs technology is the full integration with active circuitry. Additionally, they allow detection of widely spaced frequency bands, facilitating the design of a fading resistant architecture.
The ultimate goal of this research is a fully integrated, ultra low-power RF transceiver suitable for wireless sensor node applications.
Scaling of CMOS technology poses significant difficulties in precise process control and circuit operation noise reduction. In order to continue the silicon success in the nanometer regime, it is critical to explore design solutions to handle the performance variability at the early stage. Our work aims at building a cohesive process and design co-optimization framework for future technology generations. By developing a set of predictive technology and circuit performance models, current efforts are focused on investigating the impact of variations on different digital circuit designs at both gate level and micro-architecture level.
This research addresses the algorithms and implementations for digital baseband timing recovery in wireless receivers. Timing recovery refers to the estimation and tracking of several non-idealities in the received signal caused by (1) the wireless channel itself, and (2) the RF and analog circuits in the transmitter and receiver. Parameters to be estimated include: (1) frequency, (2) phase, (3) sampling instant, and (4) gain, including multipath and scattering effects. This research looks specifically at timing recovery performed on the baseband signal (after down-conversion from the carrier) in the digital domain (after the analog to digital converter) and is particularly concerned with lowering the power consumption of the total receiver.
Digital baseband timing recovery can ease the design of the analog and RF circuitry by correcting for non-idealities caused by sub-optimal implementations. This tradeoff becomes especially important in single-chip radios when the RF and analog circuitry needs to be implemented in an ostensibly digital process with low voltages--a difficult task. By transferring some of the complexity to the digital domain, it is conjectured that the entire system can consume less power. This work is taking place within the PicoRadio project where low power is the primary goal. We investigate the architectural and implementation issues related to building low power baseband timing recovery systems in VLSI.
In this research, the computational hardware requirements for timing recovery on the various PicoRadio physical layers provide a platform for evaluation of the digital baseband timing recovery systems. The past accomplishments and ongoing efforts include modification of algorithms, and the efficient mapping of these algorithms into architectures and VLSI implementations that provide the final measure of complexity and power consumption.
This research supports the PicoRadio project at the Berkeley Wireless Research Center. This project is focused on developing an extremely low-power, wireless sensor node capable of collecting data from the environment and transmitting it over an ad-hoc multihop network.
Phase III of this project involves system level architecture options to meet the aggressive 100 µW average power consumption requirement. Work to this end falls into two categories: power budgeting and architectural choices. A preliminary power budget for the PicoNode III has been created through analysis of the expected data rates, clock frequencies, projected 0.12 µ process characteristics, and discussions with the PicoNode subgroups. As more refined microarchitectures are explored for the various subcomponents, this power budget will be revised to provide a more accurate account of the power consumption.
The second category involves architectural options for low-power operation of the PicoNode. Known techniques for low-power design include clock gating, dynamic frequency scaling, multiple supply voltages, and dynamic threshold voltage scaling. Coupled with the event driven nature of sensor and communication networks, blocks can be powered down completely while waiting for events. Clearly the power management can no longer be an afterthought of the design, since it affects the partitioning of the system, design of the individual components, and interactions between them. One current vision uses a distributed power management scheme that would activate blocks upon the receipt of an event. However, while this strategy may minimize the active power of a particular block, it may not yield the globally minimal power consumption for the entire system due to the overhead of entering and exiting the power-down modes. In situations where data moves predictably through the system, a centralized controller can be used to optimize the power characteristics of these specific scenarios. Another current research topic involves maintaining the state of the system while using aggressive power reduction schemes in sleep mode.
In order to achieve the low power goals of PicoRadio, new architectures for the RF receiver must be researched. An important component of the receiver is the low noise amplifier. For our application, the low noise amplifier must provide high gain and adequate noise and linearity while consuming minimal power.
In this research, a design utilizing an inductively degenerated common source amplifier utilizing a RF MEMS FBAR resonator was explored. The FBAR resonator is capable of providing a high Q tank and narrowband filtering. Another advantage of the resonator is that it can ultimately be integrated on-chip. In this architecture, the resonator will be used for tuning the output tank, as well as providing high impedance at resonance in order to generate gain. On-chip spiral inductors are used at the source for input impedance matching, and in parallel with the FBAR resonator at the output to provide DC bias current through the transistors. The gate inductor used to determine the resonant frequency is implemented off-chip. Simulations have shown that a voltage gain exceeding 30 dB can be achieved while using only 500 mA from a 1.2 V supply.
To characterize the performance of the LNA, a prototype was fabricated in a 0.13 mm CMOS process. A PCB board was also made to test the LNA, and results will be available soon.
Low power locationing systems are essential parts of distributed sensor networks. As a part of PicoNode 3 digital protocol processing chip, a locationing block is being implemented on silicon. Hop-counts from certain sensor nodes (coined anchors) with known positions are utilized to estimate the position of the node. The block executes the LS position estimation, also called triangulation, as well as encoding and decoding of the Pico Radio packets that contain locationing information. In future work the actual distances, instead of hop counts, between nodes are to be measured using radio signals. This scheme is planned to utilize the time of flight measurements of the radio signals with an accurate version of GPS-type signaling.
Digital controllers for pulse-width modulation (PWM) converters are enjoying growing popularity due to their low power, immunity to analog component variations, ease of integration with other digital systems, ability to implement sophisticated control schemes, and potentially faster design process [1].
We are developing IC implementations of digital controllers for power converters that find applications in areas such as microprocessor voltage regulation modules (VRM) [2,3] and mobile device power supplies. We explore various topologies for the modules contained in a digital controller in order to provide a high-performance, low-cost solution. We have developed a very low power digital PWM (DPWM) generation module, PID control modules, and a novel low power ADC which is insensitive to switching noise and partially synthesizable. In the past year, we implemented a digital controller system for cell phone application with on-chip power switches. Now we are working on a fast, low power ADC module, and completing a digital controller for microprocessor VRM.
Some modules we developed such as the partially synthesizable ADC might have broader applications in other designs that aim for low power and a moderate conversion frequency range.
As feature sizes in modern integrated circuits continue to decrease below 100 nm, the physics of conventional deep ultraviolet (DUV) optical lithography impose severe limitations. Past efforts to improve the resolution of lithography systems have been based upon reducing the wavelength of the light source. Current state-of-the-art lithography systems use a source wavelength of 193 nm. However, decreasing the wavelength below this level is problematic because at shorter wavelengths most materials become absorptive. Although much research has been done to develop 157 nm lithography tools, this has proven to be costly and difficult, and progress has been slower than expected.
Therefore, it seems that a "quantum leap" is needed in order to keep up with the relentless progress of Moore's Law. Extreme ultraviolet (EUV) lithography at a wavelength of about 13.5 nm has been proposed as a potential replacement for DUV lithography. In order to minimize problems with absorption, EUV lithography tools rely on reflective optics made of silicon/molybdenum multilayer mirrors instead of the refractive optics found in DUV systems. Early results have been encouraging [1], and this is now an area of active research.
A system for performing static lithographic exposures has been constructed at the Advanced Light Source at the Lawrence Berkeley National Laboratory [2]. This system is currently being used to characterize new high numerical aperture optics and to explore issues related to the future manufacturing possibilities for EUV lithography. These issues include photoresist development, mask fabrication and defect problems, and improved modeling of the EUV exposure process. We wil be exploring these issues with emphasis on control and metrology to characterize and improve the EUV lithography process.
In this research project we investigate the feasibility of a class of sensors for semiconductor manufacturing applications. The variables that these sensors can measure include etch rate, temperature, and plasma induced potentials.
The common theme shared by this class of sensors is that they are based on electrical impedance tomography (EIT). EIT involves injecting currents into an object while measuring the induced potentials on the surface of the object. The internal conductivity distribution can be approximately deduced from these measurements. This estimation problem is in general non-linear and poorly conditioned. Simulations have been performed to assess the potential performance of EIT based sensors in semiconductor manufacturing.
In a semiconductor manufacturing context, chemical and physical effects can induce conductivity changes in the interior of the wafer being processed. By placing electrodes at the wafer periphery and measuring potentials across these electrodes, we can infer conductivity changes. This can, in turn, be related to physical and chemical effects through process models. We have built a prototype etch-rate sensor based on this technology and it is being tested in the UC Berkeley Microfabrication Laboratory. The figure below shows the estimated change in thickness after wet etch using EIT (left) and the optically measured change in thickness (right).

Figure 1: Estimated change in thickness using EIT (left) and optically measured
change in thickness (right)
In DUV photolithography, mask patterns and processes are increasing in complexity, while IC critical dimensions continue to shrink at a rapid pace. As a result, the proportional variability of the process will increase to unacceptable levels unless a means of more advanced process control is introduced. Previously standard offline pilot-lot experiments now prove to be too costly and difficult. One attractive and potentially highly viable alternative is simulation-based advanced process control. [1,2] The proposed control framework exploits scatterometry, which provides in-situ, full-profile metrology [3]. The major obstacle to implementing scatterometry in a process control setting is profile inversion—deriving estimated input conditions from the measured profile. In this work, a first-principle-based process simulator (Prolith [4]) is used to simulate the lithography process and create a library of profile-to-input-conditions pairs. These profiles are then used to generate simulated diffraction responses, resulting in a library of diffraction-responses-to-input-conditions pairs. Finally, the empirically measured diffraction response will be matched to a simulated diffraction response in this library, whose accompanying set of input conditions should estimate the actual input conditions well. Preliminary, simulation-only results suggest that the framework has the potential to be successful, particularly if approximate values of the input conditions are provided during the matching step. However, it is expected that the success of the framework in reality will hinge on how well Prolith models the actual lithography process, the levels of measurement noise, and the method of constructing the simulated library. The current focus of effort in this research is determining how to build a library with balanced sensitivity across all input parameters, as well as diagnosing the empirical performance of the framework.
As the scaling efforts and complexity of circuit design continue to grow, interconnect variation becomes one of the limiting factors of circuit performance [1]. The systematic nature of the pattern-density dependency in chemical mechanical polishing (CMP) makes previously used approaches to statistical circuit analysis, such as worst-case analysis, insufficient and inaccurate. In this project, we will build models for the oxide and copper CMP process so that the systematic components of the interconnect variation can be decomposed from the total variability. The reduced randomness will enable more aggressive circuit (interconnect) design.
This project has two phases: during the first phase we will use library-based scatterometry as a novel metrology tool to monitor the oxide profile evolution [2,3]. Subsequently, we will use the profiles to build models for oxide CMP. During the second phase, based on the knowledge of oxide polishing processes, we will design test structures, perform characterization experiments, and develop physical or semi-empirical models for copper dishing and oxide erosion in the damascene process [4]. We also plan to integrate the CMP variation model into a circuit performance simulation tool, and study the effects of CMP variation on circuit performance. A long term objective of this project is to provide designers with the tools that will allow design optimization while properly accounting for CMP variability.
Josephson-CMOS hybrid random-access memories have the potential to remove the memory bottleneck faced by Josephson digital technology. The main idea is to use high-density, charge-storage CMOS gates as the memory and access them by high-speed superconductive devices. This takes advantage of the best features of each. CMOS devices using the 0.25 micron process were fabricated and tested at 4 K, and a 4 K MOS device model was established, based on low-temperature experimental data on discrete devices. We intend to include the capacitances at low temperature based on measurements. According to the 4 K model, operating sub-micron CMOS devices at 4 K will further increase memory circuit speed as well as allow operation at low voltage, resulting in reduced power dissipation. In realizing such a memory hybrid, an interface circuit is needed to amplify millivolt-level Josephson data signals to volt-level signals for CMOS circuits. The interface circuit includes a higher-voltage Josephson pre-amplifier using a dual series array and an ultra-fast hybrid Josephson-CMOS amplifier, which incorporates an N-type MOSFET loaded with a series array of 400 Josephson junctions. The whole circuit has been simulated with a 4 K CMOS model, and a delay time less than 60 ps has been calculated in the absence of parasitic inductances and capacitances. That delay may be as much as doubled when accounting for parasitics. We designed and fabricated the interface circuit using a 0.25-micron National Semiconductor Corporation (NSC) process for the CMOS chip and the UC Berkeley 6.5 kA/cm2 Nb process for the Josephson chip. The circuit functionality has been experimentally verified by wire-bonding the CMOS chip to the Josephson chip. We demonstrated the design and fabrication of a model 64-kbit Josephson-CMOS hybrid memory, which includes the ultra-high-speed interface, address buffers, word line decoders, 3T DRAM-type cells, and Josephson sensing circuits; these are fabricated using the 0.25 micron NSC CMOS process and the UC Berkeley Nb process. Subnanosecond access time is predicted by a conservative simulation that used a room-temperature model for the CMOS. We are working on a piggyback structure using very short wire bonding with which we will be able to measure subnanosecond access times.