107
A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP by ARDESHIR SAGHAFI B.Sc, The University of Science and Technology Tehran, Iran, 1989 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES (Electrical & Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA July 2005 © Ardeshir Saghafi, 2005

A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP

by

ARDESHIR SAGHAFI

B.Sc, The University of Science and Technology

Tehran, Iran, 1989

A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF APPLIED SCIENCE

in

THE FACULTY OF GRADUATE STUDIES

(Electrical & Computer Engineering)

THE UNIVERSITY OF BRITISH COLUMBIA

July 2005

© Ardeshir Saghafi, 2005

Page 2: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Abstract

With the rapid advances in semiconductor technology, modern digital systems operated at

GHz frequency have been successfully developed for many years. As the chip size gets

progressively bigger, and the number of logic gates and chip operating frequencies

increase, the clock skew becomes increasingly more important in ensuring the proper

functioning of VLSI chips. With a synchronous methodology, it is impossible to increase

the clock speed further without reducing the clock skew on the chip.

The Phase Locked Loops (PLLs) and Delay Locked Loops (DLLs) have been widely

adopted to solve the clock skew problem. In recent years, Delay Locked Loops (DLL's)

have been widely used for clock alignment due to their lower phase-error accumulation

and faster locking time. In this thesis a novel high resolution D L L with less than 10 ps is

proposed which combines the coarse and fine delay line into an efficient hybrid delay line.

Consequently, it saves power and area.

11

Page 3: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Table of Contents

Abstract i i Table of Contents i i i List of Figures v Acknowledgment viii

CHAPTER 1 Introduction 1

1.1 Clock skew 1 1.2 Delay Locked Loop '. 3 1.3 D L L Vs. P L L 5 1.4 Applications 7

1.4.1 Clock distribution 7 1.4.2 S D R A M 7 1.4.3 Time-to-Digital converter (TDC) 9 1.4.4 Automatic Test Equipment (ATE) 10 1.4.5 Clock synthesis 10 1.4.6 Clock and data recovery (CDR) 11

CHAPTER 2 Background 12

2.1 Analog D L L 12 2.2 Digital D L L 16 2.3 Double loop D L L 18 2.4 Synchronous Mirror Delay (SMD) 20 2.5 Register controlled D L L (RDLL) 24 2.6 Vernier Delay Locked Loop (VDLL) 27

CHAPTER 3 Design of proposed D L L 30

3.1 Block diagram 30 3.2 D L L modules description 39

3.2.1 Vernier delay line 39 3.2.2 Vernier delay line controller 42 3.2.3 High resolution phase detector 47 3.2.4 Lock detector. 51

CHAPTER 4 Analysis of proposed D L L 52

4.1 Testbench 52 4.2 Initial lock 53 4.3 Lock re-entry 57 4.3.1 Lock re-entry (case 1) 57 4.3.2 Lock re-entry (case 2) 58

i i i

Page 4: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

4.4 Gate count of vernier unit delay 63 4.5 Resolution of the proposed D L L 63 4.6 Limitations of the proposed D L L 64

CHAPTER 5 Conclusion 65

Bibl iography 68

Appendix A Design V H D L code 80

Appendix B Synthesis result 95

iv

Page 5: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

List of Figures

Figure 1.1 Possible hold violation due to clock skew 2

Figure 1.2 Possible setup violation due to clock skew. 3

Figure 1.3 Typical D L L block diagram 4

Figure 1.4 Typical P L L block diagram 6

Figure 1.5 S D R A M output timing with and without a D L L 8

Figure 1.6 Block diagram of the laser range finder [101] 9

Figure 2.1 Conventional Analog D L L 12

Figure 2.2 Analog D L L with duty-cycle correction 14

Figure 2.3 Analog multiphase D L L 15

Figure 2.4 Digital D L L block diagram 16

Figure 2.5 Dual loop D L L 19

Figure 2.6 Conventional SMD 21

Figure 2.7 Timing diagram of a conventional SMD 22

Figure 2.8 Block diagram of Direct SMD 23

Figure 2.9 Register Controlled D L L (RDLL) 24

Figure 2.10 Core circuit in R D L L 25

Figure 2.11 Core circuit in a RSDLL 26

Figure 2.12 Block diagram of a vernier delay line [73] 29

Figure 2.13 Schematic of vernier delay line [73] 29

Figure 3.1 Block diagram of proposed D L L 30

V

Page 6: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Figure 3.2 Circuit and timing diagram ofa Conventional unit delay. 31

Figure 3.3 Circuit and timing diagram of a Symmetrical unit delay 32

Figure 3.4 CMOS N A N D gate 34

Figure 3.5 Phase Detector block 34

Figure 3.6 Lock Detector block 35

Figure 3.7 Controller block 35

Figure 3.8 Vernier delay line block 36

Figure 3.9 S A R D L L block diagram [33] 37

Figure 3.10 Flowchart for weighing sequence 38

Figure 3.11 Proposed unit delay circuit 40

Figure 3.12 State diagram of controller block 43

Figure 3.13 Shift registers in controller block 44

Figure 3.14 Phase detector in [50] 48

Figure 3.15 Proposed high resolution phase detector 49

Figure 3.16 Phase detector waveforms 50

Figure 3.17 Lock detector circuit 51

Figure 4.1 Initial lock mode waveform for a leading input clock 54

Figure 4.2 Initial lock mode waveform for a leading input clock (zoomed in) 55

Figure 4.3 Initial lock mode waveform for a leading output clock 56

Figure 4.4 Initial lock mode waveform for a leading output clock (zoomed in) 56

Figure 4.5 Lock re-entry mode waveform for small phase error. 58

Figure 4.6 Lock re-entry mode waveform for a leading input clock 59

Figure 4.7 Lock re-entry mode waveform for a leading input clock (zoomed in) 60

vi

Page 7: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Figure 4.8 Introduced glitch waveform for a leading input clock 60

Figure 4.9 Lock re-entry mode waveform for a leading output clock 61

Figure 4.10 Lock re-entry mode waveform for a leading output clock (zoomed in) 62

Figure 4.11 Introduced glitch waveform for a leading output clock 62

Page 8: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A C K N O W L E D G M E N T S

I would like to express my deepest gratitude to my academic and research advisor

Dr. Andre Ivanov for his guidance and constant support in helping me to conduct and

complete this work.

Also my wife has been supportive, not just tolerant, of my return to graduate school. She

is as pleased as I am that my dissertation is finished. She knows that I am grateful to her

for continuous support, but I take this opportunity for a public acknowledgment of my

debt to her.

V l l l

Page 9: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Chapter 1

Introduction

This chapter introduces the research topic of this thesis. A quick review of the D L L circuit

and its comparison with Phased Locked Loop are also included in this chapter. The chap­

ter also describes the different applications in which the D L L is used.

1.1 Clock skew

As silicon fabrication technology develops, more logic can be packed on a die and as a

result the chip size gets progressively bigger. The number of logic gates and chip operat­

ing frequencies increase, and the clock skew becomes increasingly more important in

ensuring the proper functioning of VLSI chips. With a synchronous communication proto­

col on and off the chip, it is impractical to increase the communication clock speed further

without reducing the clock skew on the chip. In a synchronous design the period of clock

determines the available time for any operation between two flip-flops. Any uncertainty

such as skew or jitter reduces this period.

The clock skew is caused by different RC delay of clock interconnections along different

clock signal paths, different delays of clock buffers due to process and temperature varia­

tions on the same chip, and power supply differences caused by power rail voltage drop.

l

Page 10: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The clock skew problem can also exist in other situations. For example, the input clock

driver in any chip will introduce uncertain time delays between the internal and external

clocks. As a result, internal clocks in a multi-chip system become asynchronous and prob­

lems occur when data transfer between chips is performed.

Clock skew can lead to both setup and hold time violations. Consider the circuit in Figure

1.1(a), where the clock is shown routed in the direction of the data path. Delays in the

clock path lead to skewed versions of the system clock arriving at the two flip-flops. If 62

is greater than the sum of the clock-to-Q delay of FF1, the logic delay, and the setup time

of FF2, then a hold time violation will occur. As shown in Figure 1.1(b), FF2 samples the

wrong data. This can be prevented by adding delay to the data path from FF1 to FF2

(which increases the cycle time and is not preferred) or by reducing the clock skew.

(a)

clk_

D1 6* D Q

clkl - * J d1

logic D2

d2

D Q

r 4 F F 2

clk2

(b)

elk

clkl

D2

clk2

J V f A A

t

\

J \ I V I I

Figure 1.1 Possible hold violation due to clock skew

Page 11: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

If the clock signal is routed in the opposite direction to data flow as shown in Figure

1.2(a), then clock skew will not cause a hold time violation. However a setup time viola­

tion can occur since clk2 might arrive earlier than clkl as shown in Figure 1.2(b). The

clock cycle has to be increased in order to prevent this violation, which also harms system

performance.

(a)

e l k _J \ / V

(b) c lk l

D2

clk2

J V

X J V

7 \

\

Figure 1.2 Possible setup violation due to clock skew

1.2 Delay Locked L o o p

To reduce clock skew, the clock distribution network should be designed with care. In

addition, circuits such as Phased Locked Loops (PLLs) and Delay Locked Loops (DLL)

may be necessary to reduce the total clock skew by employing them in several critical

places of the clock distribution structure.

3

Page 12: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Basic D L L consists of a phase detector (PD) or a phase comparator (PC) block, a variable

delay line, and a controller to convert the PD's output to a control signal for the delay line

as shown in Figure 1.3(a). A basic D L L detects the phase error between the input clock

and its output clock and adjusts the total delay of variable delay line to a multiple of peri­

ods of the input clock. It introduces enough delay (Td) so the rising edge of the output

clock coincides with the next rising edge of the input clock as shown in Figure 1.3(b).

External Reference Clock Phase

Detector Low Pass

Filter Phase

Detector Low Pass

Filter

Clock Buffer

Error signal

(a)

Point of use

DLL Input Clock j v -Lock lime-

DLL Output Clock / \ / \ \\J \ / \ \\-

DLL is locked (b)

Figure 1.3 Typical D L L block diagram

The correct timing of a synchronous circuit relies on clock edges and is affected by the

clock skew and jitter, so the introduced 1 input clock period delay doesn't have any nega­

tive impact on the functionality of systems that utilize the Delay Locked Loop circuits.

Page 13: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The output clock's frequency of a standard Delay Locked Loop circuit is the same as that

in the input clock, so generally DLLs are not used for clock synthesis. PLLs are used

widely for synthesis and clock multiplication. While there are some applications which

use DLLs for clock synthesis, this is not common [45], [53] and [107].

The D L L and P L L circuits are considered feedback circuits. They generally require sev­

eral clock cycles to achieve lock, resulting in a large standby power consumption. These

circuits cannot be used in clock deskewing applications requiring low standby power con­

sumption. In other words, these circuits cannot be turned off in standby mode due to their

slow locking operation.

The Synchronous Mirror Delay (SMD) and Clock Synchronized Delay (CSD) circuits

were developed for applications requiring low standby currents [52], [88] and [94]. These

circuits have no feedback so their lock-in time is significantly less than that of DLLs or

PLLs. During the standby mode, it is possible to switch them off. When power is resumed,

it only takes two or three clock cycles for them to lock in, which is negligible for most

applications.

1.3 D L L vs . P L L

When it comes to choosing between a P L L and D L L for a particular application, differ­

ences in their architecture need to be understood. The oscillator used in the P L L inherently

introduces instability and accumulation of phase errors (Figure 1.4). This in turn degrades

the performance of the P L L when compensating for the delay of the clock distribution net-

5

Page 14: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

work. On the other hand, the unconditionally stable DLL architecture does not accumulate

phase errors [20], [100], [103]. For this reason, the DLL architecture is widely used for

delay compensation and clock conditioning.

External Reference! Clock

6H Phase

Frequency Detector

Error signal

Low Pass Filter

Voltage Controlled Oscillator

Clock Tree

Point of use

Figure 1.4 Typical PLL block diagram

The DLL's closed loop transfer function has only one pole (a first order system) [56] and

[57]. Therefore, it is naturally a stable system. On the other hand, a PLL's closed loop

transfer function has two or three poles. Therefore, stability is a major issue and needs to

be addressed during design. Normally, one needs to add zeroes to a PLL's transfer function

in order to stabilize the PLL circuit [103].

The input clock's jitter propagates through a DLL circuit (first order system) and can

affect the performance of the system. PLL filters out the jitter, so it is the best choice for

applications with high jitter input. In a clock distribution system, the main clock is gener­

ated by a quartz crystal oscillator, which does not introduce a significant amount of jitter.

Therefore, generally the DLL circuit is utilized for de-skewing purposes [33], [66], [68]

and [98].

6

Page 15: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The main disadvantage of a conventional D L L compared to a PLL, is its limited phase

capture range [37]. At a given operating clock frequency a D L L can delay its input clock

by an amount bounded by a minimum and maximum delay. As a consequence, extra care

must be taken by a designer to prevent the loop from trying to lock to a delay outside these

limits. To extend the operating range, the number of delay cells or the gain of the delay

line (analog DLL) should be increased. This not only consumes additional power, but also

causes more jitter from supply.

1.4 Appl icat ions

DLLs are used in many different applications as described in the following subsections.

1.4.1 Clock distribution

As previously mentioned, a D L L is mainly used in the clock distribution circuit which do

not require clock synthesis or multiplication. Due to the nature of these systems (fixed

clock frequency), a DLL's narrow capture range is not an issue, [35], [82], [33], [66], [68],

[98] and [102].

1.4.2 S D R A M

In synchronous D R A M , the output data strobe (DQS) should be locked to data outputs

(DQ outputs) for high-speed performance. The clock-access and output-hold times of con­

ventional D R A M designs are determined by the delay time of internal circuits such as

clock input and output buffers. Variations in temperature and process change access times

and reduce the size of the valid data window. Several publications describe how a D L L

7

Page 16: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

can optimize and stabilize clock-access and output hold times, [26], [47], [65], [70], [72],

[73], [74], [77], [79], [85], [87], [90], [93], [94] and [104]. A n internal D L L can be used to

adjust the time difference between the output and input clock signals in SDRAMs (Figure

1.5).

(a) without DLL

cik r DQ

tAC = Td(max) tOH = Td(min)

td

Data out xxyy - 53

Valid data window

(b) with DLL

Clk

tAC tOH

DQ

L \ / \ /

X )( Data out )

- 4 •

Valid data window

Figure 1.5 S D R A M output timing with and without a D L L

In Double Data Rate synchronous D R A M [1], [17], [21], [32], [44], [46], [50], [51] and

[75], where read/write accesses can occur on both rising and falling edges of the clock,

clock synchronizing is critical and is required for both clock edges. A symmetrical D L L is

Page 17: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

used for this application. The term ''symmetrical'' means that the delay line used in the

DLL has the same delay for a high-to-low or a low-to-high logic transition.

1.4.3 Time-to-Digital converter (TDC)

High-resolution time-to-digital converters (TDCs) have an application in a number of

measurement systems such as time-of-flight (TOF) particle detectors, laser range finders

(Figure 1.6), and logic analyzers. Laser range-finding is used in many industrial applica­

tions, for example measuring dimensions of ship blocks in shipyards, inspection of oil

level in large tanks, and robot vision [4], [22], [23], [36], [54], [62], [69], [81], [83], [95],

[96], [97], [101] and [106].

Time interval measurement

(DLL)

Distance result

Laser diode Transmitter

Amplifier +

Timing discriminator"

Target

Figure 1.6 Block diagram of the laser range finder [101].

Modern TOF systems used in particle physics experiments, require TDCs to have a resolu­

tion below 1 ns. A distance measurement accuracy of 2-3 cm corresponds to 100-200 ps of

measurement time. A high-resolution measurement can be obtained by utilizing a logic

buffer delay as a time unit, and a DLL is used to stabilize the value of buffer delay against

Page 18: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

process variations, temperature and power supply changes. The delay line is used in a

closed loop controlled by a D L L . The time resolution is limited to the delay of each unit

cell in the delay line.

1.4.4 Automatic Test Equipment (ATE)

General purpose Automatic Test Equipment (ATE) requires fast devices, high tester band­

width, high data rates, and high timing accuracy. At the heart of ATE is timing event gen­

eration circuitry which generates control signals for different parts of ATE [99]. DLLs

have been used widely in ATE to achieve required precision and eliminate process varia­

tions, temperature fluctuations and supply voltage (PVT) that affect the time base genera­

tor.

1.4.5 Clock synthesis

PLLs have been used successfully in creating tapped ring oscillators for clock synthesis. A

PLL's delay elements have two dependent variables controlled by the feedback system, the

frequency and the phase. A D L L , however, has only a single dependent variable controlled

by the feedback loop, the phase. The P L L will integrate the error of all its noise

sources,but a D L L will only integrate the noise sources that cause jitter such as power sup­

ply noise or thermal noise. This only happens over one delay period, so a D L L does not

accumulate noise because it is a first order system. This is a desirable characteristic for

every high performance clock generator [2], [3], [6], [7], [8], [9], [11], [12], [15], [18],

[27], [28], [30], [39], [40], [41], [42], [45], [48], [64], [67], [80], [88], [103] and [105].

10

Page 19: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

1.4.6 Clock and data recovery (CDR)

Clock and data recovery is a mechanism that allows a receiver to extract the clock from an

incoming data stream which then can be used to extract the incoming data. The receiver

extract the embedded clock in the data stream in order to transmit data back to the source.

Both Delay-Locked-Loops (DLLs) and Phase-Locked-Loops (PLLs) can be used in clock

and data recovery circuits. DLLs are rarely used in CDR circuits [14], [19], [25], [55], [61]

and [91].

11

Page 20: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Chapter 2

Background

In this chapter, we provide an overview of different D L L types. The advantages and disad­

vantages of each D L L type has been discussed. A extensive literature overview of differ- •

ent types of DLLs has been included, which covers papers from 1993 to 2005.

2.1 Ana log D L L

Analog DLLs were first used in clock distribution applications [10] and [13]. A conven­

tional analog D L L consists of four main blocks: a voltage controlled delay line (VCDL), a

charge-pump, a low pass filter, and a phase detector as shown in (Figure 2.1).

V C D L

RefClk

Figure 2.1 Conventional analog D L L

12

Page 21: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The input reference clock drives the delay line and is comprised of cascaded variable

delay buffers. The output clock drives the loop phase detector. The output of the phase

detector is integrated by the charge pump and the loop filter capacitor to generate a loop

control voltage. The loop negative feedback drives the control voltage to a value that ide­

ally orces a zero phase error between the output clock and the reference clock.

The simple design of the D L L offers many advantages when compared to Voltage Con­

trolled Oscillator (VCO) based PLLs. Due to frequency acquisition constraints, P L L usu­

ally uses a specific type of phase detector, the state-machine based phase frequency

detector (PFD). In contrast, a DLL's phase detector can be easily implemented by using

bang-bang control [109]. This means that the control signal of the loop can simply be a

binary up or down signal rather than being proportional to the phase error magnitude.

Additionally, since DLLs do not use a V C O , phase errors induced by supply or substrate

noise do not accumulate over many clock cycles [108]. This improved noise immunity is

the main reason for the increased usage of DLLs in applications that do not require clock

synthesis [16], [19], [34] and [105].

An analog D L L is a relatively complex analog circuit requiring process-specific imple­

mentation. It is difficult to reuse the same design for different technology, making analog

D L L a non-portable architecture. For example, i f an analog D L L is designed for 0.35 | im

CMOS technology then it is not practical to upgrade it to 0.18 | im technology, as major

changes in the layout of the design are required.

13

Page 22: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The output clock's duty cycle changes as it passes through many delay cells. The reason is

that the propagation delay of each unit cell in the delay line is not the same for low-to-high

and high-to-low input, so even i f the duty-cycle of a reference clock is 50% at the input,

the output duty-cycle may be significantly different. A conventional solution to this is

attaching duty-cycle correction circuits to all clock output drivers, which also adds to the

area and increases jitter.

A n all-analog multiphase D L L is proposed in [34]. It achieves both wide range operation

and low jitter performance. The proposed D L L has the same benefits as conventional ana­

log D L L such as jitter cancelling and multiphase clock generation. It also uses a dual con­

trolled delay cell to correct the duty-cycle problem as shown in Figure 2.2.

Reference Clock

V C D L

Phase Detector Charge pump Low pass filter

Phase Detector Charge pump Low pass filter

Vcp

Vduty

Clk

Figure 2.2 Analog D L L with duty-cycle correction

14

Page 23: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A second phase detector compares the inverted clock input, with the inverted clock output

and generates a control signal Vduty as shown in Figure 2 . 2 . It fine-tunes the cell current

ratio and therefore aligns the falling edges of reference clock and output clock. In this

way, it maintains a reference clock's duty cycle.

A quadrature phase mixing D L L was proposed in [104] and [105] , which completely elim­

inates the limited capture range deficiency of conventional analog DLLs (Figure 2 . 3 ) . This

approach is based on the fact that quadrature clocks ( 9 0 degree phase shifted clocks) can

be generated for a given clock frequency. The quadrature clocks are input to a phase

mixer, which can produce a clock whose phase can span the complete 0 - 3 6 0 degree phase

interval. This approach reduces the limited phase range problem of conventional D L L .

Reference Clock Divide 0

By 2 9o°|

Phase Detector

Charge Pump

Figure 2 .3 Analog multiphase D L L

15

Page 24: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

2.2 Digital D L L

Both analog and digital DLLs have been used for clock alignment applications [35], [82],

[33], [66], [68], [98] and [102]. A n analog D L L generally provides better jitter perfor­

mance at the expense of greater complexity. Although the digital D L L uses more area and

power than the analog D L L , its greater simplicity, and lower minimum required power

supply voltage makes it very attractive for many clock alignment applications.

Digital DLLs are characterized by their use of digital delay lines. They are typically made

from simple digital circuit elements (Figure 2.4). This simplicity helps to design a portable

digital D L L which can be easily adopted for different technologies. Additionally, because

phase information in a digital D L L is stored as a digital state, digital DLLs can provide

very fast timing recovery after being placed in standby mode. However, conventional dig­

ital DLLs provide only moderate phase resolution and jitter performance [1], [21], [32],

[48], [49], [71], [74], [76], [78], [92] and [94].

External Reference | Clock _

Demultiplexer

N -h

Phase Detector

Right Shift Register •1 Phase

Detector Left ^ Shift

Register •1 Phase Detector

Shift Register

Error signal

Figure 2.4. Digital D L L block diagram

16

Page 25: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Another benefit of digital DLLs is their ability to operate at lower voltages than analog

DLL's . Because analog DLLs require the use of saturated current sources, they experience

minimum voltage problems as supply voltage decreases. Digital DLLs , on the other hand,

only require enough voltage to ensure the proper operation of their digital gate elements.

A digital DLLs utilize the power saving benefits of power supply scaling better than ana­

log DLLs. The power consumption of an analog D L L is the sum of static power consumed

by the constant current sources in the circuit and the dynamic power of C V f (where C is

capacitance and f is frequency). The power consumption of a digital D L L , on the other

hand, is determined primarily by C V f power, which decreases quadratically with supply

voltage.

The delay elements can be implemented with almost any circuit block, but because the

phase resolution of the delay line is determined by the propagation of each unit cell, delay

elements that provide minimal delay are generally preferred. The delay line of a conven­

tional digital D L L uses inverters, since they provide the shortest delay of any CMOS digi­

tal gates. Because of the inverting characteristic of an inverter gate, the delay line is

tapped only at every other inverter (two inverters in a series form a unit cell) to ensure that

output taps are not inverted and only shifted by the total propagation delay of the two

inverters.

Although conventional delay lines are attractive for their simplicity, DLLs based on such

conventional delay elements suffer from several significant limitations. First, the delay

17

Page 26: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

line provides fairly coarse resolution. For example, the delay line with inverters as unit

cells provides a minimum phase step corresponding to two inverter delays. Such coarse

phase resolution is not enough for many clock alignment applications.

Second, conventional delay lines deliver only a limited phase range. In order to cover at

least one full cycle of phase, the delay line length and unit cell delays are adjusted to pro­

vide at least 360 degrees of phase under the fastest process, voltage, and temperature

(PVT) conditions and minimum operating frequency. Consequently to cover this range, a

long delay line which occupies more silicon area and dissipates additional power is

required. Additionally, because inverters offer a poor power supply rejection ratio

(PSRR), power supply's noise-induced jitter can be accumulated as the signal propagates

through the delay line. This causes the signals from the later taps in the delay line to intro­

duce more jitter than earlier taps.

2.3 Double loop D L L

The key parameters in the D L L design are locking time, power consumption, jitter, and

phase error, which depend on the choice of proper delay elements and loop control meth­

ods. The phase adjustment is done through a variable delay line or a tapped delay line. The

tapped delay line is used for digital control, where the locking characteristics are less sen­

sitive to switching noise and cross talk. On the other hand, the variable delay line is used

for reducing the static phase error, where the delay changes gradually. Therefore, the logi­

cal approach to obtaining a D L L with fast locking and a low phase error is to combine

these two methods. This is called a dual loop D L L , sometimes referred to as semi-digital

18

Page 27: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

D L L [26], [29], [31], [38], [45], [58], [60], [63], [84] and [89]. The locking procedure is

done in two steps, coarse tuning and fine tuning. Coarse tuning and fine tuning are per­

formed in the digital and analog domains, respectively (Figure 2.5). The dual loop D L L

can be used in low power stand-by mode applications. Then, the recovery from stand-by

mode to regular operational mode is almost immediate because digital information is kept

in the stand-by mode and the position of the output tap in the delay line is known at star­

tup.

External Clock Delay Delay Delay Delay

Charge Pump] Loop Filter

PFD

Mux / -

Analog Delay

Mux /—SjVlux / - S j V l u x / ~ \ Mux /—/-

Clock Buffer

Digital Control Block

Digital Phase Detector!

Figure 2.5 Dual loop D L L

After powering up the system, the coarse tuning mechanism starts. Normally the middle

tap in the delay line is selected and the output clock is compared to input reference clock

19

Page 28: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

by a digital phase detector. Depending on which clock is leading and which is lagging, the

output of the phase detector shifts the selected tap right or left. Finally, the proper tap with

minimum delay to the reference clock is selected. By that time, coarse tuning phase had

been completed.

To avoid unwanted phase jitter, the digital block is disabled and shift registers in the con­

troller block hold their positions. The analog control part is enabled to reduce the phase

error. This function is performed by a lock window mechanism. If the internal clock is

outside the window, the digital block is enabled. Once the internal clock enters the lock

window range, the analog block is enabled and the digital block is disabled. The range of

the analog part must be large enough to cover the lock-detecting window. The analog con­

trol block consists ofa PFD, a charge pump, and a loop filter. The operation of the analog

loop is the same as that of the conventional analog D L L .

2.4 S y n c h r o n o u s Mirror Delay (SMD)

The conventional P L L and D L L circuits are considered feedback systems, requiring many

clock cycles to achieve lock. Therefore they can not be turned off and are not used in

clock-skew suppression applications requiring low standby currents for example in a cell

phone device. On the other hand, Synchronous Mirror Delay (SMD) and Clock Synchro­

nized Delay (CSD) circuits are non-feedback systems which can achieve the lock, in only

two clock cycles [52], [88] and [94]. Therefore, in standby mode these circuits can be dis­

abled, and they can lock to the reference clock in just two clock cycles when the operation

mode is resumed.

20

Page 29: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A conventional SMD circuit as shown in Figure 2.6, consists of an input buffer with delay

of d l , a clock driver with delay d 2 , a replica delay line (a dummy input buffer plus a

dummy clock driver with total delay (t,- e p l i c a = dj + d2), and two delay lines (a delay-mea­

surement line and a variable-delay line arranged in parallel). When the circuit is activated,

the first clock signal propagates through the input buffer, the replica delay line, and the

delay-measurement line with delay [ t C K - t r e p l i c a ] until the second signal comes out of the

input buffer. Delay time [ t C K - 1 ^ ] ^ ] determines the length of the variable line. The sec­

ond signal propagates through the variable-delay line and comes out of the clock driver.

The resulting total delay time is d, + d 2 + [ t C K - t ^ J + [ t C K - t r e p l i c a ] + d 2 = 2 t C K (Fig­

ure 2.7). In this manner, no feedback circuitry is used and clock skew is eliminated within

two clock cycles. The simple structure of the SMD circuit also reduces design efforts [52],

[88] and [94].

tV = [tCK - (dl + d2)] < •

Buffer R e p l i c a D e l a y

Meas. Delay Line

Var. Delay Line

d2

Clock Driver

Internal Clock Line I Figure 2.6 Conventional SMD

21

Page 30: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

tCK tCK

Ext Clock

A

B

C

Int Clock

n n V i —| n

Vreplica 1 i r~ii n

d2\ ~ i n

Figure 2.7 Timing diagram of a conventional SMD

Despite their advantages, SMD circuits are not widely used because they use a dummy

clock driver circuit based on clock driver circuits after the placement and routing phases.

Therefore, they are used for devices in which the clock driver circuits can be fixed during

the circuit design stage, e.g., memory elements [94].

Furthermore, a difference between the original clock driver circuit and the dummy clock

driver circuit exists due to process, power supply voltage,, and temperature variations

(PVT). This delay difference increases the phase error, which can not be compensated for

during the operation mode because no feedback mechanism exists for a SMD circuit.

22

Page 31: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A direct-skew-detect synchronous mirror delay (direct SMD) achieves clock-skew sup­

pression in only two clock cycles [43] and [52]. It can be used for application-specific

integrated circuits (ASIC) with undefined clock paths as shown in Figure 2.8. The direct

SMD circuit detects both clock skew and clock cycle by using a direct-skew detector and

clock suppression circuitry. The direct SMD circuit does not use a dummy clock driver

circuit. Therefore, it does not experience the same problems as mentioned above for a con­

ventional SMD circuit.

Input Ext B u f f e r

Clock

Dummy, Input Buffer

Skew Detector h - 1

Skew-Detection Signal

Meas. Delay Line

Var. Delay Line

Switch Clock Driver

Internal Clock Line

Figure 2.8 Block diagram of direct SMD

23

Page 32: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

2.5 Register controlled D L L (RDLL)

The R D L L belongs to the digital D L L family and is widely used in high speed synchro­

nous D R A M (SDRAM) applications [17], [51], [85] and [90]. In a SDRAM, the output

data strobe (DQS) should be locked to the data outputs. To optimize and stabilize clock-

access and output times, an internal R D L L is used in a SDRAM memory chip, which

adjusts the time difference between the output and input clock signals.

The R D L L consists of a tapped delay line, a shift register, a phase detector, and a replica

input buffer dummy [85]. The replica input buffer dummy is used in the feedback path to

match the delay of the input clock buffer. The phase detector (PD) is used to compare the

relative timing of the edges of the input clock and the feedback clock signal, which comes

through the tapped delay line. The shift register controls the point of entry in the delay line

for the incoming external clock as shown in Figure 2.9.

External Clock

M Clock buffer

Clock buffer (dummy)

Phase Comparator]

Delay line

Output Clock

Shift register

Figure 2.9 Register Controlled D L L (RDLL)

24

Page 33: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The outputs of the phase detector, shift-right and shift-left, are used to control the shift

register. In the conventional R D L L , only one bit of the shift register output is high, while

the other bits are zero. The single bit is used to select a point of entry for CLKIn in the

delay line. When the rising edge of the input clock is within the resolution of the output

clock, then both outputs of PD, shift-right and shift-left, are low and the loop is locked as

shown in Figure 2.10.

^>-^7t>^>---:"rOH>

CLKIn-

CLKOut

H L L

Shift register

Figure 2.10 Core circuit in R D L L

The resolution of the R D L L is determined by the size of unit delay used in the delay line.

The locking range is determined by the number of delay stages used in the delay line.

Since the D L L circuit inserts a delay time between CLKIn and CLKOut, making the out­

put clock change simultaneously with the next rising edge of the input clock, the minimum

operating frequency to which the R D L L can lock is the reciprocal of the product of the

number of stages in the delay line with the delay per stage (F mj n= l/(Td * N), where Td is

the delay of one unit delay and N is the number of unit delays in the delay line). Adding

more delay stages will increase the locking range of the R D L L at the cost of increased

chip area and power consumption [17], [51], [85] and [90]. 25

Page 34: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The conventional R D L L uses an A N D gate as the unit-delay stage (NAND + Inverter).

The problem created by using a N A N D + Inverter as the basic delay element is that the

propagation delay through the unit delay for a high-to-low transition is not equal to the

delay of a low-to-high transition, i.e, t P H L is not equal to t P L H . If the difference between

t P H L and t P L H is 20 ps, for example, then the total skew of the falling edge through 50

stages is 1 ns. Because of this skew, the input clock's duty-cycle will not be preserved,

when the clock propagates through the delay line.

A Register-Controlled Symmetrical D L L (RSDLL) is proposed in [51], which can be used

for duty-cycle sensitive applications. For example, it meets the requirements of double-

data-rate (DDR) S D R A M that read/write accesses occurs on both rising and falling edges

of the clock. In the RSDLL, a modified symmetrical delay element is used, with a N A N D

gate instead of an inverter (two N A N D gates per delay stage).

Input •

L i H H

Q Q Q Q Shift register

H

Figure 2.11 Core circuit in a R S D L L

26

Page 35: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

This symmetrical unit delay guarantees that t P H L = t P L H independently of process varia­

tions, since when one N A N D switches from HIGH to LOW, the other switches from L O W

to HIGH. The schematic for a symmetrical D L L is shown in Figure 2.11.

2.6 Vernier Delay Locked Loop (VDLL)

The Vernier principle is based on the Vernier caliper [83]. The tool measures the length of

an object placed between its two jaws. On the sides of the jaws, an indicator mark shows

the distance between the jaws on a scale. Since the indicator usually falls between two tick

marks, additional accuracy is obtained by dividing the distance between tick marks.

A n additional scale is included next to the indicator, which has ten divisions in a distance

equal to nine divisions on the scale. Because of this mismatch it is possible to measure a

subdivision of the primary scale ten times smaller than the distance between tick marks.

Based on this concept, a delay line with N =10 delay elements can be designed to have a

total delay of H - 9 times of clock periods. The minimum achievable time step is = TV

N = D/H where T is the period of the input clock and D is delay of each delay element.

This technique was introduced and implemented for a time to digital converter (TDC)

[36], [70], [83] and [99]. A TDC is mainly used to digitize the time which has many poten­

tial applications in high-energy and nuclear physics experiments.

27

Page 36: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

In a conventional digital D L L , the quantization error is equal to the propagation delay of

each unit in the delay line. In a 0.35 | im CMOS technology, the propagation delay of an

inverter gate is about 40 ps. Thus, a unit delay consisting of two series inverter presents a

delay of 80 ps. For a GHz operating frequency, the 80 ps quantization error accounts for

8% of 1 ns clock period, an error that affects the functionality of a synchronous system.

The Vernier technique is implemented to reduce this error in [5], [24], and [86].

A modified version of the Register-Controlled D L L (RDLL) is proposed [73], which

relies on the Vernier concept. It consists of two series of RDLLs. The first R D L L performs

the coarse delay adjustment, with a 200 ps quantization error. The second R D L L , with a

40 ps quantization error, performs fine-tuning.

The coarse R D L L uses the conventional delay line, where each unit delay consists of a

N A N D gate and an inverter in a series configuration. The fine R D L L uses a different con­

figuration, composed of two delay elements that have delay times of t d and 1.2 td, where t d

is the unit delay time of the conventional delay element as shown in Figure 2.12.

The delay lines are arranged in two parallel main and sub delay lines and are serially con­

nected by switches SW0 to SW4. In Figure 2.12, only one of the switches can be closed at

any time. For example, i f SW0 is closed, the delay line generates 5 td. Similarly, i f SW1 is

closed, the delay line generates 5.2 td. Thus, this delay arrangement can generate a 0.2 t d

delay step, which is considerably smaller than that of conventional delay.

28

Page 37: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Sub-delay line

IN 1.2 td 1.2 td 1.2 td 1.2 td

SWON SW1 S W 2 \ SW3N SW4N

OUT - o

td td td td Main delay line

td

Figure 2.12 Block diagram of a Vernier delay line [73]

In figure 2.13, the main and sub-delay lines are connected with SWO to SW4 switches.

The fan-out of the main delay line is one, while that of the sub_delay line is two. Hence,

the delay of the sub_delay line exceeds that of the main delay line. This delay difference

becomes the unit delay time of the delay line, which is equal to the quantization error as

shown in F igure 2.13.

Sub-delay line

SW(n-l)

t r

td+A F.0 = 2

V td

- a

V SW(n)

Main delay line F.0 = 1

Figure 2.13 Schematic of vernier delay line [73]. 29

Page 38: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Chapter 3

Design of proposed DLL

This chapter covers the block diagram of the proposed circuit and detailed circuit explana­

tions of each module in the block diagram. The logic design is described thoroughly. The

simulation results are covered in the next chapter. The design goal is to increase the reso­

lution of D L L to less than 10 ps, as well as reducing the area (gate size) of the vernier

delay line in the D L L by a minimum of 10%. The power consumption is also reduced as a

result of the gate reduction in the vernier delay line. The resolution of less than 10 ps, area

reduction of 15% and operating frequency of up to 200 MHz is achieved in this design.

3.1 B lock diagram

The block diagram consists of four modules, phase detector, lock detector, Vernier delay

line and controller as shown in Figure 3.1.

Output

Input Clock

Vernier delay line

Phase Detector

a Controller Phase

Detector Controller

Lock Lock Indicator

Detector •

Error signal

Clock

Figure 3.1 Block diagram of proposed D L L

30

Page 39: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The input clock is connected to two modules, the phase detector and the Vernier delay

line. The Vernier delay line propagates the input clock and provides N output taps where N

is the number of unit delays in the delay line

In order to lock the output clock to the input clock for all input frequencies, the delay of

the delay line should be greater than the period of the minimum operating frequency. For

example, if a DLL's locking range is between 100 MHz to 200 MHz, then the delay line

must be able to delay the input clock by 10 ns. Therefore, the input clock is delayed by 10

ns when it exits from the last output tap. If the delay of each unit is, for example, 50 ps,

then the delay line needs 200 unit delays. Therefore, to reduce the minimum operating fre­

quencies, more unit delays are required, which leads to more area and power consumption.

The delay of each unit depends on the number of cascaded gates in each unit and the tech­

nology in which the circuit is implemented. The conventional unit delay consists of 1

N A N D and 1 inverter gates in series, which, in 0.18 | i m , technology generates a delay of

approximately 70 ps. The same unit cell implemented in 0.35 [xm can generate approxi­

mately 100 ps. The delay estimates are based on commercial libraries.

There is a drawback for conventional unit delay, as the propagation delay is not symmetri­

cal and the total delay for the rising edge of input signal is not the same as for the falling

edge. Therefore, an input clock with a 50% duty cycle can result in a square wave pulse

which no longer has the a 50% duty cycle as shown in Figure 3.2. This non-symmetrical

aspect can cause problems in Double-Data-Rate DRAMs, where read/write access can

31

Page 40: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

occur on both rising and falling edges of the clock [1], [17], [21], [32], [44], [46], [50] and

[51].

InA

In pLH Out

OutA

H

t l t2

t

t l j*t2

Figure 3.2 Circuit and timing diagram of a conventional unit delay

The proposed DLL utilizes the unit delay consisting of two basic NAND gates in series.

This configuration eliminates the non-symmetrical characteristic of a conventional unit

delay. The total propagation delay of t l ( T P H L + T P L H ) for the input rising edge is equal to

t2 ( T P L H + T P H L ) for the falling edge of the same input clock. The T P H L and T P L H are high

to low and low to high delays of the NAND gate, respectively as shown in Figure 3.3.

Therefore, the duty cycle of the input clock is preserved through-out the delay line.

InA

In

J>TO Out

OutA

^ t

H H

t l t2 t l =t2

• t

Figure 3.3 Circuit and timing diagram of a symmetrical Unit delay

32

Page 41: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The Vernier delay line is controlled by a finite state machine or simply a controller. There

are two modes of operation, coarse and fine. A system reset signal initiates the coarse tun­

ing mode. In this mode, the phase detector compares the output clock signal from the cen­

ter tap with the reference input clock. If the positive edge of the input reference clock is

leading, then the controller shifts the output tap to the left and the total delay decreases.

On the other hand, if the positive edge of the input reference clock is lagging, then control­

ler shifts the output tap to the right and the total effective delay increases.

The controller enters the fine tuning mode when the positive edge of the input reference

clock and the output tap of delay line are less than a unit delay apart. Therefore, the delay

of each Vernier unit determines the resolution of the coarse tuning mode. In the fine tuning

mode, each time unit shift to the left or right is a fraction of its coarse tuning mode. This

enhanced resolution determines the final resolution of the system and sets the maximum

phase jitter.

The phase detector compares the input clock reference with the output tap signal of the

delay line. The resolution of the D L L depends not only on the fine resolution of each Ver­

nier unit delay but also on the resolution of the phase detector. In this design, the phase

detector's resolution is determined by the differential delay of a two input N A N D .

Generally, in CMOS gates the propagation delay from input ports to output port are not the

same. For example, in a N A N D gate, the input A which is connected to NMOS transistor

T l , has a smaller propagation delay than input B, which is connected to the NMOS tran-

33 •

Page 42: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

sistor T2 because the capacitance load on the drain of T2 is more that of TI as shown in

Figure 3.4. This difference for a two input N A N D gate in CMOS 0.18 | im technology is

less than 10 ps and varies with load and input signal transition time (slew rate).

Out

A

Figure 3.4 CMOS N A N D gate

The phase detector block has three outputs: increasedelay, decrease_delay, and

controller_clk as shown in Figure 3.5. At any time during the coarse and fine tuning mode,

one of the increase_delay or decreasedelay outputs is active and controllerclk is used to

synchronize the controller with the phase detector, so any shift to right or left is performed

on the positive edge of controllerclk output.

dll_clk_input

dll_clk_output

reset

Figure 3.5 Phase Detector block

increase_delay register_clk decrease_delay

34

Page 43: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

When the interval between the positive edge of the output clock and the input reference

clock is within the resolution of the D L L , then D L L is in lock mode. A l l of the phase

detector's outputs are disabled and the controller stays is in standby mode. The lock detec­

tor block indicates when D L L is in the lock mode, and its output goes high when the D L L

is locked as is shown in Figure 3.6.

increase_delay dll_clock_input decrease_delay

• lock indicator

reset

Figure 3.6 Lock Detector block

The controller block is a finite state machine (FSM) controlling the delay line as shown in

Figures 3.7 and 3.8. It controls the coarse and fine tuning modes. It also provides the

mechanism to resume the lock mode when the input's clock frequency or phase changes

rapidly. The system reset pulse initializes the D L L , and the controller block goes into reset

mode when the system is powered up. The detailed flowchart is shown in Figure 3.12.

reset

increase_delay. decrease_delay •

registerer_clk

fine control

^ > fine_control_inv

^ > coarse_control

Figure 3.7 Controller block

35

Page 44: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

fine control inv

coarse control I

fine control i Vernier delay line

•*»*- delay_line_output

delay_line_input

Figure 3.8 Vernier delay line block

The delay line has 128 output taps controlled by the controller block. During initialization

the center tap is selected as the output tap. The register-input bus is hardwired to a hex

value of "0000000080000000", which means that all the register-input bits except bit 63

are tied to logic zero. During system power up, the input load signal is asserted to logic

one. Consequently, this number is loaded into a 128 bit shift register. After reset, the cen­

ter tap corresponding to output control bit 63 of the controller's shift register is selected

for the delay line output tap.

It is possible to load the shift register with any other number, so any output tap in the delay

line can be selected. The center tap is however the best choice, because it gives the maxi­

mum dynamic range for both right and left shift, so the lock mode can be achieved in the

fastest time. In addition to speed, choosing the center tap as initial output tap leaves a

maximum number of unit delays in both directions. Therefore, the controller output selec­

tor does not reach the boundary taps before entering the lock mode.

36

Page 45: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

At any time, only one bit of the shift register is active, selecting an output tap of the delay

line. In this design all the unit delays are the same and exhibit the same amount of delay. A

linear approach has been selected to achieve the lock mode in the design of DLL in this

thesis. Therefore, the controller linearly shifts the output tap to the right or left one step at

a time so the skew between the output clock and input reference clock is gradually

reduced to the minimum, which is less than the resolution of fine tuning delay units.

A Successive Approximation Register Delay Locked Loop (SARDLL) is proposed in

[33], which uses a counter instead of a shift register. Also, its delay line is designed in a

binary-weighted manner and no longer consists of delay units with equal delay time. The

N-bit control word from the up/down counter determines whether the input clock goes

through the delay stage or passes it as shown in Figure 3.9.

Input Clock

Feedback Clock

1 2 4

m 2N-3 2N-2 2N-1

1 1 | Output 4-, J L J L r\nnV

f ^ J Delay Line ^

N-bit Control Word

Phase Comp

Fast Idle N-bit Up/Down Counter Slow

Clock

Figure 3.9 SARDLL block diagram [33].

37

Page 46: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

For faster lock time, the binary search algorithm is incorporated into S A R D L L . This algo­

rithm reduces the searching effort and speeds up the lock time process. The flowchart in

Figure 3.10 demonstrates how this algorithm works for a three-bit control word. In the

beginning, the most significant bit (MSB) of the controller output is set to one, and all the

other bits are set to zero. A phase comparator examines whether the output clock leads the

input clock or not. If it does, the MSB remains high. If not, it is set to low and held con­

stant. In this way the MSB is determined and the process is repeated for each following bit

until the least significant bit (LSB) is determined. In this way, the D L L can be locked

quickly.

(Start)

Figure 3.10 Flowchart for weighing sequence

A conventional linear approach has been implemented in this thesis. Devising the best

algorithm to speed up the lock time period is an independent topic which can be explored

in future research projects.

38

Page 47: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

3.2 D L L modules description

In this section, all the modules for the proposed D L L are explained in detail. First, the cir­

cuit and all of its components are described. Then, the functionality and operation of each

block are investigated in details.

3.2.1 Vernier delay line

The delay line consists of N unit delay in a chain configuration. In this design N=128,

which establishes an approximate minimum operating frequency of 100 M H z based on

target spec. More unit delays are needed to lower the minimum operating frequency. Each

unit delay consists of five dual-input N A N D gates. Therefore, a total of 640 N A N D gates

are used for this Vernier delay line.

In clock distribution applications, the clock frequency is fixed, so the minimum value of N

is calculated for the frequency, automatically leading to minimal power and area con­

sumption. In the clock recovery application, the D L L operates in a range of frequencies,

so the value of N is determined by the lowest frequency component in the incoming data.

The output port of all 128 delay units comprising the Vernier delay line are connected to a

single-bit bus. This single-bit bus is the output of the D L L and is fed back to the phase

detector block for phase comparison. If none of the tri-sate output buffers in the Vernier

delay line are enabled, then the D L L output floats which is neither low or high value.

39

Page 48: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

In order to prevent the DLL output to float, a small tri-state buffer is hooked up to the DLL

output. The input and enable ports of this buffer are tied to logic high, so its output holds

the DLL output to a weak high '1 ' value. Due to the weak drive capability of this small

buffer, a low output at any one of these 128 buffers overrides this weak high value and the

DLL output is pulled down to the '0' logic value.

Each unit delay consists of five NAND and one tri-state buffer gates. U l and U2 form the

fine unit delay, and U3 and U4 form the coarse unit delay. U5 acts as a switch controlled

by the fine-control input. The coarsecontrol input is connected to enable port of the

buffer gate (U6) and determines whether the unit_delay_out port is connected to output of

the U4 or is in a state of high impedance as shown in Figure 3.11.

VDD VDD

finejnput

finecontrol fine control inv

vernier_input coarse_input •

clk_output

fme_output

..vernier_output coarse_output

Figure 3.11 Proposed unit delay circuit

40

Page 49: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The clk_output of all N unit delays are tied to each other and form a one bit tri-state bus. A

single tri-state with weak output drive holds this bus at weak high level which guarantees

this single-bit bus never floats.

The fine and coarse delay units are constructed by two N A N D gates in series, forming a

symmetrical delay line. The propagation delay is the same for both rising and falling

edges, so the duty-cycle is preserved along the line.

Each output port of U2 and U4 is connected to two other inputs, so the fan-out is two and

both U2 and U4 use the A l port for delay input. As the result, both U2 and U4 introduce

the same amount of propagation delay. The difference between fine and coarse unit delay

is that the fineinput is connected to port A2 of U I , but the coarse_input is connected to

port A l of U3. In a N A N D gate, the propagation delay from A l and A2 ports to output Z

is not the same. The Vernier technique is based on this inherent characteristic of the

N A N D gate and uses this differential delay between the two inputs to achieve a fine step

resolution.

In DLLs proposed in [47], [59], [51] [85] and [90], the input clock is connected to all the

unit delays in the delay chain, so there are N taps, where N is the number of unit delays in

the delay line. This large fan-out requires a clock driver, which is large in area and con­

sumes extra power. It also introduces an extra delay that has to be compensated for with

another dummy clock driver in the feedback path.

41

Page 50: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

In this design, the input clock is connected to only two N A N D gates in the first unit delay,

so there is no need for the clock driver. This eliminates the phase shift between the input

reference clock and output clock due to delay mismatch between the clock and dummy

clock driver.

There are a total of 5 dual-input N A N D and one tri-state gate in each unit delay, which is

less than 6 dual input N A N D and 6 inverter gates used in the previously described digital

Vernier D L L circuit [73] as it is shown in Figure 2.13.

The coarseinput and fmeinput ports of the first unit delay are tied to the input reference

elk port. This is the entry port for both fine and coarse chains, and from this point the ref­

erence clock propagates through two separate fine and coarse delay chains. The

fme_output, coarse_output and vernierout ports in the last unit delay of the delay chain

are not connected to any net.

3.2.2 Vernier delay line controller

The controller block consists of a finite state machine (FSM) and two shift registers that

control the D L L operation. A l l the timing control for the delay line is originated in the

controller block. It determines which output tap in the delay line is connected to the D L L

output and whether the D L L is in coarse or fine mode.

42

Page 51: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Reset

decreasedelay & fine_control(N-l) increase_delay & fine_control(0)

Figure 3.12 State diagram of controller block

The finite state machine has four states: IDLE, INCREMENT, DECREMENT, and FINE

as shown in Figure 3.12. The F S M remains in the IDLE state while reset is asserted. The

initial coarse_load_data value is loaded into the coarse shift register when reset is asserted.

This value determines which output tap is selected as the output of the Vernier delay line.

The default value of "00000000000000008000000000000000" selects the center tap.

The register_clock, increase_delay and decrease_delay are generated by the phase detec­

tor block The input register_clock signal is used to clock the shift register. The

increase_delay and decrease_delay signals determine whether is a right shift or a left shift

as shown in Figure 3.13.

43

Page 52: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

increasedelay-decrease_delay-

register_clk

increase_delay decrease_delay-

register_clk

coarse_load_data

5^ Right D

B^- Left Enable (STATE = INCREMENT) gB» Clk Q

(STATE == DECREMENT)

coarse_control

fine load data

(STATE == FINE)

fine control fine control inv

Figure 3.13 Shift registers in controller block

Depending on whether the increase_delay or decrease_delay signals is asserted, the state

machine moves to INCREMENT or DECREMENT state, respectively. The state machine

stays in the D E C R E M E N T state as long as decrese_delay is asserted and moves to the

FINE state when increasedelay is asserted for the first time. The sate machine stays in the

INCREMENT state as long as increase_delay is asserted and moves to the D E C R E M E N T

state when decrease_delay is asserted for the first time. Subsequently, it moves to the

FINE state in the next clock when increase_delay is asserted

44

Page 53: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Therefore, regardless of whether it is in a state of DECREMENT or INCREMENT, the

state machine ends up in the FINE state where the coarse shift register is disabled and the

fine shift register's output determines the amount of incremental fine delay needed for

the D L L to lock its output clock with the input reference clock. The D L L stays in the lock

mode for as long as the input clock phase is steady and the phase difference between the

output and input reference clock is within the resolution of the phase detector.

If input clock's frequency and phase change at any time, then the D L L exits the lock

mode. If the output clock's rising edge leads the input clock's rising edge, then

increase_delay is asserted. On the other hand, if the input clock's rising edge leads the out­

put clock's rising edge, then decrease_delay is asserted. In either case, the register_clock

is enabled. The state machine stays in the FINE state and the fine shift register shifts left or

right depending on whether decreasedelay or increase_delay is asserted.

For example if the resolution of Vernier delay line is 10 ps and the fine shift register holds

the hex value of "00000000010000000000000000000000" when D L L is in lock mode.

The fine shift register can be shifted to the left until its most significant bit becomes "1",

which requires 39 clock cycles. The delay of delay line is then decreased by 390 ps. On

the other hand shift register can be shifted to the right until its least significant bit becomes

"1" which requires 88 clock cycles and the delay of delay line is increased by 880 ps.

Therefore, i f the phase error between the output clock's rising edge and input clock's ris­

ing edge is within this window, then the state machine stays in the FFNE state and lock

mode is achieved.

45

Page 54: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

If phase error is not within this window, then the state machine shifts to either INCRE­

MENT or DECREMENT, depending on whether an increase or decrease in the delay line

is required. At this point, the fine shift register resets to "0" and is disabled. The coarse

shift register, which controls the coarse delay line, is enabled and each shift to the right or

left increases or decreases the delay by an amount of delay equal to coarse unit delay

(delay of two NAND gates in a row). The state machine finally moves into the FINE state

when the phase error is less than the coarse unit delay, and then the fine incremental delay

can reduce the phase error into less than Vernier resolution.

In order to lower the power consumption in this DLL, only register_clock is used as the

clock to the controller module. Therefore, while DLL is in lock mode, both increase_delay

and decrease_delay are deasserted and registerclock is not enabled. The controller mod­

ule has 128 flip-flops for each coarse and fine shift registers, so turning off the clock to

shift registers when both are disabled, lowers the power consumption. A flip-flop con­

sumes power if it is clocked regardless of its D input changes. Disabling a clock when is

not required saves power in digital circuits.

The Vernier delay line consists of 128 unit delays. Therefore, there are 128 flip-flops in

each coarse and fine shift register. The finite state machine has four independent states.

Two flip-flops are required to encode the two bits representing these 3 states. In total,

there are 258 flip-flops in the controller module, so clock-gating (disabling a clock when

is not required) saves power when DLL is in the lock mode.

46

Page 55: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

3.2.3 High-resolution phase detector

The phase detector in D L L detects the phase error between output and input reference

clocks. The resolution of a Vernier D L L depends not only on the Vernier concept utilized

in the delay line, but also on how its phase detector is designed. The minimum phase error

that can be detected by the phase detector is defined as the phase detector's resolution. The

resolution of a phase detector depends on many factors, including design methodology

and CMOS technology implemented in chip fabrication.

A high-resolution phase detector is proposed in [50], where the delay of a buffer deter­

mines the resolution. The 70 ps is achieved when it is implemented in 0.18 | im technol­

ogy. The phase detector has three outputs: Shift_Left, Shift_Right, and Clk as shown in

Figure 3.14. When the rising edge of the input clock is within one unit delay (the delay of

U4) of the rising edge of the output clock, both outputs of the phase detector, Shift_Right

and ShiftJLeft, go to low and Clk is turned off.

A divide-by-two is included in the phase detector, so the phase detector is made to wait at

least two clock cycles before making another decision, generating a high on either

Shift_Right or Shift_Left. This provides enough time for the shift register in the proposed

[50] design to operate and for its output waveform to stabilize, on the other hand increases

the lock time, because now a decision has to be made for every two input clock cycles.

47

Page 56: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Figure 3.14 Phase detector in [50].

A modified version of the high-resolution phase detector [50] is proposed in this thesis

which can significantly improve resolution. The Vernier methodology is implemented in

this design, which effectively reduces the amount of delay between the D inputs of UI and

U2. As explained previously, the delay between two inputs and the output of the A N D

gate is not the same for both inputs.

The delay difference is exploited in the Vernier delay line to achieve a very small fine

incremental unit delay. The same concept is used in the proposed high-resolution phase

detector in the thesis. The schematic of this phase detector is shown in Figure 3.15. The

U7 and U8 introduce the same delay because both gates are connected through pin A l of

48

Page 57: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

the A N D gate. The U3 gate introduces slightly more delay, because the A2 pin is used as

input. The 0.18 |J.m technology library used for simulation and synthesis, introduces less

than 10 ps of delay difference between two outputs and output of an A N D gate.

Figure 3.15 Proposed high resolution phase detector

The decreasedelay and increasedelay are ORed to generate register_clk. The resolution

of a phase detector is defined as the minimum detectable phase error between its two

inputs. If phase error is within the resolution of the phase detector, then decrease_delay,

increase_delay and register_clk stay low.

The OR gate (U6) also delays the register_clk to either increasedelay or decreasedelay

which guarantees the required setup for the flip-flops in the controller driven by

register_clk. In a flip-flop the data should not change within setup and hold time window

around the clock edge, otherwise output is not predictable and can go to a metastable

(unstable) condition.

49

Page 58: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

If the output clock leads the input clock by a margin greater than the resolution, a delay

difference is created between the A l and A2 input pins to the output pin in the A N D gate.

Then, the Q pin of UI and U2 go high resulting a high on the increase_delay output.

On the other hand, if the input clock leads the output clock by a margin greater than the

resolution, then the Q pin of UI and U2 go low (Q goes high for both UI and U2), result­

ing in a high on decreasedelay output. In either case, register_clk goes high and generates

the required clock edge for the logic in the controller module as shown in Figure 3.16.

DLL Input Clock

DLL Output Clock

U1 /Q

decrease_delay

r-*\ Output leading

I—*| Input leading

^ " w U2 /Q /

increase_delay / \ / \ j j_ ft

ft

— "— SS register_clk / \ / \ jj / \

Figure 3.16 P h a s e detector waveforms

If none of the two cases exist, the input and output clocks are within the resolution of the

phase detector. In this case, the Q of UI goes high and the Q pin of U2 goes low resulting

in a low on both increase_delay and decrease_delay outputs. This happens when output

locks to input and D L L is in lock mode.

50

Page 59: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The divide by two logic (U3 and U7 in Figure 3.13) is not used in the proposed high-reso­

lution phase detector. The delay of the Vernier delay line increases or decreases by a small

differential amount equal to the resolution of the delay line. Therefore, the delay line can

be stabilized before the next decision is taken on the next edge of input clocks, and there is

no need to delay by every other clock. This reduces the time required for the D L L to

achieve the lock mode. The lock mode is detected by the lock detector module and is

described in the next section.

3.2.4 Lock detector

The lock detector is a very simple circuit, which outputs a high when D L L is in the lock

mode as shown in Figure 3.17. If both increasedelay and decrease_delay are low on the

falling edge of the D L L input clock, then the output lockjndicator goes high to indicate

that D L L now is in lock mode. The D L L input clock is used instead of the register_clk

because when D L L goes to lock mode, the register_clk is off and can not clock the low

value on the decreasedelay and increasedelay.

increase_delay decreasedelay

D L L input clock

D Q

> 1—c

D Q

>

lock indicator

Figure 3.17 Lock detector circuit

51

Page 60: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Chapter 4

Analysis of proposed DLL

This chapter analyzes the simulation results, describes the testbench, and demonstrates

how the D L L achieves the lock mode. The coarse and fine phases of the locking process

are investigated and illustrated in the captured waveforms.

4.1 Testbench

A simple testbench instantiates the D L L design, clock, and reset generator. It also intro­

duces glitch in the clock in order to examine how the D L L re-enters the lock mode when

its input clock phase changes abruptly. The lock_indicator signal is monitored any time

this signal becomes high indicating that D L L has entered the lock mode. The target resolu­

tion is less than 10 ps for the operating frequency range of 100 MHz to 200 MHz.

In order to verify that the D L L can recover from any abrupt input phase changes, after a

set period of time a glitch is imposed on the input clock source. This drives the D L L into

the non-locking mode, where the D L L mechanism guarantees recovery. After some time,

the D L L locks to the input signal. The time it takes for D L L to lock depends on input fluc­

tuations, the D L L architecture, the length of the delay line, and the algorithm used in the

controller's module, where the worst period is defined as the lock recovery period.

52

Page 61: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

The D L L described in this thesis is in lock mode when the controllers state machine is in

the FINE state and when both increase_delay and decrease_delay signals are inactive.

Depending on the imposed glitch, the lock mode can be achieved in the FINE state based

on the condition that this glitch is smaller than unit delay. Any variation larger than unit

delay forces the state machine to enter INCREMENT or DECREMENT state, which later

re-enter the FINE state and finally enable D L L to regain lock status.

The testbench is configured for six different cases and exhaustively covers all the different

operational modes of D L L . The first two cases verify the general locking process after

power up and reset, considering both possible leading or lagging input clock in reference

to output clock. The other four cases verify the lock re-entry process when an amount of

glitch is applied to input clock. Depending on the amount of glitch and the relative posi­

tion of the input to output clocks (leading or lagging), the four possible cases are investi­

gated in the testbench. The following sections detail all the cases. A l l the waveforms are

included, and a description of the phase detector and the controller's operation for every

case clarifies the DLLs operating mechanism.

4.2 Initial lock

After powering up and resetting, either phase detector's increasedelay or decrease_delay

becomes high, depending on the polarity of the phase error. If the input clock leads the

output clock, then decrease_delay is enabled. On the other hand, i f the output clock leads

the input clock then increase_delay is enabled. In the case where output clock is in the

same phase as the input clock, then both increase_delay and decrease_delay signals (phase

detector outputs to the controller module) are disabled.

53

Page 62: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

If decrase_delay is enabled, then the state machine transits to the DECREMENT state. In

this state at every clock the coarse shift register shifts one unit to the left, which conse­

quently decreases the total delay by. one unit. At some point the output clock starts leading '

the input clock, which means that coarse action is completed and the state machine has

transited to the FINE state. In this state, the fine shift register shifts to the right and at

every cycle the total delay of delay line increases by an incremental value. As described in

the previous chapter, the incremental value is very small, 4 ps for the N A N D gate used in

this design. Finally, the output clock is within the D L L resolution (4 ps) of the input clock,

and increase_delay is disabled. The lock_indicator signal becomes high, which indicates

that D L L is locked. The captured waveforms are shown in Figures 4.1 and 4.2. For clarity,

only related signals are captured. The phase error between the output and input clocks is 2

ps after the D L L locks, where the L O C K I N D I C A T O R signal is high as shown in Figure

4.2.

File Edit Marker G o T o View Options Window Help

D | c g | B | ' I '1 , 1 a - | z - J T J K | > J « | » | H « | R | [ * T « . | ( S | f |

RESET

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

D L L _ C L O C K J N P U T

REGISTER_CLOCK

D E C R E A S E _ D E L A Y

I N C R E A S E _ D E L A Y

N E X T S T A T E

S T A T E

50000 100000 150000 ' ' I.J...' • j . . . . . . . i . .

200000 250000 _

DECREMENT FINE

DLE DECREMENT FINE

R R ~ T | Ready jTlrne - ZS0000 Wi f -10 5Wfc=9 Se i -0

Figure 4.1 Initial lock mode waveform for a leading input clock 54

Page 63: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

File Edit Marker GoTo View Options Window Help

OJEgjt i z+ z-

RESET

LOCK_INDICATOR

D L L _ C L O C K _ O U T P U T

DLL_CLOCK_INPUT

REGISTER_CLOCK

D E C R E A S E _ D E L A Y

INCREASE_DELAY

N E X T S T A T E

S T A T E

232910 232920

FINE

FINE

232930

J Ready Time « HS0000 :Wif=1D lWfc=9 ;Sel=0

Figure 4.2 Initial lock mode waveform for a leading input clock (zoomed in)

On the other hand, i f increase_delay is enabled, then the state machine transits to the

I N C P v E M E N T state. In this state, at every clock edge the coarse shift register shifts one

unit to the right, which increases the total delay by one unit delay. At some point, the input

clock starts leading the output clock and decrease_delay is asserted, which means that

coarse action is completed. The state machine then moves to the D E C R E M E N T state and

after one clock cycle enters the FINE state as shown in Figure 4.3. The reason behind this

sequence is that initially the fine delay line output tap is set to the first tap, the most left

tap position of the chain, so fine delay can only be increased. Therefore, by going to the

DECREMENT state the output clock leads the input clock again, but this time the phase

error is less than one unit delay. By moving to FINE state the delay incrementally

increases until the phase error becomes zero and lock state is achieved.The phase error

between the output and input clocks is 2 ps after the D L L locks as shown in Figure 4.4.

55

Page 64: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

File Edit Marker GoTo View Options Window Help

E _ i I « I * H A I [ M £ | J S L £ | |

RESET

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

DLL_CLOCK_INPUT

REGISTER_CLOCK

D E C R E A S E _ D E L A Y

I N C R E A S E _ D E L A Y

N E X T S T A T E

S T A T E

31

50000 100000 1 1 1 i 1 L..1 . v i . 1 1 1 1 1 1 I 1 • 1 •

-32086

150000 200000 2500001; • 1 1 ,1.1—' I ' I I < ' I . ' • ' ' I 1 I I I I 1

L n m r L r ^ ^

n I

NCRdMEMT D" | FINE i

iDLE iNCRE M E N T D" FINE i

Ready .Time = 260000 ,Wif-10 W f c - 9 Sel=0

Figure 4.3 Initial lock mode waveform for a leading output clock

RESET

L O C K J N D I C A T O R

DLL_CLOCK_OUTPUT

D L L _ C L O C K J N P U T

REGISTER_CLOCK

D E C R E A S E _ D E L A Y

INCREASE_DELAY

N E X T S T A T E

S T A T E

7 H

J250480 I ' 1 1 1 1

250500 250520 ..... I ... i i I U

FINE

250540 . I .

Time = 2G0CIB0 Wif-10 W f c - 9 ,Se l -0

Figure 4.4 Initial lock mode waveform for a leading output clock (zoomed in)

56

Page 65: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

4.3 Lock re-entry

Phase variations on the input clock due to jitter and glitch introduce phase error, which in

causes the D L L to exit the lock mode. This initiates a re-entry process and subsequently

the D L L resumes its lock status. Depending on the amount of phase error, the state

machine can stay in the FINE state or move to INCREMENT or D E C R E M E N T states.

The following sections explain these 2 possible cases in detail.

4.3.1 Lock re-entry (easel)

If the phase error is within the dynamic range of the fine delay line, then D L L re-enters the

lock mode and the state machine stays in the FINE state. The dynamic range of the fine

delay line is the range at which its delay can be increased or decreased without reaching

the limit in both direction. The total delay of fine delay line is (N * T^, where N is the

number of fine delay units in the chain and Tf is the delay of each fine unit. In this design

N is 128 and the delay of each fine unit is 4 ps. The 4ps is the difference of input to output

delay of 2 input N A N D gate in the library.

For example, i f in lock mode the fine delay line's output is the middle tap of the chain then

the fine delay line can be increased or decreased by a delay equal to half of the total delay

of the fine delay line or 256 ps, which then any input phase error less than 256 ps is com­

pensated and lock mode is resumed while state machine is still in the FINE state. The sim­

ulation result is shown in Figure 4.5. The INCREASE_DELAY signal goes high for one

clock so increases the total delay by 1 fine unit delay or 4 ps and compensates for the

57

Page 66: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

added 6 ps input phase error. The phase error is within 4 ps resolution of phase detector

and D L L is locked.

Eile Edit Marker GoTo View Options Window Help

D .1 j I U.\ 2+ | Z - | ' J i | K | > | «|»j*r>| R I [fT «t

277048 290000 300000 310000

J,,.,L,,J,J1....! ! [...! !.... .J. ! .' ) ! ,1...! ! ! ,' 1 1 1 1 1.... 1 1 1 ' '

R E S E T

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T j

D L L _ C L O C K _ I N P U T

R E G I S T E R _ C L O C K

D E C R E A S E _ D E L A Y

I N C R E A S E D E L A Y

S T A T E FINE

LT

| Ready

F I N E

F I N E

|Time = 600000 sWif=28 Wfc=9 jSel-1

Figure 4.5 Lock re-entry mode waveform for small phase error

4.3.2 Lock re-entry (case 2)

The phase error can not be corrected by fine action if the amount of error is larger than the

dynamic range of the fine delay line. For example, if in the lock mode the fine delay's out­

put tap is in the center of the fine delay line, then any phase error greater than half of the

fine delay line, or 256 ps can not be corrected while the state machine is in the FINE state.

A phase error is generated i f input clock leads the output clock. The decrease_delay is

enabled, and the fine delay line output tap shifts to the left until it reaches the first tap of

58

Page 67: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

fine delay line. At this point the state machine moves to the D E C R E M E N T state and

coarse action is enabled. At every clock, the total delay of D L L is decremented by an

amount equal to one unit delay or 78 ps in the simulation. At a certain point, the

decrease_delay is deasserted and increase_delay is enabled. Then, the state machine

moves to the FINE state and finally achieves the lock mode.

Figures 4.6 shows that originally D L L locks at time 220 ns. A 500 ps glitch is applied at

time 240 ns and D L L locks again at time 550 ns. The final phase when the D L L locks

again is shown in Figure 4.7. The 500 ps is the amount of glitch required for the D L L to

exit the lock mode and not to be locked within the dynamic range of fine delay line as

described in lock re-entry (case 1). The introduced glitch is shown in Figure 4.8.

Figure 4.6 Lock re-entry mode waveform for a leading input clock

59

Page 68: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

File Edit Marker G o T o View Opt ions Window Help ^ 2 ^ i S ^ « & J i , s s l . t X > L L C I l

I D | E S ! | H | * N E S | - j 1 1 1 H H ' 1 K | » M * M » l [ M 5 l | S|f|

51476C f

450000 500000 . 1 ... 1 1 I 1 1 1 I I r 1 i j

— i i i i i_ R E S E T

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

D L L _ C L O C K J N P U T

R E G I S T E R _ C L O C K

D E C R E A S E J 3 E L A Y

I N C R E A S E _ D E L A Y

1

1

0

0

0

0

0

F INE

Hi R E S E T

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

D L L _ C L O C K J N P U T

R E G I S T E R _ C L O C K

D E C R E A S E J 3 E L A Y

I N C R E A S E _ D E L A Y

1

1

0

0

0

0

0

F INE

UUTT TTLRT

R E S E T

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

D L L _ C L O C K J N P U T

R E G I S T E R _ C L O C K

D E C R E A S E J 3 E L A Y

I N C R E A S E _ D E L A Y

1

1

0

0

0

0

0

F INE

i i m j m i i J T r L j i j i J T R j i j i mjmRjmnj i r i jmn jT iimnjiruiiirirmrLn i n n

UUTT TTLRT

R E S E T

L O C K J N D I C A T O R

D L L _ C L O C K _ O U T P U T

D L L _ C L O C K J N P U T

R E G I S T E R _ C L O C K

D E C R E A S E J 3 E L A Y

I N C R E A S E _ D E L A Y

1

1

0

0

0

0

0

F INE

1

1

0

0

0

0

0

F INE cn-M. Kl FINE

FINE |.«| J ' | »| •1 M M [Ready -t ime - 600000 ;Wl f«31" ,Wfc=9" " Sel= 1

Figure 4.7 Lock re-entry mode waveform for a leading input clock (zoomed in)

File Edit Marker GoTo View Options Window Help

D £ -1 'J II z+ Z- K • ,.- +. a ?|

£60990 280000 300000 320000 .! 1 1 1 1 ; 1.... j 1 1 1 1 '..j 1 1 1 1

1 1 1 1 1 1 J 1

RESET

LOCK_INDICATOR

DLL_CL0CK_0uTPUT|

DLL_CLOCK_INPUT

REGISTER_CLOCK

DECREASEJOELAY

IIMCREASE_DELAY

NEXTSTATE

1

1

1

1

0

0

0

FINE

FIN

Fl ' -E

Read- Time -6D0000 }Wlf-31 !Wfc=9 Sel=1

Figure 4.8 Introduced glitch waveform for a leading input clock

60

Page 69: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

A phase error is generated i f the output clock leads the input clock. The increase_delay is

enabled, and the fine delay line output tap shifts to the right until it reaches the last tap of

fine delay line. At this point, the state machine moves to INCREMENT state and coarse

action is enabled. At every clock, the total delay of D L L is incremented by an amount

equal to one unit delay or 78 ps in the simulation. At a certain point the increase_delay is

deasserted and decrease_delay is enabled. Then, the state machine moves to the FINE

state and finally achieves the lock mode.

Figures 4.9 shows that, originally, the D L L locks at time 250 ns. A 2 ns glitch is applied at

time 300 ns and D L L locks again at time 1315 ns. The final phase when the D L L locks

again is shown in Figure 4.10. The 2 ns input phase shift is introduced as glitch which

causes the D L L exits the lock mode and L O C K J N D I C A T O R signal goes low as shown in

Figure 4.11.

Fi le Edi t M a r k e r G o T o V i e w O p t i o n s W i n d o w H e l p

•leg]sal a iNgsl i J J z,|z-|:-j| - i H ^ M j j r n c j M i l

500001) |0 500000 1000000 J—i—i—i—i—i—i—i—i—I—i i—i i i i i i i I i i i _i_

RESET

L O C K J N D I C A T O R

DLL_CLOCK_OUTPUT|

DLL_CLOCK_INPUT

REGISTER_CLOCK

DECREASE_DELAY

INCREASE_DELAY

NEXTSTATE

1

1

0

0

0

0

0

FINE

r J~L

•III i l l Ml IIIII j j II III IJIllllillllilM I I I I M J N ' I I I ujjjj_ji..jjjiijjjjjjjj;iji..jjjjjiwiiii

L J

STATE •* ' • I FINE

= INE I N C R E I v E F I N E

ax F I N E • s C R E I v . E ' IIME

Ready !Tlme - 1500000 Wif=31 :Wfc=S S e l - 1

Figure 4.9 Lock re-entry mode waveform for a leading output clock

61

Page 70: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

File Edit Marker GoTo View Options Window Help

T [ - | - J »|z-||. | a l ;

240131 11250000

_ l l _ _ l I I

1300000 l

RESET

L O C K J N D I C A T O R

DLL_CLOCK_OUTPUT|

D L L _ C L O C K J N P U T

REGISTER_CLOCK

DECREASE_DELAY

INCREASEJDELAY

NEXTSTATE

> S T A T E ' , , , '

1

0

0

0

0

0

0

INCREH: I M . - H M N i . > \ K : } f Ml

\ C R E ! - / E U ~ D E C B E '

I

F I N E

Ti Ready T i m e - 1500000 W l f - 3 1 Wfc=9

Figure 4.10 Lock re-entry mode waveform for a leading output clock (zoomed in)

File Edit Marker GoTo View Options Window Help

D|cs|al *|<Mm| __4 z+|z-|, | < | > | « | » H jVjff f^J

267250 . I I I . L_

300000 , I ,

320000 I I i , . , I i

RESET

LOCK_INDICATOR

DLL_CLOCK_OUTPUT|

DLL_CLOCK_INPUT

REGISTER_CLOCK

D E C R E A S E _ D E L A Y

INCREASE_DELAY

N E X T S T A T E

1

1

1

1

0

0

0

FINE F I N E

FIM=

J3I

Ready iTime » 15DOOO0 Wif=31 !Wfc=9 |Sel=1

Figure 4.11 Introduced glitch waveform for a leading output clock

62

Page 71: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

4.4 Gate count of the vernier unit delay

The proposed vernier unit delay line was mapped to a commercial 0.18 | im library. The

total cell area is about 96 basic cells. The previously published unit delay [73], was also

mapped to the same library and the total cell area is about 122 basic cells. Therefore, the

proposed unit delay saves about 20% gate count when is implemented in the same library.

The gate count reduction is significant considering hundreds of the unit delays blocks are

needed in a typical delay line.

The static power consumption of a circuit is due to the leakage current and is proportional

to the gate count. Therefore, the static power consumption of the delay line is reduced by

20%. The dynamic power consumption of the circuit not only depends on the gate size but

also at the rate each gate is being toggled in the circuit. The toggle rate is a function of

logic and operating frequency. The dynamic power consumption can be measured using

the dynamic test vectors which are generated during functional simulation.

The practical formulas are given by fabs to estimate the dynamic power consumption. The

general guideline is that dynamic power consumption increases proportionally with the

gate count increase. Based on this rule of thumb the 20% dynamic power saving is real­

ized by the proposed delay line.

4.5 Resolut ion of the proposed D L L

The proposed vernier unit delay is based on the delay difference between the 2 inputs to

output of a dual-input N A N D gate. The difference for a N A N D gate in 0.18 (imcommer-

63

Page 72: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

cial library is measured less than 10 ps in the functional simulation (4 ps). The previously

published unit delay [73], is based on the delay difference of a N A N D gate with different

fanout loads. The achieved resolution was the fifth of the delay of each unit block, i.e,

about 20 ps i f it was implemented in the same 0.18 [im library, considering the delay of

each unit delay block is 100 ps.

Therefore, the proposed design offers 100% improvement for resolution of the delay line.

The higher resolution reduces the phase error between the output and input clocks of a

D L L . At the same time, the cycle-to-cycle jitter is also reduced due to the fact that output

clock can be delayed by smaller unit between the two consecutive clock edges.

4.6 Limitations of the proposed DLL

The main limitation of the proposed D L L is, that depending on the phase error between

the input and output clocks, it can take up to 128 clock cycles for the D L L to lock which is

considered relatively slow. For example, i f the first output tap of the fine delay line is

selected while the D L L is locked, then a 512 ps glitch at input causing the output clock to

lead the input clock, requires 128 input clock so the D L L can lock again. The resolution of

fine delay line is 4 ps so at every clock cycle the delay of the whole delay line can is

increased by an amount equal to 4 ps, therefore 128 input clock cycles is required to lock.

This example is considered the worst case and normally D L L locks in a shorter time. The

thesis mainly concentrates on how to improve a DLL's resolution. The extra research can

be done to improve the lock time, for example devising efficient algorithms to shorten the

lock time period [33].

64

Page 73: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Chapter 5

Conclusion

The phase-locked loops (PLLs) and delay-locked loops (DLLs) have been widely adopted

to solve the clock skew problem. In recent years, Delay Locked Loops (DLLs) have been

widely used for clock alignment due to their lower phase-error accumulation and faster

locking time [35], [82]. A D L L is used in many other applications such as clock synthesis

[2], [3], [6], clock recovery [14], [19], [25], S D R A M controller [26], [47], [65], Automatic

Test equipment (ATE) [99] and Time to Digital Converter (TDC) [4], [22], and [23].

The first DLLs were analog and mainly used for clock distribution applications [10], and

[13]. A conventional analog D L L consists of four main blocks: a voltage controlled delay

line (VCDL), a charge-pump, a low pass filter, and a phase detector. The simple design of

the D L L offers many advantages when compared to VCO-based PLLs. It is still relatively

complex analog circuit, requiring process-specific implementation, making it very diffi­

cult to reuse the same design for different technology. Basically an analog D L L is a non­

portable architecture as major changes in the layout of design are required to port a design

from one technology to another one.

Digital DLLs are characterized by their use of digital delay lines. They are typically made

from simple digital circuit elements. This simplicity helps to design a portable digital D L L

which can be easily adopted for different technologies. Although the digital D L L uses

65

Page 74: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

more area and power than the analog D L L , its greater simplicity, and lower minimum

required power supply voltage makes it very attractive for many applications.

The Register Delay Locked Loop (RDLL) belongs to the digital D L L family and is widely

used in high speed synchronous D R A M (SDRAM) applications [17], [51], [85] and [90].

The R D L L consists of a tapped delay line, a shift register, a phase detector, and a replica

input buffer dummy [85].

The Synchronous Mirror Delay (SMD) and Clock Synchronized Delay (CSD) circuits are

non-feedback systems which can achieve the lock, in only two clock cycles [52], [88] and

[94]. Therefore, in standby mode these circuits can be disabled, and they can lock to the

reference clock in just two clock cycles when the operation mode is resumed.

The latest DLLs use Vernier principle, based on the Vernier caliper tool[83]. The Vernier

technique implemented in the proposed design is based on the characteristic of a N A N D

gate and uses the delay difference between the inputs to output of a dual-input N A N D gate

to achieve a fine step resolution. The previous technique [73] was based on the delay dif­

ference of a N A N D gate with different fanout loads. The analysis in previous chapter

shows the resolution of D L L is doubled based on the new technique implemented in the

proposed design.

This thesis introduced a novel architecture for a high-resolution Vernier D L L with a reso­

lution of less than 10 ps. It combines the two coarse and fine unit delay blocks into one

66

Page 75: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

unit delay block in a way that effectively reduces the area of the delay line. This reduction

is considered significant when taking into account the number of unit delay blocks

required in a typical delay line. The combination of smaller delay line and integration of

fine and coarse controllers reduces D L L power consumption. The analysis in the previous

chapter shows that a 20% gate count reduction in the delay line is achieved by using the

proposed unit delay block. It also shows that total power consumed by delay line is also

reduced 20% approximately.

A testbench was written for all different cases, exhaustively covers all the different opera­

tional modes of DLL. The first two cases verify the general locking process after power up

and reset, considering both possible leading or lagging input clock in reference to output

clock. The other four cases verify the lock re-entry process when an amount of glitch is

applied to input clock.

A linear control algorithm is used in this thesis to achieve lock mode. The controller lin­

early increases or decreases the total delay .of the delay line. For faster lock time, the

binary search algorithm is incorporated into SARDLL [33]. This algorithm reduces the

searching effort and speeds up the lock time process. The various lock mechanism can be

explored in order to speed up the lock time period of the D L L . This can be considered as

one of the of future research topics.

67

Page 76: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Bibliography

[1] T.Hamamoto, K.Furutani, T.Kubo, S.Kawasaki, H.Iga, T.Kono, Y.Konishi, T.Yoshihara, " A 667-Mb/s Operating Digital D L L Architecture for 512-Mb DDR S D R A M , " IEEE J. Solid-State Circuits, vol. 39, N O . l , pp. 194-206, Jan 2004.

[2] C.C.Chung, C.Y.Lee, " A New DLL-Based Approach for All-Digital Multiphase Clock Generation," IEEE J. Solid-State Circuits, vol. 39, NO.3, pp. 469-471, Mar 2004.

[3] R.F.Rad, A.Nguyen, J.M.Tran, T.Greer, J.Poulton, W.J.Dally, J.H.Edmondson, R.Senthinathan, R.Rathi, M.E.Lee, H.T.Ng, " A 33-mw 8-Gb/s CMOS Clock Mul­tiplier and CDR for Highly Integrated I/Os," IEEE J. Solid-State Circuits, vol. 39, NO.9, pp. 1553-1561, Sept 2004.

[4] C.S.Hwang, P.Chen, H.W.Tsao, " A High-Precision Time-to-Digital Converter Using a Two-Level Conversion Scheme," IEEE Transactions on Neuclear Science, vol 51, NO.4, pp. 1349-1352, Aug 2004.

[5] A.H.Chan, GW.Roberts, " A Jitter characterization system using a component-invariant Vernier delay line," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol 12, N O . l , pp. 79-95, Jan 2004.

[6] C.S.Hwang, P.Chen, H.W.Tsao, " A wide-range and fast-locking clock synthesizer IP based on delay-locked-loop," ISCAS 2004, Proceedings of the 2004 Interna­tional Symposium on, Vol.1 May 2004, pp.352-361.

[7] K.Kim, N.Park, T.Kim, "An unlimited lock range D L L for clock generator," ISCAS 2004, Proceedings of the 2004 International Symposium on, Vol.1 May, 2004,pp.352-361.

[8] K.Cheng, Y L o , WFang, S.Hung, " A mixed-mode delay-locked loop for wide-range operation and multiphase clock generation,"System-on-chip for Real-Time Applications, 2003 Proceedings, Jul 2003, pp.90-93.

[9] A.Suzuki, S.Kawahito, D.Miyazaki, M.Furuta, " A digitally skew correctable multi-phase clock generator using a master-slave D L L , " ISCAS 03, Proceedings of the 2003 International Symposium on, Vol.1 May 2003, pp. 105-108.

[10] K.Taesung, K.Beomsup, "Phase interpolator using delay locked loop," Mixed-Sig­nal Design, 2003, Southwest Symposium on, Feb 2003, pp.76-80.

68

Page 77: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

ZJingcheng, D.Qingjin, T.Kawasniewski "A-107dBe, lOKHz Carrier offset 2-GHz DLL-based frequency synthesizer," Custom Integrated Circuits Conference, 2003, Proceedings of the IEEE 2003, Sept 2003, pp.301-304.

GManganaro, S.Kwak, S.Bugeja " A dual 10b 200MSPS pipeline D/A converter with DLL-based clock synthesizer," Custom Integrated Circuits Conference, 2003, Proceedings of the IEEE 2003, Sept 2003, pp.301-304.

H.Chang, C.Sun, S.Liu, " A low-jitter and precise multiphase delay-locked loop using shifted averaging V C D L , " in ISSCC 2003 Dig. Tech. Papers, Vol.1, 2003, pp. 434-505.

W.Pvhee, H.Ainspan, S.Rylov, A.Rylyakov, M.Beakes, D.Friedman, S.Gowada, M.Soyuer, " A 10-Gb/s CMOS clock and data recovery circuit using a secondary delay-locked-loop," Custom Integrated Circuits Conference, 2003, Proceedings of the IEEE 2003, Sept 2003, pp.81-84.

GWei, J.Stonick, D.Weinlader, J.Sonntag, S.Searles " A 500MHz M P / D L L Clock Generator for a 5Gb/s Backplane Transceiver in 0.25 CMOS," in ISSCC 2003 Dig. Tech. Papers, Vol.1, 2003, pp. 464-465.

S.J.Kim, S.H.Hong, J.K.Wee, J.H.Ahn, J.Y.Chung, " A low Jitter, fast recoverable, fully analog D L L using tracking A D C for high speed and low stand-by power DDR I/O interface," VLSI Circuits, 2003, Digest of Technical Papers, 2003 Sym­posium on, June 2003, pp. 285-286.

J.T.Kwak, C.K.Kwon, K.W.Kim, S.H.Lee, J.S.Kih, " A low cost high performance register-controlled digital D L L for 1 Gbps/spl times/32 DDR S D R A M , " VLSI Cir­cuits, 2003, Digest of Technical Papers, 2003 Symposium on, June 2003, pp. 283-284.

K.H.Cheng, Y.L.Lo, W.F.Yu, S.Y.Hung, " A mixed-mode delay-locked loop for wide-range operation and multiphase clock generation," System-on-Chip for Real-Time Applications, 2003, Proceedings, The 3rd IEEE International Workshop on, Jul 2003, pp. 90-93.

Z.Mao, T.H.Szymansli, " A 4Gb/s CMOS fully-differential analog dual delay-locked loop clock/data recovery circuit," Electronics, Circuit and Systems, 2003, ICECS 2003, Proceedings of the 2003 10th IEEE International Conference on, Vol.2, Dec 2003, pp. 559-562.

69

Page 78: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

M.E.Lee, W.J.Dally, T.Greer, H.T.Ng, R.F.Rad, J.Poulton, R.Senthinathan, "Jitter Transfer Characteristics of Delay-Locked Loops, Theories and Design Tech­niques," IEEE J. Solid-State Circuits, vol. 38, NO.4, pp. 614-621, Apr 2003.

T.Matano, Y.Takai, T.Takahashi, Y.Sakito, I.Fujii, Y.Takaishi, H.Fujisawa, S.Kubouchi, S.Narui, K.Arai, M.Morino, M.Nakamura, S.Miyatake, T.Sekiguchi, K.Koyama, " A 1-Gb/s/pin 512-Mb DDRII S D R A M Using a Digital D L L and a Slew-Rate-Controlled Output Buffer," IEEEJ. Solid-State Circuits, vol. 38, NO.5, pp. 762-768, May 2003.

S.Tabatabaei, A.Ivanov, "Embedded Timing Analysis: A SOC Infrastructure," IEEE Design & Test Of Computers, vol. 19, NO.3, pp. 24-36, June 2002.

S.Tabatabaei, A.Ivanov, " A n embedded core for Sub-Picosecond timing measure­ments,"^? Conference, 2002, Proceedings of ITC International, pp. 129-137, Oct 2002.

A.H.Chan, G.W.Roberts, " A deep sub-micron timing measurement circuit using a single-stage Vernier delay line," Custom Integrated Circuits Conference, 2002, Proceedings of the IEEE 2002, May 2002, pp.77-80.

X.Millard, F.Devisch, M.Kuijk, " A 900-Mb/s CMOS Data Recovery D L L using Half-Frequency Clock," IEEEJ. Solid-State Circuits, vol. 37, NO.6, pp. 711-715, June 2002.

S.J.Kim, S.H.Hong, J.K.Wee, J.H.Cho, P.S.Lee, J.H.Ahn, J.Y.Chung, " A Low-Jit­ter Wide-Range Slew-Calibrated Dual-Loop D L L Using Antifuse Circuitry for High-Speed D R A M , " IEEE J. Solid-State Circuits, vol. 37, NO.6, pp. 726-734, June 2002.

R.F.Rad, WDally, H.T.Ng, R.Senthinathan, M.E.Lee, R.Rathi, J.Poulton, " A Low-Power Multiplying D L L for Low-Jitter Multi gigahertz Clock Generation in Highly Integrated Digital Chips," IEEEJ. Solid-State Circuits, vol. 37, NO. 12, pp. 1804-1812, Dec 2002.

C.Kim, I.C.Hwang, S.M.Kang, " A Low-Power Small-Area +/-7.28-ps-Jitter 1-GHz DLL-Based Clock Generator," IEEEJ. Solid-State Circuits, vol. 37, NO. 11, pp. 1414-1420, Nov 2002.

Y.J.Jung, S.WLee, D.Shim, W.Kim, C.Kim, S.I.Cho, " A Dual-Loop Delay-Locked Loop Using Multiple Voltage-Controlled Delay Lines," IEEE J. Solid-State Circuits, vol. 36, NO.5, pp. 784-791, May 2001.

70

Page 79: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

DJ.Foley, M.Flynn, "CMOS DLL-Based 2-V 3.2-ps Jitter 1-GHz Clock Synthe­sizer and Temperature-Compensated Tunable Oscillator," IEEE J. Solid-State Cir­cuits, vol. 36, NO.3, pp. 417-423, Mar 2001.

G.K.Dehng, J.W.Lyn, S.I.Liu, " A Fast-Lock Mixed-Mode D L L Using a 2-b SAR algorithm," IEEEJ. Solid-State Circuits, vol. 36, NO.10, pp. 1464-1471, Oct 2001.

J.B.Lee, K.H.Kim, C.Yoo, S.Lee, O.GNa, C.Y.Lee, H.Y.Song, J.S.Lee, Z.H.Lee, K.W.Yeom, H.J.Chung, I.W.Seo, M.S.Chae, Y.H.Choi, S.I.Cho, "Digitally-Con­trolled D L L and I/O Circuits for 500Mb/S/Pin x l6 DDR S D R A M , " ISSCC Dig, Tech. Papers, Feb 2001, pp.68-70.

G. K.Dehng, J.M.Hsu, C.Y.Yang, S.I.Liu, "Clock-Deskew Buffer Using a SAR-Controlled Delay-Locked Loop," IEEE J. Solid-State Circuits, vol. 35, pp. 1128-1136, Aug 2000.

YMoon, J.Choi, K.Lee, D.K.Jeong, and M.K.Kim, "An All-Analog Multiphase Delay-Locked Loop Using a Replica Delay Line for Wide-Range Operation and Low-Jitter Performance," IEEE J. Solid-state Circuits, vol.35, pp. 377-384, Mar 2000.

H. Lee, H.Q.Nguyen, D.W.Potter, "Design Self-Synchronized Clock Distribution Networks In An SOC ASIC Using D L L With Remote Clock Feedback," ASIC/ SOC Conference, 2000, Proceedings, 13th Annual IEEE International, Sept 2000, pp.248-252.

P.Dudek, S.Szczepanski, J.V.Hatfield, " A High-Resolution CMOS Time-to-Digital Converter Utilizing a Vernier Delay Line," Solid-State Circuits, IEEE Transactions on, vol 35, NO.2, pp. 240-247, Feb 2000.

K.Minami, M.Mizuno, H.Yamaguchi, T.Nakano, YMatsushima, YSumi, T.Sato, H.Yamashida, M.Yamashina, " A 1GHz Portable Digital Delay-Locked Loop with infinite Phase Capture Ranges," ISSCC Dig, Tech. Papers, Feb 2000, pp.350-351.

Y.J.Jung, S.W.Lee, D.Shim, W.Kim, C.H.Kim, S.I.Cho, " A low Jitter Dual Loop D L L using Multiple VCDLs with a Duty Cycle Corrector," VLSI Circuits, 2000, Digest of Technical Papers, 2000 Symposium on, pp. 50-51.

D.J.Foley, M.P.Flynn, "CMOS D L L Based 2V, 3.2ps Jitter, 1GHz Clock Synthe­sizer and Temperature Compensated Tunable Oscillator," Custom Integrated Cir­cuits Conference, 2002, Proceedings of the IEEE 2002, May 2000, pp.371-374.

71

Page 80: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

S.S.Hwang, K.M.Joo, H.J.Park, J.W.Kim, P.Chung, " A D L L based 10-320 M H z Clock Synchronizer," ISCAS 2000, Proceedings of the 2000 International Sympo­sium on, Vol.1 May, 2000 pp.265-268.

D.J.Foley, M.P.Flynn, " A 3.3V, 1.6GHz, Low-Jitter, Self-Correcting D L L Based Clock Synthesizer in 0.5 CMOS," ISCAS 2000, Proceedings of the 2000 Interna­tional Symposium on, Vol . l May 2000, pp.249-252.

GChien, P.R.Gray, " A 900-MHz Local Oscillator Using a DLL-Based Frequency Multiplier Technique for PCS applications," IEEE J. Solid-state Circuits, vol.35, NO.12, pp. 1996-1999, Oct 2000.

J.H.Lee, S.H.Han, H.J.Yoo, " A 330MHz Low-Jitter and Fast-Locking Direct Skew Compensation D L L , " ISSCCDig, Tech. Papers, Feb 2000, pp.352-353.

S.Kuge, T.Kato, K.Furutani, S.Kikuda, K.Mitsui, T.Hamamoto, J.Setogawa, K . H amade, Y.Komiya, S.Kawasaki, T.Kono, T.Amano, T.Kubo, M.Haraguchi, Y.Nakaoka, M.Akiyama, Y.Konishi, H.Ozaki, T.Yoshihara, " A 0.18 256-Mb DDR-S D R A M with Low-Cost Post-Mold Tuning Method for D L L Replica," IEEE J. Solid-state Circuits, vol.35, N O . l l , pp. 1680-1689, Nov 2000.

S.S.Hwang, "Dual-Loop DLL-based clock synthesizer," Electronics Letters, vol 36, NO. 14, pp. 1173-1174, Jul 2000.

T.Hamamoto, S.Kawasaki, K.Furutani, K.Yasuda, Y.Konishi " A skew and jitter suppressed D L L architecture for high frequency DDR SDRAMs," VLSI Circuits, 2000, Digest of Technical Papers, 2000 Symposium on, Mar 2000, pp. 76-77.

J.J.Kim, S.B.Lee, T.S.Jung, C.H.Kim, S.I.Cho, B.Kim, " A Low-Jitter Mixed-Mode D L L for High-Speed D R A M Applications," IEEE J. Solid-state Circuits, vol.35, NO.10, pp. 1430-1436, Oct 2000.

C.S.Hwang, WC.Chung, C.Y.Wang, H.W.Tsao, S.I.Liu, " A 2V Clock Synthesizer using Digital Delay-Locked Loop," ASIC, 2000, Proceeding, 2002 IEEE Asia-Pacific Conference on, Aug 2000, pp.91-94.

S.Eto, H.Akita, K.Isobe, K.Tsuchida, H.Toda, T.Seki, " A 333MHz, 20mW, 18ps Resolution Digital D L L using Current-Controlled Delay with Parallel Variable Resistor D A C (PVR-DAC)," ASIC, 2000, Proceeding, 2002 IEEE Asia-Pacific Conference on, Aug 2000, pp.349-350.

72

Page 81: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

H.Yoon, GCha, C.YOO, N.J.Kim, K.Y.Kim, C.H.Lee, K.N.Lim, k.Lee, J.Y.Jeon, T.S.Jung, H.Jeong, T.Y.Chung, K . K i m and S.I.Cho, " A 2.5-V, 333-Mb/s/pin, 1-Gbit, Double-Data-Rate Synchronous D R A M , " IEEE J. Solid-State Circuits, vol. 34, N O . l l , pp. 1589-1599 Nov, 1999.

F.Lin, J.Miller, A.Schoenfeld, M.Ma, and R.J.Baker, " A Register-Controlled Sym­metrical D L L for Double-Data-Rate D R A M , " IEEE J. Solid-State Circuits, vol. 34, pp. 565-568, Apr 1999.

T.Saeki, K.Minami, H.Yoshida, H.Suzuki, " A Direct-Skew-Detect Synchronous Mirror Delay for Application-Specific Integrated Circuits," IEEE J. Solid-State Circuits, vol. 34, pp. 372-379, Mar 1999.

W.Rhee, A . A l i , "An On-Chip Phase compensation technique in fractional-N-fre-quency synthesis," ISCAS 1999, Proceedings of the 1999 International Symposium on, Vol.3, June 1999, pp.363-366.

A.Mantyniemi, T.Rahkonen, J.Kostamovaara, " A High Resolution digital CMOS Time-To-Digital converter based on nested Delay Locked Loops," ISCAS 1999, Proceedings of the 1999 International Symposium on, Vol.2, June 1999, pp.537-540.

S.Nagavarapu, J.Yan, E.K.F.Lee, R.L.Geiger " A n asynchronous data recovery/ retransmission technique with foreground D L L calibration," ISCAS 1999, Pro­ceedings of the 1999 International Symposium on, Vol.6, June 1999, pp.354-357.

R.L.Aguiar, D.M.Santos, "Simulation and modeling of digital Delay Locked Loops," ISCAS 1999, 42ndMidwest Symposium On, Vol.2, Aug 1999, pp.843-846.

R.L.Aguiar, D.M.Santos, "Modeling Charge-Pump Delay Locked Loops," ICECS 1999, The 6th IEEE International Conference On, Vol.2, Sept 1999, pp.823-826.

S.H.Han, J.H.Lee, H.J.Yoo, " A fast lock-on time Mixed Mode D L L with lOps jit­ter," VLSI and CAD, 1999, ICVC 1999, The 6th IEEE International Conference On, Oct 1999,pp.564-565.

M.Miyazaki, K.Ishibashi, " A 3-Cycle lock time Delay-Locked Loop with a paral­lel phase detector for low power mobile systems," ASICs, 1999, AP-ASIC 1999, The First IEEE Asia Pacific Conference On, Aug 1999, pp.396-399.

73

Page 82: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Y.S.Song, J.K.kang, " A Delay-Locked Loop circuit with Mixed-Mode tuning," ASICs, 1999, AP-ASIC 1999, The First IEEE Asia Pacific Conference On, Aug 1999, pp.347-350.

P.D.Capofreddi, C.D.Baringer, J.F.Jenson, M.J.W.Rodwell, W.P.Posey, M.W.Yung, Y.M.Xie, " A Clock and Data recovery IC for communications and radar applica­tions," Design Of Mixed-Mode Integrated Circuits and Applications, 1999, Third International Workshop On, Jul 1999, pp.88-90.

T.Toifi, R.Vari, P.Moreira, A.Marchioro, "4-Channel Rad-Hard Delay Generation ASIC with Ins Timing Resolution for L H C , " Nuclear Science, IEEE Transactions On, Vol.46, NO.3, June 1999, pp.423-427.

J.Park, Y.Koo, W.Kim, " A Semi-Digital Delay-Locked Loop for clock skew mini­mization," VLSI Design, 1999, Proceedings of 12th International Conference On, Jan 1999,pp.584-588.

A.Balatsos, D.Lewis, "Low-Skew clock generator with dynamic impedance and delay matching," ISSCC Dig, Tech. Papers, Feb 1999, pp. 182-183.

L.Paris, J.Benzreba, P.Demone, M.Dunn, L.Falkenhagen, P.Gillingham, I.Harri­son, W.He, D.Macdonald, M.Macintosh, B.Millar, K.Wu, H.J.Oh, J.Stender, V.Chen, J.Wu, " A 800MB/s 72Mb S L D R A M with digitally calibrated D L L , " ISSCC Dig, Tech. Papers, Feb 1999, pp.414-415.

Y.Moon, D.K.Jeong, " A lGbps transceiver with Receiver-End deskewing capabil­ity using Non-Uniform Tracked Oversampling and a 250-750 MHz Four-Phase D L L , " 1999 Symposium On VLSI Circuits, Dig, Tech. Papers, pp.47-48.

F.Mu, A.Edman, C.Sevenson, "Digital Multiphase Clock/Pattern Generator," IEEEJ. Sold-State Circuits, vol.34, NO.2, pp. 182-191, Feb 1999.

S.I.Liu, J.H.Lee, H.W.Tsao, "Low-Power Clock-Deskew Buffer for High-Speed Digital Circuits," IEEE J. Sold-State Circuits, vol.34, NO.4, pp. 554-558, Apr 1999.

M.Mota, J.Christiansen, " A High-Resolution Time Interpolator Based on a Delay Locked Loop and an RC Delay Line," IEEE J. Sold-State Circuits, vol.34, NO. 10, pp. 1360-1366, Oct 1999.

Y.Nakase, YMorooka, D.J.Perlman, D.J.Kolar, J.M.Choi, H.J.Shin, T.Yoshimura, N.Watanabe, Y.Matsuda, M.Kumanoya, M.Yamada, "Source-Synchronization and

74

Page 83: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Timing Vernier Techniques for 1.2-GB/s S L D R A M interface," IEEE J. Sold-State Circuits, vol.34, NO.4, pp. 494-501, Apr 1999.

W.Bruno, K.S.Donnelly, J.Kim, P.S.Chau, J.L.Zerbe, C.Huang, C.V.Tran, C.L.Portmann, D.Stark, Y.F.Chan, T.H.Lee, M.A.Horowitz, " A Portable Digital D L L for High-Speed CMOS Interface Circuits," IEEE J. Sold-State Circuits, vol.34, NO.5, pp. 632-644, May 1999.

C.Kim, H.K.Kyung, W.P.Jeong, J.S.Kim, B.S.Moon, J.W.Chai, S.M.Yim, J.H.Choi, K.H.Han, C.J.Park, H.S.Hwang, H.Choi, S.B.Cho, C.L.Portmann, S.I.Cho, " A 2.5-V, 72-Mbit, 2.0-GByte/s Packet-Based D R A M with a 1.0-Gbps/ pin Interface," IEEE J. Sold-State Circuits, vol.34, NO.5, pp. 645-652, May 1999.

S.Eto, M.Matsumiya, M.Takita, Y.Ishii, T.Nakamurra, K.Kawabata, H.Kano, A . Kitamoto, T.Ikeda, T.Koga, M.Higashiro, Y.Serizawa, K.Itabashi, O.Tsuboi, Y.Yokoyama, and M.Taguchi, " A 1 Gb S D R A M with ground level precharged bit-line and non-boosted 2.1V word line," IEEE J. Solid-State Circuits, vol. 33, N O . l 1 pp. 1697-1702, Nov 1998.

M.Hasegawa, M.Nakamura, S.Narui, S.ohkuma, YKawase, H.Endoh, S.Miyatake, T.Akiba, K.Kawakita, M.Yoshida, S.Yamada, T.Sekigguchi, I.Asano, Y.Tadaki, R.Nagai, S.Miyako, K.Kajigaya, M.Horiguchi, and Y.Nakagome, " A 256 Mb S D R A M with subthreshold leakage current suppression," in ISSCC 1998 Dig. Tech. Papers, Feb 1998, pp. 80-81.

C.H.Kim, J.H.Lee, J.B.Lee, B.S.Kim, C.S.Park, S.B.Lee, S.Y.Lee, C.W.Park, J.GRoh, H.S.Nam, D.GKim, D.Y.Lee, T.S.Jung, H.Yoon, S.I.Cho, " A 64-Mbit, 640-MByte/s bidirectional data strobed, Double-Data-Rate S D R A M with a 40-mW D L L for a 256-MByte memory system," IEEE J. Sold-State Circuits, vol.33, N O . l l , pp. 1703-1710, Nov 1998.

B. S.Kim, L.S.Kim, "100 MHz all-digital Delay-Locked Loop for low power appli­cation" Electronics Letters, vol 34, NO.18, pp. 1739-1740, Sept 1998.

S.J.Jang, S.H.Han, C.S.kim, Y.H.Jun, H.J.Yoo, " A compact ring delay line for high speed synchronous D R A M , " VLSI Circuits, 1998, Digest of Technical Papers, 1998 Symposium on, pp. 60-61.

B. W.Garlepp, K.S.Donnelly, J.kim, P.S.Chau, J.L.Zerbe, C.Huang, C.V.Tran, C. L.Portmann, D.Stark, Y.F.Chan, T.H.Lee, M.A.Horwitz, " A portable digital D L L architecture for CMOS interface circuits," VLSI Circuits, 1998, Digest of Technical Papers, 1998 Symposium on, pp. 214-215.

75

Page 84: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

T.Yushimura, Y.Nakase, N.Watanabe, YMorooka, Y.Matsuda, M.Kumanoya, H.Hamano, " A Delay-Locked Loop and 90-degree phase shifter for 800Mbps Double Data Rate memories," VLSI Circuits, 1998, Digest of Technical Papers, 1998 Symposium on, pp. 66-67.

D.Birru, " A novel Delay-Locked Loop based CMOS clock multiplier," IEEE J. Sold-State Circuits, vol.44, NO.4, pp. 1319-1322, Nov, 1998

M.Mota, J.Christiansen, " A four channel, self-calibrating, high resolution, Time To Digital Converter, "Electronics, Circuits and Systems, 1998 IEEE International Conference On, vol.1, pp. 409-412, Sept 1998.

RL.Aguiar, D.M.Santos, "Wide-Area clock distribution using controlled delay lines," Electronics, Circuits and Systems, 1998 IEEE International Conference On, vol.2, pp. 63-66, Sept 1998.

M.S.Gorbics, J.Kelly, K.M.Roberts and R.L.Sumner, " A High Resolution Multihit Time to Digital Converter Integrated Circuit," IEEE Transactions on Neuclear Sci­ence, vol 44, pp. 379-384, June 1997.

S.Sidiropoulos, M.Horwitz, " A Semidigital Dual Delay-Locked Loop," IEEE J. Solid-State Circuits, vol. 32, pp. 1683-1692, Nov 1997.

A.Hatakeyama, H.Mochizuki, T.Aikawa, M.Takita, Y.Ishii, H.Tsuboi, S.Y.Fujioka, S.Yamaguchi, M.Koga, Y.Serizawa, K.Nishimura, K.Kawabata, YOkajima, M.Kawano, H.Koima, K.Mizutani, T.Anezaki, M.Hasegawa, and M.taguchi, " A 256-Mb S D R A M using a register-controlled digital D L L , " IEEE J. Solid-State Circuits, vol. 32, pp. 1728-1732, Nov 1997.

GC.Moyer, M.Clements, W.Liu, T.Schaffer, R.K.Cavin, "The Delay Vernier pat­tern generation technique," IEEE J. Sold-State Circuits, vol.32, NO.4, pp. 551-562, Apr 1997.

K.Gotch, S.Wakayama, M.Saito, J.Ogawa, H.Tamura, YOkajima, M.Taguchi, "All-Digital Multi-Phase Delay Locked Loop for internal timing generation in embedded and/or high speed DRAMs," VLSI Circuits, 1997, Digest of Technical Papers, 1997 Symposium on, pp. 107-108.

T.Saeki, H.Nakamura, J.Shimizu, " A lOps jitter 2 clock cycle lock time CMOS digital clock generator based on an interleaved synchronous mirror delay scheme" VLSI Circuits, 1997, Digest of Technical Papers, 1997 Symposium on, pp. 109-110.

76

Page 85: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

[89] S.Sidiropoulos, M.Horwitz, " A Semi-Digital D L L with unlimited phase shift capa­bility and 0.08-400MHz operating range," ISSCC Dig, Tech. Papers, Feb 1997, pp.332-333.

[90] A.Hatakeyama, H.Mochizuki, TAikawa, M.Takita, Y.Ishi, H.Tsuboi, S.Fujioka, S.Yamaguchi, M.Koga, Y.Serizawa, K.Nishima, K.Kawabata, YOkajima, M.Kawano, H.Kojima, K.Mizutani, T.Anezaki, M.Hasegawa, M.Taguchi, " A 256Mb S D R A M using a Register-Controlled Digital D L L , " ISSCC Dig, Tech. Papers, Feb 1997, pp.72-73.

[91] S.Gogaert, M.Steyaert, " A skew tolerant CMOS level-based A T M data-recovery system without P L L topology," Custom Integrated Circuits Conference, 1997, Proceedings of the IEEE 1997, Sept 1997, pp.'453-456.

[92] B.S.Kim, L.S.Kim, " A low power 100MHz A l l Digital Delay-Locked Loop," ISCAS 2004, Proceedings of the 1997 International Symposium on, Vol.1 May 1997, pp. 1820-1823.

[93] V.Lines, M.A.Scido, C.Mar, A.Achyuthan, "High speed circuit techniques in a 150MHz 64M S D R A M , " Memory Technology, Design and Testing, 1997, Pro­ceedings International Workshop On, Aug 1997, pp.8-11.

[94] T.Saeki, YNakaoka, M.Fujita, A.Tanaka, K.Nagata, K.Sakakibara, T.Matano, Y.Hoshino, K.Miyano, S.Isa, E.Kakehashi, J.Drynan, M.Komuro, T.Fukase, H.Iwasaki, J.Sekine, M.Igeta, N.Nakanishi, T.Itani, K.Yoshida, H.Yoshina, S.Hashimoto, T.Yshii, M.Ichinose, T.Imura, M.Uziie, K.Koyama, Y.Fukuzo, and T.Okuda, " A 2.5 ns clock access 250 MHz 256 Mb S D R A M with synchronous mirror delay," ISSCC 1996 Dig. Tech. Papers, Feb 1996, pp. 374-375.

[95] A.Chau, D.Deusschere, S.Dow, J.Flasck, M.E.Levi, F.Kristen, E.Su, " A Multi-Channel Time-to-Digital converter chip for drift chamber readout," Nuclear Sci­ence, IEEE Transactions On, Vol.43, NO.3, June 1996, pp. 1720-1724.

[96] D.M.Santos, S.F.Dow, M.E.Levi, " A CMOS Delay-Locked Loop and Sub-Nano­second Time-to-Digital converter chip," Nuclear Science, IEEE Transactions On, Vol.43, NO.3, June 1996, pp.1717-1719.

[97] J.Christiansen, "An Integrated High Resolution CMOS Timing Generator Based on an Array of Delay Locked Loops," IEEE J. Sold-State Circuits, vol.31, NO.7, pp. 952-957, Jul 1996.

77

Page 86: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

[98] S.Tanoi, T.Tanabe, K.Takahashi, S.Miyamoto, M.Uesugi, " A 250-622 MHz Deskew and Jitter-Suppressed Clock Buffer Using Two-Loop Architecture," IEEE J. Sold-State Circuits, vol.31, NO.4, pp. 487-493, Apr 1996.

[99] J.Chapman, J.Currin, S.Payne, " A Low-Cost High-Performance CMOS Timing Vernier for ATE," Test Conference, 1995, Proceedings International, pp. 459-468, Oct 1995.

[100] R.F.Ormondroyd, "The acquisition performance of Delay-Locked Loops in noise," Radio Receivers and Associated Systems, Sept 1995, pp.192-197.

[101] E.R.Ruotsalainen, T.Rahkonen, J.Kostamovaara, " A Low-Power CMOS Time-to-Digital converter," IEEE J. Sold-State Circuits, vol.30, NO.9, pp. 984-990, Sept 1995.

[102] H.Sutoh, K.Yamakoshi, M.Ino, "Circuit technique for Skew-Free Clock distribu­tion," Custom Integrated Circuits Conference, 1995, Proceedings of the IEEE 1995, Sept 1995, pp.163-166.

[103] B.Kim, T.C.Weigandt, P.R.Gray " P L L / D L L system noise analysis for Low-Jitter Clock synthesizer design" ISCAS 1994, Proceedings of the 1994 International Symposium on, Vol.4 June 1994, pp.31-34.

[104] T.Lee, " A 2.5 V CMOS delay-locked loop for an 18 Mbit, 500 MB/s D R A M , " IEEEJ. Solid-State Circuits, vol. 29, pp. 1491-1496, Dec 1994.

[105] M.Izzard, "Analog versus digital control of a clock synchronizer for a 3 Gb/s data with 3.0 V differential E C L , " inDig. Tech, Papers 1994 Symp. VLSI Circuits, June 1994, pp. 39-40.

[106] C.Ljuslin, J.Christiansen, A.Marchioro, O.Klingsheim, "An integrated 16 channel CMOS Time-to-Digital converter," Nuclear Science, IEEE Transactions On, Vol.41, NO.4, Aug 1994,pp.ll04-1108.

[107] A.Waizman, " A Delay Line Loop for frequency synthesis of De-Skewed Clock," ISSCC Dig, Tech. Papers, Feb 1994, pp.298-299.

[108] T.Kuroda, T.Fujita, S.Mita, T.Mori, K.Matsuo, M.Kakumu, T.Sakurai, "Substrate noise influence on circuit performance in variable threshold-voltage scheme," IEEE J. Sold-State Circuits, vol.29, NO.3, pp. 309-312, Mar 1994.

78

Page 87: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

[109] M.Ramezani, C.A.T.Salama, "An improved Bang-Bang phase detector for clock and data recovery applications," ISCAS, vol.1, NO.3, pp. 715-718, 1994.

79

Page 88: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Appendix A

Design VHDL code

library ieee; use ieee. stdlogicl 164. all; — library vst_nl8_sc_tsm_c4_wc; — use vst_nl8_sc_tsm_c4_wc.components.all; — library tpz973gtc; — use tpz973gtc.components.all;

entity vernierunitdelay is port ( coarsecontrol : in stdlogic := '0'; ~ contol line for the coarse chain finecontrol : in stdlogic := '0'; — contol line for the fine chain fine_control_inv : in stdlogic := '0'; — inverted version of fine_control coarseinput : in stdlogic := '0'; ~ input to coarse chain fineinput : in stdjogic := '0'; - input to fine chain vernierinput : in stdlogic := '0'; — input from previous stage coarseoutput : out stdlogic := '0'; ~ output of coarse chain fineoutput : out std_logic := '0'; - output of fine chain veraieroutput : out stdlogic := '0'; - output to next stage clkoutput : out stdlogic := '0'); - output of vernier unit

end vernier_unit_delay;

architecture structural of vernierunitdelay is

signal A,B,coarse_output_int,fine_output_int: stdlogic := '0'; signal logicone : stdlogic := '1';

component NAN2D0 port(

Z : out STDLOGIC; A l :in STDJLOGIC; A2 : in STD_LOGIC);

end component;

component BUFTD1 port(

Z :out STD_LOGIC; A :in STDLOGIC; ENB : in STDLOGIC);

end component;

Page 89: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

component N AN2M1D1 port(

Z :out STD_LOGIC; A l : in STD_LOGIC; A2 : in STDLOGIC);

end component;

begin -- structural

U1: NAN2D0 port map (A, logicone, fine_input); U2: NAN2D0 port map (fine_output_int, A, logicone); U3: NAN2D0 port map (B, coarseinput, vernier_input); U4: NAN2D0 port map (coarse_output_int, B, finecontrolinv); U5: NAN2M1D1 port map (vernier_output, fine_output_int, finecontrol); U6: BUFTD1 port map (clkoutput, coarse_output_int, coarsecontrol);

fineoutput <= fme_output_int; coarse_output <= coarseoutputint; logic_one <= '1';

end structural;

entity vernierdelayline is generic (

N : integer := 128 ); -- number of delay elements port( delaylineoutput: out stdlogic := '0'; delay_line_input : in std_logic := '0'; finecontrol : in std_logic_vector(N-1 downto 0) := (others => '0'); fine_control_inv : in std_logic_vector(N-l downto 0) := (others => '0'); coarse_control : in std_logic_vector(N-l downto 0) := (others => '0'));

end vernierdelayline;

architecture structural of vernier_delay_line is

signal fine,coarse,vernier : std_logic_vector(N-1 downto 1) :=. (others => '0'); signal logicone : stdlogic := T;

component vernierunitdelay port( coarse_control : in std_logic; ~ contol line for the coarse chain finecontrol : in std_logic; - contol line for the fine chain finecontroMnv : in stdlogic; - inverted version of fme_control coarse_input : in stdlogic; - input to coarse chain fme_input : in stdlogic; - input to fine chain vernierinput : in stdlogic; - input from previous stage coarseoutput : out stdlogic; - output of coarse chain fme_output : out stdlogic; - output of fine chain

Page 90: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

vernier_output : out std_logic; - output to next stage clkoutput : out std_logic); — output of vernier unit

end component;

begin — structural

chain: for i in 0 to N-l generate

last_unit: if (i = 0 ) generate Dl : vernier_unit_delay port map (coarsecontrol(O), finecontrol(O), finecontrolinv(O), coarse(l),

fine(l), vernier(l), open, open, open, delay_line_output); end generate last_unit;

middleunits: if (i > 0 and i < N-l) generate Dl : vernierunitdelay port map (coarse_control(i), finecontrol(i), fme_control_inv(i), coarse(i+l),

fme(i+l), vernier(i+l), coarse(i), fine(i), vernier(i), delay_line_output); end generate middleunits;

first_unit: if (i = N-l) generate Dl : vernier_unit_delay port map (coarse_control(N-l), fme_control(N-l), fine_control_inv(N-l),

delay_line_input, delay_line_input, logicone, coarse(N-l), fine(N-l), vernier(N-l), delay_line_output); end generate first_unit;

end generate chain;

logicone <= ' 1'; delaylineoutput <= 'H'; — Should be commented for synthesis

end structural;

entity vernier_controller is

generic (

N : integer := 128 ); — number of delay elements

port ( reset : in std_logic := '0'; registerclock : in std_logic := '0'; increasedelay : in std_logic := '0'; decreasedelay : in stdlogic := '0'; coarse_control : out std_logic_vector(N-l downto 0) := (others => '0'); finecontrol : out std_logic_vector(N-1 downto 0) := (others => '0'); fine_control_inv : out std_logic_vector(N-l downto 0) := (others => '0'));

end verniercontroller;

architecture behavior of vernier_controller is

signal coarse_load_data : std_logic_vector(N-l downto 0) := (others => '0');

82

Page 91: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

signal fine_load_data : std_logic_vector(N-l downto 0) := (others => '0'); signal fme_control_int : std_logic_vector(N-l downto 0) := (others => '0'); signal coarse_control_int: std_logic_vector(N-l downto 0) := (others => '0'); signal coarse_enable : std_logic := '0'; signal fine_enable : stdlogic := '0'; signal logic_zero : stdlogic := '0'; signal logic_one : stdlogic := '1';

type statejype is (IDLE,INCREMENT,DECREMENT,FiNE); signal nextstate : state_type; signal state : state_type;

begin

process (nextstate, increase_delay, decrease_delay, fine control_int) begin case state is when IDLE =>

if increase_delay = '1' then nextstate <= INCREMENT; elsif decreasedelay = '1' then nextstate <= DECREMENT; else nextstate <= IDLE; end if;

when INCREMENT => if decrease_delay = i ' then nextstate <= DECREMENT; else nextstate <= INCREMENT; end if;

when DECREMENT => if increasedelay = T then nextstate <= FINE; else nextstate <= DECREMENT; end if;

when FINE => if increasedelay = '1' and finecontrolint(O) = T then nextstate <= INCREMENT; elsif decreasedelay = '1' and fine_control_int(N-2) = '1' then nextstate <= DECREMENT; else nextstate <= FINE; end if;

end case; end process;

process (reset, registerclock) begin

if reset = '0' then state <= IDLE; elsif (register_clock'event and registerclock = '1') then

state <= nextstate; end if;

end process;

process(reset, register_clock) begin

if reset = '0' then coarse_control_int <= coarse_load_data;

Page 92: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

elsif (register_clock'event and register_clock = T) then if increase_delay = '1' and coarseenable = '1' then

rightshift: for i in 0 to N-2 loop coarse_control_int(i) <= coarse_control_int(i+l); end loop; coarsecontrolint(N-l) <= logic_one;

elsif decrease_delay = '1' and coarse_enable = '1' then leftshift: for i in N-1 downto 2 loop

coarse_control_int(i) <= coarse_control_int(i-l); end loop; coarse_control_int(0) <= logic_one;

end if; end if;

end process;

process(reset, register_clock) begin

if reset = '0' then fine_control_int <= fine_load_data; elsif (register_clock'event and registerclock = '1') then

if increasedelay = '1' and fineenable = '1' then rightshift: for i in 0 to N-2 loop

finecontrolint(i) <= fine_control_int(i+l); end loop; fine_control_int(N-1) <= logic_zero;

elsif decrease_delay = T and fine_enable = '1' then leftshift: for i in N-1 downto 2 loop

finecontrolint(i) <= finecontrolint(i-l); end loop; fine_control_int(0) <= logic_zero;

end if; end if;

end process;

logiczero <= '0'; logicone <= '1'; coarse_load_data <= x"FFFFFFFFFFFFFFFF7FFFFFFFFFFFFFFF"; fine_load_data <= x"80000000000000000000000000000000"; coarse_enable <= '1' when ((state = INCREMENT or state = DECREMENT) and (nextstate /= FINE)) else '0'; fine_enable <= T' when (state = FINE) else '0'; , fine_control_inv <= not finecontrolint; fine_control <= fine_control_int; coarsecontrol <= coarse_control_int; end behavior;

entity high_resoloution_phase_detector is port( dll_clock_output: in stdlogic := '0'; — DLL's output clock dllclockinput : in stdlogic := '0'; — Input clock to DLL

84

Page 93: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

reset : in std_logic := '0'; — reset input register_clock : out stdlogic := '0'; ~ Clock for shift register decreasedelay : out stdlogic := '0'; — shift-left output increasedelay : out std_logic := '0'); — shift right output

end high_resoloution__phase_detector;

architecture structural of high_resoloution_phase_detector is

signal A,B,C,D,E,decrease_delay_int,increase_delay_int: stdlogic := '0'; signal F,G,H,I,dll_clock_input_int,reg_clk_l,reg_clk_2,reg_clk_3 : stdlogic := '0'; signal logicone : stdlogic := '1';

component BUFBD4 port(

Z : out STDLOGIC; A : in STDLOGIC);

end component;

component BUFBD16 port(

Z : out STDLOGIC; A : in STDLOGIC);

end component;

component BUFBD32 port(

Z : out STDJLOGIC; A : in STDLOGIC);

end component;

component DFFRPB1 port(

Q : out STDLOGIC; QB :out STDLOGIC; CK :in STDLOGIC; D : in STD_LOGlC; RB :in STD_LOGIC);

end component;

component AND3D1 port(

Z : out STD_LOGIC; A l :in STDLOGIC; A2 : in STD_LOGIC; A3 : in STDLOGIC);

end component;

component AND2D1 port(

85

Page 94: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Z : out STDJLOGIC; A l :in STDLOGIC; A2 :in STD_LOGIC);

end component;

component OR2D1 port(

Z : out STD_LOGIC; A l :in STD_LOGIC; A2 :in STD_LOGIC);

end component;

begin — structural

Ul : DFFRPB1 port map (E, F, dllclockinputjnt, B, reset); U2: DFFRPB1 port map (G, H, dllclockinputjnt, A, reset); U3: AND2D1 port map (A, logicone, dll_clock_output); U4: AND3D1 port map (increase_delay_int, E, G, dll_clock_input__int); U5: AND3D1 port map (decrease_delay_int, F, H, dllclockinputjnt); U6: OR2D1 port map (regclkl, increasedelayint, decrease_delay_in U7: AND2D1 port map (B, dllclockoutput, logic_one); U8: AND2D1 port map (dllclockjnputint, dll_clock_input, logicone) u9: BUFBD4 port map (reg_clk_2, regclkl); ulO: BUFBD16 port map (reg_clk_3, reg_clk_2); ul 1: BUFBD32 port map (register_clock, reg_clk_3);

logic_one <= '1'; decreasedelay <= decrease delay_int; increase_delay <- increase_delay_int;

end structural;

entity lock_detector is port( lock_indicator : out stdlogic := '0'; — high when DLL is locked dll_clock_input : in std_logic := '0'; ~ Input clock to DLL reset : in stdlogic := '0'; — reset input decrease_delay : in std_logic := '0'; - shift-left output increase_delay : in std_logic := '0'); ~ shift_right output

end lockdetector;

architecture behavior of lock detector is

begin

process(reset, dll_clock_input) begin

Page 95: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

if reset = '0' then lock indicator <= '0'; elsif dll_clock_input'event and dllclock input = '0' then

lockindicator <= not(increase_delay or decrease_delay); end if;

end process;

end behavior;

entity vernierdll is

generic ( N : integers 128 );

port( dll_clock input : in std_logic := '0'; dll_clock_output : out stdlogic := '0'; lockindicator : out std_logic := '0'; reset : in std_logic := '0');

end vernierdll;

architecture structural of vernier dll is

signal registerclock : stdlogic := '0'; signal finecontrol : std_logic_vector(N-l downto 0) := (others => '0'); signal fine_control_inv : stdJogic_vector(N-1 downto 0) := (others => '0'); signal coarse_control : std_logic_vector(N-l downto 0) := (others => '0'); signal decrease_delay : std_logic := '0'; signal increasedelay : stdlogic := '0'; signal dllclockoutput int : stdlogic := '0'; signal dllclock inputjnt : stdjogic := '0'; signal delayjineoutput : std_logic := '0'; signal logiczero : stdlogic := '0'; signal reset_int : stdlogic := '0'; signal lock indicator int : stdlogic := '0'; signal dll_clock_output_pad : stdjogic := '0';

component PDCH3DGZ port(

CLK : in std_logic; CP : out stdjogic);

end component;

component PDD24DGZ port(

I : in stdjogic; OEN : in stdjogic; PAD : inout stdjogic; C : out stdlogic);

end component;

87

Page 96: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

component PDIDGZ port(

PAD : in stdjogic; C : out stdlogic);

end component;

component PDO02CDG port( I : in stdlogic; PAD : out stdlogic);

end component;

component vernier_delay_line

generic ( N : integer); number of delay elements

port( delay_line_output: out std_logic; delayjinejnput : in std_logic; finecontrol : in std_logic_vector(N-l downto 0); fine_control inv : in std_logic_vector(N-1 downto 0); coarsecontrol : in std_logic_vector(N-l downto 0));

end component;

component high_resoloution_phase_detector port( dllclockoutput: in stdlogic; — DLL's output clock dllclock input : in stdjogic; - Input clock to DLL reset : in stdjogic; - reset input register_clock : out stdjogic; -- Clock for shift register decreasedelay : out std logic; ~ shift left output increase_delay : out stdlogic); - shift right output

end component;

component vernier_controller is

generic ( N : integer); — number of delay elements

port( reset : in registerclock increase_delay decrease_delay coarsecontrol fine_control : fine control inv

stdjogic; in stdjogic; : in stdlogic; : in std logic; : out stdlogic_vector(N-l downto 0); out stdlogic_vector(N-l downto 0); : out stdlogic_vector(N-l downto 0));

Page 97: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

end component;

component lockdetector is port( lockindicator : out stdjogic; dll_clock_input : in stdlogic; reset : in std_logic; decrease_delay : in stdlogic; increase_delay : in stdlogic);

end component;

begin — structural

ul : vernierdelayline generic map (N => 128)

port map (delayjineoutput, dll_clock_input_int, finecontrol, fine_control_inv, coarse_control);

u2 : high_resoloution_phase_detector port map (dll_clock_output_int, dll_clock_input_int, reset_int, registerclock, decreasedelay,

increasedelay); u3 : verniercontroller generic map (N=> 128) port map (resetint, registerclock, increasedelay, decreasedelay, coarsecontrol, fine_control,

fine_control_inv);

u4 : lockdetector

port map (lockindicatorint, dll_clock_input_int, reset_int, decreasedelay, increasedelay);

u5: PDD24DGZ

port map (delay_line_output, logiczero, dll_clock_output_pad, dll_clock_output_int);

u6: PDO02CDG

port map (lockindicatorint, lockindicator);

u7: PDIDGZ

port map (reset, resetint);

u8: PDCH3DGZ

port map (dll_clock_input, dll_clock_input_int);

logic_zero <= '0'; dll_clock_output <= dll_clock_output_pad;

end structural;

89

Page 98: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

library ieee; use ieee.stdjogicl 164.all; library vst_nl8_sc_tsm_c4_typ; use vst_nl8_sc_tsm_c4_typ.components.all; library tpz973gtc; use tpz973gtc.components.all;

entity verniertestbench is generic(

N : integer := 128); end verniertestbench;

architecture behavior of vernierjestbench is

signal jitterl: stdjogic := '1'; signal jitterh : stdlogic := '0'; signal clock ljnput: stdjogic := '0'; signal clock2 input: stdjogic := '0'; signal clockenable : stdjogic := ' 1'; signal dll_clock input : stdjogic := '0'; signal dll_clock_output : stdjogic := '0'; signal lock indicator : stdjogic := '0'; signal reset : stdjogic := '0';

component vernierdll

generic ( N : integer);

port( dll_clock input : in stdjogic; dll_clock_output : out'stdjogic; lock indicator : out stdjogic; reset : in stdlogic);

end component;

begin

UI : vernierdll generic map(N => 128) port map (dllclockjnput, dll_clock_output, lockjndicator, reset);

process begin clockl input <= '0'; wait for 4100 ps; clockl Jnput <= '1'; wait for 4100 ps;

90

Page 99: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

end process;

process begin clock2_input <= '1'; wait for 3900 ps; clock2_input <= '0'; wait for 3900 ps;

end process;

process begin clock_enable <= '1'; wait for 291200 ps; clock_enable <= '0'; wait for 2910000 ps ;

end process;

process begin jitterl <= '1'; wait for 291100 ps; jitterl <= '0'; wait for 200 ps; jitterl <=T; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps;

Page 100: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

jitterl<=T; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <=T; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= '1'; wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <= *1 wait for 8000 ps; jitterl <= '0'; wait for 200 ps; jitterl <=T;

end process;

process begin jitterh <= '0'; wait for 295200 ps jitterh <= '1'; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= '1'; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <=T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= '1'; wait for 200 ps;

Page 101: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

jitterh <= '0'; wait for 8000 ps; jitterh <= '1'; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <=T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <=T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <=']'; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <=T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= '1'; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= T; wait for 200 ps; jitterh <= '0'; wait for 8000 ps; jitterh <= '1'; wait for 200 ps; jitterh <= '0';

end process;

dllclockinput <= (((clock 1 input and jitterl) or jitterh) and clock_enable) or (clock2_input and (not clockenable));

process begin reset <= '0'; wait for 10000 ps; reset <= T; wait;

end process;

end behavior;

configuration vernierrtl of vernierjestbench is

93

Page 102: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

for behavior for ul : vernierdll use entity work.vernierdll(structural);

for structural forul: vernier_delay_line use entity work.vernierdelayline(structural);

end for;

for u2: high_resoIoution_phase_detector use entity work.high_resoloution_phase_detector(structural);

end for;

foru3: vernier_controller use entity work.vernier_controller(behavior);

end for;

for u4: lockdetector use entity work.lock_detector(behavior);

end for;

end for; end for;

end for;

end vernier rtl

Page 103: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Appendix B

Synthesis result

Report: cell Design : vernier_unit_delay Version: V-2004.06-SP1 Date : Mon Jun 27 12:29:56 2005

Attributes: b- black box (unknown) h- hierarchical n - noncombinational r - removable u - contains unmapped logic

Cell Reference Library Area Attributes

UI NAN2D0 vst nl8 sc tsm c4 wc 12.197000

U2 NAN2D0 vst nl8 sc tsm c4 wc 12.197000

U3 NAN2D0 vst nl8 sc tsm c4 wc 12.197000

U4 NAN2D0 vst nl8 sc tsm c4 wc 12.197000

U5 NAN2M1D1 vst nl8 sc tsm c4 wc 16.261999

U6 BUFTD1 vst nl8 sc tsm c4 wc 28.459000 n

Total 6 cells 93.508995

Page 104: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Report: area Design : vernier_unit_delay Version: V-2004.06-SP1 Date : Thu Jun 23 14:31:56 2005

Library(s) Used:

vst_n 18_sc_tsm_c4_wc (File: /CMC/kits/cmosp 18/synopsys/2004/syn/vst_nl 8_sc_tsm_c4_wc.db)

Number of ports: 10 Number of nets: 13 Number of cells: 6 Number of references: 3

Combinational area: 93.508995 Noncombinational area: 0.000000 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 93.508995 Total area: undefined

**************************************** Report: area Design : vernierdelayline Version: V-2004.06-SP1 Date : Thu Jun 23 14:34:45 2005

Library(s) Used:

vstnl 8_sc_tsm_c4_wc (File: /CMC/kits/cmospl8/synopsys/2004/syn/vst_nl8_sc_tsm_c4_wc.db)

Number of ports: 386 Number of nets: 767 Number of cells: 128 Number of references: 1

Combinational area: 11969.153320 Noncombinational area: 0.000000 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 11969.151367 Total area: undefined

96

Page 105: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

Report: area Design : high_resoloution_phase_detector Version: V-2004.06-SP1 Date : Thu Jun 23 14:39:20 2005

Library(s) Used:

vst_nl8_sc_tsm_c4_wc (File: /CMC/kits/cmospl8/synopsys/2004/syn/vst_nl8_sc_tsm_c4_wc.db)

Number of ports: 6 Number of nets: 17 Number of cells: 11 Number of references: 7

Combinational area: 760.265991 Noncombinational area: 154.492004 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 914.757996 Total area: undefined

Report: area Design : verniercontroller Version: V-2004.06-SP1 Date : Thu Jun 23 14:50:12 2005

Library(s) Used:

vst_n 18_sc_tsm_c4_wc (File: /CMC/kits/cmosp 18/synopsys/2004/syn/vst_n 18_sc_tsm_c4_wc.db)

Number of ports: 388 Number of nets: 825 Number of cells: 567 Number of references: 19

Combinational area: 6244.804199 Noncombinational area: 25637.822266 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 31882.552734 Total area: undefined

97

Page 106: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

**************************************** Report: area Design : lock_detector Version: V-2004.06-SP1 Date : Thu Jun 23 14:42:05 2005 ****************************************

Library(s) Used:

vst_nl8_sc_tsm_c4_wc (File: /CMC/kits/cmospl8/synopsys/2004/syn/vst_nl8_sc_tsm_c4_wc.db)

Number of ports: 5 Number of nets: 6 Number of cells: 2 Number of references: 2

Combinational area: 12.197000 Noncombinational area: 77.246002 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 89.443001 Total area: undefined

**************************************** Report: area Design : vernier_dll Version: V-2004.06-SP1 Date : Thu Jun 23 15:09:01 2005 ****************************************

Library(s) Used:

vstnl 8_sc_tsm_c4_wc (File: /CMC/kits/cmosp 18/synopsys/2004/syn/vst_nl 8_sc_tsm_c4_wc.db) tpz973gwc (File: /CMC/kits/cmosp 18/synopsys/2004/syn/tpz973gwc.db)

Number of ports: 4 Number of nets: 397 Number of cells: 8 Number of references: 8

Combinational area: 65985.875000 Noncombinational area: 25869.562500 Net Interconnect area: undefined (Wire load has zero net area)

Total cell area: 91855.906250 Total area: undefined

98

Page 107: A NOVEL HIGH RESOLUTION DELAY LOCKED LOOP …

****************************************

Report: cell Design : vernierdll Version: V-2004.06-SP1 Date : Thu Jun 23 15:10:29 2005 ****************************************

Attributes: b - black box (unknown) h - hierarchical n - noncombinational p - parameterized r - removable

u - contains unmapped logic

Cell Reference Library Area Attributes ul vernier_delay_line 11969.151367

u4 u5

u3

u2 h, n, p

highresoloutionphasedetector 914.757996 h,n

vernier_controller 31882.552734 h, n, p

lockdetector 89.443001 h, n PDD24DGZ tpz973gwc 9400.000000

n u6 u7 u8

PDO02CDG PDIDGZ PDCH3DGZ

tpz973gwc 9400.000000 tpz973gwc 9400.000000

tpz973gwc 18800.000000

Total 8 cells 91855.906250

HDL Parameter Information: ul - N=>128 u3 - N => 128