# A Non-sequential Phase Detector for PLL-based High-Speed Data/Clock Recovery<sup>\*</sup>

Yonghui Tang Randall L. Geiger Dep. of Electrical and Computer Engineering Iowa State University, Ames, IA 50011

*Abstract* – The Phase-Locked Loop (PLL) is a widely used block in data and clock recovery circuits. Phase detectors form a crucial part of the PLL. The requirements for phase detectors used in random data recovery are more stringent than the one used for clock recovery, especially at highspeed. This paper presents a new Phase Detector (PD) that can be used for high-speed random data/clock recovery. In contrast to most existing structures which are speed-limited by sequential logic circuits. It exploits the leading and lagging signals from the VCO which greatly simplifies the PD structure. Using the HSPICE simulator and HP 0.35u standard CMOS process models, simulation results show that the PD can operate at 2GHz over the 0°C to 100°C temperature range and over fast and slow process corners.

# I. INTRODUCTION

Phase-Locked Loops (PLLs) and Data/Clock Recovery Circuits find wide applications in areas such as communications, wireless systems, digital circuits, and disk drive electronics. The Phase Detector is a key component of the PLL, but with existing circuit implementations, the phase detector is often the bottleneck that limits the data rates that can be achieved by the PLL.

Phase detectors can be categorized into two types. One is used in PLLs that lock to a reference clock signal. Many phase detectors can perform this function. Included in the group are XOR gates, RS latch etc. The other can be used in PLLs that lock to random Non-Return-to-Zero (NRZ) data and recover the clock signal which is associated with the data stream. Since the spectrum of NRZ data has no energy at its data rate, this makes the task of data recovery more difficult and places more severe restrictions on the performance of the phase detector. Often, it requires a nonlinear operation at the front end of the phase detector circuit to generate some energy at its data rate frequency.

Several kinds of phase detectors that are applicable to random data applications have been reported [1] [3]. Probably the most widely used is Hogge's phase detector [3]. The Hogge phase detector can be used at speeds up to 1GHz in the referenced process, but its performance deteriorates rapidly at modestly higher frequencies. This performance limitation is due mainly to the inability of the flipflop used in the circuits to settle fast enough.

The phase detector we present here is specifically designed to operate at higher data rates than what is achievable with the Hogge circuits. This should enable the data rate of corresponding data recovery circuits to be increased. The proposed circuits are simple and easy to implement. The structure of this new PD is partly determined by the number of stages in the VCO.

PD structures for VCOs with even and odd number of stages are described in Section II. Implementation details are provided in Section III. Simulation results are given in Section IV.

## II. STRUCTURE AND OPERATION

# A. Structure of Non-Sequential Phase Detector used with a VCO with odd number of stages

The proposed PD is designed for use with a ring oscillator type VCO. The simplicity of this PD is achieved by introducing one or two signals from the VCO to help in finding the phase/frequency difference.



Figure 1. (a) Phase Detector based on VCO with odd number of stages; (b) The structure of whole PLL

The structure of the PD that is used with an odd number of VCO stages is shown in Fig. 1(a). The PLL structure that uses this PD

This work was supported, in part, by Texas Instruments Inc., RocketChips Inc. and the R. J. Carver trust.

is shown in Fig. 1(b) for the case where there are 3 delay stages in the VCO. If there are more than 3 delay stages in the VCO, **CLK\_lead** signal can come from the non-inverting output of the stage immediately preceding the clock output stage and **CLK\_lag** signal can come from the inverting output of the immediately following stage. Other signals can also be used for **CLK\_lead** and **CLK\_lag** when there are more than 3 delay stages depending on the PLL design.

This PD uses two signals extracted from the VCO. They are **CLK\_lead** and **/CLK\_lag**, the leading and inverting lagging signals of the Clock (**CLK**) signal.

We use two delay cells and two XOR gates to detect the edges of the input random data. The **Up** and **Down** signals that are required to drive the charge pump are generated from signals **A**, **E** and **F**.

Fig.2 shows a timing diagram for the situation when the PLL is in lock. The input data is random in this figure.



Figure 2. The operations of the proposed phase detector

Signal **A** is generated from **CLK\_lead** and **/CLK\_lag** signals. It has a fixed relationship with the regenerated clock, i.e. the rising edges of the **CLK** are at the middle of signal **A**.

Signals E and F are generated from the signals **Data**, **Data\_delay1** and **Data\_delay2**. The falling edges of F and the rising edges of E are aligned at the dotted line, which, when the PLL is in lock, is right at the middle of the signal A.

When the PLL is in lock, **Up** and **Down** signals are generated by using **E** and **F** to cut signal **A** in half. Therefore, **Up** and **Down** signal have the same duty cycles and hence the loop filter which filters the difference in the duty cycles of the **Up** and **Down** signals will not be driven up or down when in lock. From the preceding description, we note that when the PLL is in lock, the **CLK** is not locked to **Data**, but to **Data\_delay1** instead. However, this should not affect the operation of the PLL.



The scenarios with leading and lagging **Data** signals are shown in Fig.3. For simplicity purposes, emphasis is only on how the **Up** and **Down** signals relate to the **Data** signal.

When **Data** is leading the **CLK**, the dotted line in Fig.2 will move to the left relative to signal **A**. Therefore, the width of the **Down** pulse will decrease and the width of the **Up** pulse will increase. Conversely, when **Data** is lagging the **CLK**, the dotted line will move to the right relative to signal **A**. Therefore, the width of the **Down** pulse will increase and the width of the **Up** pulse will decrease. The changes in the duty cycles of the **Up** and **Down** signals will, in turn, decrease or increase the frequency of the **CLK** signal through the PLL.

In this phase detector, the **Up** and **Down** signals are only generated whenever there are transitions on the incoming data stream. This property guarantees its ability to handle random NRZ data.

It is well known that many phase detectors are plagued by "dead zone" which causes jitter in the PLL. In order to achieve correct operation of the phase detector and eliminate the "dead zone" in the proposed phase detector, the following constraint on the phase detector's delay stage must be satisfied:

$$max\left(\frac{1}{2}T_0, \frac{1}{2}(T-T_0)\right) < T_{delay} < \left(T-\frac{1}{2}T_0\right)$$

Where **T** is the period of the signal **CLK**,  $T_0$  is the pulse width of the signal **A**, and  $T_{delay}$  is the delay of the phase detector's delay stage. Typically, the **A** signal has a 50% duty cycle.

The transfer characteristics of the PD for a typical  $T_{delay}$  are shown in Fig.4. This relationship is typical of a good phase detector. Corresponding to different delay times, the shape of the curve will change modestly, but its functionality as a useful PD will be maintained.



Figure 4. The transfer characteristic of the proposed PD

Unlike many up-down PDs, whenever there is a phase shift in the input, pulse width of both the **Up** and **Down** signals will change in opposite directions rather than changing only one of them in one direction. This property increases the gain of the PD and inherently enhances the acquisition behavior of the PLL.

### B. Structure of Non-Sequential Phase Detector used with a VCO with even number of stages



Figure 5. (a) Phase Detector based on VCO with an even number of stages; (b) The structure of the whole PLL

In where the VCO has an even number of stages, the PD structure becomes even simpler. In this case, we can eliminate the AND gate that is used to generated the signal **A** in Fig. 1(a). Since the signal **A** can be directly extracted from the VCO.

The PD structure is shown in Fig. 5(a). A PLL structure that uses this PD for the case where the VCO has 4 delay stages is shown in Fig. 5(b).

For more than 4 delay stages in the VCO, the **CLK\_mid** must come from the inverting output of the stage which

preceding or following half number of stages to the **CLK** output stage.

The operation principle of this kind of PD is basically the same as described previously when used with a VCO with an odd number of stages.

# III. THE IMPLEMENTATION OF THE PROPOSED PD

#### A. Delay Cell

The delay cells are simply implemented by a series of inverters as shown in Fig.6. Since the delay time requirement for them is not very critical, it's easy to control the delay time over the temperature range and the process corners.



Fig. 6 Delay Cells

# B. XOR And AND Gates

The initial goal of designing this PD is to eliminate the sequential logic circuits that are difficult to operate at high-speeds. The proposed PD consists of only combinational logic. It tends to operate at a higher speed than sequential logic. A good choice of architecture for the XOR gates and AND gates is, however, crucial to achieve the high-speed operation in the proposed PD.

Several styles of CMOS logic can be considered. One is complementary CMOS logic which is built from NMOS pulldown and PMOS pull-up logic networks. Simple gates, such as NAND/NOR can be realized very efficiently with only a few transistors and a few circuit nodes. Other gates, such as XOR and AND gates, require more complex circuit realizations.

Another choice is pass-transistor logic. Pass-transistor logic XOR gates are very simple and can operate at very high speeds. Pass-transistor logic was used in the implementation of the proposed PD. However, special care must be taken to circumvent the swing degradation problems which are of concern in pass-transistor logic.

Several styles of pass-transistor logic are available including Complementary Pass-transistor Logic (CPL), Swing Restored Pass-transistor Logic (SRPL), Double Pass-transistor Logic (DPL), and Single-Rail Pass-transistor logic (LEAP). We used CPL as the structure for both the XOR and AND gates. The circuit schematic is shown in Fig. 7. Two PMOS transistors are used to restore a logic "1" at the output. In order to increase the driving capability, an inverter is used between the XOR gate and the AND gate.



Fig. 7 (a) CPL AND gate; (b) CPL XOR gate

All XOR and AND gates are implemented in the complementary mode, both "signal" and "signal-bar" are present. Therefore, both "Up" and "Down" can be generated complementarily to drive the Charge pump.

#### IV. SIMULATION RESULTS

The simulation of the PD was done using HSPICE simulator with the HP 0.35u standard CMOS process parameters. The clock signal is 2GHz. All schematic and anticipated layout parasitics are included. Additional 10fF capacitors were added at each connection nodes to model the interconnection capacitance. The simulation covers the temperature range from 0°C to 100°C and all process corners.

Three simulation results are shown here in Fig.8(a). The input data stream is a 1GHz signal with 50% duty cycle which represent the NRZ data pattern "0101010 ......" (not random). One of the three results is at room temperature with the normal model. The other 2 are under extreme conditions; specifically 100°C at the slow process corner and 0°C at the fast process corner. The PD operates correctly under all the conditions.

From these simulations, it is apparent that very high gain is achieved around the zero phase error point. The "Dead zone" which is inherent in many phase detectors is absent.

The simulation results with random input data are shown in Fig. 8(b). The performance of the PD with two patterns of NRZ data was tested. One pattern is series of "1"s with one "0" ("11110"); another is series of "0"s with one "1" (00001). Results show that they all have zero output at 0 degree phase shift and the PD gain is reasonably constant.

#### V. CONCLUSION

A new non-sequential Phase Detector structure has been introduced. It is not only applicable to random data recovery, but also has the advantage of simplicity and operability at high-speed. Simulation results show it can operate up to 2GHz with a HP 0.35u standard CMOS process.



Fig. 8 Simulation results of the proposed PD under several situations with (a). 50% duty cycle data (b). Random input data

#### REFERENCES

- J.D.H. Alexander, "Clock Recovery From Random Binary Signals," *Electronic Letters*, vol. 11, pp.541-542, October, 1975.
- [2] Floyd M. Gardner, "Charge-Pump Phase-Lock Loops," *IEEE Trans. Comm.*, vol. COM-28 pp.1849-1858, Nov. 1980.
- [3] Charles R. Hogge, "A self Correcting Clock Recovery Circuit," IEEE Journal of Lightwave Technology, vol. LT-3 1312-1314, Dec. 1985.
- [4] Reto Zimmermann, Wolfgang Fichtner, "Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic," *IEEE Journal of Solid-State Circuits*, VOL.32 No.7, 1079-1090, July 1997.
- [5] Behzad Razavi, "Monolithic Phase-Locked Loops and Clock Recovery Circuits - Theory and Design," *IEEE Press*, 1996