작성일: 2004.02.23
TechXclusives...
By Peter Alfke Director, Applications Engineering, Xilinx San Jose
- Publication Date: 10/28/2002 -
This TechXclusive analyzes the metastable behavior of Virtex-II ProTM flip-flops. When a 300 MHz clock synchronizes an asynchronous input of ~50 MHz, the incremental metastable delay exceeds 2.5 ns only once per billion years. The input capture window causing 0.5 extra nanoseconds of metastable delay is calculated as 0.03 femtoseconds.
Whenever a clocked flip-flop synchronizes an asynchronous input, there is a small probability that the flip-flop output will exhibit an unpredictable delay. This happens when the input transition not only violates the setup and hold-time specifications, but when it actually occurs within the tiny timing window where the flip-flop accepts the new input. Under these circumstances, the flip-flop can enter a symmetrically balanced transitionary state, called "metastable" (meta = between).
The slightest deviation from perfect balance will then cause the output to revert to one of its two stable states, but the delay in doing so depends not only on the gain-bandwidth product of the flip-flop's master latch, but also on the noise level within the circuit. The delay can, therefore, only be described in probabilistic terms.
The problem for the system designer is not the undocumented logic level in the balanced state (it's easy enough to translate that to either a 0 or a 1), but rather the unpredictable timing of the eventual transition to a valid logic state. If the metastable flip-flop drives two destinations with different path delays, one destination might clock in the final data state, while the other clocks in the intermediate metastable state.
With the help of a self-contained circuit, Xilinx evaluated the Virtex-II Pro CLB and IOB flip-flops. The result shows their metastable recovery to be superior to that of any previously documented standard device.
Because metastability can only be measured statistically, this data was obtained by configuring the FPGA with the detector circuit shown in Figure 1. The flip-flop under test receives the asynchronous ~50-MHz signal on its D input while it is clocked by a much higher adjustable frequency. The output QA feeds two flip-flops in parallel, one (QB) being clocked by the same clock edge, the other (QC) being clocked by the opposite clock edge. When clocked at a low frequency, each input change is captured by the rising clock edge and appears first on QA, and is then captured by the falling clock edge and appears on QC, and finally, after the subsequent rising clock edge, on QB.
If a metastable event in the first flip-flop increases the settling time on QA so much that QC misses the change, but QB still captures it on the next rising clock edge, this error can be detected by feeding the XOR of QB and QC into a falling-edge triggered flip-flop. Its output (QD) is normally Low, but goes High for one clock period each time the asynchronous input transition caused such a metastable delay in QA. The frequency of metastable events can be observed with a 16-bit counter driven by QD.
Figure 1
By changing the clock frequency, and thus the clock half-period, the amount of acceptable metastable delay on the QA output can be varied, and the resulting frequency of metastable events may be observed on the counter outputs.
As expected, no metastable events were observed at clock rates below 300 MHz, since a half clock period at those frequencies is adequate to resolve most metastable delays. Increasing the clock rate slightly brought a sudden burst of metastable events. Careful adjustment of the clock frequency gave repeatable, reliable measurements. The measured MTBF ranged between one millisecond and one minute, and was extrapolated beyond a million years.
Our tests indicated that a metastable delay increment of 500 ps occurs about once per second. During that time, the 300-MHz clock made 100 million attempts to synchronize the rising and falling data edge. And only one of them resulted in a 500 ps delay increment. Since data and clock are asynchronous, the capture window for this short metastable event is 3 ns ÷ 100 million = 0.03 femtoseconds. There is no practical way to zero in on such a narrow window, and random testing is the only successful methodology.
The Mean Time Between Failures (MTBF) can only be defined statistically. It is inversely proportional to the product of the two frequencies involved, the clock frequency and the average frequency of the asynchronous data changes, provided that these two frequencies are independent and have no correlation.
The generally accepted equation for MTBF is:
K1 represents the metastability-catching set-up time window, which describes the likelihood of going metastable. K1 is difficult to calculate, as any measured value of t includes the normal clock-to-Q, plus routing delays, plus the actual set-up time. K1 has no practical significance once two values for MTBF have been measured.
K2 is an exponent that describes the speed with which the metastable condition is being resolved. K2 is an indication of the gain-bandwidth product in the feedback path of the master latch of the master-slave flip-flop. A small increase in K2 results in an enormous improvement in MTBF. Some researchers list 1/K2 as a time constant, tau.
The circuit of Figure 1 was implemented in an XC2VPro4 device using 0.13 micron, 9-layer-metal technology. Two different implementations put QA, the flip-flop under test first, into a CLB and then into an IOB. The results are listed in Table 1 and are plotted in Figure 2. Note that the time plotted on the horizontal axis includes the clock-to-out delay of QA, plus a short interconnect delay, plus the set-up time at the input of the QC flip-flop. The extrapolated results are outstanding, far superior to any metastable data published before. When granted two nanoseconds of extra settling delay, the problems caused by metastability are almost eliminated, as the MTBF exceeds millions of years.
Table 1 -Virtex-II Pro Metastability Measurements and Calculations
(1997 data for XC4005E added for comparison)
Table 1 lists the experimental results from which the exponential factor K2 was derived. The clock frequency was adjusted manually while counting errors. Measurements were taken at room temperature, but testing at VCC extremes gives an indication of performance at higher and lower temperature.
Figure 2: XC2VP4 Metastable Recovery
Xilinx Applications began measuring and publishing metastable results as early as 1988. The earliest data was published in 1989, and was expanded in 1997 to cover newer devices. To illustrate the improvement in the Virtex-II Pro flip-flop, Fig 3 combines the new results with the 1997 results, all normalized to 1 MHz asynchronous data and 100 MHz clock rate. (Recalculating MTBF for lower data and clock frequencies increases MTBF, but does not affect the slope on the logarithmic scale. It effectively moves the lines to the left to compensate for the 150 times smaller product of the two frequencies, causing an MTBF that is 150 times longer.)
Figure 3: Metastable Progress
For other operating conditions, divide MTBF by the product of the two frequencies. For a ~10 MHz asynchronous input synchronized by a 200 MHz clock, the MTBF is 20 times shorter than plotted; for a ~50 kHz signal synchronized by a 1 MHz clock, the MTBF is 2000 times longer than plotted in Figure 3.
Asynchronous inputs can result in unpredictable delays at the synchronizer flip-flop output. Modern CMOS circuits are so fast, that this metastable delay can safely be ignored for clock rates below 200 MHz. The user must, however, minimize the routing delay between the synchronizer and the next-level flip-flop, so as not to squander the advantage gained by the fast settling time. This article provides quantitative data to calculate the metastable MTBF.
<!-- Source URL: Click Here -->
(C) Copyright 1994-2003 Xilinx, Inc. All Rights Reserved