h1

Another Synchronization Pitfall…

May 18, 2007

Many are the headaches of a designer doing multi clock domain designs. The basics that everyone should know when doing multi clock domain designs are presented in this paper. I would like to discuss on this post a lesser known problem, which is overlooked by most designers. Just as a small anecdote, this problem was encountered by a design team led by a friend of mine. The team was offered a 2 day vacation reward for anyone tracking and solving the weird failures that they experienced. I guess this already is a good reason to continue reading…

OK, we all know that when sending a control signal (better be a single one! – see the paper referenced above) from one clock domain to another, we must synchronize it at the other end by using a two stage shift register (some libraries even have a “sync cell” especially for this purpose).

Take a look at the hypothetical example below

sync_pitfall1.png sync_pitfall2.png

Apparently all is well, the control signal, which is an output of some combinational logic, is being synchronized at the other end.
So what is wrong?
In some cases the combinational logic might generate a hazard, depending on the inputs. Regardless whether it is a static one (as depicted in the timing diagram) or a dynamic one, it is possible that exactly that point is being sampled at the other end. Take a close look at the timing diagram, the glitch was recognized as a “0” on clk_b’s side although it was not intended to be.

The solution to this problem is relatively easy and involves adding another sampling stage clocked with the sending clock as depicted below. Notice how this time the control signal at the other end was not recognized as a “0”. This is because the glitch had enough time to settle until the next rising edge of clk_a.

sync_pitfall3.pngsync_pitfall4.png

In general, the control signal sent between the two clock domains should present a strict behavior during switching- either 1–>0 or a 0–>1. Static hazards (1–>0–>1 or 0–>1–>0) or Dynamic hazards (1–>0–>1–>0 or 0–>1–>0–>1) are a cause for a problem.

Just a few more lines on synchronization faults. Quite often they might pop up in only some of the designs. You might have 2 identical chips, one will show a problem the other not. This can be due to slight process variations that make some logic faster or slower, and in turn generate a hazard exactly at the wrong moment.

Advertisements

One comment

  1. One synchronization problem I’ve seen was that a control signal in one clock domain was unintentionally fed into a state machine in a different clock domain. Needless to say that once in a while that state machine was thrown into an invalid state.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: