h1

Real World Examples #5 – Clock Divider by 5

August 26, 2009

Here is a neat little circuit that was used in an actual project a long, long time ago (in a galaxy far, far away…).

The requirement was to build a divide by 5 circuit for the clock with 50% duty cycle. The initial (on reset) behavior was not important – i.e. the circuit could wake up in an undefined state, but should have settled after a given time. The engineer produced the circuit below:

divby5

Basically, the circuit is made out of a 3-bit counter, that counts from 000 to 100 and then resets. Signal ‘X’ goes high when the value of the counter is either 000, 001 or 010. Signal ‘Y’ goes high when the counter equals its ‘middle’ state 010. ‘Z’ is a sample on the falling edge of ‘Y’ in order to generate the 50% duty cycle.

So far so good. The general thinking was OK, but there was a major problem with the circuit, can you discover what it was? How would you fix it in RTL? and more important, how would you fix it in an ECO (as it was eventually done)?

No extra flops are allowed!

h1

Dual Edge Binary Counters + Puzzle

June 24, 2009

I lately came across the need to use a dual edge counter, by this I mean a counter which is counting both on the rising and on the falling edge of the clock.
The limitation is that one has to use only normal single edge sensitive flops, the kind you find in each library.

There are several ways to do this, some easier than others. I would like to show you a specific design which is based on the dual edge flop I described in a previous post. This design is just used here to illustrate a point, I do not recommend you use it – there are far better ways. Please refer to the end of the post for more on that.

The figure below depicts the counter:
dual edge counter

The counter is made of 2 n-bit arrays of flops. The one operates on the rising edge, the other on the falling edge. The “+1″ logic is calculated from the final XOR output, which is the real output of the counter! The value in each of the n-bit arrays does not represent the true counting value, but is used to calculate the final counter value. Do not make the mistake and use the value directly from either set of flops.

This leads to a small puzzle – given the conditions above, can this counter be done with less flops?

h1

Reordering Nets for Low Power

May 10, 2009

As posts accumulate, you can see that low power design aspects is a big topic on this site. I try to bring more subtle design examples for lower power design that you can control and implement (i.e. in RTL and the micro architectural stage).

Identifying “glitchy” nets is not always easy. Some good candidates are wide parity or CRC calculations (deep and wide XOR trees), complicated arithmetic paths and basically most logic that originates in very wide buses and converges to a single output controlling a specific path (e.g. as a select pin of a MUX for a wide data path).

If you happen to identify a good candidate, it is advisable (when possible) that you feed the “glitchy” nets as late as possible in the calculation path. This way the total amount of toggling in the logic is reduced.

Sounds easy enough? Well the crux of the problem is identifying those opportunities – and it is far from easy. I hope this post at least makes you more aware of that possibility.

To sum up, here are two figures that illustrate the issue visually. The figure below depicts the situation before the transformation.. The nets which are highlighted represent high activity nets.
glitchy_net1

After the transformation – pushing the glitchy net calculation late in the path. The transformation is logically equivalent (otherwise there is no point…) and we see less high activity nets.

glitchy_net2

h1

Parametrized Reset Values

April 19, 2009

For some odd reason some designers refuse to use parametrized blocks. I have no idea what are the reasons for such an opinion, but here is a good example why one would want to decide for the usage of parameters.

Imagine you need to design a block, which will be used several times throughout the design. The problem is, that each instance might need to have different reset values to some of its internal flops.
One (wrong) possibility is to define an extra input, which will in turn be connected as the reset value – but this is not something you’d like to do (why??)

The better option is to send the reset value as a parameter, which if it wasn’t clear by now, is the way to go.

h1

Puzzle #14 – Multipliers

April 14, 2009

Here is an interview question that was circulating some of the message boards lately.
Can you create a 4×4 multiplier with only 2×2 multipliers at hand?

post your answers as a comment to this post.

h1

Reducing Power Through Retiming

February 23, 2009

Here is an interesting and almost trivial technique for (potential) power reduction, which I never used myself, nor seen used in others’ designs. Well… maybe I am doing the wrong designs… but I thought it is well worth mentioning. So, if any of my readers use this, please do post a short comment on how exactly did you implement it and if it really resulted in some significant savings.

We usually have many high activity nets in the design. They are in many cases toggling during calculation more than once per cycle. Even worse, they often drive long and high capacitive nets. Since, in a usual synchronous design (which 99% of us do), we only need the stable result once per cycle – when the calculation is done – we can just put a register to drive the high capacitive net. The register effectively blocks all toggling on that net (where it hurts) and allows it to change maximum one time per cycle.

The image below tells the whole story. (a) is before the insertion of the flop, (b) right after.

low_power_retiming

This is all nice, but just remember that in real life it can be quite hard to identify those special nets and those special high toggling logic clouds. Moreover, most of the time we cannot afford the flop for latency reasons. But if you happen to be in the early design phase and you know more or less your floor plan, think about moving some of those flops so they will reduce the toggling on those high capacitive nets.

h1

Transparent Pipelining

February 15, 2009

The nice thing about posting in such a site, is that one learns quite a bit with time. During the long pause I took, I tried to read quite a bit and look for some interesting papers. Yes, I am aware that most of my readers are not really interested in reading technical papers for fun, but the bunch that I collected are IMHO quite important and teach a lot.

The one for today is about a novel clocking scheme for latch based pipelines. I found it really interesting and important. I am sure that sometime I am going to implement this for a future design. The paper could have had more examples, and I bet that a few animations would only do good for that topic – but you can’t really ask for stuff like that in a technical paper, can you?

OK, enough of my words – you can find the paper here.

Follow

Get every new post delivered to your Inbox.

Join 108 other followers