New Updates Coming Soon

February 4, 2009

I know it has been a while since I added new posts. There has been a lot going on here lately – new addition to the family, new job and some more smaller things, which keep me relatively busy lately.

Don’t give up on me just yet. I promise to keep the interesting posts coming, although maybe not on a weekly basis as I tried doing before.

Hope you guys understand…


Real World Examples #4 – More on “Thinking Hardware”

January 20, 2009

I was reviewing some code not so long ago, and noticed together with the owner of the code, that we had some timing problems.
Part of the code looked something like that (Verilog):

wire [127:0] a;
wire [127:0] b;
wire [127:0] c;
assign c = select_register ? a : b;

For those not familiar with Verilog syntax, the code describes a MUX construct using the ternary operator. The two data inputs for the MUX are “a” and “b” and the select is “select_register”.

So why was this code translated into a relatively slow design? The answer is in the width of the signals. The code actually synthesizes to 128 parallel MUX structures. The “select_register” has actually 128 loads.
When a construct like this is hidden within a large code, our tendency is to just neglect it by saying it is “only” 2:1 MUX deep, but we have to look more carefully than that – and always remember to consider the load.

Solving this problem is relatively easy by replication. Just creating more versions of the “select_register” helped significantly.


A Message for the New Year

January 10, 2009

Holiday season is gone, the new year is just starting and I am into preaching mood.

I get many, many emails from people asking me to help them with their designs, interview questions or just give advice. Sometimes, if I am not fast enough in replying, I even get complains and emails urging me to supply the answer “ASAP”. This is all OK and nice, but I would like you the reader to stop for a second and think on how much YOU are contributing to our community?

Not everyone can or likes to write a technical blog, but there are other options one can utilize – one of my favorites is posting on a forum. Even if you are a beginner in the field, post your questions, this is already a big help for many. I personally post from time to time on EDA board. Just go through that forum and have a quick look, some questions are very interesting while others can be extremely stupid (sorry) – who cares! What matters in my eyes, is that the forum is building a database of questions and answers that can help you and others.

I assume that most of my readers are on the passive side of things (just a hunch). I hope this post will make you open an account on one of the forums and start posting.

p.s. please use the comments section to recommend your favorite design related forums or groups.


Interview Question – BCD Digit, Multiplied by 5

December 21, 2008

A while back, someone sent me the interview question I am about to describe, asking for help. I think it serves a very good example of observing patterns and not rushing into conclusions.
I will immediately post the answer after describing the problem. However, I urge you to try and solve it on your own and see what you came up with. On we go with the question…

Design a circuit with minimum logic that receives a single digit, coded BCD (4 wires) and as an output gives you the result multiplied by 5 – also BCD coded (8 wires).

So, I hope you got a solution ready at hand and you didn’t cheat 😉 .

Let’s first make some order and present the input and required outputs in a table (always a good habit).

Looking for some patterns we can see that we actually don’t need any logic at all to solve this problem!!

You will be amazed how many people get stuck with a certain solution and believe it is the minimal one. Especially when the outcome is one or two single gates. When you tell them it can be done with less, they will easily find the solution. IMHO there is nothing really clever or sophisticated about this problem, but it demonstrates beautifully how it is sometimes hard for us to escape our initial ideas and approaches about a problem.

Coming to think of it, this post was more about psychology and problem solving than digital design – please forgive…


A Coding Tip for Multi Clock Domain Designs

December 13, 2008

Multi clock domain designs are always interesting, but almost always hide some synchronization problems, which are not that trivial. There are tools on the market that identify all(??) clock domain crossings within a design. I personally had no experience with them, so I can’t give an opinion (although I heard some unflattering remarks from fellow engineers).

Seems like each company has its own ways of handling this problem. One of the oldest, easiest and IMHO one of the most efficient ways, is to keep strict naming guidelines for your signals, whether combinatorial or sequential !!

The most common way is to add a prefix to each signal which describes its driver clock e.g. clk_800_mux_32to1_out or clk_666_redge_acknowledge.

If you don’t use this simple technique, you won’t believe how useful it is. Many of the related problems of synchronization are actually discovered during the coding process itself. Moreover, it even makes life easier when doing the code review.

If you have more tips on naming convention guidelines for signals in RTL – post them as a comment!


Another Reason to Add Hierarchies to Your Designs

November 30, 2008

We are usually very annoyed when the team leader insists on code restructuring and hierarchical design.
I also know this very well from the other side as well. Trying to convince designers to better restructure their own design which they know so very well already.

Well, here is another small, yet important reason why you might want to do this more often.
Assume your design is more or less mature, you ran some simulation, went through some synthesis runs and see that you don’t meet timing.
You analyze the timing report just to find a huge timing path optimized by the tool and made of tons of NANDs, NORs, XORs and what not. Well you see the starting point and the end point very clearly, but you find yourself asking if this is the path that goes through the MUX or through the adder maybe?

Most logic designs are extremely complicated and the circuit is not just something you can draw immediately on paper. Moreover, looking at a timing report of optimized logic, it is very hard to interpret the exact path taken through the higher level structured – or in other words, what part of the code I am really looking at here??!! Adding an hierarchy will also add its name to the optimized structures in the timing report and you could then easily pin point your problems.

I even saw an engineer that uses this technique as a debugging tool. If he has a very deep logic cloud, he will intentionally build an hierarchy around say a simple 2:1 MUX in the design and look for it in the timing report. This enables him to “feel” how the synthesis tool optimizes the path and where manual optimization needs to be implemented .

Use this on your bigger blocks, it saves a lot of time and effort in the long run.


Challenge #3 – Counting the Number of “1”s

November 13, 2008

Time for a new challenge! The last two had some great responses and solutions. If you read through the comments you’d see there were some disagreements on what is the best approach. Some claimed a hand crafted approach is the best, while others said it was more of a theoretical problem and we should use a synthesis tool to solve it.
Both have pros and cons, although for those specific challenges I personally tend to go with the hand crafted approach – you, of course, don’t have to agree with me.

For this time we got a very practical problem that pops up again and again: counting the number of “1”s in a vector.
Use the metrics given in challenge #1 and find the minimal delay circuit for a combo cloud that counts the number of “1”s in an 8-bit vector. You get 8 bits in and supply 4 output bits which give a binary representation of the amount of “1”s in the 8-bit vector.

Oh and don’t forget to mention how your method scales when counting 16-bit and 32-bit vectors.

Ready, set, go!