Archive for the ‘General’ Category

h1

Null Convention Logic

October 11, 2007

It is extremely rare in our industry that totally new approaches for Logic circuit design are taken. I don’t know the exact reasons and I really don’t want to get into the “fight” between tool vendors and engineers.

Null Convention Logic, is a totally different approach to circuit design. It is asynchronous in its heart (I guess half of the readers of this post just dropped now).
It is not new and being currently pushed by its developers in Theseus Research.

They published a book, which I really recommend reading. It is not very practical with the current mainstream tools and flows but it is a very interesting reading that will open your eyes to new approaches in logic design.
You can get a good introduction to the book’s content by reading this paper. It is fairly technical and would need a few good hours to digest and grasp the meaning behind, especially given the fact that it is so much different than what we are used to - forget about AND, OR and NOT gates…

Book link here.

h1

Going on Vacation…

September 15, 2007

I will be on vacation for a week and half (up to september 25th) and will have relatively limited access to the web. Therefore, I will leave you with a series of puzzles just for fun. I will try to make at least one of them hard enough…

h1

10,000 Views …

September 3, 2007

Last week this blog crossed the 10,000 views mark in about 3+ months or so of being “on the air”. That number represents total views not counting my own visits to the blog (if it would include that, the number would for sure triple)

Just wanted to say thanks to everybody who is reading, sending emails, suggesting and even complaining (though there are not many of those).
Please don’t be ashamed and continue sending emails. First, it is fun to get them and second, it helps me improve and write about what is really interesting for you guys.

h1

The Coolest Binary Adder You Have Ever Seen…

August 26, 2007

I have to admit, I never thought I would ever link from this blog to youtube, but given the nature of the following contraption I believe you will agree it was a must…

This is by far the coolest binary adder you have ever seen - link here.
It has almost everything inside, a reset “pin”, carry out “pin” etc.
If you are into wood working you could visit the builder’s site and see exactly how this can be done - visit him here.

I also saw a mechanical binary adder in the Deutsches Mueseum, but it was based on water! I might try to get a video of that one running in the future, since the museum is 400 meters from my house. If you ever visit Munich and you don’t go there - shame on you!!!

h1

Everything You Wanted to Know About Specman Verification and Never Dared to Ask

August 7, 2007

My friend Avidan Efody, has a site full of tons of advice, tips and tricks concerning verification with Specman. No, it is not “plug your buddy’s blog” section, but if verification is what you do, and you never been there before - shame on you - you should visit it ASAP.

You can find it here.

h1

Arithmetic Tips & Tricks #1

August 1, 2007

Every single one of us had sometime or another to design a block utilizing some arithmetic operations. Usually we use the necessary operator and forget about it, but since we are “hardware men” (should be said with pride and a full chest) we know there is much more going under the hood. I intend to have a series of posts dealing specifically with arithmetic implementation tips and tricks. There are plenty of them, I don’t know all, probably not even half. So if you got some interesting ones please send them to me and I will post them with credits.

Let’s start. This post will explain 2 of the most obvious and simple ones.

  • Multiplying by a constant
  • Multipliers are extremely area hungry and thus when possible should be eliminated. One of the classic examples is when multiplying by a constant.
    Assume you need to multiply the result of register A by a factor, say 5. Instead of instantiating a multiplier, you could “shift and add”. 5 in binary is 101, just add A to A00 (2 trailing zeros, have the effect of multiplying by 4) and you have the equivalent of multiplying by 5, since what you basically did was 4A+A = 5A.
    This is of course very simplistic, but when you write your code, make sure the constant is not passed on as an argument to a function. It might be that the synthesis tool knows how to handle it, but why take the risk.

  • Adding a bounded value
  • Sometimes (or even often), we need to add two values where one is much smaller than the other and bounded. For example adding a 3 bit value to a 32 bit register. The idea here is not to be neat and pad the 3 bit value by leading zeros and create by force a 32 bit register from it. Why? adding two 32 bit values instantiates full adder logic on all the 32 bits, while adding 3 bits to 32 will infer a full adder logic on the 3 LSBs and an increment logic (which is much faster and cheaper) on the rest of the bits. I am quite positive that today’s synthesis tools know how to handle this, but again, it is good practice to always check the synthesis result and see what came up. If you didn’t get what you wanted it is easy enough to force it by coding it in such a way.

    h1

    Replication

    July 25, 2007

    Replication is an extremely important technique in digital design. The basic idea is that under some circumstances it is useful to take the same logic cloud or the same flip-flops and produce more instances of them, even though only a single copy would normally be enough from a logical point of view.
    Why would I want to spend more area on my chip and create more logic when I know I could do without it?

    Imagine the situation on the picture below. The darkened flip-flop has to drive 3 other nets all over the chip and due to the physical placement of the capturing flops it can not be placed close by to all of them. The layout tool finds as a compromise some place in the middle, which in turn will generate a negative slack on all the paths.

    replication1.png

    We notice that in the above example the logic cloud just before the darkened flop has a positive slack or in other words, “some time to give”. We now use this and produce a copy of the darkened flop, but this time closer to each of the capturing flops.

    replication2.png

    Yet another option, is to duplicate the entire logic cloud plus the sending flop, as pictured below. This will usually generate even better results.

    replication3.png

    Notice that we also reduce the fan out of the driving flop, thus further improving on timing.

    It is important to take care about while writing the HDL code, that the paths are really separated. This means when you want to replicate flops and logic clouds make sure you give the registers/signals/wires different names. It is a good idea to keep some sort of naming convention for replicated paths, so in the future when a change is made on one path, it would be easy enough to mirror that change on the other replications.

    There is no need to mention that when using this technique we pay in area and power - but I will still mention it :-)

    h1

    2 Lessons on PRBS Generators and Randomness

    July 10, 2007

    The topic of “what is random” is rather deep and complicated. I am far from an authority on the subject and must admit to be pretty ignorant about it. However, this post will deal with two very simple but rather common errors (or misbehaviors) of random number generators usage.

      LFSR width and random numbers for your testbench

    Say you designed a pretty complicated block or even a system in HDL and you wish to test it by injecting some random numbers to the inputs (just for the heck of it). For simplicity reasons lets assume your block receives an integer with a value between 1 and 15. You think to yourself that it would be pretty neat to use a 4-bit LFSR which generates all possible values between 1 and 15 in a pseudo-random order and just repeat the sequence over and over again. Together with the other type of noise in the system you inject, this should be pretty thorough, right? Well, not really!

    Imagine for a second how the sequence looks like, each number will always be followed by another specific number in this sequence! For example, you will never be able to verify a case where the same number is injected immediately again into the block!
    To verify all other cases (at least for all different pairs of numbers) you would need to use an LFSR with a larger width (How much larger?). What you need to do then is to pick up only 4 bits of this bigger LFSR and inject them to your block.

    I know this sounds very obvious, but I have seen this basic mistake done several times before - by me and by others as well (regardless of their experience level).

      PRBS and my car radio “mix” function

    On sunny days I ride my bicycle to work, but on rainy days I chicken out and use the car for the 6km I have to go. Since I don’t often like what is on the radio, I decided to go through my collection of CDs and choose the 200 or so songs I would like to listen to in the car and burn them as mp3s on a single CD (Don’t ask how much time this took). Unfortunately, if you just pop in the CD and press play, the songs play in alphabetical order. Luckily enough, my car CD player has a “mix” option. So far so good, but after a while I started to notice that when using the “mix” option, always song 149 is followed by song 148, which in turn is followed by song 18, and believe me this is annoying to the bone. The whole idea of “mixing” is that you don’t know what to expect next!

    I assume that the “mix” function is accomplished by some sort of PRBS generator, which explains the deterministic order of song playing. But my advice to you if you design a circuit of this sort (for a CD player, or whatever), is to introduce some sort of true randomness to the system. For example, one could time the interval between power-up of the radio and the first human keystroke on the CD player and use this load the PRBS generator as a seed value, thus producing a different starting song for the play list each time. This however, does not solve the problem of the song playing order being deterministic. But given such a “random” number from the user once could use it to generate an offset for the the PRBS generator making it “jump” an arbitrary number of steps instead of the usual one step.

    My point was not to indicate that this is the most clever way to do things, but I do think that with little effort one could come up with slightly more sophisticated systems, that make a big difference.

    h1

    Resource Sharing vs. Performance

    June 27, 2007

    I wanted to spend a few words on the issue of resource sharing vs. performance. I believe it is trivial for most engineers but a few extra words won’t do any harm I guess.
    The issue is relevant most evidently when there is a need to perform a “heavy” or “expensive” calculation on several inputs in a repeated way.

    The approaches usually in consideration are: building a balanced tree structure, sequencing the operations, or a combination of the two.

    A tree structure architecture is depicted below. The logic cloud represents the “heavy” calculation. One can see immediately that the operation on a,b and c,d is done in parallel and thus saves latency on the expense of instantiating the logic cloud twice.

    tree_structure.png

    The other common solution, depicted below, is to use the logic cloud only once but introducing a state machine which controls a MUX, that determines which values will be calculated on the next cycle. The overhead of designing this FSM is minimal (and even less). The main saving is in using the logic cloud only once. Notice that we pay here in throughput and latency! With some more thinking, one could also save a calculation cycle by introducing another MUX in the feedback path, and using one of the inputs just for the first calculation, thereafter using always the feedback path.

    resource_sharing.png

    h1

    Non-Readable Papers

    June 19, 2007

    I actually enjoy surfing the web and reading technical papers which are somewhat related to my work. A lot of the good stuff appears in books, but if you want to find the coolest techniques and breakthrough ideas, they naturally first appear in technical papers.

    I have to admit I don’t like the format used by the standard technical papers, some of them seem to be made non-readable on purpose. Here is a real paper that can compete for the dubious title of being the most non-readable paper around.

    Here is one of my papers. Before you continue, stop and try digesting what was written…

    If you made through the first page, consider yourself a hero. That “technical paper” was generated automatically using SCIgen.

    I bet a lot of people would be impressed if you present a list of papers generated by this service. A sort of a high-tech “emperor’s new cloths” syndrome - no one wants to admit he doesn’t understand a technical paper describing some “major” work in his own field…