h1

Why Not Just Over-Constrain My Design?

June 25, 2008

This is a question often raised by beginners when trying to squeeze performance from their designs.
So, why over-constraining a design does not necessarily improve performance. The truth is that I don’t really know. I assume it is connected to some internal variables and measuring algorithms inside the synthesis tool and the fact that they give up trying to improve the performance because they reached a certain local minimum in some n-variable space (really!).

But empirically, I (and many others) have found out that you can not get the best performance by just over-constraining your design in an unrealistic manner. It has to be somehow closely related to the actual maximum speed that can be reached. The graph below sums up this problem pretty neatly.

As seen above, there is a certain min-max range for the performance frequency that can be reached and its peak is not the result of the highest frequency constrained!
The flat region on the left of the figure is the speed reached without any optimization, that is, right after mapping your HDL into gates. As we move towards the right, we see actual speed improvement as we constrain for higher speeds. Then a peak is reached and constraining for higher speeds results in poorer performance.

I worked relatively less with FPGAs in my carrier but I have seen this phenomenon there as well. Take it to your attention.

Advertisements

10 comments

  1. I have seen exactly the same thing in FPGAs. Even if you don’t care about +/- 5% of your maximum frequency (prototyping in an FPGA for example) it still makes sense to avoid overconstraining your design as the runtime of the tools may dramatically increase.

    At http://www.da.isy.liu.se/~ehliar/stuff/place_and_route.html I have a small experiment which demonstrates that the runtime increases by almost a magnitude when a certain design is just barely overconstrained. (Which makes sense as the tool suddenly has to spend a lot more time failing to arrange the circuit so to meet timing.)


  2. Andreas,
    The point about the running time is a VERY important one I forgot to mention! Maybe it deserves a separate post. Thanks for pointing this out.

    N.


  3. Let’s assume you are trying to reach some frequency F and there are N violators when the design is properly constrained.

    If you increase the constraint to F + delta, then you increase the WNS of the true failing paths but you also create false violators and increase the TNS.

    If delta is small, then the synthesis tool *might* be able to fix all the true violators, or at least more of the ones you care about.

    If delta is large, then you create many more false violators than true violators. Runtime ends up increasing dramatically and the tool finds it more beneficial to its cost function to reduce the large number of false violators (perhaps they are easier) than to reduce the true violators.

    Kind of simplistic, but basically what is happening.


  4. I agree to your analysis but still think it is a bit simplistic. I believe a lot hides under the term “cost function”.

    I am also somehow amazed of how the main synthesis vendors (they know who they are) didn’t come up with an ***automatic*** incremental synthesis approach.
    It is true it can be done in incremental stages, but what I mean is to have a database automatically saved from the previous synthesis run.

    Most designs are being synthesized (even in the block level) over and over again. For some extremely timing critical blocks, I remember running synthesis in the order of tens of times. Sometimes modifying the frequency just one bit resulted in solving the “previous” critical path only to generate a totally “new” critical path. This problem would be rather easily solved by the approach I described.

    But hey, I guess I am missing something, since they would have done it already…


  5. Couple of points:
    1. Running incremental synthesis in many times is pointless as the tool doesn’t know what other constraints may be there. In short it doesn’t have STA information which usually include some layout information.

    2. If changing the frequency by small percentage (Either FPGA or ASIC design flows) causes the design to fail at a different point, you need to change the constraints the synthesis tool is starting from.

    Most synthesis tool work by solving paths by order of priority (Usually slack) and at some point in time when the improvements become marginal they stop. Therefore the initial condition is very important, and our job as designers is to make sure the tool has as much information as possible to work efficiently.

    E.


  6. Nice to see you on board at last Erez!


  7. This happens to Xilinx FPGA tools, Altera doesn’t work that way. The constraint value doesn’t seem to affect Altera PAR’s effort.
    I’ve done some research on FPGA timing, we should use Perl script (Xplorer) to arrive at the peak performance. Thanks 🙂 … Hope this is useful.
    To my experience, constraining the Xilinx device is like squeezing baloon into a box. You try to squeeze in 1 dimension, it would expand in other dimensions. You got to balance the length. In FPGA designs, you’ve got a numbers of paths with different lengths. When you optimize 1 path to meet the constraint, you use up resource and the later path will fail to be routed or can be routed with very long interconnect delay.


  8. One reason why over-constraining does not help is that the tool will “over fix” paths. By that, I mean it could use larger than needed gates which will unnecessarily slow down paths loaded by those gates. So you net critical path could end up being slower.

    Ofcourse placement aware (wire load) synthesis will fare much worse with larger gates but I dont think you were looking at that.

    — r


  9. I agree with Rohit, I work with ASICS and that curve is exactly what I see when over-constraining a circuit. The big jump in the curve means that the synthesis tool is performing logic optimizations, which may reduce the amount of logic used in the critical path, but then there is a point where no more logic optimization is possible, and the tool starts using larger gates to improve timing by strengthening the drive current, however these larger gates have a larger intrinsic delay as well as a larger input capacitance, which plays in detriment of the desired frequency.


  10. Rohit, you are right…
    In case If we work at higher frequency on our design then as per our constraints if the cells are not available in technology library.
    So for higher frequency the synthesis tool pick large cells from library and it needs more area so our design become slow and we get -ve slack. which is not desirable



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: