The Qualis Library: Working with Behavioral Compiler -- Some Helpful Tips

Working with Behavioral Compiler: Some Helpful Tips
By David Black, High-Level Design Wizard, Qualis Design Corporation

I've worked with many of our clients on Behavioral Compiler designs, and have accrued a few helpful tips for users of Synopsys BC. Many engineers have found BC to be extremely effective in reducing design time (5x) when used appropriately. Use these tips to help save you time and stress.

Tip Index

BC TIP: Initialize variables close to their usage site
BC TIP: Use bc_view
BC TIP: Misuse of Verilog 'disable' can make life tough
BC TIP: Make I/O stand out using smart naming conventions
BC TIP: Add a debug output register
BC TIP: Verilog tasks not useful with BC... yet
BC TIP: Appropriate designs for BC
BC TIP: Simplifing constraining fast handshakes
BC TIP: Checklist of common BC traps
BC TIP: Back-to-back loops need a clock
SYNOPSYS TIP: Synopsys manpages at the UNIX prompt

BC TIP: Initialize variables close to their usage site

Behavioral Compiler needs to have variables initialized properly to provide good results. If you fail to initialize the entire variable in one chunk, BC doesn't recognize the initialization as valid. Failure to initialize results in a warning messages, may create simulation mismatches, and may create unnecessary registers.

    Warning: Variable 'MAIN/ERROR_CONDITON/x_loop_connect' is not initialized (HLS-155)

    reg [31:0] x;
    ...
    while (COND) begin :WHILE
      if (COND) begin
        x[31:8] = a;
        end //if
      else begin
        x[31:8] = b;
        end //else
      x[7:0] = 0;
      data_out <= x;
      @(posedge clock);
      end //while

'x = 0;'

'while'

'if'

x

I recommend eliminating all BC warning messages. Only occasionally is it acceptable to leave the warnings.
(return to the tip index)

BC TIP: Use bc_view

Behavioral Compiler supports a graphical analysis tool called bc_view. It's part of the BC package, but is not enabled by default. To turn on bc_view, be sure to put the following in your scheduling scripts:

   bc_enable_analysis_info = true       /* must occur before analyze & elaborate */

Two minor related tips:

Make sure your DISPLAY environmental variable is correctly set. You can use the dc_shell command set_unix_variable to accomplish this during a session if need be.
bc_view seems to work best if the display host is of the same architecture as the executing host. This is likely only a problem for folks using load balancer or other remote queuing tools.

(return to the tip index)

BC TIP: Misuse of Verilog 'disable' can make life tough

Behavioral Compiler supports the use of Verilog's 'disable' to emulate VHDL's 'exit' and 'next'. This works if you follow the examples closely; however, overuse can lead to problems. There are five common mistakes.

MISTAKE 1. Disabling the wrong block

    begin :LOOP forever begin :BODY
       @(posedge clock);
       if (COND1) disable BODY; // equivalent to VHDL 'NEXT'
       if (COND2) disable LOOP; // equivalent to VHDL 'EXIT'
    end end

begin

MISTAKE 2. Asserting outputs and disabling without an intervening clock

   request_out <= 1;
   @(posedge clock);
   begin :LOOP forever begin
     if (acknowledge_in == 1'b1) begin
       request_out <= 0;
       disable LOOP;
     end
     @(posedge clock);
   end end

request_out

unspecified

@(posedge clock)

disable

MISTAKE 3. Attempting to disable more than one level of hierarchy

Verilog allows disabling any block from a simulation point of view. Unfortunately, BC does not support exiting more than a single level of loop hierarchy. This should be addressed in a future BC version (time unspecified).

MISTAKE 4. Disabling a labeled block not associated with a loop

Verilog semantics allow for disabling many things. Intuitively, disabling a block is nice as an error escape mechanism. In code:

   begin :CODE_SEQUENCE
     ...
     if (ERROR_CONDITION) begin
       error_flag = 1;
       disable CODE_SEQUENCE;
       end
     ...
     if (ERROR_CONDITION) begin
       error_flag = 1;
       disable CODE_SEQUENCE;
       end
     ...
     if (ERROR_CONDITION) begin
       error_flag = 1;
       disable CODE_SEQUENCE;
       end
     ...
     if (ERROR_CONDITION) begin
       error_flag = 1;
       disable CODE_SEQUENCE;
       end
     ...
   end //CODE_SEQUENCE

A work-around involves setting a bogus variable to true and using a loop that ends with a test that unconditionally exits. Unfortunately, there is a drawback as discussed in Mistake #5 below.

MISTAKE 5. Too many disables leads to long schedule times

Synopsys has a complexity problem if the number of states and transitions gets too large. Because BC looks for the best places to place operations, when the number of states and transitions gets large, the search space can get large exponentially. This leads to slow scheduling by the tool. This is related to Synopsys' recommendation that the number of operations be kept under 150 (artificial number) viewed from another angle. If there are a large number of operations OR transitions, then there are a large number of combinations to consider. The number of considerations directly impacts the tools performance.
(return to the tip index)

BC TIP: Make I/O stand out using smart naming conventions

Behavioral Compiler places specific constraints on all I/O. All I/O are referenced using VHDL terminology 'signals'; whereas, internal temporaries and registers are referred to as variables. Because all I/O is scheduled, it is important to quickly see all I/O in your source code. Outputs are handled by using non-blocking assignments which helps. Inputs are handled merely by their appearance. Any net that crosses the process (always block) boundary is designated a 'signal'.

I recommend BC designers use a combination of a naming convention using suffixes, and explicitly place inputs into temporary variables. The naming convention is merely to append either and _O (oh) for output signals and an _I for input signals.

Thus:

    data_O <= value;
    @(posedge clock);
    ack = ack_I;
    while (ack == 0) begin
      ack = ack_I;
      @(posedge clock);
      end

(return to the tip index)

BC TIP: Add a debug output register

Behavioral Compiler code is sometimes difficult to debug if problems are found at the gate level. This happens many times for teams using emulation as part of their methodology. Part of the reason for this is due to the level of abstraction. BC designs an efficient FSM that can be hard to follow. I recomend adding a debug_State output port (registered by default) and assigning it unique values throughout the code. Ideally, this register should be conditionally compiled in (use of m4 preprocessor is recommended). This allows for the register to be used for emulation or early prototypes, and removed on subsequent "clean" compilations.

    // Verilog snippets...
    module Whopper (... , debug_State_o );
    ...
    output [7:0] debug_State_o; reg [7:0] debug_State; // COMMENT OUT FOR
FINAL GATES
    ...
    debug_State_o <= 0; // COMMENT OUT FOR FINAL GATES
    @(posedge clock);
    ...
    if (CONDITION) begin
      ...
      debug_State_o <= 1; // COMMENT OUT FOR FINAL GATES
      @(posedge clock);
      end
    else begin
      ...
      debug_State_o <= 2; // COMMENT OUT FOR FINAL GATES
      @(posedge clock);
      end
    ...

(return to the tip index)

BC TIP: Verilog tasks not useful with BC... yet

The latest release of Behavioral Compiler announced some really cool features. Not the least of these was the promise of using tasks as a level of hierarchy. What the announcement failed to point out was the uselessness of the current implementation. Yes, you can use tasks; however, you may not use signals (I/O) within the tasks. That is unfortunate because a natural application of tasks would be to create I/O handlers (e.g Read_PCI, Write_PCI, Send_Packet, etc.).

The reason tasks don't handle I/O is two fold. First, Verilog specifies that task inputs are read statically at time of invocation and outputs are computed statically when the task finishes (returns). So if you write:

    task Read;
    input  [15:0] addr;
    output  [7:0] data;
    output [15:0] addr_port_o;
    input   [7:0] data_port_i;
    ...
    endtask
    ...
    Read(fifo,result,real_addr_o,real_data_i);

real_addr_o

(return to the tip index)

BC TIP: Appropriate designs for BC

This topic is a difficult one to approach, but there are some key ideas.

Does the design specification use algorithms or is it specified as state machines? BC does best with algorithmic descriptions. FSM's are probably best implemented using RTL.
Are there a lot of datapath constructs, and how specifically are these to be implemented? If you know the exact datapath implementation you wish to implement, BC may not be the right tool. On the other hand, if you need help finding the correct implementation, BC may be ideal.
Are you concerned with getting every last unneeded gate out of the design or do you want a design implementation as fast as possible and don't mind a few extra gates? If the former, you may feel more comfortable with RTL. If you can spare a few gates, BC can help quite a bit once you master the coding style (which is quite simple, but does take learning).
Does you design have a lot of pipeling opportunities? If yes, BC has some really good mechanisms to automate pipelining datapaths.

(return to the tip index)

BC TIP: Simplifing constraining fast handshakes

What follows is a technique to simplify application of cycle constraints on fast handshake loops. This is necessary as part of defensive coding to guard against possible future changes that could break proper scheduling of fast handshake loops.

STEP 1: In the source code, label all fast handshake loops with a unique and consistent convention. For example, FASTLOOP_. In several existing designs, this has already been done for other reasons.

    EXAMPLE VERILOG SOURCE CODE:

    read_req_o <= 1'b1;
    @(posedge clock);
    begin: FASTLOOP_READ_BUS forever begin
        data = data_i;
        if (bus_grant_i == 1'b1) begin
            read_req_o <= 1'b0;
            @(posedge clock);
            disable FASTLOOP_READ_BUS
        end
        @(posedge clock);
    end end

STEP 2:

find()

set_cycles

"FASTLOOP_"

    /* Immediately following elaboration */
    $LOOP_LIST = find("cell","*FASTLOOP_*",-hierarchy)
    foreach ($LOOP,$LOOP_LIST) {
        set_cycles 1 -from_begin $LOOP -to_end $LOOP
    }/*endforeach*/

That's all there is to it. Similar technique may be applied to other situations. The advantage of using naming conventions for this should be obvious: No scripting commands in the HDL source code.
(return to the tip index)

BC TIP: Checklist of common BC traps

The following is a list of common traps encountered by folks using BC. Hopefully this list may be of use to future designs. Some are technical and others are psychological. These are presented in no particular order. The list does not represent frequency of encounter except to note those more frequently encountered in my experience are marked with an asterisk (*).

    01. Disabling the wrong named block *
    02. Mixing blocked and non-blocking assignments *
    03. Leaving out clocks in one edge of a case/if branch (usually else) *
    04. Attempting to disable more than one level of hierarchy *
    05. Getting lost with indentation of begin/end blocks *
    06. Adding clocks unnecessarily
    07. Casual mixing of RTL/BC code
    08. Omitting simulation of behavior before scheduling
    09. Too many operations as a result of unrolled for loops
    10. Missing DesignWare-Foundation libraries
    11. Overlooking bc_time_design reports of multicycle candidates
    12. Missing clock edges on entering/leaving loops
    13. Failure to adhere to signal naming conventions
    14. Not reviewing clock edge usage
    15. Cognitive dissonance
    16. Wrong version of Synopsys tools
    17. Ignoring warnings/error messages from bc_check and/or schedule
    18. Overlooking clock edges in m4 macros
    19. Over zealous desire to save clock cycles (+)
    20. Over zealous desire to save registers and gates (+)
    21. Not labeling begin/end blocks rigorously
    22. Overlooking clocks because coding style hides them
    23. Mistakes due to lack of sleep from overworking
    24. Not running small experiments on new coding styles
    25. Using styles discouraged or not supported officially
    26. Back-to-back loops add a clock in superstate causing mismatches

(return to the tip index)

BC TIP: Back-to-back loops need a clock

Sometimes folks like to code back-to-back loops without an intervening clock edge. Unfortunately, BC doesn't support this and in superstate_fixed scheduling mode will add the clock for you. If your code expects this to be a tight interface, the extra clock will cause simulation mismatches.

Example of problem:

  // Get two bytes from the interface
  request_o <= 1;
  @(posedge clock);
  begin :LOOP1 forever begin
    data1 = data_i;
    if (ack_i == 1) begin
      @(posedge clock);
      disable LOOP1;
      end
    @(posedge clock);
  end //LOOP1
  // BC will require a clock here
  // Superstate will silently insert a clock
  begin :LOOP2 forever begin
    data2 = data_i;
    if (ack_i == 1) begin
      request_o <= 0;
      @(posedge clock);
      disable LOOP2;
      end
    @(posedge clock);
  end //LOOP2

Example of the solution:

  // Get two bytes from the interface
  count = 2;
  request_o <= 1;
  @(posedge clock);
  begin :LOOP forever begin
    data1 = data_i;
    count = count - 1'b1;
    if (ack_i == 1 && count == 0) begin
      request_o <= 0;
      @(posedge clock);
      disable LOOP;
      end
    @(posedge clock);
  end //LOOP

(return to the tip index)

SYNOPSYS TIP: Synopsys manpages at the UNIX prompt

Synopsys manpages are available and may be used at the UNIX command line quite easily. The following script will do the trick:

#!/bin/csh
#
# @(#)$Info: dcman script to display Synopsys manpages. $
#
# NOTE: This requires that $SYNOPSYS point to the Synopsys
#       root directory. In dc_shell use 'list synopsys_root'
#       to determine the correct setting if you don't know
#       this already.
#
setenv MANPATH "$SYNOPSYS/doc/syn/man"
man $*
exit 0

set_structure

  % dcman set_structure

(LINT-47)

  % dcman LINT/LINT-47

(return to the tip index)

-------

Brought to you by the Qualis Library
http://www.qualis.com/library/