10
VHDL modeling for synthesis

prepared by P. Bakowski


Introduction

High Level Synthesis tools allow the designer to translate a high level VHDL description into the corresponding structure. The elements of the generated structure are taken from the library of components incorporated into the synthesis tool. The preparation of a synthetizable model requires the knowledge about the synthetizable features. If the prepared components (synthetic components) are to be used in the overall structure of the final model, the modeler must know their input/output characteristics and the performed function.

Synthetizable models do not contain any timing features at the delay level. For example, the clock cycle of a synthesized sequential model is not known until the circuit has not been completely synthesized. Only after the physical layout modelization the clock delay may be back-annotated into the original high level description.

The synthesized (or synthetizable) VHDL processes are roughly classified as combinational or sequential. The combinational processes do not infer any internal memory elements. The figure below shows the overall synthesis path going from algorithmic and/or register level descriptions down to logic level VHDL code.

The algorithmic level descriptions are mainly used to model state machines and sequencers. The register transfer level descriptions are well suited to the modeling of data paths (data flows). Both the algorithmic level and the register transfer level descriptions are taken by high level synthesis tools and transformed into the corresponding netlists. The netlists incorporate  logic- level components and the synthetic components (blocks) taken from the design library. In order to minimize the number of required components, the synthesis process exploits optimization algorithms allowing resource sharing.

General modeling recommendations for synthesis
 
Content: high level synthesis, synthetizable subset, synthesis process simple objects, arrays, expressions, arithmetic operations, combinational processes, sequential processes, latches, flip-flops, state-machines, three-state buffers, resource sharing, general modeling recommendations, structuring modelsexamples, exercises


VHDL synthetizable subset

The synthetizable subset may be shortly characterized by: Simple objects

The basic types and operators used for the synthesis are defined in  std_logic_1164 package. This package  specifies a 9-level logic system with several sub-systems {0,1}, {0,1,X,Z}. The basic logical operators such as and, or, not are redefined for the proposed logic system using overloading technique.
A characteristic fragment of std_logic_1164 package is given below.

Std_logic_1164 package includes the resolution function. The std_logic_vector is a resolved 9-state vector. The user may also declare its own enumerated types; for example: The synthesis process of the type etat will provide an encoded object based on basic logic types. Integer like types are defined as bound subtypes: In order to represent the defined values, the synthesis process will calculate the necessary number of bits. However there is no way to access the separate bits by the generated circuit. In some cases negative integers are not supported. real and physical types are not supported by synthesis tools.

Arrays and multi-dimensional arrays

Arrays are usually used to represent the vectors or multi-bit memory elements. One-dimensional arrays are easily synthetizable. Multidimensional arrays generally are not supported. Their use requires additional decoding schemes. Some versions of synthesizers accept bi-dimensional arrays; they are implemented by use of predefined components (RAMs) imported from the library .
Unconstrained array types are supported in some cases:

Array attributes

The synthesizers support the following array attributes:
'left, 'right, 'high, 'low, 'length, 'range, 'reverse_range
Example:
Records
Record types are synthetizable in most cases.

Object classes: synthesis of constants, variables and signals

Constants of any (synthetizable) type are supported. In principle, the simple constants do not generate any hardware. Complex constants such as bi-dimensional arrays may infer ROM blocks.  Variables are declared locally within a process or a subprogram. Depending on the context, the corresponding hardware my be: Signals, depending on the usage context,  may infer : Example: The next example illustrates the generation of a JK flip-flop. The flip-flop state register is inferred from the circular q_tmp signal assignment (q_tmp <= q_tmp;). In addition the variable jk_var may infer a memory element. This is the case of the second architecture where the input signals are read at the end of the process. The new output value is created from the previous input values latched in the input memory.
architecture first of jk_ff is
signal q_tmp: std_logic;
begin
end first;
 
architecture second of jk_ff is
signal q_tmp: std_logic;
begin
process(clk,rst)
variable jk_var: std_logic_vector( 1 downto 0);
begin
wait until clk='1' and clk'event
if rst ='1' then
q_tmp <= '0';
else
case jk_var is
when "00" => q_tmp <= q_tmp;
when "01" => q_tmp <= '0';
when "10" => q_tmp <= '1';
when "11" => q_tmp <= not q_tmp;
when others => q_tmp <= 'x'; -- note others
end case;
end if;
jk_var := (j & k);
-- inputs are read after the state decoding
-- they must be memorized to be used later in the process
end process;
q <= q_tmp;
end second;


Expressions

Expressions represent arithmetic or logical computations performed by applying one or more operators to one or more operands.  Logical operators are synthesized as a set of logical gates resulting from the simplification of the equivalent boolean equations. Array operands involved into operations must be of the same size. A logical operator applied to two array operands is applied to pairs of the corresponding arrays' elements. If more than one operand is used in an expression, parentheses should be used to group operands.
signal a, b,c : bit_vector(1 downto 0);
signal k,l,m,n: bit;
c<= b and a;
n <= (k xor l) and m;

Relational operators

Relational operators always return a boolean value (true, false). The operators such as = or > compare two operands of the same base type. For the equality operator, the result is true if the two operands represent the same value. Relational operators can be applied to scalar types and to one dimensional arrays. Note that the order of the scalar type is determined by its declaration. The smallest value may be obtained by using sc_type'left attribute , the greatest by sc_type'right attribute. The relative order of two array values is determined by comparing each pair of elements in turn, starting from the left bound of each array's index range. If pair of arrays is of different size, the shorter is ordered before the longer.

For example, the bit vector "1101" is less than "110100" .

signal a,b: bit_vector(3 downto 0);
signal k,l,m,n: boolean;
k <= (a=b);
n <= (l>m);

Arithmetical operators

Arithmetical operators are synthesized using predefined operators called synthetic operators. These operators are prepared in the included numeric package. In some cases the arithmetical operators working on short bit arrays (for instance 4-bit vectors)  may be synthesized directly by synthesis tools.  Note that the integer subtypes are coded  as binary vectors. The use of full range integers is not recommended.
type mot is array(7 downto 0) of bit;
type mot_c is array(8 downto 0) of bit;
....
signal a,b: mot;
signal c:mot_c;
mot_c <= a + b; -- + operator is overloaded by synthetic operator
The multiplication operator is always implemented via synthetic operator. The multiplication of an operand with a power of 2 results in a corresponding shift operation.

Unary operators (sign operators)

Unary operators are predefined for integer types. The + operator has no effect, while the - operator negates its operand.
signal a,b: integer range -4 to 3;
a <= -b; -- binary coded signed integers

Synthesis of VHDL processes

From the synthesis point of view, VHDL processes may be classified as :

Combinational processes

Pure combinational processes do not infer any memory elements. In pure combinational processes all signals read by the process must be declared in the sensitivity list. The simple combinational processes represent combinational circuits like decoders/encoders and combinational logic/arithmetic units. The control operations of these kind of models may be described by if .. else or case with .. expressions.

if .. elsif.. else

alu_unit:process(op_code,a,b) -- all input signals belong to sensitivity list
begin
if op_code="00" then
res <= a + b;
elsif op_code="01"
res <= a - b;
elsif op_code="10"
res <= a and b;
else
res <= a or b;
end if;
end process alu_unit;
The same process may be written using wait statement; Problem !

The descriptions which do not provide complete information, such as in the example below, cannot be synthesized.

Note that only the value '0' of op_code is interpreted; what to do if the value of op_code  equals '1' ?
 

case .. when .. when ..

In some situations a simple regular decoder or multiplexer structure must be replaced by a more specific one. The following example shows the implementation of case statement with several or cases.

loop .. exit; loop .. next

Iterative statements like loops allow to describe effectively regular combinational circuits. Recall that exit statement terminates a loop while  next statement continues the execution of a loop. The following example shows the implementation of a comparator using both the exit and the next statements.
signal a,b: bit_vector (1 downto 0);
signal a_smaller_than_b: boolean;
a_smaller_than_b <= false;
for i in 0 to 1 loop
if (a(i) ='1' and b(i) ='0') then
a_smaller_than_b <= false;
exit;
elsif (a(i) ='0' and b(i) ='1') then
a_smaller_than_b <= true;
exit;
else
next;
end if;
end loop;

Sequential processes

When a process has no wait statement it is normally synthesized with combinational logic. The computations performed by the process react immediately to changes in input signals. There is no no internal state of the process.
The process is called sequential if it is synthesized with sequential logic. Typically sequential logic is activated by one or more clock signals.  In a sequential process computations are performed only once for each specified clock edge. The results of this computations are saved in internal flip-flops until the next active edge of the clock.

The following states are stored in flip-flops:

Latches

Latch circuit may be inferred from a simple process awaken by clock signal and the input data. The following if statement infers a latch since there is no else statement..

simple_latch:
process (clk,data) -- waiting for clk or data signal to be latched if clk='1'
begin
if(clk ='1') then
q <= data;
end if;
end process simple_latch;
An additional asynchronous signal called clr may be added as follows:
Flip-flops
A simple flip-flop register may be inferred from the following model. Note the use of 'event attribute required to detect the rising edge of clk signal. Actual synthesis tools and algorithms synthesize the logic circuits described with standard logic type operating with nine logic values. In this case the detection of a rising/falling edge with a simple condition like:
signal clk: std_logic;
if (clk'event and clk='1') then .. -- rising edge 0 => 1 ?
if (clk'event and clk='0') then .. -- falling edge 1 => 0 ?
is simply insufficient. What happens if the previous state was 'z' or 'x' ?
We can try to write the following conditions:
signal clk: std_logic;
if (clk'event and clk'last_value = '0' and clk='1') then .. -- rising edge 0 => 1
if (clk'event and clk'last_value = '1' and clk='0') then .. -- falling edge 1 => 0
But how the synthesizer can find out the previous ('last_value) state (before the event) of the clk signal in a static manner; no way !
The solution to this problem is provided by the introduction of special functions falling_edge(signal_name) and rising_edge(signal_name) into the synthesis packages.

Inferring memory elements for finite state machines

More complex sequential processes may infer several memory elements. Multi-state finite state machines require several flip-flops to memorize their internal state. The following is an example of a simple  FSM entity with asynchronous reset.
 
entity FSM is
port(rst,clk,inc,a,b: in bit; res: out bit);
end FSM;
architecture first of FSM is
type etat_t is (s0, s1,s2, s3);
signal etat_act, etat_suiv: etat_t;
begin
sync_p:
process(clk,rst)
begin
if (rst='1') then
etat_act <= s0;
elsif (clk'event and clk='1') then
etat_act <= etat_suiv;
end if;
end process sync_p;
out_p:
process(etat_act,a,b)
begin
res <= b; -- initial output is b signal
etat_suiv <= s0;
if (inc ='1') then
case etat_act is
when s0 => etat_suiv <= s1;
when s1=> etat_suiv <= s2;
res <= a; -- in state s1 output takes value of a signal
when s2 => etat_suiv <= s1;
when etat_act => null;
end case;
end if;
end process out_p;
end first;

Inference of three-state buffers

Multiple valued logic including high-impedance state is used to synthesize the bus structures. For example a simple four-valued logic defined as follows may be used to specify a three-state gate.
type mvl4 is ('0','1','x', 'z');
signal tsout, tsin: mvl4;
...
if(enbl='1') then
tsout <= tsin;
else
tsout <= 'z';
end if;
...
The following example illustrates the synthesis of a flip-flop with a three-state output gate. Note that all signals use four-state logic (mvl4). This may cause some problems when the (clk'event and clk='1') expression is evaluated (!).
entity ff_3st is
port(clk, enbl, din: in mvl4; dout: out mvl4; cond: in boolean);
end ff_3st;
architecture first of ff_3st is
signal tmp: mvl4;
begin
ff_p:
process(clk, din, cond)
begin
if(clk'event and clk='1') then
if (cond) then
tmp <= din; -- flip-flop inferred
end if;
end if;
end process ff_p;
out_p:
process(enbl,tmp)
begin
if(enbl ='1') then
dout <= tmp;
else
dout <= 'z';  -- three-state buffer inferred
end if;
end process out_p;
end first;

Synthesis optimization through resource sharing

Resource sharing reduces the amount of the hardware necessary to perform the required operations. Several identical operations may be assigned the same component in the netlist. Without resource sharing, each operation is built with separate circuit/component.

The following is a list of operators (components) which can be shared:

*  /
+ -  : best suited for resource sharing
> >= < <=
These operators may be shared only if they reside in the same process. Note that when a synthesis tool provides resource sharing, it adds multiplexers to the inputs and outputs of shared hardware resources. This permits to channel data into, and out of, the shared resource.

Example:

shar_p:
process(cond,a,b,c)
begin
if (cond) then
z <= a + b;  -- adder circuit inferred
else
z <= b + c;
end if;
end process shar_p;

After synthesis without resource sharing:

After synthesis with resource sharing : solution 1 After synthesis with resource sharing : solution 2

General modeling recommendations for synthesis

In general, the synthetizable models must be completely specified before the synthesis process. No unknown states or operations may be assumed by synthesis algorithms.

For combinational logic:

Statements and assignments For sequential logic use if statement to infer flip-flops instead of wait statement.

Basic objects:
For compatibility reasons, use standard IEEE packages std_logic_1164 and numeric_std  (IEEE 1076.6) to specify the simple objects
For efficiency reasons,  use only bounded integer types


Structuring architectures for synthesis

Complex architectures needs to be modeled in a modular fashion. Independent  and smaller  modules or blocks are easier to test and /or verify. The blocks should  be neither too small nor to large; they should follow the functional partition of the overall architecture. Previously designed and verified modules can be easily assembled. Some of the existing blocks may be stored in design base and  provide valuable source of blocks for  new designs.
 
 
functional modules - procedural evaluation  structural modules - concurrent evaluation
function : expressions and control structures
procedure : expressions and control structures
process :  sub-programs
..
..
..
..
block :  signal assignments
component: processes and signal assignments
configuration: components

VHDL constructs allow different degree and style of  modularization. Functional granularity is first provided through the use of subprograms and processes. Structural granularity of description  is supported by the use of blocks , components and configurations.

For the reasons of higher design manageability and synthesis efficiency the decomposition should follow specific functional units such as ALUs, sequencers, memory units, multipliers, etc. Different kind of functionnalities require different and often quite specific description style. Some of the blocks may not be available at source level and only the interfaces (entities) may be visible to VHDL modeler.
The units can be synthesized independently and interconnected afterwards to form the required complex architecture. The lowest level modules or components should correspond to few thousands of logic gates. This quantity of circuit  corresponds to small hardware module to be implemented as FPGA or ASIC block .

The decomposition requires  introduction of a well specified interconnection scheme. This scheme may be used for one-to-one communication or for many-to-many modules communication. In the later case bus structure must be elaborated and modeled together with the overall architecture.

In some cases the bus structure may be specific for the given design; in others a standard bus structure can be applied. The advantage of a standard bus is the possibility to interconnect and integrate the  future modules complying to the standard.


Examples of synthetizable descriptions


Exercises