The Xilinx XC4000 series contains special purpose hardware to efficiently implement fast carry logic as found in adders, subtracters counters and other related function blocks. When special purpose circuitry such as this is available, an optimal solution is based on the usage of these facilities.
Normally, no algorithmic advantages can be gained by substituting a superior description for such a function block, as the special purpose hardware is implemented at the hardware level. Thus, a brute force approach is given a significant advantage, as it maps directly to the hardware.
It is, however, difficult for VHDL compilers to use special purpose features which are available in FPGAs under certain conditions, such as the fast-carry logic or the builtin RAM.
Xilinx provides a partial solution to this problem by supplying a DesignWare library for adders, subtracters, counters, and comparators. In Synopsys, DesignWare libraries are used to implement common, complex functional units which can be used by the design analyzer. In the X-BLOX DesignWare library, these functional units are not actually implemented using the Synopsys FPGA compiler. Instead, references to X-BLOX modules are inserted in the net list. When the design is post-processed for final layout using XACT, X-BLOX is invoked as module generator to synthesize appropriate functional units. X-BLOX has intimate knowledge of Xilinx circuits, so it can generate logic geared to special features such as fast-carry logic.
To compare different modeling styles and their implementation in hardware, we have described an adder module with several different algorithms:
These adders have been described in different styles, and compiled with and without X-BLOX. The description style played little role in the final hardware efficiency, and only the VHDL ``+'' operator could be mapped to a Xilinx DesignWare block using fast-carry logic. When compiling the circuits without the X-BLOX library, the only difference was the usage of a Synopsys DesignWare fast-carry adder.
Table 1: Size in CLBs of adders (widths from 8
to 12 bits) using different description methods. The test
circuits were synthesized using FPGA compiler 3.3a, and routed
using ppr (XACT 5.1). Area results as reported by
ppr.
Table 2: Timing (in ns) of adders (widths from 8
to 12 bits) using different description methods. The test
circuits were synthesized using FPGA compiler 3.3a, and routed
using ppr (XACT 5.1). Timing results are pad-to-pad
delays as reported by xdelay and include the
propagation delay of input and output pads. Thus, relative speed
differences are more pronounced than they may appear here.