작성일: 2006.04.30
An industry expert examines field-programmable gate arrays (FPGAs), including current and forthcoming architectures, technologies, and software tools.
By Bob Zeidman (March 21, 2006)
This article examines field-programmable gate arrays (FPGAs) and their underlying architectures and technologies. We will also examine current and up-and-coming software tools that are designed to allow you to squeeze more functionality into these chips in less time, running at faster speeds, and using less power.
Introduction
The first section of this article deals with the
internal architecture and characteristics of typical FPGA devices, allowing you
to decide which particular device is right for your design. The next section
examines new FPGA architectures being offered by various vendors. The final
section looks at some new software tools to help you with your designs.
The basics of FPGAs
Field-programmable gate arrays (FPGAs) are
so-called because they are structured very much like the now-obsolete "gate
array" form of application specific integrated circuit (ASIC). In fact,
FPGAs essentially killed the gate array ASIC business. In the not-so-distant
past, FPGAs were marketed for primarily two uses: (a) for prototyping ASICs and
(b) for use in systems to achieve time-to-market knowing that they would be
replaced with an ASIC implementation at the earliest opportunity.
With regard to this latter point, FPGAs can be programmed on your desk top in minutes while ASICs require weeks to fabricate a new design. As FPGA speeds increased, power consumption decreased, and prices decreased, FPGAs began shipping in products without any intention of replacing them with equivalent ASICs. Of course FPGAs are still good at prototyping ASICs and they are still used that way.
FPGA architectures
Each FPGA vendor has its own FPGA architecture,
but in general terms they are all a variation of that shown in Fig 1. The
architecture consists of configurable logic blocks, configurable I/O blocks, and
programmable interconnect. Also, there will be clock circuitry for driving the
clock signals to each logic block. Additional logic resources such as ALUs,
memory, and decoders may also be available. The three basic types of
programmable elements for an FPGA are static RAM, anti-fuses, and flash EPROM.
1. Generic FPGA architecture.
Configurable Logic Blocks (CLBs): These blocks contain the logic for the FPGA. In the large-grain architecture used by all FPGA vendors today, these CLBs contain enough logic to create a small state machine as illustrated in Fig 2. The block contains RAM for creating arbitrary combinatorial logic functions, also known as lookup tables (LUTs). It also contains flip-flops for clocked storage elements, along with multiplexers in order to route the logic within the block and to and from external resources. The multiplexers also allow polarity selection and reset and clear input selection.
2. FPGA Configurable logic block (CLB) (courtesy of Xilinx).
Configurable I/O Blocks: A Configurable input/output (I/O) Block, as shown in Fig 3, is used to bring signals onto the chip and send them back off again. It consists of an input buffer and an output buffer with three-state and open collector output controls. Typically there are pull up resistors on the outputs and sometimes pull down resistors that can be used to terminate signals and buses without requiring discrete resistors external to the chip.
The polarity of the output can usually be programmed for active high or active low output, and often the slew rate of the output can be programmed for fast or slow rise and fall times. There are typically flip-flops on outputs so that clocked signals can be output directly to the pins without encountering significant delay, more easily meeting the setup time requirement for external devices. Similarly, flip-flops on the inputs reduce delay on a signal before reaching a flip-flop, thus reducing the hold time requirement of the FPGA.
3. FPGA Configurable I/O block (courtesy of
Xilinx).
Programmable Interconnect: In Fig 4, a hierarchy of interconnect resources can be seen. There are long lines that can be used to connect critical CLBs that are physically far from each other on the chip without inducing much delay. Theses long lines can also be used as buses within the chip.
There are also short lines that are used to connect individual CLBs that are located physically close to each other. Transistors are used to turn on or off connections between different lines. There are also several programmable switch matrices in the FPGA to connect these long and short lines together in specific, flexible combinations.
Three-state buffers are used to connect many CLBs to a long line, creating a bus. Special long lines, called global clock lines, are specially designed for low impedance and thus fast propagation times. These are connected to the clock buffers and to each clocked element in each CLB. This is how the clocks are distributed throughout the FPGA, ensuring minimal skew between clock signals arriving at different flip-flops within the chip.
In an ASIC, the majority of the delay comes from the logic in the design, because logic is connected with metal lines that exhibit little delay. In an FGPA, however, most of the delay in the chip comes from the interconnect, because the interconnect – like the logic – is fixed on the chip. In order to connect one CLB to another CLB in a different part of the chip often requires a connection through many transistors and switch matrices, each of which introduces extra delay.
4. FPGA programmable interconnect (courtesy of Xilinx).
Clock Circuitry:Special I/O blocks with special high drive clock buffers, known as clock drivers, are distributed around the chip. These buffers connect to clock input pads and drive the clock signals onto the global clock lines described above. These clock lines are designed for low skew times and fast propagation times. Note that synchronous design is a must with FPGAs, since absolute skew and delay cannot be guaranteed anywhere but on the global clock lines.
SRAM vs. Antifuse vs. Flash
There are three competing technologies
for programming FPGAs. SRAM programming involves a small static RAM bit for each
programming element. Writing the bit with a zero turns off a switch, while
writing with a one turns on a switch. Another method involves an antifuse that
consists of a microscopic structure that, unlike a regular fuse, normally makes
no connection. A large amount of current during programming of the device causes
the two sides of the antifuse to connect. A third and relatively new method uses
flash EPROM bits for each programming element.
The advantages of SRAM-based FPGAs – the most common programming technology by far – is that they use a standard fabrication process that chip fabrication plants are always optimizing for better performance. Since the SRAMs are reprogrammable, the FPGAs can be reprogrammed any number of times, even while they are in the system, just like writing to a normal SRAM. SRAM devices can easily use the internal SRAMs as small memories in the design.
The disadvantages of SRAM-based FPGAs are that they are volatile, which means a power glitch could potentially corrupt the contents of the device. SRAM devices have large routing delays and are slower than other technologies, in theory, but continually improving SRAM technology has effectively eliminated this disadvantage. SRAM FPGAs can consume more power and are less secure than other technologies because they must be reprogrammed upon power-up and the programming bitstream can be observed going into the device. Custom SRAM FPGAs with built-in keys that unencrypt incoming program bit streams can be purchased from vendors, but this reduces the low cost and fast lead time advantage of the FPGA. Bit errors are also more likely with SRAM FPGAs than with the other devices. The market has decided that the advantages of SRAM FPGAs outweigh the disadvantages as they are by far the dominant FPGA technology.
The advantages of antifuse FPGAs are that they are non-volatile and the delays due to routing are very small, so they tend to be faster. Antifuse FPGAs tend to require lower power and they are better for keeping your design information out of the hands of competitors because they do not require an external device to program them upon power-up as SRAM devices do. The disadvantages are that they require a complex fabrication process, they require an external programmer to program them, and once they are programmed, they cannot be changed. The complex, nonstandard fabrication process has turned out to be a key disadvantage as antifuse FPGAs have lower yields and the technology has improved more slowly than SRAM FPGAs.
Flash FPGAs seem to combine the best of both of the other methods. They are nonvolatile like antifuse FPGAs, yet reprogrammable like SRAM FPGAs. They use a standard fabrication process like SRAM FPGAs and they are lower power and secure like antifuse FPGAs. They are also relatively fast. Currently, one vendor supports flash FPGAs and another vendor has a hybrid flash/SRAM FPGA. They are not catching on as fast as I expected, though that could change in the future.
Example FPGA families
Examples of SRAM FPGA families include the
following:
Examples of antifuse FPGA families include the following:
Examples of flash FPGA families include the following:
Examples of hybrid flash/SRAM FPGA families include the following:
Emerging technologies
Cores: When I talk about a
"core" I am simply referring to a large self-contained function. There are two
basic types of cores. The soft core, known as an IP core, is a function that is
described by its logic function rather than by any physical implementation. Soft
cores usually consist of hardware description language (HDL) code. Hard cores,
on the other hand, consist of physical implementations of a function. With
respect to FPGAs, these hard cores are known as embedded cores because they are
physically embedded onto the chip die and surrounded by programmable logic.
Many FPGA vendors have begun offering cores. The density of programmable devices is increasing, enabling what is called a Programmable System on a Chip (PSOC). Whereas programmable devices were initially developed to replace glue logic, entire systems can now be placed on a single programmable device. SOCs include of all kinds of complicated devices, like processors. In order to place these complex functions within a programmable device, there are three options: the first is to either (a) design the function yourself and place it in the programmable logic, (b) purchase the HDL code for the function and incorporate it into your HDL code, or (c) get the vendor to include the function as a cell embedded in the programmable device. The second option is the IP core or soft core, while the third option is the embedded core or hard core.
IP Cores: IP cores are often sold by third party vendors that specialize in creating these functions. Recently, FPGA vendors have begun offering their own soft cores. IP cores reduce the time and manpower requirements for the FPGA designer. IP cores have already been designed, characterized, and verified. Also, IP cores can often be modifiable, meaning that you can add or subtract functionality to suit your needs. They are also portable from one vendor to another.
But IP cores may also be expensive. Electrical characteristics such as timing or power consumption for IP cores can be optimized to a limited degree, but the actual characteristics depend on its use in a particular device and also depend on the logic to which it is connected. IP cores purchased from a third party may not be optimized for your particular FPGA vendor's technology. You may not be able to meet your speed or power requirements, especially after you have placed and routed it.
Embedded Cores: The embedded core is ideal for many users, which is one reason why programmable device vendors are now offering embedded cores in their devices. The embedded core will be optimized for the vendor's process to give you good timing and power consumption numbers. The function will be placed as a single cell on the silicon die and so the performance of the function will not depend on the rest of your design since it will not need to be placed and routed.
Some embedded cores are analog devices that cannot be designed into an ordinary FPGA. By integrating these functions into the device, you can avoid the difficult process of designing analog devices, and you save the chips and components that would otherwise be required outside the programmable device.
Of course there is a drawback to embedded cores. By using an embedded core in your programmable device, you tie your design into a single vendor. Unless another vendor offers the same embedded core, switching to another vendor will require a large effort and will not be pleasant.
Processor Cores: Processor cores are one of the types of cores commonly available as IP cores or embedded cores. These processors tend to be those that are designed for embedded systems since, almost by definition, programmable devices are embedded systems.
If the processor core is embedded, you will be using a processor that has been optimized and has predictable timing and power consumption. For either type of core, tools will be readily available for software development. Off-the-shelf cross compilers and simulators can be used to debug code before the design has been completed and the programmable device is available.
An example of an FPGA with an embedded processor, along with other embedded cores, is shown in Fig 5.
5. FPGA with embedded processor core (courtesy of
Quicklogic).
DSP Cores: Digital Signal Processors (DSPs) are another common type of core that is offered as an IP core or an embedded core. These are essentially specialized processors that are used for manipulating analog signals. They are commonly used for filtering and compression of video or audio signals.
Many engineers have argued that as general processors become faster, DSPs will be less useful because the same functions can be accomplished using the generic processors. However, video and audio digitization, compression, and filtering requirements have increased in recent years as millions of users connect to the Internet and regularly upload and download all kinds of information over relatively limited bandwidth connections. So far, DSP demand for use in networking and graphics devices has been increasing, not decreasing.
Analog Cores: FPGA vendors have begun to include analog cores in their FPGAs. For example, PHY cores are the analog circuitry that drives networks. Many companies are now integrating this functionality onto their devices. Because these devices include specialized analog circuitry, they are available only as embedded cores.
6. FPGA with embedded PHY core (courtesy of Actel).
A functional block diagram of an FPGA that includes an embedded processor core, embedded digital peripheral cores, and embedded analog cores is shown in Fig 6.
Special I/O Drivers: Special I/O drivers are also being embedded into programmable devices. The newer buses inside personal computers need to have very tightly controlled timing and must be driven by special high-drive, impedance-matched circuits. The I/O buffers need to have inputs with very specific voltage threshold values. Many vendors now offer programmable devices with I/O that meet these special requirements. Many times, this is the only way to design a programmable device that can interface with these buses without external chips and components.
New Architectures: New basic architectures are being developed for the logic blocks that comprise FPGAs. One new architecture has a logic block that is based on a DSP, as shown in Fig 7. This type of FPGA will be better for use in chips that need a significant amount of signal processing. I have certain doubts about this future path, though. First, the majority of programmable devices do not perform any DSP, so this architecture targets a relatively small market. Second, special tools will be needed to convert digital signaling algorithms for use in such a specialized FPGA. These tools will need to optimize the algorithm very well so that performance in this specialized FPGA can actually perform better than a standard DSP, or a generic processor, running code that has been optimized using tools and compilers that have been available for years.
7. DSP core cell in an FPGA (courtesy of Altera).
New tools
The most significant area for the future, I believe, lies
in the creation of new development tools for FPGAs. As programmable devices
become larger, more complex, and include one or more processors, there is a huge
need for tools to take advantage of these features and optimize the designs.
As FPGAs come to incorporate processors, development tools are needed for software just as much as for hardware. Hardware synthesis tools allow hardware engineers to work at higher levels of abstraction, without the need to understand the details of the underlying hardware architectures. Similarly software synthesis tools are needed to allow software engineers to work at a higher level of abstraction without the need to understand the details of the underlying software architecture.
Ultimately, there will have to be a melding of hardware and software expertise in an FPGA designer. System level issues must be understood and addressed. Future intelligent tools will work with libraries of pre-tested hardware objects and software functions, leaving "low-level" C and Verilog design necessary only for unique, specialized sections of hardware or software.
Eventually, platform FPGAs with embedded processors will become the dominant platform for embedded system design, and will finally allow the fulfillment of the promise of, and force further development of, hardware/software co-design tools.
Conclusion
This article has presented an overview of current and
emerging FPGA technologies, architectures, and tools. You are now prepared to
delve into your first or fiftieth FPGA design with the confidence that your
knowledge is up to date and that you have the ability to accurately evaluate the
various FPGA vendors and their families, and the software tools needed to ensure
your design works as required.
Bob Zeidman is the president of Zeidman Technologies (www.zeidman.biz), a company that develops hardware/software co-design tools. He is also president of Zeidman Consulting (www.ZeidmanConsulting.com), a contract research and development firm. Among his publications are technical articles on hardware and software design methods as well as three textbooks: Designing with FPGAs and CPLDs, Verilog Designer's Library, and Introduction to Verilog. Bob holds two patents and earned bachelor's degrees in physics and electrical engineering at Cornell University and a master's degree in electrical engineering at Stanford University. Bob can be contacted at Bob@ZeidmanConsulting.com.
All material on this site Copyright © 2006 CMP Media LLC. All rights reserved.