electronic design - "Tool Up For Alternatives To Standard ASICs"

Structured ASIC...

[Technology Report]
Tool Up For Alternatives To Standard ASICs
Standard-cell ASIC NREs got your eyes bulging? Check out tools and methodologies for emerging alternatives like structured ASICs and, yes, FPGAs.

David Maliniak
ED Online ID #5715
September 15, 2003

Unfortunately for all of us, the electronics OEM remains in a funk. Yet design work must go on, or the industry's malaise will linger. Although some 4000 ASIC starts are projected for 2003, not all of them are going into high-volume applications, which can make it very difficult to justify standard-cell nonrecurring engineering charges (NREs).

It's no wonder that research firm Gartner/Dataquest expects only around 6% growth in the ASIC market this year but is projecting growth of over 10% for application-specific standard products (ASSPs) and over 14% for FPGAs and programmable logic devices. If you look behind the numbers and at the marketplace activity of late, it's clear that a shift is under way.

For cutting-edge IC designs, standard-cell and custom ASIC implementation is the ticket to the highest performance allowed by today's silicon fabrication processes. But given that most IC designs don't see huge unit volumes and most don't require cutting-edge performance, designers often seek different solutions for standard-cell ASIC implementation.

How you ultimately choose to implement your design is a matter of tradeoffs, of course, between expected unit volumes, performance requirements, and your market window. Once all of these are weighed, it may be that there's a better way for you to go than with a standard-cell methodology.

Currently, digital logic can be implemented through four primary methods. FPGAs are the lowest in risk but carry the highest unit costs. Gate arrays are a middle ground that has fallen from wide use, as they call for only somewhat less custom mask making than standard cell. Standard cell is the option with the highest performance and lowest unit cost (good), but it also has the longest design cycle and highest NREs (bad).

Then there's the fourth option. Quite a bit of excitement surrounds a reinvention of the gate-array concept that's emerged in the past couple of years. Known variously as platform ASICs or structured ASICs, these devices offer performance levels that are just a process generation or so behind standard-cell performance, which is good enough for a very large number of applications. For many users, structured ASICs more than make up for a small performance hit in two ways: NREs are a fraction of standard-cell costs, and turnaround times are dramatically shorter. Some vendors claim that designs can go from RTL or gate-level netlist to prototypes in as little as three weeks.

The concept behind structured ASICs is to simplify the design process by having a large array of predefined logic cells. Structured ASICs typically provide on-chip resources that are similar to those found on high-end FPGA devices. This typically includes embedded memory, I/Os, clock networks, computational blocks, IP blocks, and deterministic routing. These resources are pre-fabricated and have been physically verified to work properly under normal operating conditions to reduce design cost. The chip is completed by adding two (and sometimes only one) user-customized metal layers.

STRUCTURED ASICs
In surveying some of the offerings currently on the market, one finds that generally, front-end tool flows for structured ASICs aren't all that different from the flows used for standard-cell implementations. That's good news. Design teams need not completely overhaul their front-end flows to contemplate a switch to a new implementation platform. But there are certainly considerations to be made in terms of methodologies.

With regard to the availability of EDA tools for structured ASICs, few vendors have rushed to market with offerings. Synplicity has been the most prominent by far. Recently, Magma Design Automation's announcement that it would acquire Aplus Design Technologies brings it into the mix as well. Aplus has implemented a physical planning stage that defines interconnect delays at all levels (global, semi-global, local, etc.) through mapping and placement. This timing-unified synthesis and mapping process has awareness of the design's physical implementation and is driven by the timing model defined in the physical planning stage.

It's important to understand that structured ASIC cell logic blocks are complex cells. They typically include lookup tables, storage elements, multiplexers, inverting inputs, and cascading outputs, all common in FPGAs. A custom mapper that can directly map to individual complex cells yields the best quality of results.

Why should designers consider structured or platform ASICs for their chip implementation? Again, it's a matter of tradeoffs. If you know that the ASIC you're designing will be produced in extremely large volumes, or if it needs a very large number of gates, or requires absolute best-in-class performance, then standard cell is probably the best way for you to implement it. But for mid-volume production runs, or for those without the budgets for standard-cell NREs, structured ASICs may be the answer (Fig. 1).

Figure 1

In many cases, the best approach to design for a given structured ASIC platform of the several on the market depends on the architecture of the platform itself. NEC Electronics' approach to a design methodology for its Instant Silicon Solution Platform (ISSP) is to leverage as much of the existing design flow and its users' experience with that flow as possible. At the same time, NEC took steps to improve on the existing cell-based front-end flow, largely through strategic partnerships with key EDA vendors. Other platform-ASIC vendors have taken this tack as well.

NEC's flow begins with an RTL design description. "The customer has to consider the structure of ISSP before doing RTL design," says Chung Ho, NEC's director of strategic marketing. Otherwise, it's difficult to wring all of the performance potential out of the architecture.

Enhancements to the flow come in two forms. One comes through NEC's partnership with Tera Systems. NEC uses Tera's TeraForm RTL Design Consultant and Virtual Prototype technologies to check incoming RTL. Tera's rule-checking technology, embodied in the RTL Design Consultant, is used to look for issues in the RTL that would be problematic for the physical implementation of an ISSP device. Items such as the number of clock domains and the number of logic levels are checked so that when the RTL is mapped into the ISSP architecture, areas of concern are flagged. The handoff point is at the netlist level. As do other ASIC vendors, NEC provides all of the backend place and route, silicon verification, design-for-test insertion, and so on.

NEC is one of several structured ASIC vendors who has a joint development agreement with Synplicity to create custom synthesis mapping technology for the Synplify ASIC synthesis tool. Synplify ASIC has a graphical user interface that's not unlike that of its Synplify Pro FPGA synthesis tool. Because NEC is seeing a number of its ISSP customers graduating up from the FPGA world, the company likes this about the Synplicity tools.

The custom mapper accounts for the fundamental ISSP architecture in terms of the complex gate structure. As a result, the user's logic can take full advantage of the tradeoff ratio between multiplexers, flip-flops, gates, and other resources.

LSI Logic has also chosen to incorporate both the Tera Systems and Synplicity tools in its newly launched RapidWorx design system (Fig. 2). Geared toward its RapidChip platform ASICs, RapidWorx represents a rules-based methodology and correct-by-construction flow.

Figure 2

There are five basic steps to the flow: configuration of the RapidChip slice resources, physical mapping of those resources, RTL rule checking, physical synthesis, and netlist handoff rule checking. Each step is launched from within the RapidWorx design cockpit. There's tight integration between tools to allow for cross-probing from one step to another.

As in the NEC flow, LSI Logic ensures RTL rule compliance early in the cycle to prevent problems from propagating downstream. Thanks to the final gate-level netlist check just before handoff, LSI feels confident in quoting a layout cycle of 1 million gates per week. The design platform costs about $50,000 per seat for a six-month license that includes Synopsys PrimeTime static timing analysis.

AMI Semiconductor, with its XPress Array line of structured ASICs, has a front-end flow that's more geared toward FPGA design conversions. Yet, AMI's flow also tries to take maximum advantage of existing tools. "We prefer to receive RTL design descriptions," says Bob Kirk, AMI's director of applications engineering for its digital ASIC business unit. "If, for example, that RTL code was written for Synplify Pro, we use Synplify ASIC to retarget that to the XPress Array architecture. Conversely, if the customer has designed the FPGA using the FPGA Compiler from Synopsys, we'll use Design Compiler to retarget it." AMI also can accept an FPGA netlist out of synthesis, as well as retarget it.

For both flows, AMI uses Synopsys' LEDA rule checker to ensure that IP blocks are instantiated in a way that won't trip up physical design. "The most common problems are in the use of multiple clock domains," says Kirk. "Many designers don't know how to interface clock domains reliably and safely. For example, they'll omit the metastability flip flops."

Another structured ASIC vendor, Lightspeed Semiconductor, has also bent over backwards to accommodate existing ASIC flows as much as possible (Fig. 3). Users receive a library that looks like a standard-cell library. They perform synthesis using their preferred engine and work to achieve pre-place-and-route timing closure of their design. In the Lightspeed architecture, metal layer 5, via layer 5, and metal layer 6 are the customizable layers where users plug in their own logic.

Figure 3

The flow for Lightspeed's third-generation Luminance devices, which contain up to 10 million usable ASIC gates, does have a couple of attractive differences. For one, structures to facilitate 100% stuck-at fault testing are pre-built into the logic fabric. There's no performance hit or impact on timing, nor is there logic for the customer to insert. Also, the testing doesn't rely on user clocks.

Another wrinkle is in the post-place-and-route back end, where Lightspeed gets into its FastFlow timing-closure process. After a first pass at placement and routing, Lightspeed uses highly targeted tools that allow application engineers to modify placement to improve timing. In a matter of hours, engineers can perform placement improvements, incremental routing, standard-delay format (SDF) extraction, and timing analysis.

The keys, according to Lightspeed VP of engineering Michael Sydow, are the tools' tight coupling with the architecture and a highly optimized design database. "The router has a fast incremental capability that minimizes changes. The goal is to do as little rip-up and rerouting as possible to keep timing stable," he says. Lightspeed and Synplicity also are jointly developing a custom mapper for the Luminance products based on Synplify ASIC.

Coming at the structured ASIC market from a slightly different angle, ViASIC describes itself as a tool vendor first and foremost. Indeed, unlike other vendors, ViASIC isn't in the business of selling finished silicon. Rather, it licenses an architecture and sells a physical design tool with which to implement it. ViASIC's ViaMask architecture requires customization of just one mask layer (a via layer between metal layers 3 and 4) for designers to implement an ASIC for the TSMC 180- and 130-nm processes.

The ViaPath physical design tool is what's used to place, optimize, and route a ViaMask design. The architecture is basically a sea of gates with about a dozen basic logic gates, such as two-way NANDs, multiplexers, inverters, and other functions repeated throughout the chip. Also interspersed in the pre-defined metal layers is a true dual-port RAM.

A ViaMask design begins with a synthesized gate-level netlist. The ViaPath tool maps that netlist over to the ViaMask fabric, determining the required ratio of logic cells. The tool then places those particular cells onto the architecture and turns to the routing, which is a matter of determining which vias need to be placed and where. The tool's output is the GDSII file for the via layer, which is combined with the GDSII file for the foundry's master tile. The final steps are extraction and/or DRC/LVS.

Chip Express, which offers structured ASICs in both 0.25-µm and 180-nm technologies, sees customers coming down from standard-cell ASIC as well as FPGA conversions. "We see designs coming in at multiple entry points," says Steve Bateman, VP of engineering. An increasingly popular option is RTL handoff. "As we find ourselves with customers wanting to hand off at higher levels of abstraction, that gives us some interesting challenges from the design-flow perspective," he adds.

Many of Chip Express' customers follow more or less a traditional standard-cell model in which they hand off a gate-level netlist. Others are moving to RTL handoff, and for that, Chip Express relies on Atrenta's Spyglass RTL rule checker and anxiously awaits its forthcoming constraints checker. For Chip Express, comfort with RTL handoff requires up-front partnering with the customer to ensure that it will receive clean RTL.

But FPGA conversion customers, Bateman says, can be more problematic. Many of them are prototyping in FPGAs and have figured out that FPGAs can be prohibitively expensive for a production run of any appreciable size. The problem is that RTL written for FPGAs differs from that written for ASIC implementation. Because of the ability to quickly reprogram an FPGA prototype, many FPGA RTL coders are, as Bateman puts it, "less than disciplined." Here again is where RTL rule checking comes in. It enables Chip Express to quickly find problems in the RTL that would stack the deck against a quick turnaround in physical design while adding value to customers.

OPTING FOR FPGAs
While structured ASICs are the right choice for many designers, plenty of other designers ply the waters of FPGA design. This can be as a prototyping vehicle or for small-to-medium production runs. The fact is that FPGAs are becoming more affordable while performance continues to rise. Though large FPGA vendors continue to provide tools for their devices, others are stepping into the fray with everything from standalone tools to entire flows.

FPGA tools have been traditionally viewed as simple and easy to use, says Dennis Kish, VP of marketing at Actel. "However, this is changing with the availability of large FPGAs that are getting more complex with each new generation of technology," he says.

As FPGAs continue along the complexity curve, tool flows for FPGAs are beginning to resemble, in certain respects, traditional ASIC flows. This can be seen with technologies such as physical synthesis and formal verification moving into the FPGA flow.

Another example of ASIC-like tool technology moving into the FPGA flow comes from startup Hier Design. Hier's PlanAhead takes silicon virtual prototyping, a technology aimed at the ASIC mainstream, and moves it into the FPGA domain (Fig. 4). Claiming that a flat flow is inefficient for today's large FPGA designs, Hier's tool, which is geared strictly for Xilinx Virtex-II and Spartan3 devices, sits between synthesis and place-and-route to provide a less iterative path from netlist through physical design.

Figure 4

The tool partitions the synthesized netlist either manually or automatically. Designers can then implement blocks individually and assemble them for performance analysis before other blocks are completed. The block-based, hierarchical, and incremental approach permits creation and manipulation of physical hierarchy independently from logical hierarchy. At the same time, designers can plan and analyze multiple physical implementations.

"Floorplanning is not good enough," says Salil Raje, Hier's VP of engineering. "Floorplanning is essentially adding area constraints into your netlist. What you really need to know is whether the use of these partitions will cause congestion after place and route."

Hier's silicon virtual prototyping capability goes beyond partitioning and adding area constraints to ensure that partitioning results in more connectivity within blocks than between blocks. The opposite scenario, says Raje, defeats the purpose of partitioning and it might as well be placed and routed flat.

The PlanAhead tool, which sells for $25,000 for a one-year time-based license, is available now. Supported platforms include Sun Solaris 5.8, Linux 7.3, and Windows XP.

Not all FPGA vendors are sold on the concept of silicon virtual prototyping, though. "If you have a well designed, high-performance device architecture, there is no need for this kind of tool for most designs," says Tim Southgate, Altera's VP of software and tools marketing. "However, for those difficult designs, it can be helpful to make use of floorplanning and physical synthesis."

Altera provides established FPGA synthesis vendors with its device architecture data. As a result, the vendors can optimize their tools for placement-based physical synthesis and shorten the synthesis placement iterative cycle. But the challenge for EDA vendors is that this capability is unique to each FPGA architecture, so it requires a large effort to develop physical synthesis support.

For its part, says Actel's Kish, Actel will introduce floorplanning technology in its next-generation integrated design environment. This new feature will aid logic placement, I/O assignment, and routing to achieve the best tradeoffs between achieving optimal design density and performance.

Other vendors have pushed to beef up their FPGA flows to handle the larger devices. Mentor Graphics has announced a three-part flow extending from high-level design through the programmed part's interface to the pc board.

"We believe that design of complex FPGAs is not just synthesis and place and route anymore, which is currently a mainstream design flow," says Simon Bloch, Mentor's general manager for FPGA design. Issues like multiple clock domains complicate matters, as does complexity itself. "We're seeing place-and-route cycles for 6-million-gate devices taking 24 hours. If the engineers do 50 iterations, which is not unusual for devices this large, the resulting design cycle begins to look like that of a small ASIC," he continues.

Mentor's answer is a flow that spans design and verification for the FPGA itself, the embedded system it's part of, and the pc board(s) that carry that system. The flow comprises existing tools, such as the HDL Designer tool suite, the Precision RTL synthesis tool, ModelSim for HDL simulation, and Seamless for hardware/software co-verification. They'll be augmented, in time, with other offerings in areas such as high-level synthesis and, possibly, formal verification. Mentor also will work to tie generic tools, such as Platform Express, Seamless, and its software-development tools like Nucleus, CodeLab, and the XRay debugger, to specific FPGA devices.

Also taking a high-level approach to FPGA design is Celoxica, which recently announced support for the 90-nm version of Xilinx's Spartan3 devices. Celoxica's DK Design Suite provides a C-based language design environment that enables partitioning of designs between hardware and software. It offers seamless verification of the system at the architectural level and direct synthesis from C-based models. Celoxica's methodology supports C, C++, SystemC, and Handel-C.

While Mentor contemplates high-level FPGA synthesis, it's already arrived in the form of AccelChip's AccelFPGA tool. Version 2.0 of the tool, which automatically generates synthesizable RTL models directly from Matlab, carries enhancements that boost DSP developers' ability to synthesize Matlab designs. DSP developers are freed from using Matlab designs as specifications for driving manual RTL modeling. New in version 2.0 is automated support for conversion from floating-point to fixed-point representations, as well as support for automatic import of foreign models. This paves the way for a new library of DSP IP, called AccelWare, for release in the third quarter.

Full FPGA tool suites don't always have to target the highest-end devices, though. In its Board-on-Chip technology rollout, Altium targets the Xilinx Spartan and Altera Cyclone devices, preferring to call these "low-cost system FPGAs."

Like Mentor, Altium's flow is a system-level approach with several main thrusts. First is a capture environment for system-level design based on the nVisage schematic-entry package, which enables mixed schematic-level and HDL design. Altium's approach to IP is a critical element here.

"An important part of the technology is the ability to supply high-level components in a pre-packaged, pre-synthesized form," says Rob Irwin, Altium's manager for brand strategy. "So we'd supply soft cores. Also a range of peripherals and analogous components to say, muxes, logic devices, that sort of thing."

A third component of Altium's flow is to provide reconfigurable hardware development platforms. The NanoBoard breadboards will eventually broaden into an application-specific line for various kinds of design work. Last is a comprehensive approach to software development, embodied in the Tasking products.

FPGA vendors themselves continue to improve both their hardware and software offerings. This is exemplified in Altera's HardCopy Stratix mask-programmed devices and Quartus II version 3.0 design software (see "Hardwired FPGA Option Shrinks Chip Size And Cost," Electronic Design, July 21, 2003, p. 34, ED Online 5318). The HardCopy Stratix parts offer up to 100% performance gains over the Stratix FPGAs, while lowering power consumption and per-part costs compared with the earlier HardCopy APEX devices.

The new Quartus II release lets designers directly target HardCopy Stratix devices from the beginning of the development cycle, rather than doing development work on FPGAs first and then moving the design over to the HardCopy platform. The software delivers design performance metrics for these devices as early in the cycle as it does for FPGAs. Designers can assess speed performance, power consumption, logic-cell placement, and I/O assignments before implementation, an essential capability for system-level board design.

Need More Information?

AccelChip
www.accelchip.com

Actel
www.actel.com

Altera
www.altera.com

Altium
www.altium.com

AMI Semiconductor
www.amis.com

Atrenta
www.atrenta.com

Celoxica
www.celoxica.com

Chip Express
www.chipexpress.com

Hier Design
www.hierdesign.com

Lightspeed Semiconductor
www.lightspeed.com

LSI Logic
www.lsilogic.com