Systems Technology Division, |
Hewlett-Packard Company, |
3404 East Harmony Road, |
Fort Collins, CO, 80525, |
U S A |
(from Jan 1998) |
email: gerard@ee.ed.ac.uk
webSite: http://www.ee.ed.ac.uk/~gerard/
Main interest is digital VLSI - both component research and development, and CAD environment design.
Degrees: Mathematics, Computer Science and Electrical Engineering.
Employment: one year in ASIC design at Hitachi Central Research Laboratory, Tokyo; seven years as lecturer in VLSI Design and Project Management at the University of Edinburgh; three years in industrial VLSI and CAD development at ES2, Bracknell.
Commercial products: ROM generator and parameterizable logic module library.
Publications: VLSI articles on Memory design, CAD development and DSP components; and over a dozen articles on Management skills with a book published by the IEEE.
Digital CMOS VLSI design skills: layout, design rule checking, circuit level simulation (HSPICE), standard-cell design, schematic capture, layout vs schematic, place-and-route, behavioural simulation (Verilog), CAD development, digital design, design quality assurance, CADENCE design environment and tool development with Skill.
Programming skills: C, Lisp, Skill, Verilog
The University of Edinburgh, UK (1983-86)
The University of Cambridge, UK (1979-83)
Member of Multi-Media group. Focused on the design of a VLSI accelerator-module for MPEG audio encoding. Devised silicon implementation of discrete Fourier transform with new bit-serial architecture avoiding all control signals except an initial reset. The technique also led to the design of a hardware accelerator for the discrete cosine transform. The designs were specified in (C-generated!) Verilog and implemented through netlist conversion and layout using Cadence place-and-route tools[20, 25, 26, 29-34].
VLSI Design: third-year undergraduate course (wrote text book[1] ) and design laboratory. Initiated a unique collaborations with Cadence UK resulting in the donation of software (worth over a million pounds) to establish the Cadence Laboratory for Scotland and an annual student prize of a trip to Cadence - in California.
Project Management: three week intensive Masters module and a first-year undergraduate course. Proposed and devised the courses from scratch. The focus was on the skills which a graduate engineer needs to manage small teams and small projects as determined by a survey[45] of employers of past graduates. The organization and philosophy were published in both professional and research literature[43, 44, 56-60], and the material as a series of ten articles in the IEE Engineering Management Journal[46-55] and a book[41] which has been reprinted[42] in the IEEE Engineers Guide to Business series.
Courses attended: To acquire skills and knowledge for the MEng course, attended several industrial courses:
Course | Company | Year |
Team Work | Hewlett-Packard | 1990 |
Management I | Royal Bank of Scotland | 1990 |
Project Planning | British Telecom | 1991 |
Making Meetings Matter | EU Personnel Office | 1991 |
People Management | Standard Life | 1991 |
Awards: Premium Award for the IEE Engineering Management Journal 1992; Commendation in the ESSO UK prize for "transferable skills in engineering" in the Partnership Trust Awards (a national "scheme for commending innovation in teaching and learning in higher education") 1994; promotion to Senior Lecturer in 1995.
Consultancy: EuCAD (the European R&D division of Cadence), VLSI Vision Ltd (the producers of the Peach and Imputer "camera on a chip"), and Motorola, East Kilbride.
Working for Scottish Enterprise, through Cadence, member of think tank: "group of top academic experts from around the world", to define vision and curriculum for the System Level Integration Institute in Scotland as part of Project Alba, 1997.
Research activity in VLSI Design: see below.
Softmacro Development Group: originated definition of new product/documentation format. Given sole responsibility for recruiting and managing new group; reviewed and updated base library elements; performed complete revision of existing softmacros; specified and implemented new parameterizable library with strategic library cells to assist in the penetration of new markets; devised and implemented automatic Quality Assurance procedure for cell-macro design.
Parameterized ROM Generator: sole responsibility from initial design through to prototype testing and documentation. Developed with Cadence software, the project consisted of schematic entry and design analysis using SPICE, leaf-cell layout for 2um CMOS, critical path analysis; software in SKILL for HCI; automatic layout; creation of schematic, symbol, abstract views, and Silos model; and automatic post-commissioned verification using DRC and simulation of extracted layout. The design was then ported across design rules to 1.5um CMOS, and finally integrated in ES2's solo1400 software with an IMP functional model.
General Activities: research into embedded-controller design in terms of the CAD interface and environment[10, 11]. Devised and promoted method by which customers could include full-custom modules within ES2's semi-custom layout environment. Provided support to sales and marketing through informal presentations to customers. Liaised with ES2's partners in DTI Behavioural Compilers Project. Coordinated final phases of Alvey DSP collaboration with Edinburgh University including complete revision of workplan to obtain final contract. Initiated successful LINK application with Southampton University and Lucas Research.
Overview My interests are primarily in digital VLSI design. They include the direct implementation of DSP algorithms on silicon, clocking strategies for large designs, circuit design, low-power techniques, and effective design methodologies.
Bit-serial correlator with self-generating clock The issue of clock distribution in large MOS designs led to the idea of constructing a system of independent modules with primarily bit-serial architectures driven by locally generated clocks and interfaced by hand-shaking protocols. In this view, the internal clock for each module is generated using its own distribution network as a ring oscillator. Thus, fast local clocks achieve rapid bit-serial computations in a relatively small area. To explore these ideas, the design of a spread-spectrum correlator was considered[12]. The device has an input data work of 4-bits which is correlated with 512 binary taps at a (typical) sample rate of 2.4 MHz producing a full precision 13-bit sum. The design consists of 3.5K bit-registers in 70 mm-sq using a 1.2µ process. The clock feedback loop includes several buffer stages running in the opposite direction to the data-flow (to avoid race hazards).
CAD for skew-free clock distribution The main deficiency was that the clock feedback path was unnecessarily long with respect to the optimal buffering of the clock signals; work thus began on overcoming the well-known problems of clock skew. The aim was to automate the generation of a balanced (for signal delay) clock-distribution network. My approach was to exploit a feature of the Cadence design software which allows the modification of layout according to a simple parameter even after place and route. Thus a good initial guess for a balanced network can be updated to be fully balanced after the actual routing and interconnect capacitances are known[5, 9]. The effect of fixed interconnect capacitances on the buffer taper factor has been derived[17].
Low-power PLA circuit design The above work implicitly assumed a single-phase clock and the use of CMOS technology. One of the most common sub-modules in VLSI is the Programmable Logic Array (PLA); its implementation in single-phased CMOS is achieved using either pseudo-NMOS which introduces static power consumption, or dynamic techniques with discharge paths of transistors (to form a NAND gate) which introduces delay. Combining these two approaches, I proposed a hybrid design style which achieves the high speed of the first with far lower static-power consumption[6].
Augmented self-generating clocks One problem associated with the self-generating clock is that the delay between latches is limited to that of the ring oscillator and so designs must avoid long combinatorial logic delays. This restriction can be reduced by augmenting the oscillator path to include such delay chains (using self-timed logic or delay matching). A square-root algorithm which avoids multiplication was implemented to test this idea[7]. In general, the idea is to use local, creative clocking and a a novel circuit for the controlled-generation of a clock's falling edge has been proposed[40].
Twos-complement addition and redundant-number conversion During an investigation into logic for conversion from redundant-binary to twos-complement numbers, the direct use of conversion logic as a twos-complement adder was noticed in two special cases[15, 16]. A formal proof of the general equivalence has been derived[8].
Adder design This interest in adder logic has been continued in the design of a high-speed wide adder[37] and circuit enhancements to improve the performance of multiple output domino logic[39].
Latch design Recent activity on latch design has resulted in the discovery of problems with two published latch designs[18, 21, 22] and a proposal for a low-power double-edge triggered latch circuit[19].
Asynchronous logic New two-phase micropipeline controller logic has been designed which offers superior performance to all other techniques[13, 35]. It has been verified using formal techniques[28] and included in a novel parallel-serial conversion architecture[38].
Miscellaneous As part of background research, review articles have been written on low-power design[24], Discrete Cosine Transforms[36] and Verilog HDL[27]. Work has also been performed on Viterbi implementations[14] and a digital design for the Tower of Hanoi[23] has resulted from a student laboratory.
Citations The Doctoral work on content-addressable memory[3, 4] is cited in IEE Proceedings, IEEE Journal of Solid-State Circuits, and IEEE Micro, and is described in full within a chapter of Advances in Computers by Academic Press (1992).
A paper on skew-free clock distribution[5] was selected for IEEE reprints[9]
The low-power single-phase programmable-logic-array[6] has been incorporated in standard text books: Weste and Eshraghian - 2nd edition, 1993; and Bellaouar and Elmasry - 1995), and cited in IEEE Trans Neural Networks (and implemented in silicon).
The survey on low-power digital techniques[24] is cited in Microelectronics Journal; and the low-cost sorting architecture[20] is cited in Nuclear Instruments & Methods in Physics Research.
12th December 1997