The Rocky and Bullwinkle solution to the address tracing problem met the original design goals fairly well. The simplicity goal adhered to at design time eliminated potential implementation headaches, especially with the choice of SRAM over DRAM; the FPGA worked perfectly with the SRAMs the very first time I wired them up. The design's modularity also allowed separate development and construction of the various components of the system.
True expandability was not achieved for the system as built: it would be nearly impossible to modify it for more than 64 bits width or more than 64 kilobytes of SRAM per module. However, the design can be adjusted easily to new specs if a new system is constructed from scratch.
Acceptable transparency was achieved primarily as the result of the carefully designed staller program which interrupts program execution during stalls. Several MS-DOS programs were run and stalled without problems, even with disk-accesses, graphics modes, and network operations ( telnet). Attempting to stall Windows would probably cause problems, but this has not been tested.
Testability was also achieved as most circuit difficulties were easily tracked down using the logic analyzer or even just a simple logic probe. For functional testing the console program proved very useful, as it provided direct access to port command writes and status reads as well as memory reads and writes to the FPGA.
The secondary goal of implementing a hardware-based compression scheme was not attempted. The functioning natasha FPGA program uses 59 of 144 available CLBs, so there is plenty of room available for a simple redundancy-removal circuit, as long as it is pipelined so that no trace records are missed.