Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Functional Design Verification: Modeling and Simulating Design Errors, Papers of Electrical and Electronics Engineering

University of Michigan (UM) - Ann Arbor Electrical and Electronics Engineering

A research thesis that explores functional design verification by collecting design errors and generating functional vectors for modeled errors using methods adapted from physical fault testing techniques. Information on the collection method, error model requirements, and test generation. It also includes data on the distribution of design errors and their frequency based on size and multiplicity.

Typology: Papers

Pre 2010

Uploaded on 09/02/2009

koofers-user-tuj 🇺🇸

10 documents

1 / 149

Partial preview of the text

Download Functional Design Verification: Modeling and Simulating Design Errors and more Papers Electrical and Electronics Engineering in PDF only on Docsity! FUNCTIONAL DESIGN VERIFICATION FOR MICROPROCESSORS BY ERROR MODELING by David Van Campenhout A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering) in The University of Michigan 1999 Doctoral Committee: Professor Trevor Mudge, Chair Professor Richard B. Brown Professor John P. Hayes Professor Karem A. Sakallah ACKNOWLEDGEMENTS I would like to express my sincere appreciation to my advisor Trevor Mudge for his mentoring and guidance. Trevor has also given me broad freedom in my research and has been much more than just a research advisor. My appreciation and thanks also goes to John Hayes. Working with John has been a great learning experience. I would like to thank Karem Sakallah and Richard Brown for serving on my committee. I would like to thank those who volunteered their time and effort to participate in our error collection effort: Hussain Al-Asaad, Todd Basso, Mary Brown, Juan Antonio Carballo, Subhachandra Chandra, Robert Chappell, Jim Dundas, David Greene, Jonathan Hauke, Rohini Krishna Kaza, Michael Kelley, Matt Postiff, and Steve Raasch. I thank my friends and colleagues in ACAL: Hussain Al-Alsaad, Jeff Bell, I-Cheng Chen, Brian Davis, Jim Dundas, Jonathan Hauke, Tom Huff, Jim Huggins, Hyungwon Kim, Keith Kraver, Victor Kravets, Chih-Chieh Lee, Charles Lefurgy, Joao Paulo Marques da Silva, Phiroze Parakh, Steeve Raasch, Mike Riepe, Mike Upton, and Hakan Yalcin. These individuals together with the faculty provided an intellectually stimulating environment where it was also fun to work. I will never forget Tim Stanley who left us much too early. I am grateful for several close friends I met during my years in Michigan and hope that they will continue to be part of my life. My deepest appreciation goes to my parents for their unconditional love and encouragement. I am grateful to my brothers for their understanding. I would like to acknowledge the Belgian American Educational Foundation for having given me the opportunity to study in the US and supporting my first year of study. iii TABLE OF CONTENTS DEDICATION .. 2.0.6 nent teen eee ii ACKNOWLEDGEMENTS ...... 6.0.00 e cece nee eee nee iii LIST OF FIGURES .. 1.0.0. cece eee cee vi LIST OF TABLES ... 0.0.6 n tenet eee Vili LIST OF APPENDICES ........ 60.0.0 e cece ence ene nee ix CHAPTER 1. Introduction 2.6... 6.6. e ccc cece cece eee eee 1 1.1 Microprocessor design «0.2.6.6... ccc cece cece cece eee 2 1.2. Functional verification ..... 6.0... cece eee eee eee ee 5 1.3. Test generation for FDV .... 0... eee 8 1.4 Checking the outcome of a simulation . 11 1.5 Measuring and predicting functional quality . 13 1.6 Related area: physical fault testing ....... 18 1.7 Related area: software testing ....... 6... coc eee eee 20 1.8 Thesis outline 2.0.0.6... 23 2. Design error data 20.0... 6.6 e eee e ee 26 2.1 Published error data . -. 26 2.2 Collection method .. . 28 2.3 Collected error data 30 2.4 Guidelines for implementing an error collection system ............. 37 2.5 Discussion 0.6.6.6. c cece eee eee ee tenet tenes 39 3. Design error models ....... 0... ccc ec ce eee eee eee 4l 3.1 Error model requirements ..... 02.0.0 0.0 c ee cece ee eee 4l 3.2 Design error models ... 42 3.3. Number of error instances defined by error model ................- 45 3.4 Test generation ..... 6.6.0... cee eee eee . 47 3.5 Error simulation ................64. sees 48 3.6 Analytical coverage evaluation of CSSLI . . 60 3.7 Coverage evaluation using error simulation... ...............00005 64 3.8 Coverage evaluation by analysis of actual errors ...........0.....05 68 3.9 Conclusions ... 2.6... cece eee eee e een teens 74 High-level test generation for design verification of pipelined microprocessors ...... 6.0.66 ec cece eee eee eee 77 4.1 Related work 2.0... 6.6. cee eee teens 78 4.2 Pipelined processor model ........... 00 cece ee ce eee eee 83 4.3 Pipeframe model ....... . 86 44 Test generation algorithm ... 0.0.6.6... cece cece eee eee 89 4.5 DPTRACE: path selection in datapath ... 2.0.2.0... 66 cee ee eee ee 90 4.6 DPRELAX: value selection in datapath ..............0..000000005 105 4.7 CTRLJUST: CTRL line value justification ...............0.0.0005 106 4.8 Experiments ...............0.0.0005 see 112 4.9 Comclusions 2.0.2... eee eeee ne eee eee eee eee 116 5. Comclusions 2.2... 66 e cee eee e een eee 118 5.1 Contributions 0.6... ee 118 5.2. Future work 2.0... cnet e ene ees 119 APPENDICES .... 0.0.6 een en teen eens 121 BIBLIOGRAPHY ... 2.6... cece ee eee teen nen teen 129 LIST OF TABLES Table: Ll 2.1 2.2 23 24 3.1 3.2 3.3 3.4 3.5 4.1 4.2 43 44 45 4.6 Al B.l B2 Phases in the design of a microprocessor .........6..660 0c e eee ee eens 3 Design projects for which error data was collected... ..........0..00005 32 Design files written for the X86 project ......... 0... cece eee ee eee 33 Error distribution in X86... 6... ee cece teens 34 Design error distributions [%] .......... 06. c ce cee eee 39 Characteristics of two modules of the DLX microprocessor implementation .. beet eee ee eee eee eee eee 65 Coverage of synthetic and actual error: TO-T13 2. nen nee 67 Actual design errors and the corresponding dominated modeled errors for DLX 2... ences 72 Actual design errors and the corresponding dominated modeled errors for LC2 Comparison of practical design error models ...........0. 00000000 eee 716 Initial C- and O- values 0.0.6... eee eee ete 98 Computation of controllability and observability measures for a node with incoming edges x,...x,, and outgoing edges y}...Vy ..... 0... eee eee 105 Model parameters of DLX design ...... 2... LIS High-level test generation for bus-SSL errors in DLX implementation ..... 115 Gate-level test generation for standard SSL errors using HITEC .......... 115 Comparison of high-level and gate-level test generation for DLX ......... 115 Bridging functions Z(X, y) .. 6... cee eee cece eee eee 123 Test generation and fault simulation of ISCAS’89 circuits using HITEC 2... ences 126 Error simulation of ISCAS’89 circuits using CESIM .............6. 0005 127 vili LIST OF APPENDICES Appendix A. Relationship between CSSLI errors and bridging faults ...............0.64 122 B. Conditional error simulation on ISCAS 89 benchmarks .............00.65 124 CHAPTER 1 Introduction Information technology is drastically changing our world. Economies are shifting from the industrial age of steel and cars to the information age of computer networks and ideas [Eco96]. In 1998, the gross domestic product (GDP) in the US due to computers, semiconductors, and electronics reached that of the automobile industry: 3.5% [Baum98]. The microprocessor, which saw its birth in 1971 with the Intel 4004, plays a central role in this information revolution. Indeed, Intel now dominates the hardware side of the computer industry. It is the largest (by dollar volume) chipmaker in the world. Microprocessors have become commodity products and are essential parts, not just of computers, but also of cars, cellular phones, personal digital assistants, and video games, to name just a few. Continuous technological improvements have led to integrated circuits becoming smaller, faster and cheaper. Simultaneously, people have continued to find new uses of microchips and computers. The Internet, with its explosive growth, is just the latest example. The markets for microprocessors demand low cost and high performance, and are changing rapidly. To meet these demands, microprocessor design houses have to overcome great technological challenges: Circuits need to be designed that operate at very high speed. New design methodologies to deal with signal integrity and timing issues are becoming necessary now that the minimum feature size has dropped well into the deep sub-micron regime. The number of transistors integrated on a single chip is doubling every 18 months. This has led to an enormous growth in functional complexity. Furthermore, the pressure put on the design cycle by time-to-market is enormous. Functional verification, which is concerned with ensuring that the design implements the intended functional behavior, is considered one of microprocessor design’s major microarchitectural simulator, also referred to as a timer or a performance simulator, is to compute an accurate estimate of the execution time, in number of clock cycles, of a given benchmark on a given concrete microarchitecture [Burg97, John91]. As the number of design points is very large and the size of the benchmarks is very large as well, simulation speed is important. A popular simulation technique is trace-driven simulation [John91]. A concrete microarchitecture is defined by a set of parameters that further specify the features used, such as the number and latency of integer execution units, or the size of a cache. Also associated with each microarchitectural feature is its cost in terms of hardware, but also in terms of design and verification effort. To accurately estimate the cost some floorplanning and circuit design studies may be required. The design problem is that of finding the microarchitecture that meets all the constraints and provides the best trade-off between performance and cost. Verification at this stage of the design is mainly concerned with the correctness of the microarchitectural simulator; the systematic study of this problem, also referred to as performance verification, has only recently gained interest [Bose98]. Design implementation In the third phase the microarchitecture is implemented. The microarchitectural specification is typically not formal, and consists of textual descriptions, block diagrams, and parameter values. The first step in this phase is to design the first register-transfer level (RTL) description. Standardized hardware description languages (HDL’s), such as Verilog [IEEE96] and VHDL [IEEE88], or C/C++ are commonly used to describe the RTL model. This activity is sometimes referred to as control design. Logic designers refine this behavioral RTL description to a structural RTL description. Circuit designers generate transistor-level netlists that implement the structural RTL. Layout designers generate layouts for the transistor-level schematics. The refinement from behavioral RTL to layout differs significantly among industrial design methodologies. There are numerous verification problems at this stage. They include functional (logic) verification, timing verification, electrical verification, physical design rule verification. In functional verification we distinguish between design verification, which is concerned with the functional correctness of the initial RTL description, and implementation verification, which is concerned with checking the functional equivalence between two versions of the implementation. The latter includes comparing the RTL view against the structural view, the structural RTL view vs. transistor level schematics, and the schematics vs. the layout. Post tape-out Functional design verification continues after the design has been taped out. Once first silicon is available, extensive functional testing can begin. The main difference with pre- silicon design verification is the vast increase in test throughput. Diagnosing the root cause of a discrepancy can be difficult. Functional test suites are complemented with test suites aimed at measuring performance. Other activities in this phase include electrical characterization, and physical fault testing and diagnosis. Software activities that critically depend on the hardware, such as machine-specific compiler tuning, and post-hardware measurements, can start as soon as the silicon has been found sufficiently functional. This may involve engineering workarounds for remaining functional bugs, or fabrication of corrected versions of the chip. 1.2 Functional verification Functional verification has gained a lot of interest in recent years as evidenced by the surge in publications detailing industrial experience with the topic: « AMD’s KS: [Gana96] ¢ DEC’s Alpha: [Kant96, Tayl98] ¢ HP’s PA RISC: [Alex96, Bass95, Mang97, Weir97] ¢ IBM’s S/390: [Shep97, Wile97] * Metaflow’s Sparc: [Pope96] * Motorola/IBM’s PowerPC: [Mall95, Mona96] ¢ SGS Thomson’s Chameleon: [Casa96] Although concrete methodologies differ from company to company, some common themes can be identified as we shall see. Functional implementation verification. Functional implementation verification refers to checking the functional equivalence between two versions of the design. The two versions may be representations of the design at a different level of abstraction (below the microarchitectural level), such as behavioral and structural RTL. Alternatively they may be different versions of the design at the same level of abstraction; for example, one may be a retimed version of the other. Efficient methods have been developed to formally check the boolean equivalence of large combinational circuits [Kuel97], and have recently become available commercially [Goer95]. For library-based logic design methodologies these tools are readily applicable. In custom methodologies significant effort may be involved in automatically extracting an accurate gate-level view from the transistor level netlist. Nevertheless, these methods are becoming a favorable alternative for regression verification using a (switch-level) simulator. Combinational equivalence checkers can be used to check the equivalence of two sequential circuits if there is a one-to-one mapping between the state registers. If such a mapping does not exist, the complexity of the problem greatly increases. For the special case of circuits whose state registers differ because of retiming, more specialized methods have been developed [Bisc97, Hosk95]. Functional design verification. Functional design verification (FDV) is concerned with verifying the functional correctness of the first RTL model of the design. For microprocessor design correctness, this means conformance to the instruction set architecture (ISA) and to some (incomplete) microarchitectural specification. Functional design verification undergoes several phases as the project progresses. The complete effort can be divided into a pre-silicon and a post-silicon phase. The former phase is further divided into unit verification and system verification. During unit verification a portion of the design is verified in isolation. For larger units, another (lower) level of integration (designer macros) may be appropriate. Basically the designs. To take advantage of this simulation capability, automated test generation methods are needed. Pseudo-random test generation. In the area of physical fault testing, it has long ago been recognized that test patterns can easily be generated randomly. The method requires very little effort, but its efficiency and effectiveness, are rather low compared to algorithmic approaches. Furthermore, the effectiveness and efficiency decrease with increasing design size [Abra90]. Nevertheless, random test pattern generation can be very useful to complement manual test generation in the absence of better methods. Sophisticated pseudo-random exercisers have been used very successfully to validate complex microprocessor designs [Ahar91, Kant96]. Taylor et al. [Tayl98] report that 79% of the functional bugs in the DEC Alpha 21264 microprocessor were found by pseudo- random tests. To achieve this high effectiveness such pseudo-random test generators incorporate knowledge about the instruction set architecture and the concrete micro- architecture. They typically have many parameters that allow the verification engineer to bias test generation towards “interesting behaviors,” such as corner cases. A strength of random test generators is that they can generate test cases that verification engineers might have never thought of. On the other hand, most random test generators have so-called holes: these are areas in the space of valid test sequences that are covered only with an extremely low probability, or even are not covered at all. Random tests are more difficult to debug than hand-written tests. Template-based test generation. Certain aspects of designs are difficult to cover with biased random tests. This may be the case if the space of valid input sequences is highly constrained. Specialized tools have been developed to help automate the generation of such focused tests. One example is a code generator described in [Chan94, Chan95]. The user provides so-called symbolic instruction graphs that compactly describe a set of instruction sequences that exhibit certain properties. The tool generates actual instruction sequences that satisfy all the properties by using constraint solving techniques. One property might be that the third instruction is a load-class instruction, which causes a cache miss, and that the fourth instruction is an arithmetic instruction using the result produced by the load instruction. Free variables, such as the register that serves as the load 10 target, are chosen in a biased random manner. The tool incorporates knowledge about the micro-architecture, in the form of implicit constraints and biasing functions. A similar tool is described in [Hoss96]. Operating system (OS) and application code. Other sources of verification tests are operating system code and application code. However, booting an OS requires on the order of ten billion cycles [Kuma97], therefore the use of this type of verification has only recently become feasible through hardware emulation [Gana96, Bass95, Kuma97]. Demonstrating that the design correctly boots several OS’s is a great confidence builder. Furthermore, for architectures that are not very well documented, such as the Intel x86 architecture [Wolf98a], successfully running application software with the OS in place is a common practice [Gana96]. In spite of the fact that the x86 architecture dominates the industry, there are some subtle features [X86] which are not officially documented and can cause compatibility problems. Ultimately application and OS software are the yardsticks for compatibility. In-circuit emulation is one step closer towards real system operation. The emulator is hooked up to a (modified) system board and hence receives real external events from other devices on the system bus. Coverage-directed test generation. Pseudo-random test generators are typically deployed in conjunction with extensive coverage measurements. Coverage is a measure of the completeness of a test suite for a design. A discussion of prevalent coverage metrics is given in Section 1.5. Coverage data is analyzed to identify regions of the behavior that are not (well) covered. Usually, verification engineers manually tune the pseudo-random test generators, or write a focussed test to cover the verification hole. Error-oriented test generation. A different approach is to use synthetic design error models to guide test generation. This exploits the similarity between hardware design verification and physical fault testing. For example, Al-Asaad and Hayes [AA95] define a class of design error models for gate-level combinational circuits. They describe how each of these errors can be mapped onto single-stuck line (SSL) faults that can be targeted with standard automated test pattern generation (ATPG) tools. This provides a method to generate tests with a provably high coverage for certain classes of modeled errors. 11 A second method in this s stems from mutation testing, which is an error-oriented structural approach to software testing. Mutation testing will be discussed in greater detail in Section 1.7. Recently, Al Hayek and Robach [AH96] have successfully applied mutation testing to hardware design verification in the case of small VHDL modules. 1.4 Checking the outcome of a simulation A nontrivial task in simulation-based FDV is to determine the outcome of simulating a verification test, i.e., did the verification test detect an error? Manual inspection. Manual inspection of the simulation output is still a commonly used method, especially in the early stages of the verification effort. The engineer, typically the designer, inspects the simulation output through an interface similar to that of a logic analyzer. This method is very flexible way for tracking down an error to its source. The interface allows the designer to explore the entire design. Every signal in the circuit can be examined. Once the outcome of a simulation run has been validated, it can be stored together with the test for later use (regression testing). Although manual inspection is error-prone and impractical for large test sets, it is still necessary to diagnose the root cause of discrepancies detected by the methods discussed below. Self-checking tests. A first approach to automated correctness checking is to make the tests self-checking. The tests start by setting up the system’s initial state. This is followed by the bulk of the verification test. At the end, part of the system’s final state is compared to a precomputed final state included with the test. If the test was generated manually, it is not uncommon that the test writer computes the expected final state himself. For larger tests, and for tests that were generated with tool assistance or even completely automatically, the final state is usually computed by running the test through a suitable high-level simulation model of the system, such as an interpreter for the ISA. The outcome of a self-checking test is basically pass or fail. In case of failure, the test needs to be simulated again, this time with full visibility. Tracking down the error (error diagnosis) can be very tedious and time consuming. Another complication of the approach is the problem of error masking. At some point in the simulation, a verification test may uncover 14 Analysis of bug detection data Upton collected design error data from the Aurora GaAs microprocessor designs at the University of Michigan [Upto94, Upto97]. He observed error rates of one design error per every 100 to 200 lines of Verilog code. This figure has been confirmed by industrial sources [Bent97]. Upton also analyzed the cumulative number of detected bugs over time and suggested that the bug detection process can be modeled as a function exponentially tapering off in time. Malka and Ziv apply techniques from software reliability engineering for statistical analysis of bug detection data from two industrial microprocessor design projects [Malk98]. They use trend analysis to gauge the effect of the introduction of a new test generation techniques on reliability growth. Modeling of the bug discovery process is used to make short term predictions, such as the mean time to the next failure, and long term predictions, such as when a certain level of reliability can be expected. They conclude that statistical analysis of bug detection data can provide very relevant information for determining tapeout dates. The use of quality criteria based on analysis of bug detection data is widespread. However very little data of this type has been published [Mona96]. One reason might be that this data is highly dependent on the design and verification methodology, the nature of the design, and the designers themselves. Also, bug detection data is only meaningful to the extent that a detailed verification plan has been carefully designed and implemented, and that continuous efforts have been made to improve and extend techniques to exercise the design. Biased-random test generators can generate new tests indefinitely, but these tests tend to loose their effectiveness over time. Analysis of coverage provides another means to assess the functional quality of a design. Analysis of coverage Coverage is a measure of the completeness of a test suite for a design. Moundanos, Abraham and Hoskote [Moun98] give an idealized definition of coverage as the ratio of the exercised behaviors over the total number of specified behaviors. A behavior can be 15 modeled as an execution trace of the design. Unfortunately, attempting to exercise all possible execution paths is an intractable problem. Practical coverage metrics are needed to expose shortcomings of test suites and to spur further directed test generation. They are also needed to complement the methods discussed above for evaluating the state of completion of the verification effort. Code coverage metrics from software testing Software design also poses the problem of measuring the effectiveness of testing [Beiz90]. Classical structural metrics such as statement, branch and path coverage also apply to hardware design, as designs are usually represented in hardware description languages today. It is well known that many design errors may still go undetected even though complete statement and branch coverage has been achieved. Full path coverage is an impractical goal as the number of paths can be exponential. An advantage of these metrics is that their computation imposes only a small overhead on logic simulation. A typical flow for code coverage measurement is as follows. First the original HDL description is instrumented. The instrumented code is then simulated for the given test suite using an standard logic simulator that is augmented with library functions provided by the coverage tool vendor. Part of the simulation outcome is coverage data that can be examined using a coverage analysis tool. OCCOM Although code coverage metrics such as statement coverage can be computed efficiently, they suffer from not taking into account observability. A verification test that activates a particular statement, but fails to propagate the effect of executing that statement to a part of the machine state that is truly observable (those signals that are also part of the specification), cannot be considered to have “covered” that statement. To address this shortcoming Devadas et al. [Deva96] propose a code coverage metric based on tag propagation, which was later refined in [Fall98a] and is called OCCOM, which stands for observability-based code coverage metric. Errors are associated with assignment statements in the code. The effect of an error is represented by a “tag” that can propagate 16 through the circuit according to a set of rules similar to the D-calculus [Abra90]. The metric measures the fraction of tags that have been propagated to the observable state over the number of tags injected. The major extension of [Fall98a] to the earlier work in [Deva96] is an efficient method for computing OCCOM coverage using a standard logic simulator. The efficiency of computation is closely related to the definition of the tag propagation rules. The propagation rules are defined so that, in essence, the erroneous machine stays on the same execution path as the error-free machine. Experimental results on small examples show a modest overhead factor of 1.5-4 over logic simulation, which is a much smaller overhead than that incurred by fault simulation. FSM based metrics Microprocessor designs typically have a natural partition: datapaths and controllers. Controllers have been found to be particularly prone to design errors [Ho96a]. An appropriate model for small controllers is that of a finite-state machine (FSM). Coverage can then be measured as the fraction of states or state transitions visited by a test sequence. FSM transition coverage is not a meaningful metric for microprocessors, which can easily contain thousands of state registers. Even if it were possible to compute the set of reachable states, any coverage measurement with respect to complete state graph would be negligibly small. However, most of the state registers are part of the datapath. Ho [Ho95] worked on the verification of the protocol processor in the FLASH project [Kusk94]. Even after abstracting the datapath, he was still faced with the state explosion problem. Control is usually distributed and consists of a number of interacting smaller FSM’s. Ho proposed an incremental strategy in which coverage with respect to the individual state machines is attempted first. This type of coverage measurement is referred to as FSM coverage in industry [Hoss96, Kant96, Nels97, Paln94, SP92], and is also supported by EDA vendors [Clar98]. Next, larger composite state machines can be considered. To reduce the size of that state graph further, Ho defined an equivalence relation on states. All states that apply the same control signals to the datapath are considered equivalent. Provided that the machine has been partitioned in such a way that the datapath does not store any control state, this is a very reasonable assumption. Ho applied his methodology in combination 19 Combinational vs. sequential circuits. Test generation for sequential circuits is a much harder problem than test generation for combinational circuits [Micz86, Chen96, Marc96]. Although several test generators are commercially available that are able to generate very high quality tests for very large combinational circuits, test generation for sequential circuits the size of modern microprocessor is well beyond the reach of any current automatic test pattern generation (ATPG) system. However, design for testability techniques (DFT) [Abra90] can greatly reduce the complexity. In full scan design, every register is replaced by a scan register and the registers are linked in a chain, thereby making every register observable and controllable. This effectively reduces the test generation problem to one for combinational circuits. Unfortunately these DFT techniques do not apply to design verification, since typically there is no one-to-one correspondence between the state elements of the design implementation and the reference model (specification). Fault/error models. A third difference between the two areas is that physical fault testing has a proven and widely accepted logical fault model, the single-stuck line (SSL) model. The SSL combines simplicity with the property that it forces each line in the circuit to be exercised. A large body of research has been based on this model. Design verification, as yet, does not have such a fault model. The success of the SSL model provides a motivation to develop error models for design verification, which can potentially benefit from the work on SSL faults. Hierarchy. Physical fault testing and FDV also differ in the role hierarchy plays. In physical fault testing the goal is always complete (SSL) fault coverage at the gate-level. Hierarchical test generation approaches have been proposed that carry the promise of being able to handle larger designs than purely gate-level methods. Typically, test sets are precomputed for gate-level descriptions of individual modules. System tests are then derived at the high-level representation that apply the test stimuli to the module under test, and propagate the error effects to the system’s primary outputs [Murr90, Murr92]. FDV can be done in a bottom-up fashion. First, the units constituting the design are verified. Verification tests are applied to the unit in isolation during this phase. Next, the complete design is verified, shifting the focus towards the interaction of the units. 20 1.7 Related area: software testing Since the introduction of HDL’s to mainstream design methodologies in the 80's, hardware design has started to resemble software design. HDL’s, such as Verilog and VHDL, take after general-purpose programming languages, such as C/C++ and Ada. Complex mechanisms, such as dynamic memory allocation, recursion, arbitrary user defined data type and pointers, are readily supported by general-purpose programming languages and are commonly used in software design. In hardware designs, however, these mechanisms need to be implemented explicitly by the designers so that they readily map onto hardware. Functional verification of software is therefore substantially more complex than hardware verification. The task of software testing is to ensure the reliability of software. A large number of methodologies and techniques have been proposed; an overview can be found in [Beiz90]. However, most of these techniques are not supported by tools that automate the testing process. This is in contrast to most areas of hardware testing and verification. One explanation is that software tends to be more complex and more diverse. Testing methods that are both practical and effective tend to rely on expert knowledge about the software under test that is very difficult to automate. Another explanation is that most testing methods crucially depend on a specification. Written specifications are now considered a cornerstone of any software development project [Post96b], but there has been a time when the software community stubbornly tried to avoid taking time to record a description of how software was supposed to behave. Programmers obviously need specifications to write code, but these specifications should also be written down to facilitate making changes and repairs later. Testers need written specifications to determine whether an observed behavior conforms to the intended behavior. Informal specifications, plain English descriptions of the requirements, have the benefit of being easy to read. Unfortunately they are a major obstacle to automation of software testing. Recently, formal languages that are still readable, such as Semantic Transfer Language (STL) that appears in [TIEEE94], have been proposed to capture specifications. Such formal specifications can automatically be checked not only for syntax problems, but 21 also for semantic inconsistencies. Formal specifications are essential to systematic test case generation and functional coverage measurement. Beizer [Beiz90] distinguishes two major approaches to software testing: functional and structural. Functional methods are specification-directed and view the implementation as a black box. Structural methods are driven by the implementation. In the remainder of this section we discuss some of the few techniques that 1) are general, i.e., not specific to a particular application domain, and 2) have substantial support for automation. Control flowgraph path testing. Path testing methods are based on the use of the control flowgraph of the program. In this graph nodes represent branching points or junction points in the programs. Arcs represent branch-free code with a single entry and exit. Path testing is the oldest of all structural test techniques. Beizer references work at IBM from 1964. Tests are targeted at bugs that make the program take a different path than intended. Completeness of test sets is measured in terms of their statement, branch, and path coverage. Statement coverage requires that all statements in the program be executed at least once. Branch coverage require that each alternative at each branch in the program is exercised at least once. Branch coverage implies statement coverage. Path coverage requires that all control flow paths through the program are exercised. This is in general not practical to achieve as the number of paths can be exponential. Full statement and branch coverage are common targets during unit testing. Additional paths are selected by other test methods, such as dataflow testing, or logic-based testing. Coverage measurement is widely supported by software development tool vendors [Paxs98, SR]. Mutation testing. Mutation testing is an error-oriented structural approach to software testing introduced by DeMillo et al. in [DeMi78]. Mutation testing considers programs, termed mutants, that differ from the given program by only simple errors, such as replacing ‘<‘ by ‘S’ in one conditional expression. The task of the tester is to construct tests that distinguish the mutants from the given program. Mutation testing provides a metric, mutation coverage, to grade test sets. King and Offutt described in [King91] a system to automatically generate tests using constraint solving techniques. Mutation testing is predicated on two hypotheses. The competent programmer hypothesis assumes that programmers write code that is very close to correct code. The Unverified design Assisted verification Error database Assisted verification Veritied design Design error models Test generator oS Implementation Specification simulator simulator <u> Diagnose & debug %* Unknown actual error Figure 1.2: Deployment of the proposed verification system specification. A discrepancy between the two simulation outcomes indicates an error, either in the implementation or in the specification. The figure also shows that throughout the verification process, actual errors are recorded. This information can be used to tune error models. Chapter 2 examines design error data. Published design error data lack the detail needed to derive structural error models. We therefore devised a systematic method to collect error data. We present and analyze error data we collected from design projects at the University of Michigan. 25 Chapter 3 develops synthetic error models based on the empirical data. We identify requirements that error models must satisfy to be useful for design verification. We show how well each of the proposed error models meets these requirements. Chapter 4 considers the problem of generating verification tests for synthetic errors in microprocessors. As we will see, this problem is not unlike test generation for SSL faults in a very large sequential circuit. To cope with this complexity, we consider a limited, but important, class of pipelined microprocessors, and develop a test generation method specific to these designs. To this end, we introduce a model that captures high-level information about the structure of pipelined microprocessors. We then develop a high- level test generation method that exploits the high-level knowledge. We describe experiments to evaluate the effectiveness of this algorithm. Chapter 5 summarizes our research contributions and presents some directions for future research. CHAPTER 2 Design error data Our design verification approach uses design error models to direct test generation. Good design error models should result in test sets that detect many actual design errors. To construct such design error models a good understanding of the nature, frequency, and severity of actual design errors is required. Despite the abundance of design errors in large-scale projects, very little data has been published on these errors. It is common practice in industry to record design errors, but this information is considered proprietary and, perhaps, embarrassing, so it rarely appears in public. These considerations led us to collect error data from design projects at the university. Section 2.1 presents published error data from industry. Section 2.2 describes our method to systematically collect design errors. Section 2.3 presents the error data we collected. Section 2.4 offers some lessons learned. A summary and a discussion of our results is given in Section 2.5. 2.1 Published error data Although design errors that make their way into final products are common, manufacturers have not always been forthcoming about them. This has changed since MIPS began to publish their bug list beginning with [MIP94]; the Pentium bug [Beiz95] also influenced this change. To give a feel for these errors, we present a few examples of design errors that appeared in major commercial microprocessors. The errata list for the MIPS R4000PC and R4000SC microprocessors (revisions prior to revision 3.0) [MIP94] documents 55 bugs. Many of these require a rare combination of events before they become visible. The following is a representative bug: If an instruction sequence which contains a load causing a data cache miss is followed by a jump, and the jump instruction is the last instruction on the page and, further, the delay slot of the jump 26 29 (replace the _ with X where appropriate) MOTIVATION: X bug correction design modification design continuation performance optimization synthesis simplification documentation BUG DETECTED BY: inspect ion — compilation X simulation synthesis BUG CLASSIFICATION: Please try to identify the primary source of the error. If in doubt, check all categories that apply. X combinational logic: _ wrong signal source x missing input (s) _— unconnected (floating) input (s) unconnected (floating) output (s) conflicting outputs — wrong gate/module type missing instance of gate/module sequential logic: extra latch/flipflop = missing latch/flipflop extra state < missing state wrong next state other finite state machine error statement: if statement — case statement always statement declaration = port list of module declaration expression (RHS of assignment): missing term/factor extra term/factor = missing inversion extra inversion _ wrong operator wrong constant completely wrong buses: _ wrong bus width = wrong bit order verilog syntax error conceptual error _— new category (describe below) BUG DESCRIPTION: Forgot to select NOP in case of stall Figure 2.2: Bug report example 30 simulation. The operation of our error collection method within the design cycle is illustrated in Figure 2.3. From the raw revision management data, we identified the design modifications to fix each error by computing the differences between successive revisions. The analysis of the design error data lead to a preliminary classification of design errors. This classification was used in our first major design error collection effort, which took place in the fall term of 1996. Analysis of this design error data lead us to revise our classification. The result is shown in Figure 2.2. The categories are not completely disjoint, so designers were asked to check all applicable categories. 2.3 Collected error data Design projects. Design error data was collected from both class design projects and research projects at the University of Michigan. All of the designs were described in Verilog [[EEE96]. Table 2.1 lists these projects. LC2 concerns the design of the Little Computer 2 (LC-2) [Post96a], which is a small microprocessor used for teaching purposes at the University of Michigan. The design of both a behavioral and a synthesizable register transfer level model was carried out by Hussain Al-Asaad [AA98] in the summer of 1997. DLX1, DLX2, and DLX3 concern design projects that were undertaken as part of the senior/first-year-graduate level computer architecture course (EECS470) in the fall of 1996. Students designed a pipelined implementation of the DLX [Henn90] microprocessor at the structural level. X86 concerns an EECS470 design project carried out in the Fall of 1997. Students designed a pipelined implementation of a subset of the Intel x86 architecture [Int89]. FPU concerns the design of a floating-point unit for the PUMA processor [Brow96], which is a PowerPC microprocessor implemented in complementary GaAs process technology, and was undertaken as part of the graduate level VLSI design class (EECS627). Both a purely behavioral and a mixed synthesizable behavioral/structural model were designed. FXU concerns the design of the fixed-point unit of the PUMA processor. James Dundas and Todd Basso wrote the synthesizable behavioral description in the Fall of 1996. For each of the projects the table lists the 31 Design input Simulate design Speeeieeaetsts divert REAre wL tumient | Cleats comuat Fill out questionnaire Urepleen tha ebth © where appenprists) ky ie cae cee Detect bug ae Pimp Lie dation rea ‘ETE or) — a Sera Correct bug eae Sree numt abate) pats azcut Avgopt te adder wan #1 instead of °1. Cvs revision database Figure 2.3: Error collection system 34 Table 2.3: Error distribution in X86 Error category Frequency Wrong signal source 32.8% Missing instance of gate/module 14.8% Missing input(s) 11.5% Wrong gate/module type 9.8% Unconnected (floating) input(s) 8.2% Missing latch/flipflop 6.6% Conceptual error 4.9% Wrong next state 3.3% Other finite state machine error 1.6% Extra term/factor 1.6% Extra inversion 1.6% Wrong bit order 1.6% Other 1.6% 7000 T T —— Code size G33 Lines touched 6000 5000 4000 Size [lines] 3000 2000 1000 0 5 10 15 20 2 30 35 40 45 Days Figure 2.5: Project evolution: code size [lines] and lines touched over time each day. Most of the design description is in place by day 21, and integration testing can start. Figure 2.6 shows the number of revisions over the duration of the project. The number of revisions logged on any day is broken up into revisions that are due to bug corrections 35 35 = Revisions due to bug correction Other revisions No. of revisions 0 5 10 15 20 2 30 35 40 45 Days Figure 2.6: Revisions motivated by bug correction and other revisions over time and those due to other reasons. Ideally, there is a one-to-one correspondence between uncovered design errors and revisions motivated by error correction. Hence the bar for number of revisions logged due to error corrections also gives the total number of bugs corrected during the corresponding day. It can be seen that most of the bugs were discovered and corrected in the second half of the project. Figure 2.7 plots the time at which each error was corrected versus the number of lines of code that were touched to correct the error. The vertical coordinate is an indication of the structural complexity of the error. Although easy to compute, this metric is far from ideal. It does not distinguish between lines of code that have merely been reformatted and lines that have truly been changed. More accurate measures, such as the minimum number of ‘atomic’ modifications needed to remove the error from the control dataflow graph of erroneous circuit, would be more appropriate but are also much harder to compute. For about half of the errors fewer than ten lines of code were involved, and only four errors resulted in modification to the design involving more than 100 lines of code. 36 1000 ° ° oz ° ° #100 o ° ° N o 0 0 § ° g 2 9 3 °° 5 © 0,0 5 0° of 5 10 ° eo a °° °, ° ° 60° ° °°} co ~ o Oo ° 0000 0 0 5 10 15 20 25 30 35 40 45 Days Figure 2.7: Design errors: time to discovery [days] vs. error size [lines] We further characterize these design errors based on purely structural properties. We define the size of an error as the order of the polynomial that computes the number of similar errors as a function of the size of the circuit. For example, single inversion errors and single-stuck errors both are of size 1, because there are ow!) such errors in a circuit with N lines. Signal source errors are of size 2 as there are Ow?) such errors. We noted that some actual errors consist of multiple instances of the same type of error. An example is an inversion error on a port connection of a module instance that is repeated for all instances of the module. We define the multiplicity of an actual error as the number of identical and repeated instances of a simpler error that constitute the actual error. Figure 2.8 plots the frequency of design errors when binned according to size and multiplicity. We observe that design errors of higher multiplicity are rare. Design errors with multiplicity 1 and sizes 1 or 2 account for more than half of all design errors. Only about 12% of the errors are very complex, as indicated by a size of 10 or greater. 39 Table 2.4: Design error distributions [%] Category LC2 DLXI DLX2 DLX3 X86 FPU FXU Average Wrong signal source 27.3 314 25.7 46.2 32.8 23.5 25.7 30.4 Missing instance 28.6 20.0 23.1 148 5.9 15.9 15.5 Missing inversion 8.6 47.1 168 10.3 New category 9.1 8.6 77 66 IWS 44 6.9 Unconnected input(s) 8.6 14.3 77 8.2 5.9 0.9 6.5 Missing input(s) 9.1 8.6 57 77° 115 6.1 Wrong gate/module type 13.6 114 9.8 5.0 Missing term/factor 9.1 2.9 57 44 3.2 Always statement 9.1 2.9 27 2.1 Wrong constant 9.1 5.3 2.1 Missing latch/flipflop 49 59 0.9 17 Wrong bus width 45 7A 17 Missing state 9.1 1.3 Conflicting outputs 77 11 Wrong constant 2.9 44 1.0 Conceptual error 2.9 3.3 0.9 1.0 Signal declaration 57 0.8 Extra term/factor 2.9 1.6 0.9 0.8 Wrong operator 44 0.6 Gate or module input 2.9 04 Case statement 2.7 0.4 Other FSM error 1.6 0.2 Extra inversion 1.6 0.2 Wrong bit order 1.6 0.2 Wrong next state 1.6 0.2 Latch 0.9 0.1 If statement 0.9 0.1 Expres. completely wrong 0.9 0.1 submit new revisions for all of these files together. Otherwise, these revisions data can wrongly be interpreted as concerning multiple errors. 2.5 Discussion Table 2.4 shows the error distributions for all projects. Also listed is the average error frequency over all projects. We observe that signal source errors are the most common type of error at 30%. Errors involving missing logic (missing instance, missing input, missing term, missing state) are the second most common group at 26%. Also notable is that apparently very simple errors, such as extra/missing inversions and unconnected 40 inputs, account for 17% of all errors. More detailed analysis of these simple errors shows that some of these were detected late in the project. This indicates that the behavior of some parts of the design is not properly exercised, since these simple errors do not require any activation conditions. Among the errors marked as new category are timing errors, and errors that required very elaborate corrections. The limitations of our error collection effort are as follows. Student designers have limited experience. Class projects are short in duration and the verification effort in these projects is modest. Consequently our data may contain a disproportionately small number of hard-to-detect errors, compared to data from industrial design projects. This concern also applies to the data from the projects related to PUMA, but to a lesser extent. CHAPTER 3 Design error models Manufacturing testing uses logical fault models to guide test generation. Logical fault models represent the effect of physical faults on the behavior of the system, and free us from having to deal with the plethora of physical fault types directly. Similarly, we use design error models to drive verification test generation. This chapter presents and studies design error models that are based on the error data described in the previous chapter. Section 3.1 presents four requirements that error models should satisfy to be useful for design verification. Section 3.2 proposes three classes of error models: basic, extended, and conditional error models. The following sections analyze how well these error models meet the requirements: Section 3.3 analyzes the number of error instances defined by each model (requirement 4). Section 3.4 analyzes test generation with the error models (requirement 2). Section 3.5 analyzes error simulation and presents an efficient error simulation technique for conditional error models called CESIM (requirement 3). Section 3.6 presents an analytical coverage evaluation of one conditional error model (requirement 1). Section 3.7 presents an experimental coverage evaluation using error simulation (requirement 1). Another experimental study with the same goal but a different approach is detailed in Section 3.8 (requirement 1). Our findings are summarized in Section 3.9. 3.1. Error model requirements A design error model defines a class of modeled errors, also referred to as synthetic errors, for a given design. In design verification, design error models play the role of fault models in physical fault testing. The different terminology, error vs. fault, is to underscore the 4 44 ¢ Label count error (LCE): This error corresponds to incorrectly adding or removing the labels of a case statement. ¢ Expression structure error (ESE): This includes various deviations from the correct expression, such as extra/missing terms, extra/missing inversions, wrong operator, and wrong constant. ¢ Next state error (NSE): This error corresponds to incorrect next state function in a finite-state machine (FSM). Although targeting these extended error models can increase coverage of actual errors, we have found them too complex for practical use in manual or automated test generation. Analysis of the more difficult actual errors revealed that these errors are often composed of multiple basic errors, and that the component basic errors interact in such a way that a test to detect the actual error must be much more specific than a test to detect any of the component basic errors. An effective error model should necessitate the generation of these more specific tests without resorting to direct modeling of the composite errors. The complexity of the new error models should be comparable to that of the basic error models and the (unavoidable) increase in the number of error instances should be controlled to allow trade-offs between test generation effort and verification confidence. These requirements can be combined by augmenting the basic error models with a condition. Conditional error models. A conditional error (C, E) consists of a condition C and a basic error E; its interpretation is that E is only active when C is satisfied. In general, C is a predicate over the signals in the circuit during some time period. To limit the number of error instances, we restrict C to a conjunction of terms (y; = w;), where y; is a signal in the circuit that is not in the transitive combinational fanout of the basic error!, and w; is a constant of the same signal-width as y; and whose value is either all-0’s or all-1’s. The number of terms (condition variables) appearing in C is said to be the order of (C, E). Specifically, we consider the following conditional error (CE) types: * conditional single-stuck line error of order n (CSSLn) 1. The requirement that condition signals are not to be part of the transitive combinational fanout of the basic error, eliminates problems of combinational feedback, and thus ensure that all conditional errors are well defined. This requirement also facilitates efficient error simulation, as we will see in Section 3.5. 45 x#1 x=1 — uy — uy — uy Ly; ab Ly; ab ab 4 G 4 G del JIE 4Sh 46h |G 0 a) b) 2) Figure 3.1: CSSLI1 error (x = 1, y/ 0): a) error-free design; b) erroneous design with x # 1;c) erroneous design with x = | * conditional bus order error of order n (CBOEn) * conditional bus source error of order n (CBSEn) Figure 3.1 gives an example of a CSSLI error, (x = 1, y/0). If the condition does not hold, x # 1, the erroneous circuit operates as the error-free. If the condition holds, x = 1, line y is stuck at 0. 3.3. Number of error instances defined by error model The fourth requirement on error models states that the number of modeled errors should be sufficiently small. Consider a design N signals; we denote by #M the number of error instances defined by error model M on the design. Basic error models + #SSL = O(N) + #MSE = O(N) + #BOE = O(N) * #BSE = O(N’) ¢ #BDE= O(B.D?), where B is the number of tristate buses, and D is the rms! number of drivers on a bus. lL. yo = af wn) = BD? 46 Extended error models * #BCE = O(N’) * #MCE = O(N”) ¢ LCE. Consider a case statement with an n-bit signal making the selection; let L be the number of labels (branches). The simplest type of missing label error would occur if the statements selected by the missing label are identical to those of another (not missing) label. Hence, there are (2”-L) missing label errors and L extra label errors. For a circuit with C case statements, we have #LCE = O(C.2") LCE’s with n appropriately averaged. ¢ ESE. Consider an expression with L literals and E subexpressions. If we restrict missing and extra term errors to a single literal, and further require that the missing literal appears elsewhere in the expression, then there are O(L) extra term errors and O(E.L) missing term errors. ¢ NSE. Consider an FSM with S states, and E distinct state transitions. A simplest next state error would occur if for one of the state transitions, the next state was wrong. There are O(S.E) of this type. Conditional error models The number of instances defined by a conditional error model (C,E) is given by the product of the number of basic errors and the number of conditions: © #CSSLn = 02"! Nn!) * #CBOEn = O(2"N"*!) * #CBSEn = O(2"N"*?) For n = 0, a conditional error (C,£) reduces to the basic error E from which it is derived. Higher-order conditional errors enable the generation of more specific tests, but lead to a greater test generation cost due to the larger number of error instances. For example, the CSSL1 model defines a number of instances quadratic in the size of the circuit. Although the total set of all signals we consider for each term in the condition can possibly be reduced, CSSL» errors where n > | are probably not practical. 49 Augmenting targeted test generation with error simulation can reduce overall run times. Test generators typically target one error at a time. A targeted test may detect errors other than just the targeted error. These errors can be identified by an error simulator so that they do not need to be considered by the test generator any more. A stand-alone use of error simulation is the computation of design error coverage of a given test suite. This is useful in regression testing, where one might be interested in selecting a subset of a given set of test sequences that provides coverage of design errors similar to those of the complete test set. Error simulation can also reveal areas of the design that are not sufficiently tested by a given test suite, and hence spur further targeted test generation. Error simulation needs to be efficient. Not only the length of test suites, which is extremely large for pseudo-random tests, but also the nature of the error models, and the number of error instances to be considered affect the size of the task. It is clear that better methods are required than simple serial error simulation, which simulates the erroneous designs for the complete test suite one by one. In the remainder of this section we first discuss related work and we then present our method for error simulation with conditional errors; we conclude with experimental results. Related work Representative approaches to fault simulation for sequential circuits [Abra90, Nier91a] are parallel, concurrent, deductive, and differential fault simulation. Parallel fault simulation takes advantage of the word-level parallelism of the computer used. On a 32-bit computer, 32 faulty machines can be simulated in parallel. This method lacks the ability to drop errors. The other methods are motivated by the observation that as long as a fault is not detected, the good and faulty circuit differ in only a fraction of the number of signals present. For this purpose, such methods process the complete set of faulty machines one vector at a time. Both concurrent and deductive fault simulation compute the node values of a faulty machine for the current vector, based on the good circuit’s node values for the current vector, and the faulty machine’s node values for the previous vector. A drawback 50 of both methods is high memory requirement. Differential fault simulation, a variant of concurrent fault simulation, addresses the memory problem, but suffers from the inability to drop detected faults. Niermann, Cheng and Patel [Nier90, Nier91a] described a fault simulator, called PROOFS, that combines ideas of concurrent, differential and parallel fault simulation. As our error simulation method for conditional errors derives from PROOFS, we briefly describe its main features, referring to Figure 3.2. Given is a gate-level sequential circuit, a fault list, and a test vector sequence, PROOFS maintains two sets of signal values: one for the good, and one for a faulty machine. For each undetected fault, PROOFS also stores the difference in present state between the good machine and the corresponding faulty machine. The outermost loop of PROOFS processes one test vector at a time. First, the good machine is simulated for the current vector. Next, faults that are active for the current test vector are identified. A fault is considered active if one or both of the following two conditions holds: 1) the present state of the faulty machine is different from that of the good machine; 2) the fault is excited by the current vector, and the faulty line is sensitized through the first two levels of logic. Checking condition 1 is straightforward since we have saved the faulty circuit’s state while processing the previous vector. If condition 1 does not hold, that is, if the faulty circuit’s present state is identical to that of the good circuit, checking condition 2 is inexpensive too, as it is very localized and requires only the good circuit’s values. Faults that are not active for the current vector have the property that they are not detected by the current vector and the next states of the corresponding faulty machines are identical to the next state of the good machine. Consequently, there is no need to simulate these faulty machines for the current vector. Each active fault is processed as follows: First, the fault is injected into faulty circuit. The event list is initialized to reflect the fault injection and the present state lines whose values differ in the good and the faulty machine. The event-driven simulation of the faulty machine in PROOFS typically has a very low event activity, as in concurrent fault 51 PROOFS (circuit, faultList, testVectorSequence) 1. while (vectors left) { 1 read next vector 1.2 simulate good circuit 13 determine which faults are active 14 for each active fault { 14.1 inject fault 1.4.2 add faulty node events 1.4.3 simulate faulty circuit 144 drop detected faults 14.5 store faulty next state 1.4.6 remove fault 15 } 2. } Figure 3.2: PROOFS’ error simulation algorithm simulation. If the fault is detected by the current vector, it is dropped. Otherwise, the difference between the next state of the faulty machine and that of the good machine is saved. The basic algorithm, as discussed above, can be augmented to take advantage of the word-level parallelism available on the computer executing the fault simulator. On a 32-bit machine, up to 32 iterations the simulation step 1.4.3 of loop 1.4 can be executed in parallel. This s done by assigning the values of different faulty machines to different bit positions within a word. The other steps of loop 1.4 still have to be executed serially. A more detailed description of one implementation is given in [Nier90, Nier9 1a]. Extension to conditional errors It is straightforward to modify PROOFS to handle conditional errors, such as CSSL1. For a given circuit and a given test sequence, the average run time per error for CSSLI error simulation is very close to that for SSL error simulation. As the number of CSSL1 errors is quadratic in the size of the circuit, the cost of error simulation for CSSLI may be prohibitively large. To address this, we develop an error simulation algorithm for conditional errors, called CESIM, that exploits the close relationship among CSSL1 errors derived from the same CSSLO error. Its key features are processing of sets of conditional 54 For each set Sy of PSBE-equivalent errors in A, we inject the basic error corresponding to S5, apply the erroneous present state corresponding to $5, and simulate the erroneous circuit. If any outputs differ from those in the good circuit, all errors in Sj are dropped. Otherwise, we record the erroneous next state for $5, and insert Sy into nextU. Example. Figure 3.4 illustrates CESIM. Consider sets of conditional errors derived from three basic errors e,, e), and e3. Initially, the corresponding erroneous machines are all in the same present state, namely the unknown state st. The initial PS-partition has a single class, which is further partitioned with respect to PSBE-equivalence. First, the error-free machine is simulated for the first vector; the next state is 56 . This allows us to separate those conditional errors that are active (shaded in the figure) for the first vector from those that are not. For the dormant errors no further work is required: none of them is detected, and the next state of the corresponding erroneous machines is So: For each PSBE class that contains active conditional errors, the corresponding basic error is injected and the erroneous circuit is simulated for the current vector. In the example, none of these errors is detected, and the next states st and sy are distinct. This process is repeated for the next vector. In the example, the active errors in PSBE class (55, e) are detected by the second vector; all other errors remain undetected. Note that there is a one- to-one correspondence between a single transition in the state transition diagram in Figure 3.4 and a circuit simulation step in the algorithm (steps 2.2, 2.3.2, or 2.3.8.3 in Figure 3.3). Analysis. CESIM minimizes the overall computational cost by exploiting PS- and PSBE-equivalence of conditional errors. We now analyze the algorithm’s complexity. The two major components of the cost of one iteration of the top-level loop (step 2) are the simulation cost of steps 2.2, 2.3.2, and 2.3.8.3, and the partition cost of step 2.3.3. The partition cost is proportional to the number of conditional errors for which we have to check activation condition 1, which is typically a small fraction of the total number of conditional errors. The event-driven simulator is called as many times as there are PSBE partition classes on all sets A; this is a fraction of the number of PSBE partition classes of U. In summary, the cost of one iteration has one component with complexity sublinear in the size of the error list (partition cost), and a second component proportional to the size of 55 (spn e1) (spn ey) (spn e3) (sp. e1) (sp. ey) (sp. e3) (son e,) (so. ey) (so. e3) (s},e;) (sis e1) (sn e3) (55, 5) (85, e,) undetected errors undetected errors undetected errors a) undetected undetected undetected ‘ (55 5) ‘ ‘ error-free detected errors b) detected Figure 3.4: Example execution of CESIM for a 3-vector test sequence: a) PS- and PSBE-partitions of errors, b) corresponding state transitions 56 the circuit and the product of the number of basic errors and the number of distinct states (simulation cost). In our experiments, we observed that 90% of the execution time is due to partitioning, while only 10% is due to simulation. The algorithm requires maintaining both partitions (PS and PSBE) on the set of undetected errors. All partitions are implemented using hash tables, which allow for constant time insertions of error sets. Initially, all errors are undetected and the corresponding erroneous machines all start from an unknown present state. Hence all errors are PS-equivalent initially, and all errors derived from the same basic error are PSBE-equivalent. In the partition step (2.3.3), the number of error sets (PSBE equivalence classes) may increase. The worst case occurs when 1) neither A nor D is empty, 2) neither of them is detected, and 3) the next states generated in steps 2.3.8.3 and 2.3.2 are all distinct. For this case, the number of error sets can double in a single iteration of step 2, leading to an exponential growth in the number of vectors. However, the total number of PSBE-equivalence classes can never exceed the total number of individual conditional errors we started with. Our experimental results (see below) show that, in practice, the number of error sets remains fairly constant. Optimizations. As in PROOFS we take advantage of the word-level parallelism of the host computer; hence multiple iterations of 2.3.2 and of 2.3.8.3 are executed in parallel. To further reduce execution time, static dominators [Nier9la] could be used to identify redundant errors during a preprocessing step. Experiments. We used the ISCAS’89 benchmarks to evaluate the performance of CESIM. First, we generated test sequences for SSL faults using HITEC [Nier91b, Nier91a]. We then error-simulated these test sequences using CESIM for CSSLO and CSSLI errors. The error list for CSSLO errors is identical to the collapsed SSL fault list. The CSSLI error list was constructed as follows. For each CSSLO error, we considered a maximum of 500 lines to derive CSSL1 errors. The smaller circuits have fewer than 500 lines, so every line in the circuit is considered as condition line. This leads to a maximum of 1000 CSSLI errors per CSSLO error. However, some CSSL1 errors are rejected because their condition is part of the transitive fanout of the error site. A more detailed description of the experiments is given in Appendix B. 59 35 T T T T T T T T T No. of states during CSSL1 simulation ——— No. of states during CSSLO simulation 30 SI 20 4 No. of states 0 1 1 1 1 1 1 1 1 . 0 50 100 150 200 250 300 350 400 450 500 Test vector Figure 3.7: Error simulation on s1238 with CSSLO and CSSL1: number of distinct states Figure 3.7 shows the number of distinct states as a function of the number of test vectors applied. For CSSLO error simulation, the number of states rapidly drops; after vector 300 there are at most five distinct states among the present states of the remaining undetected erroneous machines. For CSSL1 error simulation, we observe that the number of states hovers around 20 but never becomes larger than 35 (about twice the number of flip-flops in the circuit). Figure 3.8 details the number of error sets occurring during the execution of CESIM. We show both the total number of error sets, and the number of error sets in use. Both are normalized with respect to the total number of errors. The number of error sets in use is the number of PSBE-equivalence classes of the set of undetected errors U in loop 2.3 of Figure 3.3. The total number of error sets is the number of error sets in use plus the number of errors sets detected by previous vectors (those error sets are dropped in steps 2.3.4.1 and 2.3.8.4.1 of Figure 3.3). For CSSLO simulation, the total number of error sets remains constant at the number of errors, whereas the number of error sets in use drops as coverage increases. For CSSL1 simulation we observe that the total number of error sets 60 400 b-- = ee ee ee eee ee Total no. of CSSLO sets / no. of CSSLO errors S 107 2 £ No. of CSSLO sets in use / no. of CSSLO errors 8 s S 192 2 Total no. of CSSL1 sets / no. of CSSL1 errors 10° No. of CSSL1 sets in use / no. of CSSL1 errors 0 50 100 150 200 250 300 350 400 450 500 Test vector Figure 3.8: Number of error sets during error simulation on s1238 with CSSLO and CSSLI errors increases steadily, as coverage increases. However, the number of error sets in use remains fairly constant and hovers around the total number of basic errors, which is about 1000 times smaller than the total number of errors. 3.6 Analytical coverage evaluation of CSSL1 The first and foremost requirement for design error models is that complete test sets for the modeled errors should also provide very high coverage of actual design errors. In this section we analyze the detection of basic design errors by complete test sets for CSSLI errors in gate-level circuits. Let Dp be a gate-level circuit; construct D, by injecting a single error e; into Dg, where e; is an instance of error model Mj. Let Tp and 7 be test sets that provide complete coverage of all detectable CSSLO and CSSLI errors, respectively, in D,. We analyze the coverage provided by test sets Ty and T, with respect to the error models M, proposed in [AA95]. In particular, we are interested in those error classes covered by T), but not by To. 61 Erroneous design Error-free design GSE e+ Te} Missing gate error de L, a@b- Wrong input error Ae te 1a 1, _|@- Gr Gr Missing 2-input gate error. ——-f @ —le = Gr Gr Figure 3.9: Some basic error types [AA95] We use the notation introduced in [AA95], and refer the reader to that paper for further details of the error models. Let y = G(1,....x,) be a gate in the error-free circuit. A gate substitution error G/G’ occurs if a gate G is erroneously replaced with a gate G’ that has the same number of inputs but is of a different type. The set of all 2” input vectors of an n- input gate is divided into disjoint subsets Vo, V),...,V,,, where V; contains all input vectors with exactly & 1s in their binary representation, 0 <k <n. The disjoint sets Vj417. Vai» Voda and Ven, are defined as follows: even V, ‘null = Vox Vatt= Vn Youd = Vi3 Veven = Vv Vie isoddaitn i=evenni#0nien | The sets Vin» Yair Voda and V, ‘even are called the characterizing sets or C-sets of G. Consider the following sets of CSSLO and CSSLI errors in the erroneous circuit: 64 sensitize z while z and x, have opposite values. This is equivalent to detecting either of the CSSLI errors (x, = 0, 2/0) or @, = 1, z/ 1). Missing 2-input gate error. This error occurs if the error-free circuit contains a gate y = G(x), X2) that is completely missing in the erroneous circuit and y = x,. It can be shown that the error detection requirements for this error are equivalent to those of a CSSLI error. For example if G=AND, the corresponding CSSLI error is (x) = 0, y/ 0). A complete test set for CSSLO errors may fail to detect this error. Conclusion. Complete test sets for CSSLI errors also detect all wrong input errors, all missing input errors on gates that are not of type {XOR,XNOR}, and all missing 2-input gate errors; ‘complete test sets for CSSLO errors can fail to detect these errors. For the other error types, no increased coverage is guaranteed by complete test sets for CSSL1 errors. 3.7. Coverage evaluation using error simulation From the analysis presented in the previous section, one could conclude that the class of design errors that is guaranteed to be detected by a complete test set for CSSLI errors is very limited. In fact, most actual design errors do not fall in this class. However, as our analytical study tries to establish properties that hold for any design, its results are conservative. For a concrete design, complete test sets for CSSL1 errors may detect many more design errors than those reported in the previous section. To compare the effectiveness of two design error models, we could take an unverified design, and generate test sets that are complete with respect to the two error models. The test set that uncovers more (and harder) design errors in a fixed amount of time is more effective. However, for such a comparison to be practical, fast and efficient high-level test generation tools for our error models appear to be necessary. Although this type of test generation is feasible, it has yet to be automated. Instead we consider test sets that were not specifically targeted, and compute their coverage of modeled design errors as well as of actual design errors. 65 Table 3.1: Characteristics of two modules of the DLX microprocessor implementation Module 1: Module2: Parameter top decode No. of lines of code 302 263 No. of CSSLO errors 574 816 No. of CSSLI errors 141,756 238,732 No. of restricted CSSLO errors 178 82 No. of restricted CSSL1 errors 21,864 18,788 No. of detectable actual errors 8 16 In this section we present a set of experiments whose goal is to compare different design error models and investigate the relationship between coverage of modeled design errors and coverage of more complex actual errors. The test vehicle for this study is the well-known DLX microprocessor [Henn90]. The particular DLX version considered is a student-written design that implements 44 instructions, has a five-stage pipeline and branch prediction logic. The design errors made by the student during the design process were systematically recorded. They were presented earlier in Chapter 2 (DLX1 in Table 2.1 and Table 2.4). Some characteristics of two of the modules of the design are shown in Table 3.1. Module top integrates the different pipeline stages and contains the forwarding logic. Module decode describes the decode stage of the pipeline. These modules are analyzed here because 75% of all actual errors were made within these two modules. A simplified block diagram of the design, indicating both modules, is shown in Figure 3.10. For these experiments, we modified the original design description to allow us to automatically inject synthetic errors into the design. The modifications do not cause a significant overhead during simulation and do not require recompilation of the simulator when a new error is injected. On the other hand, this approach requires a simulation run for each error considered. The error models considered in this study are the CSSLO and CSSL1 models. Even for moderately sized modules under consideration, the number of CSSL1 errors is very large; for example, there are 141,756 CSSLI1 errors in top. Given our error simulation approach, the number of errors needs to be reduced to make the experiment practical. A subset of the CSSLI errors was selected by imposing the following constraints: 1) lines 66 Figure 3.10: Simplified schematic of DLX implementation showing modules decode and top considered in the condition are restricted to signals of bit-width 1, and 2) lines considered as error sites are restricted to signals with bit width > 1. CE’s of this type are referred to as restricted CE’s. This reduces the number of CSSL1 errors by about an order of magnitude. For example, there are 21,864 restricted CSSL1 errors in top. Error simulation for 69 generation. The experiment evaluates the effectiveness of our verification methodology when applied to two student-designed microprocessors. A block diagram of the experimental set-up is show in Figure 3.12. As design error models are used to guide test generation, the effectiveness of our design verification approach is closely related to the synthetic error models used. To evaluate our methodology, a circuit was chosen for which design errors were systematically recorded during its design. Let Dg be the final, presumably correct, design. From the CVS revision database, the actual errors were extracted and converted so that they can be injected into the final design Do. In the evaluation phase, the design was restored to an (artificial) erroneous state D, by injecting a single actual error into the final design Do. This set-up approximates a realistic on-the-fly design verification scenario. The experiment answers the question: given Dj, can the proposed methodology produce a test that determines D, to be erroneous? This is achieved by examining the actual error in D, to determine if a modeled design error exists that is dominated by the actual error. Let Dy be the design constructed by injecting the dominated modeled error in D,, and let M be the error model that defines the dominated modeled error. Such a dominated modeled error has the property that any test that detects the modeled error in D) also detects the actual error in D;. Consequently, if we were to generate a complete test set for every error defined on D, by error model M, D, would be found erroneous by that test set. Note that the concept of dominance in the context of design verification is slightly different than in physical fault testing. Unlike the case with the testing problem, we cannot remove the actual design error from D, before injecting the dominated modeled error. This distinction is important because generating a test for an error of omission, which is generally very hard, becomes easy if given Dg instead of Dy. The erroneous design D, considered in this experiment is somewhat artificial. In reality the design evolves over time as bugs are introduced and eliminated. Only at the very end of the design process, is the target circuit in a state where it differs from the final design Dp in just a single design error. Prior to that time, the design may contain more than one design error. To the extent that the design errors are independent, it does not matter if we 70 Design and debug Evaluation of verification methodology Debug by) designer Design error| model Inject Dy , Do i Inject singh actual &) modeled (ee error qi Test for : collection | Simulate, odeled | Simulate Actual error I error database I * Actual error I Expose Expose 4 Modeled error 1 actual error modeled error Figure 3.12: Experiment to evaluate the proposed design verification methodology consider a single error or multiple design errors at a time. Furthermore, our results are independent of the order in which one applies the generated test sequences. We implemented the preceding coverage-evaluation experiment for two small but representative designs: a simple microprocessor and a pipelined microprocessor. We present our results in the remainder of this section. A pipelined microprocessor. Our first design case study considers the well-known DLX microprocessor [Henn90]. The particular DLX version considered is a student- written design that implements 44 instructions, has a five-stage pipeline and branch prediction logic, and consists of 1552 lines of structural Verilog code, excluding the models for library modules such as adders, registerfiles, etc. The design errors committed by the student during the design process were systematically recorded using our error collection system. For each actual design error we painstakingly derived the requirements to detect it. Error detection was determined with respect to one of two reference models (specifications). The first reference model is an ISA model that is not cycle-accurate: only 71 Erroneous design D, Correct design Do —> Y3 Y3 xo Mo M1 e+ Y2 ma) M14 MO Y2 x1 1 Y1 x2 ") Y1 X2 x1 1 So 1 Ss 0 $1,S0 v1 Y2,Y3 $1,S0 v1 Y2,Y3 00 XO XO 00 XO XO 01 x1 x1 01 x1 XO 10 X2 X2 10 X2 X2 11 X2 X2 11 x1 X2 Figure 3.13: Example of an actual design error in our DLX implementation the changes made to the ISA-visible part of the machine state, that is, to the register file and memory, can be compared. The second reference model contains information about the microarchitecture of the implementation and gives a cycle-accurate view of the ISA- visible part of the machine state (including the program counter). We determined for each actual error whether it is detectable with respect to each reference model. Errors undetectable with respect to both reference models may arise for the following two reasons: (1) Designers sometimes make changes to don’t care features, and log them as errors. This happens when designers have a more detailed specifications (design intent) in mind than that actually specified. (2) Inaccuracies can occur when fixing an error requires multiple revisions. We analyzed the detection requirements of each actual error and constructed a modeled error dominated by the actual error, wherever possible. One actual error involved multiple signal source errors, and is shown in Figure 3.13. Also shown are the truth tables for the immediately affected signals; differing entries are shaded. Error detection via fanout Y1 requires setting S1 = 1, SO = 1, (X1 #.X2), and sensitizing Y1. However, the combination (S1 = 1, SO = 1) is not achievable and thus error detection via YI is not possible. Detection 74 Table 3.4: Actual design errors and the corresponding dominated modeled errors for LC2 Corresponding dominated Actual errors modeled errors Category Easily Unde- Un- Total detected _ tectable SSL_ BSE CSSLI known Wrong signal source 4 0 0 2 2 0 0 Expression error 4 0 0 2 0 1 1 Wrong bus width 3 3 0 0 0 0 0 Missing assignment 3 0 0 0 0 2 1 Wrong constant 2 0 0 2 0 0 0 Unused signal 2 0 2 0 0 0 0 Wrong module 1 0 0 1 0 0 0 Always statement 1 1 0 0 0 0 0 Total 20 4 2 7 2 3 2 We can infer from Table 3.4 that most errors are detected by tests for SSL errors or BSEs. About 75% of the actual errors in the LC-2 design can be detected after simulation with tests for SSL errors and BSEs. The coverage increases to 90% if tests for CSSLI are added. 3.9 Conclusions Unlike the case with other simulation-based design validation methodologies, we use design error models to direct test generation. We have identified four key requirements that error models should satisfy to be useful for design validation: 1) complete test sets for the modeled errors should also provide very high coverage of actual errors, 2) the error models should be amenable to automated test generation 3) the error models should be amenable to error simulation, and 4) the number of modeled errors should be sufficiently small. Based on the error data presented in the previous chapter, we have proposed three classes of design error models: basic, extended and conditional design error models. We have analyzed how well each error model satisfies the four requirements. The extended error models were found too difficult for automated test generation, and have been 75 discarded on that ground. Test generation for the other two classes of models was found similar to test generation for SSL errors. We have developed an error simulation algorithm for conditional errors called CESIM. Our experimental results show that CESIM outperforms a state-of-the-art fault simulation algorithm by a wide margin (a factor 34 on average). We conducted three studies to assess how well the error models meet requirement 1. An analytical study of CSSLI errors shows that complete test sets for CSSL1 errors do provide higher coverage for common design errors in gate-level designs over test sets that are complete for SSL errors. A second study used error simulation, and compared the coverage of SSL, CSSLI and actual errors on a microprocessor design. The correlation between coverage of SSL errors and coverage of actual errors was found to be very similar to that between coverage of CSSLI errors and coverage of actual errors. A final study analyzed actual errors in microprocessor designs, and investigated whether our methodology can detect such errors. The results indicate that complete test sets for synthetic errors provide a very high coverage of actual errors (97% for one design and 90% for another). The results also show the conditional error models are especially useful for detecting actual errors that involve missing logic, which are often difficult to detect using basic errors only. Table 3.5 summarizes our findings. Each error model is graded with respect to the four requirements relative to the SSL model. The SSL model scores the highest on requirements 2 and 3, since standard ATPG tools use the SSL model. The scores of the other models reflect the effort required to either modify the design and to use standard tools, or to modify the tools to handle the new models. Our methodology supports incremental design validation: First, generate tests for SSL errors. Then generate tests for other basic error types such as MSE. Finally, generate tests for conditional errors. Our studies suggest that the CSSL1 model is a good candidate to improve on the coverage provided by a complete test sets for SSL errors. The CSSL1 model provides a natural extension of the SSL model; standard ATPG algorithms can easily be modified for CSSL1; we have demonstrated efficient error simulation with CSSLI errors. The number of CSSLI errors is quadratic in the size of the circuit. Although, the number of CSSLI 716 Table 3.5: Comparison of practical design error models* Req. 1: Req. 2: Req. 3: Req. 4: Error model Coverage Test generation Error simulation _No. of instances SSL + + + OW) . MSE + - - OW) Basic BOE . _ ow) BSE ++ _ O(N?) BDE + - - O(B.D?) CSSLI + - - 022?) Conditional CBOE + _- —_ O(N?) CSSL2 +++ - - 023N°) a. N is the number of signals in the circuit; D is the average number of drivers on a tristate bus. errors for a flattened design hierarchy is extremely large, the design hierarchy provides a natural means to reduce the number of CSSLI errors. If the stuck line and the condition line that constitute a CSSL1 error are restricted to signals belonging to the same hierarchical module the number of CSSLI errors to be targeted during test generation is typically small enough for practical use of CSSL1 errors. 79 (ILA) model of the circuit. Kelsey et al. [Kels93] describe a test generation algorithm for sequential circuits that does not follow the iterative structure of the ILA. For a given fault, an estimate of the test sequence length is computed, and the circuit is unrolled over that many cycles. The PODEM algorithm [Chen96, Goel81] is applied to the resultant circuit, which is treated as a single combinational circuit. Because this approach only makes decisions on primary inputs and only propagates information forward, it can result in a more efficient search. On the other hand, the search process is performed on a much larger and deeper circuit than in conventional approaches, hence its efficiency depends critically on the backtracing heuristics used. Ghosh et al. [Ghos91] decompose the test generation problem into three subproblems: combinational test generation, fault-free state justification, and fault-free state differentiation. By performing state justification and differentiation in the fault-free machine, their algorithm can re-use a significant amount of computation. High-level test generation Lee and Patel describe a high-level test generation method for microprocessors in [Lee92a, Lee94]. They model a processor as an interconnection of high-level modules. During a preprocessing step they symbolically simulate each instruction of the instruction set to derive the control behaviors corresponding to each instruction. These control behaviors can be seen as ‘configurations’ of the processor (datapath) over a number of clock cycles (as many as the corresponding instruction takes to execute). The proposed test generation method has two phases: path selection and value selection. During path selection, a sequence of instructions is assembled so that a set of paths is sensitized to activate the targeted error and propagate its effect. These paths may span multiple clock cycles and may require multiple instructions. The task of computing the concrete values that need to be applied to the primary inputs is delegated to the value selection phase. This second problem can be formulated as a system of non-linear equations. The variables in this problem correspond to the signals in the datapath (in multiple timeframes). The equations correspond to the modules and interconnections of the datapath. They express the relationship between the input and output signals, as defined by the module’s 80 functionality. Lee and Patel propose a simple discrete relaxation method for value selection. The reason why such a simple method works well is that path selection tries to avoid selections that may lead to conflicts during value selection. A limitation of Lee and Patel’s method is that it explicitly enumerates the control behaviors of the processor by considering every instruction in the ISA. Such an enumeration is no longer possible for pipelined processors, as instructions do not execute in isolation. Hansen and Hayes describe a high-level test generation algorithm, called SWIFT, in [Hans95b]. SWIFT can guarantee low-level fault coverage through the use of a functional fault model, described in [Hans95a]. SWIFT uses high-level information about the circuit in the form a set of (multicycle) operations that the circuit can execute. Given a precomputed test for a module, SWIFT first constructs a partially-ordered set of operations needed to apply that test to the module and propagate the fault effects to the system outputs. It then proceeds with detailed low-level processing (scheduling). Although the results in [Hans95b] are very promising, it is not clear how to derive needed high-level information automatically. Iwashita et al. [[was94] describe a technique for generating instruction sequences to excite given “test cases”, such as hazards, in pipelined processors. Test cases are mapped onto states of a reduced FSM model of the processor. The technique performs implicit enumeration of the reachable states to synthesize the desired test sequences. Some limitations are that the reduced FSM model is derived manually, and that no details are given on the effect of the abstraction on the types of test cases that can be handled. Chandra et al. [Chan95] present a sophisticated code generator for architectural validation of microprocessors. The user provides symbolic instruction graphs together with a set of constraints; these compactly describe a set of instruction sequences that have certain properties. The system expands these templates into test sequences using constraint solvers, an architectural simulator, and biasing techniques. A similar work is discussed in [Hoss96]. As these techniques operate on the microarchitectural specification of the design only, they are not suitable for generating tests for structural errors in the implementation. 81 Formal verification Bhagwati and Devadas [Bhag94] describe an automated method to verify pipelined processors with respect to their ISA specification. The method assumes that a mapping between input and output sequences of the implementation and the specification is given, and that the implementation can be approximated by a k-definite! FSM. The equivalence of the two machines is checked by symbolic simulation. The assumptions made about the implementation and the lack of abstraction limit the applicability of this approach. Burch and Dill [Burc94] propose a method for microprocessor verification based on symbolic simulation and the use of a quantifier-free first-order logic with uninterpreted functions. The method requires manually generated abstract models of both the implementation and the specification in terms of uninterpreted functions. Symbolic simulation of the models is used to construct the next-state functions. The verification problem is turned into checking the equivalence of the next-state functions of implementation and specification. Levitt and Olukotun [Levi97] develop a methodology for verifying the control logic of pipelined microprocessors. The datapath is modeled using uninterpreted functions. Verification is performed by iteratively merging the two deepest stages of the pipeline. After each step a check is made to see whether the newly obtained pipeline is still equivalent to the previous one. The equivalence is proven automatically using induction on the number of execution cycles. To achieve the high degree of automation, the approach of [Levi97] uses high-level knowledge about the design, such as the design intent of a bypass. Hybrid verification techniques A class of hybrid verification techniques [Geis96, Gupt97, Ho95, Ho96b, Lewi96, Moun98] that combine simulation with formal verification has recently been proposed. These techniques construct a reduced FSM model of the implementation. Test sequences are then generated to achieve full coverage on the reduced FSM model. A non-trivial 1. A k-definite FSM is one that can only remember the last k inputs. 84 stage i_ stagei—]_ eS 1 1 1 1 1 1 1 1 stall 1 1 1 1 1 H stage i _ | ji squash Sty ’ bypass 2 from stage j b) 6) a) Figure 4.1: Instruction interaction mechanisms: a) bypassing, b) squashing, c) stalling pipeline. They provide a means to characterize the control state of the pipeline in a much more compact way than by considering all the instructions in the pipeline simultaneously. Based on these considerations and on the analysis of actual designs, we have developed the model for pipelined processors shown in Figure 4.2, which exposes high-level knowledge that can be used during test generation. We assume that data-stationary control [Kogg77] is chosen as implementation style of the controller, in which case control ‘follows’ the data through the pipeline providing the control signals at each stage as needed. Such controller implementations mimic the pipeline structure of the datapath. The datapath and controller both exhibit pipeline structure and interact via status and control signals. The signals at each stage are classified as: * primary: interfacing with the environment * secondary: interfacing with the stage’s pipeline registers * tertiary: interfacing with another pipeline stage The tertiary signals are precisely the signals needed to describe essential instruction interaction. Typical examples of tertiary signals in the controller are squash and stall; typical examples of tertiary signals in the datapath are bypasses. Using the model requires no more than the appropriate labeling of control signals, status signals, and pipe registers, along with appropriate high-level modeling of the datapath. The block labeled ‘global 85 DPli DPOi DATAPATH Stage i oO a DPR(H1) / DPI DSli DSOi | (Comb.) ; |\datapath i from other ul logic pro! to other stages stages \ J NS S STSi ICTRLi he CONTROLLER _ Stage i / ~\ CPR(i-1) | / CPRi t Csli Comb. CcSsoi logic to other | cTsi stages \ 7 wl \ out Global 7 | comb. from other ——*/ logic stages ——>| CPOi CPli Dxx: data signal STS: status signal Cxx: control signal CTRL: control signal xPI (PO): primary input (output) xSI (SO): secondary input (output) xTI (TO): tertiary input (output) DPR: data pipe register CPR: control pipe register CTS: control tertiary signal Figure 4.2: Pipelined microprocessor model combinational logic’ generates the CTS’s. By isolating this block, the number of tertiary signals can be minimized. Our test generation method attempts to decouple decisions concerning the interaction of instructions from those concerning only a single instruction. For example, a decision of the former type might be whether the current instruction needs to be stalled by the previous instruction. Such a decisions allow us to defer deciding upon the particular opcode and operand registers of that previous instruction. This is in contrast to an 86 approach where the search is performed in the flat product space of all instructions in the pipeline. In the next section we will show how the tertiary signals can be used to decouple decisions on instruction interaction from those that concern instructions in isolation. 4.3 Pipeframe model Conventional test generation algorithms for sequential circuits use the ILA model and iteratively apply test generation techniques for combinational circuits in one timeframe. In this section we describe a different organizational model specific to pipelined processors. This pipeframe organizational model exploits high-level knowledge about pipeline structure that is captured with the processor model. The advantages of this approach are a reduction of the search space and the elimination of many conflicts. Consider the application of a conventional test generation algorithm to a pipelined controller circuit without a datapath. Figure 4.3 shows a three-stage pipelined circuit. C,, C, and C3 are combinational logic corresponding to the three pipe stages. The global combinational logic C, sources all CPI’s and all CSI’s. In order not to clutter the figure, only the CPI’s sourced by C; are shown, and the CPO’s produced by C; have been omitted. The iterative logic array model for this circuit is shown in Figure 4.4a. If PODEM is used as the combinational test generation algorithm, the decision variables are the CPI’s and the CSI’s in each timeframe. The decision space to be searched during each iteration is that of the CSI’s and CPI’s. For the controller of pipelined microprocessors, the number of CSI’s (state bits) is typically much larger than the number of CPI’s. This is because the primary function of the controller is to decode the incoming instructions. Taking into account that the circuit is pipelined and performs several concurrent, and to a large extent independent, decodes, a different organization of the search, one that is directly in terms of the CPI’s, is desirable. When the global control logic C, is absent, it is easy to see how this can be accomplished. In this case, the iterative array model consists of unconnected (horizontal) slices spanning a number of timeframes equal to the number of pipe stages. These horizontal slices will be referred to as pipeframes. It can be seen that the size of the circuit to be considered is exactly the same as that in the conventional time-

Documents

questions

Functional Design Verification: Modeling and Simulating Design Errors, Papers of Electrical and Electronics Engineering

Related documents

Partial preview of the text