Download Metal: A Language and System for Building System-Specific Static Analyses and more Study notes Computer Science in PDF only on Docsity! A System and Language for Building System-Specific Static Analyses Hallem, Chelf, Xie and Engler (Stanford University) Summary by Abheek Anand Motivation There have been several recent techniques for detecting program bugs statically. PREfix performs symbolic evaluation of interprocedural execution paths, while looking for errors such as uninitialized memory, buffer over- flows and memory leaks. While it is very comprehensive, it allows only for fixed types of analyses, and thus finds a fixed subset of the bugs. ESC/Java uses programmer-written annotations to run theorem-provers to verify program correctness. Recent work includes automatically inferring these annotations (Houdini) at the expense of exponentially higher performance overhead. However, techniques based on annotations require strenuous, invasive code modifications, which most programmers are unwilling to do. Moreover, the amount of annotation is proportional to the size of the codebase being tested, making these analyses limited in their scalability. This paper describes a new technique for finding bugs in systems code. Metal is a language used for spec- ifying program properties, and an analysis engine ‘xgcc’ is then used to test these properties against a given codebase. Metal Metal is designed to be easy to use, and flexible to express a variety of properties. It provides a state-machine (SM) as a fundamental abstraction to model these properties. These SM’s (1) recognize source-code relevant to a given rule by using pattern matching, and (2) check that these actions satisfy the given constraints. For example, the following piece of metal code describes an SM to check for correct uses of the ‘kfree’ function (checking for double freeing and using a pointer after it has been freed). state decl any pointer v; start: { kfree(v) } → v.freed; v.freed: { *v} → v.stop, { err(‘‘using %s after free’’, mc identifier(v)) ;} | kfree(v) → v.stop, { err(‘‘double free of %s’́, mc identifier(v)) ;} The SM matches any pointer declared with the variable v, and attaches a new instance of the SM with each such variable. The number of instances associated with a pattern grows and shrinks as the program executes, since the SM keeps track of only those instances that are currently in use. Each SM is associated with a global state (start), and local states (v.freed, v.stop) associated with each instance of the SM. Extensions in Metal can be extended using arbitrary C-code to do more complex operations. The expression ‘v’ in the above example is called a hole variable in the paper. Hole variables are typed expressions which are matched against the appropriate pattern in the actual source code. XGCC Recall that xgcc is the engine used to run these SM’s over the actual source code. It applies extensions to the source code AST in a top-down manner, and in execution order. Each execution path is explored once, and interprocedural calls carry state from the caller to the callee and then back. Several optimization techniques are used to make this scalable. 1. Assuming that each extension is deterministic, the state at each block is cached for each incoming state. Two forms of caching are used: • Intraprocedural caching: Each block keeps a cache of outgoing state for each incoming state (called a block summary), and a cache hit means we do not need to further analyse the block. • Interprocedural caching: Each function call maintains a similar suffix summary, and on a cache hit the function is not explored any further, and the final state is picked off the cache. However, it 1