Download Analyze and Understanding Software - Lecture Slides | CMSC 631 and more Study notes Computer Science in PDF only on Docsity!
CMSC 631 - Program Analysis and
Understanding
Spring 2009
CMSC 631 2 • Formal systems and notations ■ Vocabulary for talking about programs • Static analysis ■ Automatic reasoning about source code • Programming language features ■ Affects programs and how we reason about them Analyze and Understanding Software CMSC 631 5 Textbooks • No required textbooks • Two recommended texts ■ Pierce, Types and Programming Languages ■ Huth and Ryan, Logic in Computer Science • Neither covers everything in the course • On reserve in CS library CMSC 631 Forum • Web forum on CS dept server ■ See class web page for link • Can use the forum to communicate with others ■ Questions about assignments and projects ■ Thoughts of general interest 6 CMSC 631 7 Expectations: Homework (30%) • Written assignments ■ Short problem sets • Programming assignments ■ Implement ideas from lecture • Proofs in Coq ■ Solve problem sets using the Coq proof assistant ■ You will know immediately if you get it right! • This is how you will learn things ■ Much more effective than listening to a lecture CMSC 631 10 • Class goal: Teach you how to do research ■ So you have to do research as part of the class • Substantial research project (35% of grade) ■ Any topic vaguely related to the class is acceptable - Will post some suggestions for projects later on - May also be able to share project with other class ■ Completed in groups of size 2 (possibly 1 or 3) • This will consume second-half of semester Expectations: Project (35%) CMSC 631 11 Expectations: Project (cont’d) • Deliverables ■ Project proposal (one page) + talk with me ■ Project write-up - A conference-style paper (5-15 pages, as appropriate) ■ Implementation, if any ■ In-class presentation - 15-20 minutes, depending on # of projects • In the past, several 631 projects led to papers ■ Not required (!), but possible CMSC 631 12 • Final exam ■ Based on course assignments ■ Take home exam ■ Will occur when we wrap up core course material (TBA) Expectations: Exam (25%) CMSC 631 15 • Rice’s Theorem: Any non-trivial property of programs is undecidable ■ Uh-oh! We can’t do anything. So much for this course... • Need to make some kind of approximation ■ Abstract the behavior of the program ■ ...and then analyze the abstraction • Seminal papers: Cousot and Cousot, 1977, 1979 Abstract Interpretation CMSC 631 16 e ::= n | e + e • Notice the need for ? value • Arises because of the abstraction Example + - 0 + - - - ? 0 - 0 + + ? + + !(n) = ! " # ! n < 0 0 n = 0 + n > 0 CMSC 631 17 • Classic style of program analysis • Used in optimizing compilers ■ Constant propagation ■ Common sub-expression elimination ■ Loop unrolling and code motion ■ etc. • Efficiently implementable ■ At least, interprocedurally (within a single proc.) ■ Use bit-vectors, fixpoint computation Dataflow Analysis CMSC 631 20 • Transform CFG so each use has a single defn Static Single Assignment Form x := 0 v := 3 v := 4 + x x := x + v x1 := 0 v1 := 3 v2 := 4 + x1 v3 := Φ(v1,v2) x2 := x1 + v3 CMSC 631 21 • Three syntactic forms e ::= x variable ¦ λx.e function ¦ e e function application • One reduction rule ■ (λx.e1) e2 → e1[e2\x] (replace x by e2 in e1) • Can represent any computable function! Lambda Calculus CMSC 631 22 • Conditionals ■ true = λx.λy.x false = λx.λy.y ■ if a then b else c = a b c - if true then b else c = (λx.λy.x) b c → (λy.b) c → b - if false then b else c = (λx.λy.y) b c → (λy.y) c → c • Can also represent numbers, pairs, data structures, etc, etc. • Result: Lingua franca of PL Example CMSC 631 Operational Semantics • Evaluation is described as transitions in some abstract machine ■ Example: Beta reduction from lambda calculus (λx.e1) e2 → e1[e2\x] ■ State of machine described by current expression • There are different styles of abstract machines ■ Small-step (as above), big-step, etc • The meaning of a program is its fully reduced form (a.k.a. a value) 25 CMSC 631 Denotational Semantics • The meaning of a program is defined as a mathematical object, e.g., a function or number • Typically define an interpretation function ⟦ ⟧ ■ Program fragment as argument and returns meaning ■ E.g., ⟦ 3+4 ⟧ = 7 • Gets interesting when we try to find denotations of loops or recursive functions 26 CMSC 631 Denotational Semantics Example • b ::= true | false | b ∨ b | b ∧ b • e ::= 0 | 1 | ... | e + e | e * e • s ::= e | if b then s else s • Semantics: ■ ⟦ true ⟧ = true ■ ⟦ b1 b2 ⟧ = ■ ⟦ if b then s1 else s2 ⟧ = 27 true of ⟦b1⟧ = true or ⟦b2⟧ = true false otherwise ⟦s1⟧ if ⟦b⟧ = true ⟦s2⟧ if ⟦b⟧ = false { { CMSC 631 30 • Machine represents all values as bit patterns ■ Is 00110110111100101100111010101000 - A signed integer? Unsigned integer? Floating-point number? Address of an integer? Address of a function? etc. • Type systems allow us to distinguish these ■ To choose operation (which + op), e.g., FORTRAN ■ To avoid programming mistakes - E.g., don’t treat integer as a function address Type Systems CMSC 631 31 e ::= x ¦ n ¦ λx:τ.e ¦ e e τ ::= int ¦ τ → τ A e : τ in type environment A, expression e has type τ Simply-typed λ-calculus A n : int x ∊ dom(A) A x : A(x) A e1 : τ→τ′ A e2 : τ A e1 e2 : τ′ ! ! ! ! ! ! ! A[τ\x] e : τ′ A λx:τ.e : τ→τ′ ! CMSC 631 32 • Liskov: ■ If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of o1, the behavior of P is unchanged when o2 is substituted for o1 then S is a subtype of T. • Informal statement ■ If anyone expecting a T can be given an S instead, then S is a subtype of T. Subtyping CMSC 631 35 • Polyspace ■ Looks for race conditions, out-of-bounds array accesses, null pointer derefs, etc ■ Also includes arithmetic equation solver • ASTREE ■ Used to detect all possible runtime failures (divide by zero, null pointer deref, array out of bounds) on embedded code ■ Used regularly on Airbus avionics software Applications: Abstract Interp. CMSC 631 36 • Optimizing compilers ■ I.e., any good compiler • ESP: Path-sensitive program cheker ■ Example: can check for correct file I/O properties, like files are opened for reading before being read • LCLint: Memory error checker (plus more) • Meta-level compilation: Checks lots of stuff • ... Applications: Dataflow analysis CMSC 631 37 • Extended Static Checker ■ Can perform deep reasoning about programs ■ Array out-of-bounds ■ Null pointer errors ■ Failure to satisfy internal invariants • Uses the Simplify theorem prover Applications: Axiomatic Semantics