Chapter 1 Proof Of Correctness (Sections 1.6, 2.3) ***************************************************** 1.1 Introduction *=*=*=*=*=*=*=*=* ``Beware of bugs in the above code; I have only proved it correct, not tried it.'' Donald E. Knuth http://www-cs-faculty.stanford.edu/~knuth/ How to check if a program is correct ? Why checking if a program is correct ? What is a correct program ? Définition 1 Specification: precise description of how a program should behave. A program is correct, if it behaves in accordance with its specifications. How to check if a program is correct? - Testing Designing good test data sets You can prove there is a bug. No proof of no bug! - Proof of Correctness 1.2 Correctness *=*=*=*=*=*=*=*= 1.2.1 Assertions ================= ... x=sqrt(r); // r should be >= 0 ! x=a[i]; // i should be in the correct range ... Définition 2 Assertion: condition on the variables of a program that should be verified at some given step, if the program is running correctly. Using assert in C/C++: #include ... assert(r>=0); x=sqrt(r); assert((0<=i) && (i=0); if (n==0) return 1; return n*factorial(n-1); } The specification of a program can be formalized as follow: - P: program - X: input - P(X): output - Q: precondition: predicate Q(X) on the input - R: postcondition: predicate R(X,P(X)) on the input and the output Exemple 1 Program to compute of a square root - P: y:=sqrt(x); - X: x - P(x): y - Q(x): x>=0 - R(x,y): y^2=x Définition 3 P is correct if (for all X) Q(X)-> R(X,P(X)). A Hoare triple is just a short hand notation: {Q} P {R} Exemple 2 {x>=0} y:=sqrt(x);; {y^2=x} 1.3 Proof of Correctness *=*=*=*=*=*=*=*=*=*=*=*=* 1.3.1 Divide and Conquer ========================= Let P be a big program: s_0 s_1 s_2 ... s_n-1 To prove {Q} P {R}, we break P into elementary steps, and insert assertions that describes the state of the program at each step: {Q} s_0 {R_1} s_1 {R_1} s_2 ... s_n-1 {R} P is correct, if each step is correct, i.e. the following Hoare triples hold: - {Q} s_0 {R_1}, - {R_1} s_1 {R_2}, - ... - {R_n-1} s_n-1 {R} To prove that the elementary steps are correct, we will use some syntactic rules, exactly as we did in formal logic. 1.3.2 Assignment rule: ======================= Consider the following Hoare triple: {Q} x:=e {R} Théorème 1 If from Q you can derive R with x substituted everywhere by e, then the Hoare triple is valid. Exemple 3 {x=2} y:=x+1 {y=3} When substituting, y=3 becomes x+1=3. From x=2, you can deduce x+1=3. So this Hoare triple holds. Exemple 4 {x>0} x:=x+1 {x>1} Here it can be confusing. The same name x stands for both the value of x before and after the assignments. If you get confused, just rename the variable: {x_0>0} x_1:=x_0+1 {x_1>1} When substituting, x_1>1 becomes x_0+1>1, which you can deduce from x_0>0. 1.3.3 Conditional Rule ======================= Consider the following Hoare triple: {Q} if condition B then P_1 else P_2 end if {R} Théorème 2 If the Hoare triples {Q/\ B} P_1{R} and {Q/\ B'} P_2 {R} hold, then the Hoare triple above holds. Exemple 5 [, Exercise 11 p. 78] 1.3.4 Loop Rule ================ Consider the following Hoare triple: {Q} while condition B do P end_while {R} {Q} describes the state of the program before the loop. We need to have a predicate which describes the state of the program DURING the loop: The loop invariant {S} Most of the time, {Q} will do the job, but not always. Théorème 3 If the Hoare triple {S/\ B} P {S} holds, then the following Hoare triple will hold: {S} while condition B do P end_while {S/\ B'} 1.3.5 A simple example ======================= Consider this little program: Product(x,y) // Precondition: x and y are non-negative integers // Postcondition: returns the product of x and y begin i:=0; j:=0; while ( (i=x)' ) do j=j+y; i:=i+1; end while // Assertion: { j = x*y } return(j); end; It's pretty clear that this algorithm it's correct. Let's check it anyway to see how it works. Proof. We need a loop invariant, which will describe the state of the program after k iterations. A reasonable guess is S : j=iy Let i_k, j_k and S_k be the value of i, j and S after k iterations. Base case: using the assignment rule, i_0=0 and j_0=0 so S_0 holds. Induction step: Assume the invariant holds after k-1 iterations: j_k-1=i_k-1y We need to prove that the invariant is preserved by the k th iteration. Otherwise said, we have to check that the following Hoare triple holds: {S_k-1/\(i=x)'} j_k:=j_k-1+y; i_k:=i_k-1+1; {S_k} Let's applying the assignment rule. By substituting i_k and j_k by the expressions they have been assigned with, we get: S j i <=> = y k k k j (i <=> +y = +1)y (By the substitution rule) k-1 k-1 i (i S <=> y+y = +1)y (By ) k-1 k-1 k-1 i i <=> y+y = y+y k-1 k-1 So, S_k indeed holds. By induction, after any number k of iterations, S_k still hold. At the end, both S_k and i=x hold, so, as expected: j=iy=xy Remarque 1 We have proved that after the end of the execution the result is correct. But what if the program does not terminate and loop forever? This is not usually consider a proper behavior. We only have spoken about partial-correctness. A full proof of correctness needs also a proof of termination! 1.3.6 A more sophisticated example: The Euclidean algorithm ============================================================ GCD(a,b) Begin // Precondition: a and b are non-negative integers // with a>=b. // Postcondition: // returns gcd(a,b), the greater common divisor of // a and b. Local variables: i, j, q, r i:=a; j:=b; while j>0 do // Assertion: // r is the rest of the integer division i/j r:= i mod j; i:= j; j:= r; end while // Assertion: i is the gcd of a and b. return(i); end; Exercice 1 Use the Euclidean algorithm to compute: GCD(8,8), GCD(14,4), GCD(31,17). Here the proof of correctness of the algorithm is non-trivial. Proof. Let i_k and j_k be the value of i and j after k iterations. We need to find an invariant which describes the state of the program after each iteration. Take S_k: gcd(i_k, j_k)=gcd(a,b). 1. Base case: Before the loop, i_0=a and j_0=b. So the invariant S_0 holds: gcd(i_0, j_0)=gcd(a,b); 2. Induction step: Assume S_k-1 holds, that is gcd(i_k-1, j_k-1)=gcd(a,b). We just need to prove that gcd(i_k, j_k)=gcd(i_k-1, j_k-1). By the assignment rule, we have: - r is the rest of the division of i_k-1 by j_k-1, - i_k=j_k-1, - j_k=r. So, by the definition of the integer division, we have: i =qj +j k-1 k-1 k for some integer q. 1. Assume x divides both i_k and j_k. Then, of course j_k-1, since j_k-1=i_k. By the equation above, x also divides i_k-1. 2. Assume x divides both i_k-1 and j_k-1. Then of course x divides i_k. Once again, using the equation above, x also divides j_k Therefore, gcd(i_k, j_k)=gcd(i_k-1, j_k-1)=gcd(a,b), as wanted. 3. At the end: By induction, the loop invariant S still holds: gcd(i, j)=gcd(a,b) Moreover, the loop condition failed: j=0. So, gcd(a,b)=gcd(i, j)=gcd(i, 0)=i, as expected. 1.4 Conclusion *=*=*=*=*=*=*=* 1.4.1 A note on automatic program proving: =========================================== - There is no mechanical way to decide even if a program terminates or not. - Traditional imperative languages: - Automatic proving is very difficult because of side effects - side effects needs to be considered as some form of output - Functional languages (lisp, scheme, ML, caml): - Designed to make it easier to prove correctness - Fun to program. Try them! - Two difficulties: - Finding the correct assertions - Checking that the implications holds - Mixed approach: - The programmer gives the main assertions - The prover derives the other assertions 1.4.2 Check assertions in your programs! ========================================= - Use static type checking / assert / exceptions - When debugging, you can ask for all assertions to be checked - Documentation - Help for the proof ----------------------------------------------------------------------- This document was translated from LaTeX by HeVeA (http://pauillac.inria.fr/~maranget/hevea/index.html).