Lectures 6 and 7: Hoare Logic
Lectures 6 and 7: Hoare Logic
Learning outcomes
- Understand how a small-step operational semantics explains what a program does.
- Explain what a Hoare triple means.
- Use Hoare logic rules to prove (simple) programs on paper.
Introduction
We have so far taken a view of program verification where the code is a functional program in Coq, the specification is a property about that function, and the proof uses any Coq-level reasoning required; we specifically developed a style where the code is an ADT and the specification relates it to a model.
We will now develop Hoare logic, a formal system for reasoning about program correctness. At first, the code will still be functional programs. The specifications will use a new style of pre- and post-conditions, attached to each function. Proofs will use the rules of Hoare logic to handle each programming feature and to compose existing specifications together. Hoare logic will provide a smooth path to separation logic and concurrent separation logic, which add extensions for reasoning about more interesting programs while retaining much of the basic structure of this simple version.
Reading these notes
These lecture notes are written in a bottom-up style: each definition is presented in detail, then used later. This is intended to be a good reference but you will not need to understand every detail before moving on.
When reading the notes, I suggest reading the introduction in each sub-section, then moving forward, until you get to the Hoare logic subsection. You can go back if you ever need to understand something in more detail or to see a specific rule or definition.
Hoare logic and its extensions are the basis of a good deal of program reasoning, including more semi-automated verification tools like Dafny and F*, as well as the Iris framework we will use in this class. The general setup is that the user writes pre- and post-conditions in the source code and then some form of automation kicks in to prove that those annotations are correct. The purpose of learning Hoare logic itself rather than starting with verifying programs is to understand more deeply two things:
- What does it mean to have a "verified" program annotated with pre- and post-conditions when the verification succeeds?
- When a proof fails, what reasoning are tools following?
The second question is especially important since you will face failed proofs in verification (either because of incorrect code, incorrect specifications, or because the proof isn't done), and it is extremely helpful to have a good mental model of what's going on when debugging them.
Programming language
We'll start by defining a programming language and its semantics. The goal right now is to have just enough features to understand the rules of Hoare logic; eventually we'll add features to enable writing useful programs, without changing the structure of the verification features.
Syntax
The language is an extension of the -calculus that includes tuples, numbers and booleans as primitive data, and arithmetic operations for addition and comparison.
At a high level, the language is built around expressions and values. Values are simpler: they are the data of the language, in this case consisting of first-class functions, numbers, booleans, and tuples. Expressions are where the computation lives, and include arithmetic, comparison between numbers, and a conditional if
expression. What especially typifies expressions is that they evaluate to values, as opposed to being run to have side effects like statements in other languages. Hence why we see no construct in this language like x := v
to assign a (mutable) variable. if
is a bit special in that we have it as an expression (which evaluates to either the then
side or the else
side), unlike the if
statement in C or Java, for example.
Common syntactic sugar:
Some notation is worth explaining here. The notation uses metavariables to indicate what is being defined; always refers to an expression, to a value, to a variable, and to a number. When we write , this defines variables to be either the constant or (the vertical bar separates alternatives, just like a Coq Inductive
). The grammar for expressions includes one alternative , which says that any value (defined above) can also be used as an expression. The grammar defines a recursive inductive datatype for expressions and for values; the definitions refer to each other ( refers to values when defining expressions, but also says anonymous functions are values and the body is an expression), so we have mutually recursive inductives.
The grammar rule says that a variable like or is an expression ( in the grammar). says that an integer constant is a value; the overline is used to distinguish between a meta-level number and the literal which is a value in the language. and are "projection functions" (often denoted using ) that get the first and second element of a tuple, respectively.
If you've seen the Ξ»-calculus before...
You may have seen a definition of the lambda calculus that basically said . Separately, you would then have a definition of a value predicate (probably as part of the semantics), defined so that only lambda abstractions are values.
The definition above instead makes values a separate syntax, and then embeds them into expressions. A value is only the expressions formed this way. When this definition is formalized in Coq, this does mean that there are some constructs like pairs and lambda abstractions that appear in both the definition of expressions and values, but we can be ambiguous via an "abuse of notation" when working on paper.
The language may look too primitive to do anything if you haven't seen a presentation of the -calculus before; the examples below should help. The main thing we can do is to define functions using the construct, which is a lambda abstraction or anonymous function. The syntax for application is used when is a function to apply it to .
Examples of functions:
Semantics
The semantics answers precisely what it means to run some expression . We need such a definition to be able to ground any verification that meets some specification; the correctness theorem will need to talk about what does (when run) in order to say whether that behavior is correct or not.
We will give what is called a small-step operational semantics for this language; you can look to a programming language theory class to get a broader perspective on other approaches to semantics of a programming language.
The semantics is based on a step relation (to be defined below in the Reduction rules subsection) . This is just a relation that we use special notation for; it's not implication. Intuitively, is defined in such a way that it means executes to in one execution step. We will then generally talk about , which allows zero or more steps between and ; formally it would be defined by these rules:
When , we often say reduces to . Reduction is computation in the -calculus. If you wrote an interpreter for this language, it would mean reducing it as much as possible. If we have (that is, if can reduce to a value specifically), then this is considered a terminating execution. The step relation is defined so that ; that is, for any value, it will never step to anything (values are irreducible). If we have and (that is, is not a value and cannot step to anything), we will say is "stuck", which is how the semantics encodes "going wrong" - for example, is stuck.
Exercise
Let and . What does reduce to?
Solution
reduces to itself! Hence we have a non-terminating expression, also called diverging. Note that is not a value and is not stuck.
Eventually we'll add recursion to the language and then we'll have much more ordinary examples of non-termination.
There are three things an expression can do: it can reduce to a value, it can reduce to a stuck expression, and it might have an infinite chain of reductions (it also might do more than one of these if the relation is non-deterministic!). A good interpreter would throw some sort of exception or print an error message if an expression gets stuck (for example, 1 + true
is such an example) but we haven't written a semantics for what error messages should be printed. This is a language designed for verification so we won't be too worried about exactly how programs go wrong; we're in the business of proving correct programs.
Reduction rules
Rules defining (step relation):
The first rule, , is called a beta reduction. It's important enough to get special attention, and the name is worth remembering. The notation for substitution means to substitute in the expression for each occurrence of the variable . The way to remember the order here (which is rather standard) is to think of it as "sits over" . We will not give a formal definition of substitution.
Capture-avoiding substitution
For the curious, substitution is not entirely straightforward.
There is a generally tricky issue of variable capture when defining substitution. The issue is that if we have an expression like and try to apply it to the free variable , a naive definition of substitution will reduce to . Notice that what was a free variable is now a reference to the lambda abstraction's bound variable; we call this "variable capture" and it's generally considered a bug in substitution. Instead, capture-avoiding substitution will rename bound variables and produce , preserving the meaning of the original expression.
The Coq formalism we will eventually use based on Iris does not solve this problem (using a slightly wrong definition of substitution), but we will always substitute closed terms and in that situation variable capture is not possible.
Evaluation contexts
Next, we need to give so-called structural rules. The attentive reader will notice that there is no rule for something like - strictly speaking, is not a value and the rule for applications does not apply. We'll now fix that.
Here, we'll define structural rules all at once using a technique of evaluation contexts. The idea is that in , we want to say that the "next operation" is to run , and when that finishes and produces 7, steps can continue with . First, we define an evaluation context:
An evaluation context is an expression with a hole in it (where the is). Define to mean "fill in" the hole with the expression , producing an expression. For example, and .
Full definition of K[e']
Now we add one more rule to define the step relation:
Stop and think: in where and are expressions and not values, what order are and evaluated in? Try to answer this based only on the rules.
Solution
First, let's expand this to . We need to decompose this into something of the form for some . The only rule for forming evaluation contexts is the one written , which we instantiate to get ; is at the position of the "hole" and that's what runs first.
This means under this semantics a function with multiple arguments (via currying) has its arguments evaluated right-to-left.
Exercise: what would you change so that addition is evaluated left-to-right? What would you change so tuples could be evaluated in either order?
Intuition behind the semantics
The semantics likely mostly follows your intuition, but it makes some things precise. The most obvious element of precision is the evaluation order.
When we have , the semantics agrees with your intuition that . But what is the actual sequence of steps? The rules above give an answer hidden in the evaluation contexts.
The two relevant contexts are and . The first does not apply here, since the right-hand side of the sum is not a value yet. Since , . Only then can we use the context and evaluate in that "hole" position.
Similarly, the beta rule requires that the argument be a value before we perform substitution (this is called a "call-by-value" semantics). Look at how the contexts and further tell us that the argument is reduced to a value before reducing the function to a lambda expression.
This semantics is deterministic (you could prove this). A different definition of evaluation contexts could make it non-deterministic, and yet another definition would be deterministic but with everything evaluated left-to-right.
Structural evaluation rules
The following lemmas about the step relation can be derived from the step-context rule; they formalize the intuition above that comes from inspecting the semantics.
Hoare logic
Hoare logic is based on the Hoare triples . The intuitive reading of this statement is that "if holds initially and terminates in a value , then holds". The proposition is called the pre-condition and is the post-condition. Note that is a function of a value, which is the value that reduces to (if it loops forever and doesn't reduce to a value, the Hoare triple doesn't say anything). To give a reminder that the postcondition is a function, these notes will write . On the board we will also write to give a name to the return value, omitting the to save some space.
The rules of Hoare logic explain how to prove a triple by breaking it down into simpler verification tasks, based on the structure of . Working bottom-up instead, if we prove a Hoare triple about a helper function , this can be used in the proof of a larger function at the point where it calls , without having to repeat the reasoning about .
Propositions
The Hoare triple has "propositions" and . You can proceed to the next sub-section with an intuition that a proposition is a Coq proposition, but eventually (when we upgrade to separation logic) it will be more like a predicate over the program state. The main thing to note is that in the Hoare logic rules we use a statement (pronounced " entails "), which says "whenever is true, is true".
Let's start with the following grammar for propositions:
where is a "meta-level" proposition (think of it as a Coq proposition, which is what it will be when we use formalization of all of this theory). We will use to "lift" a meta-level proposition to the Hoare logic level; in the rest of these notes this conversion will be implicit, to keep things concise.
For now, a Hoare logic proposition appears no better than Coq propositions. This will change when we move to separation logic.
The rules for proving entailments between propositions are mostly intuitive, and we will come back to them and be more precise with separation logic. However, to give you a flavor here are a few rules:
Proof rules
Reading these rules
It's important to have some fluency with these rules. This first means knowing how to read them.
First, remember that above the line is a rule's premises (if there are any) and below is its conclusion.
In general with such proof rules, you will want to start by looking for axioms, or rules without premises. These are the ends of a proof, where using these rules will finish the proof. For other rules (which will be the majority) you will want to read rules from the conclusion to the premises: the effect of apply
ing in Coq is to transform a goal that looks like the conclusion into the premises. If there's a Hoare triple in both, then the one in the premise is the "rest of the proof" and the rest of the premises are "side conditions" required to apply this rule.
For Hoare logic more specifically, read rules counter-clockwise starting at the bottom left, starting with the precondition in the goal, observe what preconditions are provided in the premises, see what postcondition needs to be established, and finally see what postcondition this concludes in by looking at the goal. Try doing this to understand what the consequence and pure-step rules accomplish.
It's best for the purpose of this class to think about as a Coq Prop
that says has a particular specification, with a definition that we haven't yet given. The rules below are then theorems proven with that definition. There are two consequences to this view:
- The specifications we write will quantify over Coq-level variables (implicitly in the notes, but in Coq we will have to be explicit about this). For example, should be viewed as being for all values of the Coq variables and .
- In addition to using the rules of Hoare logic, we can also use Coq reasoning principles, like
destruct
and arithmetic reasoning about and . The rules will handle anything specific to programs.
You can think of as being Coq implication, as opposed to entailment between propositions.
Let's walk through the pure-step rule. It says to prove a triple about with arbitrary pre- and post-conditions, we can switch to a different proof about with the same pre- and post-conditions, if . In practice, imagine you're doing a Coq proof whose goal is . Applying this rule would change your goal to a goal with instead, which is simpler because is the result of one step of computation. This is almost like running simpl
on a Coq goal, but we've moved forward by one step of computation in the programming language as opposed to steps of Coq's functions.
Example: consequences of bind rule
The bind rule is likely to be one of the most confusing, especially because it makes use of evaluation contexts. To illustrate what it does, here are some rules derived from it using examples of how evaluation contexts are constructed.
Learning by doing
The following examples are something you could construct yourself while reading a text like this, to generate your own exercise: take the bind rule and the rules for evaluation constructs, plug in some possible contexts for , and see what the result rules say. Many things you learn (e.g., papers you read, the Intel Architectures Developer's Manual, the entire memcached codebase) will not be "scaffolded" appropriately like this, and being able to construct your own exercises to build understanding is an extremely valuable skill.
Remember that is just .
We know that first the expression is evaluated, then the result is substituted into and that is evaluated. Observe that the hoare-let rule says we can prove a let expression in a mirrored way with two steps: show takes to an intermediate condition , then that takes to . The proof for has to be carried out for any subject only to since we don't know in that proof what will return other than what the specification says.
This rule is derived by choosing . What is not so obvious is that we didn't determine exactly where inside the next step will occur; that step might be buried further within the expression. However, the Hoare bind rule doesn't care about this and still lets us split the proof this way.
Another derived rule, for if
expressions:
This chooses a particular specification for the intermediate condition, but a very general one: we need to produce a boolean since if
only works on booleans, and other than that this rule allows one proof where returns true and has postcondition and another for false with postcondition . The if
subsequently becomes one of the branches, and the premises require a proof about each branch.
Exercise: derive structural rules for pure steps
Many presentations of Hoare logic don't use the pure-step rule but following an axiomatic style give one "structural" rule for each language construct.
Prove the following structural rules.
Solution
hoare-add: pure-step + value
hoare-neq: pure-step + value
hoare-if-false: pure-step. Observe that the if
construct takes a step even when the branches are expressions; this is necessary for this rule to even be sound.
Hoare logic verification examples
Here are some examples of correct specifications.
The spec is an example of giving a strong specification, is an example of under-specification (in this case, it would work just as well without the precondition), is an under-specification (we could say something stronger in the postcondition and still prove it), is the best we can prove assuming the and specs, and is an example where the precondition is necessary and we under-specify in the postcondition.
Even the spec is not the strongest possible - notice that the code does not actually require the second argument to be a boolean, and that it only needs to be safe if the first argument is .
Example proof: specification for f
Let's prove the spec for above, assuming the other specs. This proof demonstrates using the rules as well as compositionality: notice how we end up using the specifications for and without having to know anything about else about those functions. To conveniently state these intermediate steps, we'll abbreviate the postcondition above as .
First, unfolding we see that we have a function application. Apply the pure-step rule and show a beta reduction (substituting for in the body of ), so that the goal is now
We have a function application. The "next thing to run" is the second argument; luckily it's already a value, so the first argument is next. This can be formally expressed by saying that the current expression we're verifying can be decomposed into an evaluation context around the argument . We want to apply the bind rule and produce two goals. The and in the rule are determined by matching up the rule's conclusion with our current goal, but is a free choice in the below:
Observe how the bind rule splits up the proof of a complex expression into a proof about the next expression (the min reasoning in our current goals), and then a proof about the code that consumes a value produced by , (the add reasoning in our current goals). When applied it allows using any intermediate proposition to connect the first and second parts.
In this case, we'll take . This was derived from a combination of what we learn from the min spec, what we need for the add spec, and some intuition; later we'll see a way of setting up the proof so that it isn't necessary to cleverly come up with this definition.
Let's see how to prove the first goal, using the existing min spec. Unfortunately neither the precondition nor postcondition exactly match up. To make them align, we'll first apply the pure rule to "remember" that . Notice that is a meta-level variable (a number) for this goal, and what we know about it is buried in a precondition; applying the pure rule allows us to "extract" that knowledge up to the meta level without changing anything about the Hoare triple to be proven. Next, we'll use the rule of consequence, a general-purpose way to use one spec to prove another for the same code. With the assumption , this now works; it just requires some logical reasoning for both entailments.
For the second goal, this time we don't need the pure rule, only the rule of consequence. However, we do need the fact that in order to prove that the precondition in the goal implies the precondition in the proven add spec. This is also an opportunity to use the Hoare exists rule. Other than that this goal also comes down to two entailments to be proven.
This completes the proof! Taking a step back, notice how we basically broken down the proof into two steps, one for each sub-call, and then we proved each step by aligning the spec proven for the function with the context in which it was called. This is great for modular reasoning: we can prove a specification, then use (and reuse) it without thinking about the implementation.
Exercise: prove the triples
Prove the Hoare triples above. Do the proofs as carefully as you can, annotating the rules you use. This is good practice for understanding the rules, and will also help you appreciate when we switch to Coq and it automates the book-keeping for you.
You may want to start by reading the next section on Hoare outlines and writing an outline-based proof sketch, and only then trying to do a detailed proof.
Hoare outlines
Writing an on-paper proof referencing these rules is quite tedious and obscures some important aspects of the proofs. There's another commonly used format for expressing a proof in Hoare logic that captures the essence of the reasoning: the Hoare outline. To use the outline, we will need programs to be primarily written with let-bindings; this names all the subexpressions and makes obvious what order computations happen in, as opposed to it coming from the language semantics and implicit in the program. Observe how in it's clear that runs first and then , while is ambiguous from just the syntax, and in fact evaluates first based on the language semantics above.
Here's a specification for a function written in this style, with let bindings. The specification is written vertically to fit a big program; it means exactly the same thing as the horizontal version.
And here is a Hoare outline for the above specification, which you can think of as a "proof sketch" for the Hoare triple but not all the individual details.
The Hoare outline gives a statement true at that point in the code, starting with the precondition and ending at the postcondition. We re-use the program variables as variables in the outline's propositions. Whenever a line has , we'll apply a specification for . For the Hoare outline to be a valid proof, the condition above should imply 's precondition, and the line below should be the postcondition (in both cases we may need to instantiate the specification's variables appropriately). We can also write two statements in a row if the first implies the second, which helps adapt pre- and post-conditions. Note how these rules about what makes a Hoare outline work are implicitly justified by the rule of consequence (do you see why?).
In this language all variables are immutable and there's no state other than the variables (the language is purely functional), so any line of the outline as statements that remain true forever afterward; this will no longer be true once the program has mutable state.
Hoare logic supports modular proofs
Something that you probably take for granted if you have even just a little programming experience is that if you're working on a big function f
, you'll split into some helper functions h1
and h2
that f
then calls. This is a more modular implementation. Ideally (if the split is a good one) when you're working on f
you don't have to think about the complete code of h1
and h2
, because you just have some abstract "specification" in mind of what they do.
Hoare logic supports modular reasoning that exactly reflects this split into helper functions: we prove a specification about h1
and h2
, then prove f
by referring to those specifications without needing to know how they're implemented. Modularity is important to scale up a proof to a big piece of code, and in this case it's elegant that the way the proof is modular (by writing specifications for functions) matches how the code is modular.
Modularity is also beneficial when a helper function is reused, since then the proof of h1
can be done just once and used multiple times. However, it has benefits even if a function is used only once since it breaks down a big task into smaller, more manageable tasks. In a larger team modularity even permits you to split and parallelize the work: if you can decide what h1
does (its specification), then one person can implement h1
and another can use it in f
at the same time.
Soundness
What does it mean to prove ? We have a bunch of "rules" above, but where do they come from?
The reason Hoare logic is useful and what it means to have a (proven) Hoare triple is that we can give a definition of soundness of a Hoare triple that relates it to the semantics of programs, and then all the rules given above can be proven as theorems. The soundness definition requires that we interpret propositions, and you can think of that interpretation as being Coq propositions here.
Hoare triple definition/soundness
means if holds and , then either (a) ( is not stuck), or (b) there exists a value such that and holds.
First, notice that Hoare logic connects back to the operational semantics of the programming language and in particular: we can precisely say how a Hoare triple relates to 's behavior.
Second, notice that we only learn anything about if . This is called partial correctness because it only gives the postcondition for programs when the terminate; there is a variant of Hoare logic with so-called total correctness triples that also prove the program terminates, but proving termination can be quite complicated. Do note that regardless of termination, if holds initially the program is at least safe in that it never does something "bad" like trying to execute .
Hoare logic as a logic
We have taken the view that a Hoare triple means the soundness statement above, and the rules are proven under that definition. It is also possible to view Hoare logic as a logic where the rules are purely formal axioms of a proof system, and then soundness is a theorem about that proof system.
See Programming Language Foundations for a more detailed presentation of this view (note that there are some significant differences in that presentation of Hoare logic and you will likely need to read earlier chapters). An important consequence of formalizing Hoare logic with axioms is the proof of completeness and not just soundness; informally, the rules of Hoare logic are enough to prove any true Hoare triples, and thus we aren't missing any important rules.
Example: soundness of exists rule
At a high level, this rule can be proven by unfolding the definition of soundness and following rules of propositional logic.
Our goal is a Hoare triple; expanding the definition of soundness for that triple, we have assumptions and . We can eliminate the existential to get a specific such that holds. From the premise of this rule, we know holds. Unfolding the soundness definition in that assumption, we get an implication with two premises: the first is to prove the precondition (which we already have), and the second is to prove (which we also have). Thus from that Hoare triple we conclude that either is reducible or it's a value where holds, which exactly matches the goal.
Exercise: soundness of rule of consequence
Go back to the proof rules and look at the rule of consequence. Prove this rule, interpreting the Hoare triples with the definition above.