Lecture 26: Liveness
Lecture 26: Liveness
Learning outcomes
- Understand how fairness makes liveness complicated
- Explain the meaning of an LTL liveness specification
Motivation
We've only talked about safety so far: we prove the program "doesn't go wrong" as long as it wrongs, but the postcondition only holds if the function terminates.
Real systems also care about liveness properties: proving that programs terminate and systems eventually produce results.
In fact systems also care about (quantitative) performance properties: achieving certain throughput or latency targets. These have generally been out of scope for verification simply because the properties are too hard to formalize (performance is a complex interplay between many factors), but there is still some work in this area.
Liveness is an important topic when reasoning about distributed systems especially.
Example: spinlock
Consider the following implementation of a spinlock-based mutex. Three points in the control flow are labeled, pc0, pc1, and pc2; each of the two spawned threads in main
goes through these states.
type Mutex = bool;
const UNLOCKED = false;
const LOCKED = true;
func (m *Mutex) Lock() {
// pc0
for CompareAndSwap(m, UNLOCKED, LOCKED) {}
}
func (m *Mutex) Unlock() {
// pc1
*m = UNLOCKED
// pc2
}
func main() {
var m Mutex
go func() {
m.Lock()
m.Unlock()
}()
go func() {
m.Lock()
m.Unlock()
}()
}
CompareAndSwap(m, UNLOCKED, LOCKED)
atomically checks if *m = UNLOCKED
and if it does, sets it to LOCKED
. If *m
is not UNLOCKED, the call does nothing. CompareAndSwap
returns true if it successfully changed the value of *m
, so that Lock()
loops until it successfully acquires the lock.
The full state of this program can be modeled as a state machine. The state consists of a program counter for each thread (either pc0, pc1, or pc2) plus the value of the single mutex.
The state of each thread in main
follows these transitions:
The key transition is that the CompareAndSwap
loop advances from pc0 to pc1 only if the mutex is unlocked, and atomically sets the mutex value to locked.
Each thread independently goes through this loop.
Exercise: argue as rigorously as you can why the two threads must both terminate.
Solution
First, we can observe that one of the two threads must win the race to acquire the mutex; they cannot both simultaneously fail.
However, why should the other thread acquire the mutex? A key issue this exercise brings up is fairness. The system's scheduler must run both threads "often enough" in order to guarantee that if the first thread acquires the mutex, it gets to run at some point and finish its critical section (represented by whatever happens in the pc1 state) so it can unlock the mutex. Otherwise it would be possible for the second thread to spin forever, waiting for a mutex that will never be released.
Linear Temporal Logic
To talk about liveness formally we will introduce Linear Temporal Logic (LTL). You've already seen separation logic, so this won't be your first time seeing a logic.
First, we will use LTL to describe traces of system behavior. The semantics (the meaning) of an LTL formula is a set of traces, where a trace is a sequence of states. We will these formulas to describe two things: (1) what is a legal trace that comes from running our system?, and (2) what traces meet our liveness specification? The first-order logic and separation logic we've seen so far has been all about describing a single state; LTL is all about sequences of states. It turns out to be simplest to always talk about infinite sequences of states; if a system terminates at some point we'll model its execution by repeating the terminating state infinitely once it's reached.
Key idea
The key idea to remember about LTL is that a formula describes a set of traces (equivalently, a formula either holds or doesn't hold for a given trace), where a trace is an infinite sequence of states.
This tells you what form the semantics has, which is necessary to understand any particular LTL construct or formula.
Syntax
Let's start working through the syntax of LTL formulas, with brief intuition behind each construct.
The first construct is actually the boring one. It lifts a propositional formula about a single state into temporal logic about a trace; what it asserts is just that the first state in the trace satisfies . You can think about what is in one of two ways: its a formula describing one state with a bunch of variables over finite domains (like booleans or the program counter pc0 | pc1 | pc2
in the example above), or you can just imagine in Coq and it describe a single state (though is not technically propositional in that it could use quantifiers; this distinction isn't important to understand LTL). We can take any such formula for a single state and turn it into an LTL formula with (this is intentionally the same notation we used to lift pure propositions into separation logic).
Next we have an action where (a relation between two states). This construct is used to describe transition systems within LTL; we will see that it allows us to easily say that a trace corresponds to the behavior of a state machine, where is the relation describing the state machine's transitions. In temporal logic says that the holds between the first and second states of the trace.
The second row has the key novelty of LTL: the temporal operators. (pronounced "always ") says that holds from now onward or henceforth. (pronounced "eventually ") says that holds eventually; that is, starting at some point in the future from the current moment.
The third row has the usual connectives of a logic, negation, (logical and), and . There's nothing exceptional about these in LTL. We can write for implication, and it's the same as as in regular propositional logic.
Semantics
Now let's formally give the semantics of an LTL formula. In this logic I think it helps to directly look at the semantics to get an intuition; it's simple enough to do so.
We'll take every in the grammar above and describe it as a , where . A concrete trace can be written as a sequence, like , but note that it is important that this be an infinite sequence. A trace is a function so in this example we can say and .
Define a shift operator on traces that removes the first elements . Formally we can define it to be . Notice . (The notation is intentionally meant to evoke Go's t[k..]
sub-slicing, but note that is defined and makes sense even for infinite traces.)
Now we can give the semantics of each temporal operator. The first four definitions are interesting:
The other connectives are boring (they don't talk about time in any interesting way):
Examples
we can think of as reflecting the following picture:
where we mean "a state satisfying ", then arbitrary states , and so on. only cares about the first state in a trace.
has this picture:
Every state must satisfy (although the concrete states could all be different).
:
Notice that must show up in this list, but after that anything can happen (we could go back and forth, or could hold from then on).
Now consider what happens when start combining the modalities:
The picture for is this:
The definition says that there is a time somewhere in the future (this is the ) where from that point forward (this is the ), holds. The example shows that at logical time 2 becomes true, but that's not the point where it holds from then on: that only happens at time 4. This trace still satisfies the temporal formula.
Exercise: what's the intuition and picture for ?
Solution
The picture for is this:
I've grouped these into runs where holds in each segment (the grouping is just to help you see the pattern).
This one is more complicated. Expanding the definitions, the formula says that at every time step (this is ), there is a later time (this is ) where holds.
Exercise: what does mean? What does mean? Try to reason from the definitions as well as intuitively.
Negation: and .
Exercise: simplify .
Solution
Push the negation inward to get .
Notice that flips to under negation, and the composite modality flips to . The same holds in reverse in both cases! We say that and are dual and similarly that and are dual.
Fun fact: .
State machine definition
We can describe the spinlock example in temporal logic as follows:
The state consists of locked (a boolean) and two program counters: t1 for thread 1 and t2 for thread2. The program counters will be inductive datatypes with three values, pc0, pc1, and pc2, corresponding to the labeled points in the code. The threads will start in pc0 and terminate in pc2, to keep the state machine as small as possible.
There are three possible transitions for thread 1:
These correspond to the CompareAndSwap failing, CompareAndSwap succeeding, and the call to Unlock.
The transitions for thread 2 are the same but with the roles of the threads reversed. It can be easier to read these if we introduce notation for changing just one field of the state: is the state but with the field changed to the value .
It helps to read these as consisting of a "guard" over the initial state that says when the transition can happen together with an "update" that says how is derived from .
The entire behavior of the spinlock can now be written in temporal logic. First, the transition relation of the whole system is just to take an arbitrary possible transition from either thread:
The initial state is this one:
Putting it together, a valid execution of the spinlock is one satisfying:
Think about what means.
Stutter steps
For technical reasons, it's necessary to actually include stutter steps where the state machine does nothing as part of the predicate:
If we didn't do this, consider what happens once the program terminated and and were both impossible. Then would be always false, and there would be no infinite traces such that holds! Stutter steps solve this problem by allowing repeating the final state forever.
Fairness
For an action , define a state predicate . Intuitively, holds in a state if the action can run in that state. We'll commit a minor abuse of notation and use as a temporal formula, rather than the technically correct which is cumbersome to read.
For example, when is enabled? That is, what is a simplified expression for ? Looking at the definition, it's possible for this transition to run in if (this is the guard of this transition).
Define a notion called weak fairness (of an action ): .
Weak fairness is equivalent to the following: .
We will generally assume that a transition of the state machine runs fairly. In the spinlock example, we might assume ; that is, the scheduler runs each thread, as long as it stays ready to run.
TODO: examples and exercises
Specifying liveness properties
Recall we described system behavior for the spinlock example: we choose some for our state machine, wrote a transition relation (one action for each transition) and a predicate to describe initial states. Then a valid trace of executing this transition system is .
Need fair traces for liveness to be achievable. Use assumptions of the form or (they're not the same!).
Generally want to say .
Putting it together: .
Further reading
See The Temporal Logic of Actions by Leslie Lamport.