Haskell Magic: Model checking made simple, part 1

As some of you know, I'm a PhD candidate, which involves doing teaching assistant work. One of the classes I'm working with involves a bunch of model checking, based on this text here. As this (re-)uses a whole lot of material from logic, automata and algorithm design and analysis, this is something I found fairly interesting, despite not really knowing much about it.

Just today, I got asked about one of the assignments, which requires implementing a way of checking invariant properties of a state graph (definitions coming, I promise). This led to me demonstrating how you can do (some of) this is Haskell, and how concise, clear and simple it is. Given that the majority of my students haven't even heard of, much less seen, any Haskell (our university is a Microsoft and Java shop, in all the worst ways), this elicited oohs and ahs of amazement. Based on this, I decided I'd try and do (something resembling) what they are being asked to do, in a series of blog posts to fulfil my birthday condition. If anyone from my class is reading - I hope this helps you learn Haskell, but don't expect to copy my code and get away with it.

Without further ado, let's begin.

Some basic ideas

Model checking is a broad field involving formal verification of the properties of certain systems. A lot of the time, these systems involve hardware or software, but doesn't have to. The process of model checking usually goes something like this:

  1. Construct a model of the system you want to check.
  2. Write a property describing the thing you want to reason about regarding this system.
  3. Check it to see if the model entails this property.

Each of these steps can be approached in a wide variety of ways, many of which are borrowed from other fields. In particular, automata and logic come up frequently, and given that we want this checking to be doable by machine, algorithms and data structures are also quite important.

In this particular case, we are dealing with a state graph. This is a general representation of a system which can change states - that is, something where, over time, given certain actions, different effects can occur. More formally, a state graph is a 5-tuple (S, I, E, P, L) where

  • S is a finite set of states
  • I is a subset of initial states of S
  • E is a set of edges from states to other states
  • P is a finite set of propositions
  • L is a labelling of each state in S with some (possibly empty) set of propositions in P

For a (visual) example, consider the diagram below:

A diamond-shaped graph of circular nodes. The topmost node, labelled
'{}' has arrows going from it to a node on the left, labelled '{p}', and a node
on the right, labelled '{q}', creating a fork. Both the left node and the right
node have arrows going from them to a bottom node, labelled '{p,q}'. The bottom
node also has an arrow pointing back to the left node, labelled '{p}'. Finally,
an arrow above a diamond points from empty space above the topmost node to the
topmost node.

This state graph has four states, each labelled with their set of labels. There's one initial state (the one with the arrow coming into it 'from nowhere'), which happens to be labelled with the empty set. Arrows from states to states indicate edges, which have a direction indicated by the arrowhead. This state graph represents a process which may be in one of four different 'states', with different effects in each one, represented by the labelling.

Properties of state graphs are stated in terms of the elements of P, and can be either true or false. For the state graph above, 'at some point, p in every state' is a property (which happens to be true); 'at any point, q will be true in the next state' is also a property (which happens to be false). Given enough propositions, and a complex enough state graph, we can describe some extremely sophisticated properties. Additionally, if we can state these properties in a certain way, depending on the property, we can mechanically check if a given state graph possesses that property or not. This is essential, as in any real cases, state graphs can be incredibly large (millions of states isn't unusual), and the resulting properties can be extremely complex.

In the context of this assignment, the properties of interest are invariants, which are properties of a 'global' nature for the entire state graph. An example of an invariant for the state graph above would be 'p is true in every state' (which is, of course, false); another would be 'in any state, if p is true, q is false' (also false, as it happens). Invariants are the simplest properties, both in terms of checking and description: for the former, it's enough to check that the property holds in each state individually (using some kind of graph traversal); for the latter, we can use propositional logic as our description language (which also makes checking easy). The assignment specifically asks for the ability to:

  1. Take a description of a state graph and an invariant property as input.
  2. Construct both.
  3. Check if the given invariant property holds.

For this part of the series, I will describe propositional logic, write a simple interpreter for it, and then massively over-engineer it. Because without that last step, this would be a highly dull blog post. We'll continue the over-engineering in the second part, where we veer dangerously into compiler design territory; then in the third and fourth parts, we'll design the rest.

Propositional logic for the masses

First-off, a disclaimer: the following description is not, in any way, shape or form, meant to be a definitive one. Much more thorough treatments of this topic can be found elsewhere, and even Wikipedia is OK for detail purposes. I'm only going to give enough information to explain what I'm about to do and why we're writing the Haskell we're writing. If you happen to be a logic expert, this will probably be laughably simplistic for you (and you can likely teach me this course, in all likelihood) - consider yourselves warned.

Propositional logic is basically the simplest kind of logic we can have that's still interesting. The 'objects' that we work with in propositional logic are propositions - statements which can either be true or false. 'It is sunny outside' is an example of a proposition; another one is 'Yesterday I ate a hamburger for breakfast'. To keep the notation simple and precise, we usually refer to propositions using symbols p1, p2 and so on. At any given time, we're only going to need a finite number of these, which we'll collect together into a big set P, which we can think of as the 'universe' of things we currently care about.

Now, if we could only deal with propositions alone, we wouldn't have a terribly interesting system. Thus, we also have operators, which can take propositions and construct other propositions. We can talk about the arity of an operator, which refers to how many propositions it 'consumes'; in our case, we have only unary ('consumes' one) and binary ('consumes' two) operators. In particular, we have the unary operator ∼ (pronounced 'not'), and the following binary operators:

  • ∧ ('and')
  • ∨ ('or')
  • → ('if')
  • ↔ ('iff', short for 'if and only if')

We write 'not' prefix (thus, before whatever it's operating on), and all the binary operators infix. To avoid ambiguity, we'll also use parentheses for grouping, just like with algebra. Our operators can nest inside each other as much as we wantThese can nest inside each other arbitrarily - p1(p2p3) is fine.

Obviously though, we can't just put together propositions and operators any old how - we have to have some rules. These are called the syntax of propositional logic; anything which conforms to them is called a well-formed formula (or wff, pronounced 'woof'). Typically, if we're talking about arbitrary wffs, we'll call them Φ and Ψ, possibly with subscripts. To spell out our syntax precisely, we have the following rules:

  1. Any proposition p on its own is a wff.
  2. If Φ is a wff, then (∼Φ) is a wff.
  3. If Φ and Ψ are both wffs, then (Φ ∧ Ψ), (Φ ∨ Ψ), (Φ → Ψ) and (Φ ↔ Ψ) are all wffs.
  4. Nothing else is a wff.

We'll drop parentheses for 'obvious' cases in writing, but you can assume that they're all there. Based on the rules above, we can represent a wff as a tree - see the example below for the tree corresponding to p1(p2p3).

A binary tree, whose root is labelled '∧', with two children. Its left 
child is another binary tree, whose root is labelled '∨'. Its right child is a 
leaf labelled 'p1'. The left child of the left subtree is a leaf labelled 'p2', 
and the right child of that same subtree is a leaf labelled 'p3'.

We call such a tree a syntax tree.

So far, everything we've talked about has been to do with form - we haven't given any meaning to anything yet. In the context of a logic, meaning is to do with truth - and in propositional logic, truth is binary. We define two symbols, representing truth and falsehood respectively; in our case, they'll be True and False. Every wff is one or the other - to determine which is the job of our semantics, which is what gives meaning to our logic. Before we can talk about the meaning of operators, let's first think about propositions. In order for us to have meaning in our propositions, we have to specify what the truth value of each proposition in P is. We call this an interpretation, and we can think of it as a function that takes propositions and emits True or False.

Armed with some interpretation I, we can now define the truth value of any wff.

  1. If a wff is a single proposition p, its truth value is I(p).
  2. If a wff has the form (∼Φ), its truth value is True if the truth value of Φ is False; otherwise, its truth value is False.
  3. If a wff has the form (Φ ∧ Ψ), its truth value is False unless the truth value of both Φ and Ψ is True, in which case it is True.
  4. If a wff has the form (Φ ∨ Ψ), its truth value is True unless the truth value of both Φ and Ψ is False, in which case it is False.
  5. If a wff has the form (Φ → Ψ), its truth value is True unless the truth value of Φ is True and Ψ is False; in that case, it's False.
  6. If a wff has the form (Φ ↔ Ψ), its truth value is True if the truth values of Φ and Ψ match; otherwise, it is False.

Looking at this from the point of view of syntax trees, you can think of this process as a 'collapsing' of the tree bottom-up using I to process the leaves. We can then repeatedly apply the rules given here until we reach the root, which will give us the truth value of the whole formula.

How does this relate to invariant checking? Given that invariants are checked on a state-by-state basis, and that each state is labelled with a set of propositions, we can see each state as defining its own induced interpretation in which any proposition it's labelled with is True, and everything else is False. Logicians call this the 'closed-world assumption' - essentially, only the things we say are true are actually true. Thus, we can 'spell out' our invariant as a propositional wff, then just wander over the entire state graph state-by-state and check that every induced interpretation asigns that wff the truth value True. Easy!

Propositional logic interpreter - the simple version

Let's start off with a basic version, following the definitions pretty much word-for-word.

data Wff a
  = Prop a
  | Not (Wff a)
  | And (Wff a) (Wff a)
  | Or (Wff a) (Wff a)
  | If (Wff a) (Wff a)
  | Iff (Wff a) (Wff a)
  deriving (Show)

The syntax defined above practically cries out for a Haskell-style algebraic data type to enforce its rules. Given that we haven't specified anything about what propositions are (we've named them, but that's about it), we'll leave it to the user to decide how to deal with them.

newtype Interpretation a = Interpretation 
  { runInterpretation :: a -> Bool 
  }

truthValue :: Interpretation a -> a -> Bool
truthValue = runInterpretation

Given that our propositions are allowed to be anything, an interpretation for us is some way to map whatever we're using to represent propositions into Bool. This 'newtype wrapper' gives us a way of changing the implementation without having to break anything, and also avoids exposing internals unless we want to.

Now we have all the pieces needed to write our interpreter.

interpret :: Interpretation a -> Wff a -> Bool

If we have a single proposition, we just ask our interpretation what to do:

interpret i (Prop x) = truthValue i x

For a 'not', we 'push' interpret to the wff being operated on, then flip the outgoing truth value:

interpret i (Not phi) = not . interpret i $ phi

'And's and 'or's are dealt with similarly - we 'push' interpret to their 'child' wffs, then combine the results using (&&) and (||) respectively:

interpret i (And phi psi) = interpret i phi && interpret i psi
interpret i (Or phi psi) = interpret i phi || interpret i psi

For 'if', we want to give back True except when the left 'branch' is True and the right branch is False:

interpret i (If phi psi) = (not . interpret i $ phi) || interpret i psi

Lastly, for 'if and only if', we want to check if the truth values match:

interpret i (Iff phi psi) = interpret i phi == interpret i psi

... and that's it. This interpreter and the associated data structures work just fine, and could be used as-are. The simplicity and ease of translation shown here is why many (including myself) love Haskell.

However, these definitions are basic, and probably not as efficient as they could be. Let's see how they can be improved.

Interpretations shouldn't be functions

Our definition of an interpretation as (a wrapper around) a a -> Bool is extremely general, and could be used to define a lot of interesting behaviour. However, in our case, this isn't really needed, and in fact, would cause complications, as we would have to create (potentially different) functions for each state in our state graph when doing our checks. Because every state is labelled with exactly those propositions that happen to be true there, we can just as easily have our Interpretation a simply be a collection of as which are true at the moment.

Let's start with lists, as this is easy. First, we change our newtype appropriately:

newtype Interpretation a = Interpretation i
  { runInterpretation :: [a] 
  }

We also have to adjust truthValue to look inside the Interpretation's list; if it finds the second argument, then we give back True, but not otherwise. This requires us to introduce an Eq constraint as well.

truthValue :: (Eq a) => Interpretation a -> a -> Bool
truthValue i x = x `elem` runInterpretation i

To get everything to compile again, we also need to add an Eq constraint to interpret:

interpret :: (Eq a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged

Now, we don't have to use functions for Interpretations anymore. While we did add an extra constraint, limiting what we can use to represent propositions, this is going to be much simpler to work with. This also shows why having the newtype wrapper around our interpretation implementation helps; any changes we make to the implementation can be (mostly) invisible at the interface level.

Speeding up interpretations

Lists are certainly useful, but they're definitely not good for lookups. Potentially, every time we want to call truthValue, we might have to scan the entire list before giving an answer! Additionally, if we have large wffs being used to check large state graphs, we're going to be doing this scanning over and over again, which is definitely not a good idea. Let's fix this decision by thinking about it a little more and choosing a better data structure.

If we think about how the model checking process will work in our application, we expect the wff to be build once, but interpreted many times. Additionally, we will have to do a lot more calls to truthValue than constructions of Interpretations: while the former will have to be done for every proposition that shows up in our wff, the latter will only be done once per state. This clearly requires a structure optimized for lookups, which ideally doesn't constrain a too much. My suggestion would be something like HashSet from unordered-containers, as Hashable is an easy constraint to satisfy for anything we could realistically want to use for propositions, and lookups are fast.

The changes we need to make are quite minor:

import Data.Hashable

import qualified Data.HashSet as HS

-- Wff unchanged

newtype Interpretation a = Interpretation
  { runInterpretation :: HS.HashSet a
  }

truthValue :: (Eq a, Hashable a) => Interpretation a -> a -> Bool
truthValue i x = x `HS.member` runIntepretation i

interpret :: (Eq a, Hashable a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged

Although we could have used elem (as it's provided by the wonder-weapon that is Foldable, which HashSet is an instance of), it's much less efficient due to how HashSet is defined. We also need a Hashable constraint on truthValue and interpret, but, as we mentioned before, it's not a big ask for any realistic choice of representation.

'Compiling' wffs

So far, our changes have been simple, and we've kept the data type we use for building a wff and the data type we use for interpreting the wff the same. However, these two tasks are at cross purposes: the structure for building the wff should be expressive and easy to use when defining arbitrarily complex wffs, but the structure for interpreting should be as efficient as possible. Given this, we'd like to separate these two concerns, and define another type which is simpler, but can be interpreted more quickly. This is actually what compilers do - they take something that's 'easy' for humans (to write and understand) and convert it into something that's 'easy' for machines (to execute).

At the same time, when we do anything like this, we have to make sure that we don't accidentally change the semantics - you can always get the wrong answer infinitely fast, after all. To ensure that this doesn't happen, we're going to use the concept of logical equivalence. Essentially, if we have two wffs Φ and Ψ, we say that they are equivalent (which is written Φ ≡ Ψ) if they have the same truth value under any interpretation. This property means that we can always replace Φ with Ψ (and vice-versa) without changing the semantics, no matter the interpretation. There are many equivalences in propositional logic (exercises on which being a major reason why many people end up hating first-year logic classes, but I digress); we will use a few of them.

One way we can radically simplify things is to eliminate all our operators and replace them with one such that for any of our others, we can always rebuild. Thanks to Emil Post, we actually have two options: either ↑ (called the 'Sheffer stroke' or NAND) or ↓ (called the 'Quine dagger' or NOR). Although it really doesn't matter which one we go with, let's try the Sheffer stroke. Instead of rewriting the equivalences that make it possible, I'll defer to Wikipedia and let you look them up yourself.

Welcome back! Let's now implement a new data type to represent Sheffer-stroke-only wffs, as well a function to interpret them.

data CWff a -- for 'compiled wff'
  = CProp a
  | Stroke (CWff a) (CWff a)
  deriving (Show)

interpret' :: (Eq a, Hashable a) => Interpretation a -> CWff a -> Bool
interpret' i (CProp x) = truthValue i x
interpret' i (Stroke phi psi) = not (interpret' i phi && interpret' i psi)

Wow, much simpler. However, one important step remains - we currently have no way of converting a Wff into a CWff. Something like this, perhaps:

compile :: Wff a -> CWff a
compile (Prop p) = CProp p
compile (Not phi) = Stroke (compile phi) (compile phi)
compile (And phi psi) =
  Stroke
    (Stroke (compile phi) (compile psi))
    (Stroke (compile phi) (compile psi))
compile (Or phi psi) =
  Stroke
    (Stroke (compile phi) (compile phi))
    (Stroke (compile psi) (compile psi))
compile (If phi psi) = Stroke (compile phi) (Stroke (compile psi) (compile psi))
compile (Iff phi psi) =
  Stroke
    (Stroke (compile phi) (compile psi))
    (Stroke
      (Stroke (compile phi) (compile phi))
      (Stroke (compile psi) (compile psi)))

Eeek. While this certainly works, even in Haskell it's not a pretty sight, and writing this is thoroughly unpleasant. Despite the added simplicity in the data type, the compilation can massively inflate the size of the resulting structure (just look at all those nested Strokes if you have any doubts). While this is certainly easier to interpret in terms of the number of rules required, it's not looking like it'll be any faster.

Where to from here?

Our ridiculous over-engineering needs to continue in order to make all this worthwhile. Join us next time, when we get close to writing a compiler in the pursuit of Ever Greater Speed. Meanwhile, feel free, as some of my students did, to appreciate how straightforward Haskell makes writing such things. Think hard and have fun!