As some of you know, I'm a PhD candidate, which involves doing teaching assistant work. One of the classes I'm working with involves a bunch of model checking, based on this text here. As this (re-)uses a whole lot of material from logic, automata and algorithm design and analysis, this is something I found fairly interesting, despite not really knowing much about it.
Just today, I got asked about one of the assignments, which requires implementing a way of checking invariant properties of a state graph (definitions coming, I promise). This led to me demonstrating how you can do (some of) this is Haskell, and how concise, clear and simple it is. Given that the majority of my students haven't even heard of, much less seen, any Haskell (our university is a Microsoft and Java shop, in all the worst ways), this elicited oohs and ahs of amazement. Based on this, I decided I'd try and do (something resembling) what they are being asked to do, in a series of blog posts to fulfil my birthday condition. If anyone from my class is reading - I hope this helps you learn Haskell, but don't expect to copy my code and get away with it.
Without further ado, let's begin.
Some basic ideas
Model checking is a broad field involving formal verification of the properties of certain systems. A lot of the time, these systems involve hardware or software, but doesn't have to. The process of model checking usually goes something like this:
- Construct a model of the system you want to check.
- Write a property describing the thing you want to reason about regarding this system.
- Check it to see if the model entails this property.
Each of these steps can be approached in a wide variety of ways, many of which are borrowed from other fields. In particular, automata and logic come up frequently, and given that we want this checking to be doable by machine, algorithms and data structures are also quite important.
In this particular case, we are dealing with a state graph. This is a general representation of a system which can change states - that is, something where, over time, given certain actions, different effects can occur. More formally, a state graph is a 5-tuple (S, I, E, P, L) where
- S is a finite set of states
- I is a subset of initial states of S
- E is a set of edges from states to other states
- P is a finite set of propositions
- L is a labelling of each state in S with some (possibly empty) set of propositions in P
For a (visual) example, consider the diagram below:
This state graph has four states, each labelled with their set of labels. There's one initial state (the one with the arrow coming into it 'from nowhere'), which happens to be labelled with the empty set. Arrows from states to states indicate edges, which have a direction indicated by the arrowhead. This state graph represents a process which may be in one of four different 'states', with different effects in each one, represented by the labelling.
Properties of state graphs are stated in terms of the elements of P, and can be either true or false. For the state graph above, 'at some point, p in every state' is a property (which happens to be true); 'at any point, q will be true in the next state' is also a property (which happens to be false). Given enough propositions, and a complex enough state graph, we can describe some extremely sophisticated properties. Additionally, if we can state these properties in a certain way, depending on the property, we can mechanically check if a given state graph possesses that property or not. This is essential, as in any real cases, state graphs can be incredibly large (millions of states isn't unusual), and the resulting properties can be extremely complex.
In the context of this assignment, the properties of interest are invariants, which are properties of a 'global' nature for the entire state graph. An example of an invariant for the state graph above would be 'p is true in every state' (which is, of course, false); another would be 'in any state, if p is true, q is false' (also false, as it happens). Invariants are the simplest properties, both in terms of checking and description: for the former, it's enough to check that the property holds in each state individually (using some kind of graph traversal); for the latter, we can use propositional logic as our description language (which also makes checking easy). The assignment specifically asks for the ability to:
- Take a description of a state graph and an invariant property as input.
- Construct both.
- Check if the given invariant property holds.
For this part of the series, I will describe propositional logic, write a simple interpreter for it, and then massively over-engineer it. Because without that last step, this would be a highly dull blog post. We'll continue the over-engineering in the second part, where we veer dangerously into compiler design territory; then in the third and fourth parts, we'll design the rest.
Propositional logic for the masses
First-off, a disclaimer: the following description is not, in any way, shape or form, meant to be a definitive one. Much more thorough treatments of this topic can be found elsewhere, and even Wikipedia is OK for detail purposes. I'm only going to give enough information to explain what I'm about to do and why we're writing the Haskell we're writing. If you happen to be a logic expert, this will probably be laughably simplistic for you (and you can likely teach me this course, in all likelihood) - consider yourselves warned.
Propositional logic is basically the simplest kind of logic we can have that's still interesting. The 'objects' that we work with in propositional logic are propositions - statements which can either be true or false. 'It is sunny outside' is an example of a proposition; another one is 'Yesterday I ate a hamburger for breakfast'. To keep the notation simple and precise, we usually refer to propositions using symbols p1, p2 and so on. At any given time, we're only going to need a finite number of these, which we'll collect together into a big set P, which we can think of as the 'universe' of things we currently care about.
Now, if we could only deal with propositions alone, we wouldn't have a terribly interesting system. Thus, we also have operators, which can take propositions and construct other propositions. We can talk about the arity of an operator, which refers to how many propositions it 'consumes'; in our case, we have only unary ('consumes' one) and binary ('consumes' two) operators. In particular, we have the unary operator ∼ (pronounced 'not'), and the following binary operators:
- ∧ ('and')
- ∨ ('or')
- → ('if')
- ↔ ('iff', short for 'if and only if')
We write 'not' prefix (thus, before whatever it's operating on), and all the binary operators infix. To avoid ambiguity, we'll also use parentheses for grouping, just like with algebra. Our operators can nest inside each other as much as we wantThese can nest inside each other arbitrarily - p1 ∧ (p2 ∨ p3) is fine.
Obviously though, we can't just put together propositions and operators any old how - we have to have some rules. These are called the syntax of propositional logic; anything which conforms to them is called a well-formed formula (or wff, pronounced 'woof'). Typically, if we're talking about arbitrary wffs, we'll call them Φ and Ψ, possibly with subscripts. To spell out our syntax precisely, we have the following rules:
- Any proposition p on its own is a wff.
- If Φ is a wff, then (∼Φ) is a wff.
- If Φ and Ψ are both wffs, then (Φ ∧ Ψ), (Φ ∨ Ψ), (Φ → Ψ) and (Φ ↔ Ψ) are all wffs.
- Nothing else is a wff.
We'll drop parentheses for 'obvious' cases in writing, but you can assume that they're all there. Based on the rules above, we can represent a wff as a tree - see the example below for the tree corresponding to p1 ∧ (p2 ∨ p3).
We call such a tree a syntax tree.
So far, everything we've talked about has been to do with form - we haven't
given any meaning to anything yet. In the context of a logic, meaning is to do
with truth - and in propositional logic, truth is binary. We define two symbols,
representing truth and falsehood respectively; in our case, they'll be True
and False
. Every wff is one or the other - to determine which is the job
of our semantics, which is what gives meaning to our logic. Before we can talk
about the meaning of operators, let's first think about propositions. In order
for us to have meaning in our propositions, we have to specify what the truth
value of each proposition in P is. We call this an interpretation, and we
can think of it as a function that takes propositions and emits True
or
False
.
Armed with some interpretation I, we can now define the truth value of any wff.
- If a wff is a single proposition p, its truth value is I(p).
- If a wff has the form (∼Φ), its truth value is
True
if the truth value of Φ isFalse
; otherwise, its truth value isFalse
. - If a wff has the form (Φ ∧ Ψ), its truth value is
False
unless the truth value of both Φ and Ψ isTrue
, in which case it isTrue
. - If a wff has the form (Φ ∨ Ψ), its truth value is
True
unless the truth value of both Φ and Ψ isFalse
, in which case it isFalse
. - If a wff has the form (Φ → Ψ), its truth value is
True
unless the truth value of Φ isTrue
and Ψ isFalse
; in that case, it'sFalse
. - If a wff has the form (Φ ↔ Ψ), its truth value is
True
if the truth values of Φ and Ψ match; otherwise, it isFalse
.
Looking at this from the point of view of syntax trees, you can think of this process as a 'collapsing' of the tree bottom-up using I to process the leaves. We can then repeatedly apply the rules given here until we reach the root, which will give us the truth value of the whole formula.
How does this relate to invariant checking? Given that invariants are checked on
a state-by-state basis, and that each state is labelled with a set of
propositions, we can see each state as defining its own induced interpretation
in which any proposition it's labelled with is True
, and everything else is
False
. Logicians call this the 'closed-world assumption' - essentially, only
the things we say are true are actually true. Thus, we can 'spell out' our
invariant as a propositional wff, then just wander over the entire state graph
state-by-state and check that every induced interpretation asigns that wff the
truth value True
. Easy!
Propositional logic interpreter - the simple version
Let's start off with a basic version, following the definitions pretty much word-for-word.
data Wff a
= Prop a
| Not (Wff a)
| And (Wff a) (Wff a)
| Or (Wff a) (Wff a)
| If (Wff a) (Wff a)
| Iff (Wff a) (Wff a)
deriving (Show)
The syntax defined above practically cries out for a Haskell-style algebraic data type to enforce its rules. Given that we haven't specified anything about what propositions are (we've named them, but that's about it), we'll leave it to the user to decide how to deal with them.
newtype Interpretation a = Interpretation
{ runInterpretation :: a -> Bool
}
truthValue :: Interpretation a -> a -> Bool
truthValue = runInterpretation
Given that our propositions are allowed to be anything, an interpretation for us
is some way to map whatever we're using to represent propositions into Bool
.
This 'newtype wrapper' gives us a way of changing the implementation without
having to break anything, and also avoids exposing internals unless we want to.
Now we have all the pieces needed to write our interpreter.
interpret :: Interpretation a -> Wff a -> Bool
If we have a single proposition, we just ask our interpretation what to do:
interpret i (Prop x) = truthValue i x
For a 'not', we 'push' interpret
to the wff being operated on, then flip the
outgoing truth value:
interpret i (Not phi) = not . interpret i $ phi
'And's and 'or's are dealt with similarly - we 'push' interpret
to their
'child' wffs, then combine the results using (&&)
and (||)
respectively:
interpret i (And phi psi) = interpret i phi && interpret i psi
interpret i (Or phi psi) = interpret i phi || interpret i psi
For 'if', we want to give back True
except when the left 'branch' is
True
and the right branch is False
:
interpret i (If phi psi) = (not . interpret i $ phi) || interpret i psi
Lastly, for 'if and only if', we want to check if the truth values match:
interpret i (Iff phi psi) = interpret i phi == interpret i psi
... and that's it. This interpreter and the associated data structures work just fine, and could be used as-are. The simplicity and ease of translation shown here is why many (including myself) love Haskell.
However, these definitions are basic, and probably not as efficient as they could be. Let's see how they can be improved.
Interpretations shouldn't be functions
Our definition of an interpretation as (a wrapper around) a a -> Bool
is
extremely general, and could be used to define a lot of interesting behaviour.
However, in our case, this isn't really needed, and in fact, would cause
complications, as we would have to create (potentially different) functions for
each state in our state graph when doing our checks. Because every state is
labelled with exactly those propositions that happen to be true there, we can
just as easily have our Interpretation a
simply be a collection of a
s
which are true at the moment.
Let's start with lists, as this is easy. First, we change our newtype
appropriately:
newtype Interpretation a = Interpretation i
{ runInterpretation :: [a]
}
We also have to adjust truthValue
to look inside the Interpretation
's
list; if it finds the second argument, then we give back True
, but not
otherwise. This requires us to introduce an Eq
constraint as well.
truthValue :: (Eq a) => Interpretation a -> a -> Bool
truthValue i x = x `elem` runInterpretation i
To get everything to compile again, we also need to add an Eq
constraint to
interpret
:
interpret :: (Eq a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged
Now, we don't have to use functions for Interpretation
s anymore. While we
did add an extra constraint, limiting what we can use to represent propositions,
this is going to be much simpler to work with. This also shows why having the
newtype
wrapper around our interpretation implementation helps; any changes
we make to the implementation can be (mostly) invisible at the interface level.
Speeding up interpretations
Lists are certainly useful, but they're definitely not good for lookups.
Potentially, every time we want to call truthValue
, we might have to scan
the entire list before giving an answer! Additionally, if we have large wffs
being used to check large state graphs, we're going to be doing this scanning
over and over again, which is definitely not a good idea. Let's fix this
decision by thinking about it a little more and choosing a better data
structure.
If we think about how the model checking process will work in our application,
we expect the wff to be build once, but interpreted many times. Additionally,
we will have to do a lot more calls to truthValue
than constructions of
Interpretation
s: while the former will have to be done for every proposition
that shows up in our wff, the latter will only be done once per state. This
clearly requires a structure optimized for lookups, which ideally doesn't
constrain a
too much. My suggestion would be something like HashSet
from
unordered-containers
, as Hashable
is an easy constraint to satisfy
for anything we could realistically want to use for propositions, and lookups
are fast.
The changes we need to make are quite minor:
import Data.Hashable
import qualified Data.HashSet as HS
-- Wff unchanged
newtype Interpretation a = Interpretation
{ runInterpretation :: HS.HashSet a
}
truthValue :: (Eq a, Hashable a) => Interpretation a -> a -> Bool
truthValue i x = x `HS.member` runIntepretation i
interpret :: (Eq a, Hashable a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged
Although we could have used elem
(as it's provided by the wonder-weapon that
is Foldable
, which HashSet
is an instance of), it's much less
efficient due to how HashSet
is defined. We also need a Hashable
constraint on truthValue
and interpret
, but, as we mentioned before,
it's not a big ask for any realistic choice of representation.
'Compiling' wffs
So far, our changes have been simple, and we've kept the data type we use for building a wff and the data type we use for interpreting the wff the same. However, these two tasks are at cross purposes: the structure for building the wff should be expressive and easy to use when defining arbitrarily complex wffs, but the structure for interpreting should be as efficient as possible. Given this, we'd like to separate these two concerns, and define another type which is simpler, but can be interpreted more quickly. This is actually what compilers do - they take something that's 'easy' for humans (to write and understand) and convert it into something that's 'easy' for machines (to execute).
At the same time, when we do anything like this, we have to make sure that we don't accidentally change the semantics - you can always get the wrong answer infinitely fast, after all. To ensure that this doesn't happen, we're going to use the concept of logical equivalence. Essentially, if we have two wffs Φ and Ψ, we say that they are equivalent (which is written Φ ≡ Ψ) if they have the same truth value under any interpretation. This property means that we can always replace Φ with Ψ (and vice-versa) without changing the semantics, no matter the interpretation. There are many equivalences in propositional logic (exercises on which being a major reason why many people end up hating first-year logic classes, but I digress); we will use a few of them.
One way we can radically simplify things is to eliminate all our operators and
replace them with one such that for any of our others, we can always rebuild.
Thanks to Emil Post, we actually have two options: either ↑ (called the
'Sheffer stroke' or NAND
) or ↓ (called the 'Quine dagger' or NOR
).
Although it really doesn't matter which one we go with, let's try the Sheffer
stroke. Instead of rewriting the equivalences that make it possible, I'll defer
to Wikipedia and let you look them up yourself.
Welcome back! Let's now implement a new data type to represent Sheffer-stroke-only wffs, as well a function to interpret them.
data CWff a -- for 'compiled wff'
= CProp a
| Stroke (CWff a) (CWff a)
deriving (Show)
interpret' :: (Eq a, Hashable a) => Interpretation a -> CWff a -> Bool
interpret' i (CProp x) = truthValue i x
interpret' i (Stroke phi psi) = not (interpret' i phi && interpret' i psi)
Wow, much simpler. However, one important step remains - we currently have no
way of converting a Wff
into a CWff
. Something like this, perhaps:
compile :: Wff a -> CWff a
compile (Prop p) = CProp p
compile (Not phi) = Stroke (compile phi) (compile phi)
compile (And phi psi) =
Stroke
(Stroke (compile phi) (compile psi))
(Stroke (compile phi) (compile psi))
compile (Or phi psi) =
Stroke
(Stroke (compile phi) (compile phi))
(Stroke (compile psi) (compile psi))
compile (If phi psi) = Stroke (compile phi) (Stroke (compile psi) (compile psi))
compile (Iff phi psi) =
Stroke
(Stroke (compile phi) (compile psi))
(Stroke
(Stroke (compile phi) (compile phi))
(Stroke (compile psi) (compile psi)))
Eeek. While this certainly works, even in Haskell it's not a pretty sight, and
writing this is thoroughly unpleasant. Despite the added simplicity in the data
type, the compilation can massively inflate the size of the resulting
structure (just look at all those nested Stroke
s if you have any doubts).
While this is certainly easier to interpret in terms of the number of rules
required, it's not looking like it'll be any faster.
Where to from here?
Our ridiculous over-engineering needs to continue in order to make all this worthwhile. Join us next time, when we get close to writing a compiler in the pursuit of Ever Greater Speed. Meanwhile, feel free, as some of my students did, to appreciate how straightforward Haskell makes writing such things. Think hard and have fun!