As some of you know, I'm a PhD candidate, which involves doing teaching assistant work. One of the classes I'm working with involves a bunch of model checking, based on this text here. As this (re-)uses a whole lot of material from logic, automata and algorithm design and analysis, this is something I found fairly interesting, despite not really knowing much about it.

Just today, I got asked about one of the assignments, which requires
implementing a way of checking invariant properties of a state graph
(definitions coming, I promise). This led to me demonstrating how you can do
(some of) this is Haskell, and how concise, clear and simple it is. Given that
the majority of my students haven't even *heard of*, much less seen, any Haskell
(our university is a Microsoft and Java shop, in all the worst ways), this
elicited oohs and ahs of amazement. Based on this, I decided I'd try and do
(something resembling) what they are being asked to do, in a series of blog
posts to fulfil my birthday condition. If anyone from my class is reading - I
hope this helps you learn Haskell, but don't expect to copy my code and get away
with it.

Without further ado, let's begin.

## Some basic ideas

*Model checking* is a broad field involving formal verification of the
properties of certain systems. A lot of the time, these systems involve hardware
or software, but doesn't have to. The process of model checking usually goes
something like this:

- Construct a
*model*of the system you want to check. - Write a
*property*describing the thing you want to reason about regarding this system. *Check*it to see if the model entails this property.

Each of these steps can be approached in a wide variety of ways, many of which are borrowed from other fields. In particular, automata and logic come up frequently, and given that we want this checking to be doable by machine, algorithms and data structures are also quite important.

In this particular case, we are dealing with a *state graph*. This is a general
representation of a system which can change states - that is, something
where, over time, given certain actions, different effects can occur. More
formally, a state graph is a 5-tuple *(S, I, E, P, L)* where

*S*is a finite set of*states**I*is a subset of*initial states*of*S**E*is a set of*edges*from states to other states*P*is a finite set of*propositions**L*is a*labelling*of each state in*S*with some (possibly empty) set of propositions in*P*

For a (visual) example, consider the diagram below:

This state graph has four states, each labelled with their set of labels. There's one initial state (the one with the arrow coming into it 'from nowhere'), which happens to be labelled with the empty set. Arrows from states to states indicate edges, which have a direction indicated by the arrowhead. This state graph represents a process which may be in one of four different 'states', with different effects in each one, represented by the labelling.

Properties of state graphs are stated in terms of the elements of *P*, and can
be either true or false. For the state graph above, 'at some point, *p* in every
state' is a property (which happens to be true); 'at any point, *q* will be true
in the next state' is also a property (which happens to be false). Given enough
propositions, and a complex enough state graph, we can describe some extremely
sophisticated properties. Additionally, if we can state these properties in a
certain way, depending on the property, we can mechanically check if a given
state graph possesses that property or not. This is essential, as in any real
cases, state graphs can be incredibly large (millions of states isn't unusual),
and the resulting properties can be extremely complex.

In the context of this assignment, the properties of interest are *invariants*,
which are properties of a 'global' nature for the entire state graph. An example
of an invariant for the state graph above would be '*p* is true in every state'
(which is, of course, false); another would be 'in any state, if *p* is true,
*q* is false' (also false, as it happens). Invariants are the simplest
properties, both in terms of checking *and* description: for the former, it's
enough to check that the property holds in each state individually (using some
kind of graph traversal); for the latter, we can use propositional logic as our
description language (which also makes checking easy). The assignment
specifically asks for the ability to:

- Take a description of a state graph and an invariant property as input.
- Construct both.
- Check if the given invariant property holds.

For this part of the series, I will describe propositional logic, write a simple
interpreter for it, and then *massively over-engineer* it. Because without that
last step, this would be a highly dull blog post. We'll continue the
over-engineering in the second part, where we veer dangerously into compiler
design territory; then in the third and fourth parts, we'll design the rest.

## Propositional logic for the masses

First-off, a disclaimer: the following description is not, in any way, shape or form, meant to be a definitive one. Much more thorough treatments of this topic can be found elsewhere, and even Wikipedia is OK for detail purposes. I'm only going to give enough information to explain what I'm about to do and why we're writing the Haskell we're writing. If you happen to be a logic expert, this will probably be laughably simplistic for you (and you can likely teach me this course, in all likelihood) - consider yourselves warned.

Propositional logic is basically the simplest kind of logic we can have that's
still interesting. The 'objects' that we work with in propositional logic are
*propositions* - statements which can either be true or false. 'It is sunny
outside' is an example of a proposition; another one is 'Yesterday I ate a
hamburger for breakfast'. To keep the notation simple and precise, we usually
refer to propositions using symbols *p _{1}, p_{2}* and so on. At any given time,
we're only going to need a finite number of these, which we'll collect together
into a big set

*P*, which we can think of as the 'universe' of things we currently care about.

Now, if we could only deal with propositions alone, we wouldn't have a terribly
interesting system. Thus, we also have *operators*, which can take propositions
and construct other propositions. We can talk about the *arity* of an operator,
which refers to how many propositions it 'consumes'; in our case, we have only
*unary* ('consumes' one) and *binary* ('consumes' two) operators. In particular,
we have the unary operator ∼ (pronounced 'not'), and the following binary
operators:

- ∧ ('and')
- ∨ ('or')
- → ('if')
- ↔ ('iff', short for 'if and only if')

We write 'not' prefix (thus, before whatever it's operating on), and all the
binary operators infix. To avoid ambiguity, we'll also use parentheses for
grouping, just like with algebra. Our operators can nest inside each other
as much as we wantThese can nest inside each other arbitrarily -
*p _{1}* ∧

*(p*∨

_{2}*p*is fine.

_{3})Obviously though, we can't just put together propositions and operators any old
how - we have to have some rules. These are called the *syntax* of propositional
logic; anything which conforms to them is called a *well-formed formula* (or
*wff*, pronounced 'woof'). Typically, if we're talking about arbitrary wffs,
we'll call them Φ and Ψ, possibly with subscripts. To spell out our syntax
precisely, we have the following rules:

- Any proposition
*p*on its own is a wff. - If Φ is a wff, then (∼Φ) is a wff.
- If Φ and Ψ are both wffs, then (Φ ∧ Ψ), (Φ ∨ Ψ), (Φ → Ψ) and (Φ ↔ Ψ) are all wffs.
- Nothing else is a wff.

We'll drop parentheses for 'obvious' cases in writing, but you can assume that
they're all there. Based on the rules above, we can represent a wff as a tree -
see the example below for the tree corresponding to *p _{1}* ∧

*(p*∨

_{2}*p*.

_{3})We call such a tree a *syntax tree*.

So far, everything we've talked about has been to do with *form* - we haven't
given any meaning to anything yet. In the context of a *logic*, meaning is to do
with truth - and in propositional logic, truth is binary. We define two symbols,
representing truth and falsehood respectively; in our case, they'll be `True`

and `False`

. *Every* wff is one or the other - to determine which is the job
of our *semantics*, which is what gives meaning to our logic. Before we can talk
about the meaning of operators, let's first think about propositions. In order
for us to have meaning in our propositions, we have to specify what the truth
value of each proposition in *P* is. We call this an *interpretation*, and we
can think of it as a function that takes propositions and emits `True`

or
`False`

.

Armed with some interpretation *I*, we can now define the *truth value* of any
wff.

- If a wff is a single proposition
*p*, its truth value is*I(p)*. - If a wff has the form (∼Φ), its truth value is
`True`

if the truth value of Φ is`False`

; otherwise, its truth value is`False`

. - If a wff has the form (Φ ∧ Ψ), its truth value is
`False`

unless the truth value of*both*Φ and Ψ is`True`

, in which case it is`True`

. - If a wff has the form (Φ ∨ Ψ), its truth value is
`True`

unless the truth value of*both*Φ and Ψ is`False`

, in which case it is`False`

. - If a wff has the form (Φ → Ψ), its truth value is
`True`

unless the truth value of Φ is`True`

and Ψ is`False`

; in that case, it's`False`

. - If a wff has the form (Φ ↔ Ψ), its truth value is
`True`

if the truth values of Φ and Ψ match; otherwise, it is`False`

.

Looking at this from the point of view of syntax trees, you can think of this
process as a 'collapsing' of the tree bottom-up using *I* to process the leaves.
We can then repeatedly apply the rules given here until we reach the root, which
will give us the truth value of the whole formula.

How does this relate to invariant checking? Given that invariants are checked on
a state-by-state basis, and that each state is labelled with a set of
propositions, we can see each state as defining its own *induced interpretation*
in which any proposition it's labelled with is `True`

, and everything else is
`False`

. Logicians call this the 'closed-world assumption' - essentially, only
the things we *say* are true are actually true. Thus, we can 'spell out' our
invariant as a propositional wff, then just wander over the entire state graph
state-by-state and check that every induced interpretation asigns that wff the
truth value `True`

. Easy!

## Propositional logic interpreter - the simple version

Let's start off with a basic version, following the definitions pretty much word-for-word.

```
data Wff a
= Prop a
| Not (Wff a)
| And (Wff a) (Wff a)
| Or (Wff a) (Wff a)
| If (Wff a) (Wff a)
| Iff (Wff a) (Wff a)
deriving (Show)
```

The syntax defined above practically cries out for a Haskell-style algebraic
data type to enforce its rules. Given that we haven't specified anything about
what propositions *are* (we've named them, but that's about it), we'll leave it
to the user to decide how to deal with them.

```
newtype Interpretation a = Interpretation
{ runInterpretation :: a -> Bool
}
truthValue :: Interpretation a -> a -> Bool
truthValue = runInterpretation
```

Given that our propositions are allowed to be anything, an interpretation for us
is some way to map whatever we're using to represent propositions into `Bool`

.
This 'newtype wrapper' gives us a way of changing the implementation without
having to break anything, and also avoids exposing internals unless we want to.

Now we have all the pieces needed to write our interpreter.

```
interpret :: Interpretation a -> Wff a -> Bool
```

If we have a single proposition, we just ask our interpretation what to do:

```
interpret i (Prop x) = truthValue i x
```

For a 'not', we 'push' `interpret`

to the wff being operated on, then flip the
outgoing truth value:

```
interpret i (Not phi) = not . interpret i $ phi
```

'And's and 'or's are dealt with similarly - we 'push' `interpret`

to their
'child' wffs, then combine the results using `(&&)`

and `(||)`

respectively:

```
interpret i (And phi psi) = interpret i phi && interpret i psi
interpret i (Or phi psi) = interpret i phi || interpret i psi
```

For 'if', we want to give back `True`

except when the left 'branch' is
`True`

and the right branch is `False`

:

```
interpret i (If phi psi) = (not . interpret i $ phi) || interpret i psi
```

Lastly, for 'if and only if', we want to check if the truth values match:

```
interpret i (Iff phi psi) = interpret i phi == interpret i psi
```

... and that's it. This interpreter and the associated data structures work just fine, and could be used as-are. The simplicity and ease of translation shown here is why many (including myself) love Haskell.

However, these definitions are basic, and probably not as efficient as they could be. Let's see how they can be improved.

## Interpretations shouldn't be functions

Our definition of an interpretation as (a wrapper around) a `a -> Bool`

is
extremely general, and could be used to define a lot of interesting behaviour.
However, in our case, this isn't really needed, and in fact, would cause
complications, as we would have to create (potentially different) functions for
each state in our state graph when doing our checks. Because every state is
labelled with exactly those propositions that happen to be true there, we can
just as easily have our `Interpretation a`

simply be a collection of `a`

s
which are true at the moment.

Let's start with lists, as this is easy. First, we change our `newtype`

appropriately:

```
newtype Interpretation a = Interpretation i
{ runInterpretation :: [a]
}
```

We also have to adjust `truthValue`

to look inside the `Interpretation`

's
list; if it finds the second argument, then we give back `True`

, but not
otherwise. This requires us to introduce an `Eq`

constraint as well.

```
truthValue :: (Eq a) => Interpretation a -> a -> Bool
truthValue i x = x `elem` runInterpretation i
```

To get everything to compile again, we also need to add an `Eq`

constraint to
`interpret`

:

```
interpret :: (Eq a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged
```

Now, we don't have to use functions for `Interpretation`

s anymore. While we
did add an extra constraint, limiting what we can use to represent propositions,
this is going to be much simpler to work with. This also shows why having the
`newtype`

wrapper around our interpretation implementation helps; any changes
we make to the implementation can be (mostly) invisible at the interface level.

## Speeding up interpretations

Lists are certainly useful, but they're *definitely* not good for lookups.
Potentially, every time we want to call `truthValue`

, we might have to scan
the entire list before giving an answer! Additionally, if we have large wffs
being used to check large state graphs, we're going to be doing this scanning
over and over again, which is definitely not a good idea. Let's fix this
decision by thinking about it a little more and choosing a better data
structure.

If we think about how the model checking process will work in our application,
we expect the wff to be build *once*, but interpreted many times. Additionally,
we will have to do a lot more calls to `truthValue`

than constructions of
`Interpretation`

s: while the former will have to be done for every proposition
that shows up in our wff, the latter will only be done once per state. This
clearly requires a structure optimized for lookups, which ideally doesn't
constrain `a`

too much. My suggestion would be something like `HashSet`

from
`unordered-containers`

, as `Hashable`

is an easy constraint to satisfy
for anything we could realistically want to use for propositions, and lookups
are fast.

The changes we need to make are quite minor:

```
import Data.Hashable
import qualified Data.HashSet as HS
-- Wff unchanged
newtype Interpretation a = Interpretation
{ runInterpretation :: HS.HashSet a
}
truthValue :: (Eq a, Hashable a) => Interpretation a -> a -> Bool
truthValue i x = x `HS.member` runIntepretation i
interpret :: (Eq a, Hashable a) => Interpretation a -> Wff a -> Bool
-- rest of interpret unchanged
```

Although we could have used `elem`

(as it's provided by the wonder-weapon that
is `Foldable`

, which `HashSet`

is an instance of), it's much less
efficient due to how `HashSet`

is defined. We also need a `Hashable`

constraint on `truthValue`

and `interpret`

, but, as we mentioned before,
it's not a big ask for any realistic choice of representation.

## 'Compiling' wffs

So far, our changes have been simple, and we've kept the data type we use for
*building* a wff and the data type we use for *interpreting* the wff the same.
However, these two tasks are at cross purposes: the structure for building the
wff should be expressive and easy to use when defining arbitrarily complex wffs,
but the structure for interpreting should be as efficient as possible. Given
this, we'd like to separate these two concerns, and define another type which is
simpler, but can be interpreted more quickly. This is actually what compilers do
- they take something that's 'easy' for humans (to write and understand) and
convert it into something that's 'easy' for machines (to execute).

At the same time, when we do anything like this, we have to make sure that we
don't accidentally change the semantics - you can always get the wrong answer
infinitely fast, after all. To ensure that this doesn't happen, we're going to
use the concept of *logical equivalence*. Essentially, if we have two wffs Φ
and Ψ, we say that they are *equivalent* (which is written Φ ≡ Ψ) if they have
the same truth value under *any* interpretation. This property means that we can
always replace Φ with Ψ (and vice-versa) without changing the semantics, no
matter the interpretation. There are many equivalences in propositional logic
(exercises on which being a major reason why many people end up hating
first-year logic classes, but I digress); we will use a few of them.

One way we can radically simplify things is to eliminate all our operators and
replace them with *one* such that for any of our others, we can always rebuild.
Thanks to Emil Post, we actually have *two* options: either ↑ (called the
'Sheffer stroke' or `NAND`

) or ↓ (called the 'Quine dagger' or `NOR`

).
Although it really doesn't matter which one we go with, let's try the Sheffer
stroke. Instead of rewriting the equivalences that make it possible, I'll defer
to Wikipedia and let you look them up yourself.

Welcome back! Let's now implement a new data type to represent Sheffer-stroke-only wffs, as well a function to interpret them.

```
data CWff a -- for 'compiled wff'
= CProp a
| Stroke (CWff a) (CWff a)
deriving (Show)
interpret' :: (Eq a, Hashable a) => Interpretation a -> CWff a -> Bool
interpret' i (CProp x) = truthValue i x
interpret' i (Stroke phi psi) = not (interpret' i phi && interpret' i psi)
```

Wow, *much* simpler. However, one important step remains - we currently have no
way of converting a `Wff`

into a `CWff`

. Something like this, perhaps:

```
compile :: Wff a -> CWff a
compile (Prop p) = CProp p
compile (Not phi) = Stroke (compile phi) (compile phi)
compile (And phi psi) =
Stroke
(Stroke (compile phi) (compile psi))
(Stroke (compile phi) (compile psi))
compile (Or phi psi) =
Stroke
(Stroke (compile phi) (compile phi))
(Stroke (compile psi) (compile psi))
compile (If phi psi) = Stroke (compile phi) (Stroke (compile psi) (compile psi))
compile (Iff phi psi) =
Stroke
(Stroke (compile phi) (compile psi))
(Stroke
(Stroke (compile phi) (compile phi))
(Stroke (compile psi) (compile psi)))
```

Eeek. While this certainly works, even in Haskell it's not a pretty sight, and
writing this is thoroughly unpleasant. Despite the added simplicity in the data
type, the compilation can *massively* inflate the size of the resulting
structure (just look at all those nested `Stroke`

s if you have any doubts).
While this is certainly easier to interpret in terms of the number of rules
required, it's not looking like it'll be any faster.

## Where to from here?

Our ridiculous over-engineering needs to continue in order to make all this worthwhile. Join us next time, when we get close to writing a compiler in the pursuit of Ever Greater Speed. Meanwhile, feel free, as some of my students did, to appreciate how straightforward Haskell makes writing such things. Think hard and have fun!