making propellor safer with GADTs and type families
Since July, I have been aware of an ugly problem with propellor. Certain propellor configurations could have a bug. I've tried to solve the problem at least a half-dozen times without success; it's eaten several weekends.
Today I finally managed to fix propellor so it's impossible to write code that has the bug, bending the Haskell type checker to my will with the power of GADTs and type-level functions.
Code with the bug looked innocuous enough. Something like this:
foo :: Property foo = property "foo" $ unlessM (liftIO $ doesFileExist "/etc/foo") $ do bar <- liftIO $ readFile "/etc/foo.template" ensureProperty $ setupFoo bar
The problem comes about because some properties in propellor have Info associated with them. This is used by propellor to introspect over the properties of a host, and do things like set up DNS, or decrypt private data used by the property.
At the same time, it's useful to let a Property internally decide to
run some other Property. In the example above, that's the
line, and the
setupFoo Property is run only sometimes, and is
passed data that is read from the filesystem.
This makes it very hard, indeed probably impossible for Propellor to
look inside the monad, realize that
setupFoo is being used, and add
its Info to the host.
setupFoo doesn't have Info associated with it -- most
properties do not. But, it's hard to tell, when writing such a Property
if it's safe to use ensureProperty. And worse,
setupFoo could later
be changed to have Info.
Now, in most languages, once this problem was noticed, the solution would
probably be to make
ensureProperty notice when it's called on a Property
that has Info, and print a warning message. That's Good Enough in a sense.
But it also really stinks as a solution. It means that building propellor isn't good enough to know you have a working system; you have to let it run on each host, and watch out for warnings. Ugh, no!
This screams for GADTs. (Well, it did once I learned how what GADTs are and what they can do.)
Property NoInfo and
Property HasInfo can be separate data
types. Most functions will work on either type (
Property i) but
ensureProperty can be limited to only accept a
data Property i where IProperty :: Desc -> ... -> Info -> Property HasInfo SProperty :: Desc -> ... -> Property NoInfo data HasInfo data NoInfo ensureProperty :: Property NoInfo -> Propellor Result
Then the type checker can detect the bug, and refuse to compile it.
There are a lot of Property combinators in propellor. These combine
two or more properties in various ways. The most basic one is
which only runs the first Property after the second one has successfully
So, what's it's type when used with GADT Property?
requires :: Property i1 -> Property i2 -> Property ???
It seemed I needed some kind of type class, to vary the return type.
class Combine x y r where requires :: x -> y -> r
Now I was able to write 4 instances of
Combines, for each combination
of 2 Properties with HasInfo or NoInfo.
It type checked. But, type inference was busted. A simple expression
requires bar" blew up:
No instance for (Requires (Property HasInfo) (Property HasInfo) r0) arising from a use of `requires' The type variable `r0' is ambiguous Possible fix: add a type signature that fixes these type variable(s) Note: there is a potential instance available: instance Requires (Property HasInfo) (Property HasInfo) (Property HasInfo) -- Defined at Propellor/Types.hs:167:10
To avoid that, it needed "(foo
requires bar) :: Property HasInfo" --
I didn't want the user to need to write that.
I got stuck here for an long time, well over a month.
type level programming
Finally today I realized that I could fix this with a little type-level programming.
class Combine x y where requires :: x -> y -> CombinedType x y
CombinedType is a type-level function, that calculates the type that
should be used for a combination of types x and y. This turns out to be really
easy to do, once you get your head around type level functions.
type family CInfo x y type instance CInfo HasInfo HasInfo = HasInfo type instance CInfo HasInfo NoInfo = HasInfo type instance CInfo NoInfo HasInfo = HasInfo type instance CInfo NoInfo NoInfo = NoInfo type family CombinedType x y type instance CombinedType (Property x) (Property y) = Property (CInfo x y)
And, with that change, type inference worked again! \o/
(Bonus: I added some more intances of CombinedType for combining things like RevertableProperties, so propellor's property combinators got more powerful too.)
Then I just had to make a massive pass over all of Propellor, fixing the types of each Property to be Property NoInfo or Property HasInfo. I frequently picked the wrong one, but the type checker was able to detect and tell me when I did.
A few of the type signatures got slightly complicated, to provide the type checker with sufficient proof to do its thing...
before :: (IsProp x, Combines y x, IsProp (CombinedType y x)) => x -> y -> CombinedType y x before x y = (y `requires` x) `describe` (propertyDesc x) onChange :: (Combines (Property x) (Property y)) => Property x => Property y => CombinedType (Property x) (Property y) onChange = -- 6 lines of code omitted fallback :: (Combines (Property p1) (Property p2)) => Property p1 -> Property p2 -> Property (CInfo p1 p2) fallback = -- 4 lines of code omitted
.. This mostly happened in property combinators, which is an acceptable tradeoff, when you consider that the type checker is now being used to prove that propellor can't have this bug.
Mostly, things went just fine. The only other annoying thing was that some
things use a
[Property], and since a haskell list can only contain a
single type, while Property Info and Property NoInfo are two different
types, that needed to be dealt with. Happily, I was able to extend
(!) operators to work in this situation,
so a list can be constructed of properties of several different types:
propertyList "foos" $ props & foo & foobar ! oldfoo
The resulting 4000 lines of changes will be in the next release of propellor. Just as soon as I test that it always generates the same Info as before, and perhaps works when I run it. (eep)
These uses of GADTs and type families are not new; this is merely the first time I used them. It's another Haskell leveling up for me.
Anytime you can identify a class of bugs that can impact a complicated code base, and rework the code base to completely avoid that class of bugs, is a time to celebrate!