johnw is currently certified at Master level.

Name: John Wiegley
Member since: 2000-09-20 05:20:40
Last Login: 2017-02-17 18:32:35

FOAF RDF Share This

Homepage: http://www.newartisans.com/

Notes:

My journal is now kept here.

Projects

Recent blog entries by johnw

Syndication: RSS 2.0

Monads are monoid objects

Monads are monoid objects

Lately I’ve been working again on my Category Theory formalization in Coq, and just now proved, in a completely general setting, the following statement:

Monads are monoid (objects) in the (monoidal) category of endofunctors (which is monoidal with respect to functor composition).

The proof, using no axioms, is here.

Now, just how much category theory was needed to establish this fact?

Categories

We start with concept of a category, which has objects of some Type, and arrows between objects of some other Type. In this way, objects and arrows can be almost anything, except they must provide: identity arrows on every object, and composition of arrows, with composition being associative and identity arrows having no effect on composition.

All arrows between two objects forms a set of arrows, called a “hom-set”. In my library, these are actually constructive hom-setoids, allowing a category-specific definition of what it means for two members of a hom-setoid to be “equivalent”. The fact that it is constructive means that the witness to this equivalence must be available to later functions and proofs, and not only the fact that a witness had been found.

Functors

Given two categories, which may have different objects, arrows and hom equivalences, it is sometime possible to map objects to objects, arrows to arrows, and equivalences to equivalences, so long as identity arrows, composition, and the related laws are preserved. In this case we call such a mapping a “functor”.

Natural transformations

While functors map between categories, natural transformations map between functors, along with a “naturality” condition that performing the transformation before or after utilizing the related functors has no effect on the result.

Isomorphisms

Two objects in a category are said to be isomorphic if there are arrows for one to the other, and back, and the composition of these two arrows is equivalent to identity in both directions.

Note that since the type of objects and arrows is unknown in the general case, the “meaning” of isomorphism can vary from category to category, as we will see below in the case of Cat, the category of all categories.

Cartesian categories

Although objects are just abstract symbols, sometimes it’s possible to reveal additional structure about a category through the identification of arrows that give us details about the internal structure of some object.

One such structure is “cartesian products”. This identifies a product object in the category, in terms of introduction and elimination arrows, along with a universal property stating that all product-like objects in the category must be mappable (in terms of their being a product) to the object identified by the cartesian structure.

For example, I could pick tuples (a, b) in Haskell as a product , or some custom data type Tuple, or even a larger data structure (a, b, c), and all of these would be products for a and b. However, only tuples and Tuple are universal, in the sense that every other product has a mapping to them, but not vice versa. Further, the mapping between tuple and Tuple must be a isomorphism. This leaves me free to choose either as the product object for the Haskell category.

Product categories

Whereas cartesion categories tell us more about the internal structure of some product object in a category, product categories are a construction on top of some category, without adding anything to our knowledge of its internals. In particular, a product category is a category whose objects are pairs of objects from some other category, and whose arrows are pairs of the corresponding arrows between those two objects. Arrow equivalence, identity and composition, follow similarly. Thus, every object in a product category is a product, and arrows must always “operate on products”.

Bifunctors

If a functor maps from a product category to some other category (which could also be another product category, but doesn’t have to be), we call it a bifunctor. Another way to think of it is as a “functor of two arguments”.

Endofunctors

A functor that maps a category to itself (though it may map objects to different objects, etc) is called an endofunctor on that category.

The category of endofunctors

The category of endofunctors on some category has as objects every endofunctor, and as arrows natural transformations between these endofunctors. Here identity is the identity transformation, and composition is composition between natural transformations. We can designate the category of endofunctors using the name [C, C], for some category C.

Monoidal categories

A monoidal category reveals the structure of a tensor operation in the category, plus a special object, the unit of the tensor operation. Along with these come laws expressed in terms of isomorphisms between the results of the tensor:

tensor : C × C ⟶ C where "x ⨂ y" := (tensor (x, y));
I : C;

unit_left  {X} : I ⨂ X ≅ X;
unit_right {X} : X ⨂ I ≅ X;

tensor_assoc {X Y Z} : (X ⨂ Y) ⨂ Z ≅ X ⨂ (Y ⨂ Z)

Note that the same category may be monoidal in multiple different ways. Also, we needed product categories, since the tensor is a bifunctor from the product of some category C to itself.

We could also have specified the tensor in curried form, as a functor from C to the category of endofunctors on C:

tensor : C ⟶ [C, C]

However, this adds no information (the two forms are isomorphic), and just made some of the later proofs a bit more complicated.

Monoidal composition

The category of endofunctors on C is a monoidal category, taking the identity endofunctor as unit, and endofunctor composition as the tensor. It is monoidal in other ways too, but this is the structure of interest concerning monads.

Monoid categories

A monoid object in a monoidal category is an object in the category, plus a pair of arrows. Let’s call the arrows mappend and mempty. These map from a tensor product of the monoid object to itself, and from the monoidal unit to the monoid object, along with preservation of the monoid laws in terms of arrow equivlances. In Coq it looks like this:

Context `{C : Category}.
Context `{@Monoidal C}.

(* Here [mon] is the monoid object. *)
Class Monoid (mon : C) := {
  mappend : mon ⨂ mon ~> mon;
  mempty : I ~> mon;

  mempty_left : (* I ⨂ mon ≈ mon *)
    mappend ∘ bimap mempty id ≈ to (@unit_left C _ mon);
  mempty_right : (* mon ⨂ I ≈ mon *)
    mappend ∘ bimap id mempty ≈ to (@unit_right C _ mon);

  (* (mon ⨂ mon) ⨂ mon ≈ mon ⨂ (mon ⨂ mon) *)
  mappend_assoc :
    mappend ∘ bimap mappend id
      ≈ mappend ∘ bimap id mappend ∘ to tensor_assoc
}.

Monads are monoid objects

Given all of the above, we can now state that every monad is a monoid object in the monoidal category of endofunctors, taking composition as the tensor product. return is the mempty natural transformation of that object, and join, the mappend natural transformation:

Context `{C : Category}.
Context `{M : C ⟶ C}.

Definition Endofunctors `(C : Category) := ([C, C]).

Program Definition Monoid_Monad
        (m : @Monoid (Endofunctors C) Composition_Monoidal M) : 
  Monad := {|
  ret  := transform[mempty[m]];
  join := transform[mappend[m]]
|}.

This makes no assumptions about the structure of the category C, other than what has been stated above, and no other aspects of category theory are needed. The proof, again, is here.

Note that there is another way to arrive at monads, from the adjunction of two functors, which I also have a proof for, but this can wait until another post.

Footnotes: [1] We say small here to avoid the paradox of Cat not containing itself.

Syndicated 2017-05-04 00:00:00 from Lost in Technopolis

Putting lenses to work

Putting lenses to work

I gave a talk a couple of weeks ago at BayHac 2017 on “Putting lenses to work”, to show in a practical context how we use lenses at my workplace. I specifically avoided any theory about lenses, or the complex types, or the many operators, to show that at its core, lens is a truly invaluable library:

The videos are now available on YouTube, and the slides for this talk are on GitHub.

The code in the slides are taken directly (using Emacs) from a test file in that same repository, Lenses.hs, to serve as a way of preserving helpful examples, and to make it easy to cargo cult specific patterns into your code.

Syndicated 2017-04-22 00:00:00 from Lost in Technopolis

Submitting Haskell functions to Z3

Submitting Haskell functions to Z3

Conal Elliott has been working for several years now on using categories, specifically cartesian closed category, as a way to abstract Haskell functions at compile-time, so you can render the resulting “categorical term” into other categories.

Here’s an example Haskell function:

\x -> f x (g x)

And here’s its categorical rendering, just to give the flavor of the idea:

eval ∘ (f' △ g')

Where eval means uncurry ($), and f' and g' are the renderings of those two functions; and the operator is (&&&). I’m not using the typical Haskell names for these, by the way, in order to convince myself not to “think in Haskell” when working with these terms, but rather I’m choosing whatever symbols I find most often using in the literature on catgeory theory.

There are a few things to notice about these categorical terms:

  1. They must be point-free. There is no such thing as naming a term, only morphisms that use or produce objects. Hence Awodey calls category theory “the algebra of functions”.

  2. They quickly become very large and unreadable. All but the simplest terms are nearly impossible to understand just by looking at them. Think of it as the binary code for categories.

  3. Because they are just, in effect, chains of composition, without any name binding or scoping issue to consider, the nature of the computation is laid out in a very direct (albeit verbose) way, making rewrite rules available throughout the abstract term.

Although it seems a bit technical at first, the idea is quite simple: Discern the abstract, categorical meaning of a Haskell function, then realize that term in any other category that is cartesian (has products) and closed (has functions as objects, i.e., higher-order constructions). Nothing else needs to be known about the target category for the abstract term to have meaning there. That’s the beauty of using category theory as a universal language for expressing ideas: the meaning transports everywhere.

Here’s an equation meant for the solver, written in plain Haskell:

equation :: (Num a, Ord a) => a -> a -> Bool
equation x y =
    x < y &&
    y < 100 &&
    0 <= x - 3 + 7 * y &&
    (x == y || y + 20 == x + 30)

Here’s how I run the solver, using z3cat, which is built on top of Conal’s concat library:

mres <- liftIO $ runZ3 (ccc (uncurry (equation @Int))) $ do
    x <- mkFreshIntVar "x"
    y <- mkFreshIntVar "y"
    return $ PairE (PrimE x) (PrimE y)
case mres of
    Nothing  -> error "No solution found."
    Just sol -> putStrLn $ "Solution: " ++ show sol

And the result, also showing the equation submitted to Z3:

(let ((a!1 (ite (<= 0 (+ (- x!0 3) (* 7 y!1)))
                (ite (= x!0 y!1) true (= (+ y!1 20) (+ x!0 30)))
                false)))
  (ite (< x!0 y!1) (ite (< y!1 100) a!1 false) false))
Solution: [-8,2]

Now with one function, I have either a predicate function I can use in Haskell, or an input for Z3 to find arguments for which it is true!

In addition to using Conal’s work in Haskell, I’m also working on a Coq rendering of his idea, which I hope will give me a more principled way to extract Coq programs into Haskell, by way of their categorical representation.

Syndicated 2017-04-18 00:00:00 from Lost in Technopolis

A case of reflection

A case of reflection

A while back, Edward Kmett wrote a library called reflection, based on a 2004 paper by Oleg Kiselyov and Chung-chieh Shan that describes a neat trick for reifying data into types (here the word “reify” can be understood as turning a value into something that can be referenced at the type level). There was also an article written by Austin Seipp on how to use the library, and some great answers on reddit and stackoverflow that go into detail about how it works.

And yet, in all these years, though I’ve been on the lookout for a way to make use of this library, I wasn’t able to fit it into my workflow – until today! So let’s look at my real world use for reflection, which solves a problem that maybe others have encountered as well.

As you may know, the QuickCheck library provides a facility for generating arbitrary data sets. The property testing features of QuickCheck make use of this generation to search for test data that might violate a set of properties.

However, the generation facility can also be used on its own, separate from the testing components, to randomly generate data for any purpose. The library for producing this random data offers lots of combinators, and is based around instances for a type class called Arbitrary. Here’s a basic example:

module Main where

import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen

data Foo = Foo [Int] [String]
    deriving Show

instance Arbitrary Foo where
    arbitrary = do
        xs  <- listOf chooseAny
        len <- choose (1, 100)
        ys  <- vectorOf len (shuffle "Hello, world")
        return $ Foo xs ys

main :: IO ()
main = print =<< generate (arbitrary :: Gen Foo)

This creates a specifically shaped set of random data, where the list of integers may be of any length, and any value, but the list of strings will always be from 1 to 100 elements long, and the strings will only consist of random arrangements of the characters found in "Hello, world".

Now, what if you wanted to guide the generation process for Foo using external information? Such as picking the length of the list of strings from a value provided by the user? Since Arbitrary does not allow the use of Reader, how do we get that user-supplied value into the arbitrary function above? And without using global IORefs or unsafePerformIO?

The reflection library allows us to reify a runtime value into a type (whose name we’ll never know, requiring us to reference it through a type variable), and then communicate that type via a constraint, such that we can reflect the value back out as needed. If this sounds a bit confusing, maybe an example can make it clearer:

{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}

module Main where

import Data.Proxy
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment

data Foo s = Foo [Int] [String]
    deriving Show

instance Reifies s Int => Arbitrary (Foo s) where
    arbitrary = do
        xs  <- listOf chooseAny
        len <- choose (1, reflect (Proxy :: Proxy s))
        ys  <- vectorOf len (shuffle "Hello, world")
        return $ Foo xs ys

main :: IO ()
main = do
    [len] <- getArgs
    reify (read len :: Int) $ \(Proxy :: Proxy s) ->
        print =<< generate (arbitrary :: Gen (Foo s))

There are a few additional things to note here:

  1. A phantom type variable has been added to Foo. This type variable associates the reified data to our type, so it can be reflected back out in the instance for this type.

  2. The Arbitrary instance for Foo s has incurred a new contraint, stating that the type represented by s somehow reifies an Int. How this happens is the magic of the reflection library, and uses a clever GHC trick representing Edward’s unique twist on Oleg and Chung-chieh’s work. This instance requires the UndecidableInstances extension.

  3. We now call reify with the data we want to pass along. This function takes a lambda whose first argument is a Proxy s, giving us a way to know which type variable to use in the type of the call to arbitrary. This requires the ScopedTypeVariables extension.

That’s it: reflection gives us a way to plumb extra data into instances at runtime, at the cost of adding a single phantom type.

If the phantom type seems excessive for one use case, or if adding the phantom would effect a large family of types, then an alternative is to enable the FlexibleInstances extension, and use Edward’s tagged library to carry the phantom instead:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}

module Main where

import Data.Proxy
import Data.Tagged
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment

data Foo = Foo [Int] [String]
    deriving Show

instance Reifies s Int => Arbitrary (Tagged s Foo) where
    arbitrary = fmap Tagged $ do
        xs  <- listOf chooseAny
        len <- choose (1, reflect (Proxy :: Proxy s))
        ys  <- vectorOf len (shuffle "Hello, world")
        return $ Foo xs ys

main :: IO ()
main = do
    [len] <- getArgs
    reify (read len :: Int) $ \(Proxy :: Proxy s) ->
        print . unTagged =<< generate (arbitrary :: Gen (Tagged s Foo))

This way we leave the original type alone – which may be the only option if you’re generating arbitrary data for types from libraries. You’ll just have to wrap and unwrap the Tagged newtype wrapper as necessary.

Another benefit of using Tagged is that, because it can be wrapped and unwrapped as necessary, it becomes possible to change the refied information in cases where nested types are involved. In this last example, the user is allowed to specify the value that should be supplied to the Bar constructor during data generation.

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}

module Main where

import Data.Proxy
import Data.Tagged
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment

newtype Bar = Bar Int
    deriving Show

data Foo = Foo [Bar] [String]
    deriving Show

instance Reifies s Int => Arbitrary (Tagged s Bar) where
    arbitrary = return $ Tagged $ Bar $ reflect (Proxy :: Proxy s)

instance Reifies s (Int, Int) => Arbitrary (Tagged s Foo) where
    arbitrary = fmap Tagged $ do
        let (len, bar) = reflect (Proxy :: Proxy s)
        xs <- listOf (reify bar $ \(Proxy :: Proxy r) ->
                          unTagged <$> (arbitrary :: Gen (Tagged r Bar)))
        l  <- choose (1, len)
        ys <- vectorOf l (shuffle "Hello, world")
        return $ Foo xs ys

main :: IO ()
main = do
    [len, barValue] <- getArgs
    reify (read len :: Int, read barValue :: Int) $ \(Proxy :: Proxy s) ->
        print . unTagged =<< generate (arbitrary :: Gen (Tagged s Foo))

Syndicated 2017-02-23 00:00:00 from Lost in Technopolis

Using the FMap library in Coq

Using the FMap library in Coq

One of the most common structures used in programming are key/value maps, also called hash maps, dictionaries, association lists, or simply functions. These maps generally provide a way to add new values, lookup keys, iterate over the collection, etc. Yet in Coq, even though this facility exists in the standard library under the module FMap, it can be quite difficult to get started with. This post intends to clarify the typical patterns in a way that is easy to copy into your own project, based on the four different ways this library is typically used.

Using a known key type and map structure

Very often, one maps from a known, ordered type, like nat, to some other type, using one of the concrete structures offered by the FMap library. In that case, the code you want to start with looks like this:

Require Import
  Coq.FSets.FMapList
  Coq.Structures.OrderedTypeEx.

Module Import M := FMapList.Make(Nat_as_OT).

You can now create a map using M.t A, where A is your value type. You can prefix the map-related functions with M., or just call them directly. Some common function to use on maps are as follows:

  • empty
  • add
  • remove
  • find
  • mem
  • is_empty
  • map
  • mapi
  • map2
  • fold

There are also several relations you can use to phrase theorems about maps and map membership:

  • In
  • MapsTo
  • Equal
  • Equiv
  • Equivb
  • Empty

Additional functions and lemmas

In order to complete most proofs concerning maps, there are additional lemmas and functions you’ll want to include:

Require Import
  Coq.FSets.FMapFacts.

Module P := WProperties_fun N_as_OT M.
Module F := P.F.

This provides two new prefixes, P. and F., which bring into scope many more helper functions and lemmas:

  • P.of_list
  • P.to_list
  • P.filter
  • P.for_all
  • P.exists_
  • P.partition
  • P.update
  • P.restrict
  • P.diff

Helper lemmas in the F module are generally best found using SearchAbout for the specific lemma you need. There are too many to list here, and they’re often quite specific in their use, such as F.find_mapsto_iff to reflect between the fact of a successful find operation, and its equivalent MapsTo relation.

Proofs involving maps

There are several induction principles you will need for completing inductive proofs over maps:

  • P.map_induction
  • P.map_induction_bis
  • P.fold_rec
  • P.fold_rec_bis
  • P.fold_rec_nodep
  • P.fold_rec_weak

The P.map_induction induction principle treats each intermediate map as an Add relation over a previous map, until it reaches the base Empty map. The _bis variant expresses the same information as successive calls to add down to an empty map.

P.fold_rec should be applied if the goal has the form of a call to M.fold over a map. If you use this, be sure to revert into the goal any hypotheses referring to the same map, since you’ll likely want to use those facts as part of the induction.

Note that these two sets of principles are used somewhat differently from each other:

-- Applies to any evidence in the context involving [m].
induction m using P.map_induction bis.

-- Applies only to evidence in the goal, thus sometimes
-- requiring use of [revert].
apply P.fold_rec.

Rewriting with maps

Since the internal structure of maps is not exposed by the FMap interface, rewriting can sometimes be a little confusing. Equality between maps is expressed by the equivalence Equal, which states that anything found in the first map is found at the same key in the second map. In other words:

forall k v, M.MapsTo k v m1 <-> M.MapsTo k v m2

This isn’t a problem if the terms you’re rewriting involve functions from the FMap modules, but if you create a new function that operates on maps, you’ll need to accompany it with a proof relating it to Equal. For example:

Definition map_operation `(m : M.t A) : M.t A := ...

Lemma map_operation_Proper :
  Proper (Equal ==> Equal) map_operation.

Now you can rewrite the arguments in a map_operation, provided you know they are Equal.

Also, if you find yourself facing difficulties using rewrite with folds, note that in addition to establishing a proof that the fold function is Proper for its arguments and result, you must also show that the final result is independent of the order of evaluation, since it’s not known from the FMap interface whether the contents of a map are reordered during insertion or not.

Abstracting the map implementation

Often when using maps, it’s not necessary to pick an implementation, you just need the map interface over a known key type. To do this, you just need to place your code in a module that itself requires and passes along the implementation module:

Require Import
  Coq.FSets.FMapFacts
  Coq.Structures.OrderedTypeEx.

Module MyModule (M : WSfun Nat_as_OT).

Module P := WProperties_fun Nat_as_OT M.
Module F := P.F.
...
End MyModule.

To later instantiate such a module functor using a map implementation, you’d write:

Require Import
  Coq.FSets.FMapFacts
  MyModule.

Module Import M := FMapList.Make(Nat_as_OT).
Module Import MyMod := MyModule M.

Abstracting over both map and key

When implementing generic algorithms that are applicable to any map, you’ll also need to abstract over the key type. In this case, you have two choices: Do you need to know that the key type is ordered, or do you only need to know that it’s decidable? Often the latter suffices, making the algorithm even more general.

In both cases, you may refer to the key type as either E.key or M.key (since the M module re-exports key), and you can check for key equality using E.eq:

Require Import
  Coq.FSets.FMapFacts
  Coq.Structures.DecidableTypeEx.

Module MoreFacts (E : DecidableType) (M : WSfun E).

Global Program Instance filter_Proper {elt} : forall P,
  Proper (E.eq ==> eq ==> eq) P
    -> Proper (M.Equal (elt:=elt) ==> M.Equal) (@P.filter elt P).
...

End MoreFacts.

To require an ordered type, which makes E.lt available, use:

Require Import
  Coq.FSets.FMapFacts
  Coq.Structures.OrderedTypeEx.

Module MoreFacts (E : OrderedType) (M : WSfun E).
...
End MoreFacts.

Putting it all together

Since you probably came here just wondering how to construct a map, add stuff to it, and then search for what you added, here is a complete example you can cut and paste to start off with:

Require Import
  Coq.FSets.FMapAVL
  Coq.FSets.FMapFacts
  Coq.Structures.OrderedTypeEx
  PeanoNat.

Module Import M := FMapAVL.Make(Nat_as_OT).

Module P := WProperties_fun Nat_as_OT M.
Module F := P.F.

Compute M.find 1 (M.add 1 10 (M.empty _)).
Compute P.for_all (fun k _ => k <? 10) (M.add 1 10 (M.empty _)).

Also note that there is N_as_OT, which is much faster to compute with if you are using large constants, but it requires familiarity with the NArith library.

Syndicated 2016-10-27 00:00:00 from Lost in Technopolis

77 older entries...

 

johnw certified others as follows:

  • johnw certified lerdsuwa as Journeyer
  • johnw certified Nafai77 as Apprentice
  • johnw certified rms as Master
  • johnw certified walters as Master
  • johnw certified rw as Apprentice

Others have certified johnw as follows:

  • aaronl certified johnw as Master
  • jtc certified johnw as Master
  • lerdsuwa certified johnw as Master
  • Ushakov certified johnw as Master
  • walters certified johnw as Master
  • cmm certified johnw as Master
  • rw certified johnw as Master
  • maragato certified johnw as Master
  • mishan certified johnw as Master
  • Nafai77 certified johnw as Master
  • lukeg certified johnw as Master
  • dgoel3 certified johnw as Master
  • deego certified johnw as Master
  • quarl certified johnw as Master
  • bpt certified johnw as Master
  • sachac certified johnw as Master
  • dhruva certified johnw as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!