mrd's blog

Refuting Myths about Polymorphism

Submitted by mrd on Sun, 03/02/2008 - 11:23pm.

I am addressing an article written by Mark Dominus named Subtypes and polymorphism:

References and Type Safety

In the second part of the article, Mark points out an example which he says demonstrates where Standard ML breaks type safety. In fact, he is passing along a common myth about the "value restriction" in Standard ML and other Hindley-Milner based languages. To state it quite simply up-front: the "value restriction" is not necessary to protect type safety. It is simply there to preserve an intuition about how code evaluates.

This is the example he presents. If you give this to a Standard ML compiler, it will fail on the line beginning with val a because of the "value restriction."

fun id x = x (* id : α → α *) val a = ref id; (* a : ref (α → α) *) fun not true = false | not false = true ; (* not: bool → bool *) a := not; (!a) 13

If the value restriction were to be lifted, the code would still be type-safe. Mark's interpretation of the code is incorrect. But the thinking process which led him to his incorrect interpretation is itself the reason for the value restriction.

The Value Restriction

The restriction itself is simple to describe: if you have code that looks like this:

val x = ...

and the ... has a type involving polymorphism, then it must be a value syntactically.

ref id of type (α → α) ref is not a value -- it will evaluate to a location-value. Also, it is polymorphic. Therefore the value restriction applies, and the code is not permitted.

The value restriction is seemingly helping us preserve type safety. But that is a misconception. If you removed the value restriction, you could compile the above code, and it would still be type-safe.

The short answer to what would happen is this: the a on the line a:=not and other one on the line (!a) 13 are not evaluating to same location-value! Therefore, you are not, in fact, applying not to an integer. You are applying id to an integer, which is perfectly fine.

Under the hood

To explain why this is the case, I need to show you that the code above is omitting important details about polymorphism. Those details are automatically reconstructed and inserted by the typechecker. Here is a version of the code that is decorated with the details after typechecking:

val id : ∀α. α → α = Λα. fn (x:α) => x val a : ∀α. (α → α) ref = Λα. (ref{α → α}) (id{α}) val not : bool → bool a{bool} := not; ((!{int → int}) (a{int})) 13

Whew! Now you can see why people prefer to let the typechecker deal with this stuff. But now the Hindley-Milner style types have been translated into a System F (aka polymorphic lambda-calculus) style language.

When you see the type α → α in H-M types, it's omitting the universal quantifier. Haskell folks are already familiar with it, because a popular extension allows the explicit use of ∀ (aka "forall"). All the type variables must be quantified in a ∀ placed at the beginning of the type.

The term which has a type beginning with ∀ is called Λ (aka "type-lambda"). Therefore, this term must be introduced everywhere a ∀ appears in the type. The typechecker rewrites your code to insert the necessary type-lambdas.

In order to use a polymorphic value (a term wrapped in a type-lambda), you must apply a type to the type-lambda. This is denoted t{T} in the above example. For example, to use the polymorphic function id : ∀a. α → α, you must first apply its type-lambda to the type that you want: (id{int}) 13 steps to 13.

Now, I can explain what is happening behind the scenes when you use a. Since a is polymorphic, that means it is wrapped in a type-lambda. In order to be used as a reference, it must be applied to a type. For example:

a{bool} steps to (ref{bool → bool}) (id{bool}) which then steps to l1

where l1 of type (bool → bool) ref is a location-value –- the result of evaluating a call to ref.

On the other hand, later on we do:

a{int} steps to (ref{int → int}) (id{int}) which then steps to l2

where l2 is a different location-value. We've called ref a second time, and it's given us a different location.

The Unexpected

When Mark wrote val a = ref id he clearly did not expect ref to be called more than once. But I have demonstrated that is exactly what would happen if his code were to be compiled and run.

This kind of behavior is type-safe; we never attempt to invoke not on the integer 13 because the function not is safely stored in an entirely different location. It is possible to go ahead and prove the type safety of a language like this with references. But clearly, the repeated computation of ref is surprising, and might even be said to be signs of a "leaky abstraction" in the Hindley-Milner type system. It attempts to hide the details about type-lambdas, but they've revealed themselves in the dynamic behavior of the code.

The "value restriction" then, is simply a measure put in place to prevent "surprising" behavior from occurring in the first place. By allowing only values syntactically, you will not end up with any evaluation after applying the type-lambda, and all is as expected.

Subtyping and Polymorphism

Submitted by mrd on Sun, 03/02/2008 - 9:54pm.

I am addressing an article written by Mark Dominus named Subtypes and polymorphism:

There are two parts to this article, the first discusses the fact that subtyping and polymorphism can lead to some unintuitive results. I won't repeat his point here, but just note that the term for the behavior of "List" here is called "invariant" and the incorrect behavior which he had assumed is called "covariant."

Java's broken arrays

While Java generic Lists are "invariant," Java arrays are "covariant." This is not due to some wonderful property of arrays, but in fact is a gigantic flaw in the Java language which breaks static type safety. As a result, arrays must carry run-time type information in order to dynamically check the type of objects being inserted. Consider this code:

String stringA[] = new String[1]; stringA[0] = "Hello"; Object objectA[] = stringA;

So far this seems okay -- String is a subtype of Object -- therefore it seems safe to say that String arrays are a subtype of Object arrays. This is called "covariant" behavior.

So if objectA is an array of Objects, and everything is a subtype of Object, then it should be okay to put an Integer there instead, right?

objectA[0] = 1;

but wait, now stringA[0] has an Integer in it! A String array cannot have an Integer in it. Before it seemed that arrays might be "covariant," but mutation makes it seem that arrays must be "contravariant." We can't be both at the same time, so it seems mutable arrays are "invariant" after all.

If you try running the code given above, you get:

Exception in thread "main" java.lang.ArrayStoreException: java.lang.Integer

clearly, a run-time type-check is being performed.

About Covariance, Contravariance, and Invariance

I've been using these terms, but haven't really defined them.

A type-constructor C is covariant iff: for all types S,T, if S is a subtype of T then C<S> is a subtype of C<T>.

Similarly, for contravariance, if S is a subtype of T then C<T> is a subtype of C<S>. Note that S and T have swapped.

Invariance (the name is confusing, I know) is the lack of covariance or contravariance.

It may be of interest to know that functions are contravariant in the input, but covariant in the output. Perhaps that is why the Java folks are not so eager to have function types!

Broken arrays and generics

The brokenness of Java arrays has impacted on the design of Java generics as well. Consider the following code that might be written if you were designing your own generic container of some sort:

public class Container<T> { private T[] a;

So far so good, right? An array of some parameterized type seems reasonable.

public Container() { a = (T[])new T[10]; } }

Uh oh!

Container.java:5: generic array creation a = (T[])new T[10]; ^ Note: Container.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. 1 error

No can do. Java is jumping down my throat over this seemingly reasonable code. Why? Generic arrays are not permitted like this, because -- as described above -- Java arrays must carry run-time type information. But Generics promise total type erasure -- meaning NO run-time type information is kept. Sounds like they designed themselves into a corner when creating Java, and later, Generics.

Polymorphism is not Subtyping

Returning to the second part of the article, I want to dispel a few common misperceptions about polymorphism.

Mark Dominus: Then we define a logical negation function, not, which has type bool → bool, that is, it takes a boolean argument and returns a boolean result. Since this is a subtype of α → α, we can store this function in the cell referenced by a.

I want to make it absolutely clear that bool → bool is not a subtype of α → α. There is no notion of subtyping in basic Hindley-Milner, and it is still not the case even in a "loose" sense.

The easiest way to verify this for yourself is to find your nearest Standard ML (or similar) prompt and type this:

not : 'a -> 'a

and get a type error. not : bool → bool and it is not of type α → α.

Now, his example does not typecheck in Standard ML because of the "value restriction." The value restriction does not exist to preserve type safety at all, but only to prevent programmer confusion.

This post has gone on too long about Java, so I am afraid it will wash out the more important points about polymorphism and the value restriction. I will leave it then, to the next post.

Converting a ByteString to a Double using FFI

Submitted by mrd on Tue, 12/18/2007 - 3:36pm.

ByteString, the ever-useful fast string library, has some nice functions for reading Ints which make reading in test data a breeze. However, there's no similar function for Doubles which has caused some recent troubles for me when dealing with floating point data. The C library has a function for this task: strtod. So I decided to take some older but similar FFI code and mold it into an interface to strtod.


{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.C.Types
import Data.ByteString.Base


foreign import ccall unsafe "stdlib.h strtod"
c_strtod :: Ptr Word8 -> Ptr (Ptr Word8) -> IO CDouble


strtod :: ByteString -> Maybe (CDouble, ByteString)
strtod str = inlinePerformIO .
withForeignPtr x $ \ p ->
alloca (\ p_endptr -> do
let p_s = p `plusPtr` s
n <- c_strtod p_s p_endptr
endptr <- peek p_endptr
let diff = endptr `minusPtr` p_s
if diff == 0
then return Nothing
else return $ Just (n, PS x (s + diff) (l - diff)))
where (x, s, l) = toForeignPtr str

The type is chosen in particular to be convenient for usage with unfoldr.

The code probably looks more complicated than it is. Since ByteStrings are really stored as ForeignPtr Word8 (along with start-offset and length), it is easy to grab the raw form for usage by C functions. withForeignPtr lets you manipulate the raw Ptr Word8 within a function. Pointer arithmetic is performed using functions like plusPtr. The second parameter to strtod is actually an "out" parameter which sets a pointer to the spot it finished reading. I allocate space to store a pointer, namely a Ptr (Ptr Word8) and Haskell can figure out how much space it needs from the inferred type. I peek to retrieve the "returned" end-pointer. Then it's just a matter of putting back together the ByteString (PS for "Packed String") with the new offset and length.

Data Parallel Bellman-Ford

Submitted by mrd on Thu, 11/29/2007 - 7:48pm.

Graph representation is as an edge-list.


bmf :: Int -> UArr (Int :*: Int :*: Double) -> Int -> UArr Double
bmf n es src = iterate rnd dm0 !! (n - 1)
where
dm0 = toU [ if i == src then 0 else inf | i <- [0 .. n - 1] ]
rnd dm =
updateU dm
(mapU (\ e ->
if distOf dm (destin e) > distOf dm (source e) + weight e
then (destin e) :*: (distOf dm (source e) + weight e)
else (destin e) :*: (distOf dm (destin e)))
es)


source = fstS . fstS
destin = sndS . fstS
weight = sndS
distOf dm u = dm !: u


inf :: Double
inf = 1 / 0

The above code uses NDP but only the sequential portions. In order to get parallelism I need to invoke the Distributed operations. However, there are also Unlifted.Parallel operations which hide the usage of the distributed ops.


bmf :: Int -> UArr (Int :*: Int :*: Double) -> Int -> UArr Double
bmf n es src = iterate (rnd es) dm0 !! (n - 1)
where
dm0 = toU [ if i == src then 0 else inf | i <- [0 .. n - 1] ]


{-# INLINE rnd #-}
rnd :: UArr (Int :*: Int :*: Double) -> UArr Double -> UArr Double
rnd es dm = updateU dm
. mapUP sndS
. filterUP fstS
. mapUP (\ e ->
let d = distOf dm (destin e)
d' = distOf dm (source e) + weight e
in if d > d'
then True :*: (destin e :*: d')
else False :*: (0 :*: 0))
$ es

mapUP is the unlifted parallelized version of mapU. However there's no updateUP. Looking into the code I spotted out a commented out version of updateUP. There are problems when figuring out what to do about multiple concurrent writes to one location. NESL specifies "arbitrary" resolution of conflicting concurrent writes. Unfortunately the commented code has bit-rotted and I haven't successfully managed to fix it.

I also added some filtering to prevent it from getting stuck writing Infinity over a real distance constantly, due to "arbitrary" resolution of conflicting writes in updateU. The iterative function is now factored out into its own toplevel definition, for clarity.

My Haskell Emacs configuration

Submitted by mrd on Fri, 11/02/2007 - 12:18pm.

I generally like some of the features that come with haskell-mode for Emacs. But I cannot reconcile my style with that of the "smart" indentation mode. I'm not too keen on the default indentation function of Emacs either, at least, not for Haskell code. I use the basic tab-to-tab-stop function but it needs a little tweaking to bring it down to reasonable levels of indentation.

Tabs should never be used.

(setq-default indent-tabs-mode nil)

I like align-regexp, a neat tool from Emacs 22.

(global-set-key (kbd "C-x a r") 'align-regexp)

Basic haskell-mode loading:

(add-to-list 'load-path "path/to/haskell-mode")
(load-library "haskell-site-file")
(add-to-list 'auto-mode-alist '("\\.hs\\'" . haskell-mode))

My indentation settings. I wrote my own "newline and indent" function which brings any code you split onto the newline back up to the same indentation level it was at previously.

(remove-hook 'haskell-mode-hook 'turn-on-haskell-indent)
;; Just use tab-stop indentation, 2-space tabs
(add-hook 'haskell-mode-hook
(lambda ()
(turn-on-haskell-doc-mode)
(turn-on-haskell-simple-indent)
(setq indent-line-function 'tab-to-tab-stop)
(setq tab-stop-list
(loop for i from 2 upto 120 by 2 collect i))
(local-set-key (kbd "RET") 'newline-and-indent-relative))
(defun newline-and-indent-relative ()
(interactive)
(newline)
(indent-to-column (save-excursion
(forward-line -1)
(back-to-indentation)
(current-column))))

And just in case you were wondering how to tweak syntax highlighting colors in your .emacs, here's what I do:

(set-face-bold-p 'font-lock-builtin-face nil)
(set-face-bold-p 'font-lock-comment-face nil)
(set-face-bold-p 'font-lock-function-name-face nil)
(set-face-bold-p 'font-lock-keyword-face nil)
(set-face-bold-p 'font-lock-variable-name-face nil)
(set-face-foreground 'font-lock-builtin-face "cyan")
(set-face-foreground 'font-lock-comment-face "pale green")
(set-face-foreground 'font-lock-function-name-face "green")
(set-face-foreground 'font-lock-variable-name-face "pale green")

I think it's fairly self-explanatory. You can play around with the values and find what you like. M-x list-colors-display is a handy tool to list out all the possible colors along with examples.

Strongly Specified Functions

Submitted by mrd on Sun, 09/23/2007 - 9:37pm.

How to make writing easy functions hard.

Kidding aside, a strongly specified function in Coq not only computes but simultaneously provides a proof of some property. The construction of the proof is integrated with the body of the function. This differentiates it from "weakly" specified functions, which are written separately from the proof.

Proving properties about functions is one of the main practical uses of Coq. Unfortunately, it's more difficult than simply writing code. But it is more interesting (and useful) than, say, simple proofs about the monad laws. One thing I learned, the hard way, is that it's better to start small.

The first example to talk about is reverse. It has a simple inductive definition, so that isn't a big deal, like it can be for some things. It uses basic structural induction on lists, so I don't need to define my own well founded induction principle. Hopefully, it illustrates the meaning of "integrated computation and proof."

I've split the property up into two inductive definitions. The first one is the actual "reversed" relation between two lists. Notice that it is not necessarily in Set. The second definition is what actually gets constructed by the function. It says: in order to construct a reversed list of xs, you need to supply a list xs' and a proof of reversed xs xs'.

The definition is slightly different than other functions you may have seen in Coq (or not). It actually looks more like a theorem proof than a function. This is a good thing.

I use the refine tactic now extensively. This tactic allows you to specify an exact proof term, like exact, but also to leave "jokers" to be filled in by further elaboration. The two spots I've left correspond to the base case and the inductive case, which you should be able to figure out from the match. Side note: the return type-annotation is necessary, otherwise type reconstruction won't figure it out.

At this point, I've told Coq that there is a nil case and a "cons" case. The nil case is easy, so I just supplied an exact proof term. You can read that as saying: "The reverse of nil is nil and a proof of this is reversed_nil."

For the inductive case, I will use refine again to drill deeper into the function and the proof. Notice that the recursive call is made with match. This allows me to separate the recursively computed value xs' from the usage of the induction hypothesis Hxs'. There is a little trap here -- if you use automation prior to making the recursive call you may get a "Proof completed" message. But you won't be able to save the proof, because the "termination checker" will yell at you. The same thing happens if you incorrectly invoke the recursion.

The "proof completed" message isn't bogus -- it's just that the function the auto tactic had proven was not the function for which we were looking.

The refine statement provides the necessary hint to the theorem prover. I construct the returned data in a separate refine just to illustrate how you can drill down with the tool. This just leaves the simple proof that xs'++(x::nil) is in fact, the reverse.

Recursive Extraction pulls out all the computation-related (stuff in Set) bits while leaving behind the proof-related bits. If you want it in Haskell, as always, change the language with Extraction Language Haskell.

(All examples were written using Coq 8.1pl1)

Coq and The Monad Laws: The Third

Submitted by mrd on Thu, 08/16/2007 - 2:18pm.

The Third Monad Law

The previous two articles introduced Coq and the first two Monad Laws. I am discussing the third one separately because it will take longer to prove.

The proof for the third law will proceed at first like the others, induction and some simplification.

Theorem third_law : forall (A B C : Set) (m : list A) (f : A -> list B) (g : B -> list C), bind B C (bind A B m f) g = bind A C m (fun x => bind B C (f x) g). Proof. induction m. (* base case *) simpl. trivial. (* inductive case *) intros f g. simpl. unfold bind. unfold bind in IHm.

Which brings us to this state in the interactive theorem prover.

1 subgoal A : Set B : Set C : Set a : A m : list A IHm : forall (f : A -> list B) (g : B -> list C), flat_map g (flat_map f m) = flat_map (fun x : A => flat_map g (f x)) m f : A -> list B g : B -> list C ============================ flat_map g (f a ++ flat_map f m) = flat_map g (f a) ++ flat_map (fun x : A => flat_map g (f x)) m

At this point, if we could rewrite flat_map g (f a ++ flat_map f m) into flat_map g (f a) ++ flat_map g (flat_map f m)) then we would be able to apply the Inductive Hypothesis and be home free.

The "cut" tactic allows you to make an assumption, and then later come back and prove your assumption correct. Using "cut",

cut (flat_map g (f a ++ flat_map f m) = flat_map g (f a) ++ flat_map g (flat_map f m)). intro Distrib. rewrite Distrib. rewrite IHm. reflexivity.

the original goal is easily solved. But Coq has generated an additional subgoal: we must now prove that this cut is correct.

1 subgoal A : Set B : Set C : Set a : A m : list A IHm : forall (f : A -> list B) (g : B -> list C), flat_map g (flat_map f m) = flat_map (fun x : A => flat_map g (f x)) m f : A -> list B g : B -> list C ============================ flat_map g (f a ++ flat_map f m) = flat_map g (f a) ++ flat_map g (flat_map f m)

We'll proceed by induction on f a which has inductive type list B.

induction (f a). (* base case *) simpl. reflexivity. (* inductive case *) simpl. rewrite IHl. rewrite app_ass. reflexivity. Qed. End ListMonad.

All done. We only needed the association property of list append, which I found by querying SearchAbout app.

Summary

Here is a much shorter proof which takes advantage of some of Coq's automated tactics.

Theorem third_law' : forall (A B C : Set) (m : list A) (f : A -> list B) (g : B -> list C), bind B C (bind A B m f) g = bind A C m (fun x => bind B C (f x) g). Proof. induction m; simpl; intuition. replace (bind B C (f a ++ bind A B m f) g) with (bind B C (f a) g ++ bind B C (bind A B m f) g); [ rewrite IHm | induction (f a); simpl; auto; rewrite app_ass; rewrite IHl ]; auto. Qed.

On a final note, Coq has the ability to extract code into several different languages.

Extraction Language Haskell. Recursive Extraction bind ret.

results in

module Main where import qualified Prelude data List a = Nil | Cons a (List a) app l m = case l of Nil -> m Cons a l1 -> Cons a (app l1 m) flat_map f l = case l of Nil -> Nil Cons x t -> app (f x) (flat_map f t) ret :: a1 -> List a1 ret a = Cons a Nil bind :: (List a1) -> (a1 -> List a2) -> List a2 bind m f = flat_map f m

Coq and The Monad Laws: The First and Second

Submitted by mrd on Wed, 08/15/2007 - 4:14pm.

The First Monad Law

In the previous article, I gave a brief introduction to Coq and the List Monad. I listed three Theorems but did not prove them. Now I will show how to prove the first two Theorems.

Here is the context of our code:

Section ListMonad. Require Import List. Definition ret (A : Set) (a : A) := a :: nil. Definition bind (A B : Set) (m : list A) (f : A -> list B) := flat_map f m.

The first Theorem, restated:

Theorem first_law : forall (A B : Set) (a : A) (f : A -> list B), bind A B (ret A a) f = f a.

At this point, Coq will enter an interactive proof mode. In coqtop, the prompt will change accordingly. In ProofGeneral Emacs, a new window will pop up. In either case, you will be presented with a view of the current assumptions and goals that looks like this:

1 subgoal ============================ forall (A B : Set) (a : A) (f : A -> list B), bind A B (ret A a) f = f a

There are no assumptions, and one goal to prove.

Proof. intros A B a f.

Coq proofs can proceed with "tactics" which instruct the proof engine to perform a certain operation, or series of operations. In this case, we are introducing assumptions named A,B,a,f drawn from the antecedents of the goal. Now the proof view looks like this:

1 subgoal A : Set B : Set a : A f : A -> list B ============================ bind A B (ret A a) f = f a

The "unfold" tactic takes the definition of the given argument and tries to substitute it directly into the goal.

unfold bind. unfold ret. unfold flat_map.

Now we have a vastly simplified goal.

1 subgoal A : Set B : Set a : A f : A -> list B ============================ f a ++ nil = f a

Everyone knows that appending nil to a list returns the same list. But how do we explain this to Coq? Search and find out if Coq already knows about this fact.

SearchRewrite (_ ++ nil).

SearchRewrite is used to search for theorems of the form a = b which can be used to rewrite portions of your goal. I use the wildcard _ to indicate that I want to find any theorem involving appending nil to anything. Coq finds:

app_nil_end: forall (A : Set) (l : list A), l = l ++ nil

We can use this with the "rewrite" tactic. We want to rewrite our code which looks like l ++ nil into l. That is a right-to-left rewrite, which is specified like so:

rewrite <- app_nil_end.

And finally we have a goal f a = f a which Coq can trivially solve with

reflexivity.

Or any of the more automated tactics, like "trivial" or "auto".

Qed.

The first monad law is discharged and proven. You can run Check first_law. to see the new theorem.

The Second Monad Law

This second proof will proceed by induction on the structure of the list parameter m. The "induction" tactic causes Coq to try a proof by induction. The proof will be split into two parts, based on case breakdown of the list type.

Theorem second_law : forall (A : Set) (m : list A), bind A A m (ret A) = m. Proof. induction m.

The first is the base case for nil, and the second is the inductive case for cons.

2 subgoals A : Set ============================ bind A A nil (ret A) = nil subgoal 2 is: bind A A (a :: m) (ret A) = a :: m

The base case is easy, the "simpl" tactic can take care of the unfolding and simplification.

simpl. reflexivity.

The inductive case is also straightforward, because after the "simpl" step, the Inductive Hypothesis is already applicable. The appropriate IHm assumption is provided by the "induction" tactic.

simpl. rewrite IHm. reflexivity. Qed.

Summary

The full proofs, minus the commentary:

Theorem first_law : forall (A B : Set) (a : A) (f : A -> list B), bind A B (ret A a) f = f a. Proof. intros A B a f. unfold bind. unfold ret. unfold flat_map. rewrite <- app_nil_end. reflexivity. Qed. Theorem second_law : forall (A : Set) (m : list A), bind A A m (ret A) = m. Proof. induction m. (* base case *) simpl. reflexivity. (* inductive case *) simpl. rewrite IHm. reflexivity. Qed.

Next time, the Third Monad Law, and a more complicated proof.

Coq and The Monad Laws: Introduction

Submitted by mrd on Wed, 08/15/2007 - 2:26pm.

Over the past few months I've been spending some time with Coq, the proof assistant. I am still very much a beginner, so as an exercise I decided to try and prove the three Monad laws for the implementation of the List Monad.

In Haskell, the List Monad looks like this:

instance Monad [] where return x = [x] m >>= f = concatMap f m

Coq is a proof-assistant for a constructive logic, so it can also be used as a (dependently-typed) functional programming language. You can open an interactive session and type the lines in using coqtop. However, I use the ProofGeneral Emacs-based interface and save my code in a file like "listmonad.v". You can then step through the lines using C-c C-n. There also exists a custom GUI called CoqIDE.

In Coq, I defined the Monad methods like so:

Section ListMonad. Require Import List. Definition ret (A : Set) (a : A) := a :: nil. Definition bind (A B : Set) (m : list A) (f : A -> list B) := flat_map f m.

I opened up a new section (but did not close it yet). The definitions are fairly unremarkable, except that now the types A and B are explicit parameters. In Coq, a single colon is used for type annotations, and a double colon is used for list consing. flat_map is Coq's version of concatMap. I did not know that before, but found it by using SearchAbout list. at the interactive Coq prompt.

I can see what Coq thinks of my definitions:

Coq < Check ret. ret : forall A : Set, A -> list A Coq < Check bind. bind : forall A B : Set, list A -> (A -> list B) -> list B

Now we are ready to state the three Monad Laws. First, in Haskell notation:

1. return a >>= f = f a 2. m >>= return = m 3. (m >>= f) >>= g = m >>= (\x -> f x >>= g)

The first two Monad laws, in essence, assert that return is an identity operation on the left and right sides of bind. The third says that bind is associative.

In Coq:

Theorem first_law : forall (A B : Set) (a : A) (f : A -> list B), bind A B (ret A a) f = f a. Theorem second_law : forall (A : Set) (m : list A), bind A A m (ret A) = m. Theorem third_law : forall (A B C : Set) (m : list A) (f : A -> list B) (g : B -> list C), bind B C (bind A B m f) g = bind A C m (fun x => bind B C (f x) g).

Entering any of these Theorems into Coq will engage the interactive proof mode. In the next article, I will examine the proofs of the first two laws.

Haskell and XML

Submitted by mrd on Sun, 08/05/2007 - 8:59am.

I installed HXT this week to use in a small project which dealt with some XML data. It is a slick library, and it has introduced me to the joy of Arrows. I scavenged for tutorial information, and there are some useful pages out there -- The Haskellwiki page on HXT, and Neil Bartlett's recent blog entry.

However, when it came down to getting some practical work done, I found myself scouring the HXT Arrow API documentation. While it is pretty neat to figure out which combinator to use solely based on the examination of types, I would have preferred a selection of code examples.

That is what Practical Examples of HXT in Action is attempting to be. I've concocted 3 articles so far, and I'm hoping to add more. The current examples demonstrate: arrow syntax, basic useful combinators, a little development methodology, optional and list data, and simple integration with Network.HTTP.