# News aggregator

### Danny Gratzer: Examining Hackage: folds

In keeping with the rest of the “Examining Hackage” series I’d like to go through the source folds package today. We’ll try to go through most of the code in an attempt to understand what exactly folds does and how it does it. To be honest, I hadn’t actually heard of this one until someone mentioned it to me on /r/haskell but it looks pretty cool. It also has the word “comonadic” in the description, how can I resist?

It’s similar to Gabriel’s foldl library, but it also seems to provide a wider suite of types folds. In retrospect, folds has a general framework for talking about types of folds and composing them where as foldl defines only 2 types of folds, but defines a whole heap of prebuilt (left) folds.

Poking AroundAfter grabbing the source and looking at the files we see that folds is actually reasonable large

~$ cabal get folds && cd folds-0.6.2 && ag -g "hs$" src/Data/Fold.hs src/Data/Fold/L.hs src/Data/Fold/L'.hs src/Data/Fold/Class.hs src/Data/Fold/M1.hs src/Data/Fold/L1.hs src/Data/Fold/R.hs src/Data/Fold/Internal.hs src/Data/Fold/L1'.hs src/Data/Fold/R1.hs src/Data/Fold/M.hs Setup.lhs tests/hlint.hsOne that jumps out at me is Internal since it likely doesn’t depend on anything. We’ll start there.

InternalLooking at the top gives a hint for what we’re in for

{-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE DeriveDataTypeable #-} module Data.Fold.Internal ( SnocList(..) , SnocList1(..) , List1(..) , Maybe'(..), maybe' , Pair'(..) , N(..) , Tree(..) , Tree1(..) , An(..) , Box(..) ) whereThis module seems to be mostly a bunch of (presumably useful) data types + their instances for Foldable, Functor, and Traversable. Since all 3 of these are simple enough you can actually just derive them I’ll elide them in most cases.

First up is SnocList, if the name didn’t give it away it is a backwards list (snoc is cons backwards)

data SnocList a = Snoc (SnocList a) a | Nil deriving (Eq,Ord,Show,Read,Typeable,Data)Then we have the boilerplatey instances for Functor and Foldable. What’s a bit odd is that both foldl and foldMap are implemented where we only need foldl. Presumably this is because just foldMap gives worse performance but that’s a little disappointing.

Next is SnocList1 and List1 which are quite similar.

data SnocList1 a = Snoc1 (SnocList1 a) a | First a deriving (Eq,Ord,Show,Read,Typeable,Data) data List1 a = Cons1 a (List1 a) | Last aIf you’ve never seen this before, notice how instead of Nil we have a constructor which requires an element. This means that no matter how we construct a list we need to supply at least element. Among other things this means that head would be safe.

We also have a couple strict structures. Notice that these cannot be functors since they break fmap f . fmap g = fmap (f . g) (why?). We have

data Maybe' a = Nothing' | Just' !a data Pair' a b = Pair' !a !bAnd we have the obvious instance for Foldable Maybe' and Monoid (a, b). Now it may seem a little silly to define these types, but from experience I can say anything that makes strictness a bit more explicit is wonderfully helpful. Now we can just use seq on a Pair' and know that both components will be forced.

Next we define a type for trees. One thing I noticed was the docs mentioned that this type reflects the structure of a foldMap

data Tree a = Zero | One a | Two (Tree a) (Tree a) deriving (Eq,Ord,Show,Read,Typeable,Data)When we foldMap each One should be an element of the original collection. From there we can fmap with the map part of foldMap, and we can imagine traversing the tree and replacing Two l r with l <> r, each Zero with mempty, and each One a with a.

So that’s rather nifty. On top of this we have Foldable, Traversable, and Functor instances.

We also have Tree1 which is similar but elides the Zero

data Tree1 a = Bin1 (Tree1 a) (Tree1 a) | Tip1 aAs you’d expect, this implements the same type classes as Tree.

Now is where things get a bit weird. First up is a type for reifying monoids using reflection. I actually was thinking about doing a post on it and then I discovered Austin Seipp has done an outstanding one. So we have this N type with the definition

newtype N a s = N { runN :: a } deriving (Eq,Ord,Show,Read,Typeable,Data)Now with reflection there are two key components, there’s the type class instance floating around and a fresh type s that keys it. If we have s then we can easily demand a specific instance with reflect (Proxy :: Proxy s). That’s exactly what we do here. We can create a monoid instance using this trick with

instance Reifies s (a -> a -> a, a) => Monoid (N a s) where mempty = N $ snd $ reflect (Proxy :: Proxy s) mappend (N a) (N b) = N $ fst (reflect (Proxy :: Proxy s)) a bSo at each point we use our s to grab the tuple of monoid operations we expect to be around and use them in the obvious manner. The only reason I could imagine doing this is if we had a structure which we want to use as a monoid in a number of different ways. I suppose we also could have just passed the dictionary around but maybe this was extremely ugly. We shall see later I suppose.

Last comes two data types I do not understand at all. There’s An and Box. The look extremely boring.

data Box a = Box a newtype An a = An aTheir instances are the same everywhere as well.. I have no clue what these are for. Grepping shows they are used though so hopefully this mystery will become clearer as we go.

ClassGoing in order of the module DAG gives us Data.Fold.Class.hs. This exports two type classes and one function

module Data.Fold.Class ( Scan(..) , Folding(..) , beneath ) whereOne thing that worries me a little is that this imports Control.Lens which I don’t understand nearly as well as I’d like to.. We’ll see how this turns out.

Our first class is

class Choice p => Scan p where prefix1 :: a -> p a b -> p a b postfix1 :: p a b -> a -> p a b run1 :: a -> p a b -> b interspersing :: a -> p a b -> p a bSo right away we notice this is a subclass of Choice which is in turn a subclass of Profunctor. Choice captures the ability to pull an Either through our profunctor.

left' :: p a b -> p (Either a c) (Either b c) right' :: p a b -> p (Either c a) (Either c b)Note that we can’t do this with ordinary profunctors since we’d need a function from Either a c -> a which isn’t complete.

Back to Scan p. Scan p takes a profunctor which apparently represents our folds. We then can prefix the input we supply, postfix the input we supply, and run our fold on a single element of input. This is a bit weird to me, I’m not sure if the intention is to write something like

foldList :: Scan p => [a] -> p a b -> b foldList [x] = run1 x foldList (x : xs) = foldList xs . prefix1 xor something else entirely. Additionally this doesn’t really conform to my intuition of what a scan is. I’d expect a scan to produce all of the intermediate output involved in folding. At this point, with no instances in scope, it’s a little tricky to see what’s supposed to be happening here.

There are a bunch of default-signature based implementations of these methods if your type implements Foldable. Since this is the next type class in the module let’s look at that and then skip back to the defaults.

class Scan p => Folding p where prefix :: Foldable t => t a -> p a b -> p a b prefixOf :: Fold s a -> s -> p a b -> p a b postfix :: Foldable t => p a b -> t a -> p a b postfixOf :: Fold s a -> p a b -> s -> p a b run :: Foldable t => t a -> p a b -> b runOf :: Fold s a -> s -> p a b -> b filtering :: (a -> Bool) -> p a b -> p a bAt this point I looked at a few of the types and my first thought was “Oh dammit lens..” but it’s actually not so bad! The first thing to do is ignore the *Of functions which work across lens’s Fold type. There seems to be a nice pair for each “running” function where it can work across a Foldable container or lens’s notion of a fold.

prefix :: Foldable t => t a -> p a b -> p a b postfix :: Foldable t => p a b -> t a -> p a b run :: Foldable t => t a -> p a b -> bThe first two functions let us create a new fold that will accept some input and supplement it with a bunch of other inputs. prefix gives the supplemental input followed by the new input and postfix does the reverse. We can actually supply input and run the whole thing with run.

All of these are defined with folded from lens which reifies a foldable container into a Fold. so foo = fooOf folded is the default implementation for all of these. Now for the corresponding fold functions I’m reading them as “If you give me a lens to treat s as a container that I can get elements from and a fold, I’ll feed the elements of s into the fold.”

The types are tricky, but this type class seems to capture what it means to run a fold across some type of structure.

Now that we’ve seen how An comes in handy. It’s used as a single object Foldable container. Since it’s newtyped, this should basically run the same as just passing a single element in.

prefix1 = prefix . An run1 = run . An postfix1 p = postfix p . AnSo a Scan here apparently means a fold over a single element at a time. Still not sure why this is deserving of the name Scan but there you are.

Last but not least we have a notion of dragging a fold through an optic with beneath.

beneath :: Profunctor p => Optic p Identity s t a b -> p a b -> p s t beneath l f = runIdentity #. l (Identity #. f)Those #.’s are like lmaps but only work when the function we apply is a “runtime identity”. Basically this means we should be able to tell whether or not we applied the function or just used unsafeCoerce when running the code. Otherwise all we do is set up our fold f to work across Identity and feed it into the optic.

Concrete ImplementationsNow a lot of the rest of the code is implementing those two type classes we went over. To figure out where all these implementations are I just ran

~$ cabal repl > :info Scan .... instance Scan R1 -- Defined at src/Data/Fold/R1.hs:25:10 instance Scan R -- Defined at src/Data/Fold/R.hs:27:10 instance Scan M1 -- Defined at src/Data/Fold/M1.hs:25:10 instance Scan M -- Defined at src/Data/Fold/M.hs:33:10 instance Scan L1' -- Defined at src/Data/Fold/L1'.hs:24:10 instance Scan L1 -- Defined at src/Data/Fold/L1.hs:25:10 instance Scan L' -- Defined at src/Data/Fold/L'.hs:33:10 instance Scan L -- Defined at src/Data/Fold/L.hs:33:10Looking at the names, I really don’t want to go through each of these with this much detail. Instead I’ll skip all the *1’s and go over R, L', and M to get a nice sampling of the sort of folds we get.

R.hsUp first is R.hs. This defines the first type for a fold we’ve seen.

data R a b = forall r. R (r -> b) (a -> r -> r) rReading this as “a right fold from a to b” we notice a few parts here. It looks like that existential r encodes our fold’s inner state and r -> b maps the current state into the result of the fold. That leaves a -> r -> r as the stepping function. All in all this doesn’t look *too* different from

The rest of this module is devoted to making a *lot* of instances for R. Some of these are really uninteresting like Bind, but quite a few are enlightening. To start with, Profunctor.

This should more or less by what you expect since it’s really the only the way to get the types to fit together. We fit the map from b -> d onto the presentation piece of the fold and stick the map from a -> c onto the stepper so it can take the new pieces of input.

Next we have the instance for Choice.

instance Choice R where left' (R k h z) = R (_Left %~ k) step (Left z) where step (Left x) (Left y) = Left (h x y) step (Right c) _ = Right c step _ (Right c) = Right c right' (R k h z) = R (_Right %~ k) step (Right z) where step (Right x) (Right y) = Right (h x y) step (Left c) _ = Left c step _ (Left c) = Left cThis was slightly harder for me to read, but it helps to remember that here _Left %~ and _Right %~ are just mapping over the left and right sides of an Either. That clears up the presentation bit. For the initial state, when we’re pulling our computation through the left side we wrap it in a Left, when we’re pulling it through the right, we wrap it in Right.

The interesting bit is the new step function. It short circuits if either our state or our new value is the wrong side of an Either otherwise it just applies our stepping function and wraps it back up as an Either.

In addition to being a profunctor, R is also a monad and comonad as well as a whole bunch of more finely grained classes built around those two. I’ll just show the Monad Applicative, and Comonad instance here.

instance Applicative (R a) where pure b = R (\() -> b) (\_ () -> ()) () R xf bxx xz <*> R ya byy yz = R (\(Pair' x y) -> xf x $ ya y) (\b ~(Pair' x y) -> Pair' (bxx b x) (byy b y)) (Pair' xz yz) instance Comonad (R a) where extract (R k _ z) = k z duplicate (R k h z) = R (R k h) h z instance Monad (R a) where return b = R (\() -> b) (\_ () -> ()) () m >>= f = R (\xs a -> run xs (f a)) (:) [] <*> mLooking at the Comonad instance nesting a fold within a fold doesn’t change the accumulator, only the presentation. A nested fold is one that runs and returns a *new* fold which is identical except the starting state is the result of the old fold.

The <*> operator here is kind of nifty. First off it zips both folds together using the strict Pair'. Finally when we get to the presentation stage we map the final state for the left which gives us a function, and the final state for the right maps to its argument. Applying these two gives us our final result.

Notice that there’s some craziness happening with irrefutable patterns. When we call this function we won’t attempt to force the second argument until bxx forces x or byy forces y. This is important because it makes sure that <*> preserves short circuiting.

The monad instance has a suitably boring return and >>= is a bit odd. We have one machine which accumulates all the elements it’s given in a list, this is an “identity fold” of sorts. From there our presentation function returns a lambda which expects an a and runs f a with all the input we’ve saved. We combine this with m by running it in parallel with <*> and feeding the result of m back into the lambda generated by the right.

Now we’re finally in a position to define our Scan and Folding instances. Since the Scan instance can be determined from the Folding one I’ll show Folding.

instance Folding R where run t (R k h z) = k (foldr h z t) prefix s = extend (run s) postfix t s = run s (duplicate t) runOf l s (R k h z) = k (foldrOf l h z s) prefixOf l s = extend (runOf l s) postfixOf l t s = runOf l s (duplicate t) filtering p (R k h z) = R k (\a r -> if p a then h a r else r) zIt took some time, but I understand how this works! The first thing to notice is that actually running a fold just relies on the foldr we have from Foldable. Postfixing a fold is particularly slick with right folds. Remember that z represents the accumulated state for the remainder of the items in our sequence.

Therefore, to postfix a number of elements all we need do is run the fold on the container we’re given and store the results as the new initial state. This is precisely what happens with run s (duplicate t).

Now prefix is the inefficient one here. To prefix an element we want to change how presentation works. Instead of just using the default presentation function, we actually want to take the final state we get and run the fold *again* using this prefixing sequence and then presenting the result. For this we have another helpful comonandic function, extend. This leaks because it holds on to the sequence a lot longer than it needs to.

The rest of these functions are basically the same thing except maybe postfixing (ha) a function with Of here and there.

L’.hsNext up is (strict) left folds. As with right folds this module is just a data type and a bunch of instances for it.

forall r. L' (r -> b) (r -> a -> r) rOne thing that surprised me here was that our state r isn’t stored strictly! That’s a bit odd but presumably there’s a good reason for this. Now all the instances for L' are the same as those for R up to isomorphism because the types are well.. isomorphic.

The real difference comes in the instances for Scan and Folding. Remember how Folding R used foldr, well here we just use foldl'. This has the upshot that all the strictness and whatnot is handled entirely by the foldable instance!

instance Folding L' where run t (L' k h z) = k $! foldl' h z t prefix s = run s . duplicate postfix t s = extend (run s) t runOf l s (L' k h z) = k $! foldlOf' l h z s prefixOf l s = runOf l s . duplicate postfixOf l t s = extend (runOf l s) t filtering p (L' k h z) = L' k (\r a -> if p a then h r a else r) zSo everywhere we had foldr we have foldl'. The other interesting switch is that our definitions of prefix and postfix are almost perfectly swapped! This actually makes perfect sense when you think about it. In a left fold the state is propagating from the beginning to the end versus a right fold where it propagates from the end to the beginning! So to prefix something when folding to the left we add it to the initial state and when postfixing we use the presentation function to take our final state and continue to fold with it.

If you check above, you’ll find this to be precisely the opposite of what we had for right folds and since they both have the same comonad instance, we can swap the two implementations.

In fact, having read the implementation for right folds I’m noticing that almost everything in this file is *so* close to what we had before. It really seems like there is a clever abstraction just waiting to break out.

Now that we’ve seen how left and right folds are more or less the same, let’s try something completely different! M.hs captures the notion of a foldMap and looks pretty different than what we’ve seen before.

First things first, here’s the type in question.

data M a b = forall m. M (m -> b) (a -> m) (m -> m -> m) mWe still have a presentation function m -> b, and we still have an internal state m. However, we also have a conversion function to map our inputted values onto the values we know how to fold together and we have a tensor operation m -> m -> m.

Now as before we have a profunctor instance

instance Profunctor M where dimap f g (M k h m e) = M (g.k) (h.f) m e rmap g (M k h m e) = M (g.k) h m e lmap f (M k h m e) = M k (h.f) m eWhich might start to look familiar from what we’ve seen so far. Next we have a Choice instance which is still a little intimidating.

instance Choice M where left' (M k h m z) = M (_Left %~ k) (_Left %~ h) step (Left z) where step (Left x) (Left y) = Left (m x y) step (Right c) _ = Right c step _ (Right c) = Right c right' (M k h m z) = M (_Right %~ k) (_Right %~ h) step (Right z) where step (Right x) (Right y) = Right (m x y) step (Left c) _ = Left c step _ (Left c) = Left cAs before we use prisms and %~ to drag our presentation and conversion functions into Either, similarly our starting state is wrapped in the appropriate constructor and we define a new stepping function with similar characteristic’s to what we’ve seen before.

As before, we’ve got a wonderful world of monads and comonads to dive into now. We’ll start with monads here to mix it up.

instance Applicative (M a) where pure b = M (\() -> b) (\_ -> ()) (\() () -> ()) () M xf bx xx xz <*> M ya by yy yz = M (\(Pair' x y) -> xf x $ ya y) (\b -> Pair' (bx b) (by b)) (\(Pair' x1 y1) (Pair' x2 y2) -> Pair' (xx x1 x2) (yy y1 y2)) (Pair' xz yz) instance Monad (M a) where return = pure m >>= f = M (\xs a -> run xs (f a)) One Two Zero <*> mOur return/pure just instantiates a trivial fold that consumes ()s and outputs the value we gave it. For <*> we run both machines strictly next to each other and apply the final result of one to the final result of the other.

Bind creates a new fold that creates a tree. This tree contains every input fed to it as it’s folding and stores each merge a node in the tree. While we run this, we also run the original m we were given. Finally, when we reach the end, we apply f to the result of m and run this over the tree we’ve created which is foldable. If you remember back to the comment of Tree a capturing foldMap this is what was meant by it: we’re using a tree to suspend a foldMap until we’re in a position to run it.

Now for comonad.

instance Comonad (M a) where extract (M k _ _ z) = k z duplicate (M k h m z) = M (\n -> M (k . m n) h m z) h m zWe can be pleasantly surprised that most of this code is the same. Extraction grabs our current state and presents it. Duplication creates a fold which will run and return a new fold. This new fold has the same initial state as the original fold, but when it goes to present its results it will merge it with the final state of the outer fold. This is very different from before and I suspect it will significantly impact our Folding instance.

instance Folding M where run s (M k h m (z :: m)) = reify (m, z) $ \ (_ :: Proxy s) -> k $ runN (foldMap (N #. h) s :: N m s) prefix s (M k h m (z :: m)) = reify (m, z) $ \ (_ :: Proxy s) -> case runN (foldMap (N #. h) s :: N m s) of x -> M (\y -> k (m x y)) h m z postfix (M k h m (z :: m)) s = reify (m, z) $ \ (_ :: Proxy s) -> case runN (foldMap (N #. h) s :: N m s) of y -> M (\x -> k (m x y)) h m z filtering p (M k h m z) = M k (\a -> if p a then h a else z) m zThis was a little intimidating so I took the liberty of ignoring *Of functions which are pretty much the same as what we have here.

To run a fold we use foldMap, but foldMap wants to work over monoids and we only have z and m. To promote this to a type class we use reify and N. Remember N from way back when? It’s the data type that uses reflection to yank a tuple out of our context and treat it as a monoid instance. In all of this code we use reify to introduce a tuple to our environment and N as a pseudo-monoid that uses m and z.

with this in mind, this code uses N #. h which uses the normal conversion function to introduce something into the N monoid. Then foldMap takes care of the rest and all we need do is call runN to extract the results.

prefix and postfix are actually markedly similar. They both start by running the fold over the supplied structure which reduces it to an m. From there, we create a new fold which is identical in all respects except the presentation function. The new presentation function uses m to combine the pre/post-fixed result with the new result. If we’re postfixing, the postfixed result is on the right, if we’re prefixing, the left.

What’s particularly stunning is that neither of these leak! We don’t need to hold onto the structure in our new fold so we can prefix and postfix in constant memory.

Fold.hsNow that we’ve gone through a bunch of instances of Folding and Scanning, we’re in a position to actually look at what Data.Fold exports.

module Data.Fold ( Scan(..) , Folding(..) , beneath , L1(..) -- lazy Mealy machine , L1'(..) -- strict Mealy machine , M1(..) -- semigroup reducer , R1(..) -- reversed lazy Mealy machine , L(..) -- lazy Moore machine , L'(..) -- strict Moore machine , M(..) -- monoidal reducer , R(..) -- reversed lazy Moore machine , AsRM1(..) , AsL1'(..) , AsRM(..) , AsL'(..) ) whereSo aside from the folds we’ve examined before, there are 4 new classes, AsRM[1], and AsL[1]'. We’ll look at the non-1 versions.

class AsRM1 p => AsRM p where asM :: p a b -> M a b asR :: p a b -> R a bSo this class covers the class of p’s that know how to convert themselves to middle and right folds. Most of these instances are what you’d expect if you’ve ever done the “write foldl as foldr” trick or similar shenanigans.

For M

instance AsRM M where asR (M k h m z) = R k (m.h) z asM = idasM is trivially identity and since m is expected to be associative we don’t really care that R is going to associate it strictly to the right. We just glue h onto the front to map the next piece of input into something we know how to merge.

Next is R

instance AsRM R where asM (R k h z) = M (\f -> k (f z)) h (.) id asR = idFor right folds we do something a bit different. We transform each value into a function of type m -> m which is the back half of a folding function. We can compose these associatively with . since they are just functions. Finally, when we need to present this, we apply this giant pipeline to the initial state and present the result. Notice here how we took a nonassociative function and bludgeoned it into associativity by partially applying it.

For L' we do something similar

instance AsRM L' where asR (L' k h z) = R (\f -> k (f z)) (\b g x -> g $! h x b) id asM = asM . asRWe once again build up a pipeline of functions to make everything associative and apply it at the end. We can’t just use . though for composition because we need to force intermediate results. That’s why you see \b g x -> g $! h x b, it’s just strict composition.

It makes sense that we’d bundle right and monoidal folds together because every right fold can be converted to a monoidal and every monoidal fold to a right. That means that every time we can satisfy one of these functions we can build the second.

This isn’t the case for left folds because we can’t convert a monoidal or right fold to a left one. For the people who are dubious of this, foldl doesn’t let us capture the same amount of laziness we need. I forgot about this too and subsequently hung my machine trying to prove Edward Kmett wrong.

This means that the AsL' is a fairly boring class,

class (AsRM p, AsL1' p) => AsL' p where asL' :: p a b -> L' a b instance AsL' L where asL' (L k h z) = L' (\(Box r) -> k r) (\(Box r) a -> Box (h r a)) (Box z)Now we finally see the point of Box, it’s designed to stubbornly block attempts at making its contents strict. You can see this because all the instance for L does is wrap everything in Boxes! Since L' is the same as L with some extra seqs, we can use Box to nullify those attempts at strictness and give us a normal left fold.

That’s it! We’re done!

Wrap UpNow that we’ve gone through a few concrete implementations and the overall structures in this package hopefully this has come together for you. I must say, I’m really quite surprised at how effectively comonadic operations can capture compositional folds. I’m certainly going to make an effort to use this package or Gabriel’s foldl a bit more in my random “tiny Haskell utility programs”.

If you’re as entranced by these nice little folding libraries as I am, I’d recommend

Trivia fact: this is the longest article out of all 52 posts on Code & Co.

*Update: I decided it might be helpful to write some utility folds for folds. I figured this might be interesting to some.*

### Question about stepping, lists, ranges

Hello, I'm just starting the Learn You a Haskell tutorial and some stuff in the stepping/ranges part is boggling me a bit.

[2, 1..20] gives me an empty list.

[2, 2..20] gives an infinite list of 2's

[2, 3..20] gives [2..20]

[2, 4..20] gives [2, 4, 6, .. , 20]

What's going on here?

submitted by sleepystudy[link] [8 comments]

### Haskell + Sublime on Mac OS X

### HEAT: The Haskell Educational Advancement Tool - School of Computing - University of Kent

### Why does "assert" include the file location, while "error" does not?

I recently learned that "assert" will print the file location of an assert failure. It seems that this would be useful for "error" too. Asserts are not designed to be used as errors since by default they are turned off when optimization is on. Is that an alternative to "error" that will print the file location?

submitted by -Robbie[link] [14 comments]

### Magnus Therning: Adding some strictness

I’ve finally found some time to solve a few more exercises on exercim.io. One exercise, called *Robot Simulation*, I solved in a straight forward way and promptly got the feedback that my choice of data type could lead to space leaks. Since it’s a dimension of Haskell programming that I rarely (read “never”) have had to think about I thought I’d take the chance to play a bit.

I don’t want to give away the solution to the exercise so I’m using some code of a similar shape.

import Data.List newtype P = P (Int, Int) deriving (Eq, Show) step :: P -> Char -> P step (P c) 'A' = P (c `add` (0, 1)) step _ _ = undefined add :: (Num a, Num b) => (a, b) -> (a, b) -> (a, b) add (x0, y0) (x1, y1) = (x0 + x1, y0 + y1) p :: P p = foldl' step (P (0, 0)) $ replicate 5000000 'A' main :: IO () main = print pThere’s nothing complicated about the code, but the memory usage is rather ridiculous:

1,132,746,080 bytes allocated in the heap 1,149,823,448 bytes copied during GC 346,747,064 bytes maximum residency (11 sample(s)) 3,817,608 bytes maximum slop 600 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2148 colls, 0 par 0.58s 0.58s 0.0003s 0.0006s Gen 1 11 colls, 0 par 0.56s 0.56s 0.0510s 0.2278s INIT time 0.00s ( 0.00s elapsed) MUT time 0.25s ( 0.25s elapsed) GC time 1.15s ( 1.14s elapsed) EXIT time 0.02s ( 0.02s elapsed) Total time 1.42s ( 1.41s elapsed) %GC time 80.9% (81.0% elapsed) Alloc rate 4,593,597,274 bytes per MUT second Productivity 19.0% of total user, 19.1% of total elapsedSurely it ought to be possible to add 5 million numbers in less than 600MB, right?

The reason, AFAIU is that tuples are lazy, which means a lot of thunk building when doing the additions.

Using a stricter data typeThe maybe easiest way to add some strictness is to add a few exclamations marks in the data type. Modifying the data type brings a few other changes though, so here’s first a version without any strictness annotations so I have something to compare with later on1:

{-# LANGUAGE RecordWildCards #-} import Data.List data P = P { pX :: Int, pY :: Int } deriving (Eq, Show) p :: P p = foldl' step (P 0 0) $ replicate 5000000 'A' step :: P -> Char -> P step q@(P {..}) 'A' = q { pX = pX +1 } step _ _ = undefined main :: IO () main = print pThis solution uses about half as much memory:

766,417,952 bytes allocated in the heap 639,064,600 bytes copied during GC 155,866,064 bytes maximum residency (11 sample(s)) 1,914,016 bytes maximum slop 345 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1460 colls, 0 par 0.32s 0.32s 0.0002s 0.0006s Gen 1 11 colls, 0 par 0.33s 0.33s 0.0303s 0.1071s INIT time 0.00s ( 0.00s elapsed) MUT time 0.15s ( 0.15s elapsed) GC time 0.65s ( 0.65s elapsed) EXIT time 0.01s ( 0.01s elapsed) Total time 0.81s ( 0.81s elapsed) %GC time 80.2% (80.2% elapsed) Alloc rate 5,212,863,895 bytes per MUT second Productivity 19.8% of total user, 19.8% of total elapsedHalving the memory usage is rather nice, but adding some strictness ought to make it better still. The code with added exclamation marks:

{-# LANGUAGE RecordWildCards #-} import Data.List data P = P { pX :: !Int, pY :: !Int } deriving (Eq, Show) p :: P p = foldl' step (P 0 0) $ replicate 5000000 'A' step :: P -> Char -> P step q@(P {..}) 'A' = q { pX = pX +1 } step _ _ = undefined main :: IO () main = print pThe runtime behaviour of this code is remarkably more frugal with memory:

480,054,816 bytes allocated in the heap 57,968 bytes copied during GC 44,312 bytes maximum residency (2 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 918 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s Gen 1 2 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s INIT time 0.00s ( 0.00s elapsed) MUT time 0.09s ( 0.09s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.09s ( 0.09s elapsed) %GC time 2.4% (2.4% elapsed) Alloc rate 5,630,441,644 bytes per MUT second Productivity 97.3% of total user, 98.2% of total elapsedIndeed, 1MB is a bit more like it!

Strictness with the original typeHowever impressive the results when using a stricter data type, it might still be preferable to keep the original, somewhat simpler, type. It is possible to do just that by placing some explicit strictness in well chosen places. Since my suspicion is that it’s the addition in step that causes the space leak we’ll rewrite it:

import Data.List import Control.DeepSeq newtype P = P (Int, Int) deriving (Eq, Show) add :: (Num a, Num b) => (a, b) -> (a, b) -> (a, b) add (x0, y0) (x1, y1) = (x0 + x1, y0 + y1) p :: P p = foldl' step (P (0, 0)) $ replicate 5000000 'A' step :: P -> Char -> P step (P c) 'A' = let np = c `add` (0, 1) in np `deepseq` P np step _ _ = undefined main :: IO () main = print pNote that it’s not enough to use seq; a bit more force is needed, which is what deepseq offers. It delivers sure enough:

920,052,632 bytes allocated in the heap 198,736 bytes copied during GC 44,312 bytes maximum residency (2 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1774 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s INIT time 0.00s ( 0.00s elapsed) MUT time 0.29s ( 0.29s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.30s ( 0.30s elapsed) %GC time 1.4% (1.4% elapsed) Alloc rate 3,124,303,717 bytes per MUT second Productivity 98.5% of total user, 98.9% of total elapsedA further slight change is possible: implement NFData for P itself. That way it’s not necessary to pull out and force evaluation of the innards of P. Using the default implementation of rnf did not cut it though, since it uses seq it just isn’t forceful enough:

import Data.List import Control.DeepSeq newtype P = P (Int, Int) deriving (Eq, Show) instance NFData P where rnf (P q) = deepseq q () add :: (Num a, Num b) => (a, b) -> (a, b) -> (a, b) add (x0, y0) (x1, y1) = (x0 + x1, y0 + y1) p :: P p = foldl' step (P (0, 0)) $ replicate 5000000 'A' step :: P -> Char -> P step (P c) 'A' = let np = P (c `add` (0, 1)) in force np step _ _ = undefined main :: IO () main = print pThe result is close to the previous:

920,052,632 bytes allocated in the heap 198,736 bytes copied during GC 44,312 bytes maximum residency (2 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1774 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s INIT time 0.00s ( 0.00s elapsed) MUT time 0.33s ( 0.33s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.33s ( 0.33s elapsed) %GC time 1.2% (1.2% elapsed) Alloc rate 2,817,175,186 bytes per MUT second Productivity 98.7% of total user, 99.0% of total elapsedIt looks like it’s slightly slower though, which might be caused by the double call to deepseq in this version.

ConclusionFrom this little example it’s clear that a little strictness in well-chosen places can go a long way in reducing memory footprint. The various solutions also show that the results can vary quite a bit depending on the strategy for introducing strictness. In this particular case I think the stricter data type (see Using a stricter data type above) is the best choice, it’s both faster and uses less memory than any of the others.

I’m using RecordWildCards to simplify the code a bit, which possibly might influence evaluation as well. I’m simply not sure.↩

### Roman Cheplyaka: Taking advantage of type synonyms in monad-control

Bas van Dijk has recently released monad-control-1.0.0.0, the main purpose of which is to replace associated data types with associated type synonyms. The change caused minor breakages here and there, so people might wonder whether and why it was worth it. Let me show you a simple example that demonstrates the difference.

Let’s say we are writing a web application. wai defines an application as

type Application = Request -> (Response -> IO ResponseReceived) -> IO ResponseReceivedOur web app will need a database connection, which we’ll pass using the ReaderT transformer:

type ApplicationM m = Request -> (Response -> m ResponseReceived) -> m ResponseReceived myApp :: ApplicationM (ReaderT DbConnection IO)However, warp can only run an Application, not ApplicationM:

run :: Port -> Application -> IO ()Can we build runM :: Port -> ApplicationM m -> m () on top of the simple run function? Solving the problems like this one is exactly the purpose of monad-control.

Here’s how such a function might look like:

runM :: (MonadBaseControl IO m) => Port -> ApplicationM m -> m () runM port app = do liftBaseWith $ \runInIO -> run port $ \req ack -> runInIO $ app req (liftBase . ack)What’s going on here? liftBaseWith, like liftM or liftBase, allows to run a primitive monadic action in a complex monad stack. The difference is that it also gives us a function, here named runInIO, which lets to “lower” complex actions to primitive ones. Here we use runInIO to translate the return value of our app, m (), into a basic IO () value that the run function can digest.

All is well, except…

Could not deduce (StM m ResponseReceived ~ ResponseReceived) Expected type: IO ResponseReceived Actual type: IO (StM m ResponseReceived) Relevant bindings include runInIO :: RunInBase m IO In the expression: runInIO $ app req (liftBase . ack)The type of runInIO is forall a . m a -> IO (StM m a) (a.k.a. RunInBase m IO), while we would like a simple forall a . m a -> IO a. The purpose of StM is to encompass any “output effects”, such as state or error.

In our case, we don’t have any “output effects” (nor would we be allowed to), so StM (ReaderT DbConnection IO) ResponseReceived is really isomorphic to ResponseReceived.

In monad-control 0.x, StM used to be an associated data family, and its constructors for the standard monad transformers were hidden. Even though we knew that the above two types were isomorphic, we still couldn’t resolve the error nicely.

Not anymore! Since in monad-control 1.0 StM is an associated type synonym, StM (ReaderT DbConnection IO) ResponseReceived and ResponseReceived are not just hypothetically isomorphic; they are literally the same type. After we add the corresponding equality constraint to runM

runM :: (MonadBaseControl IO m, StM m ResponseReceived ~ ResponseReceived) => Port -> ApplicationM m -> m ()our app compiles!

This example is not just an isolated case. The general problem with monad-control is that it is all too easy to discard the output effects as Edward Yang shows.

Monads for which StM m a ~ a provide a “safe subset” of monad-control. Previously, it was hard to tell apart safe and unsafe uses, because the output effects or absence thereof hid behind the opaque StM data family.

Now not only is it transparent when the output effects are absent, but we can actually encode that fact right in the type system! As an example, Mitsutoshi Aoe and I are experimenting with a safe lifted async module.

One may wonder if this subset is too boring, since it only includes monads that are isomorphic to a reader transformer over the base monad. While that is technically true, there are a lot of things you can do with a reader. The ZoomT and CustomWriterT transformers that I described in another article, as well as the Proxied transformer they’re based upon, are reader-like and thus safe to use with monad-control.

### Dominic Steinitz: Stopping Times

Let an be stopping times and let the filtration on which they are defined be right continuous. Then

- ,
- ,
- and

are stopping times where .

For the first we have and both the latter are in by the definition of a stopping time.

Similarly for the second .

For the fourth we have since .

The third is slightly trickier. For , if and only if for some rational , we have . We can thus we can find such that . Writing we also have . Thus we have if and only if there exist and such that and and . In other words

By right continuity (Protter 2004 Theorem 1) of the filtration, we know the terms on the right hand side are in and so that the whole right hand side is in . We thus know that the left hand side is in and using right continuity again that therefore must be a stopping time.

BibliographyProtter, P.E. 2004. *Stochastic Integration and Differential Equations: Version 2.1*. Applications of Mathematics. Springer. http://books.google.co.uk/books?id=mJkFuqwr5xgC.

### Daniil Frumin: ANN: Hastache version 0.6.1

Happy holidays, everyone!

I would like to announce a new version of the Hastache library, version 0.6.1. Some interesting and useful changes, as well as improvements and bugfixes are included in the release. See below for an extended changelog.

Hastache is a Haskell implementation of the mustache templating system.

Quick start cabal update cabal install hastacheA simple example:

import Text.Hastache import Text.Hastache.Context import qualified Data.Text.Lazy.IO as TL main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context) >>= TL.putStrLn template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages." context "name" = MuVariable "Haskell" context "unread" = MuVariable (100 :: Int)Read Mustache documentation for template syntax; consult README for more details.

Whats’s new in 0.6.1?Most of the new features in this release deal with generic contexts.

Context mergingcomposeCtx is a left-leaning composition of contexts. Given contexts c1 and c2, the behaviour of (c1 <> c2) x is following: if c1 x produces ‘MuNothing’, then the result is c2 x. Otherwise the result is c1 x. Even if c1 x is ‘MuNothing’, the monadic effects of c1 are still to take place.

Generic contexts for more datatypesThe mkGenericContext function now supports additional datatypes like Maybe (with Nothing being an empty/null value) and Either.

Context modifiers and renamingThe new mkGenericContext' is a generalized version of mkGenericContext and it takes two addition arguments. The first one, of type (String -> String) is simply a renaming function, similar to fieldLabelModifier of aeson. To see this feature in action, consider the following example:

{-# LANGUAGE DeriveDataTypeable #-} import Text.Hastache import Text.Hastache.Context import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.IO as TL import Data.Data (Data, Typeable) import Data.Decimal import Data.Generics.Aliases (extQ) data Test = Test {f :: Int} deriving (Data, Typeable) val1 :: Test val1 = Test 1 val2 :: Test val2 = Test 2 r "f" = "foo" r x = x example :: Test -> IO TL.Text example v = hastacheStr defaultConfig (encodeStr template) (mkGenericContext' r defaultExt v) template = "An integer: {{foo}}" main = do example val1 >>= TL.putStrLn example val2 >>= TL.putStrLnIn the example we use the renaming function r to rename a field “f” to “foo”.

The second additional argument is a *query extension*, of type Ext:

A query extension is a way of turning arbitrary datatypes into strings. This might come in very handy, if you want to generate mustache contexts from records/datatypes that contain non-primitive datatypes (from non-base modules) that you want to display. Before 0.6.1, if you had a record that contained, for example, a Decimal field, and you wanted to convert it to a context and access that field, you were simply out of luck. With this release you can basically extend the mkGenericContext' function to support any datatypes you want! Once again, I believe an example is worth a thousand words, so let us consider a slightly modified version of the example above:

{-# LANGUAGE DeriveDataTypeable #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE StandaloneDeriving #-} -- Custom extension function for types that are not supported out of -- the box in generic contexts import Text.Hastache import Text.Hastache.Context import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.IO as TL import Data.Data (Data, Typeable) import Data.Decimal import Data.Generics.Aliases (extQ) data DecimalOrInf = Inf | Dec Decimal deriving (Data, Typeable) deriving instance Data Decimal data Test = Test {n::Int, m::DecimalOrInf} deriving (Data, Typeable) val1 :: Test val1 = Test 1 (Dec $ Decimal 3 1500) val2 :: Test val2 = Test 2 Inf query :: Ext query = defaultExt `extQ` f where f Inf = "+inf" f (Dec i) = show i r "m" = "moo" r x = x example :: Test -> IO TL.Text example v = hastacheStr defaultConfig (encodeStr template) (mkGenericContext' r query v) template = concat [ "An int: {{n}}\n", "{{#moo.Dec}}A decimal number: {{moo.Dec}}{{/moo.Dec}}", "{{#moo.Inf}}An infinity: {{moo.Inf}}{{/moo.Inf}}" ] main = do example val1 >>= TL.putStrLn example val2 >>= TL.putStrLnAs you can see, the query extensions are combined using the extQ function from Data.Generics, and the “unit” of this whole thing is defaultExt function.

Links AcknowledgmentsThis release would not have been possible without Tobias Florek, Edsko de Vries, Janne Hellsten, @clinty on Github, Stefan Kersten, Herbert Valerio Riedel, and other people who were submitting issues, patches, and requests.

Tagged: haskell, hastache

### Proposal to Haskell's pattern matching - (ie, what am I missing..?)

(Right off the bat - I'm no expert or professional on the subject matter, so please give me advice or critique (and I tried to do the code formatting but it doesn't seem to work :/))

A rewarding part of Haskell is that you can sometimes write clean and neat looking code. And occasionally, when you're doing trivial operations, you might run into the task of comparing two variables/functions/whatever.

This is where I feel inspired by other languages, more specifically Prolog. Say I want to simply compare two things (a and b), if they are the same (they must be comparable ofcourse).

In Haskell, I can write for example: comp a b = if a == b then...

or perhaps: comp a b | a == b = ...

this is so much talk for so little work! Is there any more concise way of writing this that I simply don't know? Because if not, why not do as they do in Prolog, the corresponding predicate would look like: comp(X, X):- ... .. meaning both variables are the same. If you write anything like that in Haskell you get an error obviously. But my suggestion would then mean you could write like this: comp a a = ... --(what would follow in the then-clause)

Perhaps you won't actually save any typing doing this, since you would want to match the second case too.. But please if you know why this would totally ruin everything or not work then please speak your mind.

submitted by Tickstart[link] [8 comments]

### Is the LYAH website newer than the book?

I'm just wondering because I've been reading both and noticed that the chapters don't all line up. My other question would be does anyone know what the key differences are?

submitted by fellow_redditor[link] [13 comments]

### Danny Gratzer: Examining Hackage: operational

In this installment of “jozefg is confused by other people’s code” we turn to operational. This is a package that’s a little less known than I’d like. It provides a monad for transforming an ADT of instructions, a monad that can be used with do notation and separates out interpretation.

Most people familiar with free monads are wondering what the difference is between operational’s approach and using free monads. Going into this, I have no clue. Hopefully this will become clear later on.

Diving Into The SourceLet’s get started shall we

~$ cabal get operationalHappily enough, there’s just one (small) file so we’ll go through that.

To start with Control.Monad.Operational exports

module Control.Monad.Operational ( Program, singleton, ProgramView, view, interpretWithMonad, ProgramT, ProgramViewT(..), viewT, liftProgram, ) whereLike with most “provides a single monad” packages, I’m most interested in how Program works. Looking at this, we see that it’s just a synonym

type Program instr = ProgramT instr IdentityJust like the mtl, this is defined in terms of a transformer. So what’s this transformer?

data ProgramT instr m a where Lift :: m a -> ProgramT instr m a Bind :: ProgramT instr m b -> (b -> ProgramT instr m a) -> ProgramT instr m a Instr :: instr a -> ProgramT instr m aSo ProgramT is a GADT, this is actually important because Bind has an existential type variable: b. Otherwise this is really just a plain tree, I assume (>>=) = Bind and return = Lift . return in the monad instance for this. And finally we can see that instructions are also explicitly supported with Instr.

We can confirm that the Monad instance is as boring as we’d expect with

instance Monad m => Monad (ProgramT instr m) where return = Lift . return (>>=) = Bind instance MonadTrans (ProgramT instr) where lift = Lift instance Monad m => Functor (ProgramT instr m) where fmap = liftM instance Monad m => Applicative (ProgramT instr m) where pure = return (<*>) = apSo clearly there’s no interesting computation happening here. Looking at the export list again, we see that there’s a helpful combinator singleton for building up these Program[T]s since they’re kept abstract.

singleton :: instr a -> ProgramT instr m a singleton = InstrWhich once again is very boring.

So this is a lot like free monads it seems since neither one of these actually does much in its monad instance. Indeed the equivalent with free monads would be

data Free f a = Pure a | Free (f (Free f a)) instance Functor f => Monad (Free f) where return = Pure Pure a >>= f = f a (Free a) >>= f = Free (fmap (>>= f) a) singleton :: Functor f => f a -> Free f a singleton = Free . fmap PureThe obvious differences is that

- Free requires a functor while Program doesn’t
- Frees monad instance automatically guarantees laws

2 is the bigger one for me. Free has a tighter set of constraints on its f so it can guarantee the monad laws. This is clearly false with Program since return a >>= f introduces an extra Bind instead of just giving f a.

This would explain why ProgramT is kept abstract, it’s hopelessly broken just to expose it in its raw form. Instead what we have to do is somehow partially normalize it before we present it to the user.

Indeed that’s exactly what ProgramViewT is representing. It’s a simpler data type

data ProgramViewT instr m a where Return :: a -> ProgramViewT instr m a (:>>=) :: instr b -> (b -> ProgramT instr m a) -> ProgramViewT instr m aThis apparently “compiles” a Program so that everything is either binding an instruction or a pure value. What’s interesting is that this seems to get rid of all Lift’s as well.

How do we produce one of these? Well that seems to be viewT’s job.

viewT :: Monad m => ProgramT instr m a -> m (ProgramViewT instr m a) viewT (Lift m) = m >>= return . Return viewT ((Lift m) `Bind` g) = m >>= viewT . g viewT ((m `Bind` g) `Bind` h) = viewT (m `Bind` (\x -> g x `Bind` h)) viewT ((Instr i) `Bind` g) = return (i :>>= g) viewT (Instr i) = return (i :>>= return)Note that this function returns an m (ProgramViewT instr m a), not just a plain ProgramViewT. This makes sense because we have to get rid of the lifts. What I think is particularly interesting here is that the 2nd and 3rd cases are just the monad laws!

The second one says binding to a computation is just applying the function to it in the obvious manner. The third re-associates bind in a way guaranteed by the monad laws.

This means that while ProgramT isn’t going to satisfy the monad laws, we can’t tell because all the things said to be equal by the monad laws will compile to the same view. Terribly clever stuff.

The rest of the module is mostly boring stuff like Monad* instances. The last interesting functions is interpretWithMonad

interpretWithMonad :: forall instr m b. Monad m => (forall a. instr a -> m a) -> (Program instr b -> m b) interpretWithMonad f = eval . view where eval :: forall a. ProgramView instr a -> m a eval (Return a) = return a eval (m :>>= k) = f m >>= interpretWithMonad f . kThis nicely highlights how you’re supposed to write an interpreter for a Program. eval handles the two cases of the view using the mapping to a monad we provided and view handles actually compiling the program into these two cases. All in all, not too shabby.

Surprise, There Were Docs The Whole Time!Now I assume that most people didn’t actually download the source to operational, but you really should! Inside you’ll find a whole directory, doc. It contains a few markdown files with explanations and references to the appropriate papers as well as a couple examples of actually building things with operational.

Now that you understand how the current implementation works, you should be able to understand most of what is being said there.

Wrap UpSo operational illustrates a neat trick I rather like: using modularity to provide an O(1) implementation of >>= and hide its rule breaking with a view.

This package also drops the positivity requirement that Free implies with its functor constraint. Which I suppose means you could have

data Foo a where Bar :: (a -> ...) -> Bar aWhich is potentially useful.

Last but not least, operational really exemplifies having a decent amount of documentation even though there’s only ~100 lines of code. I think the ratio of documentation : code is something like 3 : 1 which I really appreciate.

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus