News aggregator

a brainf*ck monad

Haskell on Reddit - Thu, 12/11/2014 - 11:12pm
Categories: Incoming News - Thu, 12/11/2014 - 7:36pm
Categories: Offsite Blogs

Oliver Charles: 24 Days of GHC Extensions: Type Families

Planet Haskell - Thu, 12/11/2014 - 6:00pm

Today, we’re going to look at an extension that radically alters the behavior of GHC Haskell by extending what we can do with types. The extension that we’re looking at is known as type families, and it has a wide variety of applications.

> {-# LANGUAGE FlexibleContexts #-} > {-# LANGUAGE TypeFamilies #-} > import Control.Concurrent.STM > import Control.Concurrent.MVar > import Data.Foldable (forM_) > import Data.IORef

As the extension is so large, we’re only going to touch the surface of the capabilities - though this extension is well documented, so there’s plenty of extra reading for those who are interested!

Associated Types

To begin, lets look at the interaction of type families and type classes. In ordinary Haskell, a type class can associate a set of methods with a type. The type families extension will now allow us to associate types with a type.

As an example, lets try and abstract over the various mutable stores that we have available in Haskell. In the IO monad, we can use IORefs and MVars to store data, whereas other monads have their own specific stores, as we’ll soon see. To begin with, we’ll start with a class over the different types of store:

> class IOStore store where > newIO :: a -> IO (store a) > getIO :: store a -> IO a > putIO :: store a -> a -> IO ()

This works fine for IO stores: we can add an instance for MVar…

> instance IOStore MVar where > newIO = newMVar > getIO = readMVar > putIO mvar a = modifyMVar_ mvar (return . const a)

and an instance for IORef:

> instance IOStore IORef where > newIO = newIORef > getIO = readIORef > putIO ioref a = modifyIORef ioref (const a)

Now we have the ability to write functions that are polymorphic over stores:

> type Present = String > storePresentsIO :: IOStore store => [Present] -> IO (store [Present]) > storePresentsIO xs = do > store <- newIO [] > forM_ xs $ \x -> do > old <- getIO store > putIO store (x : old) > return store

While this example is obviously contrived, hopefully you can see how we are able to interact with a memory store without choosing which store we are commiting to. We can use this by choosing the type we need, as the following GHCI session illustrates:

.> s <- storePresentsIO ["Category Theory Books"] :: IO (IORef [Present]) .> :t s s :: IORef [Present] .> get s ["Category Theory Books"]

Cool - now we can go and extend this to TVar and other STM cells! Ack… there is a problem. Reviewing our IOStore type class, we can see that we’ve commited to working in the IO monad - and that’s a shame. What we’d like to be able to do is associate the type of monad with the type of store we’re using - as knowing the store tells us the monad that we have to work in.

To use type families, we use the type keyword within the class definition, and specify the kind of the type:

> class Store store where > type StoreMonad store :: * -> * > new :: a -> (StoreMonad store) (store a) > get :: store a -> (StoreMonad store) a > put :: store a -> a -> (StoreMonad store) ()

As you can see, the types of the methods in the type class has become a little more complicated. Rather than working in the IO monad, we calculate the monad by using the StoreMonad type family.

The instances are similar to what we saw before, but we also have to provide the necessary type of monad:

> instance Store IORef where > type StoreMonad IORef = IO > new = newIORef > get = readIORef > put ioref a = modifyIORef ioref (const a) > > instance Store TVar where > type StoreMonad TVar = STM > new = newTVar > get = readTVar > put ioref a = modifyTVar ioref (const a)

As you can see - our methods don’t need to change at all; type families naturally extend the existing type class functionality. Our original storePresentsIO can now be made to work in any monad, with only a change to the type:

> storePresents :: (Store store, Monad (StoreMonad store)) > => [Present] -> (StoreMonad store) (store [Present]) > storePresents xs = do > store <- new [] > forM_ xs $ \x -> do > old <- get store > put store (x : old) > return store

As we have an instance for Store TVar, we can now use this directly in an STM transaction:

.> atomically (do (storePresents ["Distributed Computing Through Combinatorial Topology"] :: STM (TVar [Present])) >>= get) ["Distributed Computing Through Combinatorial Topology"]


Type Families and Computation

What we’ve seen so far is extremely useful, but the fun needn’t stop there! Type families also give us the ability to compute over types! Traditionally, Haskell is built around value level computation - running programs should do something. That said, we all know how useful it is to have functions - so why can’t we have them at the type level? Well, now that we have the ability to associate types with types, we can!

To look at this new functionality (closed type families), we need a few more extensions to really unlock the potential here, so I’ll finish this blog post on that cliff hanger. Watch this space!

This post is part of 24 Days of GHC Extensions - for more posts like this, check out the calendar.

Categories: Offsite Blogs

Sean Seefried: File system snapshots make build scripts easy

Planet Haskell - Thu, 12/11/2014 - 6:00pm
or, how Docker can relieve the pain of developing long running build scripts

I think I’ve found a pretty compelling use case for Docker. But before you think that this is yet another blog post parroting the virtues of Docker I’d like to make clear that this post is really about the virtues of treating your file system as a persistent data structure. Thus, the insights of this post are equally applicable to other copy-on-write filesystems such as btrfs, and ZFS.

The problem

Let’s start with the problem I was trying to solve. I was developing a long running build script that consisted of numerous steps.

  • It took 1-2 hours to run.
  • It downloaded many fairly large files from the Internet. (One exceeded 300M.)
  • Later stages depended heavily on libraries built in earlier stages.

But the most salient feature was that it took a long time to run.

Filesystems are inherently stateful

We typically interact with filesystems in a stateful way. We might add, delete or move a file. We might change a file’s permissions or its access times. In isolation most actions can be undone. e.g. you can move a file back to its original location after having moved it somewhere else. What we don’t typically do is take a snapshot and revert back to that state. This post will suggest that making more use of this feature can be a great boon to developing long running build scripts.

Snapshots using union mounts

Docker uses what is called a union filesystem called AUFS. A union filesystem implements what is known as a union mount. As the name suggests this means that files and directories of separate file systems are layered on top of each other forming a single coherent file system. This is done in a hierarchical manner. If a file appears in two filesystems the one further up the hierarchy will be the one presented. (The version of the file further down the hierarchy is there, unchanged, but invisible.)

Docker calls each filesystem in the union mount a layer. The upshot of using this technology is that it implements snapshots as a side effect. Each snapshot is a simply a union mount of all the layers up to a certain point in the hierarchy.

Snapshots for build scripts

Snapshots make developing a long-running build script a dream. The general idea is to break up the script up into smaller scripts (which I like to call scriptlets) and run each one individually, snapshotting the filesystem after each one is run. (Docker does this automatically.) If you find that a scriptlet fails, one simply has to go back to the last snapshot (still in its pristine state!) and try again. Once you have completed your build script you have a very high assurance that the script works and can now be distributed to others.

Constrast this with what would happen if you weren’t using snapshots. Except for those among us with monk-like patience, no one is going to going to run their build script from scratch when it fails an hour and a half into building. Naturally, we’ll try our best to put the system back into the state it was in before we try to build the component that failed last time. e.g. we might delete a directory or run a make clean.

However, we might not have perfect understanding of the component we’re trying to build. It might have a complicated Makefile that puts files in places on the file system which we are unaware of. The only way to be truly sure is to revert to a snapshot.

Using Docker for snapshotted build scripts

In this section I’ll cover how I used Docker to implement a build script for a GHC 7.8.3 ARM cross compiler. Docker was pretty good for this task, but not perfect. I did some things that might look wasteful or inelegant but were necessary in order to keep the total time developing the script to a minimum. The build script can be found here.

Building with a Dockerfile

Docker reads from a file called Dockerfile to build images. A Dockerfile contains a small vocabulary of commands to specify what actions should be performed. A complete reference can be found here. The main ones used in my script are WORKDIR, ADD, and RUN. The ADD command is particularly useful because it allows you to add files that external to the current Docker image into the image’s filesystem before running them. You can see the many scriptlets that make up the build script here.

Design 1. ADD scriptlets just before you RUN them.

If you ADD all the scriptlets too early in the Dockerfile you may run into the following problem: your script fails, you go back to modify the scriptlet and you run docker build . again. But you find that Docker starts building at the point where the scriptlets were first added! This wastes a lot of time and defeats the purpose of using snapshots.

The reason this happens is because of how Docker tracks its intermediate images (snapshots). As Docker steps through the Dockerfile it compares the current command with an intermediate image to see if there is a match. However, in the case of the ADD command the contents of the files being put into the image are also examined. This makes sense. If the files have changed with respect to an existing intermediate image Docker has no choice but to build a new image from that point onwards. There’s just no way it can know that those changes don’t affect the build. Even if they wouldn’t it must be conservative.

Also, beware using RUN commands that would cause different changes to the filesystem each time they are run. In this case Docker will find the intermediate image and use it, but this will be the wrong thing for it to do. RUN commands must cause the same change to the filesystem each time they are run. As an example, I ensured that in my scriptlets I always downloaded a known version of a file with a specific MD5 checksum.

A more detailed explanation of Docker’s build cache can be found here.

2. Don’t use the ENV command to set environment variables. Use a scriptlet.

It may seem tempting to use the ENV command to set up all the environment variables you need for your build script. However, it does not perform variable substitution the way a shell would. e.g. ENV BASE=$HOME/base will set BASE to have the literal value $HOME/base which is probably not what you want.

Instead I used the ADD command to add a file called This file is included in each subsequent scriptlet with:

THIS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" source $THIS_DIR/

What if you don’t get right the first time around? Since it’s added so early in the Dockerfile doesn’t this mean that modifying it would invalidate and subsequent snapshots?

Yes, and this leads to some inelegance. While developing the script I discovered that I’d missed adding a useful environment variable in The solution was to create a new file containing:

THIS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" source $THIS_DIR/ if ! [ -e "$CONFIG_SUB_SRC/config.sub" ] ; then CONFIG_SUB_SRC=${CONFIG_SUB_SRC:-$NCURSES_SRC} fi

I then included this file in all subsequent scriptlets. Now that I have finished the build script I could go back and fix this up but, in a sense, it would defeat the purpose. I would have to run the build script from scratch to see if this change worked.


The one major drawback to this approach is that the resulting image is larger than it needs to be. This is especially true in my case because I remove a large number of files at the end. However, these files are still present in a lower layer filesystem in the union mount, so the entire image is larger than it needs to be by at least the size of the removed files.

However, there is a work-around. I did not publish this image to the Docker Hub Registry. Instead, I:

  • used docker export to export the contents as a tar archive.
  • created a new Dockerfile that simply added the contents of this tar archive.

The resulting image was as small as it could be.


The advantage of this approach is two-fold:

  • it keeps development time to a minimum. No longer do you have to sit through builds of sub-components that you already know succeed. You can focus on the bits that are still giving you grief.

  • it is great for maintaining a build script. There is a chance that the odd RUN command changes its behaviour over time (even though it shoudn’t). The build may fail but at least you don’t have to go back to the beginning once you’ve fixed the Dockerfile

Also, as I alluded to earlier in the post Docker only makes writing these build scripts easier. With the right tooling the same could be accomplished in any file system that provides snapshots.

Happy building!

Categories: Offsite Blogs

Newb Question: Am I taking the right approach to working with 'Right (Just "someString")'

Haskell on Reddit - Thu, 12/11/2014 - 2:39pm

I wanted to create haskell program that takes data stores on redis and prints it out on console. I was able to get stuff in and out of redis, but I am not really sure what is the best way to print it out to the console. I came up with the following code. Can you guys tell me if Im taking the right approach on this.

-- So Im working with the redis library, hedis. -- I was able to store and retrive some json from redis. -- And, I go the following: --- Right (Just "theJson") -- So, I played around on ghci and I arrived at the following solution theJson = Right (Just "theJson") -- Stuff, I got from hedis somethingWentWrong _ = Just "Whoops, something went wrong" -- Return message if there's an error. -- Used by checkForError. checkForError = Data.Either.either somethingWentWrong id -- Look at the json. -- If its a Left (Just "theJson") throw a Whoops. -- If its a Right (Just "theJson") return Just "theJson" concatNothing = ( (++) "" ) -- take a string and return the string -- I think I need it for getJson. I want to unwrap Just "theJson" so I can get to "theJson" getJson mJson = maybe ("badJson") concatNothing mJson -- take the Just "theJson" -- If nothing, return "badJson" -- Just "theJson", return "theJson" showJson = show . getJson . checkForError showJson theJson submitted by tacit7
[link] [4 comments]
Categories: Incoming News

An ASM Monad

Haskell on Reddit - Thu, 12/11/2014 - 12:30pm
Categories: Incoming News - Thu, 12/11/2014 - 12:22pm
Categories: Offsite Blogs - Thu, 12/11/2014 - 12:22pm
Categories: Offsite Blogs - Thu, 12/11/2014 - 12:22pm
Categories: Offsite Blogs - Thu, 12/11/2014 - 12:22pm
Categories: Offsite Blogs