News aggregator

About learning Haskell!

Haskell on Reddit - Sun, 09/21/2014 - 3:33am
  1. This is better than almost all the monads tutorial out there: https://www.youtube.com/watch?v=ZhuHCtR3xq8

  2. Take any 5th grader that is taught in school about functions, injection, surjection, bijection, associativity, commutativity, neutral element, simetric element and groups, rings etc. and functional programming just gets intuitive and easy once you hit the wall of "tutorials" and "books".

The reason why math is not coming back to computer programming is all the people that write "tutorials"/"books". The end.

submitted by tetru
[link] [10 comments]
Categories: Incoming News

Inside the Wolfram Language

Lambda the Ultimate - Sat, 09/20/2014 - 8:20pm

Video of Stephen Wolfram showing off the Wolfram Language and sharing his perspective on the design of the language at Strange Loop conference.

Categories: Offsite Discussion

Linking/uploading a PDF file in HaskellWiki

haskell-cafe - Sat, 09/20/2014 - 7:30pm
L.S., I am trying to link a PDF file, that is to be uploaded to the HaskellWiki environment, but I encounter several problems: - The link [[File:wxhaskell.pdf | wxHaskell - A Portable and Concise GUI Library for Haskell]] is displayed as File:wxhaskell.pdf instead of wxHaskell - A Portable and Concise GUI Library for Haskell If I leave out the vertical bar, the entire string between the brackets is displayed and used as filename - If I click the link, I am asked to upload the file; the resulting file should be http://www.haskell.org/haskellwiki/File:Wxhaskell.pdf but this URL results in a blank page - The list of uploaded files should be at http://www.haskell.org/haskellwiki/Special:ListFiles but this URL results in a blank page as well Am I doing something wrong, or are these Wiki bugs? Regards, Henk-Jan van Tuyl
Categories: Offsite Discussion

Danny Gratzer: Introduction to Dependent Types: Off, Off to Agda Land

Planet Haskell - Sat, 09/20/2014 - 6:00pm
Posted on September 21, 2014 Tags: agda, types

First, an apology. Sorry this has take so long to push out. I’ve just started my first semester at Carnegie Mellon. I fully intend to keep blogging, but it’s taken a little while to get my feet under me. Happy readings :)

In this second post of my “intro to dependent types” series we’re going on a whirlwind tour of Agda. Specifically we’re going to look at translating our faux-Haskell from the last post into honest to goodness typecheckable Agda.

There are 2 main reasons to go through the extra work of using a real language rather than pseudo-code

  1. This is typecheckable. I can make sure that all the i’s are dotted and t’s crossed.
  2. It’s a lot cleaner We’re only using the core of Agda so it’s more or less a very stripped down Haskell with a much more expressive but simpler type system.

With that in mind let’s dive in!

What’s the Same

There’s quite a bit of shared syntax between Agda and Haskell, so a Haskeller can usually guess what’s going on.

In Agda we still give definitions in much the same way (single : though)

thingy : Bool thingy = true

where as in Haskell we’d say

name :: Type name = val

In fact, we even get Haskell’s nice syntactic sugar for functions.

function : A -> B -> ... -> C function a b ... = c

Will desugar to a lambda.

function : A -> B -> ... -> C function = \a b ... -> c

One big difference between Haskell and Agda is that, due to Agda’s more expressive type system, type inference is woefully undecidable. Those top level signatures are not optional sadly. Some DT language work a little harder than Agda when it comes to inference, but for a beginner this is a bit of a feature: you learn what the actual (somewhat scary) types are.

And of course, you always give type signatures in Haskell I’m sure :)

Like Haskell function application is whitespace and functions are curried

-- We could explicitly add parens -- foo : A -> (B -> C) foo : A -> B -> C foo = ... a : A a = ... bar : B -> C bar = foo a

Even the data type declarations should look familiar, they’re just like GADTs syntactically.

data Bool : Set where true : Bool false : Bool

Notice that we have this new Set thing lurking in our code. Set is just the kind of normal types, like * in Haskell. In Agda there’s actually an infinite tower of these Bool : Set : Set1 : Set2 ..., but won’t concern ourselves with anything beyond Set. It’s also worth noting that Agda doesn’t require any particular casing for constructors, traditionally they’re lower case.

Pattern matching in Agda is pretty much identical to Haskell. We can define something like

not : Bool -> Bool not true = false not false = true

One big difference between Haskell and Agda is that pattern matching must be exhaustive. Nonexhaustiveness is a compiler error in Agda.

This brings me to another point worth mentioning. Remember that structural induction I mentioned the other day? Agda only allows recursion when the terms we recurse on are “smaller”.

In other words, all Agda functions are defined by structural induction. This together with the exhaustiveness restriction means that Agda programs are “total”. In other words all Agda programs reduce to a single value, they never crash or loop forever.

This can occasionally cause pain though since not all recursive functions are modelled nicely by structural induction! A classic example is merge sort. The issue is that in merge sort we want to say something like

mergeSort : List Nat -> List Nat mergeSort [] = [] mergeSort (x :: []) = x :: [] mergeSort xs = let (l, r) = split xs in merge (mergeSort l, mergeSort r)

But wait, how would the typechecker know that l and r are strictly smaller than xs? In fact, they might not be! We know that the length of length xs > 1, but convincing the typechecker of that fact is a pain! In fact, without elaborate trickery, Agda will reject this definition.

So, apart from these restriction for totality Agda has pretty much been a stripped down Haskell. Let’s start seeing what Agda offers over Haskell.

Dependent Types

There wouldn’t be much point in writing Agda if it didn’t have dependent types. In fact the two mechanisms that comprise our dependent types translate wonderfully into Agda.

First we had pi types, remember those?

foo :: (a :: A) -> B foo a = ...

Those translate almost precisely into Agda, where we’d write

foo : (a : A) -> B

The only difference is the colons! In fact, Agda’s pi types are far more general than what we’d discussed previously. The extra generality comes from what we allow A to be. In our previous post, A was always some normal type with the kind * (Set in Agda). In Agda though, we allow A to be Set itself. In Haskell syntax that would be something like

foo :: (a :: *) -> B

What could a be then? Well anything with the kind * is a type, like Bool, (), or Nat. So that a is like a normal type variable in Haskell

foo :: forall a. B

In fact, when we generalize pi types like this, they generalize parametric polymorphism. This is kind of like how we use “big lambdas” in System F to write out polymorphism explicitly.

Here’s a definition for the identity function in Agda.

id : (A : Set) -> A -> A id A a = a

This is how we actually do all parametric polymorphism in Agda, as a specific use of pi types. This comes from the idea that types are also “first class”. We can pass them around and use them as arguments to functions, even dependent arguments :)

Now our other dependently typed mechanism was our generalized generalized algebraic data types. These also translate nicely to Agda.

data Foo : Bool -> Set where Bar : Foo True

We indicate that we’re going to index our data on something the same way we would in Haskell++, by adding it to the type signature on the top of the data declaration.

Agda’s GGADTs also allow us to us to add “parameters” instead of indices. These are things which the data type may use, but each constructor handles uniformly without inspecting it.

For example a list type depends on the type of it’s elements, but it doesn’t poke further at the type or value of those elements. They’re handled “parametrically”.

In Agda a list would be defined as

data List (A : Set) : Set where nil : List A cons : A -> List A -> List A

If your wondering what on earth the difference is, don’t worry! You’ve already in fact used parametric/non-parametric type arguments in Haskell. In Haskell a normal algebraic type can just take several type variables and can’t try to do clever things depending on what the argument is. For example, our definition of lists

data List a = Cons a (List a) | Nil

can’t do something different if a is Int instead of Bool or something like that. That’s not the case with GADTs though, there we can do clever things like

data List :: * -> * where IntOnlyCons :: Int -> List Int -> List Int ...

Now we’re not treating our type argument opaquely, we can figure things out about it depending on what constructor our value uses! That’s the core of the difference between parameters in indices in Agda.

Next let’s talk about modules. Agda’s prelude is absolutely tiny. By tiny I mean essentially non-existant. Because of this I’m using the Agda standard library heavily and to import something in Agda we’d write

import Foo.Bar.Baz

This isn’t the same as a Haskell import though. By default, imports in Agda import a qualified name to use. To get a Haskell style import we’ll use the special shortcut

open import Foo.Bar

which is short for

import Foo.Bar open Bar

Because Agda’s prelude is so tiny we’ll have to import things like booleans, numbers, and unit. These are all things defined in the standard library, not even the core language. Expect any Agda code we write to make heavy use of the standard library and begin with a lot of imports.

Finally, Agda’s names are somewhat.. unique. Agda and it’s standard library are unicode heavy, meaning that instead of unit we’d type ⊤ and instead of Void we’d use ⊥. Which is pretty nifty, but it does take some getting used to. If you’re familiar with LaTeX, the Emacs mode for Agda allows LaTeX style entry. For example ⊥ can be entered as \bot.

The most common unicode name we’ll use is ℕ. This is just the type of natural numbers as their defined in Data.Nat.

A Few Examples

Now that we’ve seen what dependent types look like in Agda, let’s go over a few examples of their use.

First let’s import a few things

open import Data.Nat open import Data.Bool

Now we can define a few simple Agda functions just to get a feel for how that looks.

not : Bool -> Bool not true = false not false = true and : Bool -> Bool -> Bool and true b = b and false _ = false or : Bool -> Bool -> Bool or false b = b or true _ = true

As you can see defining functions is mostly identical to Haskell, we just pattern match and the top level and go from there.

We can define recursive functions just like in Haskell

plus : ℕ -> ℕ -> ℕ plus (suc n) m = suc (plus n m) plus zero m = m

Now with Agda we can use our data types to encode “proofs” of sorts.

For example

data IsEven : ℕ -> Set where even-z : IsEven zero even-s : (n : Nat) -> IsEven n -> IsEven (suc (suc n))

Now this inductively defines what it means for a natural number to be even so that if Even n exists then n must be even. We can also state oddness

data IsOdd : ℕ -> Set where odd-o : IsOdd (suc zero) odd-s : (n : ℕ) -> IsOdd n -> IsOdd (suc (suc n))

Now we can construct a decision procedure which produces either a proof of evenness or oddness for all natural numbers.

open import Data.Sum -- The same thing as Either in Haskell; ⊎ is just Either evenOrOdd : (n : ℕ) -> Odd n ⊎ Even n

So we’re setting out to construct a function that, given any n, builds up an appropriate term showing it is either even or odd.

The first two cases of this function are kinda the base cases of this recurrence.

evenOrOdd zero = inj₁ even-z evenOrOdd (suc zero) = inj₂ odd-o

So if we’re given zero or one, return the base case of IsEven or IsOdd as appropriate. Notice that instead of Left or Right as constructors we have inj₁ and inj₂. They serve exactly the same purpose, just with a shinier unicode name.

Now our next step would be to handle the case where we have

evenOrOdd (suc (suc n)) = ?

Our code is going to be like the Haskell code

case evenOrOdd n of Left evenProof -> Left (EvenS evenProof) Right oddProof -> Right (OddS oddProof)

In words, we’ll recurse and inspect the result, if we get an even proof we’ll build a bigger even proof and if we can an odd proof we’ll build a bigger odd proof.

In Agda we’ll use the with keyword. This allows us to “extend” the current pattern matching by adding an expression to the list of expressions we’re pattern matching on.

evenOrOdd (suc (suc n)) with evenOrOdd n evenOrOdd (suc (suc n)) | inj₁ x = ? evenOrOdd (suc (suc n)) | inj₂ y = ?

Now we add our new expression to use for matching by saying ... with evenOrOdd n. Then we list out the next set of possible patterns.

From here the rest of the function is quite straightforward.

evenOrOdd (suc (suc n)) | inj₁ x = inj₁ (even-s n x) evenOrOdd (suc (suc n)) | inj₂ y = inj₂ (odd-s n y)

Notice that we had to duplicate the whole evenOrOdd (suc (suc n)) bit of the match? It’s a bit tedious so Agda provides some sugar. If we replace that portion of the match with ... Agda will just automatically reuse the pattern we had when we wrote with.

Now our whole function looks like

evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n evenOrOdd zero = inj₁ even-z evenOrOdd (suc zero) = inj₂ odd-o evenOrOdd (suc (suc n)) with evenOrOdd n ... | inj₁ x = inj₁ (even-s n x) ... | inj₂ y = inj₂ (odd-s n y)

How can we improve this? Well notice that that suc (suc n) case involved unpacking our Either and than immediately repacking it, this looks like something we can abstract over.

bimap : (A B C D : Set) -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D bimap A B C D f g (inj₁ x) = inj₁ (f x) bimap A B C D f g (inj₂ y) = inj₂ (g y)

If we gave bimap a more Haskellish siganture

bimap :: forall a b c d. (a -> c) -> (b -> d) -> Either a b -> Either c d

One interesting point to notice is that the type arguments in the Agda function (A and B) also appeared in the normal argument pattern! This is because we’re using the normal pi type mechanism for parametric polymorphism, so we’ll actually end up explicitly passing and receiving the types we quantify over. This messed with me quite a bit when I first starting learning DT languages, take a moment and convince yourself that this makes sense.

Now that we have bimap, we can use it to simplify our evenOrOdd function.

evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n evenOrOdd zero = inj₁ even-z evenOrOdd (suc zero) = inj₂ odd-o evenOrOdd (suc (suc n)) = bimap (IsEven n) (IsOdd n) (IsEven (suc (suc n))) (IsOdd (suc (suc n))) (even-s n) (odd-s n) (evenOrOdd n)

We’ve gotten rid of the explicit with, but at the cost of all those explicit type arguments! Those are both gross and obvious. Agda can clearly deduce what A, B, C and D should be from the arguments and what the return type must be. In fact, Agda provides a convenient mechanism for avoiding this boilerplate. If we simply insert _ in place of an argument, Agda will try to guess it from the information it has about the other arguments and contexts. Since these type arguments are so clear from context, Agda can guess them all

evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n evenOrOdd zero = inj₁ even-z evenOrOdd (suc zero) = inj₂ odd-o evenOrOdd (suc (suc n)) = bimap _ _ _ _ (even-s n) (odd-s n) (evenOrOdd n)

Now at least the code fits on one line! This also raises something interesting, the types are so strict that Agda can actually figure out parts of our programs for us! I’m not sure about you but at this point in time my brain mostly melted :) Because of this I’ll try to avoid using _ and other mechanisms for Agda writing programs for us where I can. The exception of course being situations like the above where it’s necessary for readabilities sake.

One important exception to that rule is for parameteric polymorphism. It’s a royal pain to pass around types explicitly everywhere. We’re going to use an Agda feature called “implicit arguments”. You should think of these as arguments for which the _ is inserted for it. So instead of writing

foo _ zero zero

We could write

foo zero zero

This more closely mimicks what Haskell does for its parametric polymorphism. To indicate we want something to be an implicit argument, we just wrap it in {} instead of (). So for example, we could rewrite bimap as

bimap : {A B C D : Set} -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D bimap f g (inj₁ x) = inj₁ (f x) bimap f g (inj₂ y) = inj₂ (g y)

To avoid all those underscores.

Another simple function we’ll write is that if we can construct an IsOdd n, we can build an IsEven (suc n).

oddSuc : (n : ℕ) -> IsOdd n -> IsEven (suc n)

Now this function has two arguments, a number and a term showing that that number is odd. To write this function we’ll actually recurse on the IsOdd term.

oddSuc .1 odd-o = even-s zero even-z oddSuc .(suc (suc n)) (odd-s n p) = even-s (suc n) (oddSuc n p)

Now if we squint hard and ignore those . terms, this looks much like we’d expect. We build the Even starting from even-s zero even-z. From there we just recurse and talk on a even-s constructor to scale the IsEven term up by two.

There’s a weird thing going on here though, those . patterns. Those are a nifty little idea in Agda that pattern matching on one thing might force another term to be some value. If we know that our IsOdd n is odd-o n must be suc zero. Anything else would just be completely incorrect. To notate these patterns Agda forces you to prefix them with .. You should read .Y as “because of X, this must be Y”.

This isn’t an optional choice though, as . patterns may do several wonky things. The most notable is that they often use pattern variables nonlinearly, notice that n appeared twice in our second pattern clause. Without the . this would be very illegal.

As an exercise to the reader, try to write

evenSuc : (n : ℕ) -> IsEven n -> IsOdd (suc n) Wrap Up

That wraps up this post which came out much longer than I expected. We’ve now covered enough basics to actually discuss meaningful dependently typed programs. That’s right, we can finally kiss natural numbers good bye in the next post!

Next time we’ll cover writing a small program but interesting program and use dependent types to assure ourselves of it’s correctness.

As always, please comment with any questions :)

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus
Categories: Offsite Blogs

Christopher Done: shell-conduit: Write shell scripts in Haskell with Conduit

Planet Haskell - Sat, 09/20/2014 - 6:00pm

As part of my series of write-about-personal-projects, my latest obsession is writing shell scripts with Michael Snoyman’s Conduit.

Here is my package, shell-conduit. It’s still in the experimental phase, but I don’t forsee any changes now for a while.

Bash is evil

I hate writing scripts in Bash. Until now, it was the easiest way to just write unix scripts. Its syntax is insane, incredibly error prone, its defaults are awful, and it’s not a real big person programming language.

Perl/Python/Ruby are also evil

If you’re going to go as far as using a real programming language, why bother with these dynamically typed messes? Go straight for Haskell.

Like a glove

I had an inkling a while back that conduits mirror the behaviour of bash pipes very well. I knew there was something to the idea, but didn’t act on it fully for a while. Last week I experimented somewhat and realised that the following Haskell code

source $= conduit $ sink

does indeed accurately mirror

source | conduit > sink

And that also the following

(do source source $= conduit) $$ sink

is analogous to

source source | conduit

We’ll see examples of why this works later.

I must Haskell all the things

Another trick I realised is to write some template Haskell code which will calculate all executables in your PATH at compilation time and generate a top-level name that is a Haskell function to launch that process. So instead of writing

run "ls"

you could instead just write

ls

There are a few thousand executables, so it takes about 10 seconds to compile such a module of names. But that’s all.

Again, we’ll see how awesome this looks in a minute.

Modeling stdin, stderr and stdout

My choice of modeling the typical shell scripting pipe handles is by having a type called Chunk:

type Chunk = Either ByteString ByteString

All Left values are from stderr. All Right values are either being pulled from stdin or being sent to stdout. In a conduit the difference between stdin and stdout is more conceptual than real.

When piping two commands, the idea is that any Left values are just re-yielded along, they are not consumed and passed into the process.

A process conduit on chunks

Putting the previous model into practice, we come up with a type for launching a process like this:

conduitProcess :: (MonadResource m) => CreateProcess -> Conduit Chunk m Chunk

Meaning the process will be launched, and the conduit will accept any upstream stdin (Right values), and send downstream anything that comes from the actual process (both Left and Right values).

Process conduits API

I defined two handy functions for running process conduits:

shell :: (MonadResource m) => String -> Conduit Chunk m Chunk proc :: (MonadResource m) => String -> [String] -> Conduit Chunk m Chunk

One to launch via a shell, one to launch via program name and arguments. These functions can be used in your shell scripts. Though, we’ll see in a minute why you should rarely need either.

Executing a shell scripting conduit

First we want something to consume any remainder chunks after a script has finished. That’s writeChunks:

writeChunks :: (MonadIO m) => Consumer Chunk m () writeChunks = awaitForever (\c -> case c of Left e -> liftIO (S.hPut stderr e) Right o -> liftIO (S.hPut stdout o))

This simply consumes anything left in the pipeline and outputs to the correct file handles, either stderr or stdout.

Now we can write a simple run function:

run :: (MonadIO m,MonadBaseControl IO m) => Conduit Chunk (ShellT m) Chunk -> m () run p = runResourceT (runShellT (sourceList [] $= p $ writeChunks))

First it yields an empty upstream of chunks. That’s the source. Then our script p is run as the conduit in between, finally we write out any chunks that remain.

Let’s try that out:

λ> run (shell "echo hello!") hello! λ> run (proc "date" ["+%Y"]) 2014 λ> run (shell "echo oops > /dev/stderr") oops

Looks good. Standard output was written properly, as was stderr.

Returning to our mass name generation

Let’s take our earlier work of generating names with template-haskell. With that in place, we have a process conduit for every executable in PATH. Add to that variadic argument handling for each one, we get a list of names like this:

rmdir :: ProcessType r => r ls :: ProcessType r => r egrep :: ProcessType r => r dmesg :: ProcessType r => r

The real types when instantiated will look like:

rmdir "foo" :: Conduit Chunk m Chunk ls :: Conduit Chunk m Chunk ls "." :: Conduit Chunk m Chunk Putting it all together

We can now provide any number of arguments:

λ> run ls dist LICENSE README.md Setup.hs shell-conduit.cabal src TAGS TODO.org λ> run (ls "/") bin boot cdrom …

We can pipe things together:

λ> run (do ls "-1" $= head' "-2") dist LICENSE λ> run (ls $= grep "Key" $= shell "cat" $= CL.map (second (S8.map toUpper))) KEYBOARD.HI KEYBOARD.HS KEYBOARD.O

Results are outputted to stdout unless piped into other processes:

λ> run (do shell "echo sup"; shell "echo hi") sup hi λ> run (do shell "echo sup"; sed "s/u/a/"; shell "echo hi") sup hi λ> run (do shell "echo sup" $= sed "s/u/a/"; shell "echo hi") sap hi

Live streaming between pipes like in normal shell scripting is possible:

λ> run (do tail' "/tmp/example.txt" "-f" $= grep "--line-buffered" "Hello") Hello, world! Oh, hello!

(Remember that grep needs --line-buffered if it is to output things line-by-line).

Error handling

By default, if a process errors out, the whole script ends. This is contrary to Bash, which keeps going regardless of failure. This is bad.

In Bash, to revert this default, you run:

set -e

And the way to ignore erroneous commands on case-by-case basis is to use || true:

killall nonexistant || true echo OK, done.

Which means “do foo, or otherwise ignore it, continue the script”.

We can express the same thing using the Alternative instance for the ShellT type:

λ> run (do killall "nonexistant" "-q"; echo "OK, done.") *** Exception: ShellExitFailure 1 λ> run (do killall "nonexistant" "-q" <|> return (); echo "OK, done.") OK, done. String types

If using OverloadedStrings so that you can use Text for arguments, then also enable ExtendedDefaultRules, otherwise you’ll get ambiguous type errors.

{-# LANGUAGE ExtendedDefaultRules #-}

But this isn’t necessary if you don’t need to use Text yet. Strings literals will be interpreted as String. Though you can pass a value of type Text or any instance of CmdArg without needing conversions.

Examples of script files

Quick script to reset my keyboard (Linux tends to forget these things when I unplug my keyboard):

import Data.Conduit.Shell main = run (do xmodmap ".xmodmap" xset "r" "rate" "150" "50")

Cloning and initializing a repo (ported from a bash script):

import Control.Monad.IO.Class import Data.Conduit.Shell import System.Directory main = run (do exists <- liftIO (doesDirectoryExist "fpco") if exists then rm "fpco/.hsenvs" "-rf" else git "clone" "git@github.com:fpco/fpco.git" liftIO (setCurrentDirectory "fpco") shell "./dev-scripts/update-repo.sh" shell "./dev-scripts/build-all.sh" alertDone)

Script to restart a web process (ported from an old bash script I had):

import Control.Applicative import Control.Monad.Fix import Data.Conduit.Shell main = run (do ls echo "Restarting server ... ?" killall name "-q" <|> return () fix (\loop -> do echo "Waiting for it to terminate ..." sleep "1" (ps "-C" name $= discardChunks >> loop) <|> return ()) shell "dist/build/ircbrowse/ircbrowse ircbrowse.conf") where name = "ircbrowse" You’ve seen Shelly, right?

Right. Shelly’s fine. It just lacks the two killer things for me:

  • All names are bound, so I can just use them as normal functions.
  • shell-conduit also, due to its mass name binding, prioritizes commands. For example, Shelly has a group of functions for manipulating the file system. In shell-conduit, you just use your normal commands: rm "x" and mv "x" "y".
  • Not based on conduit. Conduit is a whole suite of streaming utilities perfect for scripting.
  • Piped is not the default, either. There’re a bunch of choices: Shelly, Shelly.Lifted, Shelly.Pipe. Choice is good, but for a scripting language I personally prefer one goto way to do something.

Also, Shelly cannot do live streams like Conduit can.

Conduits as good scripting libraries

You might want to import the regular Conduit modules qualified, too:

import qualified Data.Conduit.List as CL

Which contains handy functions for working on streams in a list-like way. See the rest of the handy modules for Conduit in conduit-extra.

Also of interest is csv-conduit, html-conduit, and http-conduit.

Finally, see the Conduit category on Hackage for other useful libraries: http://hackage.haskell.org/packages/#cat:Conduit

All of these general purpose Conduits can be used in shell scripting.

Using it for real scripts

So far I have ported a few small scripts to shell-conduit from Bash and have been happy every time. I suck at Bash. I’m pretty good at Haskell.

The next test is applying this to my Hell shell and seeing if I can use it as a commandline shell, too.

My friend complained that having to quote all arguments is a pain. I don’t really agree that this is bad. In Bash it’s often unclear how arguments are going to be interpreted. I’m happy just writing something predictable than something super convenient but possibly nonsense.

Summary

I set out a week ago to just stop writing Bash scripts. I’ve written a bunch of scripts in Haskell, but I would still write Bash scripts too. Some things were just too boring to write. I wanted to commit to Haskell for scripting. Today, I’m fairly confident I have a solution that is going to be satisfactory for a long while.

Categories: Offsite Blogs

Please point me at an eloquent paper or post on the business case for Haskell

Haskell on Reddit - Sat, 09/20/2014 - 2:22pm

To be clear, I've produced quite a bit of production software in Haskell so I'm not the one that needs convincing.

I'm at a startup and have selected to use Haskell for many obvious reasons. I am a decision maker but the COO has some old ideas about hire-ability and productivity. I'm finding that my own arguments are not well equipped to defend the business case!

I've experienced the increased productivity, quality of Software, and quality of programmers in using Haskell in my own startups but saying that isn't quite enough I think. I need better ammo and hope someone knows of any prose or papers by a prominent member of our community that makes a strong case for Haskell as a competitive advantage in business, or at least just being an intelligent and Not As Scary As You Think choice.

I would prefer to convince and provide the irrefutable logic, instead of saying "get your nose outta here!"

I'm speaking of a company in which I'm the sole programmer but will be building the team. So I'm not dealing with culture shift and already am an experienced Haskell programmer to lead such a team.

If there really is none I might try writing this myself and collaborating with our experienced community.

[UPDATE]

Thanks for the responses! The provided material is all good but I think there's still a need for something targeting non-technical leaders to make them feel comfortable so I think I'm going to try taking that on myself, I'll post it on the haskell mailing list (and here) for feedback from the community.

submitted by digitalmentat
[link] [14 comments]
Categories: Incoming News

hindent: A Haskell indenter

Haskell on Reddit - Sat, 09/20/2014 - 8:25am
Categories: Incoming News

Matthew Sackman: Concurrency, Actors, Locks and mailboxes

Planet Haskell - Sat, 09/20/2014 - 8:02am

Having been part of the original team that wrote RabbitMQ (indeed I wrote the very first prototype, back in the summer of 2006), and having worked full time with Erlang since 2009 until fairly recently, it's been interesting doing some work in Go recently.

Go's a language I currently have mixed feelings for. In some ways I like it - it's simple, it doesn't take too long to learn, the syntax is clean, the tool chain is superb (compilation is quick), performance is very good, it's very easy to drop into C whenever you really need to. It also has one very nice high end feature: first class channels - you can pass channels over channels which is pretty powerful. But equally, it's statically typed and has a terrible type system (i.e. no generics. Personally, I don't feel like the proofs offered by type checkers are worth much to me so I'd much rather have no static type checking than one that is this primitive and brain dead), it's not extensible properly (e.g. you can't create your own data structures which work with the range) and worst of all, there's no pattern matching. The lack of pattern matching is particularly horrific given Go's "best practise" of returning tuples from functions (except they're not really tuples - another mistake in the language), the right most of which indicates an error or success. But you can't pattern match on the assignment so you end up with endless if-statements that explicitly check the error for nil. There are other irritations which I've found, particularly related to its locks package (i.e. non-re-entrant; can't upgrade read to write; waiting for a write blocks all attempts to gain reads. Yes, I know I'm free to implement my own locks package if I want to).

Go also doesn't push you to using actors - you have to re-implement all that yourself if you want to. In a recent project, I started off with some locks and within about three days found it utterly impossible to reason about which go-routines can hold which locks and whether or not there's any deadlock potential. Inevitably, there was. So I ripped all the locking code out and wrote my own actor loops.

This was quite interesting as here I could now be more flexible than Erlang. I think most people think that actors mean "only one thread/process/routine can read and write its state" - there is a key concept of owning that state and regulating access to it. However, what I found is that I actually only needed to limit modifying the state to a single go-routine: each actor-receive-loop routine would take a write lock on its state whenever it needs to modify its own state, but it's perfectly reasonable to have anyone read the state, provided they take a read lock before doing so. The fact we can share pointers in Go makes this possible, whereas it's impossible to do this in Erlang (well, not quite - if you use ets then you can do it, which is exactly what we do in RabbitMQ in the rabbit_msg_store - but it's certainly not pretty!). So now we can have concurrent reads and no need to pass read-requests over a channel/mailbox. This seems pretty nice to me.

Recently I was reading a paper and it suggested that:

In message passing systems, processes interact exclusively by sending and receiving messages and they do not have access to shared memory.

Firstly, on a very technical note, they do have access to shared memory - the mailbox or queue is exactly that. The key reason why it leads to more composable systems is that when you hold the lock to write into a mailbox, you can never do anything other than write into that mailbox - you can never try to acquire multiple locks, so you can't deadlock in this way. And that's even assuming you're using locks for mailboxes - queues make lovely structures for lock-free concurrent access.

Secondly, as I suggest above, it appears to be safe to allow multiple concurrent readers of an actor's state, provided modifications to the state are done atomically by the actor thread - though more care has to be taken now to ensure updates are consistent - you have to make sure you update all the state you need to change in one go under a write lock (the sort of transactional semantics you end up needing to ensure makes me heavily think of STM). Whilst I would probably still call such a system a "message passing system" I can certainly imagine others would disagree and at a minimum it's some sort of hybrid (you could argue that the side effect of releasing the write lock when you've finished modifying the state is to publish an immutable copy of the state to any and all subscribers that want it - except without all that overhead. When viewed in these terms, it makes more intuitive sense that it's safe - provided of course that you don't do anything blocking whilst you're holding a state read-lock). This design also seems to get a fair bit trickier once you get to distributed systems and the need to have proxy objects representing the state and interface of a remote actor. By comparison, in Erlang a reference to an Actor is an immutable process identifier of Pid which is easy to send around and reason about.

But mainly I was thinking about the pattern of data flow: a mailbox allows multiple writers to send data to a single reader (a gather operation, maybe). The actor loop allows the exact opposite: a single reader of the mailbox can then affect multiple things (a scatter) - either by sending out messages to many other actors (in essence, a push action), or by (as I suggest above) modifying state which can be concurrently read by many other actors (correspondingly, a pull action). In my mind's eye, I see a sort of concertina effect as all these messages are pushed into a mailbox, and then slowly the effects of each message spread out again. In some ways it seems slightly odd how powerful this is, but in other ways it makes perfect sense: if you consider a finite state machine then your mailbox is just the stream of events coming into it and you have your little automaton updating the state with some operation combining the current state with the current message. It is the very fact that the next state is dependent on the current state and the current message that requires mutual exclusion around modifying the state. And of course by ensuring that that mutual exclusion lock is (implicitly) held in absence of any other locks that makes actor systems so much easier to reason about and understand - any deadlocks that occur are at the protocol level and, if you model your protocols between actors properly, can be determined statically (though I'm not aware that anyone actually does this - false positives may abound).

This then made makes me think about how, once all actors have done their initialisation and reached the core actor loop, the entire system is purely event driven. When looked at like this, are we really sure actors are enough? Are there not other forms of expression that capture the relation between events as inputs, with state, and an output more cleanly? In particular I'm thinking of things like Join calculus and Functional Reactive Programming. Given that actors are apparently becoming rather more mainstream these days, I wonder if that really means they're only part of the solution: sure I can write large distributed systems that scale, perform well, don't deadlock or livelock and are exceedingly robust. But I can I write them with less code and cleaner semantics?

Categories: Offsite Blogs

Matthew Sackman: Welcome

Planet Haskell - Sat, 09/20/2014 - 6:32am

Yet again, a new blog. This one is much simpler than the last. Whilst one always hopes, don't expect the frequency of posts to be greater than any of the previous blogs here... The styling is currently non-existent - seriously, there is no CSS right now. Sorry.

This one is much simpler than the last - gone is Serendipity and PyBlosxom is in, in its place. I'm currently trialing using muse to write posts. Not entirely sure it's worth it, but I need to get my emacs foo improved so working with a new major mode is likely a good idea.

Categories: Offsite Blogs

Matthew Sackman: First post!

Planet Haskell - Sat, 09/20/2014 - 4:34am

This is your first post! If you can see this with a web-browser, then it's likely that everything's working nicely!

Categories: Offsite Blogs

Chaining modifications using (>>>)

haskell-cafe - Sat, 09/20/2014 - 2:14am
Hello all, I've been playing with Data.Graph.Inductive, which provides functions to modify a graph. I take these functions and build something more complex from them. To do so, I have to apply several such modifications one after the other. I can write this using ($) or (.), but I didn't like to write things backwards (last function to apply comes first), so I use (>>>) from the Arrows module to chain functions Graph -> Graph (aka GraphTransforms) and my complex function looks like f >>> g >>> h. To actually build a Graph I have to apply my function to some Graph (often "empty"). This works okay as long as the inidividual functions f,g,h do not need access to the graph. However, there is one function where I use "pre" (predecessors) and "suc" (succerssors), which need the Graph as a parameter. groupNodes :: DynGraph gr => (Label,[Node]) -> GraphTransform gr pl groupNodes (lbl,ids) gr = let id = head $ newNodes 1 gr oldEdgesTo = [(toOld, old) | old <- ids, toOld <- pre gr old] -- <== he
Categories: Offsite Discussion

Eric Kidd: Rust lifetimes: Getting away with things that would be reckless in C++

Planet Haskell - Fri, 09/19/2014 - 9:07pm

Over the years, I've learned to be cautious with C++ pointers. In particular, I'm always very careful about who owns a given pointer, and who's in charge of calling delete on it. But my caution often forces me to write deliberately inefficient functions. For example:

vector<string> tokenize_string(const string &text);

Here, we have a large string text, and we want to split it into a vector of tokens. This function is nice and safe, but it allocates one string for every token in the input. Now, if we were feeling reckless, we could avoid these allocations by returning a vector of pointers into text:

vector<pair<const char *,const char *>> tokenize_string2(const string &text);

In this version, each token is represented by two pointers into text: One pointing to the first character, and one pointing just beyond the last character.1 But this can go horribly wrong:

// Disaster strikes! auto v = tokenize_string2(get_input_string()); munge(v);

Why does this fail? The function get_input_string returns a temporary string, and tokenize_string2 builds an array of pointers into that string. Unfortunately, the temporary string only lives until the end of the current expression, and then the underlying memory is released. And so all our pointers in v now point into oblivion—and our program just wound up getting featured in a CERT advisory. So personally, I'm going to prefer the inefficient tokenize_string function almost every time.

Rust lifetimes to the rescue!

Going back to our original design, let's declare a type Token. Each token is either a Word or an Other, and each token contains pointers into a pre-existing string. In Rust, we can declare this as follows:

#[deriving(Show, PartialEq)] enum Token<'a> { Word(&'a str), Other(&'a str) }

Read more…

Categories: Offsite Blogs

Christopher Done: hindent: A Haskell indenter

Planet Haskell - Fri, 09/19/2014 - 6:00pm
A question of style

In this post I’m going to use the word “style” to refer to the way code is printed in concrete terms. No changes in the code that would yield a different syntax tree are considered “style” here.

What’s the deal with code style?

Code style is important! If you’re a professional Haskell programmer, you’re working with Haskell code all day. The following things are affected by the style used:

  • How easily it can be manipulated with regular editors: the more code is laid out in a way that prevents you from readily using your normal editor functions, the less efficient you are.
  • How well general tooling works with it: do diff and things like that work well?
  • How easily you can absorb the structure: do you have to spend time hunting around for the start and end of syntactical nodes? Can’t see the tree for the forest?
  • How quickly you can write it: can you just write code or do you have to spend time thinking about how it’s going to be laid out before writing, or editing the layout afterwards?
  • How aesthetically offended you are1: does the code you’re looking at assault your sense of beauty?

Code style is important! Let’s have a code style discussion. I propose to solve it with tooling.

Is this really an issue, though?

Okay, so I’m one guy with a bee in his bonnet. Let’s do a quick Google and see what others are saying in this StackOverflow question:

Could someone provide a link to a good coding standard for Haskell? I’ve found this and this, but they are far from comprehensive.

The following points refer to style:

  • Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
  • Put spaces around infix operators. Put a space following each comma in a tuple literal.
  • Prefer a space between a function and its argument, even if the argument is parenthesized.
  • Use line breaks carefully. Line breaks can increase readability, but there is a trade-off: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn’t overuse line breaks.
  • When possible, align – lines, = signs, and even parentheses and commas that occur in adjacent lines.

Even the Haskell community is not immune to long, protracted debates about tabs vs spaces. That reddit submission has zero points. That means it’s very controversial. The submission also has 117 comments. That means people are very vocal about this topic. That’s because bike-shedding is inversely proportional to the triviality of the debated thing. We know that.

Nevertheless, style is important enough to be discussed.

So let’s formalise a standard style

That’s a simplification. There are many style guides:

These are just public ones. In-house styles are also common, for a particular company. It’s not practical to force everyone into one single style. It’s a well-worn topic in the C world.

Okay, but is this a problem tooling even needs to solve?

There is precedent for other tooling based on style:

Everyone has their own style

So then let’s make a tool with a select number of styles, you might say. The problem is that people don’t even use the standards that exist out there. They used slightly varying versions. Ask any Haskeller what style they use, and they will say “mostly like X, but with some differences.”

For example What are some good style guides for writing Haskell code?2

I use a very similar style, with a few modifications. […] Perhaps I should write out my own style guide at some point, especially as I’ve become pretty set in my style lately.

And Good Haskell Coding Style3:

My Personal Pet Peeves

For the most part the style guides that I have added above (and the tools provided) mirror my own style guide (or perhaps my guide mirrors them). However, there is one item of style that particularly annoys me on a regular basis. […]

Can’t we just use structured editors?

Some more optimistic folk out there might suggest, perhaps, we should just throw away text files and go straight for structured code databases. Put all this formatting nonsense behind us. Layout is just a stylesheet! It’s not data to be stored in a file!

Maybe so. But nobody is using structured editors yet.

A practical way forward

Taking all of the above into consideration, here is my approach at the problem. The hindent library and program. Styled on GNU indent, the intention is that you simply run the program on some source code and it reformats it.

$ hindent hindent: arguments: --style [fundamental|chris-done|johan-tibell]

hindent has the concept of styles built in. There is a type for it:

data Style = forall s. Style {styleName :: !Text ,styleAuthor :: !Text ,styleDescription :: !Text ,styleInitialState :: !s ,styleExtenders :: ![Extender s] ,styleDefConfig :: !Config }

It contains authorship metadata. It holds an initial state which can be used during printing. Most importantly, it has a list of extenders. Means to extend the printer and change its behaviour on a node-type-specific basis.

Take a normal pretty printing approach. It’s usually something like:

class Pretty a where pretty :: a -> String

Then for all the types in the AST we implement an instance of Pretty.

The limitation here is that we can’t, as a user of the library, decide to print one particular node type differently.

Instead, here’s a new class:

class (Annotated ast,Typeable ast) => Pretty ast where prettyInternal :: ast NodeInfo -> Printer ()

(It runs in a Printer type to store some state about the current column and indentation level, things like that.)

Now, we implement the prettyInternal method for all our types. But when implementing instances, we never use the prettyInternal method directly. Instead, we use another function, pretty, which can be considered the main “API” for printing, like before. Here’s the type:

pretty :: (Pretty ast) => ast NodeInfo -> Printer ()

We’ll look at its implementation in a moment. Here is an example instance of Pretty:

instance Pretty ClassDecl where prettyInternal x = case x of ClsTyDef _ this that -> do write "type " pretty this write " = " pretty that …

See how pretty this and pretty that are used for recursing instead of prettyInternal? This approach is used for all instances.

Now let’s see what that affords us:

pretty :: (Pretty ast) => ast NodeInfo -> Printer () pretty a = do st <- get case st of PrintState{psExtenders = es,psUserState = s} -> do case listToMaybe (mapMaybe (makePrinter s) es) of Just m -> m Nothing -> prettyNoExt a printComments a where makePrinter s (Extender f) = case cast a of Just v -> Just (f s v) Nothing -> Nothing makePrinter s (CatchAll f) = (f s a)

In this function, we’re grabbing our (mentioned earlier) list of [Extender s] values from psExtenders and then looking up to see if any of the types match. To clarify, here is the Extender type:

data Extender s where Extender :: forall s a. (Typeable a) => (s -> a -> Printer ()) -> Extender s CatchAll :: forall s. (forall a. Typeable a => s -> a -> Maybe (Printer ())) -> Extender s

Both constructors are rank-n. Both accept the current state as an argument and the current node. The Extender constructor is Prismesque. It’s existential, and lets you say “I want things of this type”. The CatchAll will just accept anything.

All that adds up to me being able to do something like this. Here’s a demo style:

demo = Style {styleName = "demo" ,styleAuthor = "Chris Done" ,styleDescription = "Demo for hindent." ,styleInitialState = () ,styleExtenders = [Extender (\_ x -> case x of InfixApp _ a op b -> do pretty a pretty op pretty b _ -> prettyNoExt x)] ,styleDefConfig = def}

(The prettyNoExt is a publicly exposed version of the (private) method prettyInternal.)

Now let’s test the fundamental style versus our demo style:

λ> test fundamental "x = foo (a * b) + what" x = foo (a * b) + what λ> test demo "x = foo (a * b) + what" x = foo (a*b)+what

Viola! We’ve configured how to pretty print infix operators in a few lines of code.

In practice, there are three styles. Here’s a larger example:

foo = do print "OK, go"; foo (foo bar) -- Yep. (if bar then bob else pif) (case mu {- cool -} zot of Just x -> return (); Nothing -> do putStrLn "yay"; return (1,2)) bill -- Etc where potato Cakes {} = 2 * x foo * bar / 5

Fundamental style:

foo = do print "OK, go" foo (foo bar) (if bar then bob else pif) (case mu {- cool -} zot of Just x -> return () Nothing -> do putStrLn "yay" return (1,2)) bill -- Etc where potato Cakes{} = 2 * x foo * bar / 5

Johan Tibell’s style (which I’ve only started implementing, there’re some things I need to clarify):

foo = do print "OK, go" foo (foo bar) (if bar then bob else pif) (case mu {- cool -} zot of Just x -> return () Nothing -> do putStrLn "yay" return (1, 2)) bill -- Etc where potato Cakes{} = 2 * x foo * bar / 5

My style (pretty much complete):

foo = do print "OK, go" foo (foo bar) (if bar then bob else pif) (case mu {- cool -} zot of Just x -> return () Nothing -> do putStrLn "yay" return (1,2)) bill -- Etc where potato Cakes{} = 2 * x foo * bar / 5

The styles are part of the module hierarchy of the package.

Write your own style!

I welcome anyone to write their own style. All styles are based upon the fundamental style, which should never change, by extending it. You can base yours off of another style, or start from scratch. While you can keep your style locally, like your XMonad configuration, it’s encouraged to contribute your style as a module.

My style and Johan’s are quite different. But yours may be similar with small tweaks. Another distinctly different style is Simon Peyton Jones’s with explicit braces. This is a style you can implement if you want it.

See the contributing section of the README for more info.

Preserving meaning

A recommendation is to preserve the meaning of the code. Don’t make AST changes. Like removing parentheses, changing $ into parens, moving lets into wheres, etc. You can do it, but the results might surprise you.

Editing advantages

Having implemented Emacs support, I have been using this for a few weeks on my own code. It’s amazing. I’ve all but stopped manually making style changes. I just write code and then hit C-c i and it almost always does exactly what I want.

It can’t always do what I want. It has simple, predictable heuristics. But even when it doesn’t do what I want, I’ve so far been okay with the results. The advantages vastly outweigh that.

Remaining problems

I need to write a test suite for the fundamental style (and maybe the others). This isn’t hard, it’s just a bit laborious so I haven’t gotten round to it yet.

There’re some parts of the AST I haven’t finished filling out. You can see them which are marked by FIXME’s. This just means if you try to format a node type it doesn’t know how to print, you’ll get a message saying so. Your code won’t be touched.

Comment re-insertion is a little bit of a challenge. I have a decent implementation that generally preserves comments well. There’re some corner cases that I’ve noticed, but I’m confident I have the right change to make to clear that up.

The fundamental printer is fast. My personal ChrisDone style is slower, due to its heuristics of deciding when to layout a certain way. It took 6s to layout a complex and large declaration. I updated my style and brought it down to 2s. That’s okay for now. There are always speed tricks that can be done.

There are of course issues like whether HSE can parse your code, whether you have #ifdef CPP pragmas in your code, and things like that. That’s just part of the general problem space of tools like this.

Remaining ideas

Currently you can only reformat a declaration. I don’t yet trust (any) Haskell printer with my whole file. I invoke it on a per-declaration basis and I see the result. That’s good for me at the moment. But in the future I might extend it to support whole modules.

Implement some operator-specific layouts. There are some operators that can really only be laid out in a certain way. Control.Applicative operators spring to mind:

Foo <$> foo <*> bar <*> mu

This can easily be handled as part of a style. Other considerations might be strings of lens operators. I’ve noticed that people tend not to put spaces around them, like:4

foo^.bar.^mu.~blah

There’s also alignment, which is another possibility and easily implemented. The challenge will be deciding when alignment will look good versus making the code too wide and whitespacey. In my own style I personally haven’t implemented any alignment as it’s not that important to me, but I might one day.

Summary

Hopefully I’ve motivated the case that style is important, that formalizing style is important, and that automating it is practical and something we should solve and then move on, redirect our energy that was wasted on manually laying things out and debating.

  1. Granted, this isn’t a technical consideration. But it’s a real one.

  2. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  3. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  4. I don’t know lens’s operators so that might not be real code, but the point is that could be a style to implement.

Categories: Offsite Blogs

Christopher Done: Formatting in Haskell

Planet Haskell - Fri, 09/19/2014 - 6:00pm

This post is about the formatting package.

What’s wrong with printf?

The Text.Printf module is problematic simply because it’s not type-safe:

λ> import Text.Printf λ> printf "" 2 *** Exception: printf: formatting string ended prematurely λ> printf "%s" 2 *** Exception: printf: bad formatting char 's'

And it’s not extensible in the argument type. The PrintfType class does not export its methods.

And it’s not extensible in the formatting. You can’t add a “%K” syntax to it to print a value in Kelvins, for example.

And it’s implicit. You can’t just use your normal API searching facilities to search how to print a Day.

Holy Moly!

A while ago I was inspired by the HoleyMonoid package to use that mechanism to make a general replacement for printf.

It’s a continuation-based way of building up monoidal functions by composition with the ability to insert constants in-between. Example:

let holey = now "x = " . later show . now ", y = " . later show > run holey 3 5 "x = 3, y = 5"

The now function inserts a monoidal value directly into the composition. So

run (now x . now y)

is equivalent to

x <> y

And

run (later show . now x . later show . now y)

is equivalent to

\a b -> show a <> x <> show b <> y The Formatting package

The package is available on Hackage as formatting.

Comparison with other formatters

Example:

format ("Person's name is " % text % ", age is " % hex) "Dave" 54

or with short-names:

format ("Person's name is " % t % ", age is " % x) "Dave" 54

Similar to C’s printf:

printf("Person's name is %s, age is %x","Dave",54);

and Common Lisp’s FORMAT:

(format nil "Person's name is ~a, age is ~x" "Dave" 54) The Holey type newtype HoleyT r a m = Holey { runHM :: (m -> r) -> a } type Holey m r a = HoleyT r a m

This is my version of the HoleyMonoid. To make this into a useful package I changed a few things.

The Category instance implied a name conflict burden with (.), so I changed that to (%):

(%) :: Monoid n => Holey n b c -> Holey n b1 b -> Holey n b1 c

Rather than have the name-conflicting map function, I flipped the type arguments of the type and made it an instance of Functor.

Printers

There is an array of top-level printing functions for various output types:

-- | Run the formatter and return a lazy 'Text' value. format :: Holey Builder Text a -> a -- | Run the formatter and return a strict 'S.Text' value. sformat :: Holey Builder S.Text a -> a -- | Run the formatter and return a 'Builder' value. bprint :: Holey Builder Builder a -> a -- | Run the formatter and print out the text to stdout. fprint :: Holey Builder (IO ()) a -> a -- | Run the formatter and put the output onto the given 'Handle'. hprint :: Handle -> Holey Builder (IO ()) a -> a

All the combinators work on a lazy text Builder which has good appending complexity and can output to a handle in chunks.

The Format type

There is a short-hand type for any formatter:

type Format a = forall r. Holey Builder r (a -> r)

All formatters are written in terms of now or later.

Formatters

There is a standard set of formatters in Formatting.Formatters, for example:

text :: Format Text int :: Integral a => Format a sci :: Format Scientific hex :: Integral a => Format a

Finally, there is a general build function that will build anything that is an instance of the Build class from the text-format package:

build :: Buildable a => Format a

For which there are a bunch of instances. See the README for a full set of examples.

Composing formatters

%. is like % but feeds one formatter into another:

λ> format (left 2 '0' %. hex) 10 "0a" Extension

You can include things verbatim in the formatter:

> format (now "This is printed now.") "This is printed now."

Although with OverloadedStrings you can just use string literals:

> format "This is printed now." "This is printed now."

You can handle things later which makes the formatter accept arguments:

> format (later (const "This is printed later.")) () "This is printed later."

The type of the function passed to later should return an instance of Monoid.

later :: (a -> m) -> Holey m r (a -> r)

The function you format with (format, bprint, etc.) will determine the monoid of choice. In the case of this library, the top-level formating functions expect you to build a text Builder:

format :: Holey Builder Text a -> a

Because builders are efficient generators.

So in this case we will be expected to produce Builders from arguments:

format . later :: (a -> Builder) -> a -> Text

To do that for common types you can just re-use the formatting library and use bprint:

> :t bprint bprint :: Holey Builder Builder a -> a > :t bprint int 23 bprint int 23 :: Builder

Coming back to later, we can now use it to build our own printer combinators:

> let mint = later (maybe "" (bprint int)) > :t mint mint :: Holey Builder r (Maybe Integer -> r)

Now mint is a formatter to show Maybe Integer:

> format mint (readMaybe "23") "23" > format mint (readMaybe "foo") ""

Although a better, more general combinator might be:

> let mfmt x f = later (maybe x (bprint f))

Now you can use it to maybe format things:

> format (mfmt "Nope!" int) (readMaybe "foo") "Nope!" Retrospective

I’ve been using formatting in a bunch of projects since writing it. Happily, its API has been stable since releasing with some additions.

It has the same advantages as Parsec. It’s a combinator-based mini-language with all the same benefits.

Categories: Offsite Blogs