News aggregator

Christopher Done: Haskell structured diffs

Planet Haskell - Mon, 03/31/2014 - 6:00pm

Project-request: someone please implement a program that will diff Haskell in a cleverer way than lines.

In an effort to reign in my incessant work on Haskell tooling1, I’m outlining a tool that I’d personally like and welcome people to implement it. Otherwise it serves as a motivating problem description for the next time I come around to it myself with free time.

Before anyone emails me saying “lines/words are simple, other things are hard, that’s why it’s not been done yet. People undervalue the simple solution …” with a long lecture, spare me!

The concrete diff

The concrete diff is the line-based, and sometimes character-based, diff that we all know and love. There’s no reason to throw this away. You will need to keep this as an optional backend for when you are unable to parse a Haskell file.

Pros: simple to implement. You produce the necessary lines to delete and insert to create the change from A to B.

Cons: doesn’t know about syntactic redundancy where some changes don’t mean anything, and where the actual important change occurs. For example:

main = do putStrLn "Write your name!" name <- getLine print name

Now you change this to:

main = do args <- getArgs case args of [] -> do putStrLn "Write your name!" name <- getLine print name _ -> runWithArgs args

The diff will look like this:

@@ -5,3 +5,6 @@ module Main where -main = do putStrLn "Write your name!" - name <- getLine - print name +main = do args <- getArgs + case args of + [] -> do putStrLn "Write your name!" + name <- getLine + print name + _ -> runWithArgs args

But it’s clear to observe that this is not the change we made in spirit, it’s just one line-based way to achieve it. In actual fact, our do putStrLn … was moved into a case, un-changed. At this size, it’s not a big deal. When the code is more interesting, it’s important to know what was really changed, and what remains the same.

The abstract syntax diff

Enter the syntactic diff. We show the difference between two syntactic trees. How this is to be achieved in a readable way is the rub, but here are some ideas.

Take our example above, one approach can be to label nodes.


¹{main = ²{do putStrLn "Write your name!" name <- getLine print name}}


¹{main = do args <- getArgs case args of [] -> ²{do putStrLn "Write your name!" name <- getLine print name} _ -> runWithArgs args}

Now, at least at a superficial glance, you don’t even need this explained to you. You can see exactly what has happened: The code before has changed to the code after, but we can see that node2 has just moved to inside the case.

Where the trickiness arises is taking this to its logical conclusion and applying it generally. What’s displayed if you also change the string in the putStrLn? Good question. Here’s an idea:

¹{main = ²{do putStrLn -{"Write your name!"} name <- getLine print name}}

Because the node "Write your name" has now been lost, we don’t need to reference it any longer. So one way to show that it has been removed could be to put -{…}. And then to show what replaced it, put in +{…}, a la classic diffs:

¹{main = +{do args <- getArgs case args of [] -> ²{do putStrLn +{"Write your name!"} name <- getLine print name} _ -> runWithArgs args}}

In reality this rule would insert more -{…} and +{…} than I’ve written here, but I’m writing these examples manually so take them with a grain of salt. Let’s take it further and say that the string has actually been moved. Then we should indeed give it a number to reference it later:


¹{main = ²{do putStrLn ³{"Write your name!"} name <- getLine print name}}


¹{main = +{do args <- getArgs case args of [] -> ²{do putStrLn +{greeting} name <- getLine print name} _ -> runWithArgs args} +{where greeting = ³{"Write your name!"}}}

Again, I don’t think anybody is going to find this confusing. The node3 has moved into a where clause, which has been named greeting and referenced in place of its original place.

Am I making obvious sense, here? It’s not a particularly novel display, it states what happened syntactically, precisely. With a UI, you could expand/collapse nodes in a nested fashion or “explode” all the pieces into a flat list of numbered or +’d or -’d nodes, or just narrow down to one specific interesting expression, like

²{do putStrLn +{greeting} name <- getLine print name}

If you’re sufficiently nerd-sniped to find this interesting and do-able, then I invite you to go ahead and give it a go. I’d love to see a prototype. I don’t plan on implementing this in the near or distant future, so we won’t be toe stepping.

The reduced semantic diff

If you’re still reading by this point, let me try to entice you with ambitious ideas. Take the above approach, everything we just laid out, but let’s put an additional step in there: instead of diffing Haskell’s abstract syntax tree, diff the Core.

If you compile the below with GHC,

main = case Just () of Just () -> print "Hey!"

The external core is:

%module main:Main main:main5 :: (ZMZN Char) = unpackCStringzh ("Hey!"::Addrzh); main:main4 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 (ZMZN @ Char); main:main3 :: (ZMZN Char) = base:showLitString main:main5 main:main4; main:main2 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 main:main3; main:main1 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) = \ (etaB1::(Statezh RealWorld)) -> base:hPutStr2 base:stdout main:main2 True etaB1; main:main :: (IO Z0T) = %cast (main:main1) (%sym ((NTCoZCIO Z0T))); main:main6 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) = \ (etaXb::(Statezh RealWorld)) -> base:runMainIO1 @ Z0T (%cast (main:main1) (%sym ((NTCoZCIO Z0T)))) etaXb; main:ZCmain :: (IO Z0T) = %cast (main:main6) (%sym ((NTCoZCIO Z0T)));

You can see that the pointless case has been removed. This is the bread and butter of Core simplification. But if I remove the case myself, the Core is exactly the same. This is redundant semantic content, which is why GHC removed it.

If someone made a change like this in a real codebase which removed some redundant semantic content, not just syntactical redundancy, your diff could show it like that. In other words, nothing important semantically actually happened here.

In fact, if I refactored a bunch of code, re-organized a bit, does my next colleague really want to read through all the syntax tree just to see the crux of what changed? Sometimes, but not always. Sometimes, they just want to see the precise thing that will change at runtime.

It might actually be insane, with big blow ups in code difference for minute high-level changes, or it might be great for teams caring about performance. Difficult to know until you try it. You can also do a source-mapping back to the original Haskell source, for a more interesting display.

If you want to implement this, I would love to see any results.

The typed diff

Okay, you’re still reading so you’re pretty easily nerd sniped. Let’s continue with the ideas. Another type of difference between two sources is the types of expressions in there. Consider:

main = let x = [1,2,3] in print (x <> x)

Now you change the code to:

main = let x = myFancyMonoid in print (x <> x)

Our structural diff laid out earlier will show this:

main = let x = -{[1,2,3]} in print (x <> x)


main = let x = +{myFancyMonoid} in print (x <> x)

But actually, more things have changed here. As a result of the different monoid instance, the print (x <> x) will do something different. Maybe it’s a * rather than +, maybe it’s a number, whatever. Maybe that expression is used in a more interesting way than merely printing it. What’s the real diff?

main = let {x::[Integer]} = -{[1,2,3]} in print {{(x <> x)}::[Integer]}


main = let {x::MyFancyMonoid} = +{myFancyMonoid} in print {(x <> x)}::MyFancyMonoid}

Or something like that. I’m being hand-wavey in the display, here. The real difference is that we’ve changed the type of x. It’s an important change, which has semantic meaning. My ideas are more vague here. I haven’t thought through many scenarios of how to represent this. But one thing is clear: a diff of types can actually be useful and interesting.

The editing diff

The diffs above are all examples of “cold” diffs. Calculating the difference between two files as-is. If you’re in a structured editor like Lamdu, then you don’t have to do cold diffs and figure out and guess at what happened. You know exactly what happened. This node was raised here, this variable was renamed there, etc. But if you want to work on that, you pretty much have to work on Lamdu.


In summary I’ve intentionally listed increasingly more wacky diff ideas, from the familiar to the fairly novel. My general approach to tooling is progressive: start with the simplest working implementation then step up. Structured-haskell-mode is an example of this. It’s no Lamdu, and it’s no vanilla text-based mode. It’s a stepping stone inbetween. The impedance to try SHM is lower.

In the same way, maybe we can start with the abstract syntax diff, let people become acclimatized to it, let it stabilize, get it integrated into things like Git, and then work our way up from there.

If nobody bothers trying out these ideas, I’ll probably end up doing them myself eventually, but I thought I’d put the suggestion out there first.

  1. In favour of writing programs that concern themselves with things other than Haskell for once!

Categories: Offsite Blogs

Is there an elegant way to lazily produce values in ST?

Haskell on Reddit - Mon, 03/31/2014 - 4:35pm

I was playing with De Bruijn sequences, and I shamelessly stole the algorithm off of Wikipedia that computes the sequence in Python and ported it to Haskell.

Full text of the code: dbseq.hs

Here are the primitives I built the computation from:

class Monad m => STW m where type Yield m type RawState m yield :: Yield m -> m () readArr :: (Ix i) => STArray (RawState m) i e -> i -> m e writeArr :: (Ix i) => STArray (RawState m) i e -> i -> e -> m () newArr :: (Ix i) => (i,i) -> e -> m (STArray (RawState m) i e)

However, using m = WriterT [log] (ST s) clearly leads to running out of memory if the log gets big, because the head of the log can't be returned until the full ST computation is complete.

I tried instead using Control.Monad.ST.Lazy instead, which fared better, but still leaked memory and eventually ran out on large computations. This solution was the most elegant looking but fell over. My main question is if it's salvagable?

Then there's this monstrosity using strict ST, which actually seems to work with no leaks:

instance STW (ContT r (WriterT [t] (ST s))) where yield x = do lift $ tell [x] ContT $ \k -> WriterT $ unsafeInterleaveST $ runWriterT $ k ()

So, is this use of unsafeInterleaveST actually safe? I can't figure out how I'd go about proving it is, although it seems to work. And even if it is, it seems like a huge hack.

Is there a better way?

submitted by ryani
[link] [17 comments]
Categories: Incoming News

Monte Carlo Analysis in Haskell

Haskell on Reddit - Mon, 03/31/2014 - 4:13pm
Categories: Incoming News

Functional Jobs: Infrastructure Engineer at Zeta Project Germany GmbH (Full-time)

Planet Haskell - Mon, 03/31/2014 - 12:14pm

About us

We are an early-stage product company. At Zeta everyone has a voice and we expect you to change our world. If you like coming up with creative solutions then this is definitely the place for you.

We’ve built a team of passionately experienced, curious engineers and designers who are dedicated, hardworking and wholeheartedly embrace technical and design challenges of all types.

We are located in Berlin Mitte and are looking for talented individuals who share our passion.

Job Description

This position is for someone to work in the Infrastructure team, to contribute to the evolution of our software platform.

The team is responsible for developing and supporting the company's core systems and infrastructure, which is predominately written in Haskell.

You will work directly with other teams within the company, including client, backend, and QA engineers, ensuring that the infrastructure runs correctly, reliably and at scale.

The ability to educate others on the best use of the software resources the team provides is crucial, as is the ability to use feedback to continually improve our tooling and processes.

Initially we would like you to own our Jenkins setup and drive best practices for the automation of various teams' Continuous Integration and Deployment workflows, with future work planned to improve problem areas such as development workflow, monitoring, alerting, provisioning, and performance.

Skills & Requirements

  • Practical experience using various Amazon Web Services
  • Comfortable with admin tasks, like partitioning and formatting filesystems, Debian packaging, debugging live processes, and general Linux best practices
  • Working knowledge of Cassandra, Redis, or similar databases
  • Proactive with the ability to own and oversee projects
  • Extensive familiarity with Jenkins and Continuous Integration, for technologies such as Objective-C, C++ and Haskell
  • Good programmer with experience programming Ruby, Python, Erlang or other languages
  • Experience with various Configuration Management technologies like Ansible, Chef, Puppet, etc.
  • Interested in learning Haskell
  • Experience working as a Linux System Administrator

Get information on how to apply for this position.

Categories: Offsite Blogs

Can someone give me an example of how/when/why currying/uncurrying might be useful?

Haskell on Reddit - Mon, 03/31/2014 - 12:07pm

This is what I understand so far:

  • Uncurried functions take arguments simultaneously

    uncurriedFunction :: (a,b) -> c

  • Curried functions take arguments one at a time

    curriedFunction :: a -> b -> c

So currying, or a curried function, allows for partial application. With a curried function, we can give one argument to a function that requires more than one, right?

Beyond that, I'm having a hard time understand how the built-in functions, curry and uncurry, are useful, or why/when they would be used.

I have read through the sections of "Learn You a Haskell" that are relative to this topic, but I'm still struggling to really wrap my mind around it.

Thanks in advance.

submitted by Mister_Balloon_Hands
[link] [12 comments]
Categories: Incoming News

I'm trying to learn about free monads, but I get confused in the first paragraphs of /u/Tekmo's tutorial :/

Haskell on Reddit - Mon, 03/31/2014 - 11:52am

Why Free Monads Matter

In the tutorial, Gabriel defines a new data type, Toy:

data Toy b next = Output b next | Bell next | Done

After showing that different programs have different types, we are introduced to the Fix datatype:

data Fix f = Fix (f (Fix f))

Then we are shown that different programs can have the same type:

Fix (Output 'A' (Fix Done)) :: Fix (Toy Char) Fix (Bell (Fix (Output 'A' (Fix Done)))) :: Fix (Toy Char)

Now, this is the part where I become confused.

First, what's the intuition behind the Fix definition? It it supposed to be read as "the fixed point of an f is an f followed by a fixed point of f"?

I'm also not sure that I really follow the types and kinds; for example, the type of Done is Toy b next, and the kind of that type (sorry for the weak wording) is * -> * -> *. However, the kind of Fix is (* -> *) -> *, so how is Fix Done properly typed?

Sorry if this is very basic, but my brain seems unable to properly connect the dots.


Okay, so I read your comments and the Fix datatype makes more sense. Now to another question: is Fix integral to free monads? I found another tutorial where the author used a subset of Joy to show free monads and he also used the scheme of having an extra type variable that points to the next instruction. Is that required for free monads or do they just need an extra type variable (i.e. have kind * -> *) so that they can have a Functor instance?

submitted by gnuvince
[link] [11 comments]
Categories: Incoming News

Alejandro Cabrera: Migrating to Static, Self-Hosted Blog

Planet Haskell - Mon, 03/31/2014 - 9:50am
Hello, readers.

Soon, I'll be moving away from the Blogger platform to a statically hosted blog. I've been playing with hakyll for some time now, and am pleased with it's support for syntax highlighting, markup formats, RSS/Atom generation, and packaged Warp.

It'll be great to develop my blog with emacs, leveraging git for backup and rsync for deployment.


* Less latency
* A home-grown design
* More code samples, with pretty highlighting

With possibly:

* More frequent posts

Stay tuned. I'll be releasing the new blog by the end of the week!
Categories: Offsite Blogs

Daniil Frumin: ANN: hastache 0.6

Planet Haskell - Mon, 03/31/2014 - 1:49am
Announcing: hastache 0.6

Hastache is a Haskell implementation of the mustache templating system.

Quick start cabal update cabal install hastache

A simple example:

import Text.Hastache import Text.Hastache.Context import qualified Data.Text.Lazy.IO as TL main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context) >>= TL.putStrLn template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages." context "name" = MuVariable "Haskell" context "unread" = MuVariable (100 :: Int)

Read Mustache documentation for template syntax; consult README for more details.

Whats’s new in 0.6?

The interface of the library has been switched from ByteString to (lazy) Text. That means, for example, that the type of hastcheStr function is now the following:

hastacheStr :: MonadIO m => MuConfig m -> Text -> MuContext m -> m Text

The generic context generation (mkGenericContext) now supports functions of the types Text -> Text and Text -> m Text, as well as types with multiple constructors. That is, given a Haskell datastructure

data A = A { str :: String } | B { num :: Int }

it is possible to write a template like this:

{{#A}} A : {{str}} {{/A}} {{#B}} B : {{num}} {{/B}}

Please take a look at the multiple constructors example if you are interested.

Additionally, a couple of new MuVar instances has been added.

Links Acknowledgments

Special thanks to Mark Lentczner, Alexander Voikov and others who reported issues, tested the library and provided feedback.

Tagged: haskell, hastache
Categories: Offsite Blogs