News aggregator

Purity and referential transparency are different

Haskell on Reddit - Tue, 04/01/2014 - 1:14pm

I have always had the sense that the word "purity" is used in a pretty informal sense, and when I provoke people to give a definition I've never been entirely satisfied.

Folks often point to the notion of referential transparency. There is a complaint that Strachey's introduction of the term was perhaps poorly explained considering the original meaning c.f. Quine. I'm not as interested in that discussion and I'm content to leave RT mean something to do with having an equational theory of terms.

A good reference on the topic seems to be "What is a Purely Functional Language?" (Sabry 1998). In this article, Sabry argues that purity should be taken as property of a language, not of an expression. RT is specifically discounted as a plausible definition of purity.

For the record, their definition of purity is:

A language is purely functional if (i) it includes every simply typed lambda calculus term, and (ii) its call-by-name, call-by-need, and call-by-value implementations are equivalent (modulo divergence and errors).

This definition is more satisfying than RT because I don't see how "replace equals with equals" could ever sensibly be violated without being obtuse. Indeed, Strachey's point is that when considering, say, "let x = y++" you cannot consider "x" and "y++" to be equals. Any replacement of "y++" with "x" is just ignorance of what the term "y++" actually equals, not a failure of referential transparency (a point better expressed in a blog post by Magnus).

However, Sabry's definition doesn't totally agree with the informal usage of the term pure. For example, one can no longer say "prefer pure functions" because purity applies to languages as a whole, not portions of specific programs. A monolithic Haskell program written entirely in the IO monad can't be seen as somehow "less pure."

I would be interested if there are other papers that refine or dispute this definition. Specifically, is there a technical term that better captures this informal idea of "not creating abstractions that are functions of the of the rest of the program?" This is what I think people often mean when they advocate for the advantages of purity.

submitted by takeoutweight
[link] [18 comments]
Categories: Incoming News

Tutorial for building a compiler in Haskell?

Haskell on Reddit - Tue, 04/01/2014 - 7:57am

I have very often heard that Haskell is a very good language in which to write compilers, but have been unable to find resources or tutorials about this, besides the excellent Write You a Scheme.

I am currently working through Appel's "Modern Compiler Implementation in ML" in Haskell, and I think I am doing OK, but would like to know if I am doing it the "proper" way, specially concerning features of ML he uses, that are ugly, not available, or not idiomatic in Haskell.

submitted by TunaOfDoom
[link] [14 comments]
Categories: Incoming News

foldl is broken!

Haskell on Reddit - Tue, 04/01/2014 - 4:40am
Categories: Incoming News

Well-Typed.Com: Fixing foldl

Planet Haskell - Tue, 04/01/2014 - 4:38am

The foldl function is broken. Everyone knows it’s broken. It’s been broken for nearly a quarter of a century. We should finally fix it!

Today I am proposing that Prelude.foldl be redefined using the implementation currently known as Data.List.foldl'.

foldl is broken!

I’m sure you knew that already, but just in case…

Have you ever noticed that Haskellers usually recommend using either foldr or foldl' but not foldl? For example Real World Haskell has this to say:

Due to the thunking behaviour of foldl, it is wise to avoid this function in real programs: even if it doesn’t fail outright, it will be unnecessarily inefficient. Instead, import Data.List and use foldl'.

In the online version of the book the first user comments on that paragraph are

Why isn’t Data.List foldl implementation placed in Prelude?

I second the question: Why isn’t foldl' the default?

Good question.

Ok, so obviously we’re talking about the difference between foldl and foldl':

foldl :: (b -> a -> b) -> b -> [a] -> b foldl f a [] = a foldl f a (x:xs) = foldl f (f a x) xs foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f a [] = a foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs

The dry technical difference is that foldl' evaluates the call to f before making the next recursive call. The consequences of that are perhaps not immediately obvious, so we’ll take a step back and look at a slightly bigger picture.

Folding left and right

When we first learn Haskell we learn that there are two ways to fold a list, from the left or right.

foldl f z [x1, x2, ..., xn] = (...((z `f` x1) `f` x2) `f`...) `f` xn foldr f z [x1, x2, ..., xn] = x1 `f` (x2 `f` ... (xn `f` z)...)

Saying “from the left” or “from the right” is a description of what foldl and foldr calculate, with the parenthesis nesting to the left or to the right. At runtime of course we always have to start from the left (front) of the list.

We later learn other ways of thinking about left and right folds, that a left fold can be used like a classic loop where we go through the whole list eagerly, while a right fold can be used like a demand-driven iterator. For the left fold that means using a function that is strict in its first argument (like (+)) while for the right fold that means using a function that is not strict in its second argument (like (:)).

Indeed when looking at whether we want foldl or foldr in any particular case our choice is usually governed by whether we want “all at once” behaviour (foldl) or if we want incremental or short-cut behaviour (foldr).

Accumulating thunks

Again, as we are learning Haskell, we get told that foldl has this crazy behaviour

foldl (+) 0 (1:2:3:[]) = foldl (+) (0 + 1) (2:3:[]) = foldl (+) ((0 + 1) + 2) (3:[]) = foldl (+) (((0 + 1) + 2) + 3) [] = (((0 + 1) + 2) + 3)

when what we had in mind when we thought of an accumulating loop was

foldl' (+) 0 (1:2:3:[]) = foldl' (+) 1 (2:3:[]) = foldl' (+) 3 (3:[]) = foldl' (+) 6 [] = 6

Of course that’s just what foldl' does, it evaluates the call to + before making the next recursive call.

When is foldl (rather than foldl') useful?

The short answer is “almost never”.

As beginners we often assume that foldl must still make sense in some cases (or why have both?) but it turns out that’s not really the case.

When the f argument of foldl is a strict function then delaying the evaluation does not gain us anything as it all has to be evaluated at the end anyway. The only time when delaying the evaluation could save us anything is when the f function is not strict in its first argument – in which case you either don’t care or probably should have been using foldr in the first place.

In fact even if our f function is non-strict in its first argument, we probably do not gain anything from delaying evaluation and it usually does no harm to evaluate earlier. Remember that we still have to traverse the whole list, we don’t get any short-cutting behaviour like with foldr.

We can, if we think about it, construct examples where foldl' would be too strict. We could define last and last' like this:

last = foldl (\_ y -> y) (error "empty list") last' = foldl' (\_ y -> y) (error "empty list")

Now if we try

> last [1,undefined,3] 3 > last' [1,undefined,3] *** Exception: Prelude.undefined

This is because our accumulator is always the latest element that we’ve seen but we don’t actually want to evaluate the elements (except the last one).

So it’s true that foldl' fails in this case, but it’s also a silly definition, the usual definition is a lot clearer

last [x] = x last (_:xs) = last xs last [] = error "empty list"

That goes for pretty much all the other examples you might be able to think of where foldl would work but foldl' would not: the examples are either artificial or are clearer written in other ways.

People sometimes point out that sum is defined using foldl and not foldl' and claim that this is something to do with Haskell’s designers wanting to allow Num instances where (+) might be lazy. This is pretty much nonsense. If that were the case then sum would have been defined using foldr rather than foldl to properly take advantage of short-cutting behaviour. A much simpler explanation is that foldl' was not available in early versions of Haskell when sum was defined.

In nearly 15 years as a Haskell programmer I think I’ve specifically needed foldl rather than foldl' about three times. I say “about” because I can only actually remember one. That one case was in a slightly cunning bit of code for doing cache updates in a web server. It would almost certainly have been clearer as a local recursion but I was amused to find a real use case for foldl and couldn’t help myself from using it just for fun. Of course it needed a comment to say that I was using it on purpose rather than by mistake!

So why do we have foldl and foldl'?

If foldl is almost always a mistake (or merely benign) then why do we have it in the first place?

I don’t know for sure, but here’s my guess…

When Haskell 1.0 was published on this day 24 years ago there was no seq function at all, so there was no choice but to define foldl in the “classic” way.

Eventually, six years later after much discussion, we got the seq function in Haskell 1.3. Though actually in Haskell 1.3 seq was part of an Eval class, so you couldn’t use it just anywhere, such as in foldl. In Haskell 1.3 you would have had to define foldl' with the type:

foldl' :: Eval b => (b -> a -> b) -> b -> [a] -> b

Haskell 1.4 and Haskell 98 got rid of the Eval class constraint for seq but foldl was not changed. Hugs and GHC and other implementations added the non-standard foldl'.

I suspect that people then considered it a compatibility and inertia issue. It was easy enough to add a non-standard foldl' but you can’t so easily change the standard.

I suspect that if we had had seq from the beginning then we would have defined foldl using it.

Miranda, one of Haskell’s predecessor languages, already had seq 5 years before Haskell 1.0.

A strict foldl in Orwell!

Orwell is an interesting case. Orwell was another Haskell predecessor, very similar to Miranda and early Haskell. An informed source told me that Orwell had defined its foldl in the way that we now define foldl', ie with strict evaluation. Information on Orwell is a little hard to get ahold of online these days so I asked Philip Wadler. Phil very kindly fished out the manuals and looked up the definitions for me.

In the original version:

An Introduction to Orwell (DRAFT) Philip Wadler 1 April 1985

In the standard prelude:

lred f a [] = a lred f a (x:xs) = lred f (f a x) xs

But five years later, by the time Haskell 1.0 is being published…

An Introduction to Orwell 6.00 by Philip Wadler revised by Quentin Miller Copyright 1990 Oxford University Computing Lab

In the standard prelude:

foldl :: (a -> b -> a) -> a -> [b] -> a foldl f a [] = a foldl f a (x:xs) = strict (foldl f) (f a x) xs

Note the use of strict. Presumably Orwell’s strict function was defined as (or equivalent to)

strict :: (a -> b) -> a -> b strict f x = x `seq` f x

(These days in Haskell we call this function ($!).)

So my source was right, Orwell did change foldl to be the strict version!

I contend that this was and is the right decision, and that it was just a consequence of the late arrival of seq in Haskell and inertia and fears about backwards compatibility that have kept us from fixing foldl.

Just do it!

It’d help all of us who are sometimes tempted to use foldl because we can’t be bothered to import Data.List. It’d help confused beginners. It’d save teachers from the embarrassment of having to first explain foldl and then why you should never use it.

Orwell fixed this mistake at least 24 years ago, probably before Haskell 1.0 was released. Just because it’s an old mistake doesn’t mean we shouldn’t fix it now!

A postscript: which foldl'?

I hate to complicate a simple story but I should admit that there are two plausible definitions of foldl' and I’ve never seen any serious discussion of why we use one rather than the other (I suspect it’s another historical accident).

So the version above is the “standard” version, perhaps more clearly written using bang patterns as

foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f a [] = a foldl' f a (x:xs) = let !a' = f a x in foldl' f a' xs

But an equally plausible version is

foldl'' :: (b -> a -> b) -> b -> [a] -> b foldl'' f !a [] = a foldl'' f !a (x:xs) = foldl'' f (f a x) xs

The difference is this: in the first version we evaluate the new accumulating parameter before making the recursive call, while in the second version we ensure that the accumulating parameter is always evaluated (to WHNF) on entry into each call.

These two ways of forcing the evaluation have almost the same effect. It takes a moment or two to find a case where it makes a difference, but here’s one:

foldl' (\_ y -> y) undefined [1] = 1 foldl'' (\_ y -> y) undefined [1] = undefined

The standard foldl' ensures that all the new accumulating parameter values are evaluated, but still allows the initial value to be unevaluated. The alternative version simply says that the accumulating parameter is always evaluated.

The second version is attractive from a code generation point of view. One of the clever things that GHC can do with foldl' (and strict function arguments generally) is to unbox the values, so for example an Int can be unboxed to a Int# and passed in registers rather than on the heap. With the standard foldl' it needs a special case for the first iteration of the loop where the initial value of the accumulating parameter which might not be evaluated yet. With foldl'', that’s not a problem, we can unbox things right from the start. In practice, GHC can often see that the initial value is evaluated anyway, but not always.

(Don Stewart and I noticed this a few years back when we were working on stream fusion for lists. We had defined foldl' on streams in a way that corresponded to the second form above and then got a test failure when doing strictness testing comparing against the standard list functions.)

So if we’re going to fix foldl to be the strict version, then perhaps it should be the fully strict version, not just the “strict after the first iteration” version.

Categories: Offsite Blogs

Philip Wadler: 10 things that put people off cycling

Planet Haskell - Tue, 04/01/2014 - 4:28am
Sad to say, the photos above and below are not an April Fool. From yesterday's Guardian, 10 things that put people off cycling.
Categories: Offsite Blogs

Philip Wadler: Cycling: Past, Present and Possible Futures

Planet Haskell - Tue, 04/01/2014 - 4:24am
Spokes Spring meeting last week hosted Prof Colin Pooley, of Lancaster University, speaking about a detailed study of why people do and don't cycle. The standout lesson, from interviews and surveys, was that people don't cycle because they don't feel cycling is safe. His report concluded with a list of recommendations:
  • Fully separated cycle and pedestrian routes on all arterial roads.
  • Restrictions on traffic speeds, parking, access etc on all residential roads
  • Adopt ‘strict liability’ on roads to protect the most vulnerable road users
  • Changes to structure of cities to make accessing services by bike easy, and storing and parking bikes easy
  • Societal and economic changes to give people flexibility to travel more sustainably (flexi hours, school provision etc)
  • Change the image of cycling so that it becomes ‘normal’
The meeting was also attended by Andrew Burns, City of Edinburgh Council Leader (who reported it on his blog). The last question Councillor Burns was asked was whether he could remember the first recommendation on the list, but he could not. Which I think encapsulates the problem neatly.

The first recommendation is for separated cycle routes, as found in Copenhagen, Amsterdam, and elsewhere. ("Copenhagenize" is now a verb.) To Edinburgh's credit, it has allocated 7% of its transport budget to cycling. How much of that is going to separated cycling routes, and the other recommendations on the list? Councillor Burns says segregated routes are planned for George Street and Leith Walk. I argued we need a more agressive plan. Listening to Pooley, the problems sound difficult, but all we really need are the money and the will. Let's construct a network of separated cycle routes covering the city. Build it, and they will come!

(Above: Edinburgh Links, one of the few segregated bike routes in the city, and my ride to work each morning. Below: Copenhagen.)

Categories: Offsite Blogs

Edward Z. Yang: Calculating Shanten in Mahjong

Planet Haskell - Tue, 04/01/2014 - 2:20am

Move aside, poker! While the probabilities of various poker hands are well understood and tabulated, the Chinese game of chance Mahjong [1] enjoys a far more intricate structure of expected values and probabilities. [2] This is largely due in part to the much larger variety of tiles available (136 tiles, as opposed to the standard playing card deck size of 52), as well as the turn-by-turn game play, which means there is quite a lot of strategy involved with what is ostensibly a game of chance. In fact, the subject is so intricate, I’ve decided to write my PhD thesis on it. This blog post is a condensed version of one chapter of my thesis, considering the calculation of shanten, which we will define below. I’ll be using Japanese terms, since my favorite variant of mahjong is Riichi Mahjong; you can consult the Wikipedia article on the subject if you need to translate.

Calculating Shanten

The basic gameplay of Mahjong involves drawing a tile into a hand of thirteen tiles, and then discarding another tile. The goal is to form a hand of fourteen tiles (that is, after drawing, but before discarding a tile) which is a winning configuration. There are a number of different winning configurations, but most winning configurations share a similar pattern: the fourteen tiles must be grouped into four triples and a single pair. Triples are either three of the same tile, or three tiles in a sequence (there are three “suits” which can be used to form sequences); the pair is two of the same tiles. Here is an example:

Represented numerically, this hand consists of the triples and pairs 123 55 234 789 456.

One interesting quantity that is useful to calculate given a mahjong hand is the shanten number, that is, the number of tiles away from winning you are. This can be used to give you the most crude heuristic of how to play: discard tiles that get you closer to tenpai. The most widely known shanten calculator is this one on Tenhou’s website [3]; unfortunately, the source code for this calculator is not available. There is another StackOverflow question on the subject, but the “best” answer offers only a heuristic approach with no proof of correctness! Can we do better?

Naïvely, the shanten number is a breadth first search on the permutations of a hand. When a winning hand is found, the algorithm terminates and indicates the depth the search had gotten to. Such an algorithm is obviously correct; unfortunately, with 136 tiles, one would have to traverse hands (choices of new tiles times choices of discard) while searching for a winning hand that is n-shanten away. If you are four tiles away, you will have to traverse over six trillion hands. We can reduce this number by avoiding redundant work if we memoize the shanten associated with hands: however, the total number of possible hands is roughly , or 59 bits. Though we can fit (via a combinatorial number system) a hand into a 64-bit integer, the resulting table is still far too large to hope to fit in memory.

The trick is to observe that shanten calculation for each of the suits is symmetric; thus, we can dynamic program over a much smaller space of the tiles 1 through 9 for some generic suit, and then reuse these results when assembling the final calculation. is still rather large, so we can take advantage of the fact that because there are four copies of each tile, an equivalent representation is a 9-vector of the numbers zero to four, with the constraint that the sum of these numbers is 13. Even without the constraint, the count is only two million, which is quite tractable. At a byte per entry, that’s 2MB of memory; less than your browser is using to view this webpage. (In fact, we want the constraint to actually be that the sum is less than or equal to 13, since not all hands are single-suited, so the number of tiles in a hand is less.

The breadth-first search for solving a single suit proceeds as follows:

  1. Initialize a table A indexed by tile configuration (a 9-vector of 0..4).
  2. Initialize a todo queue Q of tile configurations.
  3. Initialize all winning configurations in table A with shanten zero (this can be done by enumeration), recording these configurations in Q.
  4. While the todo queue Q is not empty, pop the front element, mark the shanten of all adjacent uninitialized nodes as one greater than that node, and push those nodes onto the todo queue.

With this information in hand, we can assemble the overall shanten of a hand. It suffices to try every distribution of triples and the pairs over the four types of tiles (also including null tiles), consulting the shanten of the requested shape, and return the minimum of all these configurations. There are (by stars and bars) combinations, for a total of 140 configurations. Computing the shanten of each configuration is a constant time operation into the lookup table generated by the per-suit calculation. A true shanten calculator must also accomodate the rare other hands which do not follow this configuration, but these winning configurations are usually highly constrained, and quite easily to (separately) compute the shanten of.

With a shanten calculator, there are a number of other quantities which can be calculated. Uke-ire refers to the number of possible draws which can reduce the shanten of your hand: one strives for high uke-ire because it means that probability that you will draw a tile which moves your hand closer to winning. Given a hand, it's very easy to calculate its uke-ire: just look at all adjacent hands and count the number of hands which have lower shanten.

Further extensions

Suppose that you are trying to design an AI which can play Mahjong. Would the above shanten calculator provide a good evaluation metric for your hand? Not really: it has a major drawback, in that it does not consider the fact that some tiles are simply unavailable (they were discarded). For example, if all four “nine stick” tiles are visible on the table, then no hand configuration containing a nine stick is actually reachable. Adjusting for this situation is actually quite difficult, for two reasons: first, we can no longer precompute a shanten table, since we need to adjust at runtime what the reachability metric is; second, the various suits are no longer symmetric, so we have to do three times as much work. (We can avoid an exponential blowup, however, since there is no inter-suit interaction.)

Another downside of the shanten and uke-ire metrics is that they are not direct measures of “tile efficiency”: that is, they do not directly dictate a strategy for discards which minimizes the expected time before you get a winning hand. Consider, for example, a situation where you have the tiles 233, and only need to make another triple in order to win. You have two possible discards: you can discard a 2 or a 3. In both cases, your shanten is zero, but discarding a 2, you can only win by drawing a 3, whereas discarding a 3, you can win by drawing a 1 or a 4. Maximizing efficiency requires considering the lifetime ure-kire of your hands.

Even then, perfect tile efficiency is not enough to see victory: every winning hand is associated with a point-score, and so in many cases it may make sense to go for a lower-probability hand that has higher expected value. Our decomposition method completely falls apart here, as while the space of winning configurations can be partitioned, scoring has nonlocal effects, so the entire hand has to be considered as a whole. In such cases, one might try for a Monte Carlo approach, since the probability space is too difficult to directly characterize. However, in the Japanese Mahjong scoring system, there is yet another difficulty with this approach: the scoring system is exponential. Thus, we are in a situation where the majority of samples will be low scoring, but an exponentially few number of samples have exponential payoff. In such cases, it’s difficult to say if random sampling will actually give a good result, since it is likely to miscalculate the payoff, unless exponentially many samples are taken. (On the other hand, because these hands are so rare, an AI might do considerably well simply ignoring them.)

To summarize, Mahjong is a fascinating game, whose large state space makes it difficult to accurately characterize the probabilities involved. In my thesis, I attempt to tackle some of these questions; please check it out if you are interested in more.

[1] No, I am not talking about the travesty that is mahjong solitaire.

[2] To be clear, I am not saying that poker strategy is simple—betting strategy is probably one of the most interesting parts of the game—I am simply saying that the basic game is rather simple, from a probability perspective.

[3] Tenhou is a popular Japanese online mahjong client. The input format for the Tenhou calculator is 123m123p123s123z, where numbers before m indicate man tiles, p pin tiles, s sou tiles, and z honors (in order, they are: east, south, west, north, white, green, red). Each entry indicates which tile you can discard to move closer to tenpai; the next list is of ure-kire (and the number of tiles which move the hand further).

Categories: Offsite Blogs

Well-Typed.Com: Fun and Profit with Strongly-Typed Data Schemas

Planet Haskell - Tue, 04/01/2014 - 1:24am

Over the past few months, Duncan and I have been working with Chris Dornan and Alfredo Di Napoli on api-tools, a DSL for specifying data schemas for REST-like APIs in Haskell. If you’re interested in the real-world use of Haskell, static types and DSLs, why not come along to hear Chris talk about it?

Wednesday 9th April, 6:30pm, London

Find out more and register for free over at Skills Matter:

Typical business apps store structured data, transform it and send it hither and thither. They are typically made of multiple components that have to agree on the schema of the data they exchange. So a large part of what it means to be “flexible” is for it to be easy to modify and extend the schema of the data that the system handles.

Strong typing can help with this, ensuring that the code that accesses the data is consistent with the schema. One idea that has been tried with databases is to generate Haskell types from the database schema, or generate the database schema from the Haskell types, or both from some standalone schema.

In this talk we will describe how we have applied this same idea but for REST APIs. We have a DSL that we use to specify the schema of the data available via a REST API. With a machine description of the schema we can generate the code and documentation for many parts of the system, and of course those parts are then easily adaptable when the schema changes. We generate Haskell types, JSON conversion functions, REST API documentation, disk persistence and more. We also have a story for managing schema changes and migrating data. This system is in use at Iris Connect and is now available on Hackage.

This talk will also discuss further applications we have found for these tools and some of the experience from the development project at Iris Connect, including problems we have had to solve building the Haskell components of the system and the wider challenge of integrating it into the overall development project.

And if that isn’t enough for you, Well-Typed’s Haskell courses are back at the end of April, with an all-new course on advanced features of the Haskell type system. Stay tuned for more events coming soon…

Categories: Offsite Blogs

Study finds that when no financial interests are involved programmers choose DECENT languages

Lambda the Ultimate - Mon, 03/31/2014 - 11:01pm

For immediate release. By studying the troves of open source software on the github service, a recent (and still unpublished) study by a team of empirical computer scientists found that programmers prefer to use DECENT languages. DECENT languages were defined by the team to be languages that conform to a set of minimal ethical standards, including clean syntax, expressive type systems, good package managers and installers, free implementations, no connection to the military-industrial complex and not harming animals. The researchers argue that their data support the view that the prevalent use of indecent languages, Java in particular, is the result of money influencing programmer decisions. The principal author of the study notes that DECENT languages do not necessarily make the most MORAL choices (e.g., many of them are not statically typed). He attributed this to a phenomenon called the "weakness of the will."

Details to follow.

Categories: Offsite Discussion

Christopher Done: Haskell structured diffs

Planet Haskell - Mon, 03/31/2014 - 6:00pm

Project-request: someone please implement a program that will diff Haskell in a cleverer way than lines.

In an effort to reign in my incessant work on Haskell tooling1, I’m outlining a tool that I’d personally like and welcome people to implement it. Otherwise it serves as a motivating problem description for the next time I come around to it myself with free time.

Before anyone emails me saying “lines/words are simple, other things are hard, that’s why it’s not been done yet. People undervalue the simple solution …” with a long lecture, spare me!

The concrete diff

The concrete diff is the line-based, and sometimes character-based, diff that we all know and love. There’s no reason to throw this away. You will need to keep this as an optional backend for when you are unable to parse a Haskell file.

Pros: simple to implement. You produce the necessary lines to delete and insert to create the change from A to B.

Cons: doesn’t know about syntactic redundancy where some changes don’t mean anything, and where the actual important change occurs. For example:

main = do putStrLn "Write your name!" name <- getLine print name

Now you change this to:

main = do args <- getArgs case args of [] -> do putStrLn "Write your name!" name <- getLine print name _ -> runWithArgs args

The diff will look like this:

@@ -5,3 +5,6 @@ module Main where -main = do putStrLn "Write your name!" - name <- getLine - print name +main = do args <- getArgs + case args of + [] -> do putStrLn "Write your name!" + name <- getLine + print name + _ -> runWithArgs args

But it’s clear to observe that this is not the change we made in spirit, it’s just one line-based way to achieve it. In actual fact, our do putStrLn … was moved into a case, un-changed. At this size, it’s not a big deal. When the code is more interesting, it’s important to know what was really changed, and what remains the same.

The abstract syntax diff

Enter the syntactic diff. We show the difference between two syntactic trees. How this is to be achieved in a readable way is the rub, but here are some ideas.

Take our example above, one approach can be to label nodes.

Before:

¹{main = ²{do putStrLn "Write your name!" name <- getLine print name}}

After:

¹{main = do args <- getArgs case args of [] -> ²{do putStrLn "Write your name!" name <- getLine print name} _ -> runWithArgs args}

Now, at least at a superficial glance, you don’t even need this explained to you. You can see exactly what has happened: The code before has changed to the code after, but we can see that node2 has just moved to inside the case.

Where the trickiness arises is taking this to its logical conclusion and applying it generally. What’s displayed if you also change the string in the putStrLn? Good question. Here’s an idea:

¹{main = ²{do putStrLn -{"Write your name!"} name <- getLine print name}}

Because the node "Write your name" has now been lost, we don’t need to reference it any longer. So one way to show that it has been removed could be to put -{…}. And then to show what replaced it, put in +{…}, a la classic diffs:

¹{main = +{do args <- getArgs case args of [] -> ²{do putStrLn +{"Write your name!"} name <- getLine print name} _ -> runWithArgs args}}

In reality this rule would insert more -{…} and +{…} than I’ve written here, but I’m writing these examples manually so take them with a grain of salt. Let’s take it further and say that the string has actually been moved. Then we should indeed give it a number to reference it later:

Before:

¹{main = ²{do putStrLn ³{"Write your name!"} name <- getLine print name}}

After:

¹{main = +{do args <- getArgs case args of [] -> ²{do putStrLn +{greeting} name <- getLine print name} _ -> runWithArgs args} +{where greeting = ³{"Write your name!"}}}

Again, I don’t think anybody is going to find this confusing. The node3 has moved into a where clause, which has been named greeting and referenced in place of its original place.

Am I making obvious sense, here? It’s not a particularly novel display, it states what happened syntactically, precisely. With a UI, you could expand/collapse nodes in a nested fashion or “explode” all the pieces into a flat list of numbered or +’d or -’d nodes, or just narrow down to one specific interesting expression, like

²{do putStrLn +{greeting} name <- getLine print name}

If you’re sufficiently nerd-sniped to find this interesting and do-able, then I invite you to go ahead and give it a go. I’d love to see a prototype. I don’t plan on implementing this in the near or distant future, so we won’t be toe stepping.

The reduced semantic diff

If you’re still reading by this point, let me try to entice you with ambitious ideas. Take the above approach, everything we just laid out, but let’s put an additional step in there: instead of diffing Haskell’s abstract syntax tree, diff the Core.

If you compile the below with GHC,

main = case Just () of Just () -> print "Hey!"

The external core is:

%module main:Main main:main5 :: (ZMZN Char) = unpackCStringzh ("Hey!"::Addrzh); main:main4 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 (ZMZN @ Char); main:main3 :: (ZMZN Char) = base:showLitString main:main5 main:main4; main:main2 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 main:main3; main:main1 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) = \ (etaB1::(Statezh RealWorld)) -> base:hPutStr2 base:stdout main:main2 True etaB1; main:main :: (IO Z0T) = %cast (main:main1) (%sym ((NTCoZCIO Z0T))); main:main6 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) = \ (etaXb::(Statezh RealWorld)) -> base:runMainIO1 @ Z0T (%cast (main:main1) (%sym ((NTCoZCIO Z0T)))) etaXb; main:ZCmain :: (IO Z0T) = %cast (main:main6) (%sym ((NTCoZCIO Z0T)));

You can see that the pointless case has been removed. This is the bread and butter of Core simplification. But if I remove the case myself, the Core is exactly the same. This is redundant semantic content, which is why GHC removed it.

If someone made a change like this in a real codebase which removed some redundant semantic content, not just syntactical redundancy, your diff could show it like that. In other words, nothing important semantically actually happened here.

In fact, if I refactored a bunch of code, re-organized a bit, does my next colleague really want to read through all the syntax tree just to see the crux of what changed? Sometimes, but not always. Sometimes, they just want to see the precise thing that will change at runtime.

It might actually be insane, with big blow ups in code difference for minute high-level changes, or it might be great for teams caring about performance. Difficult to know until you try it. You can also do a source-mapping back to the original Haskell source, for a more interesting display.

If you want to implement this, I would love to see any results.

The typed diff

Okay, you’re still reading so you’re pretty easily nerd sniped. Let’s continue with the ideas. Another type of difference between two sources is the types of expressions in there. Consider:

main = let x = [1,2,3] in print (x <> x)

Now you change the code to:

main = let x = myFancyMonoid in print (x <> x)

Our structural diff laid out earlier will show this:

main = let x = -{[1,2,3]} in print (x <> x)

After:

main = let x = +{myFancyMonoid} in print (x <> x)

But actually, more things have changed here. As a result of the different monoid instance, the print (x <> x) will do something different. Maybe it’s a * rather than +, maybe it’s a number, whatever. Maybe that expression is used in a more interesting way than merely printing it. What’s the real diff?

main = let {x::[Integer]} = -{[1,2,3]} in print {{(x <> x)}::[Integer]}

After:

main = let {x::MyFancyMonoid} = +{myFancyMonoid} in print {(x <> x)}::MyFancyMonoid}

Or something like that. I’m being hand-wavey in the display, here. The real difference is that we’ve changed the type of x. It’s an important change, which has semantic meaning. My ideas are more vague here. I haven’t thought through many scenarios of how to represent this. But one thing is clear: a diff of types can actually be useful and interesting.

The editing diff

The diffs above are all examples of “cold” diffs. Calculating the difference between two files as-is. If you’re in a structured editor like Lamdu, then you don’t have to do cold diffs and figure out and guess at what happened. You know exactly what happened. This node was raised here, this variable was renamed there, etc. But if you want to work on that, you pretty much have to work on Lamdu.

Summary

In summary I’ve intentionally listed increasingly more wacky diff ideas, from the familiar to the fairly novel. My general approach to tooling is progressive: start with the simplest working implementation then step up. Structured-haskell-mode is an example of this. It’s no Lamdu, and it’s no vanilla text-based mode. It’s a stepping stone inbetween. The impedance to try SHM is lower.

In the same way, maybe we can start with the abstract syntax diff, let people become acclimatized to it, let it stabilize, get it integrated into things like Git, and then work our way up from there.

If nobody bothers trying out these ideas, I’ll probably end up doing them myself eventually, but I thought I’d put the suggestion out there first.

  1. In favour of writing programs that concern themselves with things other than Haskell for once!

Categories: Offsite Blogs

Is there an elegant way to lazily produce values in ST?

Haskell on Reddit - Mon, 03/31/2014 - 4:35pm

I was playing with De Bruijn sequences, and I shamelessly stole the algorithm off of Wikipedia that computes the sequence in Python and ported it to Haskell.

Full text of the code: dbseq.hs

Here are the primitives I built the computation from:

class Monad m => STW m where type Yield m type RawState m yield :: Yield m -> m () readArr :: (Ix i) => STArray (RawState m) i e -> i -> m e writeArr :: (Ix i) => STArray (RawState m) i e -> i -> e -> m () newArr :: (Ix i) => (i,i) -> e -> m (STArray (RawState m) i e)

However, using m = WriterT [log] (ST s) clearly leads to running out of memory if the log gets big, because the head of the log can't be returned until the full ST computation is complete.

I tried instead using Control.Monad.ST.Lazy instead, which fared better, but still leaked memory and eventually ran out on large computations. This solution was the most elegant looking but fell over. My main question is if it's salvagable?

Then there's this monstrosity using strict ST, which actually seems to work with no leaks:

instance STW (ContT r (WriterT [t] (ST s))) where yield x = do lift $ tell [x] ContT $ \k -> WriterT $ unsafeInterleaveST $ runWriterT $ k ()

So, is this use of unsafeInterleaveST actually safe? I can't figure out how I'd go about proving it is, although it seems to work. And even if it is, it seems like a huge hack.

Is there a better way?

submitted by ryani
[link] [17 comments]
Categories: Incoming News